Skip to content

Conversation

@calvingiles
Copy link
Collaborator

No description provided.

This commit activates the semantic test-adherence feature specification by:

- Moving spec from specs/future/ to specs/ (activating for implementation)
- Changing Status from "Provisional" to "Draft"
- Adding LiteLLM as the integration library (REQ-021)
- Adding Groq provider support for free-tier CI/CD usage (REQ-022)
- Adding requirement for default model configurations pinned in releases (REQ-023)
- Expanding provider support: groq, anthropic, openai, ollama, vertex_ai, bedrock (REQ-036)
- Setting groq as default provider (REQ-037)
- Updating environment variables for all providers (REQ-042)
- Fixing SPEC ID conflict: spec-coverage-linter changed from SPEC-003 to SPEC-004

The specification now contains 67 functional requirements ready for implementation.
This aligns with the technical design document in technical-notes/llm-provider-selection.md.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit implements the core MVP for the semantic test-adherence checker
defined in SPEC-003, enabling AI/LLM-powered validation of test-requirement alignment.

## What's Added:

### Core Modules:
- **semantic_test_result.py**: Result dataclasses for semantic analysis
  - SemanticAnalysisResult: Individual test-requirement pair analysis
  - SemanticTestAdherenceResult: Overall validation results with reporting

- **llm_provider.py**: LLM provider abstraction layer
  - LLMProvider: Abstract base class for LLM providers
  - LiteLLMProvider: Implementation using LiteLLM library
  - Support for 6 providers: Groq (default), Anthropic, OpenAI, Ollama, Vertex AI, Bedrock
  - Default models pinned per provider (as per REQ-023)
  - Retry logic with exponential backoff (3 retries, as per REQ-025)
  - JSON response parsing with error handling

- **semantic_test_analyzer.py**: Core semantic analyzer
  - Requirement discovery from spec files with full text extraction
  - Test discovery with source code and docstring extraction
  - Per-test-requirement pair semantic analysis using LLM
  - Confidence scoring and threshold-based validation
  - Provisional spec exclusion (consistent with check-coverage)

### CLI Integration:
- Added `check-semantic-test-adherence` command to cli.py
- Command-line arguments: --llm-provider, --llm-model, --threshold, --specs-dir, --tests-dir
- Comprehensive help text with examples and LLM provider configuration

### Dependencies:
- Added litellm package for unified LLM provider access
- Updated pyproject.toml and uv.lock

### Exports:
- Updated __init__.py to export new classes for library usage

## Implementation Notes:

- Follows existing project patterns (reuses logic from spec_coverage_linter)
- Defaults to Groq provider for free-tier CI/CD usage (REQ-037)
- All code passes ruff linting and formatting checks
- Existing test suite passes (131 tests)

## What's Not Yet Implemented (Future Work):

- Comprehensive test suite for semantic analyzer (TEST-001 to TEST-010)
- Configuration file support in config.py (REQ-031)
- Batching optimization for multiple tests (REQ-030)
- Caching support (REQ-044-045)
- Alternative output formats: JSON, Markdown (REQ-052-054)
- Concurrent request pooling (REQ-043)

## Requirements Addressed:

Core functionality implements:
- REQ-001 to REQ-020: Requirement/test discovery and semantic analysis
- REQ-021 to REQ-029: LLM integration with LiteLLM
- REQ-031 to REQ-037: Basic configuration (CLI args)
- REQ-046 to REQ-048: Reporting
- REQ-055 to REQ-057: Exit codes
- REQ-058 to REQ-062: Error handling
- REQ-063 to REQ-067: Integration with existing tools

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit enhances the CI pipeline with two new validation steps:

1. **Unique Spec IDs Validation**:
   - Validates that all SPEC IDs and requirement IDs are unique
   - Prevents duplicate identifiers across the codebase
   - Required check (must pass for CI to succeed)

2. **Semantic Test-Adherence Validation (Optional)**:
   - Validates that tests semantically test their linked requirements using AI/LLM
   - Uses Groq provider with GROQ_API_KEY from GitHub secrets
   - Set as optional with `continue-on-error: true` since API key may not be configured
   - Provides early feedback when API key is available

Both checks are added to the lint job for comprehensive validation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants