Conversation
🛡️ Watchflow Governance ChecksStatus: ❌ 3 Violations Found 🟡 Medium Severity (3)Checks PR description (body) and title for a linked issue reference (e.g. #123, Fixes #123, Closes #456). Use when the rule requires issue refs in either field.PR does not reference a linked issue (e.g. #123 or closes #123 in body/title) Validates that total lines changed (additions + deletions) in a PR do not exceed a maximum; enforces a maximum LOC per pull request.Pull request exceeds maximum lines changed (1209 > 500) Validates if the PR title matches a specific patternPR title 'Feature/agentic guidelines' does not match required pattern '^feat|^fix|^docs|^style|^refactor|^test|^chore|^perf|^ci|^build|^revert' 💡 Reply with Thanks for using Watchflow! It's completely free for OSS and private repositories. You can also self-host it easily. |
📝 WalkthroughWalkthroughThis PR introduces AI-based scanning and translation of AI rule files stored in repositories. It adds new API endpoints for scanning repositories and translating identified files into Watchflow YAML format, extends the GitHub API client with tree retrieval capabilities, integrates scanning into PR and push event processors, and provides comprehensive pattern matching and rule extraction logic. Changes
Sequence Diagram(s)sequenceDiagram
actor EventSource as Push/PR Event
participant Processor as Event Processor
participant Relevance as Relevance Check
participant Orchestrator as get_suggested_rules_from_repo
participant GitHub as GitHub API
participant Scanner as AI Rules Scanner
participant Translator as YAML Translator
EventSource->>Processor: Event triggered (push/PR)
Processor->>Relevance: is_relevant_push/is_relevant_pr(payload)
Relevance-->>Processor: true/false
alt Event is Relevant
Processor->>Orchestrator: get_suggested_rules_from_repo(...)
Orchestrator->>GitHub: get_repository(repo_url)
GitHub-->>Orchestrator: repo_data
Orchestrator->>GitHub: get_repository_tree(repo_full_name, ref)
GitHub-->>Orchestrator: tree_entries
Orchestrator->>Scanner: scan_repo_for_ai_rule_files(tree_entries, fetch_content)
Scanner->>GitHub: get_file_content(path, ref) [for each candidate]
GitHub-->>Scanner: file_content
Scanner-->>Orchestrator: candidates with content/keywords
Orchestrator->>Translator: translate_ai_rule_files_to_yaml(candidates)
Translator-->>Orchestrator: yaml_rules, sources, ambiguities
Orchestrator-->>Processor: (yaml_rules, rule_count, sources, ambiguities)
Processor->>Processor: Log rules details
else Event is Not Relevant
Processor->>Processor: Log skip message
end
Processor-->>EventSource: Processing complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/integrations/github/api.py (1)
1200-1235:⚠️ Potential issue | 🟠 MajorRetry decorator incorrectly retries deterministic authentication failures.
The
@retrydecorator onexecute_graphql(lines 1200–1235) retries all exceptions, including the deterministic auth-failure path at line 35. When_get_auth_headersreturnsNone, raisingException("Authentication required for GraphQL query.")will be retried 3 times before finally failing, wasting latency and resources.Restrict retries to transient transport failures only. Use
retry_if_exception_typeto exclude non-retriable errors like authentication failures, and raisePermissionErrorinstead of a genericException.Note: The method uses
aiohttp.ClientSession(nothttpx), so conditional retries should targetaiohttp.ClientErrorand related transport exceptions.Suggested fix
-from tenacity import retry, stop_after_attempt, wait_exponential +from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_exponential -@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10)) +@retry( + stop=stop_after_attempt(3), + wait=wait_exponential(multiplier=1, min=4, max=10), + retry=retry_if_exception_type((aiohttp.ClientError, TimeoutError)), + reraise=True, +) async def execute_graphql( @@ if not headers: logger.error("GraphQL execution failed: No authentication headers available.") - raise Exception("Authentication required for GraphQL query.") + raise PermissionError("Authentication required for GraphQL query.")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/integrations/github/api.py` around lines 1200 - 1235, The execute_graphql retry currently retries all exceptions (including deterministic auth failures); update the retry decorator to only retry transient transport errors by using retry_if_exception_type targeting aiohttp.ClientError (and keep stop_after_attempt/ wait_exponential as is) and ensure retry imports include retry_if_exception_type; inside execute_graphql replace the generic raise Exception("Authentication required for GraphQL query.") with raise PermissionError("Authentication required for GraphQL query.") so authentication failures are not retried; add any necessary imports for aiohttp.ClientError and retry_if_exception_type and update _get_auth_headers handling accordingly.
🧹 Nitpick comments (9)
src/api/recommendations.py (3)
22-22: Duplicateyamlimport.The
yamlmodule is imported here at module level, but it's also imported again at line 658 inside therecommend_rulesfunction. Remove the duplicate import inside the function.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/recommendations.py` at line 22, Remove the duplicate import of the yaml module inside the recommend_rules function: the module-level import "import yaml" already provides yaml to the module, so delete the inner "import yaml" statement found within the recommend_rules function (keep any code that uses yaml unchanged).
496-552: Consider adding concurrency bounds for multiple file fetches.
get_suggested_rules_from_repofetches content for multiple files viascan_repo_for_ai_rule_files. Per coding guidelines, bound fan-out withasyncio.Semaphoreto prevent overwhelming the GitHub API when scanning repositories with many candidate files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/recommendations.py` around lines 496 - 552, Add a concurrency bound when fetching file contents in get_suggested_rules_from_repo by creating an asyncio.Semaphore (e.g., MAX_CONCURRENT_REQUESTS) near the start of the function and using it inside the get_content closure so each call to github_client.get_file_content acquires the semaphore (use "async with semaphore" or acquire/release in try/finally) before performing the network call; alternatively, pass the semaphore or a bounded fetch helper into scan_repo_for_ai_rule_files so that fetch_content/get_file_content calls are limited, preventing unbounded fan-out to the GitHub API.
550-552: Logging lacks structured fields.Per coding guidelines, use structured logging at boundaries with fields:
operation,subject_ids,decision,latency_ms. The warning logsrepoanderrorbut should includeoperationfor consistency.Proposed fix
- logger.warning("get_suggested_rules_from_repo_failed", repo=repo_full_name, error=str(e)) + logger.warning("get_suggested_rules_from_repo_failed", operation="agentic_scan", repo=repo_full_name, error=str(e))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/recommendations.py` around lines 550 - 552, The except block in get_suggested_rules_from_repo is missing the required structured logging fields; update the logger.warning call to include operation="get_suggested_rules_from_repo", subject_ids=[repo_full_name] (or appropriate list of subject ids), decision="failure", and latency_ms (e.g., 0 or None) alongside the existing error field so the log conforms to the structured logging boundary contract; keep the existing return tuple unchanged.tests/unit/rules/test_ai_rules_scan.py (1)
1-20: Consider adding tests forextract_rule_statements_from_markdownandtranslate_ai_rule_files_to_yaml.The current test suite covers scanning utilities well but doesn't test the statement extraction and translation pipeline, which are core to the feature. Based on learnings, write unit tests for deterministic rule evaluation.
Would you like me to help generate test cases for these functions?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/rules/test_ai_rules_scan.py` around lines 1 - 20, Add unit tests for extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml to cover deterministic statement extraction from Markdown and correct translation into YAML rule objects: write tests calling extract_rule_statements_from_markdown with Markdown inputs containing multiple rule sections, fenced code blocks, and edge cases (no rules, malformed markers) asserting the returned list of statements and their order; and write tests for translate_ai_rule_files_to_yaml that pass sample file objects (filename + content) and assert resulting YAML structure, field mappings, and behavior when content_has_ai_keywords is false/true. Reference the functions extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml and reuse existing fixtures/helpers (e.g., sample tree entries) to keep tests deterministic and isolated from network I/O.src/rules/ai_rules_scan.py (2)
119-152: Content fetches are sequential; consider parallel fetching.The
scan_repo_for_ai_rule_filesfunction fetches content for each candidate file sequentially. For repositories with many candidate files, this could be slow. Consider usingasyncio.gatherwith a semaphore to fetch content in parallel while respecting rate limits.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/ai_rules_scan.py` around lines 119 - 152, The loop in scan_repo_for_ai_rule_files currently awaits get_file_content sequentially; change it to schedule per-entry async tasks and run them with asyncio.gather while using an asyncio.Semaphore to limit concurrency; implement a small inner async helper (e.g., fetch_and_check(path)) that acquires the semaphore, calls await get_file_content(path), computes has_keywords via content_has_ai_keywords(content), handles exceptions and returns the dict with "path","has_keywords","content", then replace the for-loop with creating tasks for each candidate and awaiting asyncio.gather on them to populate results while preserving the existing logging on exceptions.
7-14: Usestructlogfor consistency with other modules.Other modules in this codebase use
structlogfor structured logging (e.g.,src/api/recommendations.py). Consider usingstructlog.get_logger()for consistency and structured log output.Proposed fix
-import logging +import structlog import re from collections.abc import Awaitable, Callable from typing import Any, cast from src.core.utils.patterns import matches_any import yaml -logger = logging.getLogger(__name__) +logger = structlog.get_logger()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/rules/ai_rules_scan.py` around lines 7 - 14, Replace the stdlib logging usage with structlog: import structlog and replace the module-level logger = logging.getLogger(__name__) with logger = structlog.get_logger() (and remove or stop using the logging import). Update any calls that assume a stdlib Logger API if needed so they use the structlog logger in this module (ai_rules_scan.py) and keep the symbol name logger for consistency with other modules like src/api/recommendations.py.tests/integration/test_scan_ai_files.py (1)
71-71: Add trailing newline.File should end with a newline character per POSIX conventions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_scan_ai_files.py` at line 71, The file tests/integration/test_scan_ai_files.py is missing a trailing newline; open that test file and add a single newline character at the end of the file (ensure the file ends with '\n' when saved) so it conforms to POSIX conventions and prevents diff/tooling warnings.src/integrations/github/api.py (2)
263-267: Preferparamsover manual query-string concatenation forref.Building
?ref=...inline is brittle and bypasses URL encoding safeguards. Usesession.get(..., params={"ref": ref}).Suggested fix
url = f"{config.github.api_base_url}/repos/{repo_full_name}/contents/{file_path}" - if ref: - url = f"{url}?ref={ref}" + params = {"ref": ref} if ref else None session = await self._get_session() - async with session.get(url, headers=headers) as response: + async with session.get(url, headers=headers, params=params) as response:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/integrations/github/api.py` around lines 263 - 267, The code concatenates "?ref=..." onto the URL before calling session.get, which bypasses URL-encoding and is brittle; change the session.get call (the async with session.get(url, headers=headers) in the method that calls self._get_session()) to pass the ref via the params parameter (e.g., params={"ref": ref}) and remove the manual string concatenation so URL encoding is handled by the HTTP client.
207-210: Avoid hardcodingmainwhenrefis omitted.Defaulting to
"main"will fail on repositories whose default branch is different (e.g.,master,develop). Resolvedefault_branchfirst (or require explicitref) before tree lookup.Suggested fix
- ref = ref or "main" + if not ref: + repo_data, _ = await self.get_repository( + repo_full_name, + installation_id=installation_id, + user_token=user_token, + ) + if not repo_data: + return [] + ref = cast("str", repo_data.get("default_branch") or "main")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/integrations/github/api.py` around lines 207 - 210, The code currently hardcodes ref = "main" before calling _resolve_tree_sha; instead retrieve the repository's default branch via the repo metadata and use that when ref is not provided (or raise if you want to require an explicit ref). Update the logic around the ref variable in the function that calls _resolve_tree_sha to first fetch the repo's default_branch (e.g., via an existing repo metadata method or a new API call), set ref = default_branch when ref is falsy, then call await self._resolve_tree_sha(repo_full_name, ref, headers); ensure error handling covers missing metadata and preserves the previous return [] behavior if no tree_sha is resolved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/api/recommendations.py`:
- Around line 1116-1121: Remove the fragile string-based counting and the bare
except: stop computing rules_count with rules_yaml.count(...) and instead
initialize rules_count = 0, then call yaml.safe_load(rules_yaml) and set
rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0;
catch only yaml.YAMLError (or ValueError) and handle it explicitly (e.g., log a
warning via the module logger or return a validation error) rather than using a
bare `except Exception: pass`, and if you must catch a generic Exception
re-raise it after logging so errors are not silently swallowed; update the block
that uses rules_yaml, parsed, yaml.safe_load, and rules_count accordingly.
- Around line 544-548: The try/except around yaml.safe_load that sets parsed and
rules_count is swallowing all exceptions; change it to catch only expected
parsing errors (e.g., yaml.YAMLError and ValueError) and handle them
explicitly—log the error via the appropriate logger or re-raise instead of using
a bare except/pass—so update the try/except block around
yaml.safe_load(rules_yaml) and the parsed/rules_count assignment to catch
yaml.YAMLError (and optionally ValueError) and then log the exception context
(or raise) rather than silently passing.
In `@src/integrations/github/api.py`:
- Around line 132-169: get_repository currently returns a tuple of raw dicts
which is shape-unsafe; change it to return validated Pydantic models (e.g.,
RepositoryResponse for success and ApiErrorResponse for failures) instead of
tuple[dict[str, Any] | None, dict[str, Any] | None]. Update the function
signature and all return sites inside get_repository (including the
authentication error branch using _get_auth_headers and all response.status
branches after _get_session().get) to instantiate and return the appropriate
BaseModel instances with the same fields (status, message, and repo data mapped
into RepositoryResponse fields), and adjust callers to consume the typed models
rather than raw dicts. Ensure models are defined as Pydantic BaseModel
subclasses and include parsing/validation of the JSON body (use the
model.parse_obj or model(...) constructors) before returning.
- Around line 224-241: _resolve_tree_sha currently only queries
git/ref/heads/{ref} and cannot resolve tags or raw commit SHAs; update it to
call the single commits endpoint (/repos/{owner}/{repo}/commits/{ref}) which
accepts branch names, tags, and commit SHAs and returns the commit object with a
tree.sha. In function _resolve_tree_sha use self._get_session() and session.get
to fetch f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}"
with the provided headers, check for a 200 response, parse JSON into
commit_data, and return commit_data.get("tree", {}).get("sha") or None on
failure; keep existing error/None semantics and ensure you remove the previous
git/ref/heads + git/commits two-step logic so get_file_content's documented ref
support works.
In `@src/rules/ai_rules_scan.py`:
- Around line 304-309: The log call indicating a successful deterministic
mapping match is using logger.warning; change that to a non-warning level
(prefer logger.debug) so successful matches are not treated as warnings: in the
block that logs "deterministic_mapping_matched statement=%r pattern=%r" (just
before the return dict(rule_dict)) replace logger.warning(...) with
logger.debug(...) keeping the same message and arguments (statement[:100], p) to
preserve context.
- Around line 353-369: The feasibility agent output is used without verifying
its required fields or confidence; update the block that calls
get_feasibility_agent() and agent.execute(...) to validate that result.data
contains the required keys ("decision"/"is_feasible", "confidence", "reasoning")
and only accept the YAML when result.success is true, result.data["is_feasible"]
is truthy, yaml_content exists, and result.data["confidence"] >= 0.5; if
confidence is missing or < 0.5 or required keys are absent, append to ambiguous
with a clear reason (e.g., "low confidence" or "missing agent fields") including
the reported confidence/decision, otherwise proceed to parse yaml_content and
add rules to all_rules as before.
---
Outside diff comments:
In `@src/integrations/github/api.py`:
- Around line 1200-1235: The execute_graphql retry currently retries all
exceptions (including deterministic auth failures); update the retry decorator
to only retry transient transport errors by using retry_if_exception_type
targeting aiohttp.ClientError (and keep stop_after_attempt/ wait_exponential as
is) and ensure retry imports include retry_if_exception_type; inside
execute_graphql replace the generic raise Exception("Authentication required for
GraphQL query.") with raise PermissionError("Authentication required for GraphQL
query.") so authentication failures are not retried; add any necessary imports
for aiohttp.ClientError and retry_if_exception_type and update _get_auth_headers
handling accordingly.
---
Nitpick comments:
In `@src/api/recommendations.py`:
- Line 22: Remove the duplicate import of the yaml module inside the
recommend_rules function: the module-level import "import yaml" already provides
yaml to the module, so delete the inner "import yaml" statement found within the
recommend_rules function (keep any code that uses yaml unchanged).
- Around line 496-552: Add a concurrency bound when fetching file contents in
get_suggested_rules_from_repo by creating an asyncio.Semaphore (e.g.,
MAX_CONCURRENT_REQUESTS) near the start of the function and using it inside the
get_content closure so each call to github_client.get_file_content acquires the
semaphore (use "async with semaphore" or acquire/release in try/finally) before
performing the network call; alternatively, pass the semaphore or a bounded
fetch helper into scan_repo_for_ai_rule_files so that
fetch_content/get_file_content calls are limited, preventing unbounded fan-out
to the GitHub API.
- Around line 550-552: The except block in get_suggested_rules_from_repo is
missing the required structured logging fields; update the logger.warning call
to include operation="get_suggested_rules_from_repo",
subject_ids=[repo_full_name] (or appropriate list of subject ids),
decision="failure", and latency_ms (e.g., 0 or None) alongside the existing
error field so the log conforms to the structured logging boundary contract;
keep the existing return tuple unchanged.
In `@src/integrations/github/api.py`:
- Around line 263-267: The code concatenates "?ref=..." onto the URL before
calling session.get, which bypasses URL-encoding and is brittle; change the
session.get call (the async with session.get(url, headers=headers) in the method
that calls self._get_session()) to pass the ref via the params parameter (e.g.,
params={"ref": ref}) and remove the manual string concatenation so URL encoding
is handled by the HTTP client.
- Around line 207-210: The code currently hardcodes ref = "main" before calling
_resolve_tree_sha; instead retrieve the repository's default branch via the repo
metadata and use that when ref is not provided (or raise if you want to require
an explicit ref). Update the logic around the ref variable in the function that
calls _resolve_tree_sha to first fetch the repo's default_branch (e.g., via an
existing repo metadata method or a new API call), set ref = default_branch when
ref is falsy, then call await self._resolve_tree_sha(repo_full_name, ref,
headers); ensure error handling covers missing metadata and preserves the
previous return [] behavior if no tree_sha is resolved.
In `@src/rules/ai_rules_scan.py`:
- Around line 119-152: The loop in scan_repo_for_ai_rule_files currently awaits
get_file_content sequentially; change it to schedule per-entry async tasks and
run them with asyncio.gather while using an asyncio.Semaphore to limit
concurrency; implement a small inner async helper (e.g., fetch_and_check(path))
that acquires the semaphore, calls await get_file_content(path), computes
has_keywords via content_has_ai_keywords(content), handles exceptions and
returns the dict with "path","has_keywords","content", then replace the for-loop
with creating tasks for each candidate and awaiting asyncio.gather on them to
populate results while preserving the existing logging on exceptions.
- Around line 7-14: Replace the stdlib logging usage with structlog: import
structlog and replace the module-level logger = logging.getLogger(__name__) with
logger = structlog.get_logger() (and remove or stop using the logging import).
Update any calls that assume a stdlib Logger API if needed so they use the
structlog logger in this module (ai_rules_scan.py) and keep the symbol name
logger for consistency with other modules like src/api/recommendations.py.
In `@tests/integration/test_scan_ai_files.py`:
- Line 71: The file tests/integration/test_scan_ai_files.py is missing a
trailing newline; open that test file and add a single newline character at the
end of the file (ensure the file ends with '\n' when saved) so it conforms to
POSIX conventions and prevents diff/tooling warnings.
In `@tests/unit/rules/test_ai_rules_scan.py`:
- Around line 1-20: Add unit tests for extract_rule_statements_from_markdown and
translate_ai_rule_files_to_yaml to cover deterministic statement extraction from
Markdown and correct translation into YAML rule objects: write tests calling
extract_rule_statements_from_markdown with Markdown inputs containing multiple
rule sections, fenced code blocks, and edge cases (no rules, malformed markers)
asserting the returned list of statements and their order; and write tests for
translate_ai_rule_files_to_yaml that pass sample file objects (filename +
content) and assert resulting YAML structure, field mappings, and behavior when
content_has_ai_keywords is false/true. Reference the functions
extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml and
reuse existing fixtures/helpers (e.g., sample tree entries) to keep tests
deterministic and isolated from network I/O.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
src/api/recommendations.pysrc/event_processors/pull_request/processor.pysrc/event_processors/push.pysrc/integrations/github/api.pysrc/rules/ai_rules_scan.pytests/integration/test_scan_ai_files.pytests/unit/api/test_proceed_with_pr.pytests/unit/integrations/github/test_api.pytests/unit/rules/test_ai_rules_scan.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.py: Use modern typing only:dict[str, Any],list[str],str | None(noDict,List,Optional)
GitHub/HTTP/DB calls must beasync def; avoid blocking calls (time.sleep, sync HTTP) in async paths
All agent outputs and external payloads must use validatedBaseModelfrom Pydantic
Use dataclasses for internal immutable state where appropriate
Use structured logging at boundaries with fields:operation,subject_ids,decision,latency_ms
Implement Agent pattern: single-responsibility agents with typed inputs/outputs
Use Decorator pattern for retries, metrics, caching as cross-cutting concerns
Agent outputs must include:decision,confidence (0..1), shortreasoning,recommendations,strategy_used
Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5
Use minimal, step-driven prompts; provide Chain-of-Thought only for complexity > 0.7 or ambiguity > 0.6
Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)
Cache idempotent lookups; lazy-import heavy dependencies; bound fan-out withasyncio.Semaphore
Avoid redundant LLM calls; memoize per event when safe
Use domain errors (e.g.,AgentError) witherror_type,message,context,timestamp,retry_count
Use exponential backoff for transient failures; circuit-break noisy integrations when needed
Fail closed for risky decisions; provide actionable remediation in error paths
Validate all external inputs; verify webhook signatures
Implement prompt-injection hardening; sanitize repository content passed to LLMs
Performance targets: Static validation ~<100ms typical, hybrid decisions sub-second when cache warm, budget LLM paths thoughtfully
Reject old typing syntax (Dict,List,Optional) in code review
Reject blocking calls in async code; reject bareexcept:clauses; reject swallowed errors
Reject LLM calls for trivial/deterministic checks
Reject unvalidated agent outputs and missing confidenc...
Files:
tests/unit/api/test_proceed_with_pr.pytests/unit/rules/test_ai_rules_scan.pysrc/event_processors/pull_request/processor.pytests/unit/integrations/github/test_api.pytests/integration/test_scan_ai_files.pysrc/integrations/github/api.pysrc/api/recommendations.pysrc/rules/ai_rules_scan.pysrc/event_processors/push.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
tests/**/*.py: Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths
Write integration tests for webhook parsing, idempotency, multi-agent coordination, and state persistence
Usepytest.mark.asynciofor async tests; avoid live network calls; freeze time and seed randomness
Write regression tests for every bug fix; keep CI coverage thresholds green
Files:
tests/unit/api/test_proceed_with_pr.pytests/unit/rules/test_ai_rules_scan.pytests/unit/integrations/github/test_api.pytests/integration/test_scan_ai_files.py
🧠 Learnings (3)
📚 Learning: 2026-01-31T19:35:22.504Z
Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths
Applied to files:
tests/unit/rules/test_ai_rules_scan.pytests/integration/test_scan_ai_files.py
📚 Learning: 2026-01-31T19:35:22.504Z
Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Use `pytest.mark.asyncio` for async tests; avoid live network calls; freeze time and seed randomness
Applied to files:
tests/unit/rules/test_ai_rules_scan.py
📚 Learning: 2026-01-31T19:35:22.504Z
Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Write integration tests for webhook parsing, idempotency, multi-agent coordination, and state persistence
Applied to files:
tests/integration/test_scan_ai_files.py
🧬 Code graph analysis (6)
tests/unit/rules/test_ai_rules_scan.py (1)
src/rules/ai_rules_scan.py (4)
content_has_ai_keywords(59-64)filter_tree_entries_for_ai_rules(97-113)path_matches_ai_rule_patterns(51-56)scan_repo_for_ai_rule_files(119-152)
src/event_processors/pull_request/processor.py (2)
src/api/recommendations.py (1)
get_suggested_rules_from_repo(496-552)src/rules/ai_rules_scan.py (1)
is_relevant_pr(83-95)
tests/unit/integrations/github/test_api.py (1)
src/integrations/github/api.py (2)
get_repository(130-169)get_repository_tree(192-221)
src/integrations/github/api.py (2)
src/tasks/task_queue.py (2)
installation_id(41-47)repo_full_name(33-38)src/core/models.py (1)
repo_full_name(137-139)
src/rules/ai_rules_scan.py (3)
src/core/utils/patterns.py (1)
matches_any(76-95)src/integrations/github/api.py (1)
get_file_content(243-281)src/agents/factory.py (1)
get_agent(20-52)
src/event_processors/push.py (3)
src/api/recommendations.py (1)
get_suggested_rules_from_repo(496-552)src/rules/ai_rules_scan.py (1)
is_relevant_push(66-80)src/tasks/task_queue.py (3)
Task(19-47)installation_id(41-47)repo_full_name(33-38)
🔇 Additional comments (18)
src/api/recommendations.py (2)
145-199: LGTM!The new Pydantic models (
ScanAIFilesRequest,ScanAIFilesCandidate,ScanAIFilesResponse,TranslateAIFilesRequest,TranslateAIFilesResponse) are well-structured with proper type hints, field descriptions, andFielddefaults. They comply with the coding guidelines for validatedBaseModelusage.
486-493: LGTM!The
_ref_to_branchhelper correctly handles edge cases (None, empty string, whitespace) and properly strips therefs/heads/prefix. Clean implementation.src/rules/ai_rules_scan.py (3)
17-26: LGTM!The
AI_RULE_FILE_PATTERNSlist comprehensively covers common AI assistant rule file locations (Cursor, Claude, Copilot conventions). The glob patterns are well-structured for both root-level and nested paths.
66-80: LGTM!
is_relevant_pushcorrectly identifies relevant pushes by checking if the ref targets the default branch or if any commit touches AI rule file paths. The iteration over added/modified/removed files is thorough.
180-217: LGTM!
extract_rule_statements_from_markdownimplements a clean deterministic extraction with:
- Line prefix detection for known patterns (Cursor rule:, Claude:, etc.)
- Phrase marker detection (always use, never commit, etc.)
- Deduplication via normalization
- Reasonable line length limit (500 chars)
src/event_processors/pull_request/processor.py (2)
65-90: LGTM on agentic scanning integration.The conditional execution based on
is_relevant_pris appropriate, and the exception handling gracefully degrades without crashing PR processing. The detailed logging provides good observability during the rollout phase.Consider reducing log verbosity for production by:
- Moving
logger.info(" YAML:\n%s", rules_yaml)to debug level for large outputs- Truncating
rule_sourcesif the list grows large
6-7: LGTM!The new imports cleanly bring in the agentic scanning capabilities from the appropriate modules.
src/event_processors/push.py (2)
68-93: LGTM!The agentic scanning integration for push events mirrors the PR processor implementation appropriately. The conditional check via
is_relevant_push, graceful exception handling, and detailed logging are consistent with the PR processor approach.
6-7: LGTM!Imports are correctly added for the new agentic scanning functionality.
tests/unit/api/test_proceed_with_pr.py (1)
9-10: LGTM!The test double correctly updated to return the new tuple signature
(repo_data, error)matching the updatedget_repositoryAPI.tests/unit/integrations/github/test_api.py (3)
129-132: LGTM!Assertions correctly updated to unpack and validate the new tuple return signature
(repo_data, repo_error).
143-147: LGTM!Error case properly validates both
repo_data is Noneandrepo_errorcontains the expected status code.
230-276: LGTM!Comprehensive test for
get_repository_treethat:
- Mocks
_get_auth_headersand_resolve_tree_shaproperly- Validates the correct number of entries returned
- Checks expected paths are present in results
Good use of
patch.objectwith context managers for clean test isolation.tests/integration/test_scan_ai_files.py (1)
20-70: LGTM!Good integration test that:
- Properly mocks GitHub client methods with tuple return signatures
- Validates response structure (status code, repo_full_name, ref, candidate_files)
- Checks that matching paths appear in results
- Verifies required fields (
path,has_keywords) are presentConsider adding error path tests (invalid URL, rate limit, auth failure) in a follow-up.
tests/unit/rules/test_ai_rules_scan.py (3)
23-61: LGTM!Comprehensive parameterized tests for
path_matches_ai_rule_patternscovering:
- Various matching patterns (rules, guidelines, prompt, .cursor/rules/)
- Non-matching paths
- Edge cases (empty, whitespace)
- Backslash normalization
64-90: LGTM!Good coverage for
content_has_ai_keywords:
- Multiple keyword samples with parameterized tests
- Case-insensitivity validation
- Empty/None handling
130-190: LGTM!Async tests for
scan_repo_for_ai_rule_filesproperly:
- Use
pytest.mark.asynciodecorator- Test no-content-fetch path
- Test content fetch with keyword detection
- Test failure handling (network error)
Good mock implementation for
get_file_contentcallback.src/integrations/github/api.py (1)
174-177: Auth-only directory listing change is consistent with the stricter access model.This update aligns
list_directory_any_authbehavior with the new authenticated flow.
| try: | ||
| parsed = yaml.safe_load(rules_yaml) | ||
| rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Bare except with pass swallows errors silently.
This catches all exceptions including KeyboardInterrupt and SystemExit, then discards them without logging. Per coding guidelines, reject bare except: clauses and swallowed errors.
Proposed fix
try:
parsed = yaml.safe_load(rules_yaml)
rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0
- except Exception:
- pass
+ except yaml.YAMLError as e:
+ logger.debug("yaml_parse_failed_for_rules_count", error=str(e))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| parsed = yaml.safe_load(rules_yaml) | |
| rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 | |
| except Exception: | |
| pass | |
| try: | |
| parsed = yaml.safe_load(rules_yaml) | |
| rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 | |
| except yaml.YAMLError as e: | |
| logger.debug("yaml_parse_failed_for_rules_count", error=str(e)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/api/recommendations.py` around lines 544 - 548, The try/except around
yaml.safe_load that sets parsed and rules_count is swallowing all exceptions;
change it to catch only expected parsing errors (e.g., yaml.YAMLError and
ValueError) and handle them explicitly—log the error via the appropriate logger
or re-raise instead of using a bare except/pass—so update the try/except block
around yaml.safe_load(rules_yaml) and the parsed/rules_count assignment to catch
yaml.YAMLError (and optionally ValueError) and then log the exception context
(or raise) rather than silently passing.
| rules_count = rules_yaml.count("\n - ") + (1 if rules_yaml.strip() != "rules: []" and " - " in rules_yaml else 0) | ||
| try: | ||
| parsed = yaml.safe_load(rules_yaml) | ||
| rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Redundant and error-prone rules counting logic.
Line 1116 uses string counting (rules_yaml.count("\n - ")) which is fragile and immediately overwritten by the YAML parse result. Remove the unreliable string-based count.
Also, the bare except Exception with pass violates coding guidelines.
Proposed fix
- rules_count = rules_yaml.count("\n - ") + (1 if rules_yaml.strip() != "rules: []" and " - " in rules_yaml else 0)
+ rules_count = 0
try:
parsed = yaml.safe_load(rules_yaml)
rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0
- except Exception:
- pass
+ except yaml.YAMLError as e:
+ logger.debug("yaml_parse_failed", error=str(e))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/api/recommendations.py` around lines 1116 - 1121, Remove the fragile
string-based counting and the bare except: stop computing rules_count with
rules_yaml.count(...) and instead initialize rules_count = 0, then call
yaml.safe_load(rules_yaml) and set rules_count = len(parsed.get("rules", [])) if
isinstance(parsed, dict) else 0; catch only yaml.YAMLError (or ValueError) and
handle it explicitly (e.g., log a warning via the module logger or return a
validation error) rather than using a bare `except Exception: pass`, and if you
must catch a generic Exception re-raise it after logging so errors are not
silently swallowed; update the block that uses rules_yaml, parsed,
yaml.safe_load, and rules_count accordingly.
| ) -> tuple[dict[str, Any] | None, dict[str, Any] | None]: | ||
| """ | ||
| Fetch repository metadata. Returns (repo_data, None) on success; | ||
| (None, {"status": int, "message": str}) on failure for meaningful API responses. | ||
| """ | ||
| headers = await self._get_auth_headers( | ||
| installation_id=installation_id, user_token=user_token, allow_anonymous=True | ||
| installation_id=installation_id, user_token=user_token | ||
| ) | ||
| if not headers: | ||
| return None | ||
| return ( | ||
| None, | ||
| {"status": 401, "message": "Authentication required. Provide github_token or installation_id in the request."}, | ||
| ) | ||
| url = f"{config.github.api_base_url}/repos/{repo_full_name}" | ||
| session = await self._get_session() | ||
| async with session.get(url, headers=headers) as response: | ||
| if response.status == 200: | ||
| data = await response.json() | ||
| return cast("dict[str, Any]", data) | ||
| return None | ||
| return cast("dict[str, Any]", data), None | ||
| try: | ||
| body = await response.json() | ||
| gh_message = body.get("message", "") if isinstance(body, dict) else "" | ||
| except Exception: | ||
| gh_message = "" | ||
| if response.status == 404: | ||
| msg = gh_message or "Repository not found or access denied. Check repo name and token permissions." | ||
| return None, {"status": 404, "message": msg} | ||
| if response.status == 403: | ||
| msg = "GitHub API rate limit exceeded. Try again later or provide github_token for higher limits." | ||
| if gh_message and "rate limit" in gh_message.lower(): | ||
| msg = gh_message | ||
| return None, {"status": 403, "message": msg} | ||
| if response.status == 401: | ||
| return ( | ||
| None, | ||
| {"status": 401, "message": gh_message or "Invalid or expired token. Check github_token or installation_id."}, | ||
| ) | ||
| return None, {"status": response.status, "message": gh_message or f"GitHub API returned {response.status}."} |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Use validated models instead of tuple-of-raw-dicts for get_repository.
The new (repo_data, error_info) contract is shape-unsafe (dict[str, Any]) and easy to misuse. Please return a Pydantic response model (typed success/error branches) so callers can rely on validated fields.
As per coding guidelines, "All agent outputs and external payloads must use validated BaseModel from Pydantic".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/integrations/github/api.py` around lines 132 - 169, get_repository
currently returns a tuple of raw dicts which is shape-unsafe; change it to
return validated Pydantic models (e.g., RepositoryResponse for success and
ApiErrorResponse for failures) instead of tuple[dict[str, Any] | None, dict[str,
Any] | None]. Update the function signature and all return sites inside
get_repository (including the authentication error branch using
_get_auth_headers and all response.status branches after _get_session().get) to
instantiate and return the appropriate BaseModel instances with the same fields
(status, message, and repo data mapped into RepositoryResponse fields), and
adjust callers to consume the typed models rather than raw dicts. Ensure models
are defined as Pydantic BaseModel subclasses and include parsing/validation of
the JSON body (use the model.parse_obj or model(...) constructors) before
returning.
| async def _resolve_tree_sha(self, repo_full_name: str, ref: str, headers: dict[str, str]) -> str | None: | ||
| """Resolve the SHA of the tree for the given ref (commit SHA from ref -> tree SHA from commit).""" | ||
| session = await self._get_session() | ||
| ref_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/ref/heads/{ref}" | ||
| async with session.get(ref_url, headers=headers) as response: | ||
| if response.status != 200: | ||
| return None | ||
| data = await response.json() | ||
| commit_sha = data.get("object", {}).get("sha") if isinstance(data, dict) else None | ||
| if not commit_sha: | ||
| return None | ||
| commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/commits/{commit_sha}" | ||
| async with session.get(commit_url, headers=headers) as response: | ||
| if response.status != 200: | ||
| return None | ||
| commit_data = await response.json() | ||
| tree_sha = commit_data.get("tree", {}).get("sha") if isinstance(commit_data, dict) else None | ||
| return tree_sha |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the file
find . -name "api.py" -path "*/github/*" -type fRepository: warestack/watchflow
Length of output: 95
🏁 Script executed:
# Get the content of the relevant file section
head -300 src/integrations/github/api.py | tail -100Repository: warestack/watchflow
Length of output: 4393
🏁 Script executed:
# Search for _resolve_tree_sha usage and context
rg "_resolve_tree_sha" -A 5 -B 5Repository: warestack/watchflow
Length of output: 2516
🏁 Script executed:
# Check for any tests or usage examples
rg "resolve_tree_sha|_resolve_tree_sha" -lRepository: warestack/watchflow
Length of output: 136
🏁 Script executed:
# Look at the test file for _resolve_tree_sha
cat tests/unit/integrations/github/test_api.py | head -500Repository: warestack/watchflow
Length of output: 10463
🏁 Script executed:
# Search for any GitHub API endpoint documentation or comments
rg "git/ref|git/commits|/commits/" -B 2 -A 2Repository: warestack/watchflow
Length of output: 2721
🏁 Script executed:
# Check if there are any tests that verify the behavior with different ref types
rg "resolve_tree_sha" -A 20Repository: warestack/watchflow
Length of output: 4227
🏁 Script executed:
# Look for any usage with tag refs or SHA refs
rg "refs/tags|refs/heads"Repository: warestack/watchflow
Length of output: 1330
🏁 Script executed:
# Search for any GitHub API documentation comments in the codebase
rg "github.com/rest|api.github.com|GitHub API" -B 2 -A 2Repository: warestack/watchflow
Length of output: 16086
🏁 Script executed:
# Check how callers pass refs to _resolve_tree_sha
rg "get_repository_tree" -B 5 -A 15Repository: warestack/watchflow
Length of output: 9845
🏁 Script executed:
# Verify the issue: check if tags or full SHAs could be passed to _resolve_tree_sha
rg "get_repository_tree\(" -B 3 -A 3Repository: warestack/watchflow
Length of output: 2189
🏁 Script executed:
# Check if there are any error cases or comments about ref handling
rg "refs/tags|tag|commit.*sha|commit.*SHA" -B 2 -A 2Repository: warestack/watchflow
Length of output: 14815
🏁 Script executed:
# Search for any existing tests or mocks that show the response structure
rg "commit_data|tree_sha" -B 5 -A 5Repository: warestack/watchflow
Length of output: 5261
🏁 Script executed:
# Check if there are comments or documentation about the API response structures
rg "object.*sha|commit.*tree" -B 3 -A 3Repository: warestack/watchflow
Length of output: 3697
🏁 Script executed:
# Look for any existing usage patterns in similar methods
rg "git/commits|/commits/" -B 2 -A 8 | head -100Repository: warestack/watchflow
Length of output: 2709
🏁 Script executed:
# Check what refs are actually passed to _resolve_tree_sha or get_repository_tree
rg "get_repository_tree\|_resolve_tree_sha" -B 10 -A 5 | grep -E "ref=|get_repository_tree|_resolve_tree_sha" | head -40Repository: warestack/watchflow
Length of output: 45
🏁 Script executed:
# Look at the _ref_to_branch function to understand ref normalization
rg "_ref_to_branch" -B 5 -A 10Repository: warestack/watchflow
Length of output: 2167
🏁 Script executed:
# Check if there are any cases where refs with prefixes might be passed
rg "refs/tags\|refs/heads/\|commit.*sha" -B 3 -A 3 | head -60Repository: warestack/watchflow
Length of output: 45
🏁 Script executed:
# Let me check the actual response structure for both endpoints using documentation comments
rg "git/ref|commits/" -B 5 -A 10 | head -80Repository: warestack/watchflow
Length of output: 6210
🏁 Script executed:
# Check if there are any direct SHA passes to get_repository_tree
rg "get_repository_tree\(" -B 15 | grep -E "ref=|sha|commit"Repository: warestack/watchflow
Length of output: 216
🏁 Script executed:
# Verify the actual implementation one more time
cat -n src/integrations/github/api.py | sed -n '224,241p'Repository: warestack/watchflow
Length of output: 1275
🏁 Script executed:
# Check docstring for get_file_content to see what refs it claims to support
rg -A 5 "Fetches the content of a file"Repository: warestack/watchflow
Length of output: 628
🏁 Script executed:
# Check if there are any comments or documentation about ref parameter handling
rg "branch name.*tag.*commit|commit SHA" -B 2 -A 2Repository: warestack/watchflow
Length of output: 1480
🏁 Script executed:
# Check if there are any other uses of the /commits/ endpoint to see response structure
rg "/commits/" -B 3 -A 8 | grep -E "commit|tree|sha|get\(|\.get"Repository: warestack/watchflow
Length of output: 1429
🏁 Script executed:
# Check for any integration tests or mocks that show response structures
rg "commit.*tree\|tree.*commit" -B 2 -A 2Repository: warestack/watchflow
Length of output: 45
🏁 Script executed:
# See the full context of how get_repository_tree is being called
cat -n src/integrations/github/api.py | sed -n '208,223p'Repository: warestack/watchflow
Length of output: 769
Resolve ref resolution to handle branches, tags, and commit SHAs.
_resolve_tree_sha cannot resolve tags or commit SHAs—it only handles heads/{ref}. This conflicts with get_file_content's documented support for "branch name, tag, or commit SHA". Replace the two-call git/ref/heads + git/commits approach with the single /repos/{owner}/{repo}/commits/{ref} endpoint, which accepts all ref types and returns the tree SHA directly.
Suggested fix
- ref_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/ref/heads/{ref}"
- async with session.get(ref_url, headers=headers) as response:
- if response.status != 200:
- return None
- data = await response.json()
- commit_sha = data.get("object", {}).get("sha") if isinstance(data, dict) else None
- if not commit_sha:
- return None
- commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/commits/{commit_sha}"
+ commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}"
async with session.get(commit_url, headers=headers) as response:
if response.status != 200:
return None
commit_data = await response.json()
- tree_sha = commit_data.get("tree", {}).get("sha") if isinstance(commit_data, dict) else None
+ tree_sha = commit_data.get("commit", {}).get("tree", {}).get("sha") if isinstance(commit_data, dict) else None
return tree_sha🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/integrations/github/api.py` around lines 224 - 241, _resolve_tree_sha
currently only queries git/ref/heads/{ref} and cannot resolve tags or raw commit
SHAs; update it to call the single commits endpoint
(/repos/{owner}/{repo}/commits/{ref}) which accepts branch names, tags, and
commit SHAs and returns the commit object with a tree.sha. In function
_resolve_tree_sha use self._get_session() and session.get to fetch
f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}" with the
provided headers, check for a 200 response, parse JSON into commit_data, and
return commit_data.get("tree", {}).get("sha") or None on failure; keep existing
error/None semantics and ensure you remove the previous git/ref/heads +
git/commits two-step logic so get_file_content's documented ref support works.
| logger.warning( | ||
| "deterministic_mapping_matched statement=%r pattern=%r", | ||
| statement[:100], | ||
| p, | ||
| ) | ||
| return dict(rule_dict) |
There was a problem hiding this comment.
Incorrect log level for successful operation.
logger.warning is used to log a successful deterministic mapping match. This should be logger.debug or logger.info since it's not a warning condition.
Proposed fix
if p in lower:
- logger.warning(
+ logger.debug(
"deterministic_mapping_matched statement=%r pattern=%r",
statement[:100],
p,
)
return dict(rule_dict)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.warning( | |
| "deterministic_mapping_matched statement=%r pattern=%r", | |
| statement[:100], | |
| p, | |
| ) | |
| return dict(rule_dict) | |
| logger.debug( | |
| "deterministic_mapping_matched statement=%r pattern=%r", | |
| statement[:100], | |
| p, | |
| ) | |
| return dict(rule_dict) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/rules/ai_rules_scan.py` around lines 304 - 309, The log call indicating a
successful deterministic mapping match is using logger.warning; change that to a
non-warning level (prefer logger.debug) so successful matches are not treated as
warnings: in the block that logs "deterministic_mapping_matched statement=%r
pattern=%r" (just before the return dict(rule_dict)) replace logger.warning(...)
with logger.debug(...) keeping the same message and arguments (statement[:100],
p) to preserve context.
| try: | ||
| agent = get_feasibility_agent() | ||
| result = await agent.execute(rule_description=st) | ||
| if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"): | ||
| yaml_content = result.data["yaml_content"].strip() | ||
| parsed = yaml.safe_load(yaml_content) | ||
| if isinstance(parsed, dict) and "rules" in parsed and isinstance(parsed["rules"], list): | ||
| for r in parsed["rules"]: | ||
| if isinstance(r, dict): | ||
| all_rules.append(r) | ||
| rule_sources.append("agent") | ||
| else: | ||
| ambiguous.append({"statement": st, "path": path, "reason": "Feasibility agent returned invalid YAML"}) | ||
| else: | ||
| ambiguous.append({"statement": st, "path": path, "reason": result.message or "Not feasible"}) | ||
| except Exception as e: | ||
| ambiguous.append({"statement": st, "path": path, "reason": str(e)}) |
There was a problem hiding this comment.
Missing confidence gating for agent output.
Per coding guidelines: "Agent outputs must include: decision, confidence (0..1), short reasoning" and "Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5". The feasibility agent result is used without checking a confidence threshold.
Proposed fix
try:
agent = get_feasibility_agent()
result = await agent.execute(rule_description=st)
- if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"):
+ confidence = result.data.get("confidence", 0.0) if result.data else 0.0
+ if confidence < 0.5:
+ ambiguous.append({"statement": st, "path": path, "reason": f"Low confidence ({confidence})"})
+ continue
+ if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"):
yaml_content = result.data["yaml_content"].strip()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/rules/ai_rules_scan.py` around lines 353 - 369, The feasibility agent
output is used without verifying its required fields or confidence; update the
block that calls get_feasibility_agent() and agent.execute(...) to validate that
result.data contains the required keys ("decision"/"is_feasible", "confidence",
"reasoning") and only accept the YAML when result.success is true,
result.data["is_feasible"] is truthy, yaml_content exists, and
result.data["confidence"] >= 0.5; if confidence is missing or < 0.5 or required
keys are absent, append to ambiguous with a clear reason (e.g., "low confidence"
or "missing agent fields") including the reported confidence/decision, otherwise
proceed to parse yaml_content and add rules to all_rules as before.
|
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (58.2%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## main #60 +/- ##
=======================================
+ Coverage 67.0% 68.6% +1.6%
=======================================
Files 151 157 +6
Lines 9384 10247 +863
=======================================
+ Hits 6290 7037 +747
- Misses 3094 3210 +116 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
dkargatzis
left a comment
There was a problem hiding this comment.
Thanks for picking up this issue! You've built a very solid structural foundation here, especially with the clean integration into the event processors and the pragmatic reuse of the existing feasibility agent for translation.
However, based on the issue description and your attached PDF, there are a few key areas where the implementation deviates from the "agentic" vision of the feature and needs adjustment:
1. The Extractor: Deterministic vs. Agentic
In your PDF, you noted: "No LLM, no graph, no chain — just one synchronous function." While keeping things simple is usually a great principle, in this specific case, it contradicts the core requirement of the issue: "Build a LangGraph chain/agent: Extractor Agent: Read MD content -> pull out rule-like statements (prompt engineering here is key)."
Markdown guidelines (like .cursorrules or .github/copilot-instructions.md) are rarely structured predictably. They are written in paragraphs, bullet points, or complex sentences. A simple synchronous function that splits by newlines and looks for exact substring matches like "always use" or "Cursor rule:" is too brittle and will miss implicit rules or lose context.
Recommendation: Please replace the synchronous string-matching Extractor with an actual LLM-powered Extractor Agent. It should take the raw markdown, read it holistically, understand the context, and return a structured list of rule statements.
2. Missing Core Feature: Pre-Merge Enforcement
Currently, in src/event_processors/pull_request/processor.py, the dynamically translated rules are generated and then simply logged to the console via logger.info. The issue explicitly calls for enforcing these as pre-merge checks. Logging the rules does not block bad PRs or add comments for violations, meaning the feature isn't actually functioning as a guardrail yet.
Recommendation: Parse the dynamically generated rules_yaml into rule objects and append them to the existing repository rules (typically fetched via rule_provider.get_rules()) before the PR evaluation engine runs. This ensures Watchflow will automatically evaluate them, leave comments, and block the merge if they fail.
3. Missing Bonus Feature: The Self-Improving Loop (Push Event)
In src/event_processors/push.py, the PR detects when an AI rule file is pushed and translates it, but again, only logs the output. The issue asks for a self-improving loop: "if an MD rules file changes -> auto-re-run translator -> suggest/commit updated .watchflow/rules.yaml".
Recommendation: If a push event modifies an AI rule file, use the GitHub API client to either directly update the .watchflow/rules.yaml file on the default branch or (preferably) open a new Pull Request against the repository with the newly translated YAML rules so the team can review the auto-generated constraints.
4. Performance: Sequential I/O
In scan_repo_for_ai_rule_files, the code loops over candidate files and awaits get_file_content sequentially. For repositories with many files in a .cursor/rules/ directory, fetching files one by one will create high latency and could cause the webhook processing to time out.
Recommendation: Use asyncio.gather paired with an asyncio.Semaphore to fetch the contents of all candidate files concurrently while respecting GitHub's rate limits. (Note: The CodeRabbit AI bot actually caught this one as well).
Great start overall! Looking forward to seeing the LLM Extractor Agent in action.
I implemented 2 features.
I attached document here.
PR_from_Ryan.pdf
Summary by CodeRabbit
Release Notes