Feature/agentic guidelines by savagame · Pull Request #60 · warestack/watchflow

savagame · 2026-02-28T10:40:12Z

I implemented 2 features.

Repo Scanning.
Agentic Parsing & Translation.

I attached document here.
PR_from_Ryan.pdf

Summary by CodeRabbit

Release Notes

New Features
- Added repository scanning endpoints to detect and identify configuration files
- Introduced translation functionality to convert detected files into standardized YAML format
- Improved error handling in GitHub repository operations with detailed error messages
- Enhanced automatic detection of relevant changes in pull requests and push events

watchflow · 2026-02-28T10:40:16Z

🛡️ Watchflow Governance Checks

Status: ❌ 3 Violations Found

🟡 Medium Severity (3)

Checks PR description (body) and title for a linked issue reference (e.g. #123, Fixes #123, Closes #456). Use when the rule requires issue refs in either field.

PR does not reference a linked issue (e.g. #123 or closes #123 in body/title)
How to fix: Add an issue reference in the PR title or description (e.g. Fixes #123).

Validates that total lines changed (additions + deletions) in a PR do not exceed a maximum; enforces a maximum LOC per pull request.

Pull request exceeds maximum lines changed (1209 > 500)
How to fix: Reduce the size of this PR to at most 500 lines changed (additions + deletions).

Validates if the PR title matches a specific pattern

💡 Reply with @watchflow ack [reason] to override these rules, or @watchflow help for commands.

Thanks for using Watchflow! It's completely free for OSS and private repositories. You can also self-host it easily.

coderabbitai · 2026-02-28T10:40:28Z

📝 Walkthrough

Walkthrough

This PR introduces AI-based scanning and translation of AI rule files stored in repositories. It adds new API endpoints for scanning repositories and translating identified files into Watchflow YAML format, extends the GitHub API client with tree retrieval capabilities, integrates scanning into PR and push event processors, and provides comprehensive pattern matching and rule extraction logic.

Changes

Cohort / File(s)	Summary
API Recommendations `src/api/recommendations.py`	Adds request/response models (`ScanAIFilesRequest`, `ScanAIFilesCandidate`, `ScanAIFilesResponse`, `TranslateAIFilesRequest`, `TranslateAIFilesResponse`) and two new POST endpoints (`/scan-ai-files`, `/translate-ai-files`). Introduces orchestration function `get_suggested_rules_from_repo` to coordinate repository scanning, tree fetching, and YAML translation. Includes helper `_ref_to_branch` for Git ref resolution.
AI Rules Scanning `src/rules/ai_rules_scan.py`	New module providing end-to-end AI rule detection and translation. Includes pattern/keyword matching (`path_matches_ai_rule_patterns`, `content_has_ai_keywords`), event relevance detection (`is_relevant_push`, `is_relevant_pr`), tree filtering (`filter_tree_entries_for_ai_rules`), rule extraction (`extract_rule_statements_from_markdown`), deterministic YAML mapping, and agent-driven translation via `translate_ai_rule_files_to_yaml`. Outputs consolidated YAML with diagnostic info (sources, ambiguities).
GitHub API Integration `src/integrations/github/api.py`	Extends `GitHubClient` with tree operations: new `get_repository_tree` method and private `_resolve_tree_sha` helper. Modifies `get_repository` to return tuple `(repo_data, error_info)` for explicit error handling. Updates `list_directory_any_auth` to require authentication. Adds optional `ref` parameter to `get_file_content`. Removes anonymous auth paths across affected methods.
Event Processor Integration `src/event_processors/pull_request/processor.py`, `src/event_processors/push.py`	Integrates `get_suggested_rules_from_repo` into PR and push processors. Conditionally invokes scanning based on relevance checks (`is_relevant_pr`, `is_relevant_push`). Logs rule details (count, sources, YAML) on successful scans; skips scanning and logs reason otherwise. Errors during scanning are logged as warnings without disrupting processing.
Test Coverage `tests/integration/test_scan_ai_files.py`, `tests/unit/rules/test_ai_rules_scan.py`, `tests/unit/api/test_proceed_with_pr.py`, `tests/unit/integrations/github/test_api.py`	Adds integration test for `/scan-ai-files` endpoint validating response structure and mock GitHub interactions. Comprehensive unit tests for all public functions in `ai_rules_scan.py` (path matching, content detection, tree filtering, file scanning). Updates existing tests to handle new tuple return type from `get_repository`.

Sequence Diagram(s)

sequenceDiagram
    actor EventSource as Push/PR Event
    participant Processor as Event Processor
    participant Relevance as Relevance Check
    participant Orchestrator as get_suggested_rules_from_repo
    participant GitHub as GitHub API
    participant Scanner as AI Rules Scanner
    participant Translator as YAML Translator
    
    EventSource->>Processor: Event triggered (push/PR)
    Processor->>Relevance: is_relevant_push/is_relevant_pr(payload)
    Relevance-->>Processor: true/false
    
    alt Event is Relevant
        Processor->>Orchestrator: get_suggested_rules_from_repo(...)
        Orchestrator->>GitHub: get_repository(repo_url)
        GitHub-->>Orchestrator: repo_data
        Orchestrator->>GitHub: get_repository_tree(repo_full_name, ref)
        GitHub-->>Orchestrator: tree_entries
        Orchestrator->>Scanner: scan_repo_for_ai_rule_files(tree_entries, fetch_content)
        Scanner->>GitHub: get_file_content(path, ref) [for each candidate]
        GitHub-->>Scanner: file_content
        Scanner-->>Orchestrator: candidates with content/keywords
        Orchestrator->>Translator: translate_ai_rule_files_to_yaml(candidates)
        Translator-->>Orchestrator: yaml_rules, sources, ambiguities
        Orchestrator-->>Processor: (yaml_rules, rule_count, sources, ambiguities)
        Processor->>Processor: Log rules details
    else Event is Not Relevant
        Processor->>Processor: Log skip message
    end
    
    Processor-->>EventSource: Processing complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Matas/fix/fix actions #43: Introduces complementary AI-driven repository analysis endpoints and extends event processor integration for rule suggestions in PRs and pushes, sharing significant code overlap in recommendations API, rules scanning module, and event processor modifications.

Suggested reviewers

dkargatzis

Poem

🐰 Hoppy changes to scan the trees,
Rules emerging from GitHub's breeze,
Translate them to YAML with care,
AI wisdom, floating in air! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Feature/agentic guidelines' is vague and does not clearly convey what the changeset accomplishes beyond a generic reference to 'agentic guidelines'.	Use a more descriptive title that clearly indicates the main feature, such as 'Add AI rule file scanning and translation endpoints' or 'Implement agentic repository scanning for rule detection'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/integrations/github/api.py (1)
1200-1235: ⚠️ Potential issue | 🟠 Major

Retry decorator incorrectly retries deterministic authentication failures.

The @retry decorator on execute_graphql (lines 1200–1235) retries all exceptions, including the deterministic auth-failure path at line 35. When _get_auth_headers returns None, raising Exception("Authentication required for GraphQL query.") will be retried 3 times before finally failing, wasting latency and resources.

Restrict retries to transient transport failures only. Use retry_if_exception_type to exclude non-retriable errors like authentication failures, and raise PermissionError instead of a generic Exception.

Note: The method uses aiohttp.ClientSession (not httpx), so conditional retries should target aiohttp.ClientError and related transport exceptions.
Suggested fix
-from tenacity import retry, stop_after_attempt, wait_exponential
+from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_exponential

-@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
+@retry(
+    stop=stop_after_attempt(3),
+    wait=wait_exponential(multiplier=1, min=4, max=10),
+    retry=retry_if_exception_type((aiohttp.ClientError, TimeoutError)),
+    reraise=True,
+)
 async def execute_graphql(
@@
         if not headers:
             logger.error("GraphQL execution failed: No authentication headers available.")
-            raise Exception("Authentication required for GraphQL query.")
+            raise PermissionError("Authentication required for GraphQL query.")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github/api.py` around lines 1200 - 1235, The execute_graphql
retry currently retries all exceptions (including deterministic auth failures);
update the retry decorator to only retry transient transport errors by using
retry_if_exception_type targeting aiohttp.ClientError (and keep
stop_after_attempt/ wait_exponential as is) and ensure retry imports include
retry_if_exception_type; inside execute_graphql replace the generic raise
Exception("Authentication required for GraphQL query.") with raise
PermissionError("Authentication required for GraphQL query.") so authentication
failures are not retried; add any necessary imports for aiohttp.ClientError and
retry_if_exception_type and update _get_auth_headers handling accordingly.

🧹 Nitpick comments (9)

src/api/recommendations.py (3)

22-22: Duplicate yaml import.

The yaml module is imported here at module level, but it's also imported again at line 658 inside the recommend_rules function. Remove the duplicate import inside the function.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/api/recommendations.py` at line 22, Remove the duplicate import of the
yaml module inside the recommend_rules function: the module-level import "import
yaml" already provides yaml to the module, so delete the inner "import yaml"
statement found within the recommend_rules function (keep any code that uses
yaml unchanged).

496-552: Consider adding concurrency bounds for multiple file fetches.

get_suggested_rules_from_repo fetches content for multiple files via scan_repo_for_ai_rule_files. Per coding guidelines, bound fan-out with asyncio.Semaphore to prevent overwhelming the GitHub API when scanning repositories with many candidate files.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/api/recommendations.py` around lines 496 - 552, Add a concurrency bound
when fetching file contents in get_suggested_rules_from_repo by creating an
asyncio.Semaphore (e.g., MAX_CONCURRENT_REQUESTS) near the start of the function
and using it inside the get_content closure so each call to
github_client.get_file_content acquires the semaphore (use "async with
semaphore" or acquire/release in try/finally) before performing the network
call; alternatively, pass the semaphore or a bounded fetch helper into
scan_repo_for_ai_rule_files so that fetch_content/get_file_content calls are
limited, preventing unbounded fan-out to the GitHub API.

550-552: Logging lacks structured fields.

Per coding guidelines, use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms. The warning logs repo and error but should include operation for consistency.

Proposed fix

-        logger.warning("get_suggested_rules_from_repo_failed", repo=repo_full_name, error=str(e))
+        logger.warning("get_suggested_rules_from_repo_failed", operation="agentic_scan", repo=repo_full_name, error=str(e))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/api/recommendations.py` around lines 550 - 552, The except block in
get_suggested_rules_from_repo is missing the required structured logging fields;
update the logger.warning call to include
operation="get_suggested_rules_from_repo", subject_ids=[repo_full_name] (or
appropriate list of subject ids), decision="failure", and latency_ms (e.g., 0 or
None) alongside the existing error field so the log conforms to the structured
logging boundary contract; keep the existing return tuple unchanged.

tests/unit/rules/test_ai_rules_scan.py (1)

1-20: Consider adding tests for extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml.

The current test suite covers scanning utilities well but doesn't test the statement extraction and translation pipeline, which are core to the feature. Based on learnings, write unit tests for deterministic rule evaluation.

Would you like me to help generate test cases for these functions?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/rules/test_ai_rules_scan.py` around lines 1 - 20, Add unit tests
for extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml to
cover deterministic statement extraction from Markdown and correct translation
into YAML rule objects: write tests calling
extract_rule_statements_from_markdown with Markdown inputs containing multiple
rule sections, fenced code blocks, and edge cases (no rules, malformed markers)
asserting the returned list of statements and their order; and write tests for
translate_ai_rule_files_to_yaml that pass sample file objects (filename +
content) and assert resulting YAML structure, field mappings, and behavior when
content_has_ai_keywords is false/true. Reference the functions
extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml and
reuse existing fixtures/helpers (e.g., sample tree entries) to keep tests
deterministic and isolated from network I/O.

src/rules/ai_rules_scan.py (2)

119-152: Content fetches are sequential; consider parallel fetching.

The scan_repo_for_ai_rule_files function fetches content for each candidate file sequentially. For repositories with many candidate files, this could be slow. Consider using asyncio.gather with a semaphore to fetch content in parallel while respecting rate limits.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/rules/ai_rules_scan.py` around lines 119 - 152, The loop in
scan_repo_for_ai_rule_files currently awaits get_file_content sequentially;
change it to schedule per-entry async tasks and run them with asyncio.gather
while using an asyncio.Semaphore to limit concurrency; implement a small inner
async helper (e.g., fetch_and_check(path)) that acquires the semaphore, calls
await get_file_content(path), computes has_keywords via
content_has_ai_keywords(content), handles exceptions and returns the dict with
"path","has_keywords","content", then replace the for-loop with creating tasks
for each candidate and awaiting asyncio.gather on them to populate results while
preserving the existing logging on exceptions.

7-14: Use structlog for consistency with other modules.

Other modules in this codebase use structlog for structured logging (e.g., src/api/recommendations.py). Consider using structlog.get_logger() for consistency and structured log output.

Proposed fix

-import logging
+import structlog
 import re
 from collections.abc import Awaitable, Callable
 from typing import Any, cast
 from src.core.utils.patterns import matches_any
 import yaml

-logger = logging.getLogger(__name__)
+logger = structlog.get_logger()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/rules/ai_rules_scan.py` around lines 7 - 14, Replace the stdlib logging
usage with structlog: import structlog and replace the module-level logger =
logging.getLogger(__name__) with logger = structlog.get_logger() (and remove or
stop using the logging import). Update any calls that assume a stdlib Logger API
if needed so they use the structlog logger in this module (ai_rules_scan.py) and
keep the symbol name logger for consistency with other modules like
src/api/recommendations.py.

tests/integration/test_scan_ai_files.py (1)

71-71: Add trailing newline.

File should end with a newline character per POSIX conventions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_scan_ai_files.py` at line 71, The file
tests/integration/test_scan_ai_files.py is missing a trailing newline; open that
test file and add a single newline character at the end of the file (ensure the
file ends with '\n' when saved) so it conforms to POSIX conventions and prevents
diff/tooling warnings.

src/integrations/github/api.py (2)

263-267: Prefer params over manual query-string concatenation for ref.

Building ?ref=... inline is brittle and bypasses URL encoding safeguards. Use session.get(..., params={"ref": ref}).

Suggested fix

         url = f"{config.github.api_base_url}/repos/{repo_full_name}/contents/{file_path}"
-        if ref:
-            url = f"{url}?ref={ref}"
+        params = {"ref": ref} if ref else None

         session = await self._get_session()
-        async with session.get(url, headers=headers) as response:
+        async with session.get(url, headers=headers, params=params) as response:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github/api.py` around lines 263 - 267, The code concatenates
"?ref=..." onto the URL before calling session.get, which bypasses URL-encoding
and is brittle; change the session.get call (the async with session.get(url,
headers=headers) in the method that calls self._get_session()) to pass the ref
via the params parameter (e.g., params={"ref": ref}) and remove the manual
string concatenation so URL encoding is handled by the HTTP client.

207-210: Avoid hardcoding main when ref is omitted.

Defaulting to "main" will fail on repositories whose default branch is different (e.g., master, develop). Resolve default_branch first (or require explicit ref) before tree lookup.

Suggested fix

-        ref = ref or "main"
+        if not ref:
+            repo_data, _ = await self.get_repository(
+                repo_full_name,
+                installation_id=installation_id,
+                user_token=user_token,
+            )
+            if not repo_data:
+                return []
+            ref = cast("str", repo_data.get("default_branch") or "main")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github/api.py` around lines 207 - 210, The code currently
hardcodes ref = "main" before calling _resolve_tree_sha; instead retrieve the
repository's default branch via the repo metadata and use that when ref is not
provided (or raise if you want to require an explicit ref). Update the logic
around the ref variable in the function that calls _resolve_tree_sha to first
fetch the repo's default_branch (e.g., via an existing repo metadata method or a
new API call), set ref = default_branch when ref is falsy, then call await
self._resolve_tree_sha(repo_full_name, ref, headers); ensure error handling
covers missing metadata and preserves the previous return [] behavior if no
tree_sha is resolved.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/api/recommendations.py`:
- Around line 1116-1121: Remove the fragile string-based counting and the bare
except: stop computing rules_count with rules_yaml.count(...) and instead
initialize rules_count = 0, then call yaml.safe_load(rules_yaml) and set
rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0;
catch only yaml.YAMLError (or ValueError) and handle it explicitly (e.g., log a
warning via the module logger or return a validation error) rather than using a
bare `except Exception: pass`, and if you must catch a generic Exception
re-raise it after logging so errors are not silently swallowed; update the block
that uses rules_yaml, parsed, yaml.safe_load, and rules_count accordingly.
- Around line 544-548: The try/except around yaml.safe_load that sets parsed and
rules_count is swallowing all exceptions; change it to catch only expected
parsing errors (e.g., yaml.YAMLError and ValueError) and handle them
explicitly—log the error via the appropriate logger or re-raise instead of using
a bare except/pass—so update the try/except block around
yaml.safe_load(rules_yaml) and the parsed/rules_count assignment to catch
yaml.YAMLError (and optionally ValueError) and then log the exception context
(or raise) rather than silently passing.

In `@src/integrations/github/api.py`:
- Around line 132-169: get_repository currently returns a tuple of raw dicts
which is shape-unsafe; change it to return validated Pydantic models (e.g.,
RepositoryResponse for success and ApiErrorResponse for failures) instead of
tuple[dict[str, Any] | None, dict[str, Any] | None]. Update the function
signature and all return sites inside get_repository (including the
authentication error branch using _get_auth_headers and all response.status
branches after _get_session().get) to instantiate and return the appropriate
BaseModel instances with the same fields (status, message, and repo data mapped
into RepositoryResponse fields), and adjust callers to consume the typed models
rather than raw dicts. Ensure models are defined as Pydantic BaseModel
subclasses and include parsing/validation of the JSON body (use the
model.parse_obj or model(...) constructors) before returning.
- Around line 224-241: _resolve_tree_sha currently only queries
git/ref/heads/{ref} and cannot resolve tags or raw commit SHAs; update it to
call the single commits endpoint (/repos/{owner}/{repo}/commits/{ref}) which
accepts branch names, tags, and commit SHAs and returns the commit object with a
tree.sha. In function _resolve_tree_sha use self._get_session() and session.get
to fetch f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}"
with the provided headers, check for a 200 response, parse JSON into
commit_data, and return commit_data.get("tree", {}).get("sha") or None on
failure; keep existing error/None semantics and ensure you remove the previous
git/ref/heads + git/commits two-step logic so get_file_content's documented ref
support works.

In `@src/rules/ai_rules_scan.py`:
- Around line 304-309: The log call indicating a successful deterministic
mapping match is using logger.warning; change that to a non-warning level
(prefer logger.debug) so successful matches are not treated as warnings: in the
block that logs "deterministic_mapping_matched statement=%r pattern=%r" (just
before the return dict(rule_dict)) replace logger.warning(...) with
logger.debug(...) keeping the same message and arguments (statement[:100], p) to
preserve context.
- Around line 353-369: The feasibility agent output is used without verifying
its required fields or confidence; update the block that calls
get_feasibility_agent() and agent.execute(...) to validate that result.data
contains the required keys ("decision"/"is_feasible", "confidence", "reasoning")
and only accept the YAML when result.success is true, result.data["is_feasible"]
is truthy, yaml_content exists, and result.data["confidence"] >= 0.5; if
confidence is missing or < 0.5 or required keys are absent, append to ambiguous
with a clear reason (e.g., "low confidence" or "missing agent fields") including
the reported confidence/decision, otherwise proceed to parse yaml_content and
add rules to all_rules as before.

---

Outside diff comments:
In `@src/integrations/github/api.py`:
- Around line 1200-1235: The execute_graphql retry currently retries all
exceptions (including deterministic auth failures); update the retry decorator
to only retry transient transport errors by using retry_if_exception_type
targeting aiohttp.ClientError (and keep stop_after_attempt/ wait_exponential as
is) and ensure retry imports include retry_if_exception_type; inside
execute_graphql replace the generic raise Exception("Authentication required for
GraphQL query.") with raise PermissionError("Authentication required for GraphQL
query.") so authentication failures are not retried; add any necessary imports
for aiohttp.ClientError and retry_if_exception_type and update _get_auth_headers
handling accordingly.

---

Nitpick comments:
In `@src/api/recommendations.py`:
- Line 22: Remove the duplicate import of the yaml module inside the
recommend_rules function: the module-level import "import yaml" already provides
yaml to the module, so delete the inner "import yaml" statement found within the
recommend_rules function (keep any code that uses yaml unchanged).
- Around line 496-552: Add a concurrency bound when fetching file contents in
get_suggested_rules_from_repo by creating an asyncio.Semaphore (e.g.,
MAX_CONCURRENT_REQUESTS) near the start of the function and using it inside the
get_content closure so each call to github_client.get_file_content acquires the
semaphore (use "async with semaphore" or acquire/release in try/finally) before
performing the network call; alternatively, pass the semaphore or a bounded
fetch helper into scan_repo_for_ai_rule_files so that
fetch_content/get_file_content calls are limited, preventing unbounded fan-out
to the GitHub API.
- Around line 550-552: The except block in get_suggested_rules_from_repo is
missing the required structured logging fields; update the logger.warning call
to include operation="get_suggested_rules_from_repo",
subject_ids=[repo_full_name] (or appropriate list of subject ids),
decision="failure", and latency_ms (e.g., 0 or None) alongside the existing
error field so the log conforms to the structured logging boundary contract;
keep the existing return tuple unchanged.

In `@src/integrations/github/api.py`:
- Around line 263-267: The code concatenates "?ref=..." onto the URL before
calling session.get, which bypasses URL-encoding and is brittle; change the
session.get call (the async with session.get(url, headers=headers) in the method
that calls self._get_session()) to pass the ref via the params parameter (e.g.,
params={"ref": ref}) and remove the manual string concatenation so URL encoding
is handled by the HTTP client.
- Around line 207-210: The code currently hardcodes ref = "main" before calling
_resolve_tree_sha; instead retrieve the repository's default branch via the repo
metadata and use that when ref is not provided (or raise if you want to require
an explicit ref). Update the logic around the ref variable in the function that
calls _resolve_tree_sha to first fetch the repo's default_branch (e.g., via an
existing repo metadata method or a new API call), set ref = default_branch when
ref is falsy, then call await self._resolve_tree_sha(repo_full_name, ref,
headers); ensure error handling covers missing metadata and preserves the
previous return [] behavior if no tree_sha is resolved.

In `@src/rules/ai_rules_scan.py`:
- Around line 119-152: The loop in scan_repo_for_ai_rule_files currently awaits
get_file_content sequentially; change it to schedule per-entry async tasks and
run them with asyncio.gather while using an asyncio.Semaphore to limit
concurrency; implement a small inner async helper (e.g., fetch_and_check(path))
that acquires the semaphore, calls await get_file_content(path), computes
has_keywords via content_has_ai_keywords(content), handles exceptions and
returns the dict with "path","has_keywords","content", then replace the for-loop
with creating tasks for each candidate and awaiting asyncio.gather on them to
populate results while preserving the existing logging on exceptions.
- Around line 7-14: Replace the stdlib logging usage with structlog: import
structlog and replace the module-level logger = logging.getLogger(__name__) with
logger = structlog.get_logger() (and remove or stop using the logging import).
Update any calls that assume a stdlib Logger API if needed so they use the
structlog logger in this module (ai_rules_scan.py) and keep the symbol name
logger for consistency with other modules like src/api/recommendations.py.

In `@tests/integration/test_scan_ai_files.py`:
- Line 71: The file tests/integration/test_scan_ai_files.py is missing a
trailing newline; open that test file and add a single newline character at the
end of the file (ensure the file ends with '\n' when saved) so it conforms to
POSIX conventions and prevents diff/tooling warnings.

In `@tests/unit/rules/test_ai_rules_scan.py`:
- Around line 1-20: Add unit tests for extract_rule_statements_from_markdown and
translate_ai_rule_files_to_yaml to cover deterministic statement extraction from
Markdown and correct translation into YAML rule objects: write tests calling
extract_rule_statements_from_markdown with Markdown inputs containing multiple
rule sections, fenced code blocks, and edge cases (no rules, malformed markers)
asserting the returned list of statements and their order; and write tests for
translate_ai_rule_files_to_yaml that pass sample file objects (filename +
content) and assert resulting YAML structure, field mappings, and behavior when
content_has_ai_keywords is false/true. Reference the functions
extract_rule_statements_from_markdown and translate_ai_rule_files_to_yaml and
reuse existing fixtures/helpers (e.g., sample tree entries) to keep tests
deterministic and isolated from network I/O.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 60efa70 and f790c4e.

📒 Files selected for processing (9)

src/api/recommendations.py
src/event_processors/pull_request/processor.py
src/event_processors/push.py
src/integrations/github/api.py
src/rules/ai_rules_scan.py
tests/integration/test_scan_ai_files.py
tests/unit/api/test_proceed_with_pr.py
tests/unit/integrations/github/test_api.py
tests/unit/rules/test_ai_rules_scan.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

**/*.py: Use modern typing only: dict[str, Any], list[str], str | None (no Dict, List, Optional)
GitHub/HTTP/DB calls must be async def; avoid blocking calls (time.sleep, sync HTTP) in async paths
All agent outputs and external payloads must use validated BaseModel from Pydantic
Use dataclasses for internal immutable state where appropriate
Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms
Implement Agent pattern: single-responsibility agents with typed inputs/outputs
Use Decorator pattern for retries, metrics, caching as cross-cutting concerns
Agent outputs must include: decision, confidence (0..1), short reasoning, recommendations, strategy_used
Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5
Use minimal, step-driven prompts; provide Chain-of-Thought only for complexity > 0.7 or ambiguity > 0.6
Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)
Cache idempotent lookups; lazy-import heavy dependencies; bound fan-out with asyncio.Semaphore
Avoid redundant LLM calls; memoize per event when safe
Use domain errors (e.g., AgentError) with error_type, message, context, timestamp, retry_count
Use exponential backoff for transient failures; circuit-break noisy integrations when needed
Fail closed for risky decisions; provide actionable remediation in error paths
Validate all external inputs; verify webhook signatures
Implement prompt-injection hardening; sanitize repository content passed to LLMs
Performance targets: Static validation ~<100ms typical, hybrid decisions sub-second when cache warm, budget LLM paths thoughtfully
Reject old typing syntax (Dict, List, Optional) in code review
Reject blocking calls in async code; reject bare except: clauses; reject swallowed errors
Reject LLM calls for trivial/deterministic checks
Reject unvalidated agent outputs and missing confidenc...

Files:

tests/unit/api/test_proceed_with_pr.py
tests/unit/rules/test_ai_rules_scan.py
src/event_processors/pull_request/processor.py
tests/unit/integrations/github/test_api.py
tests/integration/test_scan_ai_files.py
src/integrations/github/api.py
src/api/recommendations.py
src/rules/ai_rules_scan.py
src/event_processors/push.py

tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

tests/**/*.py: Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths
Write integration tests for webhook parsing, idempotency, multi-agent coordination, and state persistence
Use pytest.mark.asyncio for async tests; avoid live network calls; freeze time and seed randomness
Write regression tests for every bug fix; keep CI coverage thresholds green

Files:

tests/unit/api/test_proceed_with_pr.py
tests/unit/rules/test_ai_rules_scan.py
tests/unit/integrations/github/test_api.py
tests/integration/test_scan_ai_files.py

🧠 Learnings (3)

📚 Learning: 2026-01-31T19:35:22.504Z

Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths

Applied to files:

tests/unit/rules/test_ai_rules_scan.py
tests/integration/test_scan_ai_files.py

📚 Learning: 2026-01-31T19:35:22.504Z

Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Use `pytest.mark.asyncio` for async tests; avoid live network calls; freeze time and seed randomness

Applied to files:

tests/unit/rules/test_ai_rules_scan.py

📚 Learning: 2026-01-31T19:35:22.504Z

Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Write integration tests for webhook parsing, idempotency, multi-agent coordination, and state persistence

Applied to files:

tests/integration/test_scan_ai_files.py

🧬 Code graph analysis (6)

tests/unit/rules/test_ai_rules_scan.py (1)

src/rules/ai_rules_scan.py (4)

content_has_ai_keywords (59-64)

filter_tree_entries_for_ai_rules (97-113)

path_matches_ai_rule_patterns (51-56)

scan_repo_for_ai_rule_files (119-152)

src/event_processors/pull_request/processor.py (2)

src/api/recommendations.py (1)

get_suggested_rules_from_repo (496-552)

src/rules/ai_rules_scan.py (1)

is_relevant_pr (83-95)

tests/unit/integrations/github/test_api.py (1)

src/integrations/github/api.py (2)

get_repository (130-169)

get_repository_tree (192-221)

src/integrations/github/api.py (2)

src/tasks/task_queue.py (2)

installation_id (41-47)

repo_full_name (33-38)

src/core/models.py (1)

repo_full_name (137-139)

src/rules/ai_rules_scan.py (3)

src/core/utils/patterns.py (1)

matches_any (76-95)

src/integrations/github/api.py (1)

get_file_content (243-281)

src/agents/factory.py (1)

get_agent (20-52)

src/event_processors/push.py (3)

src/api/recommendations.py (1)

get_suggested_rules_from_repo (496-552)

src/rules/ai_rules_scan.py (1)

is_relevant_push (66-80)

src/tasks/task_queue.py (3)

Task (19-47)

installation_id (41-47)

repo_full_name (33-38)

🔇 Additional comments (18)

src/api/recommendations.py (2)

145-199: LGTM!

The new Pydantic models (ScanAIFilesRequest, ScanAIFilesCandidate, ScanAIFilesResponse, TranslateAIFilesRequest, TranslateAIFilesResponse) are well-structured with proper type hints, field descriptions, and Field defaults. They comply with the coding guidelines for validated BaseModel usage.

486-493: LGTM!

The _ref_to_branch helper correctly handles edge cases (None, empty string, whitespace) and properly strips the refs/heads/ prefix. Clean implementation.

src/rules/ai_rules_scan.py (3)

17-26: LGTM!

The AI_RULE_FILE_PATTERNS list comprehensively covers common AI assistant rule file locations (Cursor, Claude, Copilot conventions). The glob patterns are well-structured for both root-level and nested paths.

66-80: LGTM!

is_relevant_push correctly identifies relevant pushes by checking if the ref targets the default branch or if any commit touches AI rule file paths. The iteration over added/modified/removed files is thorough.

180-217: LGTM!

extract_rule_statements_from_markdown implements a clean deterministic extraction with:

Line prefix detection for known patterns (Cursor rule:, Claude:, etc.)

Phrase marker detection (always use, never commit, etc.)

Deduplication via normalization

Reasonable line length limit (500 chars)

src/event_processors/pull_request/processor.py (2)

65-90: LGTM on agentic scanning integration.

The conditional execution based on is_relevant_pr is appropriate, and the exception handling gracefully degrades without crashing PR processing. The detailed logging provides good observability during the rollout phase.

Consider reducing log verbosity for production by:

Moving logger.info(" YAML:\n%s", rules_yaml) to debug level for large outputs

Truncating rule_sources if the list grows large

6-7: LGTM!

The new imports cleanly bring in the agentic scanning capabilities from the appropriate modules.

src/event_processors/push.py (2)

68-93: LGTM!

The agentic scanning integration for push events mirrors the PR processor implementation appropriately. The conditional check via is_relevant_push, graceful exception handling, and detailed logging are consistent with the PR processor approach.

6-7: LGTM!

Imports are correctly added for the new agentic scanning functionality.

tests/unit/api/test_proceed_with_pr.py (1)

9-10: LGTM!

The test double correctly updated to return the new tuple signature (repo_data, error) matching the updated get_repository API.

tests/unit/integrations/github/test_api.py (3)

129-132: LGTM!

Assertions correctly updated to unpack and validate the new tuple return signature (repo_data, repo_error).

143-147: LGTM!

Error case properly validates both repo_data is None and repo_error contains the expected status code.

230-276: LGTM!

Comprehensive test for get_repository_tree that:

Mocks _get_auth_headers and _resolve_tree_sha properly

Validates the correct number of entries returned

Checks expected paths are present in results

Good use of patch.object with context managers for clean test isolation.

tests/integration/test_scan_ai_files.py (1)

20-70: LGTM!

Good integration test that:

Properly mocks GitHub client methods with tuple return signatures

Validates response structure (status code, repo_full_name, ref, candidate_files)

Checks that matching paths appear in results

Verifies required fields (path, has_keywords) are present

Consider adding error path tests (invalid URL, rate limit, auth failure) in a follow-up.

tests/unit/rules/test_ai_rules_scan.py (3)

23-61: LGTM!

Comprehensive parameterized tests for path_matches_ai_rule_patterns covering:

Various matching patterns (rules, guidelines, prompt, .cursor/rules/)

Non-matching paths

Edge cases (empty, whitespace)

Backslash normalization

64-90: LGTM!

Good coverage for content_has_ai_keywords:

Multiple keyword samples with parameterized tests

Case-insensitivity validation

Empty/None handling

130-190: LGTM!

Async tests for scan_repo_for_ai_rule_files properly:

Use pytest.mark.asyncio decorator

Test no-content-fetch path

Test content fetch with keyword detection

Test failure handling (network error)

Good mock implementation for get_file_content callback.

src/integrations/github/api.py (1)

174-177: Auth-only directory listing change is consistent with the stricter access model.

This update aligns list_directory_any_auth behavior with the new authenticated flow.

coderabbitai · 2026-02-28T10:47:32Z

src/api/recommendations.py

+        try:
+            parsed = yaml.safe_load(rules_yaml)
+            rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0
+        except Exception:
+            pass


⚠️ Potential issue | 🟡 Minor

Bare except with pass swallows errors silently.

This catches all exceptions including KeyboardInterrupt and SystemExit, then discards them without logging. Per coding guidelines, reject bare except: clauses and swallowed errors.

Proposed fix

try: parsed = yaml.safe_load(rules_yaml) rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 - except Exception: - pass + except yaml.YAMLError as e: + logger.debug("yaml_parse_failed_for_rules_count", error=str(e))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

parsed = yaml.safe_load(rules_yaml)

rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0

except Exception:

pass

try:

parsed = yaml.safe_load(rules_yaml)

rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0

except yaml.YAMLError as e:

logger.debug("yaml_parse_failed_for_rules_count", error=str(e))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/api/recommendations.py` around lines 544 - 548, The try/except around yaml.safe_load that sets parsed and rules_count is swallowing all exceptions; change it to catch only expected parsing errors (e.g., yaml.YAMLError and ValueError) and handle them explicitly—log the error via the appropriate logger or re-raise instead of using a bare except/pass—so update the try/except block around yaml.safe_load(rules_yaml) and the parsed/rules_count assignment to catch yaml.YAMLError (and optionally ValueError) and then log the exception context (or raise) rather than silently passing.

coderabbitai · 2026-02-28T10:47:32Z

src/api/recommendations.py

+    rules_count = rules_yaml.count("\n  - ") + (1 if rules_yaml.strip() != "rules: []" and "  - " in rules_yaml else 0)
+    try:
+        parsed = yaml.safe_load(rules_yaml)
+        rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0
+    except Exception:
+        pass


⚠️ Potential issue | 🟡 Minor

Redundant and error-prone rules counting logic.

Line 1116 uses string counting (rules_yaml.count("\n - ")) which is fragile and immediately overwritten by the YAML parse result. Remove the unreliable string-based count.

Also, the bare except Exception with pass violates coding guidelines.

Proposed fix

- rules_count = rules_yaml.count("\n - ") + (1 if rules_yaml.strip() != "rules: []" and " - " in rules_yaml else 0) + rules_count = 0 try: parsed = yaml.safe_load(rules_yaml) rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0 - except Exception: - pass + except yaml.YAMLError as e: + logger.debug("yaml_parse_failed", error=str(e))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/api/recommendations.py` around lines 1116 - 1121, Remove the fragile string-based counting and the bare except: stop computing rules_count with rules_yaml.count(...) and instead initialize rules_count = 0, then call yaml.safe_load(rules_yaml) and set rules_count = len(parsed.get("rules", [])) if isinstance(parsed, dict) else 0; catch only yaml.YAMLError (or ValueError) and handle it explicitly (e.g., log a warning via the module logger or return a validation error) rather than using a bare `except Exception: pass`, and if you must catch a generic Exception re-raise it after logging so errors are not silently swallowed; update the block that uses rules_yaml, parsed, yaml.safe_load, and rules_count accordingly.

coderabbitai · 2026-02-28T10:47:32Z

src/integrations/github/api.py

+    ) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
+        """
+        Fetch repository metadata. Returns (repo_data, None) on success;
+        (None, {"status": int, "message": str}) on failure for meaningful API responses.
+        """
        headers = await self._get_auth_headers(
-            installation_id=installation_id, user_token=user_token, allow_anonymous=True
+            installation_id=installation_id, user_token=user_token
        )
        if not headers:
-            return None
+            return (
+                None,
+                {"status": 401, "message": "Authentication required. Provide github_token or installation_id in the request."},
+            )
        url = f"{config.github.api_base_url}/repos/{repo_full_name}"
        session = await self._get_session()
        async with session.get(url, headers=headers) as response:
            if response.status == 200:
                data = await response.json()
-                return cast("dict[str, Any]", data)
-            return None
+                return cast("dict[str, Any]", data), None
+            try:
+                body = await response.json()
+                gh_message = body.get("message", "") if isinstance(body, dict) else ""
+            except Exception:
+                gh_message = ""
+            if response.status == 404:
+                msg = gh_message or "Repository not found or access denied. Check repo name and token permissions."
+                return None, {"status": 404, "message": msg}
+            if response.status == 403:
+                msg = "GitHub API rate limit exceeded. Try again later or provide github_token for higher limits."
+                if gh_message and "rate limit" in gh_message.lower():
+                    msg = gh_message
+                return None, {"status": 403, "message": msg}
+            if response.status == 401:
+                return (
+                    None,
+                    {"status": 401, "message": gh_message or "Invalid or expired token. Check github_token or installation_id."},
+                )
+            return None, {"status": response.status, "message": gh_message or f"GitHub API returned {response.status}."}


🛠️ Refactor suggestion | 🟠 Major

Use validated models instead of tuple-of-raw-dicts for get_repository.

The new (repo_data, error_info) contract is shape-unsafe (dict[str, Any]) and easy to misuse. Please return a Pydantic response model (typed success/error branches) so callers can rely on validated fields.

As per coding guidelines, "All agent outputs and external payloads must use validated BaseModel from Pydantic".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/integrations/github/api.py` around lines 132 - 169, get_repository currently returns a tuple of raw dicts which is shape-unsafe; change it to return validated Pydantic models (e.g., RepositoryResponse for success and ApiErrorResponse for failures) instead of tuple[dict[str, Any] | None, dict[str, Any] | None]. Update the function signature and all return sites inside get_repository (including the authentication error branch using _get_auth_headers and all response.status branches after _get_session().get) to instantiate and return the appropriate BaseModel instances with the same fields (status, message, and repo data mapped into RepositoryResponse fields), and adjust callers to consume the typed models rather than raw dicts. Ensure models are defined as Pydantic BaseModel subclasses and include parsing/validation of the JSON body (use the model.parse_obj or model(...) constructors) before returning.

coderabbitai · 2026-02-28T10:47:32Z

src/integrations/github/api.py

+    async def _resolve_tree_sha(self, repo_full_name: str, ref: str, headers: dict[str, str]) -> str | None:
+        """Resolve the SHA of the tree for the given ref (commit SHA from ref -> tree SHA from commit)."""
+        session = await self._get_session()
+        ref_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/ref/heads/{ref}"
+        async with session.get(ref_url, headers=headers) as response:
+            if response.status != 200:
+                return None
+            data = await response.json()
+        commit_sha = data.get("object", {}).get("sha") if isinstance(data, dict) else None
+        if not commit_sha:
+            return None
+        commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/commits/{commit_sha}"
+        async with session.get(commit_url, headers=headers) as response:
+            if response.status != 200:
+                return None
+            commit_data = await response.json()
+        tree_sha = commit_data.get("tree", {}).get("sha") if isinstance(commit_data, dict) else None
+        return tree_sha


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the file find . -name "api.py" -path "*/github/*" -type f

Repository: warestack/watchflow

Length of output: 95

🏁 Script executed:

# Get the content of the relevant file section head -300 src/integrations/github/api.py | tail -100

Repository: warestack/watchflow

Length of output: 4393

🏁 Script executed:

# Search for _resolve_tree_sha usage and context rg "_resolve_tree_sha" -A 5 -B 5

Repository: warestack/watchflow

Length of output: 2516

🏁 Script executed:

# Check for any tests or usage examples rg "resolve_tree_sha|_resolve_tree_sha" -l

Repository: warestack/watchflow

Length of output: 136

🏁 Script executed:

# Look at the test file for _resolve_tree_sha cat tests/unit/integrations/github/test_api.py | head -500

Repository: warestack/watchflow

Length of output: 10463

🏁 Script executed:

# Search for any GitHub API endpoint documentation or comments rg "git/ref|git/commits|/commits/" -B 2 -A 2

Repository: warestack/watchflow

Length of output: 2721

🏁 Script executed:

# Check if there are any tests that verify the behavior with different ref types rg "resolve_tree_sha" -A 20

Repository: warestack/watchflow

Length of output: 4227

🏁 Script executed:

# Look for any usage with tag refs or SHA refs rg "refs/tags|refs/heads"

Repository: warestack/watchflow

Length of output: 1330

🏁 Script executed:

# Search for any GitHub API documentation comments in the codebase rg "github.com/rest|api.github.com|GitHub API" -B 2 -A 2

Repository: warestack/watchflow

Length of output: 16086

🏁 Script executed:

# Check how callers pass refs to _resolve_tree_sha rg "get_repository_tree" -B 5 -A 15

Repository: warestack/watchflow

Length of output: 9845

🏁 Script executed:

# Verify the issue: check if tags or full SHAs could be passed to _resolve_tree_sha rg "get_repository_tree\(" -B 3 -A 3

Repository: warestack/watchflow

Length of output: 2189

🏁 Script executed:

# Check if there are any error cases or comments about ref handling rg "refs/tags|tag|commit.*sha|commit.*SHA" -B 2 -A 2

Repository: warestack/watchflow

Length of output: 14815

🏁 Script executed:

# Search for any existing tests or mocks that show the response structure rg "commit_data|tree_sha" -B 5 -A 5

Repository: warestack/watchflow

Length of output: 5261

🏁 Script executed:

# Check if there are comments or documentation about the API response structures rg "object.*sha|commit.*tree" -B 3 -A 3

Repository: warestack/watchflow

Length of output: 3697

🏁 Script executed:

# Look for any existing usage patterns in similar methods rg "git/commits|/commits/" -B 2 -A 8 | head -100

Repository: warestack/watchflow

Length of output: 2709

🏁 Script executed:

# Check what refs are actually passed to _resolve_tree_sha or get_repository_tree rg "get_repository_tree\|_resolve_tree_sha" -B 10 -A 5 | grep -E "ref=|get_repository_tree|_resolve_tree_sha" | head -40

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Look at the _ref_to_branch function to understand ref normalization rg "_ref_to_branch" -B 5 -A 10

Repository: warestack/watchflow

Length of output: 2167

🏁 Script executed:

# Check if there are any cases where refs with prefixes might be passed rg "refs/tags\|refs/heads/\|commit.*sha" -B 3 -A 3 | head -60

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# Let me check the actual response structure for both endpoints using documentation comments rg "git/ref|commits/" -B 5 -A 10 | head -80

Repository: warestack/watchflow

Length of output: 6210

🏁 Script executed:

# Check if there are any direct SHA passes to get_repository_tree rg "get_repository_tree\(" -B 15 | grep -E "ref=|sha|commit"

Repository: warestack/watchflow

Length of output: 216

🏁 Script executed:

# Verify the actual implementation one more time cat -n src/integrations/github/api.py | sed -n '224,241p'

Repository: warestack/watchflow

Length of output: 1275

🏁 Script executed:

# Check docstring for get_file_content to see what refs it claims to support rg -A 5 "Fetches the content of a file"

Repository: warestack/watchflow

Length of output: 628

🏁 Script executed:

# Check if there are any comments or documentation about ref parameter handling rg "branch name.*tag.*commit|commit SHA" -B 2 -A 2

Repository: warestack/watchflow

Length of output: 1480

🏁 Script executed:

# Check if there are any other uses of the /commits/ endpoint to see response structure rg "/commits/" -B 3 -A 8 | grep -E "commit|tree|sha|get\(|\.get"

Repository: warestack/watchflow

Length of output: 1429

🏁 Script executed:

# Check for any integration tests or mocks that show response structures rg "commit.*tree\|tree.*commit" -B 2 -A 2

Repository: warestack/watchflow

Length of output: 45

🏁 Script executed:

# See the full context of how get_repository_tree is being called cat -n src/integrations/github/api.py | sed -n '208,223p'

Repository: warestack/watchflow

Length of output: 769

Resolve ref resolution to handle branches, tags, and commit SHAs.

_resolve_tree_sha cannot resolve tags or commit SHAs—it only handles heads/{ref}. This conflicts with get_file_content's documented support for "branch name, tag, or commit SHA". Replace the two-call git/ref/heads + git/commits approach with the single /repos/{owner}/{repo}/commits/{ref} endpoint, which accepts all ref types and returns the tree SHA directly.

Suggested fix

- ref_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/ref/heads/{ref}" - async with session.get(ref_url, headers=headers) as response: - if response.status != 200: - return None - data = await response.json() - commit_sha = data.get("object", {}).get("sha") if isinstance(data, dict) else None - if not commit_sha: - return None - commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/git/commits/{commit_sha}" + commit_url = f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}" async with session.get(commit_url, headers=headers) as response: if response.status != 200: return None commit_data = await response.json() - tree_sha = commit_data.get("tree", {}).get("sha") if isinstance(commit_data, dict) else None + tree_sha = commit_data.get("commit", {}).get("tree", {}).get("sha") if isinstance(commit_data, dict) else None return tree_sha

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/integrations/github/api.py` around lines 224 - 241, _resolve_tree_sha currently only queries git/ref/heads/{ref} and cannot resolve tags or raw commit SHAs; update it to call the single commits endpoint (/repos/{owner}/{repo}/commits/{ref}) which accepts branch names, tags, and commit SHAs and returns the commit object with a tree.sha. In function _resolve_tree_sha use self._get_session() and session.get to fetch f"{config.github.api_base_url}/repos/{repo_full_name}/commits/{ref}" with the provided headers, check for a 200 response, parse JSON into commit_data, and return commit_data.get("tree", {}).get("sha") or None on failure; keep existing error/None semantics and ensure you remove the previous git/ref/heads + git/commits two-step logic so get_file_content's documented ref support works.

coderabbitai · 2026-02-28T10:47:32Z

src/rules/ai_rules_scan.py

+                logger.warning(
+                    "deterministic_mapping_matched statement=%r pattern=%r",
+                    statement[:100],
+                    p,
+                )
+                return dict(rule_dict)


⚠️ Potential issue | 🟡 Minor

Incorrect log level for successful operation.

logger.warning is used to log a successful deterministic mapping match. This should be logger.debug or logger.info since it's not a warning condition.

Proposed fix

if p in lower: - logger.warning( + logger.debug( "deterministic_mapping_matched statement=%r pattern=%r", statement[:100], p, ) return dict(rule_dict)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

logger.warning(

"deterministic_mapping_matched statement=%r pattern=%r",

statement[:100],

p,

)

return dict(rule_dict)

logger.debug(

"deterministic_mapping_matched statement=%r pattern=%r",

statement[:100],

p,

)

return dict(rule_dict)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/rules/ai_rules_scan.py` around lines 304 - 309, The log call indicating a successful deterministic mapping match is using logger.warning; change that to a non-warning level (prefer logger.debug) so successful matches are not treated as warnings: in the block that logs "deterministic_mapping_matched statement=%r pattern=%r" (just before the return dict(rule_dict)) replace logger.warning(...) with logger.debug(...) keeping the same message and arguments (statement[:100], p) to preserve context.

coderabbitai · 2026-02-28T10:47:32Z

src/rules/ai_rules_scan.py

+            try:
+                agent = get_feasibility_agent()
+                result = await agent.execute(rule_description=st)
+                if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"):
+                    yaml_content = result.data["yaml_content"].strip()
+                    parsed = yaml.safe_load(yaml_content)
+                    if isinstance(parsed, dict) and "rules" in parsed and isinstance(parsed["rules"], list):
+                        for r in parsed["rules"]:
+                            if isinstance(r, dict):
+                                all_rules.append(r)
+                                rule_sources.append("agent")
+                    else:
+                        ambiguous.append({"statement": st, "path": path, "reason": "Feasibility agent returned invalid YAML"})
+                else:
+                    ambiguous.append({"statement": st, "path": path, "reason": result.message or "Not feasible"})
+            except Exception as e:
+                ambiguous.append({"statement": st, "path": path, "reason": str(e)})


⚠️ Potential issue | 🟠 Major

Missing confidence gating for agent output.

Per coding guidelines: "Agent outputs must include: decision, confidence (0..1), short reasoning" and "Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5". The feasibility agent result is used without checking a confidence threshold.

Proposed fix

try: agent = get_feasibility_agent() result = await agent.execute(rule_description=st) - if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"): + confidence = result.data.get("confidence", 0.0) if result.data else 0.0 + if confidence < 0.5: + ambiguous.append({"statement": st, "path": path, "reason": f"Low confidence ({confidence})"}) + continue + if result.success and result.data.get("is_feasible") and result.data.get("yaml_content"): yaml_content = result.data["yaml_content"].strip()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/rules/ai_rules_scan.py` around lines 353 - 369, The feasibility agent output is used without verifying its required fields or confidence; update the block that calls get_feasibility_agent() and agent.execute(...) to validate that result.data contains the required keys ("decision"/"is_feasible", "confidence", "reasoning") and only accept the YAML when result.success is true, result.data["is_feasible"] is truthy, yaml_content exists, and result.data["confidence"] >= 0.5; if confidence is missing or < 0.5 or required keys are absent, append to ambiguous with a clear reason (e.g., "low confidence" or "missing agent fields") including the reported confidence/decision, otherwise proceed to parse yaml_content and add rules to all_rules as before.

codecov-commenter · 2026-02-28T10:48:42Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 58.22050% with 216 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (58.2%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (68.6%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@           Coverage Diff           @@
##            main     #60     +/-   ##
=======================================
+ Coverage   67.0%   68.6%   +1.6%     
=======================================
  Files        151     157      +6     
  Lines       9384   10247    +863     
=======================================
+ Hits        6290    7037    +747     
- Misses      3094    3210    +116

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8302582...f790c4e. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dkargatzis

Thanks for picking up this issue! You've built a very solid structural foundation here, especially with the clean integration into the event processors and the pragmatic reuse of the existing feasibility agent for translation.

However, based on the issue description and your attached PDF, there are a few key areas where the implementation deviates from the "agentic" vision of the feature and needs adjustment:

1. The Extractor: Deterministic vs. Agentic

In your PDF, you noted: "No LLM, no graph, no chain — just one synchronous function." While keeping things simple is usually a great principle, in this specific case, it contradicts the core requirement of the issue: "Build a LangGraph chain/agent: Extractor Agent: Read MD content -> pull out rule-like statements (prompt engineering here is key)."

Markdown guidelines (like .cursorrules or .github/copilot-instructions.md) are rarely structured predictably. They are written in paragraphs, bullet points, or complex sentences. A simple synchronous function that splits by newlines and looks for exact substring matches like "always use" or "Cursor rule:" is too brittle and will miss implicit rules or lose context.

Recommendation: Please replace the synchronous string-matching Extractor with an actual LLM-powered Extractor Agent. It should take the raw markdown, read it holistically, understand the context, and return a structured list of rule statements.

2. Missing Core Feature: Pre-Merge Enforcement

Currently, in src/event_processors/pull_request/processor.py, the dynamically translated rules are generated and then simply logged to the console via logger.info. The issue explicitly calls for enforcing these as pre-merge checks. Logging the rules does not block bad PRs or add comments for violations, meaning the feature isn't actually functioning as a guardrail yet.

Recommendation: Parse the dynamically generated rules_yaml into rule objects and append them to the existing repository rules (typically fetched via rule_provider.get_rules()) before the PR evaluation engine runs. This ensures Watchflow will automatically evaluate them, leave comments, and block the merge if they fail.

3. Missing Bonus Feature: The Self-Improving Loop (Push Event)

In src/event_processors/push.py, the PR detects when an AI rule file is pushed and translates it, but again, only logs the output. The issue asks for a self-improving loop: "if an MD rules file changes -> auto-re-run translator -> suggest/commit updated .watchflow/rules.yaml".

Recommendation: If a push event modifies an AI rule file, use the GitHub API client to either directly update the .watchflow/rules.yaml file on the default branch or (preferably) open a new Pull Request against the repository with the newly translated YAML rules so the team can review the auto-generated constraints.

4. Performance: Sequential I/O

In scan_repo_for_ai_rule_files, the code loops over candidate files and awaits get_file_content sequentially. For repositories with many files in a .cursor/rules/ directory, fetching files one by one will create high latency and could cause the webhook processing to time out.

Recommendation: Use asyncio.gather paired with an asyncio.Semaphore to fetch the contents of all candidate files concurrently while respecting GitHub's rate limits. (Note: The CodeRabbit AI bot actually caught this one as well).

Great start overall! Looking forward to seeing the LLM Extractor Agent in action.

savagame added 2 commits February 26, 2026 17:35

feature: added repo scanning logic

255d258

feature: added Agentic Parsing and Translation

f790c4e

savagame requested a review from dkargatzis as a code owner February 28, 2026 10:40

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

dkargatzis requested changes Feb 28, 2026

View reviewed changes

Conversation

savagame commented Feb 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

watchflow bot commented Feb 28, 2026

🛡️ Watchflow Governance Checks

Checks PR description (body) and title for a linked issue reference (e.g. #123, Fixes #123, Closes #456). Use when the rule requires issue refs in either field.

Validates that total lines changed (additions + deletions) in a PR do not exceed a maximum; enforces a maximum LOC per pull request.

Validates if the PR title matches a specific pattern

Uh oh!

coderabbitai bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 28, 2026

Codecov Report

Uh oh!

dkargatzis left a comment

Choose a reason for hiding this comment

1. The Extractor: Deterministic vs. Agentic

2. Missing Core Feature: Pre-Merge Enforcement

3. Missing Bonus Feature: The Self-Improving Loop (Push Event)

4. Performance: Sequential I/O

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

savagame commented Feb 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 28, 2026 •

edited

Loading