Skip to content

feat: issue 24 diff and codeowners#59

Draft
dkargatzis wants to merge 14 commits intomainfrom
feat/issue-24-diff-and-codeowners
Draft

feat: issue 24 diff and codeowners#59
dkargatzis wants to merge 14 commits intomainfrom
feat/issue-24-diff-and-codeowners

Conversation

@dkargatzis
Copy link
Member

@dkargatzis dkargatzis commented Feb 28, 2026

This PR resolves #24 and expands our hybrid validation engine with diff-based pattern matching and advanced enterprise compliance guardrails. We've consolidated our GitHub API layers by deprecating the legacy graphql_client in favor of a strictly typed Pydantic/httpx GraphQL implementation to fetch reviewThreads and is_verified commit signatures. Leveraging this new data, this branch introduces DiffPatternCondition (regex matching over raw patch diffs), UnresolvedCommentsCondition (SLA checking against GraphQL thread nodes), and native CODEOWNERS API fetching with an AsyncCache layer to bypass local disk reads. Additionally, we've shipped a suite of Separation of Duties (SoD) conditions (NoSelfApproval, SignedCommits, ChangelogRequired) and an enterprise-rules-roadmap.md document to guide future native integrations with CodeQL and Dependabot. All schemas have been updated, and 100% test coverage has been achieved for the new validators.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added enterprise compliance and governance guardrails with multi-layer rule engine
    • Integrated PR review thread fetching and monitoring
    • Five new rule conditions: enforce signed commits, require changelog updates, enforce test coverage, detect restricted patterns, and monitor comment response SLA
    • Advanced access control: prevent self-approval and require team-based approvals
  • Tests

    • Added comprehensive test coverage for all new rule conditions
  • Chores

    • Refactored internal utility architecture for improved maintainability

@dkargatzis dkargatzis self-assigned this Feb 28, 2026
@watchflow
Copy link

watchflow bot commented Feb 28, 2026

🛡️ Watchflow Governance Checks

Status: ❌ 1 Violations Found

🟡 Medium Severity (1)

Validates that total lines changed (additions + deletions) in a PR do not exceed a maximum; enforces a maximum LOC per pull request.

Pull request exceeds maximum lines changed (1497 > 500)
How to fix: Reduce the size of this PR to at most 500 lines changed (additions + deletions).


💡 Reply with @watchflow ack [reason] to override these rules, or @watchflow help for commands.

Thanks for using Watchflow! It's completely free for OSS and private repositories. You can also self-host it easily.

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces enterprise guardrails and review thread capabilities for Watchflow. It adds GraphQL-backed review thread fetching, multiple new rule conditions for compliance/access control/code quality/temporal checks, new webhook handlers for review events, refactored CODEOWNERS utilities, and a comprehensive design roadmap for regulated industry features.

Changes

Cohort / File(s) Summary
Enterprise Roadmap & Documentation
docs/enterprise-rules-roadmap.md
New comprehensive design document outlining guardrails, rule engine architecture, and six rule categories (compliance, access control, operations, documentation, GitHub integrations, OPA/open-source).
GitHub GraphQL & API Integration
src/integrations/github/api.py, src/integrations/github/graphql.py, src/integrations/github/graphql_client.py
Adds get_pull_request_review_threads() method to fetch review threads via GraphQL; introduces typed execute_query_typed() for validation; removes legacy graphql_client.py in favor of centralized integration. Improves error handling in fetch_pr_hygiene_stats.
Event Types & PR Data Models
src/core/models.py, src/integrations/github/models.py
Adds PULL_REQUEST_REVIEW and PULL_REQUEST_REVIEW_THREAD event types; introduces review thread models (ThreadCommentNode, ThreadCommentConnection, ReviewThreadNode, ReviewThreadConnection) and wires into PullRequest.
PR Data Enrichment
src/event_processors/pull_request/enricher.py
Fetches and stores review threads from GraphQL; adds patch field to file metadata in enriched payloads.
Compliance Rule Conditions
src/rules/conditions/compliance.py
Introduces SignedCommitsCondition (enforces GPG/SSH/S-MIME signatures) and ChangelogRequiredCondition (requires changelog updates for source changes).
Advanced Access Control Conditions
src/rules/conditions/access_control_advanced.py
Adds NoSelfApprovalCondition (prevents author self-approval) and CrossTeamApprovalCondition (requires multi-team approvals).
Pattern Matching & Code Quality Conditions
src/rules/conditions/pull_request.py, src/rules/conditions/filesystem.py, src/rules/conditions/temporal.py
Introduces DiffPatternCondition, SecurityPatternCondition, UnresolvedCommentsCondition for pattern/diff scanning; TestCoverageCondition for test correlation; CommentResponseTimeCondition for SLA enforcement.
Rule Infrastructure & Registry
src/rules/acknowledgment.py, src/rules/registry.py, src/rules/conditions/__init__.py
Adds five new RuleID entries (DIFF\_PATTERN, SECURITY\_PATTERN, UNRESOLVED\_COMMENTS, TEST\_COVERAGE, COMMENT\_RESPONSE\_TIME); maps to condition classes; exports in registry and package all.
CODEOWNERS & Diff Utilities
src/rules/utils/codeowners.py, src/rules/utils/diff.py, src/rules/utils/__init__.py
Refactors get_file_owners() and is_critical_file() to accept codeowners\_content string instead of repo\_path; removes load_codeowners(); adds extract_added_lines(), extract_removed_lines(), match_patterns_in_patch() utilities.
Webhook Event Handlers
src/webhooks/handlers/pull_request_review.py, src/webhooks/handlers/pull_request_review_thread.py
New handlers for PR review/review\_thread events; validate action field and delegate to existing PR rule evaluation engine.
Application Integration
src/main.py
Registers new webhook handlers for PULL_REQUEST_REVIEW and PULL_REQUEST_REVIEW_THREAD event types.
Test Coverage
tests/unit/rules/conditions/test_*, tests/unit/rules/test_acknowledgment.py
Comprehensive async test suites for all new conditions (compliance, access control, pull request patterns, temporal SLA) and violation acknowledgment mappings.

Sequence Diagram

sequenceDiagram
    actor GitHub
    participant Handler as Webhook Handler
    participant Enricher as PR Enricher
    participant GraphQL as GraphQL Client
    participant RuleEngine as Rule Engine
    participant Storage as Storage

    GitHub->>Handler: pull_request_review event
    Handler->>Handler: Validate action
    Handler->>RuleEngine: Re-evaluate PR rules
    RuleEngine->>Enricher: Get enriched PR data
    Enricher->>GraphQL: Fetch review threads
    GraphQL->>GitHub: Query reviewThreads
    GitHub-->>GraphQL: Thread nodes
    GraphQL-->>Enricher: Parsed review threads
    Enricher-->>RuleEngine: Enriched event with threads
    RuleEngine->>RuleEngine: Evaluate conditions<br/>(unresolved comments,<br/>response time, etc.)
    RuleEngine-->>Storage: Store violations
    Storage-->>Handler: Result
    Handler-->>GitHub: Status update
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #54: Modifies src/rules/acknowledgment.py violation-text ↔ RuleID mappings and updates rule registry infrastructure, overlapping with this PR's extension of acknowledgment mappings.
  • PR #43: Alters GitHub GraphQL client surface in src/integrations/github/graphql.py and GraphQL execution/validation methods, directly related to GraphQL integration changes.

Poem

🐰 Review threads now flow through the warren,
Patterns caught before they tarry on,
Rules enforce what's signed and tested true,
Five new sentries guard the path for you,
Compliance's march, one thread at a time! 🧵✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.87% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'feat: issue 24 diff and codeowners' is partially related to the changeset but lacks clarity and specificity about the main changes. Consider revising the title to be more descriptive of the primary changes, such as 'feat: add diff pattern matching and enterprise compliance conditions' or 'feat: implement GraphQL review threads and access control conditions'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/issue-24-diff-and-codeowners

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dkargatzis dkargatzis marked this pull request as draft February 28, 2026 08:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (16)
src/rules/acknowledgment.py (1)

61-61: Consider a more specific pattern for COMMENT_RESPONSE_TIME mapping.

The pattern "exceeded the" is quite generic and could inadvertently match other violation messages (e.g., "file size exceeded the limit"). Consider a more specific substring like "response time exceeded" or "SLA timeframe" to avoid false positives in rule ID mapping.

♻️ Suggested improvement
-    "exceeded the": RuleID.COMMENT_RESPONSE_TIME,
+    "response time exceeded": RuleID.COMMENT_RESPONSE_TIME,

Or alternatively, ensure the violation message from CommentResponseTimeCondition uses a distinctive phrase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/acknowledgment.py` at line 61, The mapping key "exceeded the" in
src/rules/acknowledgment.py is too generic for RuleID.COMMENT_RESPONSE_TIME;
replace it with a more specific substring such as "response time exceeded" or
"SLA timeframe" (or update the violation text emitted by
CommentResponseTimeCondition to include a distinctive phrase) so the lookup for
RuleID.COMMENT_RESPONSE_TIME only matches CommentResponseTimeCondition
violations and avoids false positives.
tests/unit/rules/test_acknowledgment.py (1)

160-162: Missing test coverage for TEST_COVERAGE and COMMENT_RESPONSE_TIME mappings.

The test covers three of the five new RuleID mappings but is missing test cases for:

  • TEST_COVERAGE (maps from "without corresponding test changes")
  • COMMENT_RESPONSE_TIME (maps from "exceeded the")

Consider adding parametrized test cases for these to ensure complete coverage of all new violation text mappings.

💚 Suggested test additions
             ("PR has 1 unresolved review comment thread(s)", RuleID.UNRESOLVED_COMMENTS),
+            ("Source files modified without corresponding test changes", RuleID.TEST_COVERAGE),
+            ("Comment response time exceeded the 24-hour SLA", RuleID.COMMENT_RESPONSE_TIME),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/rules/test_acknowledgment.py` around lines 160 - 162, Add
parametrized test cases to tests/unit/rules/test_acknowledgment.py to cover the
two missing RuleID mappings: include a case that expects RuleID.TEST_COVERAGE
when the violation text contains "without corresponding test changes" and a case
that expects RuleID.COMMENT_RESPONSE_TIME when the violation text contains
"exceeded the"; update the same parameter list or the test function that
currently asserts mappings for RuleID.DIFF_PATTERN, RuleID.SECURITY_PATTERN, and
RuleID.UNRESOLVED_COMMENTS so it also asserts these two new tuples (i.e., add
("without corresponding test changes", RuleID.TEST_COVERAGE) and ("exceeded
the", RuleID.COMMENT_RESPONSE_TIME)) ensuring the test iterates those inputs and
verifies the mapping logic.
src/webhooks/handlers/pull_request_review.py (2)

21-22: Use structured logging with fields instead of f-string interpolation.

Per coding guidelines, structured logging should include fields like operation, subject_ids, decision. The current f-string interpolation loses the benefits of structured logging.

♻️ Suggested structured logging
-        logger.info(f"Ignoring pull_request_review action: {action}")
+        logger.info(
+            "pull_request_review_skipped",
+            operation="handle_pull_request_review",
+            action=action,
+            decision="skipped",
+        )

As per coding guidelines: "Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/pull_request_review.py` around lines 21 - 22, Replace
the f-string logger.info call in the pull_request_review handler with structured
logging that emits fields instead of interpolated text: use logger.info(...,
extra={...}) or logger.info with keyword fields to include
operation="pull_request_review:ignore", subject_ids (use the PR/review IDs
available in the handler context or None/empty list if not present),
decision="skipped", and latency_ms (calculate or set to 0 if not measured); keep
the return value as-is. Locate the logger.info call currently using f"Ignoring
pull_request_review action: {action}" and update it to log these structured
fields (operation, subject_ids, decision, latency_ms) while preserving the
return {"status":"skipped","reason":...}.

11-11: Consider using WebhookResponse model for type safety.

The function returns dict[str, Any] but the return value structure ({"status": "skipped", "reason": ...}) aligns with the WebhookResponse model defined in src/core/models.py. Using the model would provide validation and better type safety.

♻️ Suggested improvement
-async def handle_pull_request_review(event_type: str, payload: dict[str, Any], event: WebhookEvent) -> dict[str, Any]:
+from src.core.models import WebhookResponse
+
+async def handle_pull_request_review(event_type: str, payload: dict[str, Any], event: WebhookEvent) -> WebhookResponse | dict[str, Any]:

And for the skip case:

-        return {"status": "skipped", "reason": f"Action {action} ignored"}
+        return WebhookResponse(status="ignored", detail=f"Action {action} ignored")

Also applies to: 22-22

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/pull_request_review.py` at line 11, The handler
currently types its return as dict[str, Any]; switch it to return the
WebhookResponse model for type safety by importing WebhookResponse from
src.core.models, change the function signature of handle_pull_request_review to
-> WebhookResponse, and replace literal dict returns like {"status": "skipped",
"reason": ...} with WebhookResponse(status="skipped", reason="...") (and
similarly for any other return paths) so the model validates structure and
types.
tests/unit/rules/conditions/test_access_control.py (1)

11-11: Clarify the purpose of the suppressed unused import.

The # noqa: F401 comment suppresses the unused import warning, but it's unclear why this import is needed. If it's for module initialization side effects (e.g., registering something), consider adding a comment explaining the purpose. If it's not actually needed, remove it.

💡 Suggested clarification
-import src.rules.utils.codeowners  # noqa: F401
+# Import triggers registration of codeowners utilities needed by CodeOwnersCondition
+import src.rules.utils.codeowners  # noqa: F401

Or if unnecessary:

-import src.rules.utils.codeowners  # noqa: F401
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/rules/conditions/test_access_control.py` at line 11, The
suppressed unused import "src.rules.utils.codeowners  # noqa: F401" lacks
explanation; either remove the import if it serves no purpose, or keep it and
add an inline comment clarifying its side-effect (e.g., module-level
registration/fixtures) so the noqa is justified; update the import line for the
symbol src.rules.utils.codeowners accordingly to include a short comment like "#
imported for side-effects: <what it registers>" or delete the import if it's
unused.
src/main.py (1)

28-29: Inconsistent handler pattern: function vs. class-based handlers.

The new handlers (handle_pull_request_review, handle_pull_request_review_thread) are registered as bare functions, while all other handlers use the class-based pattern (e.g., PullRequestEventHandler().handle). This is functionally correct but creates an inconsistent codebase pattern.

Consider either:

  1. Wrapping these in handler classes for consistency, or
  2. Documenting that functional handlers are acceptable alongside class-based ones

This is a minor style observation and doesn't affect functionality.

Also applies to: 79-80

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.py` around lines 28 - 29, The codebase mixes function-style handlers
(handle_pull_request_review, handle_pull_request_review_thread) with class-based
handlers; wrap each functional handler in a small class (e.g.,
PullRequestReviewEventHandler and PullRequestReviewThreadEventHandler) that
implements a handle(self, event) method which delegates to the existing
functions, then register instances the same way other handlers are registered
(matching the pattern used by PullRequestEventHandler().handle); apply the same
change for the other occurrence of these imports/registrations noted in the file
so all handlers follow the class-based pattern for consistency.
src/rules/conditions/compliance.py (2)

54-56: Consider removing redundant validate method overrides.

Both condition classes override validate with the exact same implementation that BaseCondition already provides (see src/rules/conditions/base.py lines 47-60). These overrides are functionally equivalent to the inherited implementation and could be removed to reduce code duplication.

♻️ Suggested removal
-    async def validate(self, parameters: dict[str, Any], event: dict[str, Any]) -> bool:
-        violations = await self.evaluate({"parameters": parameters, "event": event})
-        return len(violations) == 0

Remove both validate methods and rely on the inherited BaseCondition.validate implementation.

Also applies to: 105-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/compliance.py` around lines 54 - 56, The two validate
method overrides in the condition classes (async def validate(self, parameters:
dict[str, Any], event: dict[str, Any]) -> bool) are redundant because they
simply call self.evaluate and return len(violations) == 0 exactly like
BaseCondition.validate; delete these validate methods from the condition classes
so they inherit BaseCondition.validate (refer to BaseCondition.validate and the
evaluate method used) to eliminate duplication.

30-32: Consider how to handle missing commit data when signed commits are required.

When require_signed_commits is True but commits is empty, the condition silently passes. This could mask issues where commit data wasn't enriched properly. Consider whether this should:

  1. Return a violation indicating commit data is unavailable for verification, or
  2. Log a warning so operators can diagnose enrichment issues

The current behavior may be intentional if commit data is optional, but it's worth confirming this edge case is expected.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/compliance.py` around lines 30 - 32, The early-return
that ignores empty commit lists (the line with commits = event.get("commits",
[]) and if not commits: return []) should be changed to handle the case where
require_signed_commits is True: instead of silently returning an empty result
when commits is empty, either (A) return a violation indicating "commit data
unavailable for verification" so the rule fails when require_signed_commits is
enabled, or (B) log a warning/error so operators can investigate enrichment
issues; implement this by replacing the empty-list return with a check on the
require_signed_commits flag (referencing commits and require_signed_commits) and
then producing the appropriate violation object or logging call before exiting.
src/integrations/github/graphql.py (1)

59-61: Consider redacting or truncating data in error logs.

Logging the full data dict on validation failure may inadvertently include sensitive information (tokens, user data). Per coding guidelines, strip secrets/PII from logs.

♻️ Suggested fix
         except ValidationError as e:
-            logger.error("graphql_validation_failed", error=str(e), data=data)
+            logger.error("graphql_validation_failed", error=str(e), data_keys=list(data.keys()) if isinstance(data, dict) else type(data).__name__)
             raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github/graphql.py` around lines 59 - 61, The log in the
except block that calls logger.error("graphql_validation_failed", error=str(e),
data=data) must not emit raw request payloads; replace data with a
sanitized/truncated version before logging. Implement or reuse a helper like
sanitize_for_logging(redact_sensitive_fields) to remove/replace known
secrets/PII keys (e.g., token, password, access_token, authorization, email,
name) and apply truncation for large string/blob values, then pass the
sanitized_data to logger.error instead of data; ensure this change is made in
the except ValidationError as e handler where logger.error is invoked.
src/webhooks/handlers/pull_request_review_thread.py (1)

22-22: Prefer structured logging fields over f-string interpolation.

Per coding guidelines, use structured logging at boundaries. Replace the f-string with keyword arguments for better log aggregation and querying.

♻️ Suggested fix
-        logger.info(f"Ignoring pull_request_review_thread action: {action}")
+        logger.info("Ignoring pull_request_review_thread action", action=action)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/webhooks/handlers/pull_request_review_thread.py` at line 22, Replace the
f-string log call so the action is logged as a structured field instead of
interpolated text: find the logger.info call that currently reads
logger.info(f"Ignoring pull_request_review_thread action: {action}") in the
pull_request_review_thread handler and change it to emit the message string with
the action as a keyword argument (e.g., logger.info("Ignoring
pull_request_review_thread action", action=action)) so downstream log systems
can index/query the action field.
src/integrations/github/api.py (1)

520-524: Hardcoded pagination limits will truncate data for large PRs.

The query fetches at most 50 review threads with 10 comments each, but the GraphQL query includes no pageInfo or cursor fields to enable pagination. PRs with more than 50 review threads or threads with more than 10 comments will have data silently truncated with no indication. Recommend either documenting this limitation prominently or implementing cursor-based pagination to support enterprise PRs with extensive review discussions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/integrations/github/api.py` around lines 520 - 524, The GraphQL selection
uses hardcoded reviewThreads(first: 50) and comments(first: 10) which silently
truncates large PRs; modify the query in src/integrations/github/api.py to
support cursor-based pagination by adding pageInfo { hasNextPage endCursor } on
reviewThreads and comments and exposing/accepting after cursors, then implement
iterative fetching in the function that executes the query (e.g., the pull
request fetcher / get_pull_request / fetch_pull_request_details method) to loop
using endCursor until pageInfo.hasNextPage is false (or parameterize the limits
and clearly document them if you choose not to paginate). Ensure you propagate
cursors into subsequent queries and concatenate nodes from reviewThreads and
comments instead of replacing them.
tests/unit/rules/conditions/test_access_control_advanced.py (1)

37-52: Add severity assertion for CrossTeamApprovalCondition violation.

The TestNoSelfApprovalCondition tests correctly assert Severity.CRITICAL, but TestCrossTeamApprovalCondition doesn't verify severity. Per the implementation, this should be Severity.HIGH.

♻️ Proposed fix to add severity assertion
         violations = await condition.evaluate(context)
         assert len(violations) == 1
         assert "security" in violations[0].message
         assert "backend" in violations[0].message
+        assert violations[0].severity == Severity.HIGH
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/rules/conditions/test_access_control_advanced.py` around lines 37
- 52, The test
TestCrossTeamApprovalCondition.test_evaluate_returns_violations_when_missing_teams
should also assert the violation severity; update the test to check that the
returned violation(s) have severity Severity.HIGH (import or reference Severity
and assert violations[0].severity == Severity.HIGH) alongside the existing
message and length assertions so the test verifies severity for
CrossTeamApprovalCondition.
src/rules/conditions/pull_request.py (2)

370-428: Consider consolidating DiffPatternCondition and SecurityPatternCondition to reduce duplication.

Both conditions follow the same pattern: iterate changed files, extract patch, call match_patterns_in_patch, and create violations. The only differences are:

  • Parameter key (diff_restricted_patterns vs security_patterns)
  • Severity (MEDIUM vs CRITICAL)
  • Violation message text

This could be refactored into a single base class or a factory function with configurable severity and parameter key.

♻️ Example refactor using a base class
class PatternMatchCondition(BaseCondition):
    """Base class for pattern-matching conditions on diffs."""
    
    pattern_param_key: str = ""
    violation_severity: Severity = Severity.MEDIUM
    
    async def evaluate(self, context: Any) -> list[Violation]:
        parameters = context.get("parameters", {})
        event = context.get("event", {})

        patterns = parameters.get(self.pattern_param_key)
        if not patterns or not isinstance(patterns, list):
            return []

        changed_files = event.get("changed_files", [])
        if not changed_files:
            return []

        from src.rules.utils.diff import match_patterns_in_patch

        violations = []
        for file_info in changed_files:
            patch = file_info.get("patch")
            if not patch:
                continue

            matched = match_patterns_in_patch(patch, patterns)
            if matched:
                filename = file_info.get("filename", "unknown")
                violations.append(self._create_violation(matched, filename))

        return violations
    
    def _create_violation(self, matched: list[str], filename: str) -> Violation:
        raise NotImplementedError


class DiffPatternCondition(PatternMatchCondition):
    name = "diff_pattern"
    pattern_param_key = "diff_restricted_patterns"
    violation_severity = Severity.MEDIUM
    # ...


class SecurityPatternCondition(PatternMatchCondition):
    name = "security_pattern"
    pattern_param_key = "security_patterns"
    violation_severity = Severity.CRITICAL
    # ...

Also applies to: 431-491

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/pull_request.py` around lines 370 - 428,
DiffPatternCondition duplicates logic found in SecurityPatternCondition: both
iterate changed_files, extract patch, call match_patterns_in_patch and build
Violation objects; refactor by introducing a shared base like
PatternMatchCondition (or a factory) that defines pattern_param_key,
violation_severity and a common evaluate that uses match_patterns_in_patch and
constructs violations via an overridable _create_violation method; then make
DiffPatternCondition and SecurityPatternCondition subclasses that only set name,
pattern_param_key (diff_restricted_patterns vs security_patterns),
violation_severity (Severity.MEDIUM vs Severity.CRITICAL) and implement
_create_violation to customize the message text so duplicate iteration and
matching code is removed.

392-392: Repeated inline import of match_patterns_in_patch.

The same import from src.rules.utils.diff import match_patterns_in_patch appears 4 times across different methods. While lazy imports are acceptable per guidelines, consider moving this to the module level since diff.py is lightweight and match_patterns_in_patch is used in multiple code paths.

♻️ Move import to module level
 import logging
 import re
 from typing import Any

 from src.core.models import Severity, Violation
 from src.rules.conditions.base import BaseCondition
+from src.rules.utils.diff import match_patterns_in_patch

 logger = logging.getLogger(__name__)

Then remove the inline imports on lines 392, 421, 453, and 484.

Also applies to: 421-421, 453-453, 484-484

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/pull_request.py` at line 392, Move the repeated inline
import of match_patterns_in_patch to the top of the module and remove the
duplicate inline imports inside the methods that currently call it; specifically
add "from src.rules.utils.diff import match_patterns_in_patch" at module level
of pull_request.py and delete the four inline imports found where
match_patterns_in_patch is invoked (the import currently repeated inside the
methods that call match_patterns_in_patch). This consolidates the import while
leaving all usages (calls to match_patterns_in_patch) unchanged.
src/rules/utils/diff.py (1)

71-75: Consider logging a warning for invalid regex patterns.

Invalid regex patterns are silently skipped, which could make debugging difficult when patterns don't match as expected.

♻️ Proposed change to add logging
+import logging
+
+logger = logging.getLogger(__name__)
+
 # ... in match_patterns_in_patch:
     for p in patterns:
         try:
             compiled_patterns.append((p, re.compile(p)))
         except re.error:
+            logger.warning(f"Invalid regex pattern skipped: {p}")
             continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/utils/diff.py` around lines 71 - 75, Invalid regexes are currently
skipped silently in the loop that builds compiled_patterns from patterns; modify
the try/except around re.compile(p) to log a warning when re.error is caught. In
the except block for re.error, call the module logger (e.g., logger.warning or
logger.warn) with a clear message that includes the offending pattern p and the
exception text, then continue appending only successful (p, re.compile(p))
entries to compiled_patterns; ensure logger is imported/defined in the module if
not already.
src/rules/conditions/access_control_advanced.py (1)

9-9: Unused logger variable.

logger is defined but never used in this module. Either add logging for significant operations or remove the import and variable.

♻️ Remove unused logger
-import logging
 from typing import Any
 
 from src.core.models import Severity, Violation
 from src.rules.conditions.base import BaseCondition
-
-logger = logging.getLogger(__name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/access_control_advanced.py` at line 9, The module
defines an unused logger variable ("logger = logging.getLogger(__name__)") which
should be removed or used; either delete the "import logging" statement and the
"logger" assignment to eliminate dead code, or add meaningful log statements
using the "logger" at key points (e.g., entry/exit or error paths of functions
in this module) so the logger is actually referenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/enterprise-rules-roadmap.md`:
- Around line 1-92: The docs file lists many new rule types (e.g.,
SignedCommitsCondition, SecretScanningCondition, BannedDependenciesCondition,
CrossTeamApprovalCondition, NoSelfApprovalCondition, MigrationSafetyCondition,
FeatureFlagRequiredCondition, JiraTicketStatusCondition,
CodeQLAnalysisCondition, DependabotAlertsCondition) but lacks runnable YAML
examples and migration cross-links; add 1–2 minimal, executable YAML snippets
per highlighted rule (showing parameters like require_signed_commits: true,
block_on_secret_alerts: true, banned_licenses/banned_packages,
required_team_approvals, safe_migrations_only, require_active_jira_ticket,
block_on_critical_codeql, max_dependabot_severity), and add explicit cross-links
from each rule header to the canonical README/docs migration/usage sections and
an examples directory, plus update the docs README to list these new example
files and a short “migration notes” section; ensure symbols above are used as
section headings and link targets so users can run the examples directly.

In `@src/rules/conditions/access_control_advanced.py`:
- Around line 90-100: The current approval check inside the loop that uses
clean_team, requested_team_slugs, reviews and sets missing_teams/req_team is
overly broad because has_approval uses any approval from any reviewer; update
the logic to note this limitation by adding a clear TODO comment and/or a
tracking issue reference above the block (near the conditional that computes
has_approval) stating that approvals must be validated against team membership
via the GitHub GraphQL API (i.e., ensure the approving reviewer is a member of
clean_team) and include a ticket/issue ID or link for the follow-up work so
future maintainers know to replace the simplified any(...) check with a proper
team-membership lookup.

In `@src/rules/conditions/filesystem.py`:
- Around line 321-326: In evaluate() and validate() the set of ignored
extensions is inconsistent (evaluate() checks .md, .txt, .yaml, .json while
validate() omits .txt and .json); modify the code so both paths use the same
rule by extracting the extension list into a shared constant or helper (e.g.,
IGNORED_FILE_EXTS or is_ignored_file(filename)) and replace the inline checks in
evaluate() and validate() to call that helper, ensuring .md, .txt, .yaml, and
.json are consistently ignored.
- Around line 309-314: The current handling of an invalid test_file_pattern
swallows the error and returns a permissive success (empty list/True); instead,
update both regex-compile branches (where re.compile(test_pattern) is used and
the except re.error currently logs then returns []) to fail closed: capture the
exception, log the full error, and return a clear remediation result (e.g., a
failure entry or raise a ValidationError) that includes "Invalid
test_file_pattern regex" and the exception message so the rule enforcement
stops; look for the occurrences around the compiled_pattern / test_file_pattern
compile blocks and replace the permissive return with a failing response that
surfaces remediation guidance.

In `@src/rules/conditions/temporal.py`:
- Around line 263-265: The current falsy check "if not max_hours" will treat 0
as absent; change the guard that checks
parameters.get("max_comment_response_time_hours") so it only returns early when
the parameter is truly missing (e.g., check "if max_hours is None" or "if
max_hours is None or max_hours == ''") rather than using a generic falsy check;
update the block around the
parameters.get("max_comment_response_time_hours")/max_hours variable to allow
max_hours == 0 to proceed into evaluation.

In `@src/rules/utils/diff.py`:
- Around line 29-47: The function extract_removed_lines is dead code; either
remove it or mark intent—either delete the extract_removed_lines function
entirely, or keep it but add a clear doc/comment stating it's intentionally
unused (e.g., reserved for future symmetric processing) and update any module
exports if applicable; reference extract_removed_lines, its symmetric partner
extract_added_lines, and the consumer match_patterns_in_patch to locate the
related logic so reviewers understand the change.

In `@tests/unit/rules/conditions/test_temporal_sla.py`:
- Around line 12-13: Remove trailing whitespace from the test file so pre-commit
passes: delete the extra space at the end of the line defining past_time
(variable past_time = now - timedelta(hours=25)) and any trailing spaces on the
other affected test lines (around the assertions or datetime lines referenced
near lines 34-35 and 54-55); then run pre-commit/flake8 locally to confirm no
trailing whitespace remains.

---

Nitpick comments:
In `@src/integrations/github/api.py`:
- Around line 520-524: The GraphQL selection uses hardcoded reviewThreads(first:
50) and comments(first: 10) which silently truncates large PRs; modify the query
in src/integrations/github/api.py to support cursor-based pagination by adding
pageInfo { hasNextPage endCursor } on reviewThreads and comments and
exposing/accepting after cursors, then implement iterative fetching in the
function that executes the query (e.g., the pull request fetcher /
get_pull_request / fetch_pull_request_details method) to loop using endCursor
until pageInfo.hasNextPage is false (or parameterize the limits and clearly
document them if you choose not to paginate). Ensure you propagate cursors into
subsequent queries and concatenate nodes from reviewThreads and comments instead
of replacing them.

In `@src/integrations/github/graphql.py`:
- Around line 59-61: The log in the except block that calls
logger.error("graphql_validation_failed", error=str(e), data=data) must not emit
raw request payloads; replace data with a sanitized/truncated version before
logging. Implement or reuse a helper like
sanitize_for_logging(redact_sensitive_fields) to remove/replace known
secrets/PII keys (e.g., token, password, access_token, authorization, email,
name) and apply truncation for large string/blob values, then pass the
sanitized_data to logger.error instead of data; ensure this change is made in
the except ValidationError as e handler where logger.error is invoked.

In `@src/main.py`:
- Around line 28-29: The codebase mixes function-style handlers
(handle_pull_request_review, handle_pull_request_review_thread) with class-based
handlers; wrap each functional handler in a small class (e.g.,
PullRequestReviewEventHandler and PullRequestReviewThreadEventHandler) that
implements a handle(self, event) method which delegates to the existing
functions, then register instances the same way other handlers are registered
(matching the pattern used by PullRequestEventHandler().handle); apply the same
change for the other occurrence of these imports/registrations noted in the file
so all handlers follow the class-based pattern for consistency.

In `@src/rules/acknowledgment.py`:
- Line 61: The mapping key "exceeded the" in src/rules/acknowledgment.py is too
generic for RuleID.COMMENT_RESPONSE_TIME; replace it with a more specific
substring such as "response time exceeded" or "SLA timeframe" (or update the
violation text emitted by CommentResponseTimeCondition to include a distinctive
phrase) so the lookup for RuleID.COMMENT_RESPONSE_TIME only matches
CommentResponseTimeCondition violations and avoids false positives.

In `@src/rules/conditions/access_control_advanced.py`:
- Line 9: The module defines an unused logger variable ("logger =
logging.getLogger(__name__)") which should be removed or used; either delete the
"import logging" statement and the "logger" assignment to eliminate dead code,
or add meaningful log statements using the "logger" at key points (e.g.,
entry/exit or error paths of functions in this module) so the logger is actually
referenced.

In `@src/rules/conditions/compliance.py`:
- Around line 54-56: The two validate method overrides in the condition classes
(async def validate(self, parameters: dict[str, Any], event: dict[str, Any]) ->
bool) are redundant because they simply call self.evaluate and return
len(violations) == 0 exactly like BaseCondition.validate; delete these validate
methods from the condition classes so they inherit BaseCondition.validate (refer
to BaseCondition.validate and the evaluate method used) to eliminate
duplication.
- Around line 30-32: The early-return that ignores empty commit lists (the line
with commits = event.get("commits", []) and if not commits: return []) should be
changed to handle the case where require_signed_commits is True: instead of
silently returning an empty result when commits is empty, either (A) return a
violation indicating "commit data unavailable for verification" so the rule
fails when require_signed_commits is enabled, or (B) log a warning/error so
operators can investigate enrichment issues; implement this by replacing the
empty-list return with a check on the require_signed_commits flag (referencing
commits and require_signed_commits) and then producing the appropriate violation
object or logging call before exiting.

In `@src/rules/conditions/pull_request.py`:
- Around line 370-428: DiffPatternCondition duplicates logic found in
SecurityPatternCondition: both iterate changed_files, extract patch, call
match_patterns_in_patch and build Violation objects; refactor by introducing a
shared base like PatternMatchCondition (or a factory) that defines
pattern_param_key, violation_severity and a common evaluate that uses
match_patterns_in_patch and constructs violations via an overridable
_create_violation method; then make DiffPatternCondition and
SecurityPatternCondition subclasses that only set name, pattern_param_key
(diff_restricted_patterns vs security_patterns), violation_severity
(Severity.MEDIUM vs Severity.CRITICAL) and implement _create_violation to
customize the message text so duplicate iteration and matching code is removed.
- Line 392: Move the repeated inline import of match_patterns_in_patch to the
top of the module and remove the duplicate inline imports inside the methods
that currently call it; specifically add "from src.rules.utils.diff import
match_patterns_in_patch" at module level of pull_request.py and delete the four
inline imports found where match_patterns_in_patch is invoked (the import
currently repeated inside the methods that call match_patterns_in_patch). This
consolidates the import while leaving all usages (calls to
match_patterns_in_patch) unchanged.

In `@src/rules/utils/diff.py`:
- Around line 71-75: Invalid regexes are currently skipped silently in the loop
that builds compiled_patterns from patterns; modify the try/except around
re.compile(p) to log a warning when re.error is caught. In the except block for
re.error, call the module logger (e.g., logger.warning or logger.warn) with a
clear message that includes the offending pattern p and the exception text, then
continue appending only successful (p, re.compile(p)) entries to
compiled_patterns; ensure logger is imported/defined in the module if not
already.

In `@src/webhooks/handlers/pull_request_review_thread.py`:
- Line 22: Replace the f-string log call so the action is logged as a structured
field instead of interpolated text: find the logger.info call that currently
reads logger.info(f"Ignoring pull_request_review_thread action: {action}") in
the pull_request_review_thread handler and change it to emit the message string
with the action as a keyword argument (e.g., logger.info("Ignoring
pull_request_review_thread action", action=action)) so downstream log systems
can index/query the action field.

In `@src/webhooks/handlers/pull_request_review.py`:
- Around line 21-22: Replace the f-string logger.info call in the
pull_request_review handler with structured logging that emits fields instead of
interpolated text: use logger.info(..., extra={...}) or logger.info with keyword
fields to include operation="pull_request_review:ignore", subject_ids (use the
PR/review IDs available in the handler context or None/empty list if not
present), decision="skipped", and latency_ms (calculate or set to 0 if not
measured); keep the return value as-is. Locate the logger.info call currently
using f"Ignoring pull_request_review action: {action}" and update it to log
these structured fields (operation, subject_ids, decision, latency_ms) while
preserving the return {"status":"skipped","reason":...}.
- Line 11: The handler currently types its return as dict[str, Any]; switch it
to return the WebhookResponse model for type safety by importing WebhookResponse
from src.core.models, change the function signature of
handle_pull_request_review to -> WebhookResponse, and replace literal dict
returns like {"status": "skipped", "reason": ...} with
WebhookResponse(status="skipped", reason="...") (and similarly for any other
return paths) so the model validates structure and types.

In `@tests/unit/rules/conditions/test_access_control_advanced.py`:
- Around line 37-52: The test
TestCrossTeamApprovalCondition.test_evaluate_returns_violations_when_missing_teams
should also assert the violation severity; update the test to check that the
returned violation(s) have severity Severity.HIGH (import or reference Severity
and assert violations[0].severity == Severity.HIGH) alongside the existing
message and length assertions so the test verifies severity for
CrossTeamApprovalCondition.

In `@tests/unit/rules/conditions/test_access_control.py`:
- Line 11: The suppressed unused import "src.rules.utils.codeowners  # noqa:
F401" lacks explanation; either remove the import if it serves no purpose, or
keep it and add an inline comment clarifying its side-effect (e.g., module-level
registration/fixtures) so the noqa is justified; update the import line for the
symbol src.rules.utils.codeowners accordingly to include a short comment like "#
imported for side-effects: <what it registers>" or delete the import if it's
unused.

In `@tests/unit/rules/test_acknowledgment.py`:
- Around line 160-162: Add parametrized test cases to
tests/unit/rules/test_acknowledgment.py to cover the two missing RuleID
mappings: include a case that expects RuleID.TEST_COVERAGE when the violation
text contains "without corresponding test changes" and a case that expects
RuleID.COMMENT_RESPONSE_TIME when the violation text contains "exceeded the";
update the same parameter list or the test function that currently asserts
mappings for RuleID.DIFF_PATTERN, RuleID.SECURITY_PATTERN, and
RuleID.UNRESOLVED_COMMENTS so it also asserts these two new tuples (i.e., add
("without corresponding test changes", RuleID.TEST_COVERAGE) and ("exceeded
the", RuleID.COMMENT_RESPONSE_TIME)) ensuring the test iterates those inputs and
verifies the mapping logic.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 60efa70 and 797e531.

📒 Files selected for processing (28)
  • docs/enterprise-rules-roadmap.md
  • src/core/models.py
  • src/event_processors/pull_request/enricher.py
  • src/integrations/github/api.py
  • src/integrations/github/graphql.py
  • src/integrations/github/graphql_client.py
  • src/integrations/github/models.py
  • src/main.py
  • src/rules/acknowledgment.py
  • src/rules/conditions/__init__.py
  • src/rules/conditions/access_control.py
  • src/rules/conditions/access_control_advanced.py
  • src/rules/conditions/compliance.py
  • src/rules/conditions/filesystem.py
  • src/rules/conditions/pull_request.py
  • src/rules/conditions/temporal.py
  • src/rules/registry.py
  • src/rules/utils/__init__.py
  • src/rules/utils/codeowners.py
  • src/rules/utils/diff.py
  • src/webhooks/handlers/pull_request_review.py
  • src/webhooks/handlers/pull_request_review_thread.py
  • tests/unit/rules/conditions/test_access_control.py
  • tests/unit/rules/conditions/test_access_control_advanced.py
  • tests/unit/rules/conditions/test_compliance.py
  • tests/unit/rules/conditions/test_pull_request.py
  • tests/unit/rules/conditions/test_temporal_sla.py
  • tests/unit/rules/test_acknowledgment.py
💤 Files with no reviewable changes (1)
  • src/integrations/github/graphql_client.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

**/*.py: Use modern typing only: dict[str, Any], list[str], str | None (no Dict, List, Optional)
GitHub/HTTP/DB calls must be async def; avoid blocking calls (time.sleep, sync HTTP) in async paths
All agent outputs and external payloads must use validated BaseModel from Pydantic
Use dataclasses for internal immutable state where appropriate
Use structured logging at boundaries with fields: operation, subject_ids, decision, latency_ms
Implement Agent pattern: single-responsibility agents with typed inputs/outputs
Use Decorator pattern for retries, metrics, caching as cross-cutting concerns
Agent outputs must include: decision, confidence (0..1), short reasoning, recommendations, strategy_used
Implement confidence policy: reject or route to human-in-the-loop when confidence < 0.5
Use minimal, step-driven prompts; provide Chain-of-Thought only for complexity > 0.7 or ambiguity > 0.6
Strip secrets/PII from agent prompts; scope tools; keep raw reasoning out of logs (store summaries only)
Cache idempotent lookups; lazy-import heavy dependencies; bound fan-out with asyncio.Semaphore
Avoid redundant LLM calls; memoize per event when safe
Use domain errors (e.g., AgentError) with error_type, message, context, timestamp, retry_count
Use exponential backoff for transient failures; circuit-break noisy integrations when needed
Fail closed for risky decisions; provide actionable remediation in error paths
Validate all external inputs; verify webhook signatures
Implement prompt-injection hardening; sanitize repository content passed to LLMs
Performance targets: Static validation ~<100ms typical, hybrid decisions sub-second when cache warm, budget LLM paths thoughtfully
Reject old typing syntax (Dict, List, Optional) in code review
Reject blocking calls in async code; reject bare except: clauses; reject swallowed errors
Reject LLM calls for trivial/deterministic checks
Reject unvalidated agent outputs and missing confidenc...

Files:

  • src/rules/conditions/temporal.py
  • tests/unit/rules/conditions/test_compliance.py
  • src/rules/utils/diff.py
  • src/core/models.py
  • tests/unit/rules/conditions/test_access_control_advanced.py
  • src/event_processors/pull_request/enricher.py
  • src/main.py
  • src/integrations/github/graphql.py
  • src/rules/conditions/__init__.py
  • tests/unit/rules/conditions/test_temporal_sla.py
  • src/webhooks/handlers/pull_request_review_thread.py
  • src/rules/conditions/pull_request.py
  • src/integrations/github/api.py
  • src/webhooks/handlers/pull_request_review.py
  • tests/unit/rules/conditions/test_pull_request.py
  • tests/unit/rules/conditions/test_access_control.py
  • src/rules/conditions/filesystem.py
  • src/rules/conditions/compliance.py
  • src/rules/registry.py
  • src/rules/acknowledgment.py
  • src/rules/conditions/access_control_advanced.py
  • src/rules/conditions/access_control.py
  • tests/unit/rules/test_acknowledgment.py
  • src/integrations/github/models.py
  • src/rules/utils/__init__.py
  • src/rules/utils/codeowners.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

tests/**/*.py: Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths
Write integration tests for webhook parsing, idempotency, multi-agent coordination, and state persistence
Use pytest.mark.asyncio for async tests; avoid live network calls; freeze time and seed randomness
Write regression tests for every bug fix; keep CI coverage thresholds green

Files:

  • tests/unit/rules/conditions/test_compliance.py
  • tests/unit/rules/conditions/test_access_control_advanced.py
  • tests/unit/rules/conditions/test_temporal_sla.py
  • tests/unit/rules/conditions/test_pull_request.py
  • tests/unit/rules/conditions/test_access_control.py
  • tests/unit/rules/test_acknowledgment.py
docs/**

📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)

docs/**: Update README and docs for user-visible changes and migrations
Provide runnable examples for new/changed rules; keep cross-links current

Files:

  • docs/enterprise-rules-roadmap.md
🧠 Learnings (1)
📚 Learning: 2026-01-31T19:35:22.504Z
Learnt from: CR
Repo: warestack/watchflow PR: 0
File: .cursor/rules/guidelines.mdc:0-0
Timestamp: 2026-01-31T19:35:22.504Z
Learning: Applies to tests/**/*.py : Write unit tests for deterministic rule evaluation (pass/warn/block), model validation, and error paths

Applied to files:

  • tests/unit/rules/conditions/test_compliance.py
  • tests/unit/rules/conditions/test_access_control_advanced.py
  • tests/unit/rules/conditions/test_pull_request.py
🧬 Code graph analysis (17)
src/rules/conditions/temporal.py (2)
src/rules/conditions/base.py (1)
  • BaseCondition (15-74)
src/core/models.py (2)
  • Violation (20-31)
  • Severity (8-17)
tests/unit/rules/conditions/test_compliance.py (2)
src/core/models.py (1)
  • Severity (8-17)
src/rules/conditions/compliance.py (2)
  • ChangelogRequiredCondition (59-107)
  • SignedCommitsCondition (11-56)
tests/unit/rules/conditions/test_access_control_advanced.py (2)
src/core/models.py (1)
  • Severity (8-17)
src/rules/conditions/access_control_advanced.py (2)
  • NoSelfApprovalCondition (11-56)
  • CrossTeamApprovalCondition (59-116)
src/event_processors/pull_request/enricher.py (1)
src/integrations/github/api.py (1)
  • get_pull_request_review_threads (501-555)
src/integrations/github/graphql.py (1)
src/integrations/github/models.py (1)
  • GraphQLResponse (116-120)
tests/unit/rules/conditions/test_temporal_sla.py (2)
src/core/models.py (1)
  • Severity (8-17)
src/rules/conditions/temporal.py (5)
  • CommentResponseTimeCondition (249-331)
  • evaluate (31-55)
  • evaluate (77-118)
  • evaluate (162-213)
  • evaluate (258-326)
src/webhooks/handlers/pull_request_review_thread.py (2)
src/core/models.py (1)
  • WebhookEvent (118-146)
src/webhooks/handlers/issue_comment.py (1)
  • event_type (18-19)
src/integrations/github/api.py (1)
src/integrations/github/graphql.py (2)
  • GitHubGraphQLClient (12-61)
  • execute_query_typed (52-61)
src/webhooks/handlers/pull_request_review.py (2)
src/core/models.py (1)
  • WebhookEvent (118-146)
src/webhooks/handlers/issue_comment.py (1)
  • event_type (18-19)
tests/unit/rules/conditions/test_pull_request.py (1)
src/rules/conditions/pull_request.py (19)
  • DiffPatternCondition (370-428)
  • SecurityPatternCondition (431-491)
  • UnresolvedCommentsCondition (494-560)
  • evaluate (26-75)
  • evaluate (109-156)
  • evaluate (189-232)
  • evaluate (266-288)
  • evaluate (319-353)
  • evaluate (379-412)
  • evaluate (440-475)
  • evaluate (503-540)
  • validate (77-97)
  • validate (158-177)
  • validate (234-254)
  • validate (290-300)
  • validate (355-367)
  • validate (414-428)
  • validate (477-491)
  • validate (542-560)
src/rules/conditions/filesystem.py (2)
src/rules/conditions/base.py (1)
  • BaseCondition (15-74)
src/core/models.py (2)
  • Violation (20-31)
  • Severity (8-17)
src/rules/conditions/compliance.py (2)
src/core/models.py (2)
  • Severity (8-17)
  • Violation (20-31)
src/rules/conditions/base.py (1)
  • BaseCondition (15-74)
src/rules/registry.py (5)
src/rules/conditions/filesystem.py (1)
  • TestCoverageCondition (282-376)
src/rules/conditions/pull_request.py (7)
  • DiffPatternCondition (370-428)
  • MinApprovalsCondition (257-300)
  • RequiredLabelsCondition (180-254)
  • RequireLinkedIssueCondition (310-367)
  • SecurityPatternCondition (431-491)
  • TitlePatternCondition (17-97)
  • UnresolvedCommentsCondition (494-560)
src/rules/conditions/temporal.py (2)
  • AllowedHoursCondition (65-150)
  • CommentResponseTimeCondition (249-331)
src/rules/acknowledgment.py (1)
  • RuleID (20-41)
src/rules/conditions/workflow.py (1)
  • WorkflowDurationCondition (17-99)
src/rules/conditions/access_control_advanced.py (2)
src/core/models.py (2)
  • Severity (8-17)
  • Violation (20-31)
src/rules/conditions/base.py (1)
  • BaseCondition (15-74)
src/rules/conditions/access_control.py (1)
src/rules/utils/codeowners.py (1)
  • is_critical_file (196-222)
tests/unit/rules/test_acknowledgment.py (1)
src/rules/acknowledgment.py (1)
  • RuleID (20-41)
src/rules/utils/__init__.py (1)
src/rules/utils/codeowners.py (1)
  • CodeOwnersParser (14-160)
🪛 GitHub Actions: Run pre-commit hooks
tests/unit/rules/conditions/test_compliance.py

[error] 1-1: Trailing whitespace detected and removed by pre-commit (trailing-whitespace hook).

tests/unit/rules/conditions/test_access_control_advanced.py

[error] 1-1: Trailing whitespace detected and removed by pre-commit (trailing-whitespace hook).

tests/unit/rules/conditions/test_temporal_sla.py

[error] 1-1: Trailing whitespace detected and removed by pre-commit (trailing-whitespace hook).

src/rules/conditions/compliance.py

[error] 1-1: Trailing whitespace detected and removed by pre-commit (trailing-whitespace hook).

src/rules/conditions/access_control_advanced.py

[error] 1-1: Trailing whitespace detected and removed by pre-commit (trailing-whitespace hook).

Comment on lines +1 to +92
# Enterprise & Regulated Industry Guardrails

To level up Watchflow for large engineering teams and highly regulated industries (FinTech, HealthTech, Enterprise SaaS), we should expand our rule engine to support strict compliance, auditability, and advanced access control.

## 1. Compliance & Security Verification Rules

### `SignedCommitsCondition`
**Purpose:** Ensure all commits in a PR are signed (GPG/SSH/S/MIME).
**Why:** Required by SOC2, FedRAMP, and most enterprise security teams to prevent impersonation.
**Parameters:** `require_signed_commits: true`

### `SecretScanningCondition` (Enhanced)
**Purpose:** Integrate with GitHub Advanced Security or detect specific sensitive file extensions.
**Why:** Catching hardcoded secrets before they merge is a massive pain point. We built regex parsing, but we can add native hooks to check if GitHub's native secret scanner triggered alerts on the branch.
**Parameters:** `block_on_secret_alerts: true`

### `BannedDependenciesCondition`
**Purpose:** Parse `package.json`, `requirements.txt`, or `go.mod` diffs to block banned licenses (e.g., AGPL) or deprecated libraries.
**Why:** Open-source license compliance and CVE prevention.
**Parameters:** `banned_licenses: ["AGPL", "GPL"]`, `banned_packages: ["requests<2.0.0"]`

## 2. Advanced Access Control (Separation of Duties)

### `CrossTeamApprovalCondition`
**Purpose:** Require approvals from at least two different GitHub Teams.
**Why:** Regulated environments require "Separation of Duties" (e.g., a dev from `backend-team` and a dev from `qa-team` must both approve).
**Parameters:** `required_team_approvals: ["@org/backend", "@org/qa"]`

### `NoSelfApprovalCondition`
**Purpose:** Explicitly block PR authors from approving their own PRs (or using a secondary admin account to do so).
**Why:** Strict SOX/SOC2 requirement.
**Parameters:** `block_self_approval: true`

## 3. Operations & Reliability

### `MigrationSafetyCondition`
**Purpose:** If a PR modifies database schemas/migrations (e.g., `alembic/`, `prisma/migrations/`), enforce that it does *not* contain destructive operations like `DROP TABLE` or `DROP COLUMN`.
**Why:** Prevents junior devs from accidentally wiping production data.
**Parameters:** `safe_migrations_only: true`

### `FeatureFlagRequiredCondition`
**Purpose:** If a PR exceeds a certain size or modifies core routing, ensure a feature flag is added.
**Why:** Enables safe rollbacks and trunk-based development.
**Parameters:** `require_feature_flags_for_large_prs: true`

## 4. Documentation & Traceability

### `JiraTicketStatusCondition`
**Purpose:** Instead of just checking if a Jira ticket *exists* in the title, make an API call to Jira to ensure the ticket is in the "In Progress" or "In Review" state.
**Why:** Prevents devs from linking to closed, backlog, or fake tickets just to bypass the basic `RequireLinkedIssue` rule.
**Parameters:** `require_active_jira_ticket: true`

### `ChangelogRequiredCondition`
**Purpose:** If `src/` files change, require an addition to `CHANGELOG.md` or a `.changeset/` file.
**Why:** Maintains release notes for compliance audits automatically.
**Parameters:** `require_changelog_update: true`

## 5. Potential GitHub Ecosystem Integrations

To make Watchflow a true "single pane of glass" for governance, we can build custom condition handlers that hook directly into GitHub's native ecosystem.

### `CodeQLAnalysisCondition`
**Purpose:** Block merges if CodeQL (or other static analysis tools) has detected critical vulnerabilities in the PR diff.
**How to build:** Call the GitHub `code-scanning/alerts` API for the current `head_sha`.
**Why:** Instead of developers having to check multiple tabs, Watchflow summarizes the CodeQL alerts and makes them enforceable via YAML.
**Parameters:** `block_on_critical_codeql: true`

### `DependabotAlertsCondition`
**Purpose:** Ensure developers do not merge PRs that introduce new dependencies with known CVEs.
**How to build:** Hook into the `dependabot/alerts` REST API for the repository, filtering by the PR's branch.
**Why:** Shifting security left.
**Parameters:** `max_dependabot_severity: "high"`

## 6. Open-Source Ecosystem Integrations

We can leverage popular open-source Python SDKs directly within our rule engine to parse specific file types during the event evaluation.

### Open Policy Agent (OPA) / Rego Validation
**Purpose:** If a PR modifies `.rego` files or Kubernetes manifests, validate them against the OPA engine.
**How to build:** Embed the `opa` CLI or use the `PyOPA` library to evaluate the diff.
**Why:** Infrastructure-as-Code (IaC) teams need a way to ensure PRs don't introduce misconfigurations.

### Pydantic Schema Breakage Detection
**Purpose:** Detect backward-incompatible changes to REST API models.
**How to build:** If `models.py` changes, parse the old and new AST (Abstract Syntax Tree) to see if a required field was deleted or changed types.
**Why:** Breaking API contracts is a massive incident vector in enterprise microservices.

### Ruff / Black / ESLint Override Detection
**Purpose:** Flag PRs that introduce new `# noqa`, `# type: ignore`, or `// eslint-disable` comments.
**How to build:** Use our existing diff/patch parser to explicitly hunt for suppression comments in the added lines.
**Why:** Keeps technical debt from quietly slipping into the codebase.
**Parameters:** `allow_linter_suppressions: false`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add runnable rule examples and migration cross-links.

This roadmap is useful, but for user-facing rule additions it should include at least a few executable YAML examples and links to the canonical README/docs sections describing migration/usage.

As per coding guidelines, "docs/**: Update README and docs for user-visible changes and migrations" and "Provide runnable examples for new/changed rules; keep cross-links current".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/enterprise-rules-roadmap.md` around lines 1 - 92, The docs file lists
many new rule types (e.g., SignedCommitsCondition, SecretScanningCondition,
BannedDependenciesCondition, CrossTeamApprovalCondition,
NoSelfApprovalCondition, MigrationSafetyCondition, FeatureFlagRequiredCondition,
JiraTicketStatusCondition, CodeQLAnalysisCondition, DependabotAlertsCondition)
but lacks runnable YAML examples and migration cross-links; add 1–2 minimal,
executable YAML snippets per highlighted rule (showing parameters like
require_signed_commits: true, block_on_secret_alerts: true,
banned_licenses/banned_packages, required_team_approvals, safe_migrations_only,
require_active_jira_ticket, block_on_critical_codeql, max_dependabot_severity),
and add explicit cross-links from each rule header to the canonical README/docs
migration/usage sections and an examples directory, plus update the docs README
to list these new example files and a short “migration notes” section; ensure
symbols above are used as section headings and link targets so users can run the
examples directly.

Comment on lines +90 to +100
if clean_team in requested_team_slugs:
# Team was requested, now check if anyone approved (simplified check)
has_approval = any(
(r.get("state") == "APPROVED" if isinstance(r, dict) else getattr(r, "state", None) == "APPROVED")
for r in reviews
)
if not has_approval:
missing_teams.append(req_team)
else:
# Team wasn't even requested
missing_teams.append(req_team)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Team approval logic may produce false positives.

The current implementation checks if ANY approval exists when a team is requested, not if the approval came from a member of that specific team. As noted in the comment (lines 78-81), this is a simplified check, but it could incorrectly pass when:

  • Team A and Team B are both required
  • Only Team A member approved
  • This would still pass because has_approval is True for any approval

Consider adding a TODO or tracking issue for the proper GraphQL-based team membership check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/access_control_advanced.py` around lines 90 - 100, The
current approval check inside the loop that uses clean_team,
requested_team_slugs, reviews and sets missing_teams/req_team is overly broad
because has_approval uses any approval from any reviewer; update the logic to
note this limitation by adding a clear TODO comment and/or a tracking issue
reference above the block (near the conditional that computes has_approval)
stating that approvals must be validated against team membership via the GitHub
GraphQL API (i.e., ensure the approving reviewer is a member of clean_team) and
include a ticket/issue ID or link for the follow-up work so future maintainers
know to replace the simplified any(...) check with a proper team-membership
lookup.

Comment on lines +309 to +314
try:
compiled_pattern = re.compile(test_pattern)
except re.error:
logger.error(f"Invalid test_file_pattern regex: {test_pattern}")
return []

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Invalid regex currently bypasses enforcement (fail-open).

At Line 311-314 and Line 355-358, an invalid test_file_pattern silently passes the rule ([] / True). For a compliance gate, this should fail closed with a clear remediation message.

Proposed fix
@@
         try:
             compiled_pattern = re.compile(test_pattern)
         except re.error:
-            logger.error(f"Invalid test_file_pattern regex: {test_pattern}")
-            return []
+            logger.error(f"Invalid test_file_pattern regex: {test_pattern}")
+            return [
+                Violation(
+                    rule_description=self.description,
+                    severity=Severity.MEDIUM,
+                    message=f"Invalid test_file_pattern regex: '{test_pattern}'",
+                    how_to_fix="Provide a valid regular expression for 'test_file_pattern'.",
+                )
+            ]
@@
         try:
             compiled_pattern = re.compile(test_pattern)
         except re.error:
-            return True
+            return False

As per coding guidelines, "Fail closed for risky decisions; provide actionable remediation in error paths".

Also applies to: 355-358

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/filesystem.py` around lines 309 - 314, The current
handling of an invalid test_file_pattern swallows the error and returns a
permissive success (empty list/True); instead, update both regex-compile
branches (where re.compile(test_pattern) is used and the except re.error
currently logs then returns []) to fail closed: capture the exception, log the
full error, and return a clear remediation result (e.g., a failure entry or
raise a ValidationError) that includes "Invalid test_file_pattern regex" and the
exception message so the rule enforcement stops; look for the occurrences around
the compiled_pattern / test_file_pattern compile blocks and replace the
permissive return with a failing response that surfaces remediation guidance.

Comment on lines +321 to +326
if (
filename.endswith(".md")
or filename.endswith(".txt")
or filename.endswith(".yaml")
or filename.endswith(".json")
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

evaluate() and validate() classify ignored files inconsistently.

Line 321-326 ignores .txt and .json, but Line 365-366 does not. This can make evaluate() pass while validate() fails for the same payload. Align the ignore rules in both paths.

Proposed fix
-            if not filename or filename.endswith(".md") or filename.endswith(".yaml"):
+            if not filename or filename.endswith((".md", ".txt", ".yaml", ".json")):
                 continue

Also applies to: 365-366

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/filesystem.py` around lines 321 - 326, In evaluate() and
validate() the set of ignored extensions is inconsistent (evaluate() checks .md,
.txt, .yaml, .json while validate() omits .txt and .json); modify the code so
both paths use the same rule by extracting the extension list into a shared
constant or helper (e.g., IGNORED_FILE_EXTS or is_ignored_file(filename)) and
replace the inline checks in evaluate() and validate() to call that helper,
ensuring .md, .txt, .yaml, and .json are consistently ignored.

Comment on lines +263 to +265
max_hours = parameters.get("max_comment_response_time_hours")
if not max_hours:
return []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Edge case: max_hours=0 would skip evaluation due to falsy check.

If a user wants to enforce "immediate response" (0-hour SLA), max_hours=0 would be treated as falsy and skip evaluation entirely.

🐛 Proposed fix to handle zero-hour SLA
-        max_hours = parameters.get("max_comment_response_time_hours")
-        if not max_hours:
+        max_hours = parameters.get("max_comment_response_time_hours")
+        if max_hours is None:
             return []
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/conditions/temporal.py` around lines 263 - 265, The current falsy
check "if not max_hours" will treat 0 as absent; change the guard that checks
parameters.get("max_comment_response_time_hours") so it only returns early when
the parameter is truly missing (e.g., check "if max_hours is None" or "if
max_hours is None or max_hours == ''") rather than using a generic falsy check;
update the block around the
parameters.get("max_comment_response_time_hours")/max_hours variable to allow
max_hours == 0 to proceed into evaluation.

Comment on lines +29 to +47
def extract_removed_lines(patch: str) -> list[str]:
"""
Extract lines that were removed in a patch.

Args:
patch: The unified diff patch string.

Returns:
A list of removed lines.
"""
if not patch:
return []

removed_lines = []
for line in patch.split("\n"):
if line.startswith("-") and not line.startswith("---"):
removed_lines.append(line[1:])

return removed_lines
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "extract_removed_lines" --type py

Repository: warestack/watchflow

Length of output: 141


🏁 Script executed:

head -n 50 src/rules/utils/diff.py

Repository: warestack/watchflow

Length of output: 1109


🏁 Script executed:

rg "extract_added_lines|from.*diff import|import.*diff" --type py -n

Repository: warestack/watchflow

Length of output: 634


🏁 Script executed:

sed -n '48,75n' src/rules/utils/diff.py

Repository: warestack/watchflow

Length of output: 45


🏁 Script executed:

sed -n '48,75p' src/rules/utils/diff.py

Repository: warestack/watchflow

Length of output: 726


extract_removed_lines is unused and should be removed or clearly documented.

The function has no references in the codebase. While its symmetric pair extract_added_lines is actively used in match_patterns_in_patch, extract_removed_lines is never called or imported. Either remove it or add a comment explaining the intent for future use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/utils/diff.py` around lines 29 - 47, The function
extract_removed_lines is dead code; either remove it or mark intent—either
delete the extract_removed_lines function entirely, or keep it but add a clear
doc/comment stating it's intentionally unused (e.g., reserved for future
symmetric processing) and update any module exports if applicable; reference
extract_removed_lines, its symmetric partner extract_added_lines, and the
consumer match_patterns_in_patch to locate the related logic so reviewers
understand the change.

Comment on lines +12 to +13
past_time = now - timedelta(hours=25)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix trailing whitespace flagged by pre-commit.

The pipeline indicates trailing whitespace was detected. Ensure whitespace is removed to pass the pre-commit hooks.

🔧 Suggested fix
         past_time = now - timedelta(hours=25)
-        
+
         event = {

Apply similar fixes to lines 34-35 and 54-55 if they also have trailing whitespace.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
past_time = now - timedelta(hours=25)
past_time = now - timedelta(hours=25)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/rules/conditions/test_temporal_sla.py` around lines 12 - 13,
Remove trailing whitespace from the test file so pre-commit passes: delete the
extra space at the end of the line defining past_time (variable past_time = now
- timedelta(hours=25)) and any trailing spaces on the other affected test lines
(around the assertions or datetime lines referenced near lines 34-35 and 54-55);
then run pre-commit/flake8 locally to confirm no trailing whitespace remains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: expand rule support for PR comments, diff analysis, and fix CODEOWNERS/contributor rules

1 participant