feat(mcp): optimize token consumption in MCP responses#128
feat(mcp): optimize token consumption in MCP responses#128nioasoft wants to merge 4 commits intoAutoForgeAI:masterfrom
Conversation
📝 WalkthroughWalkthroughAdds token-efficient feature serialization and cycle-check helpers, introduces a new public tool Changes
Sequence Diagram(s)sequenceDiagram
participant Assistant as Assistant
participant MCP_Server as MCP Server
participant FileSystem as File System
participant Database as Database
Assistant->>MCP_Server: call spec_get_summary()
MCP_Server->>FileSystem: read `app_spec.txt`
FileSystem-->>MCP_Server: raw spec text
MCP_Server->>MCP_Server: parse & condense (project_name, overview, stack, ports, feature_count)
MCP_Server-->>Assistant: return JSON summary
Assistant->>MCP_Server: call feature_get_ready(minimal=true)
MCP_Server->>Database: query essential feature columns
Database-->>MCP_Server: feature rows
MCP_Server->>MCP_Server: serialize with to_minimal_dict()
MCP_Server-->>Assistant: minimal feature list
Note over MCP_Server,Database: Dependency updates use to_cycle_check_dict() for cycle checks
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Add to_minimal_dict() and to_cycle_check_dict() to Feature model - Use minimal serialization for cycle detection (~95% token reduction) - Add minimal parameter to feature_get_ready/blocked (default True) - Optimize feature_get_graph to query only needed columns - Add spec_get_summary MCP tool (~800 tokens vs 12,500 full) - Implement progressive history summarization in assistant chat - Update coding prompt to recommend new token-efficient tools Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
97c9739 to
32dfc85
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
server/services/expand_chat_session.py (2)
42-46: Duplicate constant:EXPAND_FEATURE_TOOLSis unused.Two identical constants are defined:
EXPAND_FEATURE_TOOLS(lines 42-46)FEATURE_MCP_TOOLS(lines 64-69)Only
FEATURE_MCP_TOOLSis actually used (lines 180, 241). TheEXPAND_FEATURE_TOOLSconstant is dead code and should be removed.🧹 Remove unused constant
-# Feature MCP tools needed for expand session -EXPAND_FEATURE_TOOLS = [ - "mcp__features__feature_create", - "mcp__features__feature_create_bulk", - "mcp__features__feature_get_stats", -] - -Also applies to: 64-69
219-229: Dead code:mcp_serversdict is defined but never used.The
mcp_serversdictionary defined at lines 219-229 is never referenced. The MCP configuration is now written to a file (lines 189-206) and the file path is passed to the SDK at line 243 (mcp_servers=str(mcp_config_file)).This appears to be leftover code from before the refactor to file-based MCP configuration.
🧹 Remove dead code
# Determine model from environment or use default # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101") - # Build MCP servers config for feature creation - mcp_servers = { - "features": { - "command": sys.executable, - "args": ["-m", "mcp_server.feature_mcp"], - "env": { - "PROJECT_DIR": str(self.project_dir.resolve()), - "PYTHONPATH": str(ROOT_DIR.resolve()), - }, - }, - } - # Create Claude SDK clientserver/websocket.py (1)
172-208: Cleanup scheduling can be skipped when updates are emitted.
process_linereturns early on agent start/complete and most state-change updates, so the_should_cleanup()block is often bypassed. Under steady output, stale agents may never be purged. Consider scheduling cleanup before any early return.🛠️ Suggested fix to always schedule cleanup
@@ - # Check for orchestrator status messages first + # Periodic cleanup of stale agents (every 5 minutes) + if self._should_cleanup(): + asyncio.create_task(self.cleanup_stale_agents()) + + # Check for orchestrator status messages first @@ - # Periodic cleanup of stale agents (every 5 minutes) - if self._should_cleanup(): - # Schedule cleanup without blocking - asyncio.create_task(self.cleanup_stale_agents()) - return Noneserver/main.py (1)
29-100: AddSlowAPIMiddlewareto enforce the global rate limit.The limiter instance and exception handler are configured, but without
SlowAPIMiddleware, thedefault_limits=["200/minute"]will not be applied. SlowAPI enforces default limits only through middleware; without it, routes remain unrated unless explicitly decorated with@limiter.limit().Add the middleware
-from slowapi import Limiter, _rate_limit_exceeded_handler +from slowapi import Limiter, _rate_limit_exceeded_handler +from slowapi.middleware import SlowAPIMiddleware from slowapi.errors import RateLimitExceeded @@ app.state.limiter = limiter app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler) +app.add_middleware(SlowAPIMiddleware)mcp_server/feature_mcp.py (1)
1-34: Implement missing tools required by coding guidelines.The Feature MCP server is missing three required tools from the coding guidelines:
feature_get_next,feature_claim_next, andfeature_get_for_regression. Whilefeature_claim_and_getprovides similar functionality tofeature_claim_next, the other two tools have no alternatives in the current implementation.
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Around line 310-320: The template incorrectly references the tool name
feature_get_summary (which doesn't exist) while the correct tool introduced in
this PR is spec_get_summary; update the TOKEN EFFICIENCY section to replace the
feature_get_summary mention with spec_get_summary (and verify nearby mentions of
feature_get_by_id remain unchanged), ensuring all mentions consistently use the
exact tool symbol spec_get_summary so coding agents aren't confused.
In `@api/feature_repository.py`:
- Around line 37-77: The retry logic in _commit_with_retry currently calls
session.rollback(), which discards pending changes so retries commit nothing
(breaking methods like mark_in_progress); fix by changing _commit_with_retry to
accept an apply_changes callable (e.g., def _commit_with_retry(session: Session,
apply_changes: Callable[[Session], None], max_retries: int =
MAX_COMMIT_RETRIES)) and invoke apply_changes(session) before each commit
attempt (including retries) so mutations are re-applied after rollback;
alternatively, if you prefer minimal change, call session.flush() before commit
to surface lock errors early and avoid discarding pending changes, but the
recommended approach is the apply_changes pattern so callers like
mark_in_progress re-apply their status mutation on each retry.
In `@mcp_server/feature_mcp.py`:
- Around line 507-542: feature_release_testing currently ignores the tested_ok
parameter and only clears Feature.in_progress; update it to record the test
outcome by calling feature_mark_passing(feature_id) when tested_ok is True and
feature_mark_failing(feature_id) when False (or directly set feature.passes
accordingly before commit), then clear feature.in_progress and commit; ensure
you import/locate and use the existing feature_mark_passing and
feature_mark_failing helpers (or set Feature.passes) and keep the same error
handling/rollback/session.close behavior.
🧹 Nitpick comments (17)
pyproject.toml (1)
25-27: Consider narrowing the DeprecationWarning filter.Blanket
ignore::DeprecationWarningsuppresses all deprecation warnings, which could hide important notices from dependencies or Python itself. Consider scoping to specific modules or warning messages:filterwarnings = [ "ignore::DeprecationWarning:pytest_asyncio.*:", "ignore::pytest.PytestReturnNotNoneWarning", ]requirements.txt (1)
1-15: Minor inconsistency in version specifier styles.Most dependencies now use
~=(compatible release) butclaude-agent-sdk(line 2) andapscheduler(line 11) still use>=,<bounds. This is functionally fine but inconsistent. Consider aligning if feasible:-claude-agent-sdk>=0.1.0,<0.2.0 +claude-agent-sdk~=0.1.0 -apscheduler>=3.10.0,<4.0.0 +apscheduler~=3.10api/config.py (1)
129-157: Consider thread safety for the singleton pattern.The lazy singleton pattern works well for typical use cases where config is loaded once at startup. However, if
get_config()is called concurrently from multiple threads during initial load, or ifreload_config()is called during active request handling, there's a potential race condition.For a more robust implementation, consider using a lock:
🔧 Optional: Thread-safe singleton
+import threading + # Global config instance (lazy loaded) _config: Optional[AutocoderConfig] = None +_config_lock = threading.Lock() def get_config() -> AutocoderConfig: global _config - if _config is None: - _config = AutocoderConfig() + if _config is None: + with _config_lock: + if _config is None: # Double-check after acquiring lock + _config = AutocoderConfig() return _configparallel_orchestrator.py (1)
137-139: Type annotation could be more precise.The type
dict[int, tuple[int, subprocess.Popen] | None]indicates values can beNone(for placeholders), but the comment explains this well. Consider using a more explicit type alias for clarity.💡 Optional: Type alias for clarity
+# Type for testing agent entries: (feature_id, process) or None for placeholder +TestingAgentEntry = tuple[int, subprocess.Popen] | None # Testing agents: agent_id (pid) -> (feature_id, process) # Using pid as key allows multiple agents to test the same feature -self.running_testing_agents: dict[int, tuple[int, subprocess.Popen] | None] = {} +self.running_testing_agents: dict[int, TestingAgentEntry] = {}server/routers/devserver.py (1)
51-79: Minor inconsistency in validation approach across routers.This router uses
validate_project_name(which raisesHTTPException), whileterminal.pyandassistant_chat.pyuseis_valid_project_name(which returns a boolean). Both approaches work correctly, but the inconsistency may cause confusion when maintaining the codebase.Since
devserver.pyis REST-only (no WebSocket endpoints with custom close codes), using the exception-raising validator is the cleaner choice here.server/routers/projects.py (1)
422-427: Consider extracting the repeated sys.path manipulation.The pattern of adding the root to
sys.path(lines 423-425) is duplicated across multiple functions in this file (_init_imports,_get_registry_functions). Consider extracting this to a helper or using a single initialization point.That said, this follows the existing pattern in the file, so it's acceptable for consistency.
api/feature_repository.py (1)
291-311: Missing priority-based sorting for ready features.The
get_ready_featuresmethod returns features without any ordering. Compare withapi/dependency_resolver.py(lines 391-420), which sorts ready features by scheduling score, priority, and ID.Without sorting, callers may receive features in non-deterministic order, which could affect which features get worked on first.
♻️ Proposed fix: Add priority-based sorting
def get_ready_features(self) -> list[Feature]: """Get features that are ready to implement. ... """ passing_ids = self.get_passing_ids() candidates = self.get_pending() ready = [] for f in candidates: deps = f.dependencies or [] if all(dep_id in passing_ids for dep_id in deps): ready.append(f) + # Sort by priority (lower = higher priority), then by ID for stability + ready.sort(key=lambda f: (f.priority, f.id)) + return readytests/test_async_examples.py (1)
261-263: Consider making the expected feature count less brittle.The assertion
assert results[0] == 5assumespopulated_dbalways has exactly 5 features. If the fixture changes, this test will fail with a non-obvious error.♻️ Proposed improvement
# All should return the same count assert all(r == results[0] for r in results) - assert results[0] == 5 # populated_db has 5 features + # populated_db fixture creates 5 features; verify count is positive and consistent + assert results[0] > 0, "Expected populated_db to have features"Or keep the explicit count but add the fixture's expected count as a constant that can be shared with the fixture definition.
api/migrations.py (2)
22-32: Add type annotation forengineparameter.All migration functions lack type hints for the
engineparameter. Consider adding type annotations for better IDE support and static analysis.✨ Suggested type annotation
+from sqlalchemy.engine import Engine + -def migrate_add_in_progress_column(engine) -> None: +def migrate_add_in_progress_column(engine: Engine) -> None:Apply similarly to all other migration functions.
176-186: SQL construction is safe here but consider parameterization for future-proofing.The f-string SQL construction on Line 185 is currently safe since
col_nameandcol_typecome from a hardcoded list (Lines 176-181). However, if this pattern is extended in the future with dynamic values, it could become a risk.tests/conftest.py (3)
72-87: Redundantcreate_databasecall indb_sessionfixture.The
temp_dbfixture already callscreate_database(project_dir)on Line 65. Calling it again on Line 80 is redundant and could mask issues if the two calls return different session factories.♻️ Suggested fix: reuse the database from temp_db
`@pytest.fixture` -def db_session(temp_db: Path): +def db_session(temp_db: Path): """Get a database session for testing. Provides a session that is automatically rolled back after each test. """ from api.database import create_database - _, SessionLocal = create_database(temp_db) + # create_database is idempotent and returns existing engine/session + _, SessionLocal = create_database(temp_db) session = SessionLocal()Alternatively, consider having
temp_dbyield both the path and session factory to avoid the second call.
95-110: Async fixture performs only synchronous operations.The
async_temp_dbfixture is declared asasyncbut contains noawaitexpressions. The comment on Line 107 acknowledges this. Consider making it a regular sync fixture unless there's a specific reason for async (e.g., compatibility with async test collection).
216-244:populated_dbfixture should usedb_sessionfixture to avoid duplication.This fixture duplicates the session management pattern from
db_session. Consider depending ondb_sessioninstead to reduce code duplication and ensure consistent cleanup.♻️ Suggested refactor
`@pytest.fixture` -def populated_db(temp_db: Path, sample_feature_data: dict) -> Path: +def populated_db(temp_db: Path, db_session, sample_feature_data: dict) -> Path: """Create a database populated with sample features. Returns the project directory path. """ from api.database import Feature, create_database - _, SessionLocal = create_database(temp_db) - session = SessionLocal() - - try: - # Add sample features - for i in range(5): - feature = Feature( - priority=i + 1, - category=f"category_{i % 2}", - name=f"Feature {i + 1}", - description=f"Description for feature {i + 1}", - steps=[f"Step {j}" for j in range(3)], - passes=i < 2, # First 2 features are passing - in_progress=i == 2, # Third feature is in progress - ) - session.add(feature) - - session.commit() - finally: - session.close() + # Add sample features + for i in range(5): + feature = Feature( + priority=i + 1, + category=f"category_{i % 2}", + name=f"Feature {i + 1}", + description=f"Description for feature {i + 1}", + steps=[f"Step {j}" for j in range(3)], + passes=i < 2, + in_progress=i == 2, + ) + db_session.add(feature) + + db_session.commit() return temp_dbmcp_server/feature_mcp.py (3)
46-50: Duplicate_utc_now()function - import fromapi.modelsinstead.This helper is also defined in
api/models.py(Lines 27-30). Duplicating it creates maintenance burden and potential for divergence.♻️ Suggested fix
-def _utc_now() -> datetime: - """Return current UTC time.""" - return datetime.now(timezone.utc) - from mcp.server.fastmcp import FastMCP from pydantic import BaseModel, Field +from api.models import _utc_nowNote: You may need to rename the function in
api/models.pytoutc_now(without underscore) if it should be part of the public API.
1189-1201: Multiple COUNT queries could be consolidated.The
feature_get_attemptsfunction makes 4 separate queries for statistics. Consider combining into a single aggregate query similar tofeature_get_stats(Lines 162-166).♻️ Suggested optimization
+ from sqlalchemy import case, func + # Calculate statistics - total_attempts = session.query(FeatureAttempt).filter( - FeatureAttempt.feature_id == feature_id - ).count() - - success_count = session.query(FeatureAttempt).filter( - FeatureAttempt.feature_id == feature_id, - FeatureAttempt.outcome == "success" - ).count() - - failure_count = session.query(FeatureAttempt).filter( - FeatureAttempt.feature_id == feature_id, - FeatureAttempt.outcome == "failure" - ).count() + stats = session.query( + func.count(FeatureAttempt.id).label('total'), + func.sum(case((FeatureAttempt.outcome == 'success', 1), else_=0)).label('success'), + func.sum(case((FeatureAttempt.outcome == 'failure', 1), else_=0)).label('failure') + ).filter(FeatureAttempt.feature_id == feature_id).first() + + total_attempts = stats.total or 0 + success_count = int(stats.success or 0) + failure_count = int(stats.failure or 0)
1407-1497:spec_get_summaryprovides good token optimization but regex parsing may be fragile.The function effectively reduces token consumption (~94% per PR objectives). However, the regex-based parsing assumes specific XML-like tag formats. Consider adding a note about expected format or providing fallback for malformed specs.
Minor: The function imports
reinside the function body (Line 1424). Moving it to module level would be more conventional.api/models.py (1)
21-21: Use updated import path fordeclarative_base.
sqlalchemy.ext.declarative.declarative_baseis deprecated since SQLAlchemy 1.4. Use the new import path for forward compatibility.♻️ Suggested fix
-from sqlalchemy.ext.declarative import declarative_base +from sqlalchemy.orm import declarative_base
| ## TOKEN EFFICIENCY | ||
|
|
||
| To maximize context window usage: | ||
|
|
||
| - **Don't read files unnecessarily** - Feature details from `feature_get_by_id` contain everything you need | ||
| - **Be concise** - Short, focused responses save tokens for actual work | ||
| - **Use `feature_get_summary`** for status checks (lighter than `feature_get_by_id`) | ||
| - **Use `spec_get_summary`** for project context (~800 tokens vs 12,500 for full app_spec.txt) | ||
| - **Avoid re-reading large files** - Read once, remember the content | ||
|
|
||
| --- |
There was a problem hiding this comment.
Inconsistent tool name reference.
Line 316 references feature_get_summary but the actual tool introduced in this PR is spec_get_summary (as correctly shown on line 317 and line 276). This may confuse coding agents.
📝 Proposed fix
- **Don't read files unnecessarily** - Feature details from `feature_get_by_id` contain everything you need
- **Be concise** - Short, focused responses save tokens for actual work
-- **Use `feature_get_summary`** for status checks (lighter than `feature_get_by_id`)
+- **Use `feature_get_stats`** for status checks (lighter than `feature_get_by_id`)
- **Use `spec_get_summary`** for project context (~800 tokens vs 12,500 for full app_spec.txt)🤖 Prompt for AI Agents
In @.claude/templates/coding_prompt.template.md around lines 310 - 320, The
template incorrectly references the tool name feature_get_summary (which doesn't
exist) while the correct tool introduced in this PR is spec_get_summary; update
the TOKEN EFFICIENCY section to replace the feature_get_summary mention with
spec_get_summary (and verify nearby mentions of feature_get_by_id remain
unchanged), ensuring all mentions consistently use the exact tool symbol
spec_get_summary so coding agents aren't confused.
api/feature_repository.py
Outdated
| def _commit_with_retry(session: Session, max_retries: int = MAX_COMMIT_RETRIES) -> None: | ||
| """ | ||
| Commit a session with retry logic for transient errors. | ||
|
|
||
| Handles SQLITE_BUSY, SQLITE_LOCKED, and similar transient errors | ||
| with exponential backoff. | ||
|
|
||
| Args: | ||
| session: SQLAlchemy session to commit | ||
| max_retries: Maximum number of retry attempts | ||
|
|
||
| Raises: | ||
| OperationalError: If commit fails after all retries | ||
| """ | ||
| delay_ms = INITIAL_RETRY_DELAY_MS | ||
| last_error = None | ||
|
|
||
| for attempt in range(max_retries + 1): | ||
| try: | ||
| session.commit() | ||
| return | ||
| except OperationalError as e: | ||
| error_msg = str(e).lower() | ||
| # Retry on lock/busy errors | ||
| if "locked" in error_msg or "busy" in error_msg: | ||
| last_error = e | ||
| if attempt < max_retries: | ||
| logger.warning( | ||
| f"Database commit failed (attempt {attempt + 1}/{max_retries + 1}), " | ||
| f"retrying in {delay_ms}ms: {e}" | ||
| ) | ||
| time.sleep(delay_ms / 1000) | ||
| delay_ms *= 2 # Exponential backoff | ||
| session.rollback() # Reset session state before retry | ||
| continue | ||
| raise | ||
|
|
||
| # If we get here, all retries failed | ||
| if last_error: | ||
| logger.error(f"Database commit failed after {max_retries + 1} attempts") | ||
| raise last_error |
There was a problem hiding this comment.
Retry logic may not preserve pending changes after rollback.
After session.rollback() on line 70, any pending changes to objects in the session are discarded. When the loop retries session.commit(), there's nothing to commit because the changes were rolled back.
For the status update methods (e.g., mark_in_progress), this means a retry after rollback will commit an empty transaction, silently failing to persist the intended change.
🐛 Proposed fix: Re-apply changes before retry
One approach is to have the caller handle retries by passing a callable that applies the changes:
-def _commit_with_retry(session: Session, max_retries: int = MAX_COMMIT_RETRIES) -> None:
+def _commit_with_retry(
+ session: Session,
+ apply_changes: callable = None,
+ max_retries: int = MAX_COMMIT_RETRIES
+) -> None:
"""
Commit a session with retry logic for transient errors.
-
- Handles SQLITE_BUSY, SQLITE_LOCKED, and similar transient errors
- with exponential backoff.
+
+ Args:
+ session: SQLAlchemy session to commit
+ apply_changes: Optional callable to re-apply changes after rollback
+ max_retries: Maximum number of retry attempts
...
"""
delay_ms = INITIAL_RETRY_DELAY_MS
last_error = None
for attempt in range(max_retries + 1):
try:
+ if attempt > 0 and apply_changes:
+ apply_changes()
session.commit()
return
except OperationalError as e:
error_msg = str(e).lower()
if "locked" in error_msg or "busy" in error_msg:
last_error = e
if attempt < max_retries:
...
session.rollback()
continue
raiseAlternatively, consider using session.flush() to detect lock errors before commit, or restructure to re-query and re-apply the mutation on each retry.
🤖 Prompt for AI Agents
In `@api/feature_repository.py` around lines 37 - 77, The retry logic in
_commit_with_retry currently calls session.rollback(), which discards pending
changes so retries commit nothing (breaking methods like mark_in_progress); fix
by changing _commit_with_retry to accept an apply_changes callable (e.g., def
_commit_with_retry(session: Session, apply_changes: Callable[[Session], None],
max_retries: int = MAX_COMMIT_RETRIES)) and invoke apply_changes(session) before
each commit attempt (including retries) so mutations are re-applied after
rollback; alternatively, if you prefer minimal change, call session.flush()
before commit to surface lock errors early and avoid discarding pending changes,
but the recommended approach is the apply_changes pattern so callers like
mark_in_progress re-apply their status mutation on each retry.
- Use feature_get_stats for progress checks - Use feature_get_summary for single feature status Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The testing architecture was simplified - testing agents now work concurrently without claim/release coordination. - Remove feature_release_testing from allowed tools in client.py - Update testing_prompt.template.md to remove claim/release workflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.claude/templates/testing_prompt.template.md:
- Around line 86-91: The prompt's "end your session" is ambiguous; update the
template around the echo step (the echo "[Testing] Feature #{id} verified -
still passing" >> claude-progress.txt line and the surrounding text) to include
an explicit termination action the agent should perform, e.g. call the
orchestrator completion API/tool, run an exit command (explicitly state "exit 0"
or equivalent), or note that the session will auto-terminate after logging —
reference the echo line and the phrase "end your session" so implementers
replace the vague wording with one concrete mechanism (tool call name or exact
exit instruction).
🧹 Nitpick comments (1)
.claude/templates/testing_prompt.template.md (1)
186-186: Consider clarifying "one iteration" language.The phrase "You have one iteration" (line 186) is somewhat ambiguous. Combined with line 171 ("Test ONE feature thoroughly"), the intent appears to be limiting scope to a single feature per session. However, "iteration" could be misinterpreted as:
- Only one attempt to fix a regression (even if the first fix fails)
- No retries if browser automation encounters transient failures
- One verification pass only
Consider more explicit phrasing such as:
- "Your session is scoped to ONE feature. Complete all verification and any necessary fixes for that feature."
- "Focus on testing ONE feature thoroughly. You may iterate on fixes for that feature until it passes."
- Add explicit note that session auto-terminates after logging - Clarify that 'one iteration' means one feature per session, not one fix attempt Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…orgeAI#128) Added TestHarnessKernelBudgetEnforcement class with 8 tests proving all 7 verification steps for HarnessKernel.execute() budget enforcement: 1. execute() called with compiled AgentSpec via --spec path 2. turns_used incremented after each turn, matches actual count 3. max_turns=2 stops execution after exactly 2 turns 4. Budget exhaustion sets status='timeout' (not 'failed') 5. 'timeout' event recorded in agent_events with payload 6. Acceptance validators run after budget exhaustion (graceful termination) 7. tokens_in and tokens_out tracked and stored on AgentRun 8. Token tracking persists even on timeout All 54 tests pass with no regressions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Reduces token consumption across three main areas:
spec_get_summarytool returns condensed project spec (~800 tokens vs ~12,500)Changes
Phase 1: MCP Response Optimization (Highest Impact)
api/models.py:to_minimal_dict()- Returns only essential fields (~80% smaller thanto_dict())to_cycle_check_dict()- Returns only id and dependencies (~95% smaller)mcp_server/feature_mcp.py:feature_add_dependency()- Usesto_cycle_check_dict()for cycle checkingfeature_set_dependencies()- Usesto_cycle_check_dict()for cycle checkingfeature_get_ready()- Addedminimal=Trueparameter (default)feature_get_blocked()- Addedminimal=Trueparameter (default)feature_get_graph()- Queries only needed columnsPhase 2: App Spec Summary Tool
spec_get_summary()MCP tool that returns:Phase 3: Assistant Chat History Optimization
server/services/assistant_chat_session.py:Documentation
Expected Token Savings
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Behavior Changes
Performance & Optimization
✏️ Tip: You can customize this high-level summary in your review settings.