Skip to content

Conversation

@ColeMurray
Copy link
Owner

Summary

Extract sandbox lifecycle logic (~400 lines) from SessionDO into dedicated modules with pure decision functions and a provider abstraction.

Changes

  • Add SandboxProvider interface for pluggable sandbox providers with error classification (transient vs permanent for circuit breaker)
  • Add ModalSandboxProvider wrapping existing ModalClient
  • Add pure decision functions (circuit breaker, spawn, inactivity, heartbeat) - side-effect free for easy testing
  • Add SandboxLifecycleManager orchestrating lifecycle operations with dependency injection
  • Add 66 unit tests for decision functions (44) and manager (22)
  • Update SessionDO to delegate lifecycle methods to manager

Architecture Benefits

  1. Testable - Pure decision functions and mocked dependencies enable comprehensive unit testing
  2. Decoupled - SessionDO lifecycle methods now delegate to the manager
  3. Extensible - Provider abstraction enables future sandbox providers (Fly.io, Docker, etc.)
  4. Maintainable - Clear separation between decision logic and side effects

Files

File Description
src/sandbox/provider.ts SandboxProvider interface, error types
src/sandbox/providers/modal-provider.ts Modal implementation
src/sandbox/lifecycle/decisions.ts Pure decision functions
src/sandbox/lifecycle/decisions.test.ts Decision tests (44 tests)
src/sandbox/lifecycle/manager.ts SandboxLifecycleManager
src/sandbox/lifecycle/manager.test.ts Manager tests (22 tests)
src/sandbox/index.ts Updated exports
src/session/durable-object.ts Integrated lifecycle manager

Test plan

  • All 66 unit tests pass
  • TypeScript type checking passes
  • ESLint passes
  • Build succeeds (132kb bundle)
  • Manual testing of sandbox lifecycle in staging

Extract sandbox lifecycle logic (~400 lines) from SessionDO into dedicated
modules with pure decision functions and a provider abstraction:

- Add SandboxProvider interface for pluggable sandbox providers
- Add ModalSandboxProvider wrapping existing ModalClient
- Add pure decision functions (circuit breaker, spawn, inactivity, heartbeat)
- Add SandboxLifecycleManager orchestrating lifecycle operations
- Add 66 unit tests for decision functions and manager
- Update SessionDO to delegate lifecycle methods to manager

This refactoring enables:
- Unit testing via dependency injection and pure functions
- Future sandbox providers (Fly.io, Docker, etc.)
- Clear separation between decision logic and side effects
@github-actions
Copy link

Terraform Validation Results

Step Status
Format ⚠️
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

@greptile-apps
Copy link

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

Extracted ~400 lines of sandbox lifecycle logic from SessionDO into dedicated, testable modules with a provider abstraction. Added 66 unit tests covering decision functions and lifecycle management.

Key improvements

  • Pure decision functions enable comprehensive testing without mocking
  • Provider abstraction allows future sandbox providers (Fly.io, Docker, etc.)
  • Circuit breaker properly distinguishes transient vs permanent errors
  • Clean separation between decision logic and side effects

Critical issue found

The refactoring leaves duplicate isSpawningSandbox flags in both SessionDO and SandboxLifecycleManager that can get out of sync. When a sandbox connects via WebSocket, the DO sets its flag to false (line 398), but the manager's flag remains unchanged. This breaks spawn coordination and could prevent future spawns from proceeding. The DO's flag should be removed entirely since the manager now owns this state.

Architecture validation

  • Dependency injection pattern correctly implemented with adapter interfaces
  • Lazy initialization of lifecycle manager is appropriate
  • Circuit breaker error classification logic is sound (transient vs permanent)
  • Decision functions are properly pure with no side effects

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Jan 29, 2026

Additional Comments (2)

packages/control-plane/src/session/durable-object.ts
isSpawningSandbox flag mismatch between SessionDO and SandboxLifecycleManager

The isSpawningSandbox flag is now owned by SandboxLifecycleManager (packages/control-plane/src/sandbox/lifecycle/manager.ts:156), but SessionDO still maintains its own copy and sets it to false here. This creates two separate flags that can get out of sync.

When the sandbox connects via WebSocket, the DO sets its flag to false, but the manager's flag remains unchanged. This could cause the manager to believe a spawn is still in progress when it's not, blocking future spawn attempts.

Remove the isSpawningSandbox field from SessionDO entirely since the manager now handles this state.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/session/durable-object.ts
Line: 398:398

Comment:
`isSpawningSandbox` flag mismatch between SessionDO and SandboxLifecycleManager

The `isSpawningSandbox` flag is now owned by `SandboxLifecycleManager` (packages/control-plane/src/sandbox/lifecycle/manager.ts:156), but SessionDO still maintains its own copy and sets it to `false` here. This creates two separate flags that can get out of sync.

When the sandbox connects via WebSocket, the DO sets its flag to false, but the manager's flag remains unchanged. This could cause the manager to believe a spawn is still in progress when it's not, blocking future spawn attempts.

Remove the `isSpawningSandbox` field from SessionDO entirely since the manager now handles this state.

How can I resolve this? If you propose a fix, please make it concise.

packages/control-plane/src/session/durable-object.ts
uses SessionDO's isSpawningSandbox flag instead of delegating to manager

This check uses the DO's local isSpawningSandbox flag, but the manager has its own copy. The DO should delegate the warmSandbox() call to the manager without checking the flag, since warmSandbox() in the manager already checks isSpawningInMemory.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/session/durable-object.ts
Line: 881:881

Comment:
uses SessionDO's `isSpawningSandbox` flag instead of delegating to manager

This check uses the DO's local `isSpawningSandbox` flag, but the manager has its own copy. The DO should delegate the `warmSandbox()` call to the manager without checking the flag, since `warmSandbox()` in the manager already checks `isSpawningInMemory`.

How can I resolve this? If you propose a fix, please make it concise.

- Remove duplicate isSpawningSandbox flag from SessionDO
- Add isSpawning() and onSandboxConnected() methods to manager
- Add tests for isSpawningSandbox reset after restore failure
- Add 24 unit tests for ModalSandboxProvider error classification

The refactoring previously left duplicate flags in both SessionDO and
SandboxLifecycleManager. When sandbox connected via WebSocket, the DO
set its flag to false but the manager's flag remained unchanged,
breaking spawn coordination.

Test coverage increased from 66 to 92 tests.
@github-actions
Copy link

Terraform Validation Results

Step Status
Format ⚠️
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

- Fix restore not storing modal_object_id for future snapshots
- Add HTTP response.ok validation in modal-provider
- Fix falsy timestamp checks in decisions.ts (use == null)

Test coverage: 97 tests passing.
@github-actions
Copy link

Terraform Validation Results

Step Status
Format ⚠️
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

@ColeMurray ColeMurray merged commit a525bc4 into main Jan 29, 2026
10 checks passed
@ColeMurray ColeMurray deleted the refactor/sandbox-lifecycle-manager branch January 29, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants