Skip to content

Conversation

@ColeMurray
Copy link
Owner

Summary

  • Fix infinite retry loop when control plane returns HTTP 410 (session terminated)
  • Bridge now exits gracefully on fatal HTTP errors (410, 401, 404) instead of retrying forever
  • Users can restore sessions by sending a new prompt, which triggers snapshot restoration

Problem

After deployments or inactivity timeouts, sandboxes would see this pattern indefinitely:

[bridge] Connection error: server rejected WebSocket connection: HTTP 410
[bridge] Reconnecting in 60.0s (attempt 11)...

The control plane correctly returns 410 when a session is stopped/stale, but the bridge didn't recognize this as a terminal state.

Solution

  • Add SessionTerminatedError exception for non-recoverable session termination
  • Handle InvalidStatus exception to catch HTTP rejection at WebSocket connect time
  • Add _is_fatal_connection_error() as fallback for string-based detection
  • Exit cleanly so Modal can shut down the container

Test plan

  • Added unit tests for _is_fatal_connection_error() covering fatal and non-fatal cases
  • Added unit tests for SessionTerminatedError exception behavior
  • All 44 bridge tests pass
  • Linting passes

…ever

Previously, when the control plane returned HTTP 410 (session terminated),
the bridge would retry indefinitely with exponential backoff. This happened
after deployments or inactivity timeouts when the session was marked as
stopped/stale.

Now the bridge recognizes fatal HTTP errors (410, 401, 404) and exits
cleanly, allowing Modal to shut down the container. Users can restore
sessions by sending a new prompt, which triggers snapshot restoration.

Changes:
- Add SessionTerminatedError for non-recoverable session termination
- Handle InvalidStatus exception to catch HTTP rejection at connect time
- Add _is_fatal_connection_error() as fallback for string-based detection
- Add unit tests for error handling logic
@github-actions
Copy link

Terraform Validation Results

Step Status
Format ⚠️
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

@greptile-apps
Copy link

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

Prevents infinite retry loop when control plane terminates sessions by catching HTTP 410/401/404 errors and exiting gracefully instead of retrying.

  • Introduces SessionTerminatedError exception to signal non-recoverable session termination
  • Catches InvalidStatus exception from websockets library to handle HTTP errors at connection time
  • Adds _is_fatal_connection_error() fallback for string-based error detection
  • Comprehensive unit tests cover both fatal and non-fatal error scenarios

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

- Add HTTP 403 (Forbidden) to fatal error patterns
- Use defensive getattr() chain for InvalidStatus response access
- Add test for HTTP 403
@github-actions
Copy link

Terraform Validation Results

Step Status
Format ⚠️
Init
Validate

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

@ColeMurray ColeMurray merged commit 9325b57 into main Jan 29, 2026
10 checks passed
@ColeMurray ColeMurray deleted the fix/bridge-graceful-exit-on-410 branch January 29, 2026 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants