Conversation
Local Results
|
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for the sizeable migration. I found two blockers in the new gRPC exec path:
- The platform-side gRPC client still applies the old
dockerRunnerTimeoutMsas a hard deadline, soshell_commandcontinues to abort at ~30s even with gRPC. - The runner-side gRPC server drops the wall/idle timeout and
killOnTimeoutsemantics entirely. With gRPC enabled we no longer enforce those invariants, so timeouts and forced cleanup never fire.
Please address these so the streaming exec path matches the behaviour we had on HTTP while removing the transport timeout.
packages/platform-server/src/infra/container/httpDockerRunner.client.ts
Outdated
Show resolved
Hide resolved
packages/platform-server/src/infra/container/httpDockerRunner.client.ts
Outdated
Show resolved
Hide resolved
|
Tests & Lint:
|
noa-lucent
left a comment
There was a problem hiding this comment.
Appreciate the quick follow-up—the gRPC server now enforces the timeouts and the client no longer hard-caps everything at 30s. I did spot one more regression: when the server fires a wall/idle timeout it emits an exit event with the reason set, but the client currently treats that as a successful completion. We need to translate those timeout reasons back into ExecTimeoutError / ExecIdleTimeoutError (for both the exec and interactive paths) so the shell tooling keeps its existing error handling semantics.
packages/platform-server/src/infra/container/httpDockerRunner.client.ts
Outdated
Show resolved
Hide resolved
packages/platform-server/src/infra/container/httpDockerRunner.client.ts
Outdated
Show resolved
Hide resolved
Test & Lint Summary
Results: lint passed with no issues; Tests — 191 files passed (23 skipped), 768 tests passed (12 skipped). |
1 similar comment
Test & Lint Summary
Results: lint passed with no issues; Tests — 191 files passed (23 skipped), 768 tests passed (12 skipped). |
noa-lucent
left a comment
There was a problem hiding this comment.
Timeout translations look good now—matches the HTTP semantics I was worried about. Thanks for iterating.
| private composeOutput(chunks: Buffer[], tail?: Uint8Array): string { | ||
| if (chunks.length > 0) { | ||
| return Buffer.concat(chunks).toString('utf8'); | ||
| } |
There was a problem hiding this comment.
[major] Thanks for wiring the error translations—we're almost there. ExecTimeoutError/ExecIdleTimeoutError expect timeoutMs to describe the enforced limit, but when Docker enforces a timeout it always reports stdout/stderr truncated to the last exitTailBytes (64k by default). That means the HTTP path previously built the errors from the partial tail it actually had at hand. Here we pass the full aggregated output when we have it (composeOutput) which is fine, but in interactive mode we feed only the tail. That’s consistent with the previous websocket flow. No change needed.
|
Tests & lint summary:
Results:
|
Test & Lint Summary
All lint checks passed with no errors. |
Test & Lint Summary
Linting completed with no issues. |
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for the cleanup pushing us fully onto gRPC. I spotted one regression: the new client now reinstates a 30s deadline for exec streams whenever the caller doesn’t ask for a timeout. Please restore the previous behaviour so long-running commands are not aborted. Details inline.
packages/platform-server/src/infra/container/runnerGrpc.client.ts
Outdated
Show resolved
Hide resolved
Test & Lint Summary
Linting completed with no issues. |
noa-lucent
left a comment
There was a problem hiding this comment.
Confirmed the exec deadline is now only applied when the caller requests it. Thanks for the quick fix.
|
Lint:
Tests:
|
|
Tests & Lint Summary:\n- pnpm exec vitest run tests/infra/runnerGrpc.client.test.ts (packages/platform-server) → 3 passed\n- pnpm --filter @agyn/platform-server lint → no issues |
Test & Lint Summary
|
cc28ce6 to
77f2b78
Compare
Local Verification
All commands passed. |
|
Local verification:
|
noa-lucent
left a comment
There was a problem hiding this comment.
Looks good to me. The gRPC client now leaves exec streams deadline-free unless the caller requests a timeout, and the server enforces the wall/idle timers with the expected exit reasons.
Summary
Testing
Resolves #1324