Conversation
Add HEALTHCHECK to bot containers that polls the OpenClaw /health endpoint. Dashboard now uses actual container health status instead of time-based heuristic to determine when bot is ready. - Add Healthcheck config when creating containers (2s interval, 30 retries) - Add health field to ContainerStatus type (none/starting/healthy/unhealthy) - Update getContainerStatus to extract health from Docker inspect - Update BotCard to use health status for starting/running distinction Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Containers created before the healthcheck feature was added have health='none'. These should be treated as running, not starting. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change healthcheck from wget to curl (curl is installed in botenv, wget is not) - Change endpoint from /health to / (OpenClaw Control UI root, /health doesn't exist via HTTP) - Extract getEffectiveStatus to shared utility (DRY principle) - Update DashboardTab to use health-based status instead of 8-second heuristic - Fixes inconsistency where BotCard and DashboardTab used different status logic
There was a problem hiding this comment.
Pull request overview
This pull request adds Docker health check support to bot containers, replacing a time-based heuristic with proper health status monitoring. The PR introduces a health check configuration for OpenClaw bot containers and refactors the status determination logic across backend and frontend to use this health information.
Changes:
- Added Docker health checks to bot containers with curl-based HTTP checks
- Added
HealthStatustype andhealthfield to container status interfaces in both backend and frontend - Refactored duplicate status determination logic into a shared utility function that uses health status
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/types/container.ts | Adds HealthStatus type and health field to ContainerStatus interface |
| src/services/DockerService.ts | Adds health check configuration to container creation and extracts health status in inspect method |
| dashboard/src/types.ts | Mirrors backend types by adding HealthStatus type and health field |
| dashboard/src/utils/bot-status.ts | New utility function that derives effective bot status using health checks instead of time-based heuristic |
| dashboard/src/dashboard/DashboardTab.tsx | Removes duplicate inline function, imports shared utility |
| dashboard/src/dashboard/BotCard.tsx | Removes duplicate inline function, imports shared utility |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export function getEffectiveStatus(bot: Bot): BotStatus { | ||
| const containerState = bot.container_status?.state; | ||
|
|
||
| if (containerState === 'running') { | ||
| // Use Docker health check status to determine if bot is ready | ||
| const health = bot.container_status?.health; | ||
| if (health === 'starting') { | ||
| return 'starting'; | ||
| } | ||
| if (health === 'unhealthy') { | ||
| return 'error'; | ||
| } | ||
| // 'healthy' or 'none' (no healthcheck configured) = running | ||
| return 'running'; | ||
| } | ||
|
|
||
| if (containerState === 'exited' || containerState === 'dead') { | ||
| return bot.container_status?.exitCode === 0 ? 'stopped' : 'error'; | ||
| } | ||
|
|
||
| // Fallback to database status | ||
| return bot.status; | ||
| } |
There was a problem hiding this comment.
This utility function lacks test coverage. The codebase includes test files (e.g., dashboard/src/api.test.ts), indicating that utility functions should be tested. Consider adding tests for getEffectiveStatus to cover different health status combinations and edge cases, such as when health is 'starting', 'healthy', 'unhealthy', 'none', when container_status is null or undefined, and various container states.
| [LABEL_BOT_HOSTNAME]: hostname | ||
| }, | ||
| Healthcheck: { | ||
| Test: ['CMD', 'curl', '-sf', `http://localhost:${config.port}/`], |
There was a problem hiding this comment.
The health check assumes the OpenClaw gateway responds to HTTP requests at the root path. Consider verifying that OpenClaw exposes a health endpoint or root endpoint that can be used for health checks. If OpenClaw doesn't respond to root path requests, the health check will fail even for healthy containers. You may need to use a specific health endpoint path (e.g., /health) or use a different health check method like checking if the port is listening using curl -f http://localhost:${config.port} without expecting specific content, or consider using CMD-SHELL with a simpler command like nc -z localhost ${config.port}.
| Test: ['CMD', 'curl', '-sf', `http://localhost:${config.port}/`], | |
| Test: ['CMD-SHELL', `nc -z localhost ${config.port}`], |
| Test: ['CMD', 'curl', '-sf', `http://localhost:${config.port}/`], | ||
| Interval: 2_000_000_000, // 2s in nanoseconds | ||
| Timeout: 3_000_000_000, // 3s in nanoseconds | ||
| Retries: 30, |
There was a problem hiding this comment.
With 30 retries at 2-second intervals, the container will be marked as unhealthy after 60 seconds of consecutive failures. This is a very high retry count. For context, the botmaker and keyring-proxy containers in docker-compose.yml use only 3 retries. Consider whether 30 retries is appropriate for bot containers, or if this should be reduced to a more typical value like 3-5 retries to detect issues faster.
| Retries: 30, | |
| Retries: 5, |
No description provided.