Added

  • Docker daemon health checks with configurable intervals (docker_health.check_interval and docker_health.timeout in config)

  • /health endpoint now includes Docker daemon connectivity status and returns 503 Service Unavailable when Docker is down

  • Three new Prometheus metrics: docker_daemon_healthy, docker_daemon_consecutive_failures, docker_daemon_checks_total

  • Graceful shutdown coordinator ensures in-flight deployments complete before process exit (30-second timeout)

  • HTTP request body size limit (1 MB max) with 413 response on exceeded limit

  • HTTP request timeout enforcement (30s default for most endpoints, no timeout for SSE)

  • SSE connection resource limits (max 100 concurrent connections, 24-hour timeout)

  • SSE connection tracking and graceful disconnection on shutdown

Changed

  • /health endpoint response now includes status and components structure with detailed health information

  • Shutdown behavior: SIGTERM/SIGINT now trigger graceful shutdown instead of immediate termination

  • API handlers wrapped with timeout contexts and body size limiters for DoS protection

Security

  • CRITICAL: Fixed command injection vulnerability in compose execution by validating project names and file paths

  • Compose project names restricted to [a-zA-Z0-9_-]{1,64} pattern

  • Compose file paths must be absolute with no path traversal attempts (.. forbidden)

  • Env file paths validated with same security constraints

  • All file paths verified to exist and be regular files at config load time

Fixed

  • Shutdown no longer interrupts in-flight deployments mid-operation

  • Audit logs are flushed to disk before shutdown completes

  • HTTP and SSE connections close gracefully during shutdown

  • SSE connections no longer leak when clients disconnect unexpectedly

  • Audit logger double-unlock mutex bug (deferred unlock removed)