Files
hermes-agent/tests/hermes_cli
teknium1 f30db14ced fix(kanban): SIGTERM on worker must terminate the process (#28181)
The single-query signal handler in cli.py raises KeyboardInterrupt on
SIGTERM/SIGHUP. For interactive 'hermes chat -q' that unwinds the main
thread cleanly. For kanban workers spawned by the dispatcher, the
worker process is likely to have a non-daemon thread alive (terminal
_wait_for_process, custom plugins, etc.). With KeyboardInterrupt only
the main thread unwinds; the non-daemon thread keeps the process alive,
the gateway has already restarted, and the dispatcher's _pid_alive
check returns True forever — task stuck in 'running' indefinitely.

When HERMES_KANBAN_TASK is set (dispatcher-spawned worker), flush
logging + stdout/stderr, then os._exit(0) instead of raising
KeyboardInterrupt. The kernel reclaims the PID immediately, and the
existing zombie-state detection in _pid_alive flips the task to
crashed on the next dispatcher tick. detect_crashed_workers then
re-spawns it on the following tick — no manual recovery needed.

A SIGALRM(2s) deadman is armed before the flush so a pathological
blocking-I/O flush can't wedge the worker forever. In practice the
reporter measured flush in <1ms; the alarm is a failsafe, never
the common path.

Interactive (non-kanban) chat -q is unchanged — the env-gated branch
only fires for dispatcher-spawned workers.

Live verification on this machine:
- Without HERMES_KANBAN_TASK + non-daemon thread alive: process hangs
  alive 4+ seconds after SIGTERM. Dispatcher's _pid_alive returns
  True → task stuck.
- With HERMES_KANBAN_TASK + same non-daemon thread: process exits in
  0.10s via os._exit(0). Dispatcher reclaims on next tick.

Tests:
- tests/hermes_cli/test_signal_handler_kanban_worker.py (3 cases):
  end-to-end subprocess test with a non-daemon thread,
  HERMES_KANBAN_TASK env, SIGTERM, dispatcher-style _pid_alive check.
  Plus a source-level invariant test catching future refactors that
  drop the env-gated exit.
- 452/452 kanban tests pass.

Co-authored-by: andrewhosf <andrewho.sf@gmail.com>
2026-05-28 11:59:58 -07:00
..