fix(tui): stop persisting full tool output in trail lines (silent OOM death)

A heavy --tui session (browser snapshots, large tool outputs) silently
OOM-killed the Node parent within minutes — closing the gateway child's
stdin, which the user saw only as a bare "gateway exited" / stdin EOF.
CLI was immune. Root cause: each completed tool's verbose trail line
embedded up to 16KB of result_text, persisted in transcript Msg.tools[]
for the whole session and rendered EXPANDED by default, so an Ink
render-node tree was built for every one of up to 800 messages at once.
That tree blew past Node's heap at a few hundred MB — far below the 2.5GB
memory-monitor exit threshold, so the death was never even attributed.

- text.ts: persisted verbose tool-trail blocks now cap to a small preview
  (VERBOSE_TRAIL_MAX_CHARS=800/12 lines), not the 16KB live-render budget.
  Retained trail strings drop ~17x (12.2MB -> 0.7MB at 800 msgs); the live
  streaming tail still uses the larger LIVE_RENDER budget.
- tui_gateway/server.py: lower the gateway-side verbose text cap to match
  (1KB/16 lines) so we stop shipping output the TUI no longer renders.
- memoryMonitor.ts: derive critical/high thresholds from the real V8 heap
  ceiling (~88%/70%) instead of the hardcoded 2.5GB that killed the process
  at 31% of an 8GB ceiling; add a one-shot onWarn early-warning on fast
  sub-threshold heap growth so the next such death is diagnosable, not silent.
- entry.tsx: wire onWarn to a crash-log breadcrumb + stderr line.

Full tool output is unchanged in the agent context and SQLite session — this
is display/transport only, no behavior or context change.

Fixes #34095. Related #27282.

Tests: ui-tui text + new memoryMonitor suites (33 pass), python verbose-cap
guard (5 pass); full ui-tui suite shows no new failures vs pristine main.
E2E repro confirms the retention drop.
This commit is contained in:
teknium1
2026-06-03 06:00:22 -07:00
parent ba57ebec33
commit e76d8bf5aa
8 changed files with 251 additions and 9 deletions

View File

@ -1657,8 +1657,15 @@ def _tool_ctx(name: str, args: dict) -> str:
return ""
_TUI_VERBOSE_TEXT_MAX_CHARS = 16_000
_TUI_VERBOSE_TEXT_MAX_LINES = 240
# Tool Args/Result text shipped to the TUI for the verbose trail line. The TUI
# renders only a small persisted preview (ui-tui VERBOSE_TRAIL_MAX_CHARS), kept
# all session and expanded by default — so shipping more than that is pure pipe
# waste AND feeds the Ink render-tree blowup that silently OOM-killed the TUI
# parent (#34095). Cap here to match the render budget (a hair more, so the
# "[omitted …]" label is still informative when output is genuinely large).
# Full output stays in the agent context and the SQLite session, untouched.
_TUI_VERBOSE_TEXT_MAX_CHARS = 1_000
_TUI_VERBOSE_TEXT_MAX_LINES = 16
def _cap_tui_verbose_text(text: str) -> str: