fix(compress): abort instead of dropping messages when summary LLM fails (#28102)

When auxiliary compression's summary generation returns None (aux model
errored, returned non-JSON, timed out, etc.) the compressor previously
still dropped every middle message between compress_start..compress_end
and replaced them with a static 'Summary generation was unavailable'
placeholder. The session kept going but the user silently lost N turns
of context for nothing.

New behavior: on summary failure, compress() aborts entirely — returns
the input messages unchanged and sets _last_compress_aborted=True. The
existing _summary_failure_cooldown_until gate (30-60s) keeps the aux
model from being burned on every turn. Auto-compress callers detect
the no-op (len(after) == len(before)) and stop looping. The chat is
'frozen' at its current size until the next /compress or /new.

Manual /compress (CLI + gateway) now passes force=True which clears
the cooldown so users can retry immediately after an auto-abort. If
the manual retry also fails, the user gets a visible warning telling
them nothing was dropped and how to retry.

- agent/context_compressor.py: compress() gains force= kwarg; failure
  branch sets _last_compress_aborted and returns messages unchanged
  instead of inserting placeholder.
- run_agent.py: _compress_context() detects abort, surfaces warning,
  skips session-rotation entirely, returns messages unchanged.
- cli.py + gateway/run.py: manual /compress paths pass force=True.
- gateway/run.py: hygiene + /compress handlers detect _last_compress_aborted
  and emit the new 'Compression aborted' warning (gateway.compress.aborted)
  instead of the old 'N historical messages were removed' message.
- locales/*.yaml: new gateway.compress.aborted key in all 16 locales.
- tests: updated to assert the abort contract (messages preserved,
  compression_count not incremented, abort flag set, no placeholder
  leaked). New test_force_true_bypasses_failure_cooldown covers the
  manual-retry path.
This commit is contained in:
Teknium
2026-05-18 10:19:40 -07:00
committed by GitHub
parent 65e0c49b77
commit 1634397ddb
24 changed files with 249 additions and 103 deletions

View File

@ -3714,12 +3714,19 @@ class AIAgent:
"""
return self.api_mode != "codex_responses"
def _compress_context(self, messages: list, system_message: str, *, approx_tokens: int = None, task_id: str = "default", focus_topic: str = None) -> tuple:
"""Forwarder — see ``agent.conversation_compression.compress_context``."""
def _compress_context(self, messages: list, system_message: str, *, approx_tokens: int = None, task_id: str = "default", focus_topic: str = None, force: bool = False) -> tuple:
"""Forwarder — see ``agent.conversation_compression.compress_context``.
``force=True`` is passed by the manual ``/compress`` slash command
so users can bypass the summary-failure cooldown after an
auto-compress abort. Auto-compress callers use the default
``force=False``.
"""
from agent.conversation_compression import compress_context
return compress_context(
self, messages, system_message,
approx_tokens=approx_tokens, task_id=task_id, focus_topic=focus_topic,
force=force,
)
def _set_tool_guardrail_halt(self, decision: ToolGuardrailDecision) -> None: