fix(compress): abort instead of dropping messages when summary LLM fails (#28102)
When auxiliary compression's summary generation returns None (aux model errored, returned non-JSON, timed out, etc.) the compressor previously still dropped every middle message between compress_start..compress_end and replaced them with a static 'Summary generation was unavailable' placeholder. The session kept going but the user silently lost N turns of context for nothing. New behavior: on summary failure, compress() aborts entirely — returns the input messages unchanged and sets _last_compress_aborted=True. The existing _summary_failure_cooldown_until gate (30-60s) keeps the aux model from being burned on every turn. Auto-compress callers detect the no-op (len(after) == len(before)) and stop looping. The chat is 'frozen' at its current size until the next /compress or /new. Manual /compress (CLI + gateway) now passes force=True which clears the cooldown so users can retry immediately after an auto-abort. If the manual retry also fails, the user gets a visible warning telling them nothing was dropped and how to retry. - agent/context_compressor.py: compress() gains force= kwarg; failure branch sets _last_compress_aborted and returns messages unchanged instead of inserting placeholder. - run_agent.py: _compress_context() detects abort, surfaces warning, skips session-rotation entirely, returns messages unchanged. - cli.py + gateway/run.py: manual /compress paths pass force=True. - gateway/run.py: hygiene + /compress handlers detect _last_compress_aborted and emit the new 'Compression aborted' warning (gateway.compress.aborted) instead of the old 'N historical messages were removed' message. - locales/*.yaml: new gateway.compress.aborted key in all 16 locales. - tests: updated to assert the abort contract (messages preserved, compression_count not incremented, abort flag set, no placeholder leaked). New test_force_true_bypasses_failure_cooldown covers the manual-retry path.
This commit is contained in:
11
run_agent.py
11
run_agent.py
@ -3714,12 +3714,19 @@ class AIAgent:
|
||||
"""
|
||||
return self.api_mode != "codex_responses"
|
||||
|
||||
def _compress_context(self, messages: list, system_message: str, *, approx_tokens: int = None, task_id: str = "default", focus_topic: str = None) -> tuple:
|
||||
"""Forwarder — see ``agent.conversation_compression.compress_context``."""
|
||||
def _compress_context(self, messages: list, system_message: str, *, approx_tokens: int = None, task_id: str = "default", focus_topic: str = None, force: bool = False) -> tuple:
|
||||
"""Forwarder — see ``agent.conversation_compression.compress_context``.
|
||||
|
||||
``force=True`` is passed by the manual ``/compress`` slash command
|
||||
so users can bypass the summary-failure cooldown after an
|
||||
auto-compress abort. Auto-compress callers use the default
|
||||
``force=False``.
|
||||
"""
|
||||
from agent.conversation_compression import compress_context
|
||||
return compress_context(
|
||||
self, messages, system_message,
|
||||
approx_tokens=approx_tokens, task_id=task_id, focus_topic=focus_topic,
|
||||
force=force,
|
||||
)
|
||||
|
||||
def _set_tool_guardrail_halt(self, decision: ToolGuardrailDecision) -> None:
|
||||
|
||||
Reference in New Issue
Block a user