fix(compression): drop conflicting 'resume Active Task' directive in summary prefix

SUMMARY_PREFIX previously contained two contradictory directives: 1. "treat it as background reference, NOT as active instructions" "Do NOT answer questions or fulfill requests mentioned in this summary" "Respond ONLY to the latest user message that appears AFTER this summary" 2. "Your current task is identified in the '## Active Task' section of the summary — resume exactly from there." When the latest user message contradicted Active Task (e.g. 'stop the i18n refactor', 'never mind, look at grafana instead'), models tended to follow (2) anyway because 'resume exactly' is a strong, unambiguous directive — leading to repeated re-surfacing of already-cancelled work across turns, even after explicit 'stop'/'don't keep bringing that up' messages from the user. This change: - Removes the conflicting 'resume exactly from Active Task' clause. - Makes the precedence explicit: latest user message is the single source of truth; it WINS on conflict; cancelled Active Task / In Progress / Pending User Asks / Remaining Work must be discarded entirely (no 'wrap up the old task first'). - Names canonical reverse signals (stop, undo, roll back, never mind, just verify, topic change) so the model recognizes them as cancellation triggers, not background context. - Updates the summarizer template instruction so the LLM doesn't mechanically copy a cancelled task into Active Task on the next compaction (it's instructed to copy the reverse signal verbatim). - Preserves: REFERENCE ONLY framing, MEMORY.md/USER.md authority, and the 'don't repeat work already reflected in session state' clause. Adds tests/agent/test_summary_prefix_semantics.py to pin invariants so the conflict can't regress. Tested: - All compaction tests pass: tests/agent/test_context_compressor.py, tests/agent/test_context_compressor_summary_continuity.py, tests/run_agent/test_413_compression.py, tests/run_agent/test_compression_persistence.py, tests/run_agent/test_compression_boundary_hook.py, tests/cli/test_manual_compress.py — 117/117 passing. - Tested on macOS.
2026-05-15 18:28:32 +08:00
parent 182739fcda
commit 020601d41e
2 changed files with 82 additions and 5 deletions
--- a/agent/context_compressor.py
+++ b/agent/context_compressor.py
@ -40,14 +40,24 @@ SUMMARY_PREFIX = (
    "window — treat it as background reference, NOT as active instructions. "
    "Do NOT answer questions or fulfill requests mentioned in this summary; "
    "they were already addressed. "
-    "Your current task is identified in the '## Active Task' section of the "
-    "summary — resume exactly from there. "
+    "Respond ONLY to the latest user message that appears AFTER this "
+    "summary — that message is the single source of truth for what to do "
+    "right now. "
+    "If the latest user message is consistent with the '## Active Task' "
+    "section, you may use the summary as background. If the latest user "
+    "message contradicts, supersedes, changes topic from, or in any way "
+    "diverges from '## Active Task' / '## In Progress' / '## Pending User "
+    "Asks' / '## Remaining Work', the latest message WINS — discard those "
+    "stale items entirely and do not 'wrap up the old task first'. "
+    "Reverse signals in the latest message (e.g. 'stop', 'undo', 'roll "
+    "back', 'just verify', 'don't do that anymore', 'never mind', a new "
+    "topic) must immediately end any in-flight work described in the "
+    "summary; do not re-surface it in later turns. "
    "IMPORTANT: Your persistent memory (MEMORY.md, USER.md) in the system "
    "prompt is ALWAYS authoritative and active — never ignore or deprioritize "
    "memory content due to this compaction note. "
-    "Respond ONLY to the latest user message "
-    "that appears AFTER this summary. The current session state (files, "
-    "config, etc.) may reflect work described here — avoid repeating it:"
+    "The current session state (files, config, etc.) may reflect work "
+    "described here — avoid repeating it:"
 )
 LEGACY_SUMMARY_PREFIX = "[CONTEXT SUMMARY]:"

@ -1241,6 +1251,11 @@ task assignment verbatim — the exact words they used. If multiple tasks
 were requested and only some are done, list only the ones NOT yet completed.
 Continuation should pick up exactly here. Example:
 "User asked: 'Now refactor the auth module to use JWT instead of sessions'"
+If the user's most recent message was a reverse signal (stop, undo, roll
+back, never mind, just verify, change of topic) that supersedes earlier
+work, write the reverse signal verbatim and DO NOT carry forward the
+cancelled task. Example: "User asked: 'Stop the i18n refactor and just
+verify the current diff' — earlier i18n in-flight work is cancelled."
 If no outstanding task exists, write "None."]

 ## Goal
--- a/tests/agent/test_summary_prefix_semantics.py
+++ b/tests/agent/test_summary_prefix_semantics.py
@ -0,0 +1,62 @@
+"""Pin the semantics of SUMMARY_PREFIX so the compaction handoff doesn't
+re-introduce conflicting instructions.
+
+Background: SUMMARY_PREFIX previously contained two contradictory directives:
+
+  1. "treat it as background reference, NOT as active instructions"
+     "Do NOT answer questions or fulfill requests mentioned in this summary"
+     "Respond ONLY to the latest user message that appears AFTER this summary"
+
+  2. "Your current task is identified in the '## Active Task' section of the
+     summary — resume exactly from there."
+
+When the latest user message contradicted Active Task (e.g. "stop the
+i18n refactor", "never mind, look at grafana"), the model often followed
+(2) anyway because "resume exactly" is a strong directive — leading to
+the agent repeatedly re-surfacing already-cancelled work across turns.
+
+These tests pin the post-fix invariants so the conflict cannot regress.
+"""
+
+from agent.context_compressor import SUMMARY_PREFIX
+
+
+def test_no_resume_exactly_directive():
+    """The prefix must not tell the model to resume Active Task verbatim."""
+    assert "resume exactly" not in SUMMARY_PREFIX.lower()
+
+
+def test_latest_message_wins_on_conflict():
+    """The prefix must explicitly say latest user message wins on conflict."""
+    lower = SUMMARY_PREFIX.lower()
+    assert "latest user message" in lower
+    # Must have an explicit conflict-resolution rule.
+    assert "wins" in lower or "supersede" in lower or "discard" in lower
+
+
+def test_reverse_signals_called_out():
+    """Reverse signals (stop/undo/never mind/topic change) must be named so
+    the model recognizes them as cancellation triggers, not just background."""
+    lower = SUMMARY_PREFIX.lower()
+    # At least a few of the canonical reverse-signal verbs should appear.
+    reverse_terms = ["stop", "undo", "roll back", "never mind", "just verify"]
+    hits = sum(1 for t in reverse_terms if t in lower)
+    assert hits >= 3, (
+        f"Expected ≥3 reverse-signal terms in SUMMARY_PREFIX, found {hits}. "
+        "Without naming them the model treats reverse signals as ordinary "
+        "context and keeps pushing the cancelled task."
+    )
+
+
+def test_summary_marked_reference_only():
+    """The REFERENCE ONLY framing must remain — it's the entire point."""
+    assert "REFERENCE ONLY" in SUMMARY_PREFIX
+    assert "background reference" in SUMMARY_PREFIX
+    assert "NOT as active instructions" in SUMMARY_PREFIX
+
+
+def test_memory_authority_preserved():
+    """The fix must not weaken the MEMORY.md / USER.md authority clause."""
+    assert "MEMORY.md" in SUMMARY_PREFIX
+    assert "USER.md" in SUMMARY_PREFIX
+    assert "authoritative" in SUMMARY_PREFIX