From 56b8dccf252fcb60fa7b69c623071e096d2e2ce2 Mon Sep 17 00:00:00 2001 From: Mathijs van den Hurk Date: Tue, 26 May 2026 21:44:57 +0200 Subject: [PATCH] fix(compressor): treat unanswered user questions as Active Task, not 'None' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Active Task field in compression summaries is the single most important field for task continuity across context boundaries. The previous template described it narrowly as a 'task assignment' or 'request', which caused the summary LLM to write 'None' whenever the user's most recent input was a question, a decision request, or a discussion turn rather than an imperative command. The assistant on the other side of the compaction then treated the conversation as resolved and gave a generic recap instead of answering the still-open question. Expand the template guidance to cover: * explicit task assignments * questions awaiting an answer * decisions awaiting input (A vs B) * ongoing discussions where the assistant owes the next substantive reply Reserve 'None' for the rare case where the last exchange was fully resolved (e.g. user said 'thanks, that's all'). Also tighten the trailing CRITICAL instruction in the summary prompt so the LLM cannot fall back to the old 'no imperative command → None' heuristic. No behavioural code changes — template strings only. All 83 existing compressor tests pass. --- agent/context_compressor.py | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/agent/context_compressor.py b/agent/context_compressor.py index 700952990..4f1b91894 100644 --- a/agent/context_compressor.py +++ b/agent/context_compressor.py @@ -1246,11 +1246,22 @@ Summary generation was unavailable, so this is a best-effort deterministic fallb # Shared structured template (used by both paths). _template_sections = f"""## Active Task -[THE SINGLE MOST IMPORTANT FIELD. Copy the user's most recent request or -task assignment verbatim — the exact words they used. If multiple tasks -were requested and only some are done, list only the ones NOT yet completed. -Continuation should pick up exactly here. Example: +[THE SINGLE MOST IMPORTANT FIELD. Capture the user's most recent unfulfilled +input verbatim — the exact words they used. This includes: +- Explicit task assignments ("refactor the auth module") +- Questions awaiting an answer ("waarom staat X op Y?", "wat zijn de volgende stappen?") +- Decisions awaiting input ("optie A of B?") +- Ongoing discussions where the assistant owes the next substantive reply +A conversation where the user just asked a question IS an active task — the +task is "answer that question with full context". Do NOT write "None" merely +because the user did not issue an imperative command; reserve "None" for the +rare case where the last exchange was fully resolved and the user said +something like "thanks, that's all". +If multiple items are outstanding, list only the ones NOT yet completed. +Continuation should pick up exactly here. Examples: "User asked: 'Now refactor the auth module to use JWT instead of sessions'" +"User asked: 'Waarom stond provider ineens op openrouter?' — needs investigation + answer" +"User chose option A; awaiting implementation of step 2" If the user's most recent message was a reverse signal (stop, undo, roll back, never mind, just verify, change of topic) that supersedes earlier work, write the reverse signal verbatim and DO NOT carry forward the @@ -1321,7 +1332,7 @@ PREVIOUS SUMMARY: NEW TURNS TO INCORPORATE: {content_to_summarize} -Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new completed actions to the numbered list (continue numbering). Move items from "In Progress" to "Completed Actions" when done. Move answered questions to "Resolved Questions". Update "Active State" to reflect current state. Remove information only if it is clearly obsolete. CRITICAL: Update "## Active Task" to reflect the user's most recent unfulfilled request — this is the most important field for task continuity. +Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new completed actions to the numbered list (continue numbering). Move items from "In Progress" to "Completed Actions" when done. Move answered questions to "Resolved Questions". Update "Active State" to reflect current state. Remove information only if it is clearly obsolete. CRITICAL: Update "## Active Task" to reflect the user's most recent unfulfilled input — this includes any question, decision request, or discussion turn that the assistant has not yet answered. Only write "None" if the last exchange was fully resolved. {_template_sections}""" else: