fix(compressor): treat unanswered user questions as Active Task, not 'None'

The Active Task field in compression summaries is the single most important
field for task continuity across context boundaries. The previous template
described it narrowly as a 'task assignment' or 'request', which caused the
summary LLM to write 'None' whenever the user's most recent input was a
question, a decision request, or a discussion turn rather than an
imperative command. The assistant on the other side of the compaction then
treated the conversation as resolved and gave a generic recap instead of
answering the still-open question.

Expand the template guidance to cover:

  * explicit task assignments
  * questions awaiting an answer
  * decisions awaiting input (A vs B)
  * ongoing discussions where the assistant owes the next substantive reply

Reserve 'None' for the rare case where the last exchange was fully
resolved (e.g. user said 'thanks, that's all').

Also tighten the trailing CRITICAL instruction in the summary prompt so the
LLM cannot fall back to the old 'no imperative command → None' heuristic.

No behavioural code changes — template strings only. All 83 existing
compressor tests pass.
This commit is contained in:
Mathijs van den Hurk
2026-05-26 21:44:57 +02:00
committed by Teknium
parent 020601d41e
commit 56b8dccf25

View File

@ -1246,11 +1246,22 @@ Summary generation was unavailable, so this is a best-effort deterministic fallb
# Shared structured template (used by both paths).
_template_sections = f"""## Active Task
[THE SINGLE MOST IMPORTANT FIELD. Copy the user's most recent request or
task assignment verbatim — the exact words they used. If multiple tasks
were requested and only some are done, list only the ones NOT yet completed.
Continuation should pick up exactly here. Example:
[THE SINGLE MOST IMPORTANT FIELD. Capture the user's most recent unfulfilled
input verbatim — the exact words they used. This includes:
- Explicit task assignments ("refactor the auth module")
- Questions awaiting an answer ("waarom staat X op Y?", "wat zijn de volgende stappen?")
- Decisions awaiting input ("optie A of B?")
- Ongoing discussions where the assistant owes the next substantive reply
A conversation where the user just asked a question IS an active task — the
task is "answer that question with full context". Do NOT write "None" merely
because the user did not issue an imperative command; reserve "None" for the
rare case where the last exchange was fully resolved and the user said
something like "thanks, that's all".
If multiple items are outstanding, list only the ones NOT yet completed.
Continuation should pick up exactly here. Examples:
"User asked: 'Now refactor the auth module to use JWT instead of sessions'"
"User asked: 'Waarom stond provider ineens op openrouter?' — needs investigation + answer"
"User chose option A; awaiting implementation of step 2"
If the user's most recent message was a reverse signal (stop, undo, roll
back, never mind, just verify, change of topic) that supersedes earlier
work, write the reverse signal verbatim and DO NOT carry forward the
@ -1321,7 +1332,7 @@ PREVIOUS SUMMARY:
NEW TURNS TO INCORPORATE:
{content_to_summarize}
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new completed actions to the numbered list (continue numbering). Move items from "In Progress" to "Completed Actions" when done. Move answered questions to "Resolved Questions". Update "Active State" to reflect current state. Remove information only if it is clearly obsolete. CRITICAL: Update "## Active Task" to reflect the user's most recent unfulfilled request — this is the most important field for task continuity.
Update the summary using this exact structure. PRESERVE all existing information that is still relevant. ADD new completed actions to the numbered list (continue numbering). Move items from "In Progress" to "Completed Actions" when done. Move answered questions to "Resolved Questions". Update "Active State" to reflect current state. Remove information only if it is clearly obsolete. CRITICAL: Update "## Active Task" to reflect the user's most recent unfulfilled input — this includes any question, decision request, or discussion turn that the assistant has not yet answered. Only write "None" if the last exchange was fully resolved.
{_template_sections}"""
else: