The length-continue path's user-facing vprint and continuation prompt
both told the model "your response was truncated by the output length
limit." That's a lie when the stub came from a partial-stream network
error (issue #30963) — and a lie the model can detect, leading to "I
wasn't truncated, I'm done" no-op responses that defeat the
continuation entirely.
Detect the partial-stream-stub via response.id and swap in:
- vprint: "Stream interrupted by network error
(finish_reason='length' on partial-stream-stub)"
- prompt: "[System: The previous response was cut off by a network
error mid-stream. Continue exactly where you left off.
Do not restart or repeat prior text. Finish the answer
directly.]"
Real length truncations still see the original "truncated by output
length limit" prompt — the model needs to know which class of failure
it's recovering from. Same length_continue_retries=3 budget,
truncated_response_parts merging, and final-response stitching
infrastructure on both branches.
Refs: NousResearch/hermes-agent#30963