Sanjays2402
570f8bab8f
fix(compression): exclude completion tokens from compression trigger ( #12026 )
...
Cherry-picked from PR #12481 by @Sanjays2402.
Reasoning models (GLM-5.1, QwQ, DeepSeek R1) inflate completion_tokens
with internal thinking tokens. The compression trigger summed
prompt_tokens + completion_tokens, causing premature compression at ~42%
actual context usage instead of the configured 50% threshold.
Now uses only prompt_tokens — completion tokens don't consume context
window space for the next API call.
- 3 new regression tests
- Added AUTHOR_MAP entry for @Sanjays2402
Closes #12026
2026-04-20 05:12:10 -07:00
..
2026-04-20 00:10:27 -07:00
2026-04-20 02:40:20 -07:00
2026-04-20 02:53:40 -07:00
2026-04-19 19:18:19 -07:00
2026-04-11 13:59:52 -07:00
2026-04-07 17:28:37 -07:00
2026-03-14 14:27:20 +03:00
2026-04-20 05:10:23 -07:00
2026-04-20 05:11:39 -07:00
2026-04-18 22:50:55 -07:00
2026-04-16 16:50:15 -07:00
2026-04-20 04:46:45 -07:00
2026-04-20 05:12:10 -07:00
2026-04-16 04:22:16 -07:00
2026-04-20 03:48:08 -07:00
2026-04-19 07:47:15 -05:00
2025-10-01 23:29:25 +00:00
2026-04-20 04:46:45 -07:00
2026-03-17 02:53:33 -07:00
2026-04-20 04:56:06 -07:00
2026-04-10 13:06:02 -07:00
2026-04-07 17:59:42 -07:00
2026-04-13 10:50:24 -07:00
2026-04-12 03:53:30 -07:00
2026-03-15 21:59:53 -07:00
2026-04-12 16:36:11 -07:00
2026-04-13 10:50:24 -07:00
2026-04-20 05:10:03 -07:00
2026-04-02 15:33:51 -07:00
2026-04-11 23:12:11 -07:00
2026-03-29 15:47:19 -07:00
2026-04-19 18:54:35 -07:00
2026-04-19 22:44:47 -07:00
2026-03-24 08:19:23 -07:00
2026-04-07 17:59:42 -07:00
2026-03-20 15:41:06 -04:00
2026-04-20 03:48:08 -07:00
2026-04-07 22:23:28 -07:00
2026-03-30 17:34:43 -07:00
2026-04-17 00:20:40 -07:00
2026-04-17 13:31:53 -07:00
2026-04-08 00:41:36 -07:00
2026-03-19 15:16:35 +01:00
2026-04-10 13:37:45 -07:00
2026-04-17 14:21:22 -07:00
2026-02-26 13:54:20 +03:00
2026-04-19 19:18:19 -07:00
2026-04-20 00:32:06 -07:00
2026-04-20 00:32:06 -07:00
2026-04-20 03:48:08 -07:00
2026-04-19 16:15:22 -07:00
2026-03-30 13:28:10 +09:00