From 2159d2a72964865d047b1b46f6347be1e2a74e9f Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Fri, 29 May 2026 04:50:14 -0700 Subject: [PATCH] docs(credential-pools): document immediate rotation on usage-limit 429 (#34580) The rotation flowchart only described the generic 'retry once, rotate on second 429' path. ChatGPT/Codex plan-limit 429s carry a usage_limit_reached reason and rotate to the next pool key immediately (no retry, since the cap won't clear on retry). Document that case so the docs match the code. --- website/docs/user-guide/features/credential-pools.md | 7 +++++-- .../current/user-guide/features/credential-pools.md | 7 +++++-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/website/docs/user-guide/features/credential-pools.md b/website/docs/user-guide/features/credential-pools.md index 57bf3552b..807000570 100644 --- a/website/docs/user-guide/features/credential-pools.md +++ b/website/docs/user-guide/features/credential-pools.md @@ -22,8 +22,11 @@ Your request → Pick key from pool (round_robin / least_used / fill_first / random) → Send to provider → 429 rate limit? - → Retry same key once (transient blip) - → Second 429 → rotate to next pool key + → Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")? + → Rotate to next pool key immediately (no retry — the cap won't clear on retry) + → Generic / transient 429? + → Retry same key once (transient blip) + → Second 429 → rotate to next pool key → All keys exhausted → fallback_model (different provider) → 402 billing error? → Immediately rotate to next pool key (24h cooldown) diff --git a/website/i18n/zh-Hans/docusaurus-plugin-content-docs/current/user-guide/features/credential-pools.md b/website/i18n/zh-Hans/docusaurus-plugin-content-docs/current/user-guide/features/credential-pools.md index fe538fb9b..d232f4350 100644 --- a/website/i18n/zh-Hans/docusaurus-plugin-content-docs/current/user-guide/features/credential-pools.md +++ b/website/i18n/zh-Hans/docusaurus-plugin-content-docs/current/user-guide/features/credential-pools.md @@ -18,8 +18,11 @@ Your request → Pick key from pool (round_robin / least_used / fill_first / random) → Send to provider → 429 rate limit? - → Retry same key once (transient blip) - → Second 429 → rotate to next pool key + → Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")? + → Rotate to next pool key immediately (no retry — the cap won't clear on retry) + → Generic / transient 429? + → Retry same key once (transient blip) + → Second 429 → rotate to next pool key → All keys exhausted → fallback_model (different provider) → 402 billing error? → Immediately rotate to next pool key (24h cooldown)