docs(credential-pools): document immediate rotation on usage-limit 429 (#34580)

The rotation flowchart only described the generic 'retry once, rotate on
second 429' path. ChatGPT/Codex plan-limit 429s carry a usage_limit_reached
reason and rotate to the next pool key immediately (no retry, since the cap
won't clear on retry). Document that case so the docs match the code.
This commit is contained in:
Teknium
2026-05-29 04:50:14 -07:00
committed by GitHub
parent 0dba60f73b
commit 2159d2a729
2 changed files with 10 additions and 4 deletions

View File

@ -22,8 +22,11 @@ Your request
→ Pick key from pool (round_robin / least_used / fill_first / random)
→ Send to provider
→ 429 rate limit?
Retry same key once (transient blip)
→ Second 429rotate to next pool key
Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")?
Rotate to next pool key immediately (no retry — the cap won't clear on retry)
→ Generic / transient 429?
→ Retry same key once (transient blip)
→ Second 429 → rotate to next pool key
→ All keys exhausted → fallback_model (different provider)
→ 402 billing error?
→ Immediately rotate to next pool key (24h cooldown)

View File

@ -18,8 +18,11 @@ Your request
→ Pick key from pool (round_robin / least_used / fill_first / random)
→ Send to provider
→ 429 rate limit?
Retry same key once (transient blip)
→ Second 429rotate to next pool key
Plan/usage limit reached (e.g. ChatGPT/Codex "usage limit reached")?
Rotate to next pool key immediately (no retry — the cap won't clear on retry)
→ Generic / transient 429?
→ Retry same key once (transient blip)
→ Second 429 → rotate to next pool key
→ All keys exhausted → fallback_model (different provider)
→ 402 billing error?
→ Immediately rotate to next pool key (24h cooldown)