fix(model_metadata): drop stale ≤256,000 cache entries for Grok-4.3

The ``grok-4.3`` (1M context) catalog entry was added on 2026-05-15
(ce0e189d3).  Between 2026-04-10 (when ``grok-4`` at 256,000 was first
added by b57769718) and 2026-05-15, grok-4.3 slugs resolved via the
generic ``grok-4`` substring catch-all and that 256,000 value was
persisted to context_length_cache.yaml.  Users who first queried
grok-4.3 in that 35-day window are stuck at 256K forever — the cache
is read at step 1 before the hardcoded defaults in step 8, so the
correct 1M entry is never reached.

Mirror the existing Kimi/Codex/MiniMax-M3 stale-cache guards: add
_model_name_suggests_grok_4_3() and an elif branch that drops any
cached value ≤ 256,000 for a grok-4.3 slug so the next lookup falls
through to the 1M hardcoded default.

Adds 4 regression tests: helper unit test, stale-drop-and-re-resolve,
correct-cache-preserved, and no-clobber for plain grok-4 (256K correct).
This commit is contained in:
AhmetArif0
2026-06-02 02:43:38 +03:00
committed by Teknium
parent b04c6e95f6
commit 9756dff5fd
2 changed files with 81 additions and 0 deletions

View File

@ -1140,6 +1140,18 @@ def _model_name_suggests_minimax_m3(model: str) -> bool:
return "minimax-m3" in model.lower()
def _model_name_suggests_grok_4_3(model: str) -> bool:
"""Return True if the model name looks like a Grok 4.3 variant.
Catches ``grok-4.3``, ``grok-4.3-latest``, and similar slugs.
Used as a guard against stale cache entries seeded by pre-catalog builds
that resolved grok-4.3 via the generic ``grok-4`` catch-all (256,000)
before the ``grok-4.3`` (1M) entry was added to DEFAULT_CONTEXT_LENGTHS
on 2026-05-15.
"""
return "grok-4.3" in model.lower()
def _query_local_context_length(model: str, base_url: str, api_key: str = "") -> Optional[int]:
"""Query a local server for the model's context length."""
import httpx
@ -1564,6 +1576,19 @@ def get_model_context_length(
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
# Invalidate stale ≤256,000 cache entries for Grok-4.3. The
# ``grok-4.3`` (1M) entry was added to DEFAULT_CONTEXT_LENGTHS on
# 2026-05-15; prior to that, grok-4.3 slugs resolved via the
# ``grok-4`` catch-all (256,000) and that value was persisted.
# grok-4.3 is 1M, so any sub-262K cached value is a pre-catalog
# leftover — drop it and fall through to the hardcoded default.
elif cached <= 256_000 and _model_name_suggests_grok_4_3(model):
logger.info(
"Dropping stale Grok-4.3 cache entry %s@%s -> %s (pre-catalog value); "
"re-resolving via hardcoded defaults",
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
# Nous Portal: the portal /v1/models endpoint is authoritative.
# Bypass the persistent cache so step 5b can always reconcile
# against it — this corrects pre-fix entries seeded from the