Eleven new tests pinning the #29344 fix. Layout mirrors the existing
"Fix D" entitlement section so the bad-credentials disambiguator
sits alongside the entitlement-block tests it complements.
Classifier-level coverage:
* ``test_is_entitlement_failure_false_for_bad_credentials_wke_suffix``
— verbatim shape from the reporter's wire capture
(``{code: 'caller does not have permission', error: 'OAuth2 access
token could not be validated. [WKE=unauthenticated:bad-credentials]'}``)
↦ classifier must return False so the refresh path runs.
* ``test_is_entitlement_failure_false_for_wke_suffix_in_normalized_shape``
— same body after ``_extract_api_error_context`` has rewritten it
to ``{reason, message}``. The disambiguator must fire in BOTH
shapes; without this guard the production call site at
``_recover_with_credential_pool`` (which goes through the
normalised extractor) would still misclassify.
* ``test_is_entitlement_failure_false_for_any_wke_unauthenticated_variant``
— parametrised forward-compat: ``bad-credentials``,
``expired-token``, ``revoked``, ``some-future-reason``. xAI
documents the prefix as stable, the suffix after the colon as a
reason code that can grow; every variant under
``unauthenticated:`` must route to refresh.
* ``test_is_entitlement_failure_false_via_oauth2_validation_phrase_alone``
— belt-and-braces guard: if a future API revision drops the WKE
suffix but keeps "OAuth2 access token could not be validated", we
still classify correctly.
* ``test_is_entitlement_failure_wke_signal_overrides_entitlement_keywords``
— defensive: if a body ever carries BOTH the WKE suffix and
entitlement language, the WKE signal wins. Auth is recoverable;
entitlement isn't, and a refreshed token will resurface the
entitlement message on the next request.
* ``test_is_entitlement_failure_case_insensitive_wke_match`` —
pins that the classifier lowercases the haystack so a future xAI
build that uppercases the prefix doesn't reintroduce the bug.
Recovery-path coverage (end-to-end through
``_recover_with_credential_pool``):
* ``test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403``
— the headline test the reporter requested: a bad-credentials 403
with the exact wire body must call ``try_refresh_current()``
exactly once and ``_swap_credential`` once. Pre-fix this returned
``(False, _)`` because the entitlement classifier over-matched and
short-circuited the refresh path.
* ``test_recover_with_credential_pool_still_blocks_real_entitlement``
— companion regression guard for #26847: a pure unsubscribed-
account body (no WKE suffix, no OAuth2-validation phrase) must
still surface as entitlement and skip refresh. The new
disambiguator must not weaken the original loop-protection it
was added to preserve.
The scaffolding reuses ``_make_codex_agent``, ``_FakePool``, and the
existing ``MagicMock`` patterns from the surrounding tests so the
new section reads as a natural extension of "Fix D" rather than a
separate test file.