fix(dashboard): sanction plugin WS/upload auth via SDK helpers (gated mode)

Dashboard plugins (kanban, hermes-achievements) read
window.__HERMES_SESSION_TOKEN__ directly and hand-assembled WebSocket
URLs with ?token=. That works in loopback/--insecure mode but is
rejected on OAuth-gated deployments, where the session token is absent
and _ws_auth_ok only accepts single-use ?ticket= auth. The result was
401s on plugin REST calls and 1008/403 on the kanban live-events WS
whenever the dashboard ran behind OAuth (e.g. hosted Fly agents).

Make the plugin SDK the single sanctioned auth surface:

- web/src/lib/api.ts: add authedFetch() (raw Response for FormData
  uploads / blob downloads, token-or-cookie auth, no throw / no 401
  redirect) and buildWsUrl() (assembles a ws(s):// URL with the correct
  auth param for the active mode — fresh single-use ticket in gated
  mode, token in loopback).
- web/src/plugins/registry.ts: expose authedFetch, buildWsUrl,
  buildWsAuthParam, and sdkVersion on window.__HERMES_PLUGIN_SDK__;
  add SDK_CONTRACT_VERSION.
- web/src/plugins/sdk.d.ts: hand-authored typed contract for the
  plugin SDK + registry globals (single source of truth for the
  Window declarations).
- plugins/kanban + hermes-achievements dist bundles: stop reading the
  session token directly; route uploads/downloads through
  SDK.authedFetch and the live-events WS through SDK.buildWsUrl.
- plugins/kanban plugin_api.py: _ws_upgrade_authorized() delegates the
  /events WS upgrade to the canonical web_server._ws_auth_ok gate, so
  it transparently accepts loopback token / gated ticket / internal
  credential and can never drift from core auth again.
- tests: guard test asserting no plugin dist reads
  __HERMES_SESSION_TOKEN__ directly; kanban gated-ticket WS test.

Verified live on a gated staging Fly agent: kanban /events upgrades
101 with a minted ticket (ticket_len=43, ws_auth_ok=True) where the
old code got 403.
This commit is contained in:
Ben
2026-06-04 09:17:59 +10:00
committed by Teknium
parent 1c88360fed
commit a6e47314f9
8 changed files with 501 additions and 95 deletions

View File

@ -48,22 +48,16 @@
return tier ? "ha-tier-" + tier.toLowerCase() : "ha-tier-pending";
};
async function api(path, options) {
function api(path, options) {
// Delegate to the host SDK's fetchJSON so auth is handled correctly in
// BOTH dashboard modes: loopback (X-Hermes-Session-Token header) and
// gated OAuth (hermes_session_at cookie via credentials:'include').
// Hand-rolling fetch + reading window.__HERMES_SESSION_TOKEN__ directly
// 401s in gated mode (the token isn't injected there). fetchJSON throws
// Error("<status>: <body>") on non-2xx — the call sites' .catch() relies
// on that to surface errors, so we let it propagate (don't swallow).
const url = "/api/plugins/hermes-achievements" + path;
const token = window.__HERMES_SESSION_TOKEN__ || "";
const headers = { ...((options && options.headers) || {}) };
if (token) headers["X-Hermes-Session-Token"] = token;
const res = await fetch(url, { ...(options || {}), headers });
if (!res.ok) {
const text = await res.text().catch(function () { return res.statusText; });
throw new Error(res.status + ": " + text);
}
const text = await res.text();
try {
return JSON.parse(text);
} catch (_) {
return null;
}
return SDK.fetchJSON(url, options);
}
function AchievementIcon({ icon }) {

View File

@ -588,52 +588,62 @@
wsClosedRef.current = false;
function openWs() {
if (wsClosedRef.current) return;
const token = window.__HERMES_SESSION_TOKEN__ || "";
const proto = window.location.protocol === "https:" ? "wss:" : "ws:";
const qsParams = {
since: String(cursorRef.current || 0),
token: token,
};
// Build the WS URL via the host SDK so the correct auth param is used
// in BOTH modes: single-use ?ticket= in gated OAuth mode, ?token= in
// loopback. Reading window.__HERMES_SESSION_TOKEN__ directly (the old
// path) sends an empty token and is rejected in gated mode. buildWsUrl
// also applies the dashboard base-path prefix for reverse-proxied
// deployments, which the old inline URL did not. It's async (gated
// mode mints a fresh ticket per connect), so resolve then open.
const wsParams = { since: String(cursorRef.current || 0) };
// Pin the WS stream to the currently-selected board so events
// from other boards don't bleed in. Includes "default" so the
// dashboard's own board pin always wins over the server-side
// ``current`` file — same rationale as ``withBoard()`` above.
// Regression: #20879.
if (board) qsParams.board = board;
const qs = new URLSearchParams(qsParams);
const url = `${proto}//${window.location.host}${API}/events?${qs}`;
let ws;
try { ws = new WebSocket(url); } catch (_e) { return; }
wsRef.current = ws;
ws.onopen = function () { wsBackoffRef.current = 1000; };
ws.onmessage = function (ev) {
try {
const msg = JSON.parse(ev.data);
if (msg && Array.isArray(msg.events) && msg.events.length > 0) {
cursorRef.current = msg.cursor || cursorRef.current;
// Stamp per-task signal so the TaskDrawer can reload itself.
setTaskEventTick(function (prev) {
const next = Object.assign({}, prev);
for (const e of msg.events) {
if (e && e.task_id) next[e.task_id] = (next[e.task_id] || 0) + 1;
}
return next;
});
scheduleReload();
}
} catch (_e) { /* ignore */ }
};
ws.onclose = function (ev) {
if (board) wsParams.board = board;
SDK.buildWsUrl(`${API}/events`, wsParams).then(function (url) {
if (wsClosedRef.current) return;
let ws;
try { ws = new WebSocket(url); } catch (_e) { return; }
wsRef.current = ws;
ws.onopen = function () { wsBackoffRef.current = 1000; };
ws.onmessage = function (ev) {
try {
const msg = JSON.parse(ev.data);
if (msg && Array.isArray(msg.events) && msg.events.length > 0) {
cursorRef.current = msg.cursor || cursorRef.current;
// Stamp per-task signal so the TaskDrawer can reload itself.
setTaskEventTick(function (prev) {
const next = Object.assign({}, prev);
for (const e of msg.events) {
if (e && e.task_id) next[e.task_id] = (next[e.task_id] || 0) + 1;
}
return next;
});
scheduleReload();
}
} catch (_e) { /* ignore */ }
};
ws.onclose = function (ev) {
if (wsClosedRef.current) return;
if (ev && ev.code === 1008) {
setError(tx(t, "wsAuthFailed",
"WebSocket auth failed — reload the page to refresh the session token."));
return;
}
const delay = Math.min(wsBackoffRef.current, 30000);
wsBackoffRef.current = Math.min(wsBackoffRef.current * 2, 30000);
setTimeout(openWs, delay);
};
}).catch(function () {
// Ticket mint / URL build failed (e.g. session expired). Back off
// and retry; a hard auth failure surfaces via the 1008 close path.
if (wsClosedRef.current) return;
if (ev && ev.code === 1008) {
setError(tx(t, "wsAuthFailed",
"WebSocket auth failed — reload the page to refresh the session token."));
return;
}
const delay = Math.min(wsBackoffRef.current, 30000);
wsBackoffRef.current = Math.min(wsBackoffRef.current * 2, 30000);
setTimeout(openWs, delay);
};
});
}
openWs();
return function () {
@ -2837,8 +2847,6 @@
if (!files.length) return;
setUploadBusy(true);
setUploadErr(null);
const token = window.__HERMES_SESSION_TOKEN__ || "";
const headers = token ? { Authorization: "Bearer " + token } : {};
const url = withBoard(`${API}/tasks/${encodeURIComponent(props.taskId)}/attachments`, boardSlug);
// Upload sequentially so a partial failure leaves a clear state.
let chain = Promise.resolve();
@ -2846,7 +2854,11 @@
chain = chain.then(function () {
const fd = new FormData();
fd.append("file", f, f.name);
return fetch(url, { method: "POST", headers: headers, credentials: "same-origin", body: fd })
// SDK.authedFetch handles auth in BOTH modes (loopback token header /
// gated cookie) and applies the dashboard base-path prefix. The old
// hand-rolled Authorization:Bearer + credentials:'same-origin' sent
// an empty token and 401'd in gated mode.
return SDK.authedFetch(url, { method: "POST", body: fd })
.then(function (resp) {
if (!resp.ok) {
return resp.text().then(function (txt) {
@ -3073,15 +3085,16 @@
const fileRef = useRef(null);
const [dlErr, setDlErr] = useState(null);
// Download via authenticated fetch → blob → synthetic anchor click.
// A plain <a href> can't carry the session header/bearer the dashboard
// auth middleware requires in loopback mode, so fetch with the token
// and hand the browser a blob URL instead.
// A plain <a href> can't carry the auth the dashboard middleware requires,
// so fetch authenticated and hand the browser a blob URL instead.
function downloadAttachment(a) {
const token = window.__HERMES_SESSION_TOKEN__ || "";
const headers = token ? { Authorization: "Bearer " + token } : {};
// SDK.authedFetch handles auth in BOTH modes (loopback token header /
// gated cookie) and applies the dashboard base-path prefix. The old
// hand-rolled Authorization:Bearer + credentials:'same-origin' sent an
// empty token and 401'd in gated mode.
const url = withBoard(`${API}/attachments/${a.id}`, props.boardSlug);
setDlErr(null);
fetch(url, { headers: headers, credentials: "same-origin" })
SDK.authedFetch(url)
.then(function (resp) {
if (!resp.ok) {
return resp.text().then(function (txt) {

View File

@ -36,7 +36,6 @@ the port.
from __future__ import annotations
import asyncio
import hmac
import json
import logging
import os
@ -63,15 +62,29 @@ router = APIRouter()
# existing plugin-bypass; this is documented above).
# ---------------------------------------------------------------------------
def _check_ws_token(provided: Optional[str]) -> bool:
"""Constant-time compare against the dashboard session token.
def _ws_upgrade_authorized(ws: "WebSocket") -> bool:
"""Authorize a WebSocket upgrade by delegating to the dashboard's canonical
WS auth gate (``hermes_cli.web_server._ws_auth_ok``).
Delegating (rather than re-implementing a ``_SESSION_TOKEN``-only check)
means this endpoint transparently accepts whatever the core gate accepts
in each mode:
* loopback / ``--insecure``: legacy ``?token=<_SESSION_TOKEN>``
* gated OAuth: single-use ``?ticket=`` (the browser SDK's
``buildWsUrl`` mints one per connect)
* server-internal: the process-lifetime ``?internal=`` credential
The previous bespoke check only understood ``_SESSION_TOKEN``, so the
kanban live-events WS was rejected on every OAuth-gated deployment even
though the rest of the dashboard worked. Routing through the shared gate
also means this can never drift from core auth again.
Imported lazily so the plugin still loads in test contexts where the
dashboard web_server module isn't importable (e.g. the bare-FastAPI
test harness).
dashboard ``web_server`` module isn't importable (e.g. the bare-FastAPI
test harness); there we accept so the tail loop stays testable, matching
the prior behaviour.
"""
if not provided:
return False
try:
from hermes_cli import web_server as _ws
except Exception:
@ -79,10 +92,7 @@ def _check_ws_token(provided: Optional[str]) -> bool:
# testable; in production the dashboard module always imports
# cleanly because it's the caller.
return True
expected = getattr(_ws, "_SESSION_TOKEN", None)
if not expected:
return True
return hmac.compare_digest(str(provided), str(expected))
return bool(_ws._ws_auth_ok(ws))
def _resolve_board(board: Optional[str]) -> Optional[str]:
@ -2375,11 +2385,12 @@ def set_orchestration_settings(payload: OrchestrationSettingsBody):
@router.websocket("/events")
async def stream_events(ws: WebSocket):
# Enforce the dashboard session token as a query param — browsers can't
# set Authorization on a WS upgrade. This matches how the PTY bridge
# authenticates in hermes_cli/web_server.py.
token = ws.query_params.get("token")
if not _check_ws_token(token):
# Authorize the upgrade via the dashboard's canonical WS gate so the
# correct credential is accepted in every mode (loopback token / gated
# single-use ticket / server-internal credential). Browsers can't set
# Authorization on a WS upgrade, so the credential rides in the query
# string — the browser SDK's buildWsUrl() assembles it.
if not _ws_upgrade_authorized(ws):
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
return
await ws.accept()