Commit Graph

1229 Commits

Author SHA1 Message Date
2f523a4691 fix(tui): cgroup-aware V8 heap cap so memory-limited containers stop dying silently (#38541)
The TUI hardcoded --max-old-space-size=8192. V8 is not cgroup-aware, so in a
Docker/k8s container capped below ~9-10GB the heap grows past the container
limit and the cgroup OOM-killer SIGKILLs the Node parent BEFORE V8's own heap
monitor fires. SIGKILL runs no JS handler, writes no [tui-parent] breadcrumb,
and closes the gateway child's stdin — the user sees only a bare gateway
'stdin EOF'. Complements #38224 (trail-text cap), which reduced pressure but
left the 8GB-vs-container mismatch in place.

- _read_cgroup_memory_limit(): read cgroup v2 (memory.max) then v1
  (memory.limit_in_bytes); handle 'max', the v1 unlimited sentinel, blank/zero,
  and >=1PB as unconstrained.
- _resolve_tui_heap_mb(): unconstrained -> 8192; constrained -> 75% of the
  cgroup limit (headroom for non-heap RSS + the Python child sharing the
  cgroup), floored at 1536MB, never above 8192.
- NODE_OPTIONS block uses the sized value; still respects a user-supplied
  --max-old-space-size.

Net: V8 now GCs/exits gracefully (onCritical breadcrumb fires) instead of being
reaped silently. Display/transport only — no agent context or behavior change.

Tests: tests/hermes_cli/test_tui_heap_sizing.py (20 tests).
2026-06-03 16:40:28 -07:00
8a19884bf3 fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)
The stash/restore cycle in the update path was observed to clobber
freshly-pulled source files (apps/desktop/ deletion -> Vite
'[UNRESOLVED_ENTRY] Cannot resolve entry module index.html'). On a
managed clone the user never edits the source tree, so any 'dirty' state
is pure git artifact (CRLF renormalization, npm lockfile churn, files
left behind when a directory was deleted upstream such as
apps/bootstrap-installer/). Stashing that and re-applying it after a pull
is fragile and unnecessary.

- hermes update (hermes_cli/main.py): on a non-fork (managed) clone,
  discard working-tree dirt via reset --hard HEAD + clean -fd instead of
  stash/apply. Forks keep the stash machinery so intentional edits
  survive. Also pin core.autocrlf=false on Windows so the dirt is never
  created (mirrors install.ps1 #38239).
- install.sh: replace the update-path stash/restore dance with a hard
  reset to origin/<branch>; the installer is a managed-only entry point.
- install.sh + install.ps1 desktop stage: prefer 'npm ci' (wipes and
  reinstalls node_modules from the lockfile) over bare 'npm install',
  which can report 'up to date' against a stale marker while node_modules
  is empty -- leaving tsc unresolved so 'npm run pack' fails.

Tests: managed clone cleans instead of stashing; fork still stashes;
existing stash tests force the stash path explicitly.
2026-06-03 16:40:13 -07:00
26a57467a8 fix(cli): harden hermes portal SystemExit handling + finish model-pick doc sweep
Self-review of #38465 surfaced three real items:

1. SystemExit escape (defense): `_login_nous` raises SystemExit(130)/(1) on
   cancel/failure. The logged-out login path inside `_model_flow_nous` catches
   it, but the expired-session re-login path (main.py) only catches Exception,
   so a Ctrl-C during re-auth could propagate past `_run_portal_one_shot` and
   kill the CLI. Add SystemExit to the portal handler so all cancel/abort cases
   end with the graceful 'Setup cancelled / retry later' message.

2. Doc sweep: the model-pick step was only added to the bare-`hermes portal`
   prose. Propagate it to the surfaces describing `hermes setup --portal`
   behavior that still omitted model selection:
   - `--portal` argparse help (main.py)
   - nous-portal.md intro + the numbered 'what it does' step list (EN + zh-Hans)
   - run-hermes-with-nous-portal.md 'default model after setup --portal' line,
     which was now contradictory (there's a picker, not a forced default) (EN + zh)

3. Test coverage: add parametrized regression test asserting the portal handler
   swallows KeyboardInterrupt / EOFError / SystemExit (returns None, no escape).

Note on 'Skip (keep current)': delegating to _model_flow_nous means picking
Skip preserves the prior provider instead of force-switching to nous — this is
intentional and matches quick setup exactly; docs now say 'sets Nous as your
provider (when you pick a model)' rather than unconditionally.
2026-06-04 02:33:33 +05:30
cd188b814e feat(cli): make hermes portal run the full quick-setup Nous flow (model picker)
`hermes portal` / `hermes setup --portal` previously logged in and set
provider=nous but left the model UNSELECTED (blank -> runtime default) and
never showed a picker — unlike the first-time quick setup, which runs the
model picker.

Route `_run_portal_one_shot` through `_model_flow_nous` — the exact same
routine quick setup (`_run_first_time_quick_setup`) and `hermes model` -> Nous
use. It handles both the logged-out path (device-code OAuth, which picks a
model internally) and the logged-in path (curated Nous model picker), then
offers the Tool Gateway opt-in and sets provider=nous. Net effect: `hermes
portal` now offers a model picker every time and is a true single-command
collapse of quick setup's Nous step.

Removes the hand-rolled auth_add_command + manual provider write + separate
Tool Gateway prompt (now a single source of truth). Re-syncs the in-memory
config from disk afterward so a caller's later save_config can't clobber the
model/provider written by the login flow.

Docs (CLI help, portal_cli docstrings, nous-portal EN + zh-Hans) updated to
mention model selection. New regression test asserts `_run_portal_one_shot`
delegates to `_model_flow_nous`.

Verified live: `hermes portal` now shows the 27-model curated picker, 'Skip
(keep current)' preserves prior provider/model.
2026-06-04 02:20:31 +05:30
da4f407e51 feat(cli): make hermes portal the human-readable Portal onboarding alias
`hermes portal` (no subcommand) now runs the one-shot Nous Portal onboarding
— OAuth login, switch provider to Nous, offer Tool Gateway — identical to
`hermes setup --portal` and the human-readable alias for
`hermes auth add nous --type oauth` (which still works).

The prior status default moves to `hermes portal info`; `status` is kept as a
hidden back-compat alias. `open`/`tools` subcommands are unchanged.

User-facing hints and docs (status.py, conversation_loop 401 guidance,
SystemPage, README, website docs + zh-Hans) now point at `hermes portal` /
`hermes portal info`. `--manual-paste` references keep the explicit auth
command since `hermes portal` does not expose that flag.
2026-06-04 01:19:28 +05:30
1b89715e15 fix(desktop): guard reconnect sockets and keep branch search precise
Avoid stale WebSocket events from an old reconnect attempt flipping the gateway state after a newer socket opens. Also limit session-search dedupe to compression edges so branch-specific hits still open the branch instead of collapsing to the parent.
2026-06-03 13:13:21 -05:00
93228d5299 fix(desktop): persist pins, reconnect after sleep, dedupe session search
Four related desktop session-management bugs:

- Pins lost until refresh: pinned sessions are joined against the
  paginated in-memory session list, so a pinned chat that aged off the
  most-recent page got evicted on the next refresh (every message.complete
  triggers one) and the Pinned section went empty. mergeWorkingSessions ->
  mergeSessionPage now also preserves pinned rows (matched by live id or
  lineage root). Pin id checks in the chat header, command center, and
  delete/archive are normalized to the durable sessionPinId so pins survive
  auto-compression.

- Stuck on "Starting Hermes" after sleep: macOS sleep drops the renderer
  WebSocket; nothing reconnected on wake so the composer stayed disabled.
  The gateway boot hook now auto-reconnects with backoff on close/error and
  on wake signals (powerMonitor resume/unlock-screen IPC, window online,
  visibilitychange). connect() gains an open timeout so a hung reconnect
  can't deadlock in 'connecting'. Composer placeholder distinguishes
  "Reconnecting to Hermes" from a cold start.

- Loses chats from itself: the same hard-replace that dropped pins also
  dropped loaded sessions; mergeSessionPage keeps them.

- Multiple copies/branches in search: /api/sessions/search deduped only by
  raw session_id, so compression segments and branches surfaced as separate
  hits. It now dedupes by lineage root and returns the live compression tip,
  matching the session_search tool's behavior.
2026-06-03 12:39:31 -05:00
df848bd2da test(gateway): cover schtasks locale-safe decoding on Windows
Assert _exec_schtasks passes an explicit encoding and errors="replace" to
subprocess.run, and that _schtasks_encoding falls back to utf-8 when the
locale lookup is empty or raises (#38172).
2026-06-03 09:29:19 -07:00
9666305630 fix(dashboard): clamp PTY resize dimensions for WSL2 winsize garbage (#38200)
* fix(dashboard): clamp PTY resize dimensions for WSL2 winsize garbage

WSL2 reports columns=131072, rows=1 from a broken winsize probe. The
dashboard /chat tab forwards xterm.js dimensions through PtyBridge.resize(),
which packs them as unsigned short via struct.pack. 131072 > 65535 raised
struct.error — uncaught (only OSError was handled) — breaking the resize
path and leaving the TUI laid out for a one-row, absurdly-wide screen, which
surfaces as blank/disappearing text.

Clamp cols/rows to a sane [1, 2000]x[1, 1000] range before packing.
Non-finite/non-integer probes fall back to the minimum so nothing can reach
struct.pack and raise.

* test(dashboard): de-flake pub/events broadcast test

test_pub_broadcasts_to_events_subscribers round-tripped a frame through
two nested Starlette TestClient WebSocket portals within a 10s wall-clock
budget. Under heavy parallel CI load a starved ASGI thread occasionally
blew that budget even though the server logic is correct, producing
intermittent 'broadcast not received within 10s' failures.

Drive _broadcast_event directly under asyncio with fake subscribers
instead. Same fan-out contract (verbatim delivery to every subscriber on
the channel, nothing to other channels), zero scheduling surface. Runs in
~0.3s, deterministic across 10 consecutive runs.
2026-06-03 09:00:16 -07:00
7fb8a6b5c5 feat(dashboard): enrich profiles dashboard and de-dupe channel env vars (#37872)
* feat(desktop): enrich profiles dashboard and de-dupe channel env vars

Add active-profile switching, role descriptions (manual + auto-generate
via the auxiliary LLM), per-profile model selection, and gateway-running
/ distribution badges to the GUI Profiles page. New profile creation
gains clone-all, optional description and model assignment.

Hide messaging-platform credentials (channel_managed) from the Keys/Env
page since the Channels page is the canonical surface for them, and
relabel the trimmed "messaging" category as "Gateway".

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): address review feedback on profiles/env changes

- ProfilesPage: scope the action-menu outside-click handler to the menu's
  own container via a ref so opening one card's menu no longer leaves
  others open.
- EnvPage: route the "Gateway" label and hint through i18n
  (t.common.gateway / gatewayHint) instead of hard-coded English, with an
  English fallback for untranslated locales.
- web_server: only report description_auto=true when auto-generation
  actually succeeded.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): address second-round review on profiles

- ProfilesPage: treat describe-auto success by null-checking the
  description and trust the response's description_auto flag instead of
  assuming true; disable the model-editor Save button unless the selected
  choice resolves to a real /api/model/options entry (avoids silent
  no-op saves).
- tests: cover the new profile endpoints (active get/set + 404,
  description round-trip + 404, model round-trip + 400 validation, and
  describe-auto success/failure contracts).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(desktop): more profiles review fixes (toggles, races, tests)

- ProfilesPage: use the canonical `active` returned by setActiveProfile;
  make the SOUL/description/model action-menu items toggle their editor
  closed when already open; guard description save/auto-describe against
  stale responses via an activeDescRequest ref so a late reply can't
  clobber a different open editor.
- tests: assert /api/env channel_managed classification matches
  _channel_managed_env_keys().

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-03 10:37:36 -04:00
6ee046a72f fix(doctor): detect + repair stale HERMES_MAX_ITERATIONS .env ghost shadowing config.yaml (#38222)
* fix(doctor): detect + repair stale HERMES_MAX_ITERATIONS .env ghost shadowing config.yaml

hermes doctor now flags when ~/.hermes/.env carries a HERMES_MAX_ITERATIONS
value that disagrees with agent.max_turns in config.yaml, and 'hermes doctor
--fix' removes the stale .env line so config.yaml is authoritative. 'hermes
config show' surfaces the same drift inline under Max turns.

The setup wizard stopped dual-writing this value, but users who edited only
config.yaml from a pre-fix install keep a .env ghost. The gateway bridge
normally overrides it at startup, but if the bridge bails on any earlier
config-parse error the ghost silently wins — config says 400 while the
gateway activity line reads N/90.

The detector reads the .env FILE directly (load_env), not get_env_value/
os.environ, since the startup bridge may already have overwritten os.environ
with the config value.

Closes #17534.

* fix(config): stop offering HERMES_MAX_ITERATIONS as an editable env var

Removes HERMES_MAX_ITERATIONS from OPTIONAL_ENV_VARS so the dashboard env
editor (PUT /api/env) and any env-var prompt no longer let a user write it
to .env — which would recreate the stale ghost that shadows config.yaml's
agent.max_turns (issue #17534). The iteration budget is configured only via
config.yaml; the env var stays a read-only backward-compat fallback in the
gateway/CLI, never a promoted write target.

Regression test asserts it is absent from OPTIONAL_ENV_VARS.
2026-06-03 06:38:40 -07:00
0d9b7132ff feat(observability): observer-grade telemetry hooks + NeMo-Relay plugin
Adds backend-neutral observer hooks for plugins: session, turn, API
request, tool, approval, and subagent lifecycle events with stable
correlation IDs (session_id, task_id, turn_id, api_request_id,
tool_call_id, parent/child subagent ids). Extends VALID_HOOKS with
api_request_error and subagent_start.

Hot path is zero-cost when no plugin subscribes: has_hook()/presence
checks gate all payload construction, request payloads are returned
by reference when no middleware rewrites, and the sanitized response
payload no longer embeds raw response objects.

Bundles the optional NeMo-Relay observability plugin
(plugins/observability/nemo_relay) as an in-repo consumer of the new
hooks, peer to the existing langfuse plugin. Fails open when the
optional nemo-relay package is not installed.

Authored-by: Bryan Bednarski <bbednarski@nvidia.com>
Salvaged from #29722 onto current main.
2026-06-03 06:36:46 -07:00
4c544b633d fix(kanban): don't permanently block tasks that hit a provider rate limit (#38223)
A kanban worker that exhausted its retries purely on a provider rate
limit / quota wall (e.g. opencode-go's 5-hour window) exited with code 1.
The dispatcher counted that as a crash, and with DEFAULT_FAILURE_LIMIT=2
two quota-wall hits permanently blocked the card. Fanning out many
workers against one shared quota made this routine.

Now a rate-limited worker exits with EX_TEMPFAIL (75); the dispatcher
classifies that as a 'rate_limited' exit, releases the task back to
'ready' WITHOUT incrementing consecutive_failures (the breaker can't trip
on a transient throttle), and the respawn guard defers the next attempt
on a cooldown (default 5min, HERMES_KANBAN_RATE_LIMIT_COOLDOWN_SECONDS)
until the quota window clears. Genuine crashes still count and trip the
breaker as before. The 120s Retry-After cap is unchanged — no worker
parks for hours holding a slot.

- conversation_loop.py: surface failure_reason in the exhaustion return
- cli.py: kanban worker picks exit 75 on rate_limit/billing failure
- kanban_db.py: rate_limited exit kind, no-count requeue, cooldown guard
2026-06-03 06:19:32 -07:00
c5d199eada feat(dashboard): check-before-update flow on the System page (#38205)
The dashboard's update button ran 'hermes update' immediately with no
preview. Now the System page shows whether an update is available and
asks the user to confirm before applying it.

- New GET /api/hermes/update/check: reports install method, current
  version, and commits-behind (via banner.check_for_updates, 6h-cached;
  ?force=1 busts the cache). Soft-fails to behind=null on network error;
  marks docker/nix/homebrew as can_apply=false with the out-of-band cmd.
- System page: update-status badge on the Hermes version row (latest /
  N behind), a Check-for-updates button, and an Update-now button that
  opens a ConfirmDialog showing the commit count before POST /api/hermes/
  update fires. Cached status loads with the rest of the page.
- Docs + 5 endpoint tests (git/up-to-date/docker/soft-failure + auth gate).
2026-06-03 05:57:15 -07:00
1b302a0474 feat(debug): include desktop.log in hermes debug share / /debug / hermes logs (#38203)
The Electron desktop app writes boot failures, backend spawn output, and
Python tracebacks to HERMES_HOME/logs/desktop.log, but debug-share only
captured agent/errors/gateway — so desktop boot issues never made it into
shared debug reports.

- logs.py: register desktop -> desktop.log (enables 'hermes logs desktop')
- debug.py: capture desktop snapshot, add to summary report, upload full
  desktop.log in 'share', update privacy notice
- gateway /debug inherits the desktop tail via collect_debug_report()
- main.py + docs: help text and log-name table (also adds missing gui row)
- tests: desktop seed in fixture, new report test, three_pastes -> four_pastes
2026-06-03 05:41:35 -07:00
1d90b23982 fix(mcp): banner shows 'disabled' not 'failed' for enabled:false servers (#38204)
get_mcp_status() treated every non-connected server as a failure, so a
server configured with enabled: false rendered as red '— failed' in the
startup banner even though it was intentionally off. Add a 'disabled'
field derived from the enabled flag and render disabled servers dim as
'— disabled' instead.
2026-06-03 05:41:13 -07:00
192020992d fix(cli): exclude desktop-managed backend from stale-dashboard kill
Fixes #37532
2026-06-03 04:59:49 -07:00
e114b31eda test(dashboard): direct unit coverage for internal WS credential + docstring fix
Follow-up to Ben's PR #37892. Adds a TestInternalCredential block to
test_dashboard_auth_ws_tickets.py exercising the mint-once stability,
multi-use, unminted-rejection, empty-value, wrong-value, reset-and-remint,
and ticket-store-independence branches directly (previously only covered
indirectly via _ws_auth_ok, which left the unminted and empty-value
branches unexercised).

Also corrects the consume_internal_credential docstring: the returned
identity dict is discarded by the current _ws_auth_ok caller (which only
needs the boolean outcome), so the prior 'carry it into its session log'
wording over-promised.
2026-06-02 23:43:27 -07:00
Ben
fd1ec8033d fix(dashboard): authenticate server-spawned PTY child WS with a process-internal credential
The embedded-TUI PTY child attaches to two server-internal WebSockets:
/api/ws (its primary JSON-RPC gateway backend) and /api/pub (the event
sidecar). Both URLs are built server-side in web_server.py and handed to
the child via its environment.

In OAuth-gated mode (auth_required=true, every hosted Fly agent), _ws_auth_ok
unconditionally rejects the legacy ?token=<_SESSION_TOKEN> path — a leaked
session token must not grant WS access once the gate is engaged. But
_build_gateway_ws_url() still only emitted ?token=, with no gated-mode
branch (its sibling _build_sidecar_url had been given a ticket branch; the
gateway-url builder was missed). So the TUI child's /api/ws upgrade was
rejected 4401 -> 'gateway websocket connection failed' -> 'gateway startup
timeout', leaving the embedded chat unusable on every gated deployment.

A single-use 30s browser ticket is the wrong shape for this link: the child
reads its attach URL once at startup and reuses it on every reconnect, and
on a slow cold boot it may not dial within the TTL. (_build_sidecar_url's
own docstring already flagged this fragility.)

Fix: add a process-lifetime, multi-use internal credential to
dashboard_auth.ws_tickets (internal_ws_credential / consume_internal_credential),
minted once per process and NEVER injected into the SPA — it only leaves the
process via a spawned child's env, so browser-side XSS can't read it, and a
leak grants no more than a ticket already does. _ws_auth_ok accepts it via
?internal= in gated mode only. Both _build_gateway_ws_url and
_build_sidecar_url now use it, so the child can reconnect both sockets.

Loopback / --insecure behavior is unchanged (still ?token=).

Needs review: touches _ws_auth_ok + dashboard_auth (core auth surface).
2026-06-02 23:43:27 -07:00
dd28f2ac9c fix(dashboard): trust non-web WS origins on OAuth-gated binds after ticket auth (#37870)
Generalises #37747. The WS Origin guard (_ws_host_origin_is_allowed) only
trusted the packaged Electron app's non-web origin (file:// / null / app://)
when the bind was NOT OAuth-gated. The packaged Hermes Desktop renderer loads
over file://, so when it drives a remote OAuth-gated gateway its /api/ws
upgrade was rejected with HTTP 403 even though _ws_auth_ok had already
validated the single-use ?ticket= one line earlier.

This guard runs only AFTER _ws_auth_ok has accepted the WS credential, which
is the real auth boundary in every mode:
  * loopback bind          -> legacy dashboard session token
  * non-loopback --insecure -> legacy session token (Tailscale / LAN, #37747)
  * OAuth-gated public bind -> single-use, 30s-TTL, identity-bound ?ticket=
A non-web origin can only come from a native client; a DNS-rebinding attack
always arrives from an http(s) origin and is still match-checked against the
bound host. So once the upstream credential check has passed, the Origin guard
adds nothing for a non-web origin. Collapsed the loopback/non-gated special
cases to 'return True' for non-web origins.

http(s) origins keep the strict same-host check, so browser DNS-rebinding
defence is unchanged.

Tests: gated file:///null/app:// now asserted ALLOWED; cross-site http(s)
still rejected on gated and loopback binds; #37747's loopback and
non-loopback-insecure cases retained. 37/37 test_dashboard_auth_ws_auth +
test_web_server_host_header pass.
2026-06-03 14:32:53 +10:00
b28dd3417d fix(setup): default browser/TTS picker to free local backend, not paid Nous (#37800)
The Browser Automation and Text-to-Speech provider pickers listed the paid
"Nous Subscription" gateway row first, so on a fresh install the menu cursor
defaulted to index 0 (Nous). Pressing Enter selected it and ran the inline
Nous Portal device-code login — walking users into a paid offering they
never chose.

Reorder both provider lists so the free, no-key local backend is index 0
(Local Browser / Microsoft Edge TTS). Users who already configured Nous are
unaffected: _detect_active_provider_index still resolves their active row
first, so the cursor lands on Nous (now index 1) for them.

Reported by Javier via Kujila.
2026-06-02 19:49:10 -07:00
918aef267b Merge pull request #37782 from NousResearch/bb/configurable-default-interface
feat(cli): configurable default interface (cli vs tui) + --cli flag
2026-06-02 21:16:19 -05:00
d6b0c23f87 feat(cli): configurable default interface (cli vs tui)
Add `display.interface` config key so users can make the modern TUI the
default for bare `hermes` / `hermes chat` without exporting HERMES_TUI=1 in
every shell. Default stays "cli" to preserve current behavior.

Add a `--cli` flag (mirrors `--tui`) so an explicit invocation can force the
classic prompt_toolkit REPL even when `display.interface: tui` is configured.

Precedence (highest first): `--cli` > `--tui`/`HERMES_TUI=1` > config
`display.interface` > classic REPL. Two resolvers enforce it:

  * `_resolve_use_tui(args)` — the args-aware resolver used by `cmd_chat`
    and the Termux fast-TUI path (uses full load_config()).
  * `_wants_tui_early(argv)` — a dependency-free early resolver used by
    mouse-residue suppression and the Termux fast paths, which run before
    argparse / hermes_cli.config are importable (minimal cached YAML read).

Both `--cli` and `--tui` are registered via `_inherited_flag`, so they are
carried across self-relaunch automatically.

- config: add display.interface ("cli" default), bump _config_version 25->26.
  The generic missing-field migration + load_config() deep-merge seed the key
  for existing configs; no bespoke migration block needed.
- docs: document --cli flag and display.interface in cli-commands.md and
  the TUI user guide.
- tests: new test_default_interface_resolution.py covering resolver
  precedence at every layer, early resolver edge cases (missing/garbage
  config), parser flags, and relaunch inheritance.
2026-06-02 20:49:44 -05:00
6ed9a2de8f fix(dashboard): allow desktop websocket origins on remote binds 2026-06-02 18:29:08 -07:00
46e513ef51 fix(desktop): configure Linux Electron sandbox helper
Electron's chrome-sandbox helper must be root:root 4755 on Linux or the
sandboxed renderer aborts before the desktop app starts. The existing
installer only searched for macOS .app bundles, so a successful Linux
build was reported as missing.

Changes:
- Add _desktop_linux_sandbox_fixup() to hermes_cli/main.py, called
  before launching a packaged desktop app on Linux.
- Use lstat() + S_ISREG check to reject symlinks — chown/chmod on a
  symlink target would set SUID on an arbitrary path.
- Update install.sh to recognize Linux unpacked artifacts and configure
  chrome-sandbox with proper error handling (the original PR silently
  ignored chown/chmod failures).
- Add regression tests: normal fixup flow, symlink rejection, and
  already-configured skip path.

Closes #37529 (rebased, merge conflicts resolved, copilot review
feedback addressed).
2026-06-02 20:30:13 -04:00
4a626ed187 fix(tests): add _patch_managed_uv autouse fixture to uv-dependent test files
Production code now uses ensure_uv()/update_managed_uv() from
managed_uv.py instead of shutil.which("uv") directly. Tests that
patched shutil.which to control uv availability no longer controlled
the actual code path, causing CI failures.

Add an autouse _patch_managed_uv fixture to test_update_autostash.py
and test_uv_tool_update.py (matching the existing fixture in
test_cmd_update.py). The fixture makes managed_uv functions delegate
to shutil.which so existing test patches flow through naturally.
2026-06-02 20:29:54 -04:00
4df280d511 refactor(uv): single managed-uv path, delete fts5 installer escalation
Replace the multi-path UV resolution chain (PATH probing, conda guards,
5-location trust ordering, temp-dir fallback installs) with a single
managed uv binary at $HERMES_HOME/bin/uv. Every code path that needs
uv resolves it from that one location; if missing, ensure_uv()
bootstraps it via the official standalone installer.

Key changes:

- New hermes_cli/managed_uv.py: managed_uv_path(), resolve_uv(),
  ensure_uv() (returns (path, freshly_bootstrapped) tuple),
  update_managed_uv(), rebuild_venv(), installer internals.
- hermes_cli/main.py: replace all shutil.which('uv') with ensure_uv(),
  add venv rebuild on first-time managed uv bootstrap, update_managed_uv
  before dep install on all 3 update paths.
- scripts/install.sh: install_uv() always installs to
  $HERMES_HOME/bin/uv; delete ensure_fts5, _python_has_fts5,
  _reinstall_python_with_fts5, _warn_no_fts5 (61 lines).
  Managed uv always installs current Python with FTS5.
- scripts/install.ps1: Install-Uv always installs to
  $HermesHome\bin\uv.exe; Resolve-UvCmd checks managed location first.
- hermes_state.py: simplified FTS5 warning now suggests 'hermes update'
  as the fix instead of blaming install method.
- tests: 15 tests in test_managed_uv.py, autouse _patch_managed_uv
  fixture in test_cmd_update.py.

Closes #37605, Closes #37622
2026-06-02 20:29:54 -04:00
a51a7b9b92 fix(node/nix): consolidate workspace lockfile + update all consumers
Consolidate per-package package-lock.json files into a single root-level
workspace lockfile.  Update all consumers:

- Nix: shared src/npmDeps/npmDepsHash in lib.nix; devshell hook stamps
  package.json paths then runs npm ci from root; individual .nix files
  use mkNpmPassthru attrs instead of per-package fetchNpmDeps.
- Python CLI: new _workspace_root() helper so _tui_need_npm_install,
  _make_tui_argv, _build_web_ui resolve lockfile/node_modules from the
  workspace root.
- Desktop: replace --force-build/mtime heuristic with content-hash build
  stamp (_compute_desktop_content_hash via pathspec).  Remove --force-build
  flag.
- Dockerfile: single root npm install; no per-directory lockfile copies.
- CI: nix-lockfile-fix and osv-scanner reference root package-lock.json;
  apps/dashboard → apps/desktop.
- Tests: new test_tui_npm_install.py; desktop stamp tests in
  test_gui_command.py; updated assertions in test_cmd_update.py,
  test_web_ui_build.py, test_dockerfile_pid1_reaping.py.
- Docs: remove --force-build from desktop flag table.

Deleted: apps/desktop/package-lock.json, ui-tui/package-lock.json,
ui-tui/packages/hermes-ink/package-lock.json, web/package-lock.json.
2026-06-02 20:28:18 -04:00
123b945731 Merge remote-tracking branch 'origin/main' into bb/grok-provider-desktop 2026-06-02 18:41:32 -05:00
cbc82511ea fix(web-server): move event channel state from module globals to app.state (#37683)
Module-level asyncio.Lock() binds to whatever event loop was active at
import time.  When the same web_server module is reused across multiple
TestClient instances (or across uvicorn reloads), the old lock still
references a defunct loop, causing 'attached to a different loop' errors
and flaky subscriber-registration races in CI.

Replace the module-level _event_channels dict + _event_lock with:
  - _lifespan() async context manager that creates both on the running
    event loop during FastAPI startup (guaranteed correct loop binding)
  - _get_event_state() lazy accessor that initialises on app.state when
    TestClient is used without a `with` block (preserves backward compat)

All call sites (_broadcast_event, /api/pub, /api/events) now receive the
app reference and read state via _get_event_state(app) instead of the
module globals.  The test polling loop is updated to check
app.state.event_channels rather than the removed module attribute.
2026-06-02 18:40:12 -05:00
a13db76eaa fix(desktop): signal loopback worker to stop on cancel
Shutting down the callback server stopped the serve thread but left the
worker spinning in _xai_wait_for_callback (which polls callback_result)
until the timeout. Flag callback_result as cancelled on DELETE so the
wait returns promptly and the daemon thread exits — avoids thread
buildup on repeated cancel/retry.
2026-06-02 18:28:24 -05:00
d963ad56c1 fix(desktop): address second Copilot pass on xAI loopback flow
- onboarding: openSignInUrl now falls back to window.open when the desktop
  bridge's openExternal throws/rejects (OS handler missing, user denied),
  not just when the bridge is absent
- web_server: cancelling a loopback session shuts down the 127.0.0.1
  callback server + joins its thread immediately, freeing the port instead
  of holding it until the wait times out (+ regression test)
- web_server: document the new "loopback" flow in the /api/providers/oauth
  enum, the poll-endpoint docstring, and the Phase 2 flow comment block
2026-06-02 18:14:00 -05:00
3be9fb7317 fix(desktop): address Copilot review on xAI loopback flow
- web_server: join the callback-server thread in the start error path so a
  failed discovery/URL build doesn't leave a daemon thread running
- web_server: loopback worker now bails if the session was cancelled while
  waiting for the callback or exchanging the code, instead of persisting
  tokens the user no longer wants (+ regression test)
- onboarding: fall back to window.open when the desktop bridge's
  openExternal is unavailable, so the flow never silently stalls
2026-06-02 17:55:22 -05:00
dd5e97bd7f feat(desktop): make xAI Grok a first-class OAuth provider in the launcher
xAI Grok was only reachable via the "I have an API key" form. xAI's
OAuth (SuperGrok / Premium+) flow already exists in the backend
(`hermes auth add xai-oauth`) but was never surfaced in the desktop
onboarding launcher.

Add a loopback PKCE flow: the local backend binds the 127.0.0.1
callback listener, the client opens the browser, and the redirect lands
back automatically — no code to copy/paste. Reuses the existing xAI
OAuth helpers (discovery, callback server, token exchange, persist)
rather than duplicating them.

- web_server: catalog entry (flow: loopback) + status dispatch +
  _start_xai_loopback_flow + background worker + route branch
- desktop: 'loopback' flow type, awaiting_browser status, xAI Grok card
  (PROVIDER_DISPLAY / FLOW_SUBTITLES / FlowPanel waiting render)
- tests: catalog listing, start authorize-url, worker persist, state
  mismatch rejection
2026-06-02 17:34:00 -05:00
c2050183a5 feat(desktop): content-hash build stamp with --build-only and --force-build flags
Add a SHA-256 content-hash based build stamp to `hermes desktop` so
unchanged source trees skip the npm install + build step. Uses pathspec
for .gitignore-aware file matching instead of a hardcoded skip-list.

New CLI flags:
- --build-only: run the build but don't launch the app
- --force-build: rebuild even when the stamp matches

`hermes update` now calls `hermes desktop --build-only` so the
desktop app is rebuilt (if needed) as part of the update flow.

16/16 tests passing.
2026-06-02 15:45:30 -04:00
bb0619dbce fix(auth): align Codex OAuth persistence paths (#37517)
* fix(desktop): codex OAuth onboarding now resolves on fresh install

The desktop codex device-code worker persisted tokens with a hand-rolled
pool.add_entry(), writing only credential_pool.openai-codex. It never set
active_provider, so on a fresh install the onboarding setup.runtime_check
resolved provider "auto", couldn't detect the Codex OAuth session, and raised
"No inference provider configured" — while setup.status (which sniffs the pool)
reported configured. The disagreement surfaced as the onboarding banner
"Connected, but Hermes still cannot resolve a usable provider."

Use the canonical _save_codex_tokens() instead, matching the CLI's
`hermes auth add openai-codex` path and the Nous/MiniMax dashboard workers.
It writes the providers.openai-codex singleton (setting active_provider) and
syncs the pool.

* fix(auth): align Codex OAuth persistence paths

Ensure desktop and CLI Codex OAuth logins both write the canonical provider state so fresh installs resolve a usable runtime provider.

---------

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-06-02 12:19:44 -05:00
6d14a24b79 feat(dashboard): nous-blue theme, bulk sessions, schedule picker (#37383)
* feat(dashboard): nous-blue theme, bulk sessions, schedule picker

Batch of related dashboard improvements gathered on
austin/fix/dashboard-changes:

* Nous Blue theme — faithful port of the LENS_5I overlay system onto
  the existing DashboardTheme. Lifts the foreground inversion layer to
  z-index 200 to fix the long-standing hover / loading visual artifact,
  adds an explicit swatchColors slot so the theme picker shows the
  post-inversion preview, and migrates the legacy "lens-5i" theme key
  from localStorage / API to "nous-blue" on first read.
* Theme-aware series colors: new --series-input-token /
  --series-output-token CSS vars consumed by Analytics + Models
  charts; ToolCall + ModelInfoCard switched to semantic
  --color-success for diff lines and the Tools capability badge.
* Analytics + Models headers: consolidate period selector + refresh
  next to the page title and drop the redundant period badge.
* Bulk session management — "Delete empty (N)" button + per-row
  checkboxes with shift-click range select and a bulk-delete action
  bar. Backed by SessionDB.delete_sessions() /
  delete_empty_sessions() plus POST /api/sessions/bulk-delete and
  DELETE /api/sessions/empty (registered before the templated
  /api/sessions/{session_id} family so they don't get shadowed).
  Hard cap of 500 IDs per bulk request. Full pytest coverage.
* Cron page — human-readable schedule picker (every-interval / daily
  / weekly / monthly / once / custom) replaces the raw cron
  expression input; the job list now renders "Weekly on Mon, Wed,
  Fri at 14:30" instead of "30 14 * * 1,3,5". English-only ordinals
  for monthly schedules so non-English locales don't get incorrect
  suffixes.
* example-dashboard plugin moved from plugins/ to tests/fixtures/ so
  stock installs no longer ship the demo. Tests install it
  dynamically via a pytest fixture that also reorders the FastAPI
  routes.
* i18n: 40+ new keys for the bulk-select UI and schedule
  picker/describer translated across all 16 locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(dashboard): dedupe memory provider picker

The memory provider <Select> lived on both /system and /plugins,
writing the same config.yaml field through two different endpoints
with no cross-page refresh. Remove the picker from /system in favor
of a read-only status row + link to /plugins, where it pairs with
the context-engine picker under "Plugin providers".

/system retains the destructive admin controls (file sizes, Reset
MEMORY.md / USER.md / all). The api.setMemoryProvider client and
PUT /api/memory/provider backend endpoint are left in place for
CLI / script callers.

Co-authored-by: Cursor <cursoragent@cursor.com>

* docs(dashboard): address Copilot review on PR #37383

- Backdrop layer-stack comment claimed LENS_5I-style themes override
  --component-backdrop-bg-blend-mode to multiply, but our only
  LENS_5I-style theme (nous-blue) keeps the default difference.
  Reword to describe what the code actually does and present the
  var as a forward-looking extension hook.
- /api/sessions/bulk-delete docstring promised the response would
  echo back the list of deleted IDs, but the implementation only
  returns {ok, deleted}. Tighten the docstring to match the wire
  format; the client already knows what it asked to delete, so the
  IDs aren't needed.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(dashboard): address copilot review on cron describe + bulk-select checkbox

- schedule.ts: restrict `describeCronExpression` to strictly 5-field cron
  expressions. The backend `parse_schedule` also accepts the 6-field
  `min hour dom month dow year` form, and humanising those by
  destructuring only the first five fields would silently drop the year
  (e.g. ``0 9 * * * 2099`` rendered as "Daily at 09:00"). 6+ field
  expressions now fall through to the raw-string fallback so the user
  sees what's actually scheduled.

- SessionsPage.tsx (SessionRow): wire the bulk-select Checkbox's
  ``onClick`` directly instead of attaching it to a parent ``<span>``
  with a no-op ``onCheckedChange``. Radix forwards onClick to the
  underlying ``<button role=checkbox>``, so the same handler now drives
  both mouse clicks (preserving shift-key state for range select) and
  keyboard activation (Space on the focused checkbox, which the browser
  synthesises as a click on the <button>). Improves a11y / keyboard UX
  without changing the controlled-selection model.

- SessionsPage.tsx: also extend ``SessionRowProps`` with the new
  ``onRename`` / ``onExport`` props introduced on main so the row's
  destructured prop types resolve after the merge.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-02 12:37:40 -04:00
267e7fd395 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/desktop-session-list 2026-06-02 09:27:34 -05:00
afea650e16 fix(model-picker): OpenAI shows curated models; OpenRouter no longer phantom-shows (#37404)
The model picker now matches `hermes model` for OpenAI, and OpenRouter
stops appearing as authenticated when only OPENAI_API_KEY is set.

- models.py: provider_model_ids() for the default api.openai.com endpoint
  intersects the live /v1/models dump (120+ entries incl. embeddings,
  whisper, tts, dall-e, moderation, legacy chat) with the curated agentic
  list, preserving curated order. Custom OpenAI-compatible endpoints keep
  the live list verbatim so discovery still works.
- providers.py: drop extra_env_vars=("OPENAI_API_KEY",) from the openrouter
  overlay. list_authenticated_providers reads extra_env_vars to decide
  whether a provider is authenticated, so any OpenAI user saw a phantom
  OpenRouter row. Runtime OpenRouter credential resolution still falls back
  to OPENAI_API_KEY (runtime_provider.py), independent of the overlay.
- Regression tests for both paths.
2026-06-02 06:31:37 -07:00
de8bdf529d fix(desktop): keep pinned + recent sessions visible across compression
Long-running sessions auto-compress: the gateway ends the original session
and surfaces the live continuation under a new id (list_sessions_rich projects
the root forward to its tip). Two symptoms fell out of the id rotation:

- A pinned session "vanished" — the pin is stored as the pre-compression root
  id, but the sidebar only matched on the live id, so it was filtered out.
  Pins now resolve on the durable lineage-root id (`_lineage_root_id`, already
  surfaced by the projection): the sidebar indexes sessions by both ids, pin/
  unpin and reorder operate on the durable id, and `sessionPinId()` is shared
  with the Cmd+P toggle. Existing pins keep working with no migration.

- A freshly-continued session was missing from the list until you ungrouped +
  "load 50 more" — the list paginated by original start time, so an old-but-
  active conversation sat past the first page. The desktop now requests
  `order=recent` (GET /api/sessions gains an `order` param backed by the
  existing recency CTE), surfacing live continuations on the first page.
2026-06-02 07:12:05 -05:00
c10ccaaf51 feat(dashboard-auth): rotate dashboard sessions via refresh token (#37247)
* feat(dashboard-auth): rotate dashboard sessions via refresh token

The dashboard auth-code grant now issues a 24h rotating refresh token
(server side: NousResearch/nous-account-service#293). This wires up the
Hermes client half so an expired access token is transparently refreshed
instead of bouncing the user to /login every 15 minutes.

plugins/dashboard_auth/nous:
- refresh_session() now POSTs grant_type=refresh_token to Portal's token
  endpoint and returns a Session carrying the ROTATED refresh token (was
  an unconditional RefreshExpiredError under the old "no RT in V1"
  contract). The RT is sent in BOTH the request body (Portal's schema
  requires it there) and the X-Refresh-Token header (log redaction) —
  verified against the #293 preview deploy: header-only is rejected as
  invalid_request, body is accepted.
- A 400 from Portal (expired / revoked / reuse-detected) maps to
  RefreshExpiredError so the middleware forces a clean re-login; network
  errors map to ProviderError; empty RT fast-fails without a network call.
- complete_login now captures the initial refresh token Portal returns
  (forward-tolerant: empty string if a deploy omits it).
- Extracted the shared token-response handling into
  _token_response_to_session, parameterised on the 400 exception type so
  the auth-code path raises InvalidCodeError and the refresh path raises
  RefreshExpiredError.
- revoke_session stays a best-effort no-op: Portal exposes no public
  token-endpoint revocation grant (revocation is the authenticated
  /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing
  and the 24h session expires on its own. Documented for a future
  revoke grant.

hermes_cli/dashboard_auth/middleware:
- On an expired/invalid access token the gate now attempts refresh via
  the session's RT BEFORE forcing re-login. On success it serves the
  request and re-sets the rotated cookies on the response (mandatory:
  Portal rotates the RT every refresh and reuse-detects, so a stale RT
  cookie would revoke the whole session on the next refresh). On
  RefreshExpiredError (or no RT) it falls through to clear-and-relogin.
- ProviderError during refresh (Portal unreachable) forces a clean
  re-login rather than 500-ing the request.
- Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events.

Validation:
- 176 dashboard-auth unit/integration tests pass.
- Live E2E against the #293 preview deploy: refresh_session(bad rt) ->
  RefreshExpiredError through the real token endpoint; live JWKS fetch +
  RS256 verification rejects a forged token; empty-RT fast-fail. The
  successful happy-path rotation is covered by unit tests (a live run
  needs an interactive browser OAuth round trip + registered agent:*
  client).

Depends on: NousResearch/nous-account-service#293 (server-side RT issuance).

* fix(dashboard-auth): use Portal's x-nous-refresh-token header name

The refresh-token header must match Portal's REFRESH_TOKEN_HEADER exactly
("x-nous-refresh-token"); the initial cut used "X-Refresh-Token", which
Portal silently ignores (harmless since the RT is also in the body, which
is what the schema requires — but the header redaction was a no-op).
Confirmed against the NAS token route + re-validated live against the
#293 preview deploy.

* fix(dashboard-auth): refresh session when access-token cookie has been evicted

The gated middleware bounced users to /login the instant the access-token
cookie was absent, without ever consulting the refresh token:

    at, _rt = read_session_cookies(request)
    if not at:
        return _unauth_response(...)   # bailed here

This made transparent refresh effectively dead for the common case. The
access-token cookie is set with Max-Age = access_token_expires_in (~15 min),
so a real browser EVICTS hermes_session_at the moment the token lapses while
hermes_session_rt persists (30-day Max-Age). From that point the browser
sends only the refresh-token cookie — and the old guard rejected it before
_attempt_refresh could run. The _attempt_refresh path only fired for a
present-but-invalid access token, which never happens in a browser.

Fix: only hard-bounce when NEITHER cookie is present. A request carrying
just the refresh token now skips verification (no AT to verify) and flows
into the existing refresh path, which rotates both cookies and serves the
request transparently. A dead/expired RT still raises RefreshExpiredError
and falls through to clear-and-relogin.

This failure mode escaped the original tests + manual refresh button because
both kept the access-token cookie present; only a real browser evicting the
cookie at Max-Age exposes it. Added 3 regression tests covering: AT-evicted +
RT-present (transparent refresh), no-cookies (still bounces), and RT-only with
a dead RT (clean 401, no 500).
2026-06-02 21:16:41 +10:00
89db6c8534 Merge pull request #37283 from NousResearch/fix-toolset-provider-selection-display
fix(desktop): reflect active toolset provider in config panel
2026-06-02 04:05:52 -04:00
134643a2fa fix(desktop): reflect active toolset provider in config panel
The toolset config panel highlighted the first keyless provider (e.g.
Nous Portal) on load instead of the provider actually written to config.
The /api/tools/toolsets/{name}/config endpoint never reported which
provider was active, so the GUI's default-expand logic fell back to
"first configured" — and keyless providers are always "configured".

Backend now annotates each provider with is_active (via the same
_is_provider_active helper the CLI 'hermes tools' picker uses) plus a
top-level active_provider summary. The panel prefers that signal before
falling back to first-configured/first.

Adds a frontend regression test (active provider is expanded on load)
and backend coverage (config reports is_active/active_provider; selecting
a provider round-trips into the next config read).
2026-06-02 03:25:46 -04:00
bd8e2ec1a6 feat(dashboard): complete admin panel — MCP catalog, enable/disable toggles, hook creation, system stats (#36736)
* feat(dashboard): MCP catalog + enable/disable, webhook toggle, hook create/delete, system stats

Backend for the comprehensive admin pass:
- MCP: GET /api/mcp/catalog (browse Nous-approved optional-mcps), POST
  /api/mcp/catalog/install, PUT /api/mcp/servers/{name}/enabled
- Webhooks: PUT /api/webhooks/{name}/enabled; gateway rejects disabled routes
  with 403 (hot-reloaded, no restart)
- Hooks: POST/DELETE /api/ops/hooks — create (with consent approval) + remove;
  list now reports accurate allowlist status + valid events
- System: GET /api/system/stats — OS/arch/python/cpu + psutil memory/disk/
  uptime/process, stdlib fallback

All gated by dashboard auth; secrets never returned.

* feat(dashboard): MCP catalog UI, enable/disable toggles, hook create, system stats

- McpPage: catalog section (browse Nous-approved MCPs, one-click install with
  env prompts) + per-server enable/disable toggle with gateway-restart note
- WebhooksPage: per-subscription enable/disable toggle (muted + badge when off)
- SystemPage: new Host stats section (OS/arch/python/cpu/mem/disk/uptime/load),
  shell-hook create modal + delete, 'Create backup' label
- api.ts: client methods + types for catalog, toggles, hook CRUD, system stats

* test(dashboard): cover catalog, toggles, hook CRUD, system stats, webhook toggle

Adds tests for the comprehensive pass: MCP enable/disable + catalog list +
catalog-install-unknown, hook create/delete with consent, system stats shape,
and webhook enable/disable. 26 tests total, all green.

* docs(dashboard): document the comprehensive admin pass + fresh screenshots

Updates the MCP/Webhooks/Pairing/System sections for catalog browse+install,
enable/disable toggles, hook creation, and host system stats; adds the new
endpoints to the API table; replaces the screenshots with live captures of
the rebuilt pages (real data, no dummies) including the hook-create modal.

* feat(dashboard): curator, portal status, and prompt-size/dump/migrate ops

Closes the last in-scope CLI gaps from the coverage audit:
- Curator: GET /api/curator (status), PUT /api/curator/paused, POST
  /api/curator/run (background)
- Portal: GET /api/portal (Nous auth + Tool Gateway routing, read-only)
- Diagnostics: POST /api/ops/prompt-size, /api/ops/dump, /api/ops/config-migrate
  (backgrounded, tailed via action status)

Host-bound commands (secrets/proxy/lsp/acp/computer-use/desktop/completion/
postinstall/uninstall/claw) remain CLI-only by design.

* feat(dashboard): curator + portal + diagnostics UI, tests

- SystemPage: Nous Portal status section (auth + Tool Gateway routing),
  Skill curator card (status + pause/resume + run now), and three new
  Operations buttons (prompt size, support dump, migrate config)
- api.ts: client methods + CuratorStatus/PortalStatus types
- tests: curator pause/resume, portal shape, system-stats shape, + auth-gate
  coverage for the new GET endpoints (31 tests total)

* docs(dashboard): document curator, portal, and diagnostics + refresh System screenshots

Updates the System section for the Nous Portal status, Skill curator
controls, and the new prompt-size/dump/migrate operations; adds them to the
API table; refreshes the System screenshots (now showing Portal + Curator)
and adds a dedicated curator/gateway/memory capture.

* feat(dashboard): session stats/export/prune + skills hub search endpoints

Completes the existing tabs' backend depth (audit vs CLI):
- Sessions: GET /api/sessions/stats (store stats), GET /api/sessions/{id}/export,
  POST /api/sessions/prune. /stats is registered before /{session_id} so the
  literal path isn't captured by the parameterized route.
- Skills: GET /api/skills/hub/search — parallel multi-source hub search (threaded),
  returns installable identifiers
- (rename via PATCH and cron-edit via PUT already existed; now surfaced in UI)

* feat(dashboard): complete existing tabs — sessions mgmt, skills hub browse, cron edit

Audited every existing tab against its CLI command and filled the gaps:
- Sessions: store stats bar, per-row rename + export (JSON download), and a
  prune-old-sessions control (mirrors hermes sessions rename/export/prune/stats)
- Skills: new 'Browse hub' view — search the skill hub across all sources,
  install by identifier with a live install log, and 'Update all' (mirrors
  hermes skills search/install/update)
- Cron: per-job Edit modal (pre-filled) calling updateCronJob (hermes cron edit)
- api.ts: renameSession/getSessionStats/exportSessionUrl/pruneSessions,
  updateCronJob, searchSkillsHub + types

Models tab was already comprehensive (provider+model picker, dynamic per-provider
lists, main + all 11 aux-task assignments, reset) — verified, no change needed.

* test(dashboard): cover session stats/rename/export/prune + skills hub search

Adds the route-shadowing guard for /api/sessions/stats (must not be captured
by /api/sessions/{session_id}), rename/export/prune, and the empty-query
short-circuit for hub search. 36 tests total, all green.

* docs(dashboard): document enhanced Sessions, Skills hub, and Cron edit

Sessions: stats bar, rename, export, prune (+ screenshot). Skills: new Browse
hub view for search/install/update (+ screenshot). Cron: edit action. API
table updated with the new endpoints.
2026-06-02 00:16:11 -04:00
21f55af769 fix(model-picker): stop routing OpenAI selection to OpenRouter (#37175)
The /model picker emitted a standalone slug=openai row (gated on
OPENAI_API_KEY). Selecting it ran resolve_provider_full("openai"),
which resolved the legacy providers.py alias openai->openrouter BEFORE
checking the user's own providers.openai config — silently switching
users onto OpenRouter (HTTP 401 when they have no OR key).

- model_switch.list_authenticated_providers: skip vendor names that are
  aliases to an aggregator (isolates openai->openrouter; copilot/kimi/etc.
  are real providers and unaffected). Kills the phantom picker row.
- providers.resolve_provider_full: user-config providers.<name> now wins
  over the built-in alias table, so providers.openai (api.openai.com)
  beats the alias.
- model_switch PATH A: user-config providers resolve credentials via
  their own endpoint instead of the name-based runtime resolver that
  doesn't know user-config slugs; plus a fail-loud guard for explicit
  unauthed-aggregator hops.

Verified E2E with the reporter's config (no OR key): selecting OpenAI +
gpt-4o-mini now resolves to api.openai.com instead of openrouter.ai.
2026-06-01 20:27:41 -07:00
72e82f88c0 fix(kanban): decompose children inherit root workspace instead of forcing scratch (#37172)
decompose_triage_task hardcoded every fan-out child to workspace_kind
'scratch', ignoring the root task's workspace. A code-gen task created
with a dir:/worktree: workspace would fan out into throwaway scratch tmp
dirs (GC'd on archive), so generated code never landed in the project.

Children now inherit the root's workspace_kind + workspace_path. A child
dict may still override with its own workspace_kind/workspace_path; the
path only carries over when kinds match. Scratch roots are unchanged.
2026-06-01 20:26:57 -07:00
eee32cdd52 fix(gateway): fall back to in-process heartbeat when s6 sleep is missing (#36208) (#37120)
Inside an s6 container, `gateway run` redirects to the supervised
gateway and then keeps the CMD process alive as a no-op heartbeat so
/init doesn't start stage-3 shutdown. That heartbeat is
`os.execvp("sleep", ["sleep", "infinity"])`, which does a PATH lookup
for the `sleep` binary. When PATH was empty/truncated/clobbered at that
point — e.g. after user customizations rewrote PATH, or on a minimal
image without `sleep` on PATH — the exec raised FileNotFoundError,
killing the CMD process and causing /init to tear down every service:
the container failed to start (issue #36208, a regression in the s6
image from 2026.5.28).

Wrap the exec in try/except OSError: on success it still replaces the
process with the cheap `sleep` heartbeat (no resident Python
interpreter, and the existing process-tree/recursion contract is
preserved); on failure it falls back to `_block_until_terminated()` —
a SIGTERM handler (clean 128+signum exit on `docker stop`) plus a
signal.pause() loop, which needs no external binary and so can't fail
on PATH state. A threading.Event().wait() fallback covers platforms
without signal.pause().

Keeping execvp as the primary path (rather than replacing it outright)
preserves the `sleep infinity` heartbeat that the docker integration
tests assert (test_gateway_run_supervised.py) and avoids leaving a
full Python interpreter resident for the container's lifetime.

Verified end-to-end on a built image: with execvp forced to fail,
_block_until_terminated() blocks cleanly instead of raising
FileNotFoundError; normal boot still runs the cheap `sleep infinity`
heartbeat; the 6 test_gateway_run_supervised.py integration tests pass.

Salvages the two community fixes for this issue — the fallback design
from #36221 (@Pluviobyte) and the signal.pause() heartbeat from #36267
(@karmeleon) — and adds regression tests for both the normal and
sleep-missing paths.

Co-authored-by: Pluviobyte <Pluviobyte@users.noreply.github.com>
Co-authored-by: karmeleon <karmeleon@users.noreply.github.com>

Closes #36208.
2026-06-02 11:59:27 +10:00
85b65e29f0 feat(desktop): session hygiene, archive, media streaming + connecting overlay (#37099)
* feat(desktop): session hygiene, archive, media streaming + connecting overlay

Address a batch of desktop feedback:

- Stop leaking empty "Untitled" sessions: the TUI gateway pre-created a DB
  row on every session.create (i.e. every launch/draft). Persist the row
  lazily on first prompt instead, and hide message-less rows in the sidebar.
- Archive/hide sessions: new `archived` column + set_session_archived, web
  API (`?archived=` + PATCH archived), Ctrl/⌘-click and a context-menu item
  in the sidebar, and an "Archived Chats" settings panel to restore/delete.
- Videos load via a streaming `hermes-media://` protocol instead of capped,
  in-memory data URLs (16 MB limit) — bypasses the cap and supports seeking.
- Background-process completions route to the session that launched them:
  the completion event now carries session_key and each poller only consumes
  its own.
- Sidebar: "Group by workspace" toggle is always visible; each workspace
  group gets a "+" to start a session in that directory; "New agent"/"Agents"
  relabeled to "New session"/"Sessions".
- New gateway connecting overlay (ascii decode → fade out) replacing the bare
  skeleton/"starting gateway" state.

* fix(desktop): bail connecting overlay on boot error

The shownRef latch kept the connecting overlay mounted behind
BootFailureOverlay after a hard boot failure. Return null on boot.error
so the failure recovery surface fully owns the screen.

* fix(desktop): address Copilot review

- /api/sessions: validate `archived` (400 on unknown) and return `archived`
  as a JSON boolean instead of SQLite's 0/1.
- PATCH /api/sessions/{id}: 400 (not a misleading 404) when the body has no
  updatable fields; stop conflating a no-op with "not found".
- hermes-media protocol: drop `bypassCSP` — streaming only needs
  secure/standard/stream/supportFetchAPI.
- Sidebar workspace header: split the toggle and the "+" into sibling buttons
  so we no longer nest interactive elements inside a <button>.

* fix(desktop): address Copilot re-review

- hermes-media protocol: restrict streaming to an audio/video extension
  allowlist (415 otherwise) so it can't be used to read arbitrary local files.
- Connecting overlay: use z-[1200] instead of the non-standard z-1200 utility.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-01 20:41:34 -05:00
0fdab53ef0 feat(cli): ranked fuzzy search in the curses model picker
Wires the salvaged search helpers into the shared curses menu driver and
turns on type-to-filter for the CLI model pickers (the 100+ model lists
that previously required scrolling).

- Search lives in the shared `_run_curses_menu` driver behind a
  `searchable` flag + `search_labels`, so both `curses_radiolist` and
  `curses_single_select` get it without per-menu duplication. `/` opens
  the filter, BACKSPACE edits, Ctrl+U clears, ESC clears the filter then
  cancels. Returned values are always original item indices.
- `_filter_indices` RANKS matches (best-first) via a Python port of the
  TS scorer in ui-tui/src/lib/fuzzy.ts and web/src/lib/fuzzy.ts. The port
  is byte-identical in score: same per-char bonuses, prefix (+8) and
  exact (+20) bonuses, camelCase/word-boundary detection (matching on the
  lowercased target, boundary on the original case), and the -len*0.01
  length tiebreak — so the CLI, TUI, and WebUI rank results identically.
  A cross-language parity test pins the exact scores.
- `_prompt_model_selection` (the canonical picker across the model flows)
  and the custom-provider model list pass `searchable=True`.
- Split `_decode_menu_key` out of `read_menu_key` so the search loop can
  peek the raw key (catch `/`) before nav decoding.
- ESC during active search now clears the query (restores the full list)
  so a no-match filter can't strand the user; printable-key capture is
  restricted to ASCII to avoid Latin-1 mojibake.
- Update two setup-menu tests whose mock signatures predate the new
  `searchable` kwarg; add ranked-scorer + parity + state-machine tests.
2026-06-01 16:58:58 -07:00
53f598e7a2 feat(cli): add fuzzy search helpers for curses pickers
Pure, refactor-independent helpers for type-to-filter search in the
curses single-/radio-select menus: subsequence matching, filtered-index
mapping, cursor reconciliation, scroll clamping, and an active-search
key handler, plus unit tests.

Salvaged from #22758 (the curses event loop was since refactored into a
shared driver on main, so the integration is rebuilt in a follow-up
commit; these pure helpers and their tests carry over unchanged).
2026-06-01 16:58:58 -07:00