96cd37e212e819994c0e8f7135476986b14fa57c
241 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| 96cd37e212 |
fix(dashboard): reap orphaned embedded-chat sessions to stop slash_worker leak
Since #38591 made the dashboard's embedded chat unconditional, every browser refresh of /chat spins up a fresh session.create (new sid + a fresh _SlashWorker via _deferred_build) over /api/ws, but the old tab's WS disconnect only DETACHES the transport (ws.py) — it never closes the old session or its slash_worker. The dashboard's in-process gateway is long-lived, so the detached _SlashWorker subprocess's stdin pipe stays open forever and the worker never reaches EOF: one leaked python process per refresh. Fix at the session-lifecycle layer (not PTY signal timing — verified that a process whose owning gateway dies is always reaped via stdin-EOF; the leak is specifically the long-lived dashboard process keeping detached sessions parked). On WS disconnect, schedule a grace-delayed reap of any session left orphaned (transport detached to stdio, not mid-turn). A quick reconnect / session.resume / prompt.submit rebinds a live transport and cancels the reap, preserving the intentional detach-for-reconnect window. - server.py: extract _teardown_session() (shared with session.close), add _ws_session_is_orphaned() + _schedule_ws_orphan_reap(), gated by HERMES_TUI_WS_ORPHAN_REAP_GRACE_S (default 20s, 0 disables). - ws.py: schedule the reap for each detached session on disconnect. - tests: reap-closes-worker, spares-reattached/mid-turn/finalized, disabled-when-grace-zero. |
|||
| 5bcb63e400 |
fix(tui): add thread-safety locks for _sessions and prompt dicts
C1: Add _sessions_lock to protect all compound mutations and iterations
on the global _sessions dict across 5+ concurrent execution contexts
(main dispatcher, pool workers, daemon threads, notification poller,
atexit handler).
C2: Add _prompt_lock to protect _pending/_pending_prompt_payloads/_answers
dicts from races between _block() (agent callback thread) and
_respond() (pool worker). Lock scope is kept tight — _block() only
holds the lock during registration/cleanup, releasing it before
_emit() and ev.wait() to avoid blocking other prompts for 300s.
All 187 existing TUI tests pass with no regressions.
|
|||
| e7a7872a87 |
fix(tui_gateway): dedup re-queued process notifications flooding TUI
_ notification_poller_loop_ re-emits status.update every cycle when a background process completes while the session is busy. The same completion event gets re-queued and re-emitted to the TUI every few ms, flooding the transcript with duplicate lines. Add _notification_event_dedup_key(evt) that returns a tuple identity for each notification event. Only emit status.update on first sight per identity: - completions: (sid, type) — one-shot per process session - watch_match: (sid, type, command, pattern, output, ...) - watch_overflow/disabled: (sid, type, command, message, ...) The dedup key design was refined from an initial sid:type approach after @lordbuffcloud identified that distinct watch_match events (READY vs DONE) for the same process would be incorrectly collapsed. Tests from @tymrtn cover distinct watch matches, exact replay dedup, and completion one-shot behavior. Co-authored-by: tymrtn <ty@tmrtn.com> |
|||
| a3fb48b2ce |
fix(state): keep /branch sessions visible after parent reopen
/branch (aka /fork) sessions vanished from /resume and /sessions. Both surfaces funnel through list_sessions_rich(include_children=False), which hid any session with a parent_session_id unless identified as a branch via a heuristic — parent.end_reason == 'branched' AND child.started_at >= parent.ended_at. Two ways that heuristic failed: 1. CLI/gateway branches: once the parent was reopened (e.g. resumed) and re-ended with a different end_reason (tui_shutdown overwriting 'branched'), the heuristic stopped matching and the branch was hidden permanently. 2. TUI branches (tui_gateway session.branch): the TUI never ends the parent as 'branched' — it creates the child while the parent is still live — so the heuristic NEVER matched and TUI branches were hidden from the moment they were created (this is the macOS desktop app's primary symptom). Fix: persist a stable '_branched_from' marker in the branch session's model_config at creation time across ALL THREE branch paths (CLI cli.py, gateway gateway/run.py, and TUI tui_gateway/server.py), and OR a json_extract(model_config, '$._branched_from') IS NOT NULL check into the list_sessions_rich filter. The marker is immutable across the parent's lifecycle, so the branch stays visible regardless of how/whether the parent is ended. The legacy end_reason heuristic is kept (OR'd) so pre-existing branches remain visible. Subagent/compression children (no marker, parent not 'branched') stay correctly hidden. Fixes #20856. Approach by liuhao1024 (PR #20864); reimplemented on current main, extended to the TUI branch path (which the original missed), with regression tests for the reopen+re-end scenario and the TUI marker persistence. |
|||
| 8077e7d2fb |
fix(tui): narrow resume lock to avoid blocking session.close
The salvaged fix held _session_resume_lock across _make_agent (MCP discovery + AIAgent construction, seconds), serializing it against session.close. Since session.close runs on the main RPC dispatch thread (not a _LONG_HANDLER), a close racing a mid-build resume would stall all fast-path RPCs (approval.respond, session.interrupt). Restructure to double-checked locking: build the agent outside the lock, then re-check _find_live_session_by_key under the lock before _init_session. A losing concurrent resume discards its just-built agent (no worker/poller wired yet) and reuses the winner. Updated the concurrent-resume regression test to assert the real invariant (one surviving live session + loser agent closed) rather than the implementation detail of a single _make_agent call. |
|||
| bd6d098762 | fix(tui): keep resumed live history current | |||
| 98903d0313 | fix(tui): reuse live session on resume | |||
| f66a929a6b |
fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out (#38578)
* fix(desktop): render approval/sudo/secret prompts so tools stop silently timing out The desktop app's gateway event handler (use-message-stream.ts) handled clarify.request but had no case for approval.request, sudo.request, or secret.request. When a tool needed approval, the gateway emitted approval.request and blocked the agent thread in _await_gateway_decision() for up to 5 min (approvals.gateway_timeout); the desktop dropped the unknown event, never showed a dialog, then the agent returned BLOCKED. No prompt, just a stall then a block. The Ink TUI already handles all three (createGatewayEventHandler.ts); this brings the Electron app to parity. - store/prompts.ts: approval/sudo/secret atoms (+ request-id-guarded clears) - components/prompt-overlays.tsx: Radix dialogs; close/Esc maps to refusal so silence is never mistaken for consent (parity with TUI Esc->deny) - use-message-stream.ts: wire the three *.request cases; clearAllPrompts on message.complete so an overlay can't outlive its turn - chat-messages.ts: GatewayEventPayload gains command/description/env_var/prompt - mount PromptOverlays in the chat shell * feat(desktop): inline tool-call approval bar (Cursor-style "Run") Render dangerous-command / execute_code approval inline on the pending tool row instead of as a modal. Binding is positional: the desktop tool.start payload carries no structured args, but approval.request only fires from the terminal/execute_code guards and the agent blocks on one approval at a time, so the single pending row of those tools is the one that raised it. Command/description text comes from $approvalRequest. Drops ApprovalDialog from PromptOverlays (sudo/secret stay modal). * style(desktop): make inline approval bar match Cursor's command card Drop the amber alert styling for a neutral elevated card: command on a terminal-prefixed row up top, a divided footer with the muted description on the left and right-aligned controls — a ghost "Reject" (Esc) plus a split primary "Run" (⌘⏎) whose chevron opens "Allow this session" / "Always allow" / "Reject". Wire ⌘/Ctrl+Enter → Run and Esc → Reject to match Cursor's accept/skip bindings, guarded against double-send via the $approvalRequest atom. * style(desktop): shrink inline approval to a tiny Cursor-style button strip The running tool row already shows the command, so drop the whole card + command echo + description band. What's left is a compact strip under the row: a small split "Run ⌘⏎" button (chevron → Allow this session / Always allow / Reject) and a ghost "Reject Esc", indented to sit under the row's title text. * style(desktop): drop the loud blue Run button for a quiet outlined control Swap the primary (blue) Run for a subtle outlined split control — neutral border, transparent fill, hover-accent — so the approval strip reads as quiet inline affordance rather than a big CTA. Reject stays ghost. * style(desktop): make Run a soft primary badge Tint the Run split control with the primary color as a badge (bg-primary/10, primary text, primary/25 border, rounded-md, hover primary/15) instead of a solid CTA or a neutral outline. * style(desktop): slim the approval chevron and space out Reject The chevron button had ballooned because dropping the size prop fell back to the big default size (h-9 + has-svg px-3). Pin size=xs everywhere and give the chevron a tight w-5/px-0. Bump the gap between the Run badge and Reject (gap-2.5) and loosen Reject's internal spacing. * feat(desktop): confirm before "Always allow" persists an approval "Always allow" writes the matched pattern to ~/.hermes/config.yaml and suppresses the prompt in every future session — too consequential to fire straight from a menu click. Route it through a confirm dialog that names the pattern + command and the file it touches. The dialog owns the keyboard while open so Esc closes it instead of denying the approval. * fix(gateway): make sudo + secret prompts actually fire in the desktop Tek's PR added the sudo/secret overlays and callback wiring, but neither reached the live path: - Sudo: the sudo password callback is thread-local (terminal_tool _callback_tls), and _wire_callbacks runs on the agent-build thread, not the turn thread that executes tools. At command time the callback was missing, so terminal sudo fell through to /dev/tty and hung the headless gateway. Re-wire callbacks at the top of the prompt-submit turn thread. - Secret: skills_tool short-circuited to the "secret entry unsupported" hint for any gateway surface, before invoking the callback. Interactive surfaces (desktop/TUI) register a secret-capture callback that routes to the secret.request overlay; only short-circuit when no callback exists, so messaging still gets the hint but the desktop prompts. * docs(desktop): drop Cursor references from approval comments * docs(desktop): drop Cursor reference from prompt-overlays comment * fix(skills): gate in-band secret capture on HERMES_INTERACTIVE, not callback presence The desktop/sudo PR switched the gateway secret-capture short-circuit from "any gateway surface" to "gateway surface with no callback registered". That made a messaging gateway (telegram/discord/...) attempt interactive in-band secret capture whenever any callback happened to be registered, instead of returning the safe "setup unsupported" hint — and broke test_gateway_still_loads_skill_but_returns_setup_guidance. Discriminate on HERMES_INTERACTIVE instead: the desktop app / TUI set it in _enable_gateway_prompts (alongside registering the secret.request callback), while messaging platforms never do. This is the same flag tools/approval.py uses to tell an interactive surface from a messaging one, so messaging keeps the hint and desktop/TUI still prompt. --------- Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com> |
|||
| 1927ff217e |
Merge pull request #38517 from NousResearch/bb/desktop-yolo-statusbar-toggle
feat(desktop): YOLO toggle in the status bar (per-session, TUI parity) |
|||
| e02a6038a4 |
fix(tui): save TUI /save snapshots under Hermes home with system prompt (#38251)
* fix(tui): save TUI /save snapshots under Hermes home with system prompt The TUI gateway's session.save RPC wrote hermes_conversation_<ts>.json to the workspace/project CWD via os.path.abspath(...) and only exported model and messages. This diverged from the classic CLI /save (which writes under the Hermes profile home) and from the dashboard save (which includes the system prompt). Write the snapshot under get_hermes_home()/sessions/saved/ and include system_prompt, session_id, and session_start so the TUI export matches the CLI and dashboard behavior. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tui): prefer agent.session_start for /save export; assert it in test Address review feedback: derive session_start from the agent's session_start datetime (matching the classic CLI export) and fall back to the gateway session's created_at only when unavailable. Assert session_start in the regression test. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> |
|||
| e76d8bf5aa |
fix(tui): stop persisting full tool output in trail lines (silent OOM death)
A heavy --tui session (browser snapshots, large tool outputs) silently OOM-killed the Node parent within minutes — closing the gateway child's stdin, which the user saw only as a bare "gateway exited" / stdin EOF. CLI was immune. Root cause: each completed tool's verbose trail line embedded up to 16KB of result_text, persisted in transcript Msg.tools[] for the whole session and rendered EXPANDED by default, so an Ink render-node tree was built for every one of up to 800 messages at once. That tree blew past Node's heap at a few hundred MB — far below the 2.5GB memory-monitor exit threshold, so the death was never even attributed. - text.ts: persisted verbose tool-trail blocks now cap to a small preview (VERBOSE_TRAIL_MAX_CHARS=800/12 lines), not the 16KB live-render budget. Retained trail strings drop ~17x (12.2MB -> 0.7MB at 800 msgs); the live streaming tail still uses the larger LIVE_RENDER budget. - tui_gateway/server.py: lower the gateway-side verbose text cap to match (1KB/16 lines) so we stop shipping output the TUI no longer renders. - memoryMonitor.ts: derive critical/high thresholds from the real V8 heap ceiling (~88%/70%) instead of the hardcoded 2.5GB that killed the process at 31% of an 8GB ceiling; add a one-shot onWarn early-warning on fast sub-threshold heap growth so the next such death is diagnosable, not silent. - entry.tsx: wire onWarn to a crash-log breadcrumb + stderr line. Full tool output is unchanged in the agent context and SQLite session — this is display/transport only, no behavior or context change. Fixes #34095. Related #27282. Tests: ui-tui text + new memoryMonitor suites (33 pass), python verbose-cap guard (5 pass); full ui-tui suite shows no new failures vs pristine main. E2E repro confirms the retention drop. |
|||
| ea4fe15631 |
feat(desktop): inline model picker in the status bar
Replace the status-bar model chip's modal with a Cursor-style dropdown: - providers grouped by name in a stable order (no recency reshuffle on select) - per-model hover-Edit submenu for reasoning effort + fast, gated by per-model capabilities now surfaced in the model.options payload - unified Fast toggle: flips the speed=fast param where supported, else swaps to the model's `-fast` variant (base and variant collapse into one row) - localStorage-backed "Edit Models" dialog to choose which models appear Adds reusable dropdown primitives (DropdownMenuSearch, shared row/label tokens, portaled + collision-aware submenus) and reads session state from nanostores rather than prop-drilling, so editing options doesn't rebuild and close the menu. |
|||
| 31c40c72c0 |
fix(desktop): stabilize project folder sessions (#37586)
* fix(desktop): stabilize project folder sessions Keep desktop folder selection aligned with new sessions and scope TUI gateway cwd through session context so prompts and tools resolve against the selected workspace. * fix(desktop): address review feedback on folder sessions Snapshot sessions before iterating to avoid concurrent-mutation crashes, optional-chain the revealLogs catch, and read console-message args from the correct Electron event/messageDetails positions. * fix(desktop): address second review pass on folder sessions Sync the remembered workspace key with the cwd atom (clear on empty), only load tree children for real directory nodes, and throttle renderer auto-reloads so a deterministic startup crash can't loop forever. * fix(desktop): inherit parent workspace for ephemeral agent tasks Background and preview tasks use ephemeral ids absent from the session map, so pass the parent session cwd into the session context explicitly instead of clearing it back to the gateway launch dir. Also correct the set_session_vars docstring about clear_session_vars semantics. * fix(desktop): validate preview cwd before pinning session context A non-empty but non-existent client cwd would pin an unusable override and silently fall back to the launch dir. Validate once, reuse for both the session context and the terminal override, and fall back to the parent session workspace when invalid. * fix(desktop): harden preview cwd normalization and adopt normalized cwd Guard preview cwd normalization against malformed client paths so a bad input can't fail the whole restart, and adopt the backend's normalized config.get cwd in the no-active-session path so the persisted workspace stays consistent with what the agent uses. |
|||
| 8977bf282e |
Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> |
|||
| 135c65093a |
feat(desktop): stable in-workspace ordering + No-workspace default
- Sidebar: rows within a workspace group now sort by creation time instead of last activity, so they stop reshuffling every time a message lands (muscle memory). Groups still float up by recency. - Sessions only persist a workspace cwd when one was explicitly chosen; an auto-detected launch directory is no longer stamped on the row, so untargeted sessions group under "No workspace" instead of "desktop". The agent still runs in the detected directory. |
|||
| 85b65e29f0 |
feat(desktop): session hygiene, archive, media streaming + connecting overlay (#37099)
* feat(desktop): session hygiene, archive, media streaming + connecting overlay
Address a batch of desktop feedback:
- Stop leaking empty "Untitled" sessions: the TUI gateway pre-created a DB
row on every session.create (i.e. every launch/draft). Persist the row
lazily on first prompt instead, and hide message-less rows in the sidebar.
- Archive/hide sessions: new `archived` column + set_session_archived, web
API (`?archived=` + PATCH archived), Ctrl/⌘-click and a context-menu item
in the sidebar, and an "Archived Chats" settings panel to restore/delete.
- Videos load via a streaming `hermes-media://` protocol instead of capped,
in-memory data URLs (16 MB limit) — bypasses the cap and supports seeking.
- Background-process completions route to the session that launched them:
the completion event now carries session_key and each poller only consumes
its own.
- Sidebar: "Group by workspace" toggle is always visible; each workspace
group gets a "+" to start a session in that directory; "New agent"/"Agents"
relabeled to "New session"/"Sessions".
- New gateway connecting overlay (ascii decode → fade out) replacing the bare
skeleton/"starting gateway" state.
* fix(desktop): bail connecting overlay on boot error
The shownRef latch kept the connecting overlay mounted behind
BootFailureOverlay after a hard boot failure. Return null on boot.error
so the failure recovery surface fully owns the screen.
* fix(desktop): address Copilot review
- /api/sessions: validate `archived` (400 on unknown) and return `archived`
as a JSON boolean instead of SQLite's 0/1.
- PATCH /api/sessions/{id}: 400 (not a misleading 404) when the body has no
updatable fields; stop conflating a no-op with "not found".
- hermes-media protocol: drop `bypassCSP` — streaming only needs
secure/standard/stream/supportFetchAPI.
- Sidebar workspace header: split the toggle and the "+" into sibling buttons
so we no longer nest interactive elements inside a <button>.
* fix(desktop): address Copilot re-review
- hermes-media protocol: restrict streaming to an audio/video extension
allowlist (415 otherwise) so it can't be used to read arbitrary local files.
- Connecting overlay: use z-[1200] instead of the non-standard z-1200 utility.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
|||
| 3f7d1c801d |
feat(undo): /undo [N] backs up N user turns with prefill + soft-delete
Extends the existing /undo command from a single in-memory exchange removal into a full rewind: back up N user turns (default 1), soft-delete the truncated rows in SessionDB (active=0, kept for audit, hidden from re-prompts and search), notify memory providers, and prefill the composer with the backed-up message text for editing — CLI and TUI. Reuses the SessionDB rewind primitives, the on_session_switch(rewound=True) memory hook, and the TUI command.dispatch prefill payload from SaguaroDev's #21910 work, wired to /undo [N] instead of a separate /rewind picker. - cli.py: undo_last(n, prefill) — in-memory truncate + SQLite soft-delete + agent surgery (system-prompt invalidate, flush-index reset) + memory notify + editable buffer prefill; /undo dispatch parses optional count; checkpoint-rollback caller passes prefill=False - tui_gateway/server.py: command.dispatch undo branch (was rewind) parses count, picks Nth-from-last user turn, clamps to oldest - commands.py: /undo gains [N] args_hint - tests: rename + expand TUI suite (multi-turn, clamp, invalid-count) - release.py: AUTHOR_MAP entry for SaguaroDev Co-authored-by: SaguaroDev <74339271+SaguaroDev@users.noreply.github.com> |
|||
| 243e836dce |
feat(tui): wire /rewind through command.dispatch + prefill payload (#21910)
Adds the TUI half of the /rewind feature so the Ink terminal UI gets the same affordance as the prompt_toolkit CLI. Python side (tui_gateway/server.py): - /rewind added to _PENDING_INPUT_COMMANDS so slash.exec rejects it and the TUI falls through to command.dispatch (the only path with access to live session state + memory hooks). - New command.dispatch branch for name == "rewind": v1 auto-picks the most recent user turn (Claude-Code-style single- step undo), calls SessionDB.rewind_to_message, refreshes the in-memory history, fires _memory_manager.on_session_switch with rewound=True, and returns the new "prefill" payload. - A dedicated picker overlay (multi-step rewind) is tracked as a follow-up to #21910. TS side (ui-tui/src/): - New "prefill" variant on CommandDispatchResponse + asCommandDispatch validator. Mirrors "send" but does NOT auto-submit; the client drops the message into the composer for editing. - createSlashHandler renders the optional notice via sys() and calls ctx.composer.setInput(d.message), letting the user edit-and-resubmit the rewound turn — the core UX promised by the issue. Tests: - 7 new tui_gateway tests covering prefill payload shape, in-memory history truncation, DB soft-delete, memory-provider notification (rewound=True), busy-session refusal, missing-session error, and registry placement in _PENDING_INPUT_COMMANDS. - Extended asCommandDispatch vitest covering the new prefill variant (with + without notice, and rejection of malformed payloads). Out of scope for v1 (tracked as #21910 follow-up): - Dedicated picker overlay in Ink (the multi-step rewind UI). v1 auto- picks the most recent user turn, matching the most common case. - Gateway platforms (Telegram, Discord, etc.) — issue scopes v1 to CLI + TUI only. |
|||
| 77bb64813c |
fix(desktop): report desktop_contract in lazy session.create info (#36112)
The lazy session.create path hand-builds a partial info dict that omitted desktop_contract. The desktop GUI reads a missing contract as undefined and treats it as an out-of-date backend, so it surfaced a "Backend out of date" toast on every launch even against a current backend. Carry the contract in the lazy payload like _session_info already does for resume/branch. |
|||
| 51c68d4ab1 |
Add Hermes desktop app (#20059)
* feat: better composer etc * docs: add desktop and dashboard run instructions * fix(desktop): address security scan findings * fix(dashboard): resolve @nous-research/ui path under npm workspaces The sync-assets prebuild step shelled out to 'cp -r node_modules/@nous-research/ui/dist/fonts ...' with a path relative to apps/dashboard/. That works only when the dep is installed locally in the dashboard workspace, but 'npm install' at the repo root (the documented setup — see apps/desktop/README.md) hoists shared deps to the root node_modules under npm workspaces. The relative cp then fails with 'No such file or directory', sync-assets exits 1, the Vite build aborts, and 'hermes dashboard' surfaces a generic 'Web UI build failed' message. Replace the shell one-liner with scripts/sync-assets.cjs, which walks up from the dashboard directory looking for node_modules/ @nous-research/ui — working in both the hoisted (workspaces) and co-located (standalone) layouts. Also guards against a missing dist/fonts or dist/assets with a clearer error pointing at a rebuild of the UI package rather than silently copying nothing. * feat(desktop): support connecting to a remote Hermes backend Add HERMES_DESKTOP_REMOTE_URL and HERMES_DESKTOP_REMOTE_TOKEN env vars that, when set, short-circuit the local-child spawn in startHermes() and connect the Electron renderer to an already- running 'hermes dashboard' server reachable over the network. Motivating use case: WSL2 users who want to run the Hermes core (agent loop, tools, filesystem access) inside their WSL distribution while rendering the Electron GUI on native Windows. Before this change, the desktop app always spawned a local Python child on the same host as the renderer, which doesn't cross the WSL/Windows boundary. The remote path reuses waitForHermes() as a liveness probe (/api/status is in the backend's public endpoint allowlist), so the connection is only returned once the backend is actually ready. WebSocket URL derivation picks ws:// or wss:// based on the input scheme. URL validation rejects non-http(s) schemes and requires both env vars together to avoid a half-configured connection that would silently fall through to the spawn path. No behaviour change when the env vars are unset — the default local-spawn flow is untouched. Typical usage: # in WSL2 hermes dashboard --tui --no-open --host 0.0.0.0 --port 9119 --insecure # on Windows set HERMES_DESKTOP_REMOTE_URL=http://localhost:9119 set HERMES_DESKTOP_REMOTE_TOKEN=<session token> set HERMES_DESKTOP_IGNORE_EXISTING=1 (launch Hermes desktop) * ci(desktop): automate desktop releases Add GitHub Actions release channels for signed desktop installers and document the stable/nightly download paths. * feat: file tabs * refactor(desktop): tighten right-rail tab close API Promote closeRightRailTab/closeActiveRightRailTab as the single public entry point. Drops the activeTabRef + handleCloseDocument indirection in ChatPreviewRail, the unused $rightRailHasContent atom, and the legacy dismissFilePreviewTarget alias. -70 LOC. * feat(desktop): polish composer pill toward reference look Solid foreground-on-background send/voice-conversation circle (black-on-white in light, white-on-black in dark) anchors the right edge as the primary CTA instead of the orange theme primary. Bumps the primary control to 2.125rem so it visually outranks the ghost mic/plus controls. Opens up the surface padding (0.625rem x / 0.5rem y) so the input row breathes around its controls, and nudges the corner radius from 20 to 24px for a slightly pill-ier silhouette. LiquidGlass distortion is preserved. * feat(desktop): add startup and onboarding flow Add phase-based desktop boot progress, fresh-install sandbox testing, and first-run provider credential onboarding so packaged installs can start cleanly without manual settings detours. * fix(desktop): gate prompts on provider setup Show the desktop provider onboarding flow before prompt submission when no inference provider is configured, preventing fresh installs from falling through to backend credential errors. * fix(desktop): surface provider onboarding from session warnings Propagate credential warnings through session runtime info and open desktop onboarding whenever a session reports no usable provider, so unconfigured installs cannot fall through to prompt errors. * fix(desktop): route gateway provider errors to onboarding The "No inference provider configured" auth error reaches the renderer through gateway error events, not the prompt.submit promise; the previous patch only caught the latter, so the error toast still surfaced and onboarding never opened. Also strip credential-shaped env vars from the test:desktop:fresh sandbox so the packaged backend can't see provider keys leaking from the launching shell. * fix(desktop): use strict runtime check to drive onboarding setup.status returned True whenever any provider auth state was discoverable, including indirect fallbacks like a gh-CLI Copilot token. That made desktop think the user was set up while the agent's actual resolve_runtime_provider call still raised AuthError, leaving the user with a useless toast and no onboarding. Add a setup.runtime_check gateway method that runs the same resolver the agent uses on session creation, and switch the desktop onboarding overlay and prompt precheck to use it. * feat(desktop): OAuth-first onboarding using existing dashboard provider API Replace the engineer-flavored API key form with a Sign-in-first onboarding overlay that uses the dashboard's existing /api/providers/oauth catalog and PKCE/device-code endpoints (Anthropic, Nous, OpenAI Codex, etc.). API key entry is now a fallback tab with friendly provider names instead of env var prefixes, and the loud raw resolver error is gone in favor of a one-line welcome message. * fix(desktop): polish onboarding provider list Reorder OAuth providers so Nous Portal is first, give the segmented Sign in / API key control equal column widths, and replace the engineer-flavored backend names like "Anthropic (Claude API)" / "MiniMax (OAuth)" with friendlier in-app titles. External-CLI providers now show a softer subtitle and an external-link icon instead of a chevron. * refactor(desktop): split onboarding overlay into store + view Move the OAuth state machine, runtime check, copy-to-clipboard, and api-key save into store/onboarding.ts (matching the boot.ts pattern), leaving the overlay as a presentation layer that subscribes via useStore. Tabs are now table-driven, child panels read flow from the store instead of prop-drilling, and the polling/PKCE/error/success branches share a small Status atom. * fix(desktop): external CLI providers + center mode tabs External-CLI providers (Claude Code, Qwen Code) now open an in-overlay panel with the CLI command, copy button, and an "I've signed in" recheck instead of firing an invisible toast. Center the Sign in / API key tab control so it sits under the heading instead of hugging the left edge. * fix(desktop): drop onboarding tabs for an inline link, group device-code waiting state Replace the Sign in / API key tab pair with an "I have an API key" footer link under the OAuth provider list, with a "Back to sign in" affordance inside the API key form. Group the device-code "Waiting for you to authorize..." status next to the Cancel button so the alignment matches the action. * refactor(desktop): tighten onboarding store + overlay Drop the dead isOnboardingBusy/BUSY set, factor the catch-fallback dance into safeReq, and share a single reloadAndConnect helper between PKCE submit, device-code success, external recheck, and api-key save. In the overlay, extract Step / CodeBlock / FlowFooter / CancelBtn / DocsLink atoms so the four sign-in panels share the same chrome instead of repeating it inline. Net effect: fewer literal divs, one place to touch the spacing, and the code-block + footer rows are reusable across future flows. * fix(desktop): mount onboarding from frame 1 to kill the FOUT Default onboarding.configured to null (unknown until the runtime check resolves) and have the onboarding overlay render whenever it's not yet confirmed true. The boot overlay now yields to it, so the very first paint is the Welcome card with a "While we get you set up..." progress strip instead of a flash of the chat shell between boot dismiss and onboarding mount. The picker swaps in cleanly once the gateway opens and the runtime check confirms the user is not configured. Already-configured users see the same prep card briefly while their existing runtime warms up, then the overlay dismisses without touching the chat shell. * fix(desktop): top-align empty sessions placeholder The "Start a chat to build your history." empty state used a min-h-35 grid place-items-center container, which floated the text in a tall dead zone. Render it as a flat paragraph that sits right under the section header like the empty pinned state does. * refactor(desktop): drop dead boot overlay Onboarding overlay subsumes the boot card now that it mounts from frame 1 and renders boot progress inline. The standalone DesktopBootOverlay is unreachable in every flow (yields whenever onboarding has not confirmed configured, dismisses once it has). * fix(desktop): hide pinned/recents sections until first session A fresh sidebar showed the Pinned and Recent chats headers with floating empty-state copy underneath. Drop both sections (and the now-orphan SidebarEmptySessionState) when there are no sessions yet — they reappear after the first chat. Skeletons during initial load are unchanged. * feat(gui): route embedded TUI through dashboard gateway (#21979) Inject HERMES_TUI_GATEWAY_URL into dashboard PTY sessions so embedded ui-tui instances attach to the in-process websocket gateway, with coverage for the new env wiring. * Add desktop remote gateway settings Make the desktop gateway connection configurable from settings so local remains the default while remote backends can be saved, tested, and applied without environment variables. * feat(gui): first-class Messaging page + gateway menu redesign - Add Messaging page to the desktop app with per-platform setup, status, and inline guidance. Catalog derives from gateway.config Platform enum + plugin registry, so every messaging adapter the CLI supports (Telegram, Discord, Slack, Mattermost, Matrix, WhatsApp, Signal, BlueBubbles, Home Assistant, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ, Yuanbao, API server, Webhooks, plugins) shows up without per-platform code. - New REST endpoints: GET /api/messaging/platforms, PUT and POST /test on the same path. Secrets go through the existing .env pipeline; enable/disable writes config.yaml. - Replace gateway statusbar dropdown with a richer panel: status row, icon-only restart + system-panel actions, recent activity (with timestamps trimmed in display, full text on hover), platform list. - Auto-poll the messaging page every 6s (paused when hidden) so status updates without a manual check. - Drop Settings / Command Center from the sidebar nav (still reachable via shortcuts and the titlebar cog). - Flatten top corners on Messaging/Skills/Artifacts/Chat panes. - Share new StatusDot component across messaging + gateway menu. - Fix gateway/config.py so an explicit platforms.<name>.enabled=false in config.yaml is honored when env tokens are present. - pb-9 on the chat content area for breathing room above the composer. * Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * pin electron version * hide application menu on non-mac systems * interpret compactPreview for non-string vlaues as JSON or an empty string * fix(desktop): keep composer contenteditable mounted across stacked toggle The composer rendered {input} inside two different parent fragments depending on `stacked`. When auto-expand flipped `stacked` (e.g. the moment typed text wrapped past two lines), React reconciled the two branches as different positions and unmounted/remounted the contenteditable. The fresh mount started empty, so any in-flight characters — most reliably reproduced by holding a key — were lost. Replace the conditional with a single CSS Grid whose template-areas swap on `stacked`. The three children (menu, input, controls) keep stable identities across the toggle; only their grid placement changes, which the browser handles without React tearing down the editor. * refactor(desktop): align install layout with install.ps1 / install.sh Make the desktop app's runtime layout match what scripts/install.ps1 and scripts/install.sh produce, so a desktop-only user and a CLI-only user end up with the same files in the same places and can share one install. Layout - ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only) - VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime) - desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log) - HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere The packaged .app/.exe still ships a read-only payload at process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch or after an installer-driven upgrade we sync factory -> active, then provision the venv and run pip install -e . against the active root. Key behaviors - Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves to the same path resolveHermesHome() picked. Without this, Python falls back to ~/.hermes on every platform - fine on mac/linux, a split-state bug on Windows where our default is %LOCALAPPDATA%\hermes. - Detect developer installs by .git presence at ACTIVE; never overwrite a user's checkout via factory sync. - Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks pyproject hash + factory version + runtime schema version. depsFresh fast-paths when nothing changed. - Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run their local edits, not whatever's under HERMES_HOME. - Better error messages distinguish "no payload" from "no Python". - Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes exists, so users with prior pip/manual installs aren't orphaned. pyproject.toml - Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and pywinpty (Windows) to main dependencies. The dashboard backend (hermes dashboard) needs them at runtime; the previous lazy-import fallback was a footgun for fresh installs. - Empty the [pty] optional-extra; kept as a no-op back-compat alias for any existing pip install hermes-agent[pty] invocations. Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the desktop now installs whatever pyproject.toml says, single source of truth. Files - apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin, factory->active sync, marker v4 - apps/desktop/scripts/test-desktop.mjs: track new venv location - apps/desktop/README.md: new Setup, Runtime Bootstrap, and Debugging sections - pyproject.toml: fastapi/uvicorn/pty backends in main dependencies; [pty] extra emptied Tested locally on Windows: npm run dev boots cleanly, sessions land at the new location, type-check + lint + test:desktop:platforms all pass. Verified end-to-end on a fresh Win11 VM via dist:win installer. Known gaps (filed as follow-ups, not in this PR): - Skills not seeded on packaged installs (sync_skills only runs in cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch. - Git Bash not bundled or detected; agent's terminal tool errors out with a useful message but desktop bootstrapper should pre-flight it. - install.ps1 / install.sh should be decomposed into composable phase libraries so the desktop bootstrapper can reuse them as a single source of truth across all install surfaces. * feat(desktop): theme polish, prose chat typography, composer chrome - DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose - Composer liquid/radius utilities, thread font parity, tool/thinking cues - File tree label scale, preview flex, thread retry loading + streaming tests * feat(desktop): NSIS prereq detection page + auto-install via winget The packaged Windows installer now detects Python 3.11+ and Git for Windows at install time and offers to install missing prereqs via winget. Mirrors the prereq logic scripts/install.ps1 already runs for CLI installs, so desktop installer users get the same out-of-the-box experience as install.ps1 users. Why - Hermes' terminal tool calls bash.exe directly (tools/environments/ local.py); on Windows that's Git Bash from Git for Windows. Without it, the agent fails on the first terminal() call. - Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper errors out at venv creation. - Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python pre-installed but no Git, so the agent's first terminal call failed with "Git Bash isn't installed." - install.ps1 has had Install-Git + Install-Uv functions for ages. The desktop installer was the asymmetric outlier. How — NSIS prereq page - New file: apps/desktop/installer/prereq-check.nsh (plugged into electron-builder via build.nsis.include) - Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir hook (between the Directory page and InstFiles). - Group boxes for Python and Git, each showing detection status. - Pre-checked install checkboxes when winget is available. - Auto-skips silently if both prereqs are already installed. - Falls back to manual download URLs when winget itself is missing. - Detection: - Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python launcher. Microsoft Store "Python stub" (no py.exe) is correctly classified as not-installed. - Git: `where git`. - winget: `where winget` (Win10 1809+ / Win11 with App Installer). - Install execution (in customInstall macro): - Python: nsExec::ExecToLog with `--scope user --silent`. Per-user install, no UAC prompt, output streams to install log. - Git: ExecShellWait via Windows ShellExecute. Critical because Git always installs per-machine and triggers UAC; ShellExecute preserves the foreground focus chain across non-elevated → elevated process spawns, so UAC actually comes to the foreground. nsExec::ExecToLog breaks the chain because winget runs hidden. - Both pass `--disable-interactivity --accept-package-agreements --accept-source-agreements` to suppress winget's own dialogs. - Verification: probes Git's standard install locations via FileExists rather than `where git`. NSIS's process inherits PATH at startup, so a freshly-installed Git won't be visible to `where` until restart. - Silent installs (/S) skip the prompts; managed deploys handle prereqs out-of-band via Group Policy / Intune. How — Electron-side safety net - New findGitBash() in main.cjs, parallel to findSystemPython(). Probes the same locations as tools/environments/local.py:_find_bash() so a positive result here means the agent's terminal tool will work. - ensureRuntime now throws a clear, actionable error on Windows when Git Bash isn't found, matching the existing "Python 3.11+ is required" error path. - Catches users the NSIS page doesn't: .msi installer users (NSIS prereq page doesn't run for MSI), `npm run dev` users, manual installers, anyone who unchecked the install boxes on the NSIS prereq page. - All gated on `IS_WINDOWS`; macOS / Linux unaffected. NSIS build issue (resolved) - electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer emits "warning 6010: function not referenced" for our page functions because Page custom directives don't count as references in its static-analysis pass. The functions ARE called at runtime when NSIS invokes the page; the optimizer just can't see it statically. - Set `build.nsis.warningsAsErrors=false` in package.json so this spurious warning doesn't fail the build. (Documented option from electron-builder's nsisOptions.) Out of scope (filed for future work) - MSI prereq detection: Windows Installer custom actions are a different mechanism. Enterprise deploys typically handle prereqs via GP/Intune. - Bundle PortableGit + python-build-standalone in extraResources for zero-network installs. ~80MB increase. - Mac / Linux GUI prereq flows (different installer formats; Xcode CLT covers most macOS prereqs already; Linux is per-distro hard). Files - apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS) - apps/desktop/package.json (build.nsis.include + warningsAsErrors) - apps/desktop/electron/main.cjs (findGitBash + preflight) - apps/desktop/README.md (Runtime prerequisites section) Cross-platform impact - macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis config is ignored entirely; .nsh is dormant. - npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS. - scripts/install.ps1, scripts/install.sh: no reference to any new files; CLI install paths untouched. - Hermes CLI / dashboard / gateway: no reference; runtime untouched. - All checks: node --check on main.cjs and test-desktop.mjs pass; npm run test:desktop:platforms 4/4 passing; node --test green. Tested - npm run dist:win produces signed .exe and .msi without errors. - Fresh Win11 VM (Python pre-installed, no Git): prereq page renders, Python check shows detected, Git checkbox pre-checked. Click Next → Git installs via winget with UAC prompt in foreground. - After install completes, Hermes launches and the agent's terminal tool can run bash commands. Verified Git Bash is detected at `C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight. * feat: theme changes, composer tweaks, in app update ux, finesse * fix(cli): seed bundled skills on dashboard + gateway entrypoints `sync_skills(quiet=True)` was only being called from inside `cmd_chat`, which meant `hermes dashboard` (the desktop GUI's backend) and `hermes gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled skill library into ~/.hermes/skills/. This surfaced as "No skills found" in the desktop GUI's skills panel on fresh installs, despite the agent having access to the full bundled library when invoked via `hermes chat`. scripts/install.ps1 worked around it by running skills_sync.py as part of Copy-ConfigTemplates, but that's not part of the desktop installer's bootstrap chain. Fix - Extract the skills-sync block from cmd_chat into a module-level `_sync_bundled_skills_quietly()` helper. - Call the helper from cmd_chat (preserving existing behavior), cmd_dashboard (after the --status/--stop early-return paths and fastapi import check, so we don't run skills_sync on management commands or when deps aren't installed), and cmd_gateway. Why these three entrypoints - cmd_chat: the user's primary CLI entrypoint - cmd_dashboard: the desktop GUI's backend; this is what `hermes dashboard --tui` invokes when the desktop bootstrapper spawns Hermes - cmd_gateway: long-running daemons where the user expects the agent to have full skill access Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status, etc.) are management commands that don't need skill discovery and were never running skills_sync in the first place — leaving them alone. Idempotence - tools/skills_sync.py is manifest-based: skipped skills cost milliseconds. Calling it from multiple entrypoints adds no real cost, and users running `hermes chat` then `hermes dashboard` get two fast no-ops on the second call. Failure handling - Helper wraps skills_sync in try/except. Skills are an enhancement, not a hard dependency — Hermes runs fine with an empty skills/ dir. Files - hermes_cli/main.py: + new helper `_sync_bundled_skills_quietly()` at module level + cmd_chat: replace inline block with helper call + cmd_dashboard: add helper call after fastapi import succeeds + cmd_gateway: add helper call before delegating to gateway_command * feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes - Hoist todo to first-class widget (shadcn checkboxes, brand colors, no tool-accordion). Header derives label from active task; non-active rows fade. - Replace raw JSON dumps with structured key/value summaries via formatToolResultSummary; nested error extraction for clearer failures. - Fix loaded-session grouping: stitch interleaved assistant/tool iterations into one bubble instead of orphaned synthetic messages. - Stable tool/thinking timers via keyed registry so unmount/scroll doesn't reset elapsed counts; gate "running" on real live thread state. - Reorganize chat-only assistant-ui components under components/chat/. * fix(desktop): address CodeQL alerts on PR #20059 - settings/helpers.ts: harden setNested against prototype pollution. POLLUTING_PATH_PARTS check is now applied at every assignment site (loop + leaf) and uses Object.defineProperty so CodeQL can see the guard inline rather than via a helper function call. - lib/markdown-preprocess.ts: rebuild the dangling-fence close regex from a fence-char + length instead of marker.replace(...). The marker is captured by `(`{3,}|~{3,})` so it can only be backticks or tildes, but CodeQL was tracing tainted input text into the RegExp source and flagging hostname dots from input as part of the pattern (false positive js/incomplete-hostname-regexp on the test fixture URLs). Reconstructing from a literal char breaks the dataflow. - scripts/notarize-artifact.cjs: drop args from the run() rejection message. Args carry --key-id / --issuer / key file path; the existing outer catch already squashes errors to a generic line, but CodeQL was flagging the args.join(' ') as clear-text logging of APPLE_API_KEY_ID. Composer DOM-text-as-HTML alerts (composer/index.tsx:379, :547) are already addressed in 4dd9732a9 — innerHTML assignment was replaced with renderComposerContents which builds DOM via replaceChildren / append text nodes (no HTML interpretation). * fix(desktop): inline prototype-pollution guard so CodeQL sees it CodeQL's dataflow doesn't follow the helper-function guard inside `safeSet`, so it kept flagging Object.defineProperty as prototype- polluting. Inline the literal `__proto__`/`constructor`/`prototype` check at the assignment site to break the dataflow. Behavior unchanged — same set of disallowed keys, same throw. * feat(ui-tui): resolve links to readable page titles Mirror desktop pretty-link behavior in the TUI by resolving HTTP links to page titles with shared caching and safe fetch filters, plus slug-based fallbacks so chat links stay readable even when title fetch fails. * fix(desktop): drop RegExp from dangling-fence close detection Previous attempt tried to break the dataflow by reconstructing the close-fence regex from a literal char + marker.length, but CodeQL still traced marker.length back to input and kept flagging the test-fixture URLs as hostname-regex sources (js/incomplete-hostname-regexp). Replace `new RegExp(...)` + `closeRe.test(body)` with a string-only hasCloseFenceLine() helper that splits on '\n' and uses ===. No regex on this path now, so input data can no longer reach a RegExp source. Behavior preserved: matches lines that are (whitespace + marker + whitespace), which is what the original `\n[ \t]*${marker}[ \t]*(?=\n|$)` matched. All 12 markdown-text tests still pass. * fix(process-registry): suppress windows-footgun false positive on guarded killpg Keep the existing POSIX-only process-group teardown path, but make the signal selection explicit via getattr and add an inline windows-footgun suppression marker on the guarded os.killpg line so the Windows footgun check no longer blocks CI on this intentionally platform-gated code. * feat(desktop): reconcile live tool events, polish thread chrome, harden boot - chat-messages: match tool rows by overlapping query/context/preview values so preview-first `tool.progress` rows reliably adopt later stable-id `tool.start` payloads instead of spawning ghost rows or mis-merging parallel same-name calls; preserve prior args/result across phases. - tui_gateway: emit full args + parsed result on `tool.start` / `tool.complete`, drop redundant `tool.started` re-emit from `tool.progress`. - electron/main: prefer SOURCE_REPO_ROOT before PATH `hermes` in dev so local backend edits actually run; split hardening helpers into `electron/hardening.cjs` with tests. - thread/tool UI: one-shot enter animation keyed by stable ids, braille spinner for running rows, Cursor-like disclosure rows, drill-down + duration/count formatting via new tool-fallback-model. - composer: extract `text-utils`, drop liquid-glass overrides. - right-rail: split preview-pane into preview-console / preview-file. - runtime: incremental external-store runtime + runtime-readiness gate; onboarding store + tests; route-resume hook test. - regression tests for live tool reconciliation (parallel tools, id-less progress, preview-first rows, structured args/results). * feat(desktop): add ripgrep to NSIS prereq page + polish layout Add ripgrep as a third (recommended) prereq alongside Python and Git in the NSIS prereq detection page, and clean up the page layout based on on-VM testing. Why ripgrep - Hermes' search_files tool calls `rg` directly for content + filename search (tools/file_operations.py:1382). Falls back to grep/find from Git Bash when missing — works but slower and noisier (no .gitignore awareness). - ~5MB winget install via `BurntSushi.ripgrep.MSVC --scope user` — no UAC prompt, parallel to how Python installs. - scripts/install.ps1 already installs ripgrep as part of Install-SystemPackages; this brings the desktop installer to parity. Why "recommended" not "required" - Python and Git are hard requirements: without them the agent runtime or terminal tool refuses to start. The bootstrapper preflight throws. - ripgrep is a performance enhancement: missing it just means slower searches. Page wording reflects this; failure to install is logged but doesn't show a MessageBox or block. Layout polish (response to on-VM screenshot review) - Wizard header now correctly reads "System Requirements" instead of the leftover "Choose Install Location" from the previous page. Set via `GetDlgItem $HWNDPARENT 1037/1038` + WM_SETTEXT — the standard NSIS pattern for overriding the page header on a custom Page. - Removed redundant in-body title + verbose intro paragraph; the wizard header IS the title now. Body has one short intro line. - Group boxes tightened to 26u with content positioned just below the groupbox title (not top-anchored status + bottom-anchored checkbox with empty space in the middle). All three panels + footer fit comfortably in 126u, well under the 140u page limit. - Checkbox labels simplified: dropped "(per-user, no admin prompt)" and "(administrator approval required)" suffixes. The footer note still calls out UAC for Git when relevant. - Footer text trimmed to fit cleanly without clipping. Install order (in customInstall macro) - Python → ripgrep → Git - Python and ripgrep are silent and run first; Git's UAC prompt comes last so the user's approval interaction isn't interrupted by silent activity afterwards. Skip behavior unchanged - All three detected → page auto-skips via Abort - Silent install (/S) → customInstall winget block skips - User unchecks all → page advances without running winget Files - apps/desktop/installer/prereq-check.nsh: ripgrep detection block, ripgrep page panel + checkbox, ripgrep customInstall block, GetDlgItem header override, layout reflow - apps/desktop/README.md: Runtime prerequisites section updated to list ripgrep as recommended, with manual winget command * feat(desktop): add model-confirmation step to onboarding After OAuth/API-key login completes, onboarding now shows a confirmation card with the curated default model and a Change button before dropping the user into chat. Closes the gap where the desktop's `model.default` was empty after first launch and the agent had to fall back to whatever heuristic happened to fire — leaving users wondering "why am I getting sonnet-4 when I logged into Nous Portal?" Why - Desktop onboarding only persisted credentials, never `model.default`. The CLI's `hermes model` command pairs provider + model selection, but the desktop's onboarding skipped the model step entirely. - Result: users saw whichever model the agent's auto-fallback picked, unpredictably and undocumented. - For the BUILD demo we want users to land on the model they expect for their provider, with a clear "this is what you're getting" UI and a one-click path to change it before chatting. How - New `confirming_model` flow status carries the just-authenticated provider slug, current default model, label, and a saving flag. - `completeWithModelConfirm()` runs after credentials succeed: reloads env, verifies runtime, fetches /api/model/options to find the curated first-model for the provider, persists it via /api/model/set, then transitions into `confirming_model`. - If anything fails (no providers returned, network error), falls through to the previous behaviour — onboarding completes without the confirm step. Polish, not a hard requirement. - All four credential paths (device_code OAuth, PKCE OAuth, external CLI flow, API key) now use completeWithModelConfirm instead of reloadAndConnect. UI - `ConfirmingModelPanel` shows: green "<provider> connected" banner, card with "Default model: <name>" + Change button, and a "Start chatting" CTA that finalises onboarding. - Reuses the existing `ModelPickerDialog` (the same picker available from the chat shell) for the change-model UX. Search, filtering, multi-provider listing — all already built. - Stacking: ModelPickerDialog defaults to z-130, which renders UNDER the onboarding overlay (z-1300) and breaks pointer events. Added optional `contentClassName` prop to ModelPickerDialog so callers can override; onboarding passes `z-[1310]`. Provider-slug matching - For OAuth flows: pass `provider.id` directly as the preferred slug. - For API-key flows: `OPENROUTER_API_KEY` → "openrouter" via env-key prefix strip. Also includes the user-visible label as a fallback candidate. - fetchProviderDefaultModel falls back to the first authenticated provider in the response if no preferred slug matches — so even a miss still surfaces a reasonable default. Files - apps/desktop/src/store/onboarding.ts: + new `confirming_model` flow variant + fetchProviderDefaultModel + completeWithModelConfirm helpers + setOnboardingModel (optimistic update + revert on failure) + confirmOnboardingModel (finalises onboarding from the card) - reloadAndConnect (replaced; the four call sites now go through completeWithModelConfirm) - apps/desktop/src/components/desktop-onboarding-overlay.tsx: + ConfirmingModelPanel component + new branch in FlowPanel for status `confirming_model` + ModelPickerDialog usage with z-[1310] content class - apps/desktop/src/components/model-picker.tsx: + optional `contentClassName` prop on ModelPickerDialog so the dialog can be stacked on top of other fixed overlays Tested - `npm run type-check` passes - `npx eslint` clean on touched files - Live test in `npm run dev`: cleared onboarding cache, walked through Nous device-code flow, saw confirm card with curated default, clicked Change → ModelPickerDialog rendered above the onboarding overlay with working pointer events, picked a different model, "Start chatting" persisted to ~/.hermes/config.yaml. * fix(desktop): suppress generic provider warning in onboarding Hide the red setup notice when the message is the generic missing-provider guidance, since onboarding already presents provider auth actions. Centralize provider-setup matching across desktop hooks and add coverage for the matcher. * fix(desktop): add 2u clearance below prereq checkboxes Group box bottom border was clipping the checkboxes by 1-2px. Bumped each box height 26u→30u; checkboxes now sit 2u above the bottom border. * fix(nix): refresh dashboard lockfile hash Update the web npm deps hash in nix/web.nix to match the committed apps/dashboard/package-lock.json so bb/gui passes the nix lockfile check. * fix(desktop): install TUI deps in release workflow Ensure desktop release builds install the standalone ui-tui package before bundling the TUI payload. * fix(desktop): run release builder from app package Invoke the desktop builder through the package script so electron-builder uses apps/desktop/package.json. * fix(desktop): expand release artifact names safely Build desktop artifact names from workflow version/channel while preserving electron-builder platform macros. * fix(desktop): use package artifact naming in release workflow Let electron-builder's desktop package config provide platform-specific artifact extensions while the workflow injects the release version/channel metadata. * fix(nix): fetch dashboard npm deps from package root Point the dashboard npm dependency fetch at apps/dashboard so Nix can find the package lockfile after the dashboard move. * fix(nix): build dashboard from package directory Set the web package source root to apps/dashboard so npm patch/build phases run beside the dashboard lockfile while keeping apps/shared available as a sibling. * feat(desktop): render LaTeX math via KaTeX after streaming completes Add @streamdown/math plugin to the chat markdown renderer. Inline ($x^2$) and block ($$...$$) math both supported with singleDollarTextMath enabled. Plugin is gated to non-streaming state to match the existing pattern for syntax highlighting — math renders when the message completes, avoiding KaTeX re-render churn during streaming. KaTeX CSS is imported in styles.css; ~30KB CSS + ~430KB JS added to the bundle. Smoothness improvements during streaming deferred to a follow-up. * perf(desktop): memoize KaTeX renders so math streams without re-rendering Wrap rehype-katex with a per-equation LRU cache (keyed by displayMode + source text) and re-enable math during streaming. Stock @streamdown/math runs rehype-katex on every markdown commit, so each new token re-katexes every equation in the message. For math-heavy responses (an equation derived step-by-step) that's hundreds of ms of wasted work per token and the streaming UI chokes. With memoization, each equation pays katex.renderToString exactly once; subsequent tokens re-walk the tree but hit cache for unchanged equations. The wrapper mirrors rehype-katex's semantics exactly: same class detection (language-math, math-inline, math-display), same <pre>-walk-up for fenced math blocks, same parent.children.splice replacement, same SKIP traversal, same strict-then-lenient render strategy with VFile message reporting. Cached children are structuredCloned on each splice so downstream rehype plugins or toJsxRuntime can't mutate the cache. * fix(desktop): declare katex-memo deps directly + drop per-app lockfile katex-memo.ts (added in 112cad59b) imports hast-util-from-html-isomorphic, hast-util-to-text, remark-math, katex, and unist-util-visit-parents but those were never added to apps/desktop/package.json. They were silently resolving via @streamdown/math at the workspace root, which broke the moment `npm i --prefix apps/desktop` ran with the per-workspace lockfile because that install only consults apps/desktop/package.json. Add them as direct deps, plus unified/vfile/@types/hast for the type imports. Also delete apps/desktop/package-lock.json — root package.json declares workspaces: ["apps/*"], so npm manages all lockfile state at the root. The stale per-app lockfile is what made `npm i --prefix apps/desktop` diverge from the workspace install in the first place and left an empty apps/desktop/node_modules/@assistant-ui/ stub that Vite's dep optimizer then tried (and failed) to open at @assistant-ui/core/dist/internal.js. * feat(desktop): disable Backdrop noise overlay by default The noise overlay defaulted to on, which adds a busy speckle layer over the whole window for every new user. Flip the Leva default to off; the toggle stays in Backdrop / Noise for anyone who wants it back. * fix(desktop): polish LaTeX rendering — currency, code blocks, brackets Five distinct bugs surfaced from a math-heavy stress test: 1. Adjacent code fences glued together. scrubBacktickNoise's second-pass regex /``\s*``/g matched the LAST 2 backticks of one fence + whitespace + FIRST 2 backticks of the next, collapsing two blocks into one. Fixed with lookbehind/lookahead so we only match exactly 2 backticks not part of a longer run. 2. Whitespace eaten between fences and following content. stripPreviewTargets internally calls .trim() which strips leading/ trailing whitespace from each split-segment. For segments between two fences this collapsed \n\n to '', gluing fence close to next block. Fixed by capturing leading/trailing whitespace at the call site and restoring it after the transform. 3. Currency dollar signs eaten as math. With singleDollarTextMath:true remark-math greedy-matched any pair of $, so '$5 ... $10' became one inline math span. Added escapeCurrencyDollars to escape $<digit> patterns to \$<digit> in prose segments (not in code). Trade-off: math expressions starting with a digit (rare — '$5x = 10$') get escaped too. Mirrors the convention in ChatGPT/Claude's UIs. 4. \(...\) and \[...\] LaTeX brackets unsupported. Models often emit these instead of $...$ / $$...$$. Added rewriteLatexBracketDelimiters preprocessor pass. 5. ```latex / ```tex blocks were being routed to KaTeX via a rewrite to ```math. Aligns with GitHub markdown convention: ```math = render as math; ```latex / ```tex = LaTeX/TeX source code (syntax highlighted, not rendered). Conflating them broke teaching/showing-source use cases. MATH_FENCE_LANGUAGES pruned to {'math'} only. Also flipped parseIncompleteMarkdown to true (was !isStreaming) so the math parser can't see $ inside streaming-but-not-yet-closed code fences. Shiki was already deferred via defer={isStreaming} so this doesn't introduce new tokenization cost. Test: 18/18 existing tests still pass; one test updated to expect escaped \$ in currency-prose-with-URL case. * fix(desktop): detect Python via registry/filesystem; pin to 3.11–3.13 Two related fixes for Python detection on Windows: 1. py.exe (Python launcher) is missing from per-user installs that didn't check the launcher option, so 'py -3.X --version' alone misses real Python installs. User-reported case: clean Win11 + official Python.org 3.14 install -> 'where py' returned nothing, our installer offered to install Python again. Both NSIS prereq page and main.cjs now probe in this order: 1. py.exe launcher (when present) 2. PEP 514 registry: HKLM/HKCU\SOFTWARE\Python\PythonCore\<v>\InstallPath 3. Filesystem: %ProgramFiles%\Python<v>, %LocalAppData%\Programs\Python\Python<v> Crucially, we never fall back to running 'python.exe' from PATH on Windows — the WindowsApps stub at %LOCALAPPDATA%\Microsoft\ WindowsApps\python.exe is a redirector that opens the Microsoft Store window if no Store Python is installed. Triggering that during boot would be terrible UX. Registry/filesystem probes never execute the binary. 2. Drop 3.14 from the supported version set. Several Hermes deps (notably pywinpty, which carries Rust crates like windows_x86_64_msvc) don't yet publish 3.14 wheels. With wheels missing, 'pip install -e .' falls back to building from sdist, which needs a Rust toolchain — users see 'could not compile windows_x86_64_msvc build script' on first run. install.ps1 sidesteps this by pinning to 3.11 via uv; the desktop installer doesn't yet have the same uv-managed-Python pathway, so for now we accept 3.11/3.12/3.13 and tell winget to install 3.11 if none of those are present. Revisit when the wheel ecosystem catches up to 3.14 (~early 2026). * feat(desktop): Cron, Profiles, usage analytics, and titlebar fixes - Add Cron and Profiles sidebar routes with full CRUD-style flows and API wiring. - Extend Command Center with auxiliary task overrides and a Usage panel (7d/30d/90d). - Fix titlebar geometry for WSL/Windows (native overlay width, tool spacing). - Remove stray merge conflict markers from pyproject.toml optional deps. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(title-bar): position sidebar toggle button * feat(desktop): composer queue — queue many, edit/delete/cancel-edit, Cursor-style Press Enter while busy with a draft to queue it; with no draft to interrupt and send the next queued turn. Auto-drains one queued turn each time the session settles, same as Cursor. Queue persists across reloads so an interrupted-and-queued turn isn't lost on refresh. Each queued row supports edit-in-composer (with explicit Save/Cancel), send-now (↑), and delete. Drain skips only the entry currently being edited so the rest of the queue keeps flowing. Queue dequeue is transactional — an entry only leaves the queue after `prompt.submit` is accepted, so a rejected submit doesn't drop the turn. Also shrinks the `[interrupted]` marker to a muted one-liner and drops its assistant footer so it stops looking like a real reply. * fix(desktop): handle empty usage analytics totals Co-authored-by: Cursor <cursoragent@cursor.com> * fix(desktop): address PR review titlebar and usage races Co-authored-by: Cursor <cursoragent@cursor.com> * feat(desktop): add MCP settings and live subagent tree Surface configured MCP servers in Settings with JSON edit/save and a gateway-backed reload action so users can manage tool servers without falling back to slash commands. Track live subagent gateway events in a desktop store, show active subagent counts in the Agents statusbar item, and replace the Agents overlay stub with a live spawn tree for the active session. * fix(desktop): move power-user views out of sidebar Keep Cron and Profiles available through lower-prominence chrome entry points so the workspace sidebar stays focused on core chat navigation. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(desktop): subagent overlay reads like a live transcript, not a dashboard Strip the card chrome and rewire /agents to feel like peeking into the child agent's stream: - subagents store: single `stream` of typed entries (thinking/tool/progress/ summary) replaces the parallel notes/thinking/tools arrays. Drop unused fields (toolsets, depth, apiCalls, reasoningTokens, sessionId). - agents view: no OverlayCards, no boxed stream, no per-row borders. Goal + status pill + indented stream lines, full row width. - Group root spawns into "Delegation N" sections when batch shape + spawn time match — hides task-index interleaving and makes hierarchy obvious. - Sort tree by spawn time, then task_index. Step indicator is one colored pill (primary while running, emerald when done) inside the row, not a trailing pill that wrapped under the chevron. - Tree picks up `subagent.start` (not only `spawn_requested`) and prunes delegate-tool fallback rows once native subagent events land for the session — fixes duplicate "Delegated task" rows alongside the real ones. * feat(desktop): Esc closes every OverlayView-based overlay Lift the keyboard handler into the shared OverlayView so Agents, Settings, Command Center — and anything we build on top of it later — all dismiss on Esc by default. Nested Radix dialogs stop propagation themselves, so a modal opened inside an overlay (e.g. model picker inside Settings) still closes the modal first, not the overlay underneath. Drop the now-redundant Esc handlers in Settings (kept Cmd/Ctrl+P) and Command Center. * fix(desktop): drop numbered step pill on subagent rows The pill was getting clipped at the overlay edge anyway. Just use the status glyph (●/✓/✗/■/○) — the delegation header already conveys "3 workers, 3 active", and order in the list implies which step you're looking at. * fix(desktop): drop noisy "returned N items / empty object" stub strings When a tool returns nothing useful, the row should be silent — the title ("Search Files", etc.) already tells the user what happened. Counting the fields in an opaque payload is engineer-noise. `formatToolResultSummary` and `minimalValueSummary` now return '' for empty arrays / records / unrecognized values; tool-fallback already hides the detail section when its body is empty. * refactor(desktop): subagent rows borrow chat tool patterns (fade-in, lucide glyphs, shimmer) Pull the agents view closer to how chat tool blocks render: - statusGlyph() returns the same lucide BrailleSpinner / CheckCircle2 / AlertCircle vocabulary as tool-fallback's statusGlyph - Stream lines fade-in via useEnterAnimation (one-shot WAAPI), keyed per entry so streamed deltas settle in instead of popping - Subagent rows fade in too, and pick up the existing data-slot=tool-block spacing rules between blocks - Active stream line trails a BrailleSpinner instead of a hand-rolled pulsing rectangle - Goal text drops FadeText (which forces nowrap); keep FadeText only for the single-line meta subtitle - Running rows shimmer the title — same affordance the chat thinking row uses * refactor(desktop): make /agents subagent-only, drop sidebar + dead sections Activity rail and History stub were both noise. Strip the split layout, sidebar, route enum, and the rail/stub helpers — the overlay is now just the spawn tree, centered in a max-w-3xl column so it stops claiming the whole screen for one section's worth of content. * feat: update cron modals * Add dedicated GUI log stream for dashboard debugging. Capture dashboard and PTY websocket lifecycle failures in gui.log and expose it via hermes logs. * Improve desktop runtime UX by surfacing inference readiness in gateway status and hardening WSL link opening. This also stabilizes markdown code/table block spacing and adds root-install guards so desktop dev runs use a healthy workspace dependency tree. * Log detailed GUI websocket failure metadata. Capture richer reject/disconnect/send/parse context for dashboard gateway websocket flows so GUI connection failures are diagnosable from logs. * Default dashboard startup logging to GUI mode. Detect the dashboard subcommand during early CLI bootstrap so gui.log is attached from process start and GUI startup failures are always captured. * Clean up gateway status conditionals and logging bootstrap mode detection. Simplify nested dashboard gateway status branches for readability and use a concise first-subcommand check when selecting early GUI logging mode. * add logging to nsis installer * feat: glass ui pass * fix(desktop): persist inline assistant errors across hydrate/resume - Detect provider failure text arriving via message.complete (HTTP 4xx, "API call failed after N retries", Provider/Gateway error: ...) and persist as an inline assistant error instead of regular completion text, blocking the hydrate that was wiping it. - preserveLocalAssistantErrors: merge by id so same-id hydrated messages keep their local error, and preserve the optimistic user+error pair as a unit (with tail-user dedupe). - Hook all hydrate/resume writers (use-session-actions resume + fallback, hydrateFromStoredSession, syncSessionStateToView) into the merge so stale snapshots can't clobber a failed turn. - Add error to chatMessagesEquivalent so the resume diff actually sees error-only changes and paints them. - editMessage on a failed turn now submits a plain resend (no truncate_before_user_ordinal) and retries plainly on the "no longer in session history" race. Style polish on touched files: - Inline error: text-only treatment (no card). - User stop / edit-composer send: shared Tabler IconPlayerStopFilled glyph + shared icon-button class slot for parity. * feat(desktop): theme xterm with active light/dark mode The right-sidebar terminal hardcoded a light palette, which read poorly on the dark glass surface. Subscribe to `useTheme().resolvedMode` and hot-swap `term.options.theme` so Shift+X (and any other mode change) updates the terminal in place without tearing down the PTY session. Dark mode uses xterm's built-in defaults (white fg/cursor + vivid ANSI 16) with just a transparent background so the glass shows through; light mode keeps the existing hand-tuned overrides for legibility on a bright surface. * feat(sidebar): right-click + drag-reorder sessions and workspaces - Wire right-click on session rows to open the same actions menu; suppresses the OS-native context menu so Windows stops looking awful. - Share dropdown + context menu items via useSessionActions() driving a single declarative ItemSpec[]; render polymorphic over MenuItem. - New shadcn ContextMenu primitive mirroring DropdownMenu styling. - Restore drag-and-drop reordering for Agents (lost during the cwd cleanup) and add reordering of workspace groups via a right-side grab handle. Pinned reorder unchanged. - Generic orderByIds<T> replaces the duplicated session/group orderers; useSortableBindings() hook collapses the two Sortable wrappers. - cursor-pointer on every actionable element; cursor-grab on handles. - KISS pass: baseName() helper, AGE_TICKS table, single WORKSPACE_PAGE constant, flatter SidebarSessionsSection render. * feat(desktop): solarize the xterm palette in both light & dark xterm's default ANSI 16 is tuned for dark and reads candy-bright on the light glass surface (vivid cyans/greens). Ship the canonical Solarized palette (Schoonover) for both modes — same 16 accents either way, only fg/cursor swap between `base00/01` (light) and `base0/1` (dark), so a prompt's colors look uniform across a Shift+X toggle. Background stays transparent in both modes — Solarized's cream/slate backgrounds would fight the glass. * feat(desktop): virtualize chat thread + sidebar via TanStack Virtual Replaces `use-stick-to-bottom` and per-row session rendering with `@tanstack/react-virtual`, matching what Cursor uses. Chat thread (`thread-virtualizer.tsx`): - Natural-flow virtualization (padding spacers, not absolute items) so `position: sticky` on the human bubble still resolves cleanly against the scroller. - Custom at-bottom anchor: pins when armed, disarms on user-driven upward scroll, re-arms at bottom, jumps on session switch + `thread.runStart`. - Loading indicator and `--thread-last-message-clearance` move to a real `[data-slot=aui_composer-clearance]` node; drops the brittle `:nth-last-child(1 of …)` rule that can't fire reliably under virtualization. Sidebar (`virtual-session-list.tsx`): - Flat agents list virtualizes at >=25 rows; pinned and workspace-grouped paths stay direct-render. - `SortableContext` keeps all IDs; only the window mounts; dnd-kit's `setNodeRef` is merged with `virtualizer.measureElement` so rows participate in both DnD hit-testing and TanStack measurement. Drops `use-stick-to-bottom`. Streaming test gets a global `offsetWidth/offsetHeight` stub so the virtualizer's viewport sizing works in jsdom; the scroll-up-doesn't-pull-back invariant still passes. * feat: more ui qa * fix(desktop): trim sidebar terminal startup spacer Drop zsh's initial spacer row before writing the first terminal prompt so new sidebar terminal sessions do not open with a selectable blank line. * chore: uptick * feat(desktop): thin installer + first-launch install.ps1 bootstrap Converges the Windows packaged desktop installer onto a single canonical install topology: drop the Electron shell only (~80MB instead of ~500MB), clone Hermes Agent at a build-time-pinned commit on first launch via install.ps1's stage protocol, and treat the resulting git checkout at %LOCALAPPDATA%\hermes\hermes-agent\ as the canonical install location (same path the CLI installer uses). Future updates flow through the existing applyUpdates() git-pull path. Replaces the previous fat-installer architecture where the .exe bundled a pre-staged hermes-agent source tree under resources/hermes-agent/ that was then sync'd into ACTIVE_HERMES_ROOT at launch -- a complicated factory-vs-active dance with several footguns (FACTORY_HERMES_ROOT mismatch on path resolve, isGitCheckout guard regressions, pyproject hash drift detection inside the sync loop). Architecture overview --------------------- Build time apps/desktop/scripts/write-build-stamp.cjs writes apps/desktop/build/install-stamp.json with {commit, branch, builtAt, dirty}. Honours $GITHUB_SHA / $GITHUB_REF_NAME in CI, falls back to `git rev-parse HEAD` locally. apps/desktop/scripts/stage-native-deps.cjs copies the runtime subset of @homebridge/node-pty-prebuilt-multiarch from the workspace-root node_modules into apps/desktop/build/native-deps/. Workspace dedup hoists this dep to the root, out of reach of electron-builder's `files:`-restricted collector; staging gives us a deterministic path to extraResources. electron-builder ships both into resources/install-stamp.json and resources/native-deps/ respectively. Boot resolver (electron/main.cjs) Resolver order: 1. HERMES_DESKTOP_HERMES_ROOT override 2. SOURCE_REPO_ROOT (dev mode) 3. ACTIVE_HERMES_ROOT git checkout WITH .hermes-bootstrap-complete marker -- the post-install fast path 4. `hermes` on PATH (CLI-installed user adding the desktop) 5. pip-installed hermes_cli via system Python 6. bootstrap-needed sentinel -> hand off to runBootstrap Deletes the entire FACTORY_HERMES_ROOT / RUNTIME_MARKER / syncTreeExcludingVenv machinery (-200 lines). The isGitCheckout guard that bit us in the install.ps1 PR is gone. First-launch bootstrap (electron/bootstrap-runner.cjs) 1. Resolve install.ps1: prefer SOURCE_REPO_ROOT/scripts (dev), else download from GitHub raw at INSTALL_STAMP.commit (cached at HERMES_HOME\bootstrap-cache\install-<sha>.ps1). 2. Fetch the stage manifest via install.ps1 -Manifest -Commit X -Branch Y. 3. Iterate stages: install.ps1 -Stage <name> -NonInteractive -Json -Commit X -Branch Y per stage. 4. On all stages green: write the .hermes-bootstrap-complete marker with {schemaVersion, pinnedCommit, pinnedBranch, completedAt, desktopVersion}. Per-run log to HERMES_HOME\logs\bootstrap-<ts>.log. Cancellation via AbortSignal. Manifest cache so retries don't re-download. Install overlay (src/components/desktop-install-overlay.tsx) Mounted alongside the existing onboarding overlay; flexbox card with header (static) + middle (scrollable) + footer (failure-only, static). Subscribes to hermes:bootstrap:event IPC + resyncs from hermes:bootstrap:get on mount/reload. Renders: - 14-stage checklist with per-stage state icons - Overall progress bar + current-stage spotlight - Auto-expanded installer-output panel on failure - "Copy output" button (full ring buffer + error to clipboard) - "Reload and retry" wired through hermes:bootstrap:reset to clear main.cjs's latched failure Synthetic empty-manifest event from main.cjs flips the overlay to 'active' immediately so the slow install.ps1 download doesn't leave the user staring at the generic Preparing splash. Failure latching (main.cjs) bootstrapFailure module-scope variable holds the rejection after install.ps1 fails. startHermes() throws the latched error immediately when set, bypassing the entire ensureRuntime + runBootstrap chain. Without this, the renderer's ensureGatewayOpen retries would re-run install.ps1 in a 5-10 min hot loop while the user was still reading the failure overlay. Cleared via hermes:bootstrap:reset on user-driven retry. Unsupported-platform overlay (1F) macOS / Linux packaged builds (no install.sh stage protocol yet) emit an unsupported-platform event with a copy-pasteable install command + docs URL. Dedicated overlay branch with "Copy command" + "I've run it -- retry" buttons. install.ps1 additions (Phase 1F.3 + 1F.5) ----------------------------------------- New -Commit and -Tag string params. Precedence Commit > Tag > Branch. Honoured by all three code paths (update / fresh clone / ZIP fallback), with archive URL selection that handles each ref-type variant. Detached-HEAD checkouts intentionally -- they're pins, not branches the user pulls into. EAP=Continue wrap around the new pin-step git invocations. `git fetch origin <commit>` writes the routine 'From <url>' info line to stderr; under the script's global EAP=Stop that terminates the script even though fetch+checkout succeed. Matches the established pattern in Install-Uv, Test-Python, _Run-NpmInstall. Backend fix (hermes_cli/web_server.py) -------------------------------------- CORS allow_origin_regex now accepts Origin: 'null'. Packaged Electron loads index.html via file://; Chromium sets the WebSocket upgrade Origin header to the opaque origin 'null', which the old regex rejected with HTTP 403 before gateway_ws() ever ran. This failure mode was masked in the older FACTORY_HERMES_ROOT architecture because the resolver often found an existing hermes on PATH with different binding behavior. Security maintained: localhost-only bind keeps cross-machine pages out; per-process session token still gates every authenticated /api/ endpoint regardless of Origin. Desktop QoL ----------- DevTools is now enabled in packaged builds (F12 / Cmd+Opt+I). Field-debugging trade-off: tiny attack surface increase versus a much better support story when CSP / WS / theme issues surface. NSIS prereq-check page deleted (-767 lines). The standard Welcome -> License -> Directory -> InstallFiles -> Finish wizard now installs without custom Python/Git/ripgrep detection -- those prereqs are install.ps1's job at first launch. Test infrastructure (Phase 1G) ------------------------------ apps/desktop/scripts/test-desktop.mjs rewritten as a cross-platform bundle validator (was darwin-only and asserted on dead factory- payload paths): NEGATIVE: hermes_cli/main.py is NOT shipped (regression guard) POSITIVE: install-stamp.json carries a real commit + branch POSITIVE: node-pty native deps shipped under resources/native-deps POSITIVE: renderer dist/index.html reachable (asar or unpacked) New nsis mode and npm run test:desktop:nsis script. Validated end-to-end on clean Win10 VM -------------------------------------- Confirmed: NSIS installer drops Electron shell, app launches, install overlay shows progress, install.ps1 clones the pinned commit, 14 stages run to completion, marker written, backend spawns, WebSocket connects, onboarding overlay asks for API key, main UI loads, integrated terminal works. Failures handled: bootstrap stays failed (no hot-loop retry), "Copy output" gives actionable transcript, "Reload and retry" explicitly re-runs install.ps1. What's deferred --------------- - MSIX wrapping (Phase 2): same Electron .exe under MSIX manifest with runFullTrust, signed and submitted to Microsoft Store. - install.sh stage protocol parity (Phase 2): once shipped, the unsupported-platform overlay becomes drive-it-yourself and macOS/Linux packaged installers gain feature parity with Windows. * feat(desktop): persistent terminal pane + fullscreen takeover Adds a VSCode-style "focus terminal" toggle to the right sidebar's Terminal tab that takes over the chat pane area without unmounting the shell. The xterm host is mounted once at the layout root and CSS-overlayed onto whichever <TerminalSlot /> is currently active, so the PTY session, scrollback, selection, focus, and WebGL renderer survive every toggle. Also: - WebGL renderer (matching dashboard ChatPage) so Hermes' TUI skins paint faithfully instead of muting through xterm's default DOM renderer - File drag/drop from the project tree or OS into xterm — paths are shell-quoted (zsh/bash/pwsh/cmd) and written straight into the PTY - Solarized dark canvas with brights promoted to real accent variants (Schoonover's UI-gray brights washed out every TUI accent) - Strip NO_COLOR/FORCE_COLOR/COLORFGBG/TERM=dumb leaking from non-tty parents (CI runners, Cursor's agent shell) so the embedded shell gets truecolor regardless of how Electron was launched - rAF-debounced ResizeObserver — running fit.fit() synchronously during sibling pane transitions crashed the WebGL texture-atlas rebuild * fix(install.ps1): strip UTF-8 BOM regression that broke 'irm | iex' The canonical install flow irm https://raw.githubusercontent.com/.../scripts/install.ps1 | iex fails on PowerShell 5.1 with a cascade of 'The assignment expression is not valid' errors at every param() default value: [string]$Branch = 'main', ~~~~~~ The assignment expression is not valid. The input to an assignment operator must be an object that is able to accept assignments... Root cause: scripts/install.ps1 carries a UTF-8 BOM (0xEF 0xBB 0xBF) as its first three bytes. 'irm' returns the response body as a string; on PS 5.1 the BOM survives into that string as a leading \ufeff character. 'iex' then evaluates the string and PS's parser chokes on the invisible character before param() -- error recovery proceeds into the body but every assignment is reported as broken. This was the exact failure mode the install.ps1 hardening pass (PR #27224) deliberately fixed by stripping the BOM and ensuring the file body is pure ASCII. Commit |
|||
| cbf851ae1d |
perf(tui): stop slow/dead MCP servers from freezing TUI startup
The 'summoning hermes…' phase blocked on gateway.ready, which ran MCP tool discovery inline. Any configured-but-unreachable MCP server burned its full connect-retry backoff (1+2+4s ≈ 7s) before the composer appeared — startup went from instant to ~7.5s of dead air for anyone with a down stdio/http server in mcp_servers. Move discovery into a background daemon thread so gateway.ready fires immediately; tools register into the shared registry as servers connect, and the agent isn't built until the first prompt. Measured spawn→ready: ~7500ms → ~115ms (dead twozero_td server in config). Also drop rich.console + prompt_toolkit off banner.py's import path (lazy-imported inside cprint/build_welcome_banner). tui_gateway.server imports banner only to reach the lightweight prefetch_update_check helper; the eager rich/pt imports added ~45ms before gateway.ready for no benefit. tui_gateway.server import: ~115ms → ~69ms. |
|||
| 66827f8947 |
chore: prune unused imports and duplicate import redefinitions
Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves |
|||
| 3a9bc9d88a |
fix(model picker): unify /model and hermes model lists, add disk cache (#33867)
* fix(model picker): unify /model and `hermes model` model lists, add disk cache
The /model slash picker and `hermes model` were drifting apart. /model
read the raw static `OPENROUTER_MODELS` list (31 entries, including 5
that fail at runtime — no tool-call support or absent from live catalog),
while `hermes model` ran the same list through the live OpenRouter
/v1/models tool-support filter and showed 26 valid entries. Same problem
existed for every other authed provider: /model used curated static
lists, `hermes model` used live /v1/models.
Unifies both surfaces on `provider_model_ids()` and adds a generic
disk-cached wrapper so the picker stays snappy.
Changes
- hermes_cli/models.py: new `cached_provider_model_ids()` —
~/.hermes/provider_models_cache.json, 1h TTL, per-provider entries
keyed by credential fingerprint (env vars + OAuth file mtimes).
Stale-data-beats-no-data on transient failures. Pair with
`clear_provider_models_cache(provider=None)`.
- hermes_cli/models.py: `provider_model_ids("nous")` now falls back
to the docs-hosted manifest (not the in-repo snapshot) when the live
Portal /models call fails — preserves the model_catalog regression
guarantee while still going through the unified pathway.
- hermes_cli/model_switch.py: `list_authenticated_providers` routes
sections 1, 2, and 2b through `cached_provider_model_ids(slug)` with
curated fallback when the live fetcher comes up empty.
- hermes_cli/model_switch.py: `parse_model_flags` extended to a
4-tuple, parses `--refresh`.
- cli.py / gateway/run.py / tui_gateway/server.py: updated unpacking;
CLI + gateway wire `--refresh` to `clear_provider_models_cache()`.
- hermes_cli/main.py: `hermes model --refresh` argparse flag.
- hermes_cli/commands.py: `/model` args_hint advertises `--refresh`.
- tests/hermes_cli/test_inventory.py: refresh stale comment.
Live PTY parity verification
- /model → OpenRouter row: `(26 models)` (was 31, with broken entries)
- `hermes model` → OpenRouter: 26 models (unchanged)
- The 5 dropped entries: `pareto-code` (no tool-call support),
`gemini-3-pro-image-preview` (no tool-call support),
`elephant-alpha`, `hy3-preview:free`, `ring-2.6-1t:free` (gone
from OpenRouter's live catalog).
Live PTY timing
- First /model open, empty cache: 4624 ms (full network round trip
across every authed provider)
- Second /model open, warm cache: 51 ms (90× faster)
- `/model --refresh` clears the disk cache and re-fetches.
Cache schema (~/.hermes/provider_models_cache.json, ~3 KB):
{ "anthropic": {"fp": "<sha256:16>", "at": 1748..., "models": [...]},
... }
Targeted tests: tests/hermes_cli/ + gateway model tests + tui_gateway —
5855/5855 pass.
* fix(model picker): use blake2b for cache fingerprint to silence CodeQL
py/weak-sensitive-data-hashing flagged the sha256 call in
_credential_fingerprint() as a high-severity alert because the input
includes env var values whose names contain *_API_KEY / *_TOKEN.
The hash is used solely as a cache-bust identity — never reversed, never
stored, collisions are harmless (worst case: cache miss → live re-fetch).
blake2b serves the same purpose and isn't flagged by this rule.
Functional behavior identical: 16-hex-char digest, cache hit/miss logic
unchanged. Live re-verified — 26 OpenRouter models, warm-cache 78ms.
|
|||
| 0a83247e9f |
feat: add TUI session orchestrator
Add a first-class active-session orchestrator for the Ink TUI: - list, activate, close, and launch live process-local TUI sessions - hydrate committed and in-flight output when switching sessions - dispatch a new prompt session from the +new row with session-scoped model picks - expose a clickable live-session count in the status chrome - preserve stable row order while initially focusing the current session - support mouse hit-testing for floating orchestrator overlays - add backend and frontend regression coverage for the lifecycle and UI helpers |
|||
| c9b3eeabdc |
fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display toggle for full args / results / think blocks) to `self.verbose` — which controls root-logger DEBUG level. Result: setting tool_progress: verbose in config silently flipped every module in the process to DEBUG and flooded the terminal with internal logging, far beyond just full tool calls. The two concepts are separate: - `tool_progress_mode == 'verbose'` → display behavior (tool rendering) - `self.verbose` → logging behavior (root logger → DEBUG, line 9795) This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback plumbing but severs the verbose-display → debug-logging link. Changes: - cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no longer auto-True when tool_progress_mode == 'verbose'. - cli.py:_toggle_verbose — slash-cycle through tool progress modes no longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`. - cli.py:9355 — fix misleading label (drop 'and debug logs'). - tui_gateway/server.py:_make_agent — same decoupling on the TUI side (verbose_logging no longer derived from tool_progress_mode). - tests/cli/test_tool_progress_scrollback.py — invert the test that asserted the broken coupling; add coverage for explicit `--verbose` still enabling DEBUG independent of tool_progress. Live verified: - tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines - --verbose flag explicit → 32 DEBUG/INFO log lines (as expected) |
|||
| a627981a65 |
fix(tui): stop slash dropdown from chopping last char of /goal (#31311)
Two independent bugs caused the slash-command autocomplete to render
`/goal` as `/goa` (and `/gquota` as `/gquot` for that matter) in the TUI:
1. `tui_gateway/server.py` was forwarding `c.display` from
prompt_toolkit's `Completion` straight into the JSON-RPC payload.
prompt_toolkit normalizes `display=` into `FormattedText` (a `list`
subclass), so the wire format became `[["", "/goal"]]` instead of
the `string` that `CompletionItem.display` in the TUI declares.
`meta` already went through `to_plain_text` — `display` did not.
2. The dropdown row in `appOverlays.tsx` used `flexDirection="row"`
with the display `<Text>` and the (very long) meta `<Text>` as
siblings. When the meta overflows the row width, Ink/Yoga shrinks
the *first* column by one cell, lopping the trailing character off
the command name. `/goal` triggers it reliably because its meta
string is the longest of any built-in command (description +
embedded `[text | pause | resume | clear | status]` usage hint).
Wrapping the display column in `<Box flexShrink={0}>` keeps it at
its natural width and lets the meta wrap or truncate instead.
|
|||
| 83f6a83b24 | fix(tui): handle images with codex app-server | |||
| 7de8cd4c5f |
fix(tui): clear TTS env var on voice off, and add TTS indicator to status bar
Bug 1: /voice off in TUI mode did not clear HERMES_VOICE_TTS, leaving TTS stuck ON with no way to disable it (the voice.toggle tts handler requires voice mode to be ON). Bug 2: TUI status bar only showed 'voice on/off' without any indication of whether TTS speech output is active, because the frontend never tracked voiceTts state. - tui_gateway/server.py: clear HERMES_VOICE_TTS when voice is turned off - ui-tui/src/app/useMainApp.ts: add voiceTts state, thread setVoiceTts through voice contexts, display [tts] in status bar - ui-tui/src/app/slash/commands/session.ts: sync tts from voice.toggle response - ui-tui/src/app/interfaces.ts: add setVoiceTts to all voice context interfaces |
|||
| 1264fab156 |
fix(tui): surface verbose tool details (#30225)
* fix(tui): surface verbose tool details Emit redacted structured verbose args/results to the TUI so /verbose verbose can show full tool detail without reopening stdout, and fail closed if redaction is unavailable. Salvages #29011. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com> * fix(tui): address verbose detail review Label verbose tool failures as errors, cover forced verbose reasoning, and avoid new diff type warnings from the redaction regression tests. * fix(tui): bound verbose tool payloads Cap verbose tool detail text before emitting JSON-RPC events and preserve verbose results on inline diff completions. * fix(tui): align termux argv test with gc flag Update the stale TUI launch expectation so the Termux freshness path matches the current direct Node argv. --------- Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com> |
|||
| a7cd254c29 |
feat(tui): mouse_tracking DEC mode presets (salvage of #26681) (#30084)
* feat(tui): make display.mouse_tracking pick which DEC modes to enable Previously the boolean flag was all-or-nothing across modes 1000+1002+1003+1006. Inside tmux, mode 1003 (any-motion) makes every mouse cross of the prompt row fire a clipboard probe that surfaces as "No image in clipboard" — sometimes dozens in a row. Disabling tracking entirely killed scroll-wheel scrolling too, since tmux's own scrollback is preempted by the alt-screen TUI. `display.mouse_tracking` (and `/mouse <preset>`) now accepts `off | wheel | buttons | all` in addition to the legacy booleans. `wheel` is 1000+1006: scroll wheel + click only, no drag, no hover — the tmux-friendly subset. `buttons` adds 1002 for drag-to-select. `all` (= legacy `true`) keeps the hover-driven UI (scrollbar paginate-on-hover, link mouseenter, etc.). * fix(tui): repaint + sync mouse mode when display.mouse_tracking changes Two interacting bugs left the TUI blank when `display.mouse_tracking` switched at runtime (config edit, /mouse <preset>): 1. AlternateScreen's effect re-runs on every `mouseTracking` change, tearing down and re-entering the alt screen. After re-entry, ink's frame buffers are reset by `resetFramesForAltScreen()` but nothing schedules the follow-up render — the alt screen sits blank until some other state change happens to trigger one. Add a `scheduleRender()` in `setAltScreenActive`'s active=true branch so the freshly-entered alt screen gets a full repaint immediately. 2. `setAltScreenActive` early-returns when `active` hasn't changed, which silently drops a `mouseTracking` change if the cleanup→setup pair somehow leaves `altScreenActive` already true. Call `setAltScreenMouseTracking` explicitly from the AlternateScreen effect so the in-memory mode and terminal DECSET sequence stay in sync regardless of how `setAltScreenActive` resolved (the call is a no-op when the mode is unchanged). * fix(tui): address copilot review #4341269705 - tui_gateway/server.py: drop the never-referenced _MOUSE_TRACKING_MODES frozenset (comment #3284802434). _MOUSE_TRACKING_ALIASES already centralizes the canonical preset set via its values; the separate constant added no behavior. - tests/test_tui_gateway_server.py: update the existing test_config_mouse_uses_documented_key_with_legacy_fallback to assert the new preset strings ('all'/'off' instead of 'on'/'off', display.mouse_tracking persisted as 'all' instead of True) and add test_config_mouse_accepts_preset_strings_and_aliases covering /mouse set with wheel/click/unknown (comment #3284802453). The on/off legacy config.set return shape was an implementation detail of the boolean flag, not a stable API — the slash command, gateway help text, and docs all advertise the preset values now. - ui-tui/packages/hermes-ink/src/ink/ink.tsx: schedule a render at the end of reenterAltScreen() (comment #3284802461). Mirrors the same fix in setAltScreenActive() from ece0a2f4c — without it, SIGCONT/resize self-heal/stdin-gap re-entry leaves the alt screen blank because every caller returns early after invoking us. * fix(tui): address copilot review #4341308478 round 2 - ui-tui/src/config/env.ts (comment #3284837577): the precedence comment was misleading. Actual behavior on origin/main is HERMES_TUI_MOUSE_TRACKING (explicit override) > Termux default > HERMES_TUI_DISABLE_MOUSE legacy kill-switch. This is preserved from main; the only change here was the wrong comment that claimed DISABLE_MOUSE kept kill-switch semantics. Rewrote the comment block to document the actual precedence ladder. - tui_gateway/server.py /mouse set (comment #3284837607): replaced 'str(value or "").strip().lower()' with the explicit None idiom already used for /indicator, so programmatic callers can pass 0 / False and have them route through _MOUSE_TRACKING_ALIASES → 'off' instead of collapsing to '' and triggering the toggle path. - ui-tui/packages/hermes-ink/src/ink/components/AlternateScreen.tsx (comment #3284837620): always prepend DISABLE_MOUSE_TRACKING before enableMouseTrackingFor(...) on mount. Otherwise selecting 'wheel'/'buttons' from a state where DEC 1003 was already asserted (crash, another app, debugger) would silently leave hover on. Also unconditionally DISABLE on unmount so a crash mid-mount can't leak DEC modes back to the host shell. * chore(release): map nat@nthrow.io to @nthrow for #26681 salvage * fix(tui): drop redundant setAltScreenMouseTracking in AlternateScreen Copilot review #4341356637 (comment #3284880417). The explicit setAltScreenMouseTracking(mouseTracking) after setAltScreenActive(true, mouseTracking) was defensive paranoia added in the previous fix commit that's not actually reachable in practice: - React's cleanup always runs before the next setup, so on any prop change (mouseTracking or writeRaw) the cleanup sets active=false first. Setup then sees active was false and applies the new mode via setAltScreenActive without early-returning. - On the impossible 'active stayed true' path, the writeRaw above has already sent DISABLE_MOUSE_TRACKING + enableMouseTrackingFor(newMode) to the terminal, so the in-memory mode would lag but the visible state is already correct. Removing the redundant call means a single DEC sequence per mount. If the 'active stayed true' path ever manifests in practice, the right fix is in setAltScreenActive (track mode regardless of the active early-return), not here. * fix(tui): always DISABLE before enableMouseTrackingFor in ink.tsx Copilot review #4341379994 (comments #3284900825, #3284900840, #3284900852). Three remaining call sites in ink.tsx still re-enabled mouse tracking without first sending DISABLE_MOUSE_TRACKING: - handleResize alt-screen recovery (line ~577) - reassertTerminalModes stdin-gap re-assertion (line ~1351) - reenterAltScreen SIGCONT/resize/stdin-gap self-heal (line ~1408) For 'wheel'/'buttons' presets, omitting DISABLE leaves any externally- asserted DEC 1003 (other apps, prior crash, tmux state) still active and the hover-free preset silently has hover on. DISABLE_MOUSE_TRACKING is idempotent and safe to send unconditionally — it resets all four modes. Matches the pattern already in setAltScreenMouseTracking and the AlternateScreen mount path. * fix(tui): always DISABLE before enableMouseTrackingFor in exitAlternateScreen Copilot review #4341452823 (comment #3284959762). exitAlternateScreen() was the last call site in ink.tsx still re-enabling mouse tracking without DISABLE first. Editors (vim/nvim/less) and tmux can leave DEC 1003 hover asserted across the handoff back; without DISABLE, 'wheel'/'buttons' presets silently kept hover on after the editor quit. Now all five enableMouseTrackingFor() call sites in ink.tsx prepend DISABLE_MOUSE_TRACKING — handleResize, reassertTerminalModes, reenterAltScreen, setAltScreenMouseTracking, exitAlternateScreen. * fix(tui): add defensive default to enableMouseTrackingFor switch Copilot review #4341485231 (comment #3284979323). TS exhaustive switch returns string per the type system, but a JS caller / corrupted config / hot-reload-in-dev could reach the function with an unknown value at runtime. Without a default, that path returns undefined which then concatenates as the literal string 'undefined' into the terminal byte stream — visibly garbling output. Treat unknown as 'off' (no DEC sequences) so the worst case is silent input loss rather than a wrecked screen. --------- Co-authored-by: Nat Thrower <nat@nthrow.io> |
|||
| 697d38a3f4 |
feat: auto-launch Chromium-family browser for CDP
Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable. Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing. |
|||
| 8c3b065124 | fix(cli): show active profile in TUI prompt | |||
| b5c1fe78aa |
feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373)
Skill bundles are tiny YAML files in ~/.hermes/skill-bundles/ that
group several skills under one slash command. Invoking /<bundle-name>
from any surface (CLI, TUI, dashboard, any gateway platform) loads
every referenced skill into a single combined user message.
Use cases:
- /backend-dev → loads github-code-review + test-driven-development
+ github-pr-workflow as one bundle.
- /research → loads several research skills together.
- Team task profiles shared via dotfiles.
Behavior:
- Bundles take precedence over individual skills when slugs collide.
- Missing skills are skipped with a note, not fatal.
- No system-prompt mutation — bundles generate a fresh user message
at invocation time, the same way /<skill> does. Prompt cache stays
intact.
- Works in CLI dispatch, gateway dispatch, autocomplete (CLI + TUI),
/help display.
Schema (~/.hermes/skill-bundles/<slug>.yaml):
name: backend-dev
description: Backend feature work.
skills:
- github-code-review
- test-driven-development
instruction: |
Optional extra guidance prepended to the loaded skills.
New module: agent/skill_bundles.py — load, scan, resolve, build
invocation message, save, delete. yaml.safe_load only; broken
bundles log a warning and are skipped, never raise.
New CLI subcommand: hermes bundles {list,show,create,delete,reload}.
Implementation in hermes_cli/bundles.py; wired in hermes_cli/main.py.
'bundles' added to _BUILTIN_SUBCOMMANDS so plugin discovery skips it.
New in-session slash command: /bundles lists installed bundles in
both CLI and gateway. /<bundle-name> dispatch added to CLI (cli.py)
and gateway (gateway/run.py) before the existing /<skill-name> path.
Autocomplete: SlashCommandCompleter gained an optional
skill_bundles_provider parameter that defaults to None — the prompt
shows '▣ <description> (N skills)' for bundles vs '⚡' for skills.
Tests:
- tests/agent/test_skill_bundles.py — 33 tests covering slugify,
scan/cache freshness, resolve (including underscore→hyphen
Telegram alias), build_bundle_invocation_message (loading, missing
skills, user/bundle instruction injection, dedup), save/delete,
reload diff, list sort.
- tests/hermes_cli/test_bundles.py — 8 tests for the CLI
subcommand (create/list/show/delete/reload, --force, missing
bundle errors).
- tests/gateway/test_bundles_command.py — 4 tests for the gateway
handler and bundle resolution priority.
Live E2E: verified subprocess invocations of hermes bundles
{list,create,show,reload,delete} round-trip correctly against an
isolated HERMES_HOME.
Docs:
- website/docs/user-guide/features/skills.md — new 'Skill Bundles'
section with quick example, YAML schema, management commands,
behavior notes.
- website/docs/reference/cli-commands.md — 'hermes bundles' added to
the top-level command table and given its own subcommand section.
|
|||
| 2ef501e1f5 |
feat(cli): add /update slash command to CLI and TUI (#23854)
* feat: add /update slash command to CLI and TUI * test(cli): add Python tests for /update slash command Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address Copilot review for /update slash command Route classic CLI /update through prompt_toolkit modal confirmation and defer relaunch to the main-thread cleanup path after app.exit(). Tighten Y/n semantics, add Python wrapper and catalog coverage tests, and assert /update stays visible in the TUI command catalog. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address review feedback on /update command - Replace raw input() with _prompt_text_input_modal in _handle_update_command to avoid EOF/hang/keystroke-leak races with prompt_toolkit's stdin ownership - Fix confirmation logic: only proceed on recognized affirmative aliases (y/yes/1/ok); cancel on everything else including empty string, typos, and unrecognized input — matches all other [Y/n] prompts in the codebase - Route relaunch through main-thread shutdown path: set _pending_relaunch and return False from process_command so process_loop triggers app.exit(); run() then calls relaunch() after prompt_toolkit has restored terminal modes and after cleanup — safe on both POSIX (execvp) and Windows (subprocess+exit) - Fix misleading docstring in test_update_command.py: the Vitest only covers the TypeScript slash handler that emits code 42, not the Python wrapper branch that acts on it - Rewrite tests to use SimpleNamespace pattern (like test_destructive_slash_confirm) so _prompt_text_input_modal can be stubbed directly - Add Python test for _launch_tui exit-code-42 → relaunch branch in main.py Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * fix(cli): polish test fixtures for /update command - Remove unused _prompt_text_input from SimpleNamespace stub - Use pytest.fail sentinel in managed-install guard test to catch unexpected modal invocations Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * chore: re-trigger CI after Copilot review fixes Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> |
|||
| 9df9816dab |
feat(azure-foundry): add Microsoft Entra ID auth
Use azure-identity DefaultAzureCredential for keyless Foundry auth. Preserve refreshable callable credentials through OpenAI and Anthropic client paths. Add setup, doctor, auth status, docs, and tests for Entra auth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> |
|||
| d5416284f1 |
fix(tui): autonomous background process completion notifications (#26071) (#26327)
* feat(process-registry): add format_process_notification shared helper * feat(process-registry): add drain_notifications method * refactor(cli): use shared drain_notifications and format_process_notification * feat(tui): add background notification poller for completion_queue * feat(tui): wire notification poller into session init/finalize * refactor(tui): add post-turn drain using shared helper as safety net |
|||
| efc32ab639 |
refactor(inventory): extract shared ConfigContext + build_models_payload
Three call-sites in the codebase each duplicated the same config-slice
+ list_authenticated_providers + post-processing pattern:
- hermes_cli/web_server.py /api/model/options
- tui_gateway/server.py model.options JSON-RPC
- tui_gateway/server.py model.save_key JSON-RPC
This consolidates them onto hermes_cli/inventory.py:
load_picker_context() -> ConfigContext
Replaces the 17-LOC config-slice (model.{default,name,provider,
base_url}, providers:, custom_providers:) every consumer did
inline.
ConfigContext.with_overrides(*, current_provider=, current_model=,
current_base_url=) -> ConfigContext
Truthy-only overlay for TUI agent-session state on top of disk
config. Empty getattr(agent, ...) attrs MUST NOT clobber disk.
build_models_payload(ctx, *, include_unconfigured, picker_hints,
canonical_order, max_models) -> dict
Single payload builder. Delegates curation to
list_authenticated_providers (does not call provider_model_ids
per row \u2014 that pulls non-agentic models). picker_hints +
canonical_order produce the TUI ModelPickerDialog shape;
defaults match the dashboard's existing /api/model/options
contract.
Two latent bugs fixed by consolidation:
1. The dashboard read cfg.get('custom_providers') directly, missing
the v12+ keyed providers: form. Now both surfaces go through
get_compatible_custom_providers().
2. The TUI's canonical-merge keyed on is_user_defined to decide order.
Section 3 of list_authenticated_providers sets is_user_defined=True
on rows from the providers: config dict even when the slug is
canonical \u2014 that silently demoted them to the picker tail.
_reorder_canonical now keys on slug membership instead.
Stats: +666 / -145 (net +521). Module 240 LOC; 18 behavior tests.
This PR replaces the rejected #23369 (which bundled the consolidation
with new scriptable CLI surfaces \u2014 hermes models list/status, hermes
providers list \u2014 and a JSON contract that have no external user
demand). Just the refactor; the CLI surface is deferred to a separate
PR gated on actual demand.
Refs #23359.
|
|||
| 557deece6f |
fix(tui): use TERMINAL_CWD in _session_info for accurate status line path
_session_info() used os.getcwd() which reflects the gateway process working directory, not the user's actual working directory. This caused the TUI status line to display incorrect paths (e.g. D:\HermesWork instead of D:\Hermes\HermesWork) after agent turns that changed the process cwd. Align with session.create which already correctly reads TERMINAL_CWD env var set by the CLI launcher. |
|||
| ce0f529cde |
chore: ruff auto-fix C401, C416, C408, PLR1722 (#23940)
C401: set(x for x in y) -> {x for x in y} (set comprehension)
C416: [(k,v) for k,v in d] -> list(d.items()) (unnecessary listcomp)
C408: tuple()/dict() -> ()/{} (unnecessary collection call)
PLR1722: exit() -> sys.exit() (adds import sys where needed)
21 instances fixed, 0 remaining. 19 files, +40/-36.
|
|||
| 2ec8d2b42f |
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero). |
|||
| 3e7145e0bb |
revert: roll back /goal checklist + /subgoal feature stack (#23813)
* Revert "fix(goals): force judge to use tool calls instead of JSON-text replies (#23547)" This reverts commit |
|||
| 404640a2b7 |
feat(goals): /goal checklist + /subgoal user controls (#23456)
* feat(goals): /goal checklist + /subgoal user controls
Two-phase judge for /goal — Phase A decomposes the goal into a detailed
checklist on first turn; Phase B evaluates each pending item harshly
against the agent's most recent response. The goal completes only when
every item is in a terminal status (completed or impossible). Adds
/subgoal so the user can append, complete, mark impossible, undo,
remove, or clear items the judge missed or got wrong.
Mechanics:
- GoalState gains `checklist` and `decomposed` fields, both backwards
compatible (old state_meta rows load unchanged).
- Phase A: aux call writes a harsh, exhaustive checklist; biased toward
more items not fewer. Falls through to legacy freeform judge when
decompose fails.
- Phase B: judge gets the checklist + last-response snippet + path to
a per-session conversation dump at <HERMES_HOME>/goals/<sid>.json.
A bounded read_file tool (max 5 calls per turn, restricted to that
one file) lets the judge inspect history when the snippet is
ambiguous. Stickiness in code: terminal items are frozen, only the
user can revert via /subgoal undo.
- Continuation prompt shows checklist progress when non-empty;
reverts to old prompt when empty.
- Status line shows M/N done counts.
CLI + gateway + TUI gateway all pass the agent reference into
evaluate_after_turn so the dump can be written. Gateway-side
/subgoal is allowed mid-run since it only modifies the checklist
the judge consults at turn boundaries.
Tests: 24 new cases — backcompat round-trip, Phase A decompose,
Phase B updates + new_items + stickiness, user override flows,
conversation dump (incl. unsafe-sid sanitization), judge read_file
restriction. Existing freeform-mode tests updated to patch the
renamed `judge_goal_freeform` and skip Phase A explicitly.
* fix(goals): off-by-one in judge index, message-list plumbing, prompt tuning
Three live-test findings from running /goal end-to-end against
gemini-3-flash-preview as the judge:
1. Off-by-one bug — the judge sees the checklist rendered with 1-based
indices ('1. [ ] foo, 2. [ ] bar') but the apply layer indexed
state.checklist as 0-based. Result: every judge update landed on
the wrong item, evidence got attached to neighbouring rows, and
the genuine 'first pending' item (usually #1) never got marked.
Fix: convert 1 → 0 in _parse_evaluate_response. Also tightened the
user prompt to call out the 1-based scheme explicitly. New tests
cover the parser conversion + an end-to-end fake-judge round-trip.
2. Conversation dump never happened — _extract_agent_messages tried
common AIAgent attribute names (.messages, .conversation_history,
etc.) but AIAgent doesn't expose the message list as an instance
attribute; it lives inside run_conversation()'s scope. Result: the
judge's read_file tool always saw history_path=unavailable. Fix:
added an explicit messages= kwarg to evaluate_after_turn that all
three call sites (CLI, gateway, TUI gateway) now pass directly.
Agent-attribute extraction kept as back-compat fallback.
3. Prompt was too harsh on simple goals. The original 'be HARSH,
default to leaving items pending' wording made the judge refuse
to mark 'file exists' completed even after the agent ran ls,
test -f, os.path.isfile, and find — burning the entire 8-turn
budget on a fizzbuzz task. Softened to 'strict but not absurd'
with explicit guidance on what counts as evidence and a directive
not to require re-proving items already established earlier.
Re-tested live with the same fizzbuzz goal: now terminates in 2
turns with all 8 checklist items correctly attributed to their
own evidence. /subgoal user-action flow (add / complete / undo /
impossible) verified live as well.
|
|||
| c7f0aab949 |
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
|
|||
| cbce5e93fc |
codebase: add encoding='utf-8' to all bare open() calls (PLW1514)
Closes the last Python-on-Windows UTF-8 exposure by making every
text-mode open() call explicit about its encoding.
Before: on Windows, bare open(path, 'r') defaults to the system
locale encoding (cp1252 on US-locale installs). That means reading
any config/yaml/markdown/json file with non-ASCII content either
crashes with UnicodeDecodeError or silently mis-decodes bytes.
After: all 89 affected call sites in production code now pass
encoding='utf-8' explicitly. Works identically on every platform
and every locale, no surprise behavior.
Mechanical sweep via:
ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix --exclude 'tests,venv,.venv,node_modules,website,optional-skills, skills,tinker-atropos,plugins' .
All 89 fixes have the same shape: open(x) or open(x, mode) became
open(x, encoding='utf-8') or open(x, mode, encoding='utf-8'). Nothing
else changed. Every modified file still parses and the Windows/sandbox
test suite is still green (85 passed, 14 skipped, 0 failed across
tests/tools/test_code_execution_windows_env.py +
tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py +
tests/test_hermes_bootstrap.py).
Scope notes:
- tests/ excluded: test fixtures can use locale encoding intentionally
(exercising edge cases). If we want to tighten tests later that's
a separate PR.
- plugins/ excluded: plugin-specific conventions may differ; plugin
authors own their code.
- optional-skills/ and skills/ excluded: skill scripts are user-authored
and we don't want to mass-edit them.
- website/ and tinker-atropos/ excluded: vendored / generated content.
46 files touched, 89 +/- lines (symmetric replacement). No behavior
change on POSIX or on Windows when the file is ASCII; bug fix on
Windows when the file contains non-ASCII.
|
|||
| 7f92e5506e |
Merge pull request #20942 from NousResearch/austin/fix/personality
fix(tui): preserve session when switching personality |
|||
| d87c7b99e2 |
fix(analytics): prevent silent token loss and add Claude 4.5–4.7 pricing (#21455)
- Add pricing entries for Claude Opus 4.5/4.6/4.7, Sonnet 4.5/4.6, and Haiku 4.5 with updated source URLs (platform.claude.com) - Add _normalize_anthropic_model_name() to handle dot-notation variants (e.g. claude-opus-4.7 → claude-opus-4-7) for pricing lookups - Fix silent token loss: ensure session row exists before UPDATE in both run_agent.py and hermes_state.py (INSERT OR IGNORE is idempotent) - Log token persistence failures at DEBUG level instead of swallowing them silently — makes undercounted analytics diagnosable - Surface reasoning tokens in CLI /usage and TUI usage panel - Add 'reasoning' and 'cost_status' fields to TUI Usage type |
|||
| ec9d0e26d4 | fix(tui): render structured content on resume | |||
| 6e46f99e7e |
fix(tui): surface backend error as visible text when final_response is empty (#21245)
When the provider rejects a request (e.g. invalid model slug like
'--provider nous --model kimi-k2.6' where the valid slug is
'moonshotai/kimi-k2.6'), run_conversation() returns
{failed: True, error: <detail>, final_response: None}. The TUI gateway
and one-shot CLI mode both dropped the error on the floor and emitted
an empty turn, so the user saw a blank response with no indication
that anything went wrong.
Mirror the interactive CLI's existing pattern (cli.py:9832): when
final_response is empty AND (failed|partial) is set AND error is
populated, surface 'Error: <detail>' as the visible text. Leaves
the None-with-no-error path and the '(empty)' sentinel path
untouched — an empty successful turn still renders empty, and
existing sentinel handlers keep owning their lane.
Reported by @counterposition in PR #20873; taking a minimal fix
rather than the broader structured-failure refactor proposed there.
|
|||
| 65c762b2e8 |
fix(tui): preserve session when switching personality
Previously, /personality in the TUI called _reset_session_agent() which destroyed the agent, cleared conversation history, and effectively started a new session. This made personality switching disruptive — users lost their entire conversation context. Now /personality updates the agent's ephemeral_system_prompt in-place and injects a pivot marker into the conversation history. The marker tells the model to adopt the new persona from that point forward, which is necessary because LLMs tend to pattern-match their prior responses and continue the established tone without an explicit signal. Changes: - tui_gateway/server.py: Rewrite _apply_personality_to_session to update the agent in-place instead of resetting. Inject a user-role pivot marker so the model actually switches style mid-conversation. - ui-tui/src/app/slash/commands/session.ts: Update help text (no longer mentions history reset). - tests/test_tui_gateway_server.py: Update test to verify history is preserved, pivot marker is injected, and ephemeral prompt is set. |
|||
| 04cf4788cc |
fix(tui): restore voice push-to-talk parity (#20897)
* fix(tui): restore classic CLI voice push-to-talk parity (cherry picked from commit 93b9ae301bb89f5b5e01b4b9f8ac91ffa74fbd9d) * fix(tui): harden voice push-to-talk stop flow Address review feedback from PR #16189 by stopping the active recorder before background transcription, documenting single-shot voice capture, and covering the TUI gateway flags with regression tests. * fix(tui): preserve silent voice strike tracking Keep single-shot voice recording's no-speech counter alive across starts so the TUI can still emit the three-strikes auto-disable event, and bind the auto-restart state at module scope for type checking. * fix(tui): clean up voice stop failure path Address follow-up review by naming the TUI flow as single-shot push-to-talk and cancelling the recorder when forced stop cannot produce a WAV. * fix(tui): report busy voice capture starts Return explicit start state from the voice wrapper so the TUI gateway does not report recording while forced-stop transcription is still cleaning up. * fix(tui): handle busy voice record responses Apply the gateway busy status immediately in the TUI and route forced-stop voice events to the session that sent the stop request. * fix(tui): clear voice recording on null response Treat a null voice.record RPC result as a failed optimistic start so the REC badge cannot stick after gateway-side errors. * fix(tui): count silent manual voice stops Preserve single-shot voice no-speech strikes through forced stop transcription so empty push-to-talk captures still trigger the three-strikes guard. --------- Co-authored-by: Montbra <montbra@gmail.com> |