Commit Graph

10036 Commits

Author SHA1 Message Date
8ae0802d59 fix(skills): make _rmtree_writable handle read-only directories, not just files
The cherry-picked fix's onerror handler chmod'd only the failing path, but
unlinking a child requires write permission on its PARENT directory. On a true
Nix-store copy (r-xr-xr-x dirs + files) rmtree still failed. Now chmod the
parent dir as well before retrying.

Also rewrites the regression test: the original asserted the helper FAILS on a
read-only dir (documenting the limitation), which is the wrong success criterion.
Split into two tests — restore succeeds on a full read-only tree (real Nix case),
and manifest is preserved when removal genuinely cannot proceed (monkeypatched).
2026-05-30 02:05:10 -07:00
83a7d0b601 fix(skills): fix transaction ordering in reset_bundled_skill and handle read-only files in rmtree
Two related bugs in tools/skills_sync.py affecting Nix-store and
immutable-package installs:

**#34972 — reset_bundled_skill corrupts manifest on rmtree failure:**
The function deleted the manifest entry BEFORE attempting rmtree. If
rmtree failed (read-only files from Nix store), the function returned
early — leaving the skill in a manifest-less limbo state where future
syncs silently skip it forever.

Fix: reorder steps — attempt rmtree FIRST, only delete manifest entry
after rmtree succeeds. If rmtree fails, nothing is changed.

**#34860 — stale .bak directories after sync:**
sync_skills() called shutil.rmtree(backup, ignore_errors=True) which
silently failed on read-only files, leaving persistent .bak dirs.

Fix: add _rmtree_writable() helper that makes files writable via an
onerror callback before retrying removal. Used in both sync_skills()
backup cleanup and reset_bundled_skill().

Fixes #34972
Fixes #34860
2026-05-30 02:05:10 -07:00
a57cc00081 fix(packaging): include mcp_serve in py-modules so hermes mcp serve works on pip installs
mcp_serve.py was missing from the setuptools py-modules list, causing
hermes mcp serve to crash with ModuleNotFoundError on standard pip installs.

Fixes #34871
2026-05-30 01:45:30 -07:00
93e6a05efc feat(model-picker): group multi-endpoint providers under one row (#35227)
* Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here'

Adds a user-chosen compression boundary to the existing /compress command.
/compress here [N] summarizes everything except the most recent N exchanges
(default 2), which are preserved verbatim — letting the user pick the
compression boundary instead of relying on the automatic token-budget heuristic.

Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139,
Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20

- hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation
  guard (shared by CLI and gateway).
- cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression;
  compress only the head, re-append the verbatim tail through the seam guard.
- Preserves message-flow role alternation (seam guard merges any illegal
  user->user / assistant->assistant adjacency).
- Reuses the existing _compress_context session-rotation/lock machinery — no
  changes to the compression core.
- Bare /compress (full) and /compress <focus> behavior unchanged.

Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved
tool-call transcript, degenerate/multimodal seams, real handler path).

* feat(model-picker): group multi-endpoint providers under one row

The interactive provider pickers (hermes model, setup wizard, Telegram
/model) listed every provider slug flat, so vendors with several endpoints
(Kimi/Moonshot, MiniMax, xAI Grok, Google Gemini, OpenAI, OpenCode, GitHub
Copilot) each occupied multiple top-level rows. Now related slugs fold into
one top-level row that drills down to the specific endpoint.

- models.py: add PROVIDER_GROUPS table + group_providers() fold (display
  only — CANONICAL_PROVIDERS, slugs, --provider, /model <provider:model>
  all unchanged and individually addressable).
- hermes model (main.py): group rows drill into a member sub-picker, then
  dispatch to the existing _model_flow_* unchanged. setup wizard inherits it.
- Telegram /model: new mpg:<group> callback expands to member mp:<slug>
  buttons; single authenticated member degrades to a direct button.
- Grouping is the single shared fold across all three surfaces.

Validation: 163 targeted tests pass; E2E confirms group->member->model
resolves to the correct concrete slug for all families.
2026-05-30 01:41:33 -07:00
14517ac1f5 fix(update): export launcher virtualenv to uv 2026-05-30 01:41:29 -07:00
8e5a6854c3 fix(kanban): align recompute_ready guard with breaker's configured failure_limit
Follow-up to the budget-exhaustion recovery fix. recompute_ready's
new circuit-breaker guard resolved its effective limit from per-task
max_retries -> DEFAULT_FAILURE_LIMIT, skipping the dispatcher's
configured kanban.failure_limit. _record_task_failure resolves
max_retries -> failure_limit(config) -> DEFAULT, so the two disagreed
whenever an operator set kanban.failure_limit != 2:

- config > 2: a task could get stuck at DEFAULT(2) before reaching its
  allowed retry count.
- config < 2: a task the breaker already blocked could be auto-recovered
  back to ready, defeating the stricter limit.

Thread the dispatcher's failure_limit through dispatch_once into
recompute_ready so the guard and the breaker share one resolution order.
Updated test_circuit_breaker_block_still_auto_promotes (it asserted a
failures=5 block auto-recovers and resets the counter — that's the
pre-#35072 behavior the loop fix removes); it now exercises a below-limit
transient block, with the at-limit case covered in test_kanban_db.py.
Added two tests for the config-tier and per-task override resolution.
2026-05-30 01:40:57 -07:00
6ab71d3bb4 fix(kanban): prevent infinite retry loop when worker exhausts iteration budget
recompute_ready() previously reset consecutive_failures to 0 when
auto-recovering a blocked task.  This defeated the circuit-breaker:
a task that repeatedly exhausted its iteration budget would cycle
forever (block → auto-recover with counter=0 → respawn → budget
exhausted → block → …) with no signal to the operator.

Fix: don't auto-recover tasks whose consecutive_failures has reached
the effective failure limit (per-task max_retries or
DEFAULT_FAILURE_LIMIT).  The counter is also preserved across
recovery so the breaker can accumulate across cycles.

Fixes #35072
2026-05-30 01:40:57 -07:00
c70dca3a88 fix(kanban): rebuild legacy TEXT-PK tables to INTEGER AUTOINCREMENT on open
Legacy kanban boards (pre-AUTOINCREMENT schema) crashed the gateway
notifier on every tick — int(None) on a NULL id in unseen_events_for_sub
— silently losing all kanban notifications. CREATE TABLE IF NOT EXISTS
skips existing tables regardless of schema and _add_column_if_missing
only adds columns, so neither could fix a drifted primary-key type.

_rebuild_drifted_tables() detects the legacy shape via PRAGMA table_info
and rebuilds task_events/task_comments/task_runs (TEXT PK -> INTEGER
AUTOINCREMENT) and kanban_notify_subs.last_event_id (TEXT/NULL -> INTEGER
NOT NULL DEFAULT 0), preserving data. The whole pass is one transaction
so an interruption can't leave a table half-renamed, and recreates every
index DROP TABLE would otherwise take down (including idx_events_run).

Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>
2026-05-30 01:40:49 -07:00
16882cfded refactor(tui): simplify base64 clipboard write to a stdin flag
The per-entry psScript callback was identical for every PowerShell entry,
so the function-valued union member added structure without behavior. Collapse
WriteCmd to a plain stdin boolean and apply the one shared base64 script in the
write loop. Document the CP936 root cause inline.

Co-authored-by: BROCCOLO1D <279959838+BROCCOLO1D@users.noreply.github.com>
2026-05-30 01:40:44 -07:00
64998fa93e fix(tui): use base64 encoding for PowerShell clipboard writes to preserve UTF-8
When writing text to the clipboard via PowerShell (WSL2 and native Windows),
the previous implementation piped text through stdin using `Set-Clipboard
-Value $input`. PowerShell reads stdin using the Windows system's default
ANSI code page (e.g. CP936 for Chinese Windows), causing all non-ASCII
characters (CJK, emoji, accented) to become garbled.

Fix: encode the text as base64 in Node.js and pass it as a command argument.
PowerShell decodes it from base64 using explicit UTF-8, bypassing the code
page issue entirely.

Fixes #35107
2026-05-30 01:40:44 -07:00
b4cf114f68 fix(vision): fail fast on non-retryable image download errors (#35221)
_download_image() wrapped every download attempt in a blanket
`except Exception` and retried 3x with 2s/4s/8s backoff regardless of
cause. A 404/403 image URL would never resolve on retry, so it just
burned up to 6s of wall-clock + extra GETs before failing — inflating
latency for a deterministic failure (issue #32296, umbrella #35114).

Add _is_retryable_download_error(): 4xx client errors (except 429),
website-policy PermissionError, and too-large/SSRF ValueError now raise
on the first attempt. 429, 5xx, and unclassified network errors stay
retryable. Removed the now-unreachable fall-through branch since the
loop always returns on success or re-raises on the final/terminal attempt.
2026-05-30 01:40:39 -07:00
e481b15333 Merge pull request #35216 from kshitijk4poor/fix/agents-nudge-single-delegate
fix: surface /agents nudge for single-delegate fan-out (TUI + CLI)
2026-05-30 00:57:15 -07:00
9d2571c86a fix: surface /agents nudge while delegate_task is in-flight (TUI + CLI)
The subagent spawn-observability overlay added a `(/agents)` hint, but
only on the standalone "Spawn tree" panel, gated behind `!inlineDelegateKey`
— it never showed for a single delegate_task call, and only appeared once
subagents had already registered. A nudge that arrives at the end (or only
after spawn) is useless for the actual goal: letting users open the live
monitor *while* delegation is running.

Surface it the moment delegation starts, on both surfaces:

TUI (ui-tui/src/components/thinking.tsx)
- Show `(/agents)` on any "Delegate Task" tool group as soon as it appears
  (in-flight, before any subagent registers), not gated on subagents
  already existing. Same `startsWith('Delegate Task')` predicate already
  used for delegateGroups.

CLI (agent/tool_executor.py)
- Append `· /agents to monitor` to the delegate spinner label, which is
  displayed for the full duration of the delegate_task call. The previous
  attempt put the hint on the completion line (get_cute_tool_message),
  which only renders after the call finishes — reverted.

TUI tsc clean (pre-existing execFileNoThrow type errors unrelated);
subagentTree 35/35; display.py reverted to upstream.
2026-05-30 13:22:45 +05:30
bb79bcde61 fix: detect pyproject.toml / __init__.py version drift in hermes doctor (#35142)
A git conflict resolution (reset --hard or merge) can revert
hermes_cli/__init__.py to a stale __version__ while pyproject.toml stays
current, so 'hermes --version' silently reports the wrong version. Nothing
cross-checked the two files.

Add a version-consistency check to the doctor 'Python Environment' section:
reads the [project] version from pyproject.toml and compares it to
hermes_cli.__version__. Reports OK when they match, fails with a re-sync
hint when they drift, and is a silent no-op for installed wheels where
pyproject.toml isn't present.

Closes #35070
2026-05-30 00:32:05 -07:00
e5765e61fa chore(release): map wei.chen.coder@gmail.com -> wenchengxucool 2026-05-30 00:30:55 -07:00
84ee80eb5d feat: set process title to 'hermes' in ps/top/htop
Adds _set_process_title() in hermes_cli/main.py, called first thing in
main(). Tries setproctitle (optional) for a full ps-args rewrite, then
falls back to ctypes prctl(PR_SET_NAME) on Linux / pthread_setname_np on
macOS. No-op on Windows and on any failure. No new dependency: the
setproctitle path is best-effort via ImportError guard.

Fixes #35108
2026-05-30 00:30:55 -07:00
17103a1f11 chore: add SeaXen to AUTHOR_MAP for salvaged PR #33278 2026-05-30 00:23:44 -07:00
e8076c1ebe fix(dashboard): allow chat websockets on insecure public bind
Allow non-loopback websocket peers when the dashboard is explicitly exposed with --host 0.0.0.0/:: and --insecure.

This fixes the failure mode where /chat rendered over LAN but /api/ws and /api/events were rejected with HTTP 403, leaving the embedded TUI chat disconnected.

Add regression coverage for the insecure public bind case in the dashboard websocket auth tests.
2026-05-30 00:23:44 -07:00
636ff636d7 fix(agent): strip schema-foreign keys from max-iterations summary request (#34436)
The max-iterations summary path (`handle_max_iterations`) hand-builds its
message list and calls `chat.completions.create()` directly, bypassing
`ChatCompletionsTransport.convert_messages()`. It only popped
("reasoning", "finish_reason", "_thinking_prefill"), so `tool_name` (SQLite
FTS bookkeeping), the `codex_*` reasoning carriers, and other internal
`_`-prefixed scaffolding leaked to the wire.

Strict OpenAI-compatible gateways (Fireworks-backed OpenCode Go, Mistral,
Moonshot/Kimi) reject these with HTTP 400 "Extra inputs are not permitted,
field: 'messages[N].tool_name'", so a long tool-using session that exhausts
the iteration budget fails to summarise instead of returning the result.

Mirror convert_messages() in this path: also drop tool_name,
codex_reasoning_items, codex_message_items, and every `_`-prefixed key.
Copy-on-write is already in place, so internal history keeps the fields for
FTS / Codex-fallback.

Adds a regression test to TestHandleMaxIterations asserting the summary
request carries none of the schema-foreign keys (fails on main, passes here).
2026-05-30 00:22:53 -07:00
c1b2d0917f fix(cli): don't treat any container as the Docker image for updates (#35139)
detect_install_method() returned "docker" for any container (is_container()),
before the .git check. Both supported installs already self-identify via the
.install_method stamp read first: the curl installer (scripts/install.sh)
git-clones and stamps "git"; the published nousresearch/hermes-agent image
stamps "docker" at boot via docker/stage2-hook.sh. An unsupported manual
install dropped into a container has no stamp, so the bare container check
hijacked it to "docker" and 'hermes update' bailed with the docker-pull
guidance.

Drop the redundant is_container() -> docker fallback. Unstamped installs now
fall through to the .git/pip checks like any off-path install; both supported
paths are unaffected because the stamp wins first.

Fixes #34397.
2026-05-30 00:22:46 -07:00
8738cb92c3 Merge pull request #34704 from kshitijk4poor/feat/tui-agents-nudge
feat(tui): nudge toward /agents dashboard when delegation starts
2026-05-30 00:01:59 -07:00
5a72e82fd8 feat(tui): nudge toward /agents dashboard when delegation starts
The TUI already ships a rich /agents spawn-tree dashboard (live tree,
timeline, per-child tokens/cost/files/tools, kill/pause), but nothing
surfaced it — during delegation the transcript stayed quiet and users
had to already know to type /agents.

Drop a one-time transient activity hint ("subagents working · /agents
to watch live") the first time a turn starts delegating, matching the
existing "· /logs to inspect" house style. Guards keep it unobtrusive:

- fires at most once per turn (resets on message.start)
- silent when the /agents overlay is already open
- gated by display.tui_agents_nudge (default true)

Hooked on subagent.start, not subagent.spawn_requested: the delegate
progress callback in tools/delegate_tool.py only relays start/complete
to the gateway and drops spawn_requested, so start is the first
delegation event the TUI reliably receives. spawn_requested is wired
too for the future case, guarded once-per-turn.

Adds the display.tui_agents_nudge config default and gatewayTypes entry.
2026-05-30 12:26:36 +05:30
7b0915037c test: remove low-value model-catalog mirror tests
These tests asserted that hardcoded curated model lists/constants still
contained specific model strings (e.g. 'glm-5' in provider_model_ids('zai'),
exact context-length values per model key, PROVIDER_TO_MODELS_DEV entries).
They mirror a constant rather than exercise logic, so they only ever break
when models are added/retired and never catch a real bug.

Removed 22 such functions across 7 files (149 deletions, 0 additions).
Behavioral siblings are kept: live-catalog-wins, fallback ordering,
substring/longest-match resolution, normalization, credential discovery,
and probe-tier stepping all still tested.
2026-05-29 23:45:05 -07:00
0437137fff security: pin patched Starlette (>=1.0.1) for CVE-2026-48710 BadHost (#35118)
Starlette < 1.0.1 is affected by CVE-2026-48710 ("BadHost", CWE-444).
The HTTP Host header was not validated before being used to rebuild
`request.url`, so a malformed Host could make `request.url.path` desync
from the raw ASGI path the router actually dispatched. Middleware and
endpoints that apply path-based authorization off `request.url` (rather
than `scope["path"]`) can therefore be bypassed.

Hermes pulls Starlette transitively, never directly:
  - [web]          -> fastapi==0.133.1  (starlette>=0.40.0, no upper bound)
  - [mcp]          -> mcp==1.26.0 + sse-starlette (starlette>=0.27 / >=0.49.1)
  - [computer-use] -> mcp==1.26.0
  - [dev]          -> mcp==1.26.0

A fresh resolve landed starlette 0.52.1 — vulnerable. With no upper
bound on the transitive specs, pip/uv could resolve any pre-1.0.1
release on a fresh install.

Fix: pin starlette==1.0.1 directly in every extra that exposes a
Starlette-backed server surface, regenerate uv.lock (only starlette
moves: 0.52.1 -> 1.0.1, hash-verified), and mirror the pin in the
lazy-install map (tools/lazy_deps.py `tool.dashboard`) so `hermes`
on-demand dashboard installs can't re-resolve a vulnerable version.

1.0.1 is the advisory's named fix floor and the oldest patched release
(more bake time than 1.1.0/1.2.0, which are days old); it satisfies
every carrier constraint and our requires-python>=3.11.

Scope note: this is a dependency-level fix complementing the
application-layer Host-header validator added in #34162
(`hermes_cli/web_server.py` `_is_accepted_host`). Defense in depth at
both the framework and app layers.

Guards: two invariant tests in tests/test_packaging_metadata.py assert
every server-surface extra pins starlette and that pyproject + uv.lock
both resolve >= the 1.0.1 CVE floor — a dropped pin or stale lock fails
in CI instead of shipping the bypass.

Closes #35067
2026-05-29 23:23:54 -07:00
827ce602db fix(honcho): harden self-hosted setup paths
Self-hosted Honcho setup had four sharp edges:

- local/cloud URLs ending in /vN double-prefixed by the SDK (/v3/v3/... 404)
- authenticated local servers had no setup prompt for a JWT/bearer token
- profile-derived host keys could be dot-containing workspace IDs Honcho rejects
- memory-provider config files with API keys written world-readable per umask

This keeps existing behavior but makes those paths safer:

- strip a trailing /vN version segment from any configured baseUrl before SDK
  init (the SDK's route builders always prepend their own version prefix);
  auth-skipping stays loopback-only
- add an optional local JWT/bearer prompt in honcho setup, stored under
  hosts.<host>.apiKey
- derive new profile host keys with underscores, still reading legacy
  hermes.<profile> blocks
- write memory-provider config files atomically with 0600 via a shared
  utils.atomic_json_write(mode=) arg (honcho/hindsight/mem0/supermemory)
- skip honcho.json parsing in gateway cache-busting unless Honcho is the active
  memory provider; memoize by honcho.json mtime when active
- bust the gateway agent cache on memory.provider change
- add a hermes memory setup <provider> one-liner so fresh installs can configure
  a named provider without the picker (the per-provider hermes <provider>
  subcommand only registers once that provider is active)

Closes #20688, #29885, #26459, #30246, #33382, #32244.

Co-authored-by: BROCCOLO1D
2026-05-29 22:29:48 -07:00
aa32edcac5 fix(setup): write config for image_gen and video_gen in apply_nous_managed_defaults (#35109)
apply_nous_managed_defaults() was adding image_gen and video_gen to the
'changed' return set without writing any config values.  The caller
(tools_command first_install flow) uses 'changed' to skip manual
configuration, so these tools ended up in platform_toolsets but with no
video_gen.provider, video_gen.use_gateway, or image_gen.use_gateway in
config.yaml.

At runtime the FAL plugin's is_available() returned False because there
was no FAL_KEY and no use_gateway config — the tool never loaded despite
being 'enabled' in the toolset list.

For image_gen this was a latent bug masked by the gateway offer prompt
(prompt_enable_tool_gateway) running earlier in the setup flow and
writing image_gen.use_gateway=True via apply_gateway_defaults().  But if
the user skipped the gateway offer, image_gen would silently break the
same way.

For video_gen (added in PR #33259) the bug was always hit because the
gateway offer ran before the user checked video_gen in the toolset
checklist.

Fix: write provider/use_gateway config values before adding to 'changed',
matching the pattern used by web, tts, and browser.
2026-05-30 03:45:12 +00:00
a7421dc7d2 fix(session): point no-FTS5 warning at the supported install
When FTS5 is missing the warning now explains the likely cause (an
unsupported / pip-managed Python whose bundled SQLite lacks FTS5) and
links the supported install at hermes-agent.nousresearch.com, instead
of just logging the raw error.
2026-05-29 20:11:07 -07:00
4fa20f9a8b fix(install): ensure the uv-managed Python ships SQLite FTS5
uv's python-build-standalone distributions only gained FTS5 in mid-2025
(#694). A stale interpreter already in uv's store — which `uv python find`
reuses without checking — can lack it, leaving the supported install with
a SQLite that can't create the FTS5 virtual tables hermes_state.py needs
for full-text session search ("no such module: fts5").

check_python now probes the resolved interpreter for FTS5 and, if missing,
reinstalls the latest patch for $PYTHON_VERSION (which has FTS5) and
re-resolves. If an FTS5-capable Python still can't be obtained (offline,
pinned env), it warns and continues — Hermes degrades gracefully and only
disables session search. No bundled second SQLite, no user action.
2026-05-29 20:11:07 -07:00
97ecfa0fc4 fix(session): extend no-FTS5 degradation to the trigram CJK index
The salvaged contributor commit guarded only messages_fts. Current main
also creates a second virtual table, messages_fts_trigram (CJK substring
search), whose CREATE VIRTUAL TABLE ... USING fts5 still raised
"no such module: fts5" on builds without FTS5 — re-crashing SessionDB
init. Wrap the trigram setup with the same guard, and broaden the test's
no-fts5 mock to fail BOTH tables so the regression test actually
exercises a faithful no-FTS5 build.
2026-05-29 20:11:07 -07:00
5ad2b4c6da fix(session): degrade gracefully when SQLite lacks FTS5 2026-05-29 20:11:07 -07:00
860cf28dab docs: clarify compression threshold is derived from the main model's context window (#35099)
The compression threshold is threshold × context_length where context_length
is the MAIN agent model's window, not the auxiliary/summary model's. On a
262,144-token model at the default 0.50 the threshold is 131,072 — close to a
common 128K figure by coincidence of the percentage, which has led to confusion
that the auxiliary model's context limit is the trigger. Add a note preempting
that misreading and pointing to the separate summary-model-context constraint.
2026-05-29 19:59:04 -07:00
fb0ab27649 fix(agent): register explainer config key + shorten footer prefix
Follow-up to the salvaged #34452 turn-completion explainer:
- Register display.turn_completion_explainer: True in DEFAULT_CONFIG so the
  setting is discoverable, matching the file_mutation_verifier precedent.
- Shorten the repeated footer prefix from 'Turn ended without a usable
  reply: ' to 'No reply: ' so the 10 reason variants don't all open with
  the same 8-word boilerplate.
- Update the 7 assertions that referenced the old prefix.
2026-05-29 19:23:05 -07:00
de6d6023d7 test(run_agent): align test_dict_tool_call_args with explainer suffix
PR #34470 adds an explainer suffix to abnormal turn endings (e.g.
max_iterations_reached) so users see why the response is short instead
of receiving a bare/blank reply. test_tool_call_validation_accepts_dict_arguments
runs the agent at max_iterations=3 which hits the explainer path; the
existing strict-equality assertion (== "done") no longer matches once
the suffix is appended.

Switch the assertion to .startswith("done") so the test continues to
verify that the models actual text survives intact while leaving the
explainer suffix wording owned by conversation_loop (where it belongs).

Test now passes (1 passed in 0.88s).
2026-05-29 19:23:05 -07:00
59b0ea98c8 fix(agent): explain abnormal turn endings instead of blank/partial reply
When a turn ends abnormally after substantive tool calls (empty content
after retries, a partial/truncated stream, exhausted retries, or an
iteration/budget limit), the CLI/TUI response area was left blank or
showed only a fragment (e.g. "The") with no consolidated reason. The
internal turn_exit_reason values (empty_response_exhausted,
partial_stream_recovery, etc.) were never surfaced to the user.

Add a turn-completion explainer that mirrors the existing file-mutation
verifier footer: at turn end, map an abnormal turn_exit_reason to a
short, actionable message and either replace the bare "(empty)"
sentinel or append the reason after a partial fragment. Normal
text_response exits (e.g. a terse "Done.") stay quiet.

Gated by display.turn_completion_explainer (default on) with
HERMES_TURN_COMPLETION_EXPLAINER env override, matching the
file-mutation verifier seam.

Closes #34452
2026-05-29 19:23:05 -07:00
897f9533ed fix: keep CLI context display in sync with preflight token estimate (#35079)
* Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here'

Adds a user-chosen compression boundary to the existing /compress command.
/compress here [N] summarizes everything except the most recent N exchanges
(default 2), which are preserved verbatim — letting the user pick the
compression boundary instead of relying on the automatic token-budget heuristic.

Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139,
Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20

- hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation
  guard (shared by CLI and gateway).
- cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression;
  compress only the head, re-append the verbatim tail through the seam guard.
- Preserves message-flow role alternation (seam guard merges any illegal
  user->user / assistant->assistant adjacency).
- Reuses the existing _compress_context session-rotation/lock machinery — no
  changes to the compression core.
- Bare /compress (full) and /compress <focus> behavior unchanged.

Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved
tool-call transcript, degenerate/multimodal seams, real handler path).

* fix: keep CLI context display in sync with preflight token estimate

The status bar reads compressor.last_prompt_tokens, which only updates
from a successful API response. When loaded history is oversized but
compression no-ops (e.g. the auxiliary summary model times out), no fresh
usage arrives and the bar stays frozen at the old, smaller value while the
preflight estimate reports a much larger number — looking permanently out
of sync (reported: 74.4K display vs ~144,669 preflight).

Seed last_prompt_tokens with the fresh preflight estimate (upward-only, so
a real usage figure is never clobbered and a successful compression's
downward correction still wins). Display-only; no behavioral change to
compression, caching, or the agent loop.
2026-05-29 19:21:15 -07:00
9d4c81130a fix(gateway): name what the /status token number actually is
Sharpen the label from 'Session usage (cumulative)' to 'Cumulative API
tokens (re-sent each call)'. The number is real provider-reported usage
summed across every API call in the session — not context size. In an
agentic loop the same context is re-sent each iteration, so a one-hour
tool-heavy session legitimately reaches tens of millions of tokens. The
new label explains the magnitude so users stop reading it as a bug or as
a total across all sessions.
2026-05-29 19:14:37 -07:00
2259c15e4d fix(gateway): clarify status session usage label 2026-05-29 19:14:37 -07:00
45bc65abbe fix(gateway): drop outbound silence-narration messages pre-send
Hallucinated 'silence' tokens (*(silent)*, _silent_, the bare '.', '...',
'silent', no response/reply, the mute emoji) are emitted when a persona has
nothing actionable to say. In bot-to-bot channels the receiving bot mirrors
the token back, creating a tight loop that burns API tokens and can crash a
model with 'no content after all retries'. SOUL.md/prompt rules drift across
providers and have already failed in practice, so add a substrate-level guard.

_deliver_to_platform now drops a message whose finalized content is only a
silence-narration token, logs a WARNING with platform/chat_id/truncated
content, and returns {success: True, filtered: 'silence_narration',
delivered: False} instead of calling the adapter. Single chokepoint covers
every platform adapter; the regex is anchored start/end with a 64-char guard
so prose like 'Silence is golden — here is the plan...' or 'Silent install
completed' is never dropped. Local/file delivery is a separate path and is
left untouched. Opt out via gateway.filter_silence_narration: false or the
HERMES_FILTER_SILENCE_NARRATION env override (env wins when set).

Closes #34616
2026-05-29 19:06:05 -07:00
9dbc3722ae test(compression): fix StopIteration in large-rough-growth preflight test
The rough-estimate mock supplied only 2 side_effect values but the
conversation loop calls estimate_request_tokens_rough a third time for
the post-response real-token estimate, exhausting the iterator. Use a
callable side_effect that returns 125k once (to fire preflight) then
sub-threshold values, independent of call count.
2026-05-29 19:05:03 -07:00
e38b0b55d1 fix(compression): avoid repeat preflight compaction from rough estimates 2026-05-29 19:05:03 -07:00
04de307d62 fix(cli): repaint input area after inline /steer and /model submit (#34839)
handle_enter dispatches /steer and /model inline on the UI thread while
the agent is running, calling buffer.reset() then returning. Unlike every
other early-return branch in the handler, these two skipped
event.app.invalidate(). process_command() prints through patch_stdout
(scrolls output above the prompt without redrawing the input line), so the
just-cleared input area could keep showing the submitted '/steer <text>'
until an unrelated redraw fired — looking unsent and inviting an accidental
re-submit.

Add event.app.invalidate() after reset in both inline branches to match
the sibling branches. AST regression test pins the invariant: every
reset-then-return branch in handle_enter must invalidate first.

Fixes #34569
2026-05-29 19:04:40 -07:00
bcc8301000 Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' (#35048)
Adds a user-chosen compression boundary to the existing /compress command.
/compress here [N] summarizes everything except the most recent N exchanges
(default 2), which are preserved verbatim — letting the user pick the
compression boundary instead of relying on the automatic token-budget heuristic.

Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139,
Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20

- hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation
  guard (shared by CLI and gateway).
- cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression;
  compress only the head, re-append the verbatim tail through the seam guard.
- Preserves message-flow role alternation (seam guard merges any illegal
  user->user / assistant->assistant adjacency).
- Reuses the existing _compress_context session-rotation/lock machinery — no
  changes to the compression core.
- Bare /compress (full) and /compress <focus> behavior unchanged.

Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved
tool-call transcript, degenerate/multimodal seams, real handler path).
2026-05-29 17:49:15 -07:00
54aa4db1de fix(cli): remove Hermes-managed node/npm/npx symlinks on uninstall
The POSIX installer drops node/npm/npx symlinks in ~/.local/bin pointing
into $HERMES_HOME/node and prepends ~/.local/bin to PATH, shadowing an
existing nvm. Uninstall removed the hermes wrapper but left these behind,
so the user's default node/npm/npx stayed redirected after uninstall.

Add remove_node_symlinks() and call it from run_uninstall. It removes
~/.local/bin/{node,npm,npx} only when each is a symlink resolving into the
current Hermes home's node dir, so a link the user repointed at nvm or a
real binary is never touched. Handles dangling links too.

Closes #34536
2026-05-29 17:24:38 -07:00
2062a84000 fix(auxiliary): stop capping output with max_tokens by default (#34530) (#34845)
* fix(auxiliary): stop capping output with max_tokens by default

Auxiliary LLM calls (compression, titles, vision, etc.) no longer send
max_tokens on the OpenAI-compatible chat-completions path. Most providers
treat an omitted max_tokens as "use the model max", which is what we want;
an explicit cap only risks truncation or a wire-format 400.

This was surfaced by GitHub Copilot / GPT-5 (#34530): those models reject
max_tokens and require max_completion_tokens, so compression 400'd and fell
back to a static context marker. Omitting the param sidesteps that quirk
(and ZAI vision's error 1210) entirely.

The Anthropic Messages wire (MiniMax + /anthropic endpoints) keeps
max_tokens because it is a mandatory field there.

* test(auxiliary): update temperature-retry assertions for omitted max_tokens

The temperature-retry tests asserted retry_kwargs["max_tokens"] == 500 on an
api.openai.com endpoint. Now that auxiliary calls omit max_tokens on
OpenAI-compatible endpoints (#34530), that key is absent. Assert it's absent
in both first and retry kwargs and use model as the survives-the-retry witness.
2026-05-29 17:24:30 -07:00
f9daa4a41d fix(deps): declare setuptools in dev extra for packaging tests (#34851)
* fix(deps): declare setuptools in dev extra for packaging tests

tests/test_packaging_metadata.py imports `from setuptools import
find_packages` at module scope to validate package discovery against
the live tree. setuptools was being picked up ambiently from the CI
runner image, but recent ubuntu-latest images no longer ship it in the
test venv, so collection fails with ModuleNotFoundError on every PR.

Declare setuptools==82.0.1 in the dev optional-dependencies so `.[all,dev]`
installs it explicitly rather than relying on the runner environment.

* test(packaging): skip packaging-metadata tests when setuptools absent

Belt-and-suspenders alongside declaring setuptools in [dev]: guard the
module-level `from setuptools import find_packages` with
pytest.importorskip so a runner missing setuptools SKIPS these checks
instead of erroring out collection for the entire test shard.

* chore(deps): sync uv.lock for setuptools dev dependency
2026-05-29 17:24:23 -07:00
689ef5e233 feat(cli): warn on unsupported pip installs + fix stale update-check cache (#34491) (#34846)
* docs(code-execution): document HERMES_* env narrowing + passthrough workaround

The execute_code sandbox-child env scrub (108397726, #27303) deliberately
dropped the broad HERMES_ prefix passthrough, keeping only an operational
4-var allowlist (HERMES_HOME/PROFILE/CONFIG/ENV). A script that relied on a
non-secret HERMES_* var (HERMES_BASE_URL, HERMES_KANBAN_DB, HERMES_*_WEBHOOK,
or a plugin-defined one) now sees it unset in the child.

Document the behavior change and the two recovery routes (terminal.env_passthrough
in config.yaml, or required_environment_variables in skill frontmatter), plus
the debug log line that surfaces the drop for diagnosis.

* feat(cli): warn on unsupported pip installs + fix stale update-check cache after pip upgrade

Banner now shows a yellow warning when detect_install_method() == 'pip':
'pip install hermes-agent' isn't the supported install path (it exists on
PyPI for internal/CI reasons), so updates and issue support don't behave
correctly. Reuses existing install-method detection; warn, never block.

Also fixes #34491: check_for_updates() keyed its 6h cache only on ts+rev.
On the pip path (no HERMES_REVISION), rev is always None, so a
'pip install --upgrade' changed VERSION but left the cache valid — the
stale 'N commits behind' count survived the upgrade. Cache now also keys
on the installed VERSION and invalidates on mismatch.
2026-05-29 13:30:28 -07:00
bb50825716 chore(release): map annguyenNous to AUTHOR_MAP
Clears the check-attribution CI gate on PR #34468 — the contributor's
noreply email was unmapped.
2026-05-29 13:29:34 -07:00
9f5afc7636 fix(mcp): widen isinstance check to BaseException for CancelledError
asyncio.gather(return_exceptions=True) captures CancelledError as a
BaseException value. The previous isinstance(result, Exception) check
missed CancelledError, silently dropping it without logging.

Since Python 3.9, CancelledError is a BaseException subclass (not
Exception). This one-line change ensures all failure types from MCP
server connections are properly logged.

Fixes NousResearch/hermes-agent#34443
2026-05-29 13:29:34 -07:00
4fd8521e44 test(tui-gateway): isolate completion_queue in poller requeue test
test_notification_poller_requeues_when_busy drained and reused the
process-global process_registry.completion_queue, so a concurrent test
in the same xdist worker could put/get on the shared singleton mid-run
and empty the event the poller requeues — flaking 'assert not
completion_queue.empty()' under parallel CI load only.

Monkeypatch a fresh Queue onto the singleton for the test's duration so
nothing external can interleave. The poller reads completion_queue by
attribute at runtime, so the isolated queue is what it operates on.
monkeypatch restores the original on teardown. Verified immune: 50/50
passes under a background thread hammering the global queue.
2026-05-29 13:29:24 -07:00
edfdc77664 fix(cli): resume the selected chat when a bare number follows /resume
A bare `/resume` printed the recent-sessions list but armed no selection
state, so typing just `3` on the next line was sent to the agent as chat
instead of resuming session #3. `/resume 3` worked, but the natural
list-then-pick flow did not.

Arm a one-shot pending-resume prompt when bare `/resume` shows the list,
and consume the next bare numeric input as the selection (out-of-range is
reported, non-numeric/other commands disarm it). Resolves against the same
_list_recent_sessions(limit=10) list used everywhere else.

Closes #34584.
2026-05-29 13:29:24 -07:00