hermes-agent

Author	SHA1	Message	Date
teknium1	8ae0802d59	fix(skills): make _rmtree_writable handle read-only directories, not just files The cherry-picked fix's onerror handler chmod'd only the failing path, but unlinking a child requires write permission on its PARENT directory. On a true Nix-store copy (r-xr-xr-x dirs + files) rmtree still failed. Now chmod the parent dir as well before retrying. Also rewrites the regression test: the original asserted the helper FAILS on a read-only dir (documenting the limitation), which is the wrong success criterion. Split into two tests — restore succeeds on a full read-only tree (real Nix case), and manifest is preserved when removal genuinely cannot proceed (monkeypatched).	2026-05-30 02:05:10 -07:00
annguyenNous	83a7d0b601	fix(skills): fix transaction ordering in reset_bundled_skill and handle read-only files in rmtree Two related bugs in tools/skills_sync.py affecting Nix-store and immutable-package installs: #34972 — reset_bundled_skill corrupts manifest on rmtree failure: The function deleted the manifest entry BEFORE attempting rmtree. If rmtree failed (read-only files from Nix store), the function returned early — leaving the skill in a manifest-less limbo state where future syncs silently skip it forever. Fix: reorder steps — attempt rmtree FIRST, only delete manifest entry after rmtree succeeds. If rmtree fails, nothing is changed. #34860 — stale .bak directories after sync: sync_skills() called shutil.rmtree(backup, ignore_errors=True) which silently failed on read-only files, leaving persistent .bak dirs. Fix: add _rmtree_writable() helper that makes files writable via an onerror callback before retrying removal. Used in both sync_skills() backup cleanup and reset_bundled_skill(). Fixes #34972 Fixes #34860	2026-05-30 02:05:10 -07:00
liuhao1024	a57cc00081	fix(packaging): include mcp_serve in py-modules so hermes mcp serve works on pip installs mcp_serve.py was missing from the setuptools py-modules list, causing hermes mcp serve to crash with ModuleNotFoundError on standard pip installs. Fixes #34871	2026-05-30 01:45:30 -07:00
Teknium	93e6a05efc	feat(model-picker): group multi-endpoint providers under one row (#35227 ) * Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' Adds a user-chosen compression boundary to the existing /compress command. /compress here [N] summarizes everything except the most recent N exchanges (default 2), which are preserved verbatim — letting the user pick the compression boundary instead of relying on the automatic token-budget heuristic. Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139, Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20 - hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation guard (shared by CLI and gateway). - cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression; compress only the head, re-append the verbatim tail through the seam guard. - Preserves message-flow role alternation (seam guard merges any illegal user->user / assistant->assistant adjacency). - Reuses the existing _compress_context session-rotation/lock machinery — no changes to the compression core. - Bare /compress (full) and /compress <focus> behavior unchanged. Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved tool-call transcript, degenerate/multimodal seams, real handler path). * feat(model-picker): group multi-endpoint providers under one row The interactive provider pickers (hermes model, setup wizard, Telegram /model) listed every provider slug flat, so vendors with several endpoints (Kimi/Moonshot, MiniMax, xAI Grok, Google Gemini, OpenAI, OpenCode, GitHub Copilot) each occupied multiple top-level rows. Now related slugs fold into one top-level row that drills down to the specific endpoint. - models.py: add PROVIDER_GROUPS table + group_providers() fold (display only — CANONICAL_PROVIDERS, slugs, --provider, /model <provider:model> all unchanged and individually addressable). - hermes model (main.py): group rows drill into a member sub-picker, then dispatch to the existing _model_flow_* unchanged. setup wizard inherits it. - Telegram /model: new mpg:<group> callback expands to member mp:<slug> buttons; single authenticated member degrades to a direct button. - Grouping is the single shared fold across all three surfaces. Validation: 163 targeted tests pass; E2E confirms group->member->model resolves to the correct concrete slug for all families.	2026-05-30 01:41:33 -07:00
LeonSGP43	14517ac1f5	fix(update): export launcher virtualenv to uv	2026-05-30 01:41:29 -07:00
teknium1	8e5a6854c3	fix(kanban): align recompute_ready guard with breaker's configured failure_limit Follow-up to the budget-exhaustion recovery fix. recompute_ready's new circuit-breaker guard resolved its effective limit from per-task max_retries -> DEFAULT_FAILURE_LIMIT, skipping the dispatcher's configured kanban.failure_limit. _record_task_failure resolves max_retries -> failure_limit(config) -> DEFAULT, so the two disagreed whenever an operator set kanban.failure_limit != 2: - config > 2: a task could get stuck at DEFAULT(2) before reaching its allowed retry count. - config < 2: a task the breaker already blocked could be auto-recovered back to ready, defeating the stricter limit. Thread the dispatcher's failure_limit through dispatch_once into recompute_ready so the guard and the breaker share one resolution order. Updated test_circuit_breaker_block_still_auto_promotes (it asserted a failures=5 block auto-recovers and resets the counter — that's the pre-#35072 behavior the loop fix removes); it now exercises a below-limit transient block, with the at-limit case covered in test_kanban_db.py. Added two tests for the config-tier and per-task override resolution.	2026-05-30 01:40:57 -07:00
liuhao1024	6ab71d3bb4	fix(kanban): prevent infinite retry loop when worker exhausts iteration budget recompute_ready() previously reset consecutive_failures to 0 when auto-recovering a blocked task. This defeated the circuit-breaker: a task that repeatedly exhausted its iteration budget would cycle forever (block → auto-recover with counter=0 → respawn → budget exhausted → block → …) with no signal to the operator. Fix: don't auto-recover tasks whose consecutive_failures has reached the effective failure limit (per-task max_retries or DEFAULT_FAILURE_LIMIT). The counter is also preserved across recovery so the breaker can accumulate across cycles. Fixes #35072	2026-05-30 01:40:57 -07:00
teknium1	c70dca3a88	fix(kanban): rebuild legacy TEXT-PK tables to INTEGER AUTOINCREMENT on open Legacy kanban boards (pre-AUTOINCREMENT schema) crashed the gateway notifier on every tick — int(None) on a NULL id in unseen_events_for_sub — silently losing all kanban notifications. CREATE TABLE IF NOT EXISTS skips existing tables regardless of schema and _add_column_if_missing only adds columns, so neither could fix a drifted primary-key type. _rebuild_drifted_tables() detects the legacy shape via PRAGMA table_info and rebuilds task_events/task_comments/task_runs (TEXT PK -> INTEGER AUTOINCREMENT) and kanban_notify_subs.last_event_id (TEXT/NULL -> INTEGER NOT NULL DEFAULT 0), preserving data. The whole pass is one transaction so an interruption can't leave a table half-renamed, and recreates every index DROP TABLE would otherwise take down (including idx_events_run). Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>	2026-05-30 01:40:49 -07:00
teknium1	16882cfded	refactor(tui): simplify base64 clipboard write to a stdin flag The per-entry psScript callback was identical for every PowerShell entry, so the function-valued union member added structure without behavior. Collapse WriteCmd to a plain stdin boolean and apply the one shared base64 script in the write loop. Document the CP936 root cause inline. Co-authored-by: BROCCOLO1D <279959838+BROCCOLO1D@users.noreply.github.com>	2026-05-30 01:40:44 -07:00
annguyenNous	64998fa93e	fix(tui): use base64 encoding for PowerShell clipboard writes to preserve UTF-8 When writing text to the clipboard via PowerShell (WSL2 and native Windows), the previous implementation piped text through stdin using `Set-Clipboard -Value $input`. PowerShell reads stdin using the Windows system's default ANSI code page (e.g. CP936 for Chinese Windows), causing all non-ASCII characters (CJK, emoji, accented) to become garbled. Fix: encode the text as base64 in Node.js and pass it as a command argument. PowerShell decodes it from base64 using explicit UTF-8, bypassing the code page issue entirely. Fixes #35107	2026-05-30 01:40:44 -07:00
Teknium	b4cf114f68	fix(vision): fail fast on non-retryable image download errors (#35221 ) _download_image() wrapped every download attempt in a blanket `except Exception` and retried 3x with 2s/4s/8s backoff regardless of cause. A 404/403 image URL would never resolve on retry, so it just burned up to 6s of wall-clock + extra GETs before failing — inflating latency for a deterministic failure (issue #32296, umbrella #35114). Add _is_retryable_download_error(): 4xx client errors (except 429), website-policy PermissionError, and too-large/SSRF ValueError now raise on the first attempt. 429, 5xx, and unclassified network errors stay retryable. Removed the now-unreachable fall-through branch since the loop always returns on success or re-raises on the final/terminal attempt.	2026-05-30 01:40:39 -07:00
kshitij	e481b15333	Merge pull request #35216 from kshitijk4poor/fix/agents-nudge-single-delegate fix: surface /agents nudge for single-delegate fan-out (TUI + CLI)	2026-05-30 00:57:15 -07:00
kshitijk4poor	9d2571c86a	fix: surface /agents nudge while delegate_task is in-flight (TUI + CLI) The subagent spawn-observability overlay added a `(/agents)` hint, but only on the standalone "Spawn tree" panel, gated behind `!inlineDelegateKey` — it never showed for a single delegate_task call, and only appeared once subagents had already registered. A nudge that arrives at the end (or only after spawn) is useless for the actual goal: letting users open the live monitor while delegation is running. Surface it the moment delegation starts, on both surfaces: TUI (ui-tui/src/components/thinking.tsx) - Show `(/agents)` on any "Delegate Task" tool group as soon as it appears (in-flight, before any subagent registers), not gated on subagents already existing. Same `startsWith('Delegate Task')` predicate already used for delegateGroups. CLI (agent/tool_executor.py) - Append `· /agents to monitor` to the delegate spinner label, which is displayed for the full duration of the delegate_task call. The previous attempt put the hint on the completion line (get_cute_tool_message), which only renders after the call finishes — reverted. TUI tsc clean (pre-existing execFileNoThrow type errors unrelated); subagentTree 35/35; display.py reverted to upstream.	2026-05-30 13:22:45 +05:30
Teknium	bb79bcde61	fix: detect pyproject.toml / __init__.py version drift in hermes doctor (#35142 ) A git conflict resolution (reset --hard or merge) can revert hermes_cli/__init__.py to a stale __version__ while pyproject.toml stays current, so 'hermes --version' silently reports the wrong version. Nothing cross-checked the two files. Add a version-consistency check to the doctor 'Python Environment' section: reads the [project] version from pyproject.toml and compares it to hermes_cli.__version__. Reports OK when they match, fails with a re-sync hint when they drift, and is a silent no-op for installed wheels where pyproject.toml isn't present. Closes #35070	2026-05-30 00:32:05 -07:00
teknium1	e5765e61fa	chore(release): map wei.chen.coder@gmail.com -> wenchengxucool	2026-05-30 00:30:55 -07:00
weichengxu	84ee80eb5d	feat: set process title to 'hermes' in ps/top/htop Adds _set_process_title() in hermes_cli/main.py, called first thing in main(). Tries setproctitle (optional) for a full ps-args rewrite, then falls back to ctypes prctl(PR_SET_NAME) on Linux / pthread_setname_np on macOS. No-op on Windows and on any failure. No new dependency: the setproctitle path is best-effort via ImportError guard. Fixes #35108	2026-05-30 00:30:55 -07:00
teknium1	17103a1f11	chore: add SeaXen to AUTHOR_MAP for salvaged PR #33278	2026-05-30 00:23:44 -07:00
SeaXen	e8076c1ebe	fix(dashboard): allow chat websockets on insecure public bind Allow non-loopback websocket peers when the dashboard is explicitly exposed with --host 0.0.0.0/:: and --insecure. This fixes the failure mode where /chat rendered over LAN but /api/ws and /api/events were rejected with HTTP 403, leaving the embedded TUI chat disconnected. Add regression coverage for the insecure public bind case in the dashboard websocket auth tests.	2026-05-30 00:23:44 -07:00
Max Hsu	636ff636d7	fix(agent): strip schema-foreign keys from max-iterations summary request (#34436 ) The max-iterations summary path (`handle_max_iterations`) hand-builds its message list and calls `chat.completions.create()` directly, bypassing `ChatCompletionsTransport.convert_messages()`. It only popped ("reasoning", "finish_reason", "_thinking_prefill"), so `tool_name` (SQLite FTS bookkeeping), the `codex_*` reasoning carriers, and other internal `_`-prefixed scaffolding leaked to the wire. Strict OpenAI-compatible gateways (Fireworks-backed OpenCode Go, Mistral, Moonshot/Kimi) reject these with HTTP 400 "Extra inputs are not permitted, field: 'messages[N].tool_name'", so a long tool-using session that exhausts the iteration budget fails to summarise instead of returning the result. Mirror convert_messages() in this path: also drop tool_name, codex_reasoning_items, codex_message_items, and every `_`-prefixed key. Copy-on-write is already in place, so internal history keeps the fields for FTS / Codex-fallback. Adds a regression test to TestHandleMaxIterations asserting the summary request carries none of the schema-foreign keys (fails on main, passes here).	2026-05-30 00:22:53 -07:00
Teknium	c1b2d0917f	fix(cli): don't treat any container as the Docker image for updates (#35139 ) detect_install_method() returned "docker" for any container (is_container()), before the .git check. Both supported installs already self-identify via the .install_method stamp read first: the curl installer (scripts/install.sh) git-clones and stamps "git"; the published nousresearch/hermes-agent image stamps "docker" at boot via docker/stage2-hook.sh. An unsupported manual install dropped into a container has no stamp, so the bare container check hijacked it to "docker" and 'hermes update' bailed with the docker-pull guidance. Drop the redundant is_container() -> docker fallback. Unstamped installs now fall through to the .git/pip checks like any off-path install; both supported paths are unaffected because the stamp wins first. Fixes #34397.	2026-05-30 00:22:46 -07:00
kshitij	8738cb92c3	Merge pull request #34704 from kshitijk4poor/feat/tui-agents-nudge feat(tui): nudge toward /agents dashboard when delegation starts	2026-05-30 00:01:59 -07:00
kshitijk4poor	5a72e82fd8	feat(tui): nudge toward /agents dashboard when delegation starts The TUI already ships a rich /agents spawn-tree dashboard (live tree, timeline, per-child tokens/cost/files/tools, kill/pause), but nothing surfaced it — during delegation the transcript stayed quiet and users had to already know to type /agents. Drop a one-time transient activity hint ("subagents working · /agents to watch live") the first time a turn starts delegating, matching the existing "· /logs to inspect" house style. Guards keep it unobtrusive: - fires at most once per turn (resets on message.start) - silent when the /agents overlay is already open - gated by display.tui_agents_nudge (default true) Hooked on subagent.start, not subagent.spawn_requested: the delegate progress callback in tools/delegate_tool.py only relays start/complete to the gateway and drops spawn_requested, so start is the first delegation event the TUI reliably receives. spawn_requested is wired too for the future case, guarded once-per-turn. Adds the display.tui_agents_nudge config default and gatewayTypes entry.	2026-05-30 12:26:36 +05:30
kshitijk4poor	7b0915037c	test: remove low-value model-catalog mirror tests These tests asserted that hardcoded curated model lists/constants still contained specific model strings (e.g. 'glm-5' in provider_model_ids('zai'), exact context-length values per model key, PROVIDER_TO_MODELS_DEV entries). They mirror a constant rather than exercise logic, so they only ever break when models are added/retired and never catch a real bug. Removed 22 such functions across 7 files (149 deletions, 0 additions). Behavioral siblings are kept: live-catalog-wins, fallback ordering, substring/longest-match resolution, normalization, credential discovery, and probe-tier stepping all still tested.	2026-05-29 23:45:05 -07:00
Teknium	0437137fff	security: pin patched Starlette (>=1.0.1) for CVE-2026-48710 BadHost (#35118 ) Starlette < 1.0.1 is affected by CVE-2026-48710 ("BadHost", CWE-444). The HTTP Host header was not validated before being used to rebuild `request.url`, so a malformed Host could make `request.url.path` desync from the raw ASGI path the router actually dispatched. Middleware and endpoints that apply path-based authorization off `request.url` (rather than `scope["path"]`) can therefore be bypassed. Hermes pulls Starlette transitively, never directly: - [web] -> fastapi==0.133.1 (starlette>=0.40.0, no upper bound) - [mcp] -> mcp==1.26.0 + sse-starlette (starlette>=0.27 / >=0.49.1) - [computer-use] -> mcp==1.26.0 - [dev] -> mcp==1.26.0 A fresh resolve landed starlette 0.52.1 — vulnerable. With no upper bound on the transitive specs, pip/uv could resolve any pre-1.0.1 release on a fresh install. Fix: pin starlette==1.0.1 directly in every extra that exposes a Starlette-backed server surface, regenerate uv.lock (only starlette moves: 0.52.1 -> 1.0.1, hash-verified), and mirror the pin in the lazy-install map (tools/lazy_deps.py `tool.dashboard`) so `hermes` on-demand dashboard installs can't re-resolve a vulnerable version. 1.0.1 is the advisory's named fix floor and the oldest patched release (more bake time than 1.1.0/1.2.0, which are days old); it satisfies every carrier constraint and our requires-python>=3.11. Scope note: this is a dependency-level fix complementing the application-layer Host-header validator added in #34162 (`hermes_cli/web_server.py` `_is_accepted_host`). Defense in depth at both the framework and app layers. Guards: two invariant tests in tests/test_packaging_metadata.py assert every server-surface extra pins starlette and that pyproject + uv.lock both resolve >= the 1.0.1 CVE floor — a dropped pin or stale lock fails in CI instead of shipping the bypass. Closes #35067	2026-05-29 23:23:54 -07:00
Erosika	827ce602db	fix(honcho): harden self-hosted setup paths Self-hosted Honcho setup had four sharp edges: - local/cloud URLs ending in /vN double-prefixed by the SDK (/v3/v3/... 404) - authenticated local servers had no setup prompt for a JWT/bearer token - profile-derived host keys could be dot-containing workspace IDs Honcho rejects - memory-provider config files with API keys written world-readable per umask This keeps existing behavior but makes those paths safer: - strip a trailing /vN version segment from any configured baseUrl before SDK init (the SDK's route builders always prepend their own version prefix); auth-skipping stays loopback-only - add an optional local JWT/bearer prompt in honcho setup, stored under hosts.<host>.apiKey - derive new profile host keys with underscores, still reading legacy hermes.<profile> blocks - write memory-provider config files atomically with 0600 via a shared utils.atomic_json_write(mode=) arg (honcho/hindsight/mem0/supermemory) - skip honcho.json parsing in gateway cache-busting unless Honcho is the active memory provider; memoize by honcho.json mtime when active - bust the gateway agent cache on memory.provider change - add a hermes memory setup <provider> one-liner so fresh installs can configure a named provider without the picker (the per-provider hermes <provider> subcommand only registers once that provider is active) Closes #20688, #29885, #26459, #30246, #33382, #32244. Co-authored-by: BROCCOLO1D	2026-05-29 22:29:48 -07:00
Siddharth Balyan	aa32edcac5	fix(setup): write config for image_gen and video_gen in apply_nous_managed_defaults (#35109 ) apply_nous_managed_defaults() was adding image_gen and video_gen to the 'changed' return set without writing any config values. The caller (tools_command first_install flow) uses 'changed' to skip manual configuration, so these tools ended up in platform_toolsets but with no video_gen.provider, video_gen.use_gateway, or image_gen.use_gateway in config.yaml. At runtime the FAL plugin's is_available() returned False because there was no FAL_KEY and no use_gateway config — the tool never loaded despite being 'enabled' in the toolset list. For image_gen this was a latent bug masked by the gateway offer prompt (prompt_enable_tool_gateway) running earlier in the setup flow and writing image_gen.use_gateway=True via apply_gateway_defaults(). But if the user skipped the gateway offer, image_gen would silently break the same way. For video_gen (added in PR #33259) the bug was always hit because the gateway offer ran before the user checked video_gen in the toolset checklist. Fix: write provider/use_gateway config values before adding to 'changed', matching the pattern used by web, tts, and browser.	2026-05-30 03:45:12 +00:00
teknium1	a7421dc7d2	fix(session): point no-FTS5 warning at the supported install When FTS5 is missing the warning now explains the likely cause (an unsupported / pip-managed Python whose bundled SQLite lacks FTS5) and links the supported install at hermes-agent.nousresearch.com, instead of just logging the raw error.	2026-05-29 20:11:07 -07:00
teknium1	4fa20f9a8b	fix(install): ensure the uv-managed Python ships SQLite FTS5 uv's python-build-standalone distributions only gained FTS5 in mid-2025 (#694). A stale interpreter already in uv's store — which `uv python find` reuses without checking — can lack it, leaving the supported install with a SQLite that can't create the FTS5 virtual tables hermes_state.py needs for full-text session search ("no such module: fts5"). check_python now probes the resolved interpreter for FTS5 and, if missing, reinstalls the latest patch for $PYTHON_VERSION (which has FTS5) and re-resolves. If an FTS5-capable Python still can't be obtained (offline, pinned env), it warns and continues — Hermes degrades gracefully and only disables session search. No bundled second SQLite, no user action.	2026-05-29 20:11:07 -07:00
teknium1	97ecfa0fc4	fix(session): extend no-FTS5 degradation to the trigram CJK index The salvaged contributor commit guarded only messages_fts. Current main also creates a second virtual table, messages_fts_trigram (CJK substring search), whose CREATE VIRTUAL TABLE ... USING fts5 still raised "no such module: fts5" on builds without FTS5 — re-crashing SessionDB init. Wrap the trigram setup with the same guard, and broaden the test's no-fts5 mock to fail BOTH tables so the regression test actually exercises a faithful no-FTS5 build.	2026-05-29 20:11:07 -07:00
LeonSGP43	5ad2b4c6da	fix(session): degrade gracefully when SQLite lacks FTS5	2026-05-29 20:11:07 -07:00
Teknium	860cf28dab	docs: clarify compression threshold is derived from the main model's context window (#35099 ) The compression threshold is threshold × context_length where context_length is the MAIN agent model's window, not the auxiliary/summary model's. On a 262,144-token model at the default 0.50 the threshold is 131,072 — close to a common 128K figure by coincidence of the percentage, which has led to confusion that the auxiliary model's context limit is the trigger. Add a note preempting that misreading and pointing to the separate summary-model-context constraint.	2026-05-29 19:59:04 -07:00
teknium1	fb0ab27649	fix(agent): register explainer config key + shorten footer prefix Follow-up to the salvaged #34452 turn-completion explainer: - Register display.turn_completion_explainer: True in DEFAULT_CONFIG so the setting is discoverable, matching the file_mutation_verifier precedent. - Shorten the repeated footer prefix from 'Turn ended without a usable reply: ' to 'No reply: ' so the 10 reason variants don't all open with the same 8-word boilerplate. - Update the 7 assertions that referenced the old prefix.	2026-05-29 19:23:05 -07:00
Bartok9	de6d6023d7	test(run_agent): align test_dict_tool_call_args with explainer suffix PR #34470 adds an explainer suffix to abnormal turn endings (e.g. max_iterations_reached) so users see why the response is short instead of receiving a bare/blank reply. test_tool_call_validation_accepts_dict_arguments runs the agent at max_iterations=3 which hits the explainer path; the existing strict-equality assertion (== "done") no longer matches once the suffix is appended. Switch the assertion to .startswith("done") so the test continues to verify that the models actual text survives intact while leaving the explainer suffix wording owned by conversation_loop (where it belongs). Test now passes (1 passed in 0.88s).	2026-05-29 19:23:05 -07:00
Bartok9	59b0ea98c8	fix(agent): explain abnormal turn endings instead of blank/partial reply When a turn ends abnormally after substantive tool calls (empty content after retries, a partial/truncated stream, exhausted retries, or an iteration/budget limit), the CLI/TUI response area was left blank or showed only a fragment (e.g. "The") with no consolidated reason. The internal turn_exit_reason values (empty_response_exhausted, partial_stream_recovery, etc.) were never surfaced to the user. Add a turn-completion explainer that mirrors the existing file-mutation verifier footer: at turn end, map an abnormal turn_exit_reason to a short, actionable message and either replace the bare "(empty)" sentinel or append the reason after a partial fragment. Normal text_response exits (e.g. a terse "Done.") stay quiet. Gated by display.turn_completion_explainer (default on) with HERMES_TURN_COMPLETION_EXPLAINER env override, matching the file-mutation verifier seam. Closes #34452	2026-05-29 19:23:05 -07:00
Teknium	897f9533ed	fix: keep CLI context display in sync with preflight token estimate (#35079 ) * Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' Adds a user-chosen compression boundary to the existing /compress command. /compress here [N] summarizes everything except the most recent N exchanges (default 2), which are preserved verbatim — letting the user pick the compression boundary instead of relying on the automatic token-budget heuristic. Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139, Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20 - hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation guard (shared by CLI and gateway). - cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression; compress only the head, re-append the verbatim tail through the seam guard. - Preserves message-flow role alternation (seam guard merges any illegal user->user / assistant->assistant adjacency). - Reuses the existing _compress_context session-rotation/lock machinery — no changes to the compression core. - Bare /compress (full) and /compress <focus> behavior unchanged. Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved tool-call transcript, degenerate/multimodal seams, real handler path). * fix: keep CLI context display in sync with preflight token estimate The status bar reads compressor.last_prompt_tokens, which only updates from a successful API response. When loaded history is oversized but compression no-ops (e.g. the auxiliary summary model times out), no fresh usage arrives and the bar stays frozen at the old, smaller value while the preflight estimate reports a much larger number — looking permanently out of sync (reported: 74.4K display vs ~144,669 preflight). Seed last_prompt_tokens with the fresh preflight estimate (upward-only, so a real usage figure is never clobbered and a successful compression's downward correction still wins). Display-only; no behavioral change to compression, caching, or the agent loop.	2026-05-29 19:21:15 -07:00
teknium	9d4c81130a	fix(gateway): name what the /status token number actually is Sharpen the label from 'Session usage (cumulative)' to 'Cumulative API tokens (re-sent each call)'. The number is real provider-reported usage summed across every API call in the session — not context size. In an agentic loop the same context is re-sent each iteration, so a one-hour tool-heavy session legitimately reaches tens of millions of tokens. The new label explains the magnitude so users stop reading it as a bug or as a total across all sessions.	2026-05-29 19:14:37 -07:00
helix4u	2259c15e4d	fix(gateway): clarify status session usage label	2026-05-29 19:14:37 -07:00
Bartok9	45bc65abbe	fix(gateway): drop outbound silence-narration messages pre-send Hallucinated 'silence' tokens ((silent), _silent_, the bare '.', '...', 'silent', no response/reply, the mute emoji) are emitted when a persona has nothing actionable to say. In bot-to-bot channels the receiving bot mirrors the token back, creating a tight loop that burns API tokens and can crash a model with 'no content after all retries'. SOUL.md/prompt rules drift across providers and have already failed in practice, so add a substrate-level guard. _deliver_to_platform now drops a message whose finalized content is only a silence-narration token, logs a WARNING with platform/chat_id/truncated content, and returns {success: True, filtered: 'silence_narration', delivered: False} instead of calling the adapter. Single chokepoint covers every platform adapter; the regex is anchored start/end with a 64-char guard so prose like 'Silence is golden — here is the plan...' or 'Silent install completed' is never dropped. Local/file delivery is a separate path and is left untouched. Opt out via gateway.filter_silence_narration: false or the HERMES_FILTER_SILENCE_NARRATION env override (env wins when set). Closes #34616	2026-05-29 19:06:05 -07:00
teknium1	9dbc3722ae	test(compression): fix StopIteration in large-rough-growth preflight test The rough-estimate mock supplied only 2 side_effect values but the conversation loop calls estimate_request_tokens_rough a third time for the post-response real-token estimate, exhausting the iterator. Use a callable side_effect that returns 125k once (to fire preflight) then sub-threshold values, independent of call count.	2026-05-29 19:05:03 -07:00
helix4u	e38b0b55d1	fix(compression): avoid repeat preflight compaction from rough estimates	2026-05-29 19:05:03 -07:00
Teknium	04de307d62	fix(cli): repaint input area after inline /steer and /model submit (#34839 ) handle_enter dispatches /steer and /model inline on the UI thread while the agent is running, calling buffer.reset() then returning. Unlike every other early-return branch in the handler, these two skipped event.app.invalidate(). process_command() prints through patch_stdout (scrolls output above the prompt without redrawing the input line), so the just-cleared input area could keep showing the submitted '/steer <text>' until an unrelated redraw fired — looking unsent and inviting an accidental re-submit. Add event.app.invalidate() after reset in both inline branches to match the sibling branches. AST regression test pins the invariant: every reset-then-return branch in handle_enter must invalidate first. Fixes #34569	2026-05-29 19:04:40 -07:00
Teknium	bcc8301000	Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' (#35048 ) Adds a user-chosen compression boundary to the existing /compress command. /compress here [N] summarizes everything except the most recent N exchanges (default 2), which are preserved verbatim — letting the user pick the compression boundary instead of relying on the automatic token-budget heuristic. Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139, Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20 - hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation guard (shared by CLI and gateway). - cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression; compress only the head, re-append the verbatim tail through the seam guard. - Preserves message-flow role alternation (seam guard merges any illegal user->user / assistant->assistant adjacency). - Reuses the existing _compress_context session-rotation/lock machinery — no changes to the compression core. - Bare /compress (full) and /compress <focus> behavior unchanged. Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved tool-call transcript, degenerate/multimodal seams, real handler path).	2026-05-29 17:49:15 -07:00
Bartok9	54aa4db1de	fix(cli): remove Hermes-managed node/npm/npx symlinks on uninstall The POSIX installer drops node/npm/npx symlinks in ~/.local/bin pointing into $HERMES_HOME/node and prepends ~/.local/bin to PATH, shadowing an existing nvm. Uninstall removed the hermes wrapper but left these behind, so the user's default node/npm/npx stayed redirected after uninstall. Add remove_node_symlinks() and call it from run_uninstall. It removes ~/.local/bin/{node,npm,npx} only when each is a symlink resolving into the current Hermes home's node dir, so a link the user repointed at nvm or a real binary is never touched. Handles dangling links too. Closes #34536	2026-05-29 17:24:38 -07:00
Teknium	2062a84000	fix(auxiliary): stop capping output with max_tokens by default (#34530 ) (#34845 ) * fix(auxiliary): stop capping output with max_tokens by default Auxiliary LLM calls (compression, titles, vision, etc.) no longer send max_tokens on the OpenAI-compatible chat-completions path. Most providers treat an omitted max_tokens as "use the model max", which is what we want; an explicit cap only risks truncation or a wire-format 400. This was surfaced by GitHub Copilot / GPT-5 (#34530): those models reject max_tokens and require max_completion_tokens, so compression 400'd and fell back to a static context marker. Omitting the param sidesteps that quirk (and ZAI vision's error 1210) entirely. The Anthropic Messages wire (MiniMax + /anthropic endpoints) keeps max_tokens because it is a mandatory field there. * test(auxiliary): update temperature-retry assertions for omitted max_tokens The temperature-retry tests asserted retry_kwargs["max_tokens"] == 500 on an api.openai.com endpoint. Now that auxiliary calls omit max_tokens on OpenAI-compatible endpoints (#34530), that key is absent. Assert it's absent in both first and retry kwargs and use model as the survives-the-retry witness.	2026-05-29 17:24:30 -07:00
Teknium	f9daa4a41d	fix(deps): declare setuptools in dev extra for packaging tests (#34851 ) * fix(deps): declare setuptools in dev extra for packaging tests tests/test_packaging_metadata.py imports `from setuptools import find_packages` at module scope to validate package discovery against the live tree. setuptools was being picked up ambiently from the CI runner image, but recent ubuntu-latest images no longer ship it in the test venv, so collection fails with ModuleNotFoundError on every PR. Declare setuptools==82.0.1 in the dev optional-dependencies so `.[all,dev]` installs it explicitly rather than relying on the runner environment. * test(packaging): skip packaging-metadata tests when setuptools absent Belt-and-suspenders alongside declaring setuptools in [dev]: guard the module-level `from setuptools import find_packages` with pytest.importorskip so a runner missing setuptools SKIPS these checks instead of erroring out collection for the entire test shard. * chore(deps): sync uv.lock for setuptools dev dependency	2026-05-29 17:24:23 -07:00
Teknium	689ef5e233	feat(cli): warn on unsupported pip installs + fix stale update-check cache (#34491 ) (#34846 ) * docs(code-execution): document HERMES_* env narrowing + passthrough workaround The execute_code sandbox-child env scrub (`108397726`, #27303) deliberately dropped the broad HERMES_ prefix passthrough, keeping only an operational 4-var allowlist (HERMES_HOME/PROFILE/CONFIG/ENV). A script that relied on a non-secret HERMES_* var (HERMES_BASE_URL, HERMES_KANBAN_DB, HERMES__WEBHOOK, or a plugin-defined one) now sees it unset in the child. Document the behavior change and the two recovery routes (terminal.env_passthrough in config.yaml, or required_environment_variables in skill frontmatter), plus the debug log line that surfaces the drop for diagnosis. feat(cli): warn on unsupported pip installs + fix stale update-check cache after pip upgrade Banner now shows a yellow warning when detect_install_method() == 'pip': 'pip install hermes-agent' isn't the supported install path (it exists on PyPI for internal/CI reasons), so updates and issue support don't behave correctly. Reuses existing install-method detection; warn, never block. Also fixes #34491: check_for_updates() keyed its 6h cache only on ts+rev. On the pip path (no HERMES_REVISION), rev is always None, so a 'pip install --upgrade' changed VERSION but left the cache valid — the stale 'N commits behind' count survived the upgrade. Cache now also keys on the installed VERSION and invalidates on mismatch.	2026-05-29 13:30:28 -07:00
teknium1	bb50825716	chore(release): map annguyenNous to AUTHOR_MAP Clears the check-attribution CI gate on PR #34468 — the contributor's noreply email was unmapped.	2026-05-29 13:29:34 -07:00
annguyenNous	9f5afc7636	fix(mcp): widen isinstance check to BaseException for CancelledError asyncio.gather(return_exceptions=True) captures CancelledError as a BaseException value. The previous isinstance(result, Exception) check missed CancelledError, silently dropping it without logging. Since Python 3.9, CancelledError is a BaseException subclass (not Exception). This one-line change ensures all failure types from MCP server connections are properly logged. Fixes NousResearch/hermes-agent#34443	2026-05-29 13:29:34 -07:00
teknium1	4fd8521e44	test(tui-gateway): isolate completion_queue in poller requeue test test_notification_poller_requeues_when_busy drained and reused the process-global process_registry.completion_queue, so a concurrent test in the same xdist worker could put/get on the shared singleton mid-run and empty the event the poller requeues — flaking 'assert not completion_queue.empty()' under parallel CI load only. Monkeypatch a fresh Queue onto the singleton for the test's duration so nothing external can interleave. The poller reads completion_queue by attribute at runtime, so the isolated queue is what it operates on. monkeypatch restores the original on teardown. Verified immune: 50/50 passes under a background thread hammering the global queue.	2026-05-29 13:29:24 -07:00
Bartok9	edfdc77664	fix(cli): resume the selected chat when a bare number follows /resume A bare `/resume` printed the recent-sessions list but armed no selection state, so typing just `3` on the next line was sent to the agent as chat instead of resuming session #3. `/resume 3` worked, but the natural list-then-pick flow did not. Arm a one-shot pending-resume prompt when bare `/resume` shows the list, and consume the next bare numeric input as the selection (out-of-range is reported, non-numeric/other commands disarm it). Resolves against the same _list_recent_sessions(limit=10) list used everywhere else. Closes #34584.	2026-05-29 13:29:24 -07:00

1 2 3 4 5 ...

10036 Commits