376 Commits

Author SHA1 Message Date
abf1e98f62 chore: release v0.7.0 (2026.4.3) (#4812)
168 merged PRs, 223 commits, 46 resolved issues, 40+ contributors.

Highlights: pluggable memory providers, credential pools, Camofox browser,
inline diff previews, API server session continuity, ACP MCP registration,
gateway hardening, secret exfiltration blocking.
2026-04-03 11:14:55 -07:00
e492420df4 fix: route memory provider tools in sequential execution path (#4803)
Memory provider tools (hindsight_retain, honcho_search, etc.) were
advertised to the model via tool schemas but failed with 'Unknown tool'
at execution time. The concurrent path (_invoke_tool) correctly checks
self._memory_manager.has_tool() before falling through to the registry,
but the sequential path (_execute_tool_calls_sequential) was never
updated with this check. Since sequential is the default for single
tool calls, memory provider tools always hit the registry dispatcher
which returns 'Unknown tool' because they're not registered there.

Add the memory_manager dispatch check between the delegate_task handler
and the quiet_mode fallthrough in the sequential path, with proper
spinner/display handling to match the existing pattern.

Reported by KiBenderOP — all memory providers affected (Honcho,
Hindsight, Holographic, etc.).
2026-04-03 10:31:53 -07:00
67e3620c5c fix: persist API server sessions to shared SessionDB (state.db) (#4802)
The API server adapter created AIAgent instances without passing
session_db, so conversations via Open WebUI and other OpenAI-compatible
frontends were never persisted to state.db. This meant 'hermes sessions
list' showed no API server sessions — they were effectively stateless.

Changes:
- Add _ensure_session_db() helper for lazy SessionDB initialization
- Pass session_db=self._ensure_session_db() in _create_agent()
- Refactor existing X-Hermes-Session-Id handler to use the shared helper

Sessions now persist with source='api_server' and are visible alongside
CLI and gateway sessions in hermes sessions list/search.
2026-04-03 10:31:11 -07:00
aecbf7fa4a fix(discord): register /approve and /deny slash commands, wire up button-based approval UI (#4800)
Two fixes for Discord exec approval:

1. Register /approve and /deny as native Discord slash commands so they
   appear in Discord's command picker (autocomplete). Previously they
   were only handled as text commands, so users saw 'no commands found'
   when typing /approve.

2. Wire up the existing ExecApprovalView button UI (was dead code):
   - ExecApprovalView now calls resolve_gateway_approval() to actually
     unblock the waiting agent thread when a button is clicked
   - Gateway's _approval_notify_sync() detects adapters with
     send_exec_approval() and routes through the button UI
   - Added 'Allow Session' button for parity with /approve session
   - send_exec_approval() now accepts session_key and metadata for
     thread support
   - Graceful fallback to text-based /approve prompt if button send fails

Also updates test mocks to include grey/secondary ButtonStyle and
purple Color (used by new button styles).
2026-04-03 10:24:07 -07:00
5db630aae4 fix: respect per-platform disabled skills in Telegram menu and gateway dispatch (#4799)
Three interconnected bugs caused `hermes skills config` per-platform
settings to be silently ignored:

1. telegram_menu_commands() never filtered disabled skills — all skills
   consumed menu slots regardless of platform config, hitting Telegram's
   100 command cap. Now loads disabled skills for 'telegram' and excludes
   them from the menu.

2. Gateway skill dispatch executed disabled skills because
   get_skill_commands() (process-global cache) only filters by the global
   disabled list at scan time. Added per-platform check before execution,
   returning an actionable 'skill is disabled' message.

3. get_disabled_skill_names() only checked HERMES_PLATFORM env var, but
   the gateway sets HERMES_SESSION_PLATFORM instead. Added
   HERMES_SESSION_PLATFORM as fallback, plus an explicit platform=
   parameter for callers that know their platform (menu builder, gateway
   dispatch). Also added platform to prompt_builder's skills cache key
   so multi-platform gateways get correct per-platform skill prompts.

Reported by SteveSkedasticity (CLAW community).
2026-04-03 10:10:53 -07:00
b6f9b70afd fix(gateway): route /approve and /deny through running-agent guard (#4798)
When the agent is blocked on a dangerous command approval (threading.Event
wait inside tools/approval.py), incoming /approve and /deny commands were
falling through to the generic interrupt path instead of being dispatched
to their command handlers. The interrupt sets _interrupt_requested on the
agent, but the agent thread is blocked on event.wait() — not checking the
flag. Result: approval times out after 300s (5 minutes) before executing.

Fix: intercept /approve and /deny in the running-agent early-intercept
block (alongside /stop, /new, /queue) and route directly to
_handle_approve_command / _handle_deny_command.
2026-04-03 09:59:52 -07:00
93334b2b92 docs: add community FAQ entries — multi-model workflows, WhatsApp binding, verbose control, skills config, thread sessions, migration, install troubleshooting (#4797)
Addresses common questions from the Nous Research community Discord:
- Multi-model workflows via delegation config
- WhatsApp per-chat binding limitations and workarounds
- Controlling tool progress display on Telegram
- Per-platform skills config and Telegram 100-command limit
- Shared thread sessions across multiple users
- Exporting/migrating Hermes to a new machine
- Permission denied on shell reload after install
- HTTP 400 on first agent run
2026-04-03 09:58:22 -07:00
d50e5be500 fix: handle None mcp_servers in _get_platform_tools()
When config.yaml has 'mcp_servers:' with no value, YAML parses it as
None. dict.get('mcp_servers', {}) only returns the default when the key
is absent, not when it's explicitly None. Use 'or {}' pattern to handle
both cases, matching the other two assignment sites in the same file.
2026-04-03 09:08:20 -07:00
cc54818d26 fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (#4757)
Four fixes for MCP server stability issues reported by community member
(terminal lockup, zombie processes, escape sequence pollution, startup hang):

1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs
   _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously,
   a hung MCP server could block the process_loop thread indefinitely, freezing
   the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work).

2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned
   by stdio_client via before/after snapshots of /proc children. On shutdown,
   _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful
   SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from
   accumulating across sessions.

3. MCP event loop exception handler (mcp_tool.py): Installs
   _mcp_loop_exception_handler on the MCP background event loop — same pattern
   as the existing _suppress_closed_loop_errors on prompt_toolkit's loop.
   Suppresses benign 'Event loop is closed' RuntimeError from httpx transport
   __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen).

4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in
   _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive()
   TTY detection. In non-interactive environments, build_oauth_auth() still
   returns a provider (cached tokens + refresh work), but the callback handler
   raises immediately instead of blocking the MCP event loop for 120s. Re-raises
   OAuth setup failures in _run_http so failed servers are reported cleanly
   without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465
   (heathley).

Closes #2537, closes #4462
Related: #4128, #3436
2026-04-03 02:29:20 -07:00
f374ae4c61 fix: prevent compression death spiral from API disconnects (#2153) (#4750)
Three fixes for long-running gateway sessions that enter a death spiral
when API disconnects prevent token data collection, which prevents
compression, which causes more disconnects:

Layer 1 — Stale token counter fallback (run_agent.py in-loop):
When last_prompt_tokens is 0 (stale after API disconnect or provider
returned no usage data), fall back to estimate_messages_tokens_rough()
instead of passing 0 to should_compress(), which would never fire.

Layer 2 — Server disconnect heuristic (run_agent.py error handler):
When ReadError/RemoteProtocolError hits a large session (>60% context
or >200 messages), treat it as a context-length error and trigger
compression rather than burning through retries that all fail the
same way.

Layer 3 — Hard message count limit (gateway/run.py hygiene):
Force compression when a session exceeds 400 messages, regardless of
token estimates. This catches runaway growth even when all token-based
checks fail due to missing API data.

Based on the analysis from PR #2157 by ygd58 — the gateway threshold
direction fix (1.4x multiplier) was already resolved on main.
2026-04-03 02:16:46 -07:00
8fd9fafc84 fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (#4747)
Anthropic returns HTTP 429 'Extra usage is required for long context
requests' when a Claude Max subscription doesn't include the 1M context
tier. This is NOT a transient rate limit — retrying won't help.

Only applies to Sonnet models (Opus 1M is general access). Detects
this specific error before the generic rate-limit handler and:
1. Reduces context_length from 1M to 200k (the standard tier)
2. Triggers context compression to fit
3. Retries with the reduced context

The reduction is session-scoped (not persisted) so it auto-recovers
if the user later enables extra usage on their subscription.

Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage
2026-04-03 02:05:02 -07:00
26d6083624 fix: correct qwen3.6-plus model slug
Renamed qwen/qwen3.6-plus-preview:free to qwen/qwen3.6-plus:free in both
OPENROUTER_MODELS and _PROVIDER_MODELS['nous'] lists.
2026-04-03 01:56:43 -07:00
470c3ea51a fix: handle Anthropic long-context tier 429 by reducing to 200k
Anthropic returns HTTP 429 'Extra usage is required for long context
requests' when a Claude Max subscription doesn't include the 1M context
tier. This is NOT a transient rate limit — retrying won't help.

Detect this specific error before the generic rate-limit handler and:
1. Reduce context_length from 1M to 200k (the standard tier)
2. Trigger context compression to fit
3. Retry with the reduced context

The reduction is session-scoped (not persisted) so it auto-recovers
if the user later enables extra usage on their subscription.

Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage
2026-04-03 01:56:43 -07:00
388241f798 docs(acp): fix zed config 2026-04-03 01:46:45 -07:00
67ae7a79df fix: use get_hermes_home(), consolidate git_cmd, update tests
Follow-up for salvaged PR #2352:
- Replace hardcoded Path(os.getenv('HERMES_HOME', ...)) with
  get_hermes_home() from hermes_constants (2 places)
- Consolidate redundant git_cmd_base into the existing git_cmd
  variable, constructed once before fork detection
- Update autostash tests for the unmerged index check added
  in the previous commit
2026-04-03 01:46:42 -07:00
6b0022bb7b Add fork detection and upstream sync to hermes update
- Detect if origin points to a fork (not NousResearch/hermes-agent)
- Show warning when updating from a fork: origin URL
- After pulling from origin/main on a fork:
  - Prompt to add upstream remote if not present
  - Respect ~/.hermes/.skip_upstream_prompt to avoid repeated prompts
  - Compare origin/main with upstream/main
  - If origin has commits not on upstream, skip (don't trample user's work)
  - If upstream is ahead, pull from upstream and try to sync fork
  - Use --force-with-lease for safe fork syncing

Non-main branches are unaffected - they just pull from origin/{branch}.

Co-authored-by: Avery <avery@hermes-agent.ai>
2026-04-03 01:46:42 -07:00
0109547fa2 fix(update): handle conflicted git index during hermes update (#4735)
* fix(gateway): race condition, photo media loss, and flood control in Telegram

Three bugs causing intermittent silent drops, partial responses, and
flood control delays on the Telegram platform:

1. Race condition in handle_message() — _active_sessions was set inside
   the background task, not before create_task(). Two rapid messages
   could both pass the guard and spawn duplicate processing tasks.
   Fix: set _active_sessions synchronously before spawning the task
   (grammY sequentialize / aiogram EventIsolation pattern).

2. Photo media loss on dequeue — when a photo (no caption) was queued
   during active processing and later dequeued, only .text was
   extracted. Empty text → message silently dropped.
   Fix: _build_media_placeholder() creates text context for media-only
   events so they survive the dequeue path.

3. Progress message edits triggered Telegram flood control — rapid tool
   calls edited the progress message every 0.3s, hitting Telegram's
   rate limit (23s+ waits). This blocked progress updates and could
   cause stream consumer timeouts.
   Fix: throttle edits to 1.5s minimum interval, detect flood control
   errors and gracefully degrade to new messages. edit_message() now
   returns failure for flood waits >5s instead of blocking.

* fix(gateway): downgrade empty/None response log from WARNING to DEBUG

This warning fires on every successful streamed response (streaming
delivers the text, handler returns None via already_sent=True) and
on every queued message during active processing. Both are expected
behavior, not error conditions. Downgrade to DEBUG to reduce log noise.

* fix(gateway): prevent stuck sessions with agent timeout and staleness eviction

Three changes to prevent sessions from getting permanently locked:

1. Agent execution timeout (HERMES_AGENT_TIMEOUT, default 10min):
   Wraps run_in_executor with asyncio.wait_for so a hung API call or
   runaway tool can't lock a session indefinitely. On timeout, the
   agent is interrupted and the user gets an actionable error message.

2. Staleness eviction for _running_agents:
   Tracks start timestamps for each session entry. When a new message
   arrives and the entry is older than timeout + 1min grace, it's
   evicted as a leaked lock. Safety net for any cleanup path that
   fails to remove the entry.

3. Cron job timeout (HERMES_CRON_TIMEOUT, default 10min):
   Wraps run_conversation in a ThreadPoolExecutor with timeout so a
   hung cron job doesn't block the ticker thread (and all subsequent
   cron jobs) indefinitely.

Follows grammY runner's per-update timeout pattern and aiogram's
asyncio.wait_for approach for handler deadlines.

* fix(gateway): STT config resolution, stream consumer flood control fallback

Three targeted fixes from user-reported issues:

1. STT config resolution (transcription_tools.py):
   _has_openai_audio_backend() and _resolve_openai_audio_client_config()
   now check stt.openai.api_key/base_url in config.yaml FIRST, before
   falling back to env vars. Fixes voice transcription breaking when
   using a custom OpenAI-compatible endpoint via config.yaml.

2. Stream consumer flood control fallback (stream_consumer.py):
   When an edit fails mid-stream (e.g., Telegram flood control returns
   failure for waits >5s), reset _already_sent to False so the normal
   final send path delivers the complete response. Previously, a
   truncated partial was left as the final message.

3. Telegram edit_message comment alignment (telegram.py):
   Clarify that long flood waits return failure so streaming can fall
   back to a normal final send.

* refactor: simplify and harden PR fixes after review

- Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False,
  cancel_futures=True) instead of context manager that waits indefinitely
- Extract _dequeue_pending_text() to deduplicate media-placeholder logic
  in interrupt and normal-completion dequeue paths
- Remove hasattr guards for _running_agents_ts: add class-level default
  so partial test construction works without scattered defensive checks
- Move `import concurrent.futures` to top of cron/scheduler.py
- Progress throttle: sleep remaining interval instead of busy-looping
  0.1s (~15 wakeups per 1.5s window → 1 wakeup)
- Deduplicate _load_stt_config() in transcription_tools.py:
  _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config()

* fix: move class-level attribute after docstring, clarify throttle comment

Follow-up nits for salvaged PR #4577:
- Move _running_agents_ts class attribute below the docstring so
  GatewayRunner.__doc__ is preserved.
- Add clarifying comment explaining the throttle continue behavior
  (batches queued messages during the throttle interval).

* fix(update): handle conflicted git index during hermes update

When the git index has unmerged entries (e.g. from an interrupted
merge or rebase), git stash fails with 'needs merge / could not
write index'. Detect this with git ls-files --unmerged and clear
the conflict state with git reset before attempting the stash.
Working-tree changes are preserved.

Reported by @LLMJunky — package-lock.json conflict from a prior
merge left the index dirty, blocking hermes update entirely.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-03 01:17:12 -07:00
c66c688727 fix: remove redundant restart message from update launchd path
launchd_restart() already prints stop/start confirmation via its
internal helpers — the extra 'Gateway restarted via launchd' line
was redundant. Update test assertion to match.
2026-04-03 01:16:42 -07:00
988ecc7420 fix(update): avoid launchd restart race on macOS 2026-04-03 01:16:42 -07:00
7165eff901 fix(whatsapp): add free_response_chats, mention stripping, and interactive message unwrapping
Address feature gaps vs Telegram/Discord/Mattermost adapters:
- free_response_chats whitelist to bypass mention gating per-group
- strip bot @phone mentions from body before forwarding to agent
- unwrap templateMessage/buttonsMessage/listMessage in bridge
- info-level log on successful mention pattern compilation
- use module-level json import instead of inline import in config
- eliminate double _normalize_whatsapp_id call via walrus operator
- hoist botIds computation outside per-message loop in bridge
2026-04-03 01:16:39 -07:00
714e4941b8 fix(whatsapp): enforce require_mention in group chats 2026-04-03 01:16:39 -07:00
23addf48d3 fix: allow running gateway service as root for LXC/container environments (#4732)
Previously, `hermes gateway install --system` hard-refused to create a
service running as root, even when explicitly requested via
`--run-as-user root`. This forced LXC/container users (where root is
the only user) to either create throwaway users or comment out the check
in source.

Changes:
- Auto-detected root (no explicit --run-as-user) still raises, but with
  a message explaining how to override
- Explicit `--run-as-user root` now allowed with a warning about
  security implications
- Interactive setup wizard prompt accepts 'root' as a valid username
  (warning comes from _system_service_identity downstream)
- Added tests for all three paths: auto-detected root rejection,
  explicit root allowance, and normal non-root passthrough
2026-04-03 01:14:21 -07:00
4d99305345 fix(cli): surface recent sessions inside /history and /resume
When /history is used in an empty chat or /resume with no argument,
show an inline table of recent resumable sessions with title, preview,
relative timestamp, and session ID instead of a dead-end message.

Table formatting matches the existing hermes sessions list style
(column headers + thin separators, no box drawing).

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-03 00:50:49 -07:00
a933079564 fix: move class-level attribute after docstring, clarify throttle comment
Follow-up nits for salvaged PR #4577:
- Move _running_agents_ts class attribute below the docstring so
  GatewayRunner.__doc__ is preserved.
- Add clarifying comment explaining the throttle continue behavior
  (batches queued messages during the throttle interval).
2026-04-03 00:50:17 -07:00
0ed28ab80c refactor: simplify and harden PR fixes after review
- Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False,
  cancel_futures=True) instead of context manager that waits indefinitely
- Extract _dequeue_pending_text() to deduplicate media-placeholder logic
  in interrupt and normal-completion dequeue paths
- Remove hasattr guards for _running_agents_ts: add class-level default
  so partial test construction works without scattered defensive checks
- Move `import concurrent.futures` to top of cron/scheduler.py
- Progress throttle: sleep remaining interval instead of busy-looping
  0.1s (~15 wakeups per 1.5s window → 1 wakeup)
- Deduplicate _load_stt_config() in transcription_tools.py:
  _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config()
2026-04-03 00:50:17 -07:00
28380e7aed fix(gateway): STT config resolution, stream consumer flood control fallback
Three targeted fixes from user-reported issues:

1. STT config resolution (transcription_tools.py):
   _has_openai_audio_backend() and _resolve_openai_audio_client_config()
   now check stt.openai.api_key/base_url in config.yaml FIRST, before
   falling back to env vars. Fixes voice transcription breaking when
   using a custom OpenAI-compatible endpoint via config.yaml.

2. Stream consumer flood control fallback (stream_consumer.py):
   When an edit fails mid-stream (e.g., Telegram flood control returns
   failure for waits >5s), reset _already_sent to False so the normal
   final send path delivers the complete response. Previously, a
   truncated partial was left as the final message.

3. Telegram edit_message comment alignment (telegram.py):
   Clarify that long flood waits return failure so streaming can fall
   back to a normal final send.
2026-04-03 00:50:17 -07:00
970042deab fix(gateway): prevent stuck sessions with agent timeout and staleness eviction
Three changes to prevent sessions from getting permanently locked:

1. Agent execution timeout (HERMES_AGENT_TIMEOUT, default 10min):
   Wraps run_in_executor with asyncio.wait_for so a hung API call or
   runaway tool can't lock a session indefinitely. On timeout, the
   agent is interrupted and the user gets an actionable error message.

2. Staleness eviction for _running_agents:
   Tracks start timestamps for each session entry. When a new message
   arrives and the entry is older than timeout + 1min grace, it's
   evicted as a leaked lock. Safety net for any cleanup path that
   fails to remove the entry.

3. Cron job timeout (HERMES_CRON_TIMEOUT, default 10min):
   Wraps run_conversation in a ThreadPoolExecutor with timeout so a
   hung cron job doesn't block the ticker thread (and all subsequent
   cron jobs) indefinitely.

Follows grammY runner's per-update timeout pattern and aiogram's
asyncio.wait_for approach for handler deadlines.
2026-04-03 00:50:17 -07:00
9bb83d1298 fix(gateway): downgrade empty/None response log from WARNING to DEBUG
This warning fires on every successful streamed response (streaming
delivers the text, handler returns None via already_sent=True) and
on every queued message during active processing. Both are expected
behavior, not error conditions. Downgrade to DEBUG to reduce log noise.
2026-04-03 00:50:17 -07:00
69f85a4dce fix(gateway): race condition, photo media loss, and flood control in Telegram
Three bugs causing intermittent silent drops, partial responses, and
flood control delays on the Telegram platform:

1. Race condition in handle_message() — _active_sessions was set inside
   the background task, not before create_task(). Two rapid messages
   could both pass the guard and spawn duplicate processing tasks.
   Fix: set _active_sessions synchronously before spawning the task
   (grammY sequentialize / aiogram EventIsolation pattern).

2. Photo media loss on dequeue — when a photo (no caption) was queued
   during active processing and later dequeued, only .text was
   extracted. Empty text → message silently dropped.
   Fix: _build_media_placeholder() creates text context for media-only
   events so they survive the dequeue path.

3. Progress message edits triggered Telegram flood control — rapid tool
   calls edited the progress message every 0.3s, hitting Telegram's
   rate limit (23s+ waits). This blocked progress updates and could
   cause stream consumer timeouts.
   Fix: throttle edits to 1.5s minimum interval, detect flood control
   errors and gracefully degrade to new messages. edit_message() now
   returns failure for flood waits >5s instead of blocking.
2026-04-03 00:50:17 -07:00
3659e1f0c2 test(acp): add E2E tests for MCP registration and tool-result reporting
Tests the full ACP flow:
- new_session with mcpServers → config conversion → register_mcp_servers
- prompt → tool_progress_callback → ToolCallStart events
- step_callback with results → ToolCallUpdate with rawOutput
- toolCallId pairing between start and completion events
- server names with slashes/dots sanitized correctly
- all session lifecycle methods (load/resume/fork) register MCP
2026-04-02 20:54:27 -07:00
21c2d32471 fix(gateway): normalize step_callback prev_tools for backward compat
The PR changed prev_tools from list[str] to list[dict] with name/result
keys.  The gateway's _step_callback_sync passed this directly to hooks
as 'tool_names', breaking user-authored hooks that call
', '.join(tool_names).

Now:
- 'tool_names' always contains strings (backward-compatible)
- 'tools' carries the enriched dicts for hooks that want results

Also adds summary logging to register_mcp_servers() and comprehensive
tests for all three PR changes:
- sanitize_mcp_name_component edge cases
- register_mcp_servers public API
- _register_session_mcp_servers ACP integration
- step_callback result forwarding
- gateway normalization backward compat
2026-04-02 20:54:27 -07:00
f66b3fe76b fix(acp): include tool results in step_callback for ACP tool_call_update events
The step_callback previously only forwarded tool names as strings,
so build_tool_complete received result=None and ACP tool_call_update
events had empty content/rawOutput. Now prev_tools carries dicts with
both name and result by pairing each tool_call with its matching
tool-role message via tool_call_id.
2026-04-02 20:54:27 -07:00
9aa82d4807 fix(acp): use raw server name as registry key, only sanitize for tool name prefixes 2026-04-02 20:54:27 -07:00
9b2fb1cc2e feat(acp): register client-provided MCP servers as agent tools
ACP clients pass MCP server definitions in session/new, load_session,
resume_session, and fork_session. Previously these were accepted but
silently ignored — the agent never connected to them.

This wires the mcp_servers parameter into the existing MCP registration
pipeline (tools/mcp_tool.py) so client-provided servers are connected,
their tools discovered, and the agent's tool surface refreshed before
the first prompt.

Changes:

tools/mcp_tool.py:
- Extract sanitize_mcp_name_component() to replace all non-[A-Za-z0-9_]
  characters (fixes crash when server names contain / or other chars
  that violate provider tool-name validation rules)
- Use it in _convert_mcp_schema, _sync_mcp_toolsets, _build_utility_schemas
- Extract register_mcp_servers(servers: dict) as a public API that takes
  an explicit {name: config} map. discover_mcp_tools() becomes a thin
  wrapper that loads config.yaml and calls register_mcp_servers()

acp_adapter/server.py:
- Add _register_session_mcp_servers() which converts ACP McpServerStdio /
  McpServerHttp / McpServerSse objects to Hermes MCP config dicts,
  registers them via asyncio.to_thread (avoids blocking the ACP event
  loop), then rebuilds agent.tools, valid_tool_names, and invalidates
  the cached system prompt
- Call it from new_session, load_session, resume_session, fork_session

Tested with Eden (theproxycompany.com) as ACP client — 5 MCP servers
(HTTP + stdio) registered successfully, 110 tools available to the agent.
2026-04-02 20:54:27 -07:00
29c98e8f83 feat(honcho): add configurable observation mode (unified/directional)
Adds observationMode config field to HonchoClientConfig:
- 'unified' (default): user peer self-observations, all agents share one pool
- 'directional': AI peer observes user, each agent keeps its own view

Changes:
- client.py: observation_mode field, _normalize_observation_mode(), config resolution
- session.py: add_peers respects mode (peer observation flags), dialectic_query
  routes through correct peer, create_conclusion uses correct observer
2026-04-02 20:38:36 -07:00
9e0fc62650 feat(honcho): restore full integration parity in memory provider plugin
Implements all features from the post-merge Honcho plugin spec:

B1: recall_mode support (context/tools/hybrid)
B2: peer_memory_mode gating (stub for ABC suppression mechanism)
B3: resolve_session_name() session key resolution
B4: first-turn context baking in system_prompt_block()
B5: cost-awareness (cadence, injection frequency, reasoning cap)
B6: memory file migration in initialize()
B7: pre-warming context at init

Ports from open PRs:
- #3265: token budget enforcement in prefetch()
- #4053: cron guard (skip activation for cron/flush sessions)
- #2645: baseUrl-only flow verified in is_available()
- #1969: aiPeer sync from SOUL.md
- #1957: lazy session init in tools mode

Single file change: plugins/memory/honcho/__init__.py
No modifications to client.py, session.py, or any files outside the plugin.
2026-04-02 20:38:36 -07:00
924bc67eee feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623)
* feat(memory): add pluggable memory provider interface with profile isolation

Introduces a pluggable MemoryProvider ABC so external memory backends can
integrate with Hermes without modifying core files. Each backend becomes a
plugin implementing a standard interface, orchestrated by MemoryManager.

Key architecture:
- agent/memory_provider.py — ABC with core + optional lifecycle hooks
- agent/memory_manager.py — single integration point in the agent loop
- agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md

Profile isolation fixes applied to all 6 shipped plugins:
- Cognitive Memory: use get_hermes_home() instead of raw env var
- Hindsight Memory: check $HERMES_HOME/hindsight/config.json first,
  fall back to legacy ~/.hindsight/ for backward compat
- Hermes Memory Store: replace hardcoded ~/.hermes paths with
  get_hermes_home() for config loading and DB path defaults
- Mem0 Memory: use get_hermes_home() instead of raw env var
- RetainDB Memory: auto-derive profile-scoped project name from
  hermes_home path (hermes-<profile>), explicit env var overrides
- OpenViking Memory: read-only, no local state, isolation via .env

MemoryManager.initialize_all() now injects hermes_home into kwargs so
every provider can resolve profile-scoped storage without importing
get_hermes_home() themselves.

Plugin system: adds register_memory_provider() to PluginContext and
get_plugin_memory_providers() accessor.

Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration).

* refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider

Remove cognitive-memory plugin (#727) — core mechanics are broken:
decay runs 24x too fast (hourly not daily), prefetch uses row ID as
timestamp, search limited by importance not similarity.

Rewrite openviking-memory plugin from a read-only search wrapper into
a full bidirectional memory provider using the complete OpenViking
session lifecycle API:

- sync_turn: records user/assistant messages to OpenViking session
  (threaded, non-blocking)
- on_session_end: commits session to trigger automatic memory extraction
  into 6 categories (profile, preferences, entities, events, cases,
  patterns)
- prefetch: background semantic search via find() endpoint
- on_memory_write: mirrors built-in memory writes to the session
- is_available: checks env var only, no network calls (ABC compliance)

Tools expanded from 3 to 5:
- viking_search: semantic search with mode/scope/limit
- viking_read: tiered content (abstract ~100tok / overview ~2k / full)
- viking_browse: filesystem-style navigation (list/tree/stat)
- viking_remember: explicit memory storage via session
- viking_add_resource: ingest URLs/docs into knowledge base

Uses direct HTTP via httpx (no openviking SDK dependency needed).
Response truncation on viking_read to prevent context flooding.

* fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker

- Remove redundant mem0_context tool (identical to mem0_search with
  rerank=true, top_k=5 — wastes a tool slot and confuses the model)
- Thread sync_turn so it's non-blocking — Mem0's server-side LLM
  extraction can take 5-10s, was stalling the agent after every turn
- Add threading.Lock around _get_client() for thread-safe lazy init
  (prefetch and sync threads could race on first client creation)
- Add circuit breaker: after 5 consecutive API failures, pause calls
  for 120s instead of hammering a down server every turn. Auto-resets
  after cooldown. Logs a warning when tripped.
- Track success/failure in prefetch, sync_turn, and all tool calls
- Wait for previous sync to finish before starting a new one (prevents
  unbounded thread accumulation on rapid turns)
- Clean up shutdown to join both prefetch and sync threads

* fix(memory): enforce single external memory provider limit

MemoryManager now rejects a second non-builtin provider with a warning.
Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE
external plugin provider is allowed at a time. This prevents tool
schema bloat (some providers add 3-5 tools each) and conflicting
memory backends.

The warning message directs users to configure memory.provider in
config.yaml to select which provider to activate.

Updated all 47 tests to use builtin + one external pattern instead
of multiple externals. Added test_second_external_rejected to verify
the enforcement.

* feat(memory): add ByteRover memory provider plugin

Implements the ByteRover integration (from PR #3499 by hieuntg81) as a
MemoryProvider plugin instead of direct run_agent.py modifications.

ByteRover provides persistent memory via the brv CLI — a hierarchical
knowledge tree with tiered retrieval (fuzzy text then LLM-driven search).
Local-first with optional cloud sync.

Plugin capabilities:
- prefetch: background brv query for relevant context
- sync_turn: curate conversation turns (threaded, non-blocking)
- on_memory_write: mirror built-in memory writes to brv
- on_pre_compress: extract insights before context compression

Tools (3):
- brv_query: search the knowledge tree
- brv_curate: store facts/decisions/patterns
- brv_status: check CLI version and context tree state

Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped
per profile). Binary resolution cached with thread-safe double-checked
locking. All write operations threaded to avoid blocking the agent
(curate can take 120s with LLM processing).

* fix(memory): thread remaining sync_turns, fix holographic, add config key

Plugin fixes:
- Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread)
- RetainDB: thread sync_turn (was blocking on HTTP POST)
- Both: shutdown now joins sync threads alongside prefetch threads

Holographic retrieval fixes:
- reason(): removed dead intersection_key computation (bundled but never
  used in scoring). Now reuses pre-computed entity_residuals directly,
  moved role_content encoding outside the inner loop.
- contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above
  500 facts, only checks the most recently updated ones to avoid O(n^2)
  explosion (~125K comparisons at 500 is acceptable).

Config:
- Added memory.provider key to DEFAULT_CONFIG ("" = builtin only).
  No version bump needed (deep_merge handles new keys automatically).

* feat(memory): extract Honcho as a MemoryProvider plugin

Creates plugins/honcho-memory/ as a thin adapter over the existing
honcho_integration/ package. All 4 Honcho tools (profile, search,
context, conclude) move from the normal tool registry to the
MemoryProvider interface.

The plugin delegates all work to HonchoSessionManager — no Honcho
logic is reimplemented. It uses the existing config chain:
$HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars.

Lifecycle hooks:
- initialize: creates HonchoSessionManager via existing client factory
- prefetch: background dialectic query
- sync_turn: records messages + flushes to API (threaded)
- on_memory_write: mirrors user profile writes as conclusions
- on_session_end: flushes all pending messages

This is a prerequisite for the MemoryManager wiring in run_agent.py.
Once wired, Honcho goes through the same provider interface as all
other memory plugins, and the scattered Honcho code in run_agent.py
can be consolidated into the single MemoryManager integration point.

* feat(memory): wire MemoryManager into run_agent.py

Adds 8 integration points for the external memory provider plugin,
all purely additive (zero existing code modified):

1. Init (~L1130): Create MemoryManager, find matching plugin provider
   from memory.provider config, initialize with session context
2. Tool injection (~L1160): Append provider tool schemas to self.tools
   and self.valid_tool_names after memory_manager init
3. System prompt (~L2705): Add external provider's system_prompt_block
   alongside existing MEMORY.md/USER.md blocks
4. Tool routing (~L5362): Route provider tool calls through
   memory_manager.handle_tool_call() before the catchall handler
5. Memory write bridge (~L5353): Notify external provider via
   on_memory_write() when the built-in memory tool writes
6. Pre-compress (~L5233): Call on_pre_compress() before context
   compression discards messages
7. Prefetch (~L6421): Inject provider prefetch results into the
   current-turn user message (same pattern as Honcho turn context)
8. Turn sync + session end (~L8161, ~L8172): sync_all() after each
   completed turn, queue_prefetch_all() for next turn, on_session_end()
   + shutdown_all() at conversation end

All hooks are wrapped in try/except — a failing provider never breaks
the agent. The existing memory system, Honcho integration, and all
other code paths are completely untouched.

Full suite: 7222 passed, 4 pre-existing failures.

* refactor(memory): remove legacy Honcho integration from core

Extracts all Honcho-specific code from run_agent.py, model_tools.py,
toolsets.py, and gateway/run.py. Honcho is now exclusively available
as a memory provider plugin (plugins/honcho-memory/).

Removed from run_agent.py (-457 lines):
- Honcho init block (session manager creation, activation, config)
- 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools,
  _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch,
  _honcho_prefetch, _honcho_save_user_observation, _honcho_sync
- _inject_honcho_turn_context module-level function
- Honcho system prompt block (tool descriptions, CLI commands)
- Honcho context injection in api_messages building
- Honcho params from __init__ (honcho_session_key, honcho_manager,
  honcho_config)
- HONCHO_TOOL_NAMES constant
- All honcho-specific tool dispatch forwarding

Removed from other files:
- model_tools.py: honcho_tools import, honcho params from handle_function_call
- toolsets.py: honcho toolset definition, honcho tools from core tools list
- gateway/run.py: honcho params from AIAgent constructor calls

Removed tests (-339 lines):
- 9 Honcho-specific test methods from test_run_agent.py
- TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py

Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that
were accidentally removed during the honcho function extraction.

The honcho_integration/ package is kept intact — the plugin delegates
to it. tools/honcho_tools.py registry entries are now dead code (import
commented out in model_tools.py) but the file is preserved for reference.

Full suite: 7207 passed, 4 pre-existing failures. Zero regressions.

* refactor(memory): restructure plugins, add CLI, clean gateway, migration notice

Plugin restructure:
- Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/
  (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb)
- New plugins/memory/__init__.py discovery module that scans the directory
  directly, loading providers by name without the general plugin system
- run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers()

CLI wiring:
- hermes memory setup — interactive curses picker + config wizard
- hermes memory status — show active provider, config, availability
- hermes memory off — disable external provider (built-in only)
- hermes honcho — now shows migration notice pointing to hermes memory setup

Gateway cleanup:
- Remove _get_or_create_gateway_honcho (already removed in prev commit)
- Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods
- Remove all calls to shutdown methods (4 call sites)
- Remove _honcho_managers/_honcho_configs dict references

Dead code removal:
- Delete tools/honcho_tools.py (279 lines, import was already commented out)
- Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods)
- Remove if False placeholder from run_agent.py

Migration:
- Honcho migration notice on startup: detects existing honcho.json or
  ~/.honcho/config.json, prints guidance to run hermes memory setup.
  Only fires when memory.provider is not set and not in quiet mode.

Full suite: 7203 passed, 4 pre-existing failures. Zero regressions.

* feat(memory): standardize plugin config + add per-plugin documentation

Config architecture:
- Add save_config(values, hermes_home) to MemoryProvider ABC
- Honcho: writes to $HERMES_HOME/honcho.json (SDK native)
- Mem0: writes to $HERMES_HOME/mem0.json
- Hindsight: writes to $HERMES_HOME/hindsight/config.json
- Holographic: writes to config.yaml under plugins.hermes-memory-store
- OpenViking/RetainDB/ByteRover: env-var only (default no-op)

Setup wizard (hermes memory setup):
- Now calls provider.save_config() for non-secret config
- Secrets still go to .env via env vars
- Only memory.provider activation key goes to config.yaml

Documentation:
- README.md for each of the 7 providers in plugins/memory/<name>/
- Requirements, setup (wizard + manual), config reference, tools table
- Consistent format across all providers

The contract for new memory plugins:
- get_config_schema() declares all fields (REQUIRED)
- save_config() writes native config (REQUIRED if not env-var-only)
- Secrets use env_var field in schema, written to .env by wizard
- README.md in the plugin directory

* docs: add memory providers user guide + developer guide

New pages:
- user-guide/features/memory-providers.md — comprehensive guide covering
  all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight,
  Holographic, RetainDB, ByteRover). Each with setup, config, tools,
  cost, and unique features. Includes comparison table and profile
  isolation notes.
- developer-guide/memory-provider-plugin.md — how to build a new memory
  provider plugin. Covers ABC, required methods, config schema,
  save_config, threading contract, profile isolation, testing.

Updated pages:
- user-guide/features/memory.md — replaced Honcho section with link to
  new Memory Providers page
- user-guide/features/honcho.md — replaced with migration redirect to
  the new Memory Providers page
- sidebars.ts — added both new pages to navigation

* fix(memory): auto-migrate Honcho users to memory provider plugin

When honcho.json or ~/.honcho/config.json exists but memory.provider
is not set, automatically set memory.provider: honcho in config.yaml
and activate the plugin. The plugin reads the same config files, so
all data and credentials are preserved. Zero user action needed.

Persists the migration to config.yaml so it only fires once. Prints
a one-line confirmation in non-quiet mode.

* fix(memory): only auto-migrate Honcho when enabled + credentialed

Check HonchoClientConfig.enabled AND (api_key OR base_url) before
auto-migrating — not just file existence. Prevents false activation
for users who disabled Honcho, stopped using it (config lingers),
or have ~/.honcho/ from a different tool.

* feat(memory): auto-install pip dependencies during hermes memory setup

Reads pip_dependencies from plugin.yaml, checks which are missing,
installs them via pip before config walkthrough. Also shows install
guidance for external_dependencies (e.g. brv CLI for ByteRover).

Updated all 7 plugin.yaml files with pip_dependencies:
- honcho: honcho-ai
- mem0: mem0ai
- openviking: httpx
- hindsight: hindsight-client
- holographic: (none)
- retaindb: requests
- byterover: (external_dependencies for brv CLI)

* fix: remove remaining Honcho crash risks from cli.py and gateway

cli.py: removed Honcho session re-mapping block (would crash importing
deleted tools/honcho_tools.py), Honcho flush on compress, Honcho
session display on startup, Honcho shutdown on exit, honcho_session_key
AIAgent param.

gateway/run.py: removed honcho_session_key params from helper methods,
sync_honcho param, _honcho.shutdown() block.

tests: fixed test_cron_session_with_honcho_key_skipped (was passing
removed honcho_key param to _flush_memories_for_session).

* fix: include plugins/ in pyproject.toml package list

Without this, plugins/memory/ wouldn't be included in non-editable
installs. Hermes always runs from the repo checkout so this is belt-
and-suspenders, but prevents breakage if the install method changes.

* fix(memory): correct pip-to-import name mapping for dep checks

The heuristic dep.replace('-', '_') fails for packages where the pip
name differs from the import name: honcho-ai→honcho, mem0ai→mem0,
hindsight-client→hindsight_client. Added explicit mapping table so
hermes memory setup doesn't try to reinstall already-installed packages.

* chore: remove dead code from old plugin memory registration path

- hermes_cli/plugins.py: removed register_memory_provider(),
  _memory_providers list, get_plugin_memory_providers() — memory
  providers now use plugins/memory/ discovery, not the general plugin system
- hermes_cli/main.py: stripped 74 lines of dead honcho argparse
  subparsers (setup, status, sessions, map, peer, mode, tokens,
  identity, migrate) — kept only the migration redirect
- agent/memory_provider.py: updated docstring to reflect new
  registration path
- tests: replaced TestPluginMemoryProviderRegistration with
  TestPluginMemoryDiscovery that tests the actual plugins/memory/
  discovery system. Added 3 new tests (discover, load, nonexistent).

* chore: delete dead honcho_integration/cli.py and its tests

cli.py (794 lines) was the old 'hermes honcho' command handler — nobody
calls it since cmd_honcho was replaced with a migration redirect.

Deleted tests that imported from removed code:
- tests/honcho_integration/test_cli.py (tested _resolve_api_key)
- tests/honcho_integration/test_config_isolation.py (tested CLI config paths)
- tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py)

Remaining honcho_integration/ files (actively used by the plugin):
- client.py (445 lines) — config loading, SDK client creation
- session.py (991 lines) — session management, queries, flush

* refactor: move honcho_integration/ into the honcho plugin

Moves client.py (445 lines) and session.py (991 lines) from the
top-level honcho_integration/ package into plugins/memory/honcho/.
No Honcho code remains in the main codebase.

- plugins/memory/honcho/client.py — config loading, SDK client creation
- plugins/memory/honcho/session.py — session management, queries, flush
- Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py,
  plugin __init__.py, session.py cross-import, all tests
- Removed honcho_integration/ package and pyproject.toml entry
- Renamed tests/honcho_integration/ → tests/honcho_plugin/

* docs: update architecture + gateway-internals for memory provider system

- architecture.md: replaced honcho_integration/ with plugins/memory/
- gateway-internals.md: replaced Honcho-specific session routing and
  flush lifecycle docs with generic memory provider interface docs

* fix: update stale mock path for resolve_active_host after honcho plugin migration

* fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore

Review feedback from Honcho devs (erosika):

P0 — Provider lifecycle:
- Remove on_session_end() + shutdown_all() from run_conversation() tail
  (was killing providers after every turn in multi-turn sessions)
- Add shutdown_memory_provider() method on AIAgent for callers
- Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry

Bug fixes:
- Remove sync_honcho=False kwarg from /btw callsites (TypeError crash)
- Fix doctor.py references to dead 'hermes honcho setup' command
- Cache prefetch_all() before tool loop (was re-calling every iteration)

ABC contract hardening (all backwards-compatible):
- Add session_id kwarg to prefetch/sync_turn/queue_prefetch
- Make on_pre_compress() return str (provider insights in compression)
- Add **kwargs to on_turn_start() for runtime context
- Add on_delegation() hook for parent-side subagent observation
- Document agent_context/agent_identity/agent_workspace kwargs on
  initialize() (prevents cron corruption, enables profile scoping)
- Fix docstring: single external provider, not multiple

Honcho CLI restoration:
- Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py
  with imports adapted to plugin path)
- Restore full hermes honcho command with all subcommands (status, peer,
  mode, tokens, identity, enable/disable, sync, peers, --target-profile)
- Restore auto-clone on profile creation + sync on hermes update
- hermes honcho setup now redirects to hermes memory setup

* fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type

- Wire on_delegation() in delegate_tool.py — parent's memory provider
  is notified with task+result after each subagent completes
- Add skip_memory=True to cron scheduler (prevents cron system prompts
  from corrupting user representations — closes #4052)
- Add skip_memory=True to gateway flush agent (throwaway agent shouldn't
  activate memory provider)
- Fix ByteRover on_pre_compress() return type: None -> str

* fix(honcho): port profile isolation fixes from PR #4632

Ports 5 bug fixes found during profile testing (erosika's PR #4632):

1. 3-tier config resolution — resolve_config_path() now checks
   $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json
   (non-default profiles couldn't find shared host blocks)

2. Thread host=_host_key() through from_global_config() in cmd_setup,
   cmd_status, cmd_identity (--target-profile was being ignored)

3. Use bare profile name as aiPeer (not host key with dots) — Honcho's
   peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid

4. Wrap add_peers() in try/except — was fatal on new AI peers, killed
   all message uploads for the session

5. Gate Honcho clone behind --clone/--clone-all on profile create
   (bare create should be blank-slate)

Also: sanitize assistant_peer_id via _sanitize_id()

* fix(tests): add module cleanup fixture to test_cli_provider_resolution

test_cli_provider_resolution._import_cli() wipes tools.*, cli, and
run_agent from sys.modules to force fresh imports, but had no cleanup.
This poisoned all subsequent tests on the same xdist worker — mocks
targeting tools.file_tools, tools.send_message_tool, etc. patched the
NEW module object while already-imported functions still referenced
the OLD one. Caused ~25 cascade failures: send_message KeyError,
process_registry FileNotFoundError, file_read_guards timeouts,
read_loop_detection file-not-found, mcp_oauth None port, and
provider_parity/codex_execution stale tool lists.

Fix: autouse fixture saves all affected modules before each test and
restores them after, matching the pattern in
test_managed_browserbase_and_modal.py.
2026-04-02 15:33:51 -07:00
e0b2bdb089 fix: webhook platform support — skip home channel prompt, disable tool progress (salvage #4363) (#4660)
Cherry-picked from PR #4363 by @bennyhodl with follow-up fixes:

- Skip 'No home channel' prompt for webhook platform (webhooks deliver
  to configured targets, not a home channel)
- Disable tool progress for webhooks (no message editing support)
- Add webhook to PLATFORMS in tools_config.py and skills_config.py
- Add hermes-webhook toolset to toolsets.py + hermes-gateway includes
- Removed overly aggressive <50 char content filter that blocked
  legitimate short responses (tool progress already handled at source)

Co-authored-by: bennyhodl <bennyhodl@users.noreply.github.com>
2026-04-02 14:00:22 -07:00
6d68fbf756 Merge pull request #4654 from SHL0MS/skill/research-paper-writing
Replace ml-paper-writing with research-paper-writing: full end-to-end research pipeline
2026-04-02 13:24:12 -07:00
b86647c295 Replace ml-paper-writing with research-paper-writing: full research pipeline skill
Replaces the writing-focused ml-paper-writing skill (940 lines) with a
complete end-to-end research paper pipeline (1,599 lines SKILL.md + 3,184
lines across 7 reference files).

New content:
- Full 8-phase pipeline: project setup, literature review, experiment
  design, execution/monitoring, analysis, paper drafting, review/revision,
  submission preparation
- Iterative refinement strategy guide from autoreason research (when to use
  autoreason vs critique-and-revise vs single-pass, model selection)
- Hermes agent integration: delegate_task parallel drafting, cronjob
  monitoring, memory/todo state management, skill composition
- Professional LaTeX tooling: microtype, siunitx, TikZ diagram patterns,
  algorithm2e, subcaption, latexdiff, SciencePlots
- Human evaluation design: annotation protocols, inter-annotator agreement,
  crowdsourcing platforms
- Title, Figure 1, conclusion, appendix strategy, page budget management
- Anonymization checklist, rebuttal writing, camera-ready preparation
- AAAI and COLM venue coverage (checklists, reviewer guidelines)

Preserved from ml-paper-writing:
- All writing philosophy (Nanda, Farquhar, Gopen & Swan, Lipton, Perez)
- Citation verification workflow (5-step mandatory process)
- All 6 conference templates (NeurIPS, ICML, ICLR, ACL, AAAI, COLM)
- Conference requirements, format conversion workflow
- Proactivity/collaboration guidance

Bug fixes in inherited reference files:
- BibLaTeX recommendation now correctly says natbib for conferences
- Bare except clauses fixed to except Exception
- Jinja2 template tags removed from citation-workflow.md
- Stale date caveats added to reviewer-guidelines.md
2026-04-02 16:13:26 -04:00
798a7b99e4 docs: add Configuration Options section to Slack docs (#4644)
* docs: add Configuration Options section to Slack docs

Documents all config.yaml options for the Slack bot:
- Thread & reply behavior (reply_to_mode, reply_broadcast)
- Session isolation (group_sessions_per_user)
- Mention & trigger behavior (require_mention, mention_patterns, reply_prefix)
- Unauthorized user handling (unauthorized_dm_behavior)
- Voice transcription (stt_enabled)
- Full example config showing all options together

Includes a note about Slack's hardcoded @mention requirement in channels
(no free_response_channels equivalent like Discord/Telegram).

* docs: consolidate reply_in_thread into Configuration Options section

Folds the standalone Reply Threading subsection from PR #4643 into
the Thread & Reply Behavior subsection, keeping all config options
in one place. Adds reply_in_thread to the table and full example.
2026-04-02 12:38:13 -07:00
d2b08406a4 fix(agent): classify think-only empty responses before retrying 2026-04-02 12:29:18 -07:00
241cbeeccd docs: add reply_in_thread config to Slack docs 2026-04-02 12:18:40 -07:00
b9a968c1de feat(slack): add reply_in_thread config option
By default, Hermes always threads replies to channel messages. Teams
that prefer direct channel replies had no way to opt out without
patching the source.

Add a reply_in_thread option (default: true) to the Slack platform
extra config:

  platforms:
    slack:
      extra:
        reply_in_thread: false

When false, _resolve_thread_ts() returns None for top-level channel
messages, so replies go directly to the channel. Messages already
inside an existing thread are still replied in-thread to preserve
conversation context. Default is true for full backward compatibility.
2026-04-02 12:18:40 -07:00
d89cc7fec1 feat(prompt): add Google model operational guidance for Gemini and Gemma (#4641)
Adapted from OpenCode's gemini.txt. Gemini and Gemma models now get
structured operational directives alongside tool-use enforcement:
absolute paths, verify-before-edit, dependency checks, conciseness,
parallel tool calls, non-interactive flags, autonomous execution.

Based on PR #4026, extended to cover Gemma models.
2026-04-02 11:52:34 -07:00
3186668799 feat: per-turn primary runtime restoration and transport recovery (#4624)
Makes provider fallback turn-scoped in long-lived CLI sessions. Previously, a single transient failure pinned the session to the fallback provider for every subsequent turn.

- _primary_runtime dict snapshot at __init__ (model, provider, base_url, api_mode, client_kwargs, compressor state)
- _restore_primary_runtime() at top of run_conversation() — restores all state, resets fallback chain index
- _try_recover_primary_transport() — one extra recovery cycle (client rebuild + cooldown) for transient transport errors on direct endpoints before fallback
- Skipped for aggregator providers (OpenRouter, Nous)
- 25 tests

Inspired by #4612 (@betamod). Closes #4612.
2026-04-02 10:52:01 -07:00
918d593544 chore: gitignore generated skills.json
Follow-up to #4500 — the extraction script generates this file at
build time, so it should not be committed.
2026-04-02 10:48:15 -07:00
b8dd059c40 feat(website): add skills browse and search page to docs (#4500)
Adds a Skills Hub page to the documentation site with browsable/searchable catalog of all skills (built-in, optional, and community from cached hub indexes).

- Python extraction script (website/scripts/extract-skills.py) parses SKILL.md frontmatter and hub index caches into skills.json
- React page (website/src/pages/skills/) with search, category filtering, source filtering, and expandable skill cards
- CI workflow updated to run extraction before Docusaurus build
- Deploy trigger expanded to include skills/ and optional-skills/ changes

Authored by @IAvecilla
2026-04-02 10:47:38 -07:00
20441cf2c8 fix(insights): persist token usage for non-CLI sessions 2026-04-02 10:47:13 -07:00
585855d2ca fix: preserve Anthropic thinking block signatures across tool-use turns
Anthropic extended thinking blocks include an opaque 'signature' field
required for thinking chain continuity across multi-turn tool-use
conversations. Previously, normalize_anthropic_response() extracted
only the thinking text and set reasoning_details=None, discarding the
signature. On subsequent turns the API could not verify the chain.

Changes:
- _to_plain_data(): new recursive SDK-to-dict converter with depth cap
  (20 levels) and path-based cycle detection for safety
- _extract_preserved_thinking_blocks(): rehydrates preserved thinking
  blocks (including signature) from reasoning_details on assistant
  messages, placing them before tool_use blocks as Anthropic requires
- normalize_anthropic_response(): stores full thinking blocks in
  reasoning_details via _to_plain_data()
- _extract_reasoning(): adds 'thinking' key to the detail lookup chain
  so Anthropic-format details are found alongside OpenRouter format

Salvaged from PR #4503 by @priveperfumes — focused on the thinking
block continuity fix only (cache strategy and other changes excluded).
2026-04-02 10:30:32 -07:00
28a073edc6 fix: repair OpenCode model routing and selection (#4508)
OpenCode Zen and Go are mixed-API-surface providers — different models
behind them use different API surfaces (GPT on Zen uses codex_responses,
Claude on Zen uses anthropic_messages, MiniMax on Go uses
anthropic_messages, GLM/Kimi on Go use chat_completions).

Changes:
- Add normalize_opencode_model_id() and opencode_model_api_mode() to
  models.py for model ID normalization and API surface routing
- Add _provider_supports_explicit_api_mode() to runtime_provider.py
  to prevent stale api_mode from leaking across provider switches
- Wire opencode routing into all three api_mode resolution paths:
  pool entry, api_key provider, and explicit runtime
- Add api_mode field to ModelSwitchResult for propagation through the
  switch pipeline
- Consolidate _PROVIDER_MODELS from main.py into models.py (single
  source of truth, eliminates duplicate dict)
- Add opencode normalization to setup wizard and model picker flows
- Add opencode block to _normalize_model_for_provider in CLI
- Add opencode-zen/go fallback model lists to setup.py

Tests: 160 targeted tests pass (26 new tests covering normalization,
api_mode routing per provider/model, persistence, and setup wizard
normalization).

Based on PR #3017 by SaM13997.

Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>
2026-04-02 09:36:24 -07:00
f4f64c413f fix(cli): ensure zero exit code on successful quiet mode queries (#4601) 2026-04-02 09:33:31 -07:00
8dc5b11e95 fix(honcho): remove redundant local HOST import in _all_profile_host_configs
HOST is already imported at module level from honcho_integration.client.
The local import inside _all_profile_host_configs() was unnecessary.
2026-04-02 09:25:16 -07:00
37d73d94bb fix: patch _local_config_path in tests for write isolation 2026-04-02 09:25:16 -07:00
a0eae33248 fix(honcho): address PR review findings
- Remove duplicate cmd_sync definition (kept version with error output)
- Fix from_env workspace to stay shared (hermes) not profile-derived
- Add docstring clarifying get_or_create is idempotent in status
- Remove unused import importlib in test
- Fix test assertion for shared workspace in from_env path
- Add 3 tests for sync_honcho_profiles_quiet
2026-04-02 09:25:16 -07:00
c146631e3b feat(honcho): sync command + auto-sync on hermes update
- hermes honcho sync: scan all profiles, create missing host blocks
- hermes update: automatically syncs Honcho config to all profiles
  after skill sync (existing users get profile mapping on next update)
- sync_honcho_profiles_quiet() for silent use from update path
2026-04-02 09:25:16 -07:00
89eab74c67 feat(honcho): --target-profile flag + peer card display in status
- hermes honcho --target-profile <name> <command>: target another
  profile's Honcho config without switching profiles. Works with all
  subcommands (status, peer, mode, tokens, enable, disable, etc.)
- hermes honcho status now shows user peer card and AI peer
  representation when connected (fetched live from Honcho API)
2026-04-02 09:25:16 -07:00
5f6bf2a473 fix(honcho): share workspace across profiles by default
Profiles inherit the default workspace instead of deriving a separate
one. All profiles see the same user context, sessions, and project
history. Each profile is a different AI peer in a shared space.

Workspace can still be overridden per-profile via config if isolation
is needed.
2026-04-02 09:25:16 -07:00
f27da5fe8e fix(honcho): remove linkedHosts from peers table 2026-04-02 09:25:16 -07:00
0e90df1216 feat(honcho): eager peer creation + enable/disable per profile
- Eagerly create AI and user peers in Honcho when a profile is created
  (not deferred to first message). Uses idempotent peer() SDK call.
- hermes honcho enable: turn on Honcho for active profile, clone
  settings from default if first time, create peer immediately
- hermes honcho disable: turn off Honcho for active profile
- _ensure_peer_exists() helper for idempotent peer creation
2026-04-02 09:25:16 -07:00
37458e72a2 feat(honcho): auto-clone config to new profiles on creation
When a profile is created and Honcho is already configured on the
default host, automatically creates a host block for the new profile
with inherited settings (memory mode, recall mode, write frequency,
peer name, etc.) and auto-derived workspace/aiPeer.

Zero-friction path: hermes profile create coder -> Honcho config
cloned as hermes.coder with all settings inherited.
2026-04-02 09:25:16 -07:00
d1189f2be9 feat(honcho): add cross-profile observability for Honcho integration
- hermes honcho status: shows active profile name + host key
- hermes honcho status --all: compact table of all profiles with mode,
  recall, write frequency per host block
- hermes honcho peers: cross-profile peer identity table (user peer,
  AI peer, linked hosts)
- All write commands (peer, mode, tokens) print [host_key] label when
  operating on a non-default profile
2026-04-02 09:25:16 -07:00
18c156af8e feat(honcho): scope host and peer resolution to active Hermes profile
Derives the Honcho host key from the active Hermes profile so that each
profile gets its own Honcho host block, workspace, and AI peer identity.

Profile "coder" resolves to host "hermes.coder", reads from
hosts["hermes.coder"] in honcho.json, and defaults workspace + aiPeer
to the derived host name.

Resolution order: HERMES_HONCHO_HOST env var > active profile name >
"hermes" (default).

Complements #3681 (profiles) with the Honcho identity layer that was
part of #2845 (named instances), adapted to the merged profiles system.
2026-04-02 09:25:16 -07:00
661a1b0ba2 fix: exclude matrix from [all] extras — python-olm is upstream-broken (#4615)
python-olm (required by matrix-nio[e2e]) fails to build on modern macOS:
- CMake 4 rejects vendored libolm's cmake_minimum_required(VERSION 3.4)
- Apple Clang 21+ rejects a C++ type error in include/olm/list.hh
- Upstream libolm repo is archived, no fix forthcoming

Including matrix in [all] causes the entire extras install to fail during
`hermes update`, silently dropping all other extras (telegram, discord,
slack, cron, etc.) when the fallback kicks in.

The [matrix] extra is preserved for opt-in install:
  pip install 'hermes-agent[matrix]'

Closes #4178
2026-04-02 09:21:37 -07:00
acea9ee20b fix(tests): fix 11 real test failures + major cascade poisoner (#4570)
Three root causes addressed:

1. AIAgent no longer defaults base_url to OpenRouter (9 tests)
   Tests that assert OpenRouter-specific behavior (prompt caching,
   reasoning extra_body, provider preferences) need explicit base_url
   and model set on the agent. Updated test_run_agent.py and
   test_provider_parity.py.

2. Credential pool auto-seeding from host env (2 tests)
   test_auxiliary_client.py tests for Anthropic OAuth and custom
   endpoint fallback were not mocking _select_pool_entry, so the
   host's credential pool interfered. Added pool + codex mocks.

3. sys.modules corruption cascade (major - ~250 tests)
   test_managed_modal_environment.py replaced sys.modules entries
   (tools, hermes_cli, agent packages) with SimpleNamespace stubs
   but had NO cleanup fixture. Every subsequent test in the process
   saw corrupted imports: 'cannot import get_config_path from
   <unknown module name>' and 'module tools has no attribute
   environments'. Added _restore_tool_and_agent_modules autouse
   fixture matching the pattern in test_managed_browserbase_and_modal.py.

   This was also the root cause of CI failures (104 failed on main).
2026-04-02 08:43:06 -07:00
624ad582a5 fix: make gateway approval block agent thread like CLI does (#4557)
The gateway's dangerous command approval system was fundamentally broken:
the agent loop continued running after a command was flagged, and the
approval request only reached the user after the agent finished its
entire conversation loop. By then the context was lost.

This change makes the gateway approval mirror the CLI's synchronous
behavior. When a dangerous command is detected:

1. The agent thread blocks on a threading.Event
2. The approval request is sent to the user immediately
3. The user responds with /approve or /deny
4. The event is signaled and the agent resumes with the real result

The agent never sees 'approval_required' as a tool result. It either
gets the command output (approved) or a definitive BLOCKED message
(denied/timed out) — same as CLI mode.

Queue-based design supports multiple concurrent approvals (parallel
subagents via delegate_task, execute_code RPC handlers). Each approval
gets its own _ApprovalEntry with its own threading.Event. /approve
resolves the oldest (FIFO); /approve all resolves all at once.

Changes:
- tools/approval.py: Queue-based per-session blocking gateway approval
  (register/unregister callbacks, resolve with FIFO or all-at-once)
- gateway/run.py: Register approval callback in run_sync(), remove
  post-loop pop_pending hack, /approve and /deny support 'all' flag
- tests: 21 tests including parallel subagent E2E scenarios
2026-04-02 01:47:19 -07:00
64584a931f cleanup: use _generate_session_key for parent key, fix trailing whitespace 2026-04-02 01:33:53 -07:00
8cb3596939 fix(gateway): seed DM thread sessions with parent transcript to preserve context 2026-04-02 01:33:53 -07:00
e94b4b2b40 fix: preserve allowed_users during setup reconfigure and quiet unconfigured provider warnings
Setup wizard now shows existing allowed_users when reconfiguring a
platform and preserves them if the user presses Enter. Previously the
wizard would display a misleading "No allowlist set" warning even when
the .env still held the original IDs.

Also downgrades the "provider X has no API key configured" log from
WARNING to DEBUG in resolve_provider_client — callers already handle
the None return with their own contextual messages. This eliminates
noisy startup warnings for providers in the fallback chain that the
user never configured (e.g. minimax).
2026-04-02 01:00:29 -07:00
835defe074 fix: invalidate update cache for all profiles, not just current
hermes update only cleared .update_check for the active HERMES_HOME,
leaving other profiles showing stale 'N commits behind' in their banner.

Now _invalidate_update_cache() iterates over ~/.hermes/ (default) plus
every directory under ~/.hermes/profiles/ to clear all caches. The git
repo is shared across profiles so a single update brings them all current.

Reported by SteveSkedasticity on Discord.
2026-04-02 00:49:17 -07:00
e4db72ef39 fix: merge dotted+hyphenated FTS5 quoting into single pass
The original PR applied dotted and hyphenated regex quoting in two
sequential steps.  For terms with both dots and hyphens (e.g.
my-app.config.ts), step 2 would re-match inside already-quoted output,
producing malformed double-quoted FTS5 syntax.

Merged into a single regex pass: \w+(?:[.-]\w+)+ — handles dots,
hyphens, and mixed terms in one shot.  Added test coverage for the
mixed case.
2026-04-02 00:49:11 -07:00
9825cd7b1e fix(state): quote dotted terms in FTS5 queries
FTS5 queries containing dots (e.g. P2.2, simulate.p2.test.ts) can trigger query parse edge cases that yield OperationalError or empty results unless quoted. Extend _sanitize_fts5_query to wrap dotted tokens in double quotes (similar to hyphenated terms) and add regression tests.
2026-04-02 00:49:11 -07:00
c4e626b1fa refactor: extract _detect_file_drop() + add 28 tests
Extract the inline file-drop detection logic into a standalone
_detect_file_drop() function at module level for testability. The main
loop now calls this function instead of inlining the logic.

Tests cover:
- Slash commands still route correctly (/help, /quit, /xyz)
- Image paths auto-detected (.png, .jpg, .gif, etc.)
- Non-image files detected (.py, .txt, Makefile, etc.)
- Backslash-escaped spaces from macOS drag-and-drop
- Trailing user text preserved as remainder
- Edge cases: directories, symlinks, no-extension files
- Non-string input, empty strings, nonexistent paths
2026-04-02 00:40:27 -07:00
1841886898 fix(cli): detect dragged file paths instead of treating them as slash commands
When a user drags a file into the terminal, macOS pastes the absolute
path (e.g. /Users/roland/Desktop/Screenshot.png) which starts with '/'
and was incorrectly routed to process_command(), producing an 'Unknown
command' error.

This change adds file-path detection before the slash-command check:
- Parses the first token, handling backslash-escaped spaces from macOS
- Checks if the path exists as a real file via Path.exists()
- Image files (.png, .jpg, etc.) are auto-attached to the message
- Non-image files are reformatted as [User attached file: ...] context
- Falls through to normal slash-command handling if not a real file path
2026-04-02 00:40:27 -07:00
f4bc6aa856 fix: scope extras retry to [all] group only
_load_installable_optional_extras() was returning ALL extras from
pyproject.toml except 'all', which included 'rl' and 'yc-bench' —
extras not referenced by [all] that install heavy research deps
(atroposlib, tinker, wandb) from git repos. Changed to parse the
[all] group's references and only retry those 18 extras.

Also moved tomllib import to function-level since it only runs
during the rare fallback path.
2026-04-02 00:40:07 -07:00
c91f4ef4ed fix(update): preserve optional extras during fallback install 2026-04-02 00:40:07 -07:00
5101f853ba Merge pull request #3287 from NousResearch/rewbs/tool-use-charge-to-subscription 2026-04-01 18:42:47 -07:00
a0f5fc2570 fix(tools): add debug logging for token refresh and tighten domain check
- Add logger + debug log to read_nous_access_token() catch-all so token
  refresh failures are observable instead of silently swallowed
- Tighten _is_nous_auxiliary_client() domain check to use proper URL
  hostname parsing instead of substring match, preventing false-positives
  on domains like not-nousresearch.com or nousresearch.com.evil.com
2026-04-02 12:40:03 +11:00
Ben
647f99d4dd fix: resolve post-merge issues in auxiliary_client and model flow
- Add missing `from agent.credential_pool import load_pool` import to
  auxiliary_client.py (introduced by the credential pool feature in main)
- Thread `args` through `select_provider_and_model(args=None)` so TLS
  options from `cmd_model` reach `_model_flow_nous`
- Mock `_require_tty` in test_cmd_model_forwards_nous_login_tls_options
  so it can run in non-interactive test environments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 00:50:40 +00:00
a2e56d044b Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-04-02 11:00:35 +11:00
bd9e0b605f test(e2e): remove section separator comments 2026-04-01 15:23:52 -07:00
99e6f44204 test(e2e): remove unused imports and duplicate fixtures 2026-04-01 15:23:52 -07:00
1f1297f56c ci: merge e2e into tests workflow as separate job
Move e2e tests into tests.yml as a parallel job instead of a separate
workflow. Unit tests now also ignore tests/e2e/ to avoid running them
twice. Both jobs appear as independent checks in the PR.
2026-04-01 15:23:52 -07:00
04e60cfacd test(e2e): add authorization, session lifecycle, and resilience tests
New test classes:
- TestSessionLifecycle: /new then /status sequence, idempotent resets
- TestAuthorization: unauthorized users get pairing code, not commands
- TestSendFailureResilience: pipeline survives send() failures

Additional command coverage: /provider, /verbose, /personality, /yolo.

Note: /provider test is xfail - found a real bug where model_cfg is
referenced unbound when config.yaml is absent (run.py:3247).
2026-04-01 15:23:52 -07:00
ecd9bf2ca0 test(e2e): revert intentional failure after CI verification
CI correctly detected the broken assertion — e2e workflow works.
2026-04-01 15:23:52 -07:00
b209dc0f43 test(e2e): add intentional failure to verify CI detection
Temporary commit — will be reverted after confirming CI catches it.
2026-04-01 15:23:52 -07:00
67e1170b01 ci: add e2e test workflow
Separate workflow for gateway e2e tests, runs on push/PR to main.
Same Python 3.11 + uv setup as existing tests.yml but targets only
tests/e2e/ with verbose output.
2026-04-01 15:23:52 -07:00
bff34b1df9 test(e2e): add telegram slash command e2e tests
Tests /help, /status, /new, /stop, /commands through the full adapter
background-task pipeline. Validates command dispatch, session lifecycle,
and response delivery without any LLM involvement.
2026-04-01 15:23:52 -07:00
ba48cfe84a test(e2e): add telegram gateway e2e test infrastructure
Fixtures and helpers for driving messages through the full async
pipeline: adapter.handle_message → background task → GatewayRunner
command dispatch → adapter.send (mocked).

Uses the established _make_runner pattern (object.__new__) to skip
filesystem side effects while exercising real command dispatch logic.
2026-04-01 15:23:52 -07:00
de9bba8d7c fix: remove hardcoded OpenRouter/opus defaults
No model, base_url, or provider is assumed when the user hasn't
configured one.  Previously the defaults dict in cli.py, AIAgent
constructor args, and several fallback paths all hardcoded
anthropic/claude-opus-4.6 + openrouter.ai/api/v1 — silently routing
unconfigured users to OpenRouter, which 404s for anyone using a
different provider.

Now empty defaults force the setup wizard to run, and existing users
who already completed setup are unaffected (their config.yaml has
the model they chose).

Files changed:
- cli.py: defaults dict, _DEFAULT_CONFIG_MODEL
- run_agent.py: AIAgent.__init__ defaults, main() defaults
- hermes_cli/config.py: DEFAULT_CONFIG
- hermes_cli/runtime_provider.py: is_fallback sentinel
- acp_adapter/session.py: default_model
- tests: updated to reflect empty defaults
2026-04-01 15:22:26 -07:00
3628ccc8c4 feat: use 'developer' role for GPT-5 and Codex models (#4498)
OpenAI's newer models (GPT-5, Codex) give stronger instruction-following
weight to the 'developer' role vs 'system'. Swap the role at the API
boundary in _build_api_kwargs() for the chat_completions path so internal
message representation stays consistent ('system' everywhere).

Applies regardless of provider — OpenRouter, Nous portal, direct, etc.
The codex_responses path (direct OpenAI) uses 'instructions' instead of
message roles, so it's unaffected.

DEVELOPER_ROLE_MODELS constant in prompt_builder.py defines the matching
model name substrings: ('gpt-5', 'codex').
2026-04-01 14:49:32 -07:00
c59ab8b0da fix: profile model.model promoted to model.default when default not set
When a profile config sets model.model but not model.default, the
hardcoded default (claude-opus-4.6) survived the config merge and
took precedence in HermesCLI.__init__ because it checks model.default
first. Profile model configs were silently ignored.

Now model.model is promoted to model.default during the merge when the
user didn't explicitly set model.default. Fixes #4486.
2026-04-01 13:46:18 -07:00
16d9f58445 fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481)
* fix: force-close TCP sockets on client cleanup, detect and recover dead connections

When a provider drops connections mid-stream (e.g. OpenRouter outage),
httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These
zombie connections accumulate and can prevent recovery without restarting.

Changes:
- _force_close_tcp_sockets: walks the httpx connection pool and issues
  socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket
  when a client is closed, preventing CLOSE-WAIT accumulation
- _cleanup_dead_connections: probes the primary client's pool for dead
  sockets (recv MSG_PEEK), rebuilds the client if any are found
- Pre-turn health check at the start of each run_conversation call that
  auto-recovers with a user-facing status message
- Primary client rebuild after stale stream detection to purge pool
- User-facing messages on streaming connection failures:
  "Connection to provider dropped — Reconnecting (attempt 2/3)"
  "Connection failed after 3 attempts — try again in a moment"

Made-with: Cursor

* fix: pool entry missing base_url for openrouter, clean error messages

- _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback
  when pool entry has no runtime_base_url (pool entries from auth.json
  credential_pool often omit base_url)
- Replace Rich console.print for auth errors with plain print() to
  prevent ANSI escape code mangling through prompt_toolkit's stdout patch
- Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT
  accumulation after provider outages
- Pre-turn dead connection detection with auto-recovery and user message
- Primary client rebuild after stale stream detection
- User-facing status messages on streaming connection failures/retries

Made-with: Cursor

* fix(gateway): persist memory flush state to prevent redundant re-flushes on restart

The _session_expiry_watcher tracked flushed sessions in an in-memory set
(_pre_flushed_sessions) that was lost on gateway restart. Expired sessions
remained in sessions.json and were re-discovered every restart, causing
redundant AIAgent runs that burned API credits and blocked the event loop.

Fix: Add a memory_flushed boolean field to SessionEntry, persisted in
sessions.json. The watcher sets it after a successful flush. On restart,
the flag survives and the watcher skips already-flushed sessions.

- Add memory_flushed field to SessionEntry with to_dict/from_dict support
- Old sessions.json entries without the field default to False (backward compat)
- Remove the ephemeral _pre_flushed_sessions set from SessionStore
- Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior
2026-04-01 12:05:02 -07:00
1515e8c8f2 fix: rewrite test mock secrets and add redaction fixture
The original test file had mock secrets corrupted by secret-redaction
tooling before commit — the test values (sk-ant...l012) didn't actually
trigger the PREFIX_RE regex, so 4 of 10 tests were asserting against
values that never appeared in the input.

- Replace truncated mock values with proper fake keys built via string
  concatenation (avoids tool redaction during file writes)
- Add _ensure_redaction_enabled autouse fixture to patch the module-level
  _REDACT_ENABLED constant, matching the pattern from test_redact.py
2026-04-01 12:03:56 -07:00
127a4e512b security: redact secrets from auxiliary and vision LLM responses
LLM responses from browser snapshot extraction and vision analysis
could echo back secrets that appeared on screen or in page content.
Input redaction alone is insufficient — the LLM may reproduce secrets
it read from screenshots (which cannot be text-redacted).

Now redact outputs from:
- _extract_relevant_content (auxiliary LLM response)
- browser_vision (vision LLM response)
- camofox_vision (vision LLM response)
2026-04-01 12:03:56 -07:00
712aa44325 security: block secret exfiltration via browser URLs and auxiliary LLM calls
Three exfiltration vectors closed:

1. Browser URL exfil — agent could embed secrets in URL params and
   navigate to attacker-controlled server. Now scans URLs for known
   API key patterns before navigating (browser_navigate, web_extract).

2. Browser snapshot leak — page displaying env vars or API keys would
   send secrets to auxiliary LLM via _extract_relevant_content before
   run_agent.py's redaction layer sees the result. Now redacts snapshot
   text before the auxiliary call.

3. Camofox annotation leak — accessibility tree text sent to vision
   LLM could contain secrets visible on screen. Now redacts annotation
   context before the vision call.

10 new tests covering URL blocking, snapshot redaction, and annotation
redaction for both browser and camofox backends.
2026-04-01 12:03:56 -07:00
7e91009018 fix: lazy-init SessionDB on adapter instance instead of per-request
Reuse a single SessionDB across requests by caching on self._session_db
with lazy initialization. Avoids creating a new SQLite connection per
request when X-Hermes-Session-Id is used. Updated tests to set
adapter._session_db directly instead of patching the constructor.
2026-04-01 11:41:32 -07:00
bf19623a53 feat(api-server): support X-Hermes-Session-Id header for session continuity
Allow callers to pass X-Hermes-Session-Id in request headers to continue
an existing conversation. When provided, history is loaded from SessionDB
instead of the request body, and the session_id is echoed in the response
header. Without the header, existing behavior is preserved (new uuid per
request).

This enables web UI clients to maintain thread continuity without modifying
any session state themselves — the same mechanism the gateway uses for IM
platforms (Telegram, Discord, etc.).
2026-04-01 11:41:32 -07:00
3ff9e0101d fix(skill_utils): add type check for metadata field in extract_skill_conditions
When PyYAML is unavailable or YAML frontmatter is malformed, the fallback
parser may return metadata as a string instead of a dict. This causes
AttributeError when calling .get("hermes") on the string.

Added explicit type checks to handle cases where metadata or hermes fields
are not dicts, preventing the crash.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2026-04-01 11:34:56 -07:00
b267516851 fix: also exclude .env from default profile exports
The original PR excluded auth.json from _DEFAULT_EXPORT_EXCLUDE_ROOT and
filtered both auth.json and .env from named profile exports, but missed
adding .env to the default profile exclusion set. Default exports would
still leak .env containing API keys.

Added .env to _DEFAULT_EXPORT_EXCLUDE_ROOT, added test coverage, and
updated the existing test that incorrectly asserted .env presence.
2026-04-01 11:20:33 -07:00
d435acc2c0 fix(security): exclude auth.json and .env from profile exports 2026-04-01 11:20:33 -07:00
bacc86d031 fix: use RedactingFormatter on stderr handler, update types and test mock
- stderr handler now uses RedactingFormatter to match file handlers
- restart path uses verbose=0 (int) instead of verbose=False (bool)
- test mock updated with new run_gateway(verbose, quiet, replace) signature
2026-04-01 11:05:07 -07:00
5bd01b838c fix(gateway): wire -v/-q flags to stderr logging
By default 'hermes gateway run' now prints WARNING+ to stderr so
connection errors and startup failures are visible in the terminal
without having to tail ~/.hermes/logs/gateway.log.

- gateway/run.py: start_gateway() accepts verbosity: Optional[int]=0.
  When not None, attaches a StreamHandler to stderr with level mapped
  from the count (0=WARNING, 1=INFO, 2+=DEBUG). Root logger level is
  also lowered when DEBUG is requested so records are not swallowed.

- hermes_cli/gateway.py: run_gateway() gains verbose: int and
  quiet: bool params. -q translates to verbosity=None (no stderr
  handler). Wired through gateway_command().

- hermes_cli/main.py: -v changed from store_true to action=count so
  -v/-vv/-vvv each increment the level. -q/--quiet added as a new flag.

Behaviour summary:
  hermes gateway run        -> WARNING+ on stderr (default)
  hermes gateway run -q     -> silent
  hermes gateway run -v     -> INFO+
  hermes gateway run -vv    -> DEBUG
2026-04-01 11:05:07 -07:00
3400098481 fix: update fetch_transcript.py for youtube-transcript-api v1.x
The library removed the static get_transcript() method in v1.0.
Migrate to the new instance-based fetch() API and normalize
FetchedTranscriptSnippet objects back to dicts for compatibility
with the rest of the script.
2026-04-01 10:49:24 -07:00
e905768ffd fix(gateway): remap HERMES_HOME to target user in system service unit
When `sudo hermes gateway install --system --run-as-user <user>` generates
the systemd unit, get_hermes_home() resolves to /root/.hermes because
Path.home() returns root's home under sudo. The unit correctly sets
HOME= and User= via _system_service_identity(), but HERMES_HOME was
computed independently and pointed to root's config directory.

Add _hermes_home_for_target_user() which remaps the current HERMES_HOME
to the equivalent path under the target user's home. This handles:
- Default ~/.hermes → target user's ~/.hermes
- Profiles (e.g. ~/.hermes/profiles/coder) → preserves relative structure
- Custom paths (e.g. /opt/hermes) → kept as-is

Supersedes #3861 which only handled the default case and left profiles
broken (also flagged by Copilot review).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 06:09:33 -07:00
e0abf2416d fix: restore _config_version to 11 (reverted by stale-branch merge in #4419) (#4440)
PR #4419 was based on pre-credential-pools main where _config_version was 10.
The squash merge downgraded it from 11 (set by #2647) back to 10.
Also fixes the test assertion.
2026-04-01 04:34:04 -07:00
f6ada27d1c feat(skills): size limits for agent writes + fuzzy matching for patch (#4414)
* feat(skills): add content size limits for agent-created skills

Agent writes via skill_manage (create/edit/patch/write_file) are now
constrained to prevent unbounded growth:

- SKILL.md and supporting files: 100,000 character limit
- Supporting files: additional 1 MiB byte limit
- Patches on oversized hand-placed skills that reduce the size are
  allowed (shrink path), but patches that grow beyond the limit are
  rejected

Hand-placed skills and hub-installed skills have NO hard limit —
they load and function normally regardless of size. Hub installs
get a warning in the log if SKILL.md exceeds 100k chars.

This mirrors the memory system's char_limit pattern. Without this,
the agent auto-grows skills indefinitely through iterative patches
(hermes-agent-dev reached 197k chars / 72k tokens — 40x larger than
the largest skill in the entire skills.sh ecosystem).

Constants: MAX_SKILL_CONTENT_CHARS (100k), MAX_SKILL_FILE_BYTES (1MiB)
Tests: 14 new tests covering all write paths and edge cases

* feat(skills): add fuzzy matching to skill patch

_patch_skill now uses the same 8-strategy fuzzy matching engine
(tools/fuzzy_match.py) as the file patch tool. Handles whitespace
normalization, indentation differences, escape sequences, and
block-anchor matching. Eliminates exact-match failures when agents
patch skills with minor formatting mismatches.
2026-04-01 04:19:19 -07:00
70744add15 feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400) (#4419)
Adds two Camofox features:

1. Persistent browser sessions: new `browser.camofox.managed_persistence`
   config option. When enabled, Hermes sends a deterministic profile-scoped
   userId to Camofox so the server maps it to a persistent browser profile
   directory. Cookies, logins, and browser state survive across restarts.
   Default remains ephemeral (random userId per session).

2. VNC URL discovery: Camofox /health endpoint returns vncPort when running
   in headed mode. Hermes constructs the VNC URL and includes it in navigate
   responses so the agent can share it with users.

Also fixes camofox_vision bug where call_llm response object was passed
directly to json.dumps instead of extracting .choices[0].message.content.

Changes from original PR:
- Removed browser_evaluate tool (separate feature, needs own PR)
- Removed snapshot truncation limit change (unrelated)
- Config.yaml only for managed_persistence (no env var, no version bump)
- Rewrote tests to use config mock instead of env var
- Reverted package-lock.json churn

Co-authored-by: analista <psikonetik@gmail.com.com>
2026-04-01 04:18:50 -07:00
85e96a4638 fix(skills): move unified hermes-agent skill into autonomous-ai-agents category (#4435)
The unified skill from PR #4332 was placed at a top-level
skills/hermes-agent/ directory, creating a redundant standalone
category. Move it to skills/autonomous-ai-agents/hermes-agent/
alongside claude-code, codex, and opencode where it belongs.
2026-04-01 03:39:25 -07:00
c9dc6c4749 fix(insights): show cache tokens in overview so total adds up (#4428)
The total_tokens field includes cache_read + cache_write tokens, but
the display only showed input + output — making the math look wrong
(e.g. 765K + 134K displayed but total said 9.2M). Now shows a cache
line when cache tokens are present so all visible numbers sum to the
displayed total.

Affects both terminal (hermes insights) and gateway (/insights)
formats.
2026-04-01 03:06:47 -07:00
935137f0d9 feat: add inline diff previews for write actions
Show inline diffs in the CLI transcript when write_file, patch, or
skill_manage modifies files. Captures a filesystem snapshot before the
tool runs, computes a unified diff after, and renders it with ANSI
coloring in the activity feed.

Adds tool_start_callback and tool_complete_callback hooks to AIAgent
for pre/post tool execution notifications.

Also fixes _extract_parallel_scope_path to normalize relative paths
to absolute, preventing the parallel overlap detection from missing
conflicts when the same file is referenced with different path styles.

Gated by display.inline_diffs config option (default: true).

Based on PR #3774 by @kshitijk4poor.
2026-04-01 02:13:57 -07:00
68fc4aec21 fix: comprehensive default profile export exclusions and import guard
- Add _DEFAULT_EXPORT_EXCLUDE_ROOT constant with 25+ entries to exclude
  from default profile exports: repo checkout (hermes-agent), worktrees,
  databases (state.db), caches, runtime state, logs, binaries
- Add _default_export_ignore() with root-level and universal exclusions
  (__pycache__, *.sock, *.tmp at any depth)
- Remove redundant shutil/tempfile imports from contributor's if-block
- Block import_profile() from accepting 'default' as target name with
  clear guidance to use --name
- Add 7 tests covering: archive creation, inclusion of profile data,
  exclusion of infrastructure, nested __pycache__ exclusion, import
  rejection without --name, import rejection with --name default,
  full export-import roundtrip with a different name

Addresses review feedback on PR #4370.
2026-04-01 01:43:51 -07:00
f04977f45a fix(cli): support exporting the default root profile (#4366) 2026-04-01 01:43:51 -07:00
996250d178 fix(cli): pin entire TUI to bottom of terminal on startup (#4412)
Replace the per-response padding from PR #4359 (which created a void
between short responses and the prompt) with a one-time initial scroll
at session start.  Prints terminal_height newlines before the banner so
the cursor starts at the bottom row — banner, responses, and prompt all
appear pinned to the bottom with empty space above, not below.

patch_stdout naturally keeps the prompt at the bottom from there, so
no per-response padding is needed.
2026-04-01 01:41:09 -07:00
afa75a6185 fix(client): handle is_closed as method in OpenAI SDK
The openai SDK's SyncAPIClient.is_closed is a method, not a property.
getattr(client, 'is_closed', False) returned the bound method object,
which is always truthy — causing _is_openai_client_closed() to report
all clients as closed and triggering unnecessary client recreation
(~100-200ms TCP+TLS overhead per API call).

Fix: check if is_closed is callable and call it, otherwise treat as bool.

Fixes #4377
Co-authored-by: Bartok9 <Bartok9@users.noreply.github.com>
2026-04-01 01:40:43 -07:00
9a581bba50 fix(gateway): resume agent after /approve executes blocked command
When a dangerous command was blocked and the user approved it via /approve,
the command was executed but the agent loop had already exited — the agent
never received the command output and the task died silently.

Now _handle_approve_command sends immediate feedback to the user, then
creates a synthetic continuation message with the command output and feeds
it through _handle_message so the agent picks up where it left off.

- Send command result to chat immediately via adapter.send()
- Create synthetic MessageEvent with command + output as context
- Spawn asyncio task to re-invoke agent via _handle_message
- Return None (feedback already sent directly)
- Add test for agent re-invocation after approval
- Update existing approval tests for new return behavior
2026-04-01 01:38:55 -07:00
8327f7cc61 fix(docs): use compound selector instead of media query
Target the exact state that breaks: when .navbar-sidebar--show is active
on the same <nav> element. This preserves the blur on mobile when the
sidebar is closed, and only removes it when the sidebar is open.
2026-04-01 01:14:39 -07:00
7baee0b023 fix(docs): restrict backdrop-filter to desktop to fix mobile sidebar
backdrop-filter on .navbar creates a new CSS stacking context that
hides .navbar-sidebar menu content on mobile (only the close button
is visible). Scope the blur effect to min-width: 997px so it only
applies on desktop where the sidebar is not rendered inside the navbar.

Ref: facebook/docusaurus#6996, facebook/docusaurus#6853
2026-04-01 01:14:39 -07:00
efa327a998 fix: add missing provider attrs to cli_obj test fixture
_show_status() now references self.provider and self._provider_source,
added after the original PR was submitted.
2026-04-01 01:12:23 -07:00
9b99ea176e fix(cli): initialize ctx_len before compact banner path 2026-04-01 01:12:23 -07:00
a7f7e87070 fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361)
Three bugs prevented credential pool rotation from working when multiple
Codex OAuth tokens were configured:

1. credential_pool was dropped during smart model turn routing.
   resolve_turn_route() constructed runtime dicts without it, so the
   AIAgent was created without pool access. Fixed in smart_model_routing.py
   (no-route and fallback paths), cli.py, and gateway/run.py.

2. Eager fallback fired before pool rotation on 429. The rate-limit
   handler at line ~7180 switched to a fallback provider immediately,
   before _recover_with_credential_pool got a chance to rotate to the
   next credential. Now deferred when the pool still has credentials.

3. (Non-issue) Retry budget was reported as too small, but successful
   pool rotations already skip retry_count increment — no change needed.

Reported by community member Schinsly who identified all three root
causes and verified the fix locally with multiple Codex accounts.
2026-04-01 01:02:34 -07:00
ef2ae3e48f fix(file_tools): refresh staleness timestamp after writes (#4390)
After a successful write_file or patch, update the stored read
timestamp to match the file's new modification time.  Without this,
consecutive edits by the same task (read → write → write) would
false-warn on the second write because the stored timestamp still
reflected the original read, not the first write.

Also renames the internal tracker key from 'file_mtimes' to
'read_timestamps' for clarity.
2026-04-01 00:50:08 -07:00
83dec2b3ec fix: skip empty/whitespace text in Telegram send to prevent 400 errors
Telegram API returns HTTP 400 when sent whitespace-only or empty
text. Add a guard at the top of send() to silently succeed on
blank content instead of crashing.

Equivalent to OpenClaw #56620.
2026-03-31 19:10:26 -07:00
f4d44c777b feat(discord): only create threads and reactions for authorized users 2026-03-31 19:06:46 -07:00
0a6d366327 fix(security): redact secrets from execute_code sandbox output
* fix: root-level provider in config.yaml no longer overrides model.provider

load_cli_config() had a priority inversion: a stale root-level
'provider' key in config.yaml would OVERRIDE the canonical
'model.provider' set by 'hermes model'. The gateway reads
model.provider directly from YAML and worked correctly, but
'hermes chat -q' and the interactive CLI went through the merge
logic and picked up the stale root-level key.

Fix: root-level provider/base_url are now only used as a fallback
when model.provider/model.base_url is not set (never as an override).

Also added _normalize_root_model_keys() to config.py load_config()
and save_config() — migrates root-level provider/base_url into the
model section and removes the root-level keys permanently.

Reported by (≧▽≦) in Discord: opencode-go provider persisted as a
root-level key and overrode the correct model.provider=openrouter,
causing 401 errors.

* fix(security): redact secrets from execute_code sandbox output

The execute_code sandbox stripped env vars with secret-like names from
the child process (preventing os.environ access), but scripts could
still read secrets from disk (e.g. open('~/.hermes/.env')) and print
them to stdout. The raw values entered the model context unredacted.

terminal_tool and file_tools already applied redact_sensitive_text()
to their output — execute_code was the only tool that skipped this
step. Now the same redaction runs on both stdout and stderr after
ANSI stripping.

Reported via Discord (not filed on GitHub to avoid public disclosure
of the reproduction steps).
2026-03-31 18:52:11 -07:00
3604665e44 feat: add qwen/qwen3.6-plus-preview:free to OpenRouter and Nous model lists (#4376) 2026-03-31 18:05:40 -07:00
c36aa5fe98 Merge pull request #4034 from bcross/docker-optimization
fix(docker): optimize docker contanier image creation
2026-03-31 15:27:06 -07:00
f8cb54ba04 fix(cli): anchor input prompt near bottom of terminal after responses (#4359)
After short agent responses, the prompt_toolkit input area sat mid-screen
with empty terminal space below it. Now prints padding newlines (half
terminal height) after each response to push the prompt toward the bottom.
patch_stdout renders the padding above the input area.
2026-03-31 14:56:35 -07:00
b118f607b2 feat(skills): unify hermes-agent and hermes-agent-setup into single skill (#4332)
Merges the hermes-agent-spawning skill (autonomous-ai-agents/) and
hermes-agent-setup skill (dogfood/) into a single comprehensive
skills/hermes-agent/ skill.

The unified skill covers:
- What Hermes Agent is and how it compares to Claude Code/Codex/OpenClaw
- Complete CLI reference (all subcommands and flags)
- Slash command reference
- Configuration guide (providers, toolsets, config sections)
- Voice/STT/TTS setup
- Spawning additional agent instances (one-shot and interactive PTY)
- Multi-agent coordination patterns
- Troubleshooting guide
- Where-to-find-things lookup table with docs links
- Concise contributor quick reference

Removes:
- skills/autonomous-ai-agents/hermes-agent/ (hermes-agent-spawning)
- skills/dogfood/hermes-agent-setup/
2026-03-31 14:49:20 -07:00
f04986029c feat(file_tools): detect stale files on write and patch (#4345)
Track file mtime when read_file is called.  When write_file or patch
subsequently targets the same file, compare the current mtime against
the recorded one.  If they differ (external edit, concurrent agent,
user change), include a _warning in the result advising the agent to
re-read.  The write still proceeds — this is a soft signal, not a
hard block.

Key design points:
- Per-task isolation: task A's reads don't affect task B's writes.
- Files never read produce no warning (not enforcing read-before-write).
- mtime naturally updates after the agent's own writes, so the warning
  only fires on external changes, not the agent's own edits.
- V4A multi-file patches check all target paths.

Tests: 10 new tests covering write staleness, patch staleness,
never-read files, cross-task isolation, and the helper function.
2026-03-31 14:49:00 -07:00
f5cc597afc fix: add CAMOFOX_PORT=9377 to Docker commands for camofox-browser (#4340)
The camofox-browser image defaults to port 3000 internally, not 9377.
Without -e CAMOFOX_PORT=9377, the -p 9377:9377 mapping silently fails
because nothing listens on 9377 inside the container.

E2E verified: -p 9377:9377 alone → connection reset,
-p 9377:9377 -e CAMOFOX_PORT=9377 → healthy and functional.
2026-03-31 13:38:22 -07:00
1b62ad9de7 fix: root-level provider in config.yaml no longer overrides model.provider
load_cli_config() had a priority inversion: a stale root-level
'provider' key in config.yaml would OVERRIDE the canonical
'model.provider' set by 'hermes model'. The gateway reads
model.provider directly from YAML and worked correctly, but
'hermes chat -q' and the interactive CLI went through the merge
logic and picked up the stale root-level key.

Fix: root-level provider/base_url are now only used as a fallback
when model.provider/model.base_url is not set (never as an override).

Also added _normalize_root_model_keys() to config.py load_config()
and save_config() — migrates root-level provider/base_url into the
model section and removes the root-level keys permanently.

Reported by (≧▽≦) in Discord: opencode-go provider persisted as a
root-level key and overrode the correct model.provider=openrouter,
causing 401 errors.
2026-03-31 12:54:22 -07:00
e3f8347be3 feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315)
* feat(file_tools): harden read_file with size guard, dedup, and device blocking

Three improvements to read_file_tool to reduce wasted context tokens and
prevent process hangs:

1. Character-count guard: reads that produce more than 100K characters
   (≈25-35K tokens across tokenisers) are rejected with an error that
   tells the model to use offset+limit for a smaller range.  The
   effective cap is min(file_size, 100K) so small files that happen to
   have long lines aren't over-penalised.  Large truncated files also
   get a hint nudging toward targeted reads.

2. File-read deduplication: when the same (path, offset, limit) is read
   a second time and the file hasn't been modified (mtime unchanged),
   return a lightweight stub instead of re-sending the full content.
   Writes and patches naturally change mtime, so post-edit reads always
   return fresh content.  The dedup cache is cleared on context
   compression — after compression the original read content is
   summarised away, so the model needs the full content again.

3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin
   etc. are rejected before any I/O to prevent process hangs from
   infinite-output or blocking-input devices.

Tests: 17 new tests covering all three features plus the dedup-reset-
on-compression integration.  All 52 file-read tests pass (35 existing +
17 new).  Full tool suite (2124 tests) passes with 0 failures.

* feat: make file_read_max_chars configurable, add docs

Add file_read_max_chars to DEFAULT_CONFIG (default 100K).  read_file_tool
reads this on first call and caches for the process lifetime.  Users on
large-context models can raise it; users on small local models can lower it.

Also adds a 'File Read Safety' section to the configuration docs
explaining the char limit, dedup behavior, and example values.
2026-03-31 12:53:19 -07:00
d3f1987a05 fix(security): add .config/gh to read protection for @file references (#4327)
Follow-up to PR #4305 — .config/gh was added to the write-deny list
but missed from _SENSITIVE_HOME_DIRS, leaving GitHub CLI OAuth tokens
exposed via @file:~/.config/gh/hosts.yml context injection.
2026-03-31 12:48:30 -07:00
655eea2db8 fix(security): protect .docker, .azure, and .config/gh from read and write 2026-03-31 12:47:10 -07:00
c94a5fa1b2 fix(cli): use atomic write in save_config_value to prevent config loss on interrupt
save_config_value() used bare open(path, 'w') + yaml.dump() which truncates
the file to zero bytes on open. If the process is interrupted mid-write,
config.yaml is left empty. Replace with atomic_yaml_write() (temp file +
fsync + os.replace), matching the gateway config write path.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-03-31 12:21:55 -07:00
7f78deebe7 fix: apply same path traversal checks to config-based credential files
_load_config_files() had the same hermes_home / item pattern without
containment checks. While config.yaml is user-controlled (lower threat
than skill frontmatter), defense in depth prevents exploitation via
config injection or copy-paste mistakes.
2026-03-31 12:16:37 -07:00
a97641b9f2 fix(security): reject path traversal in credential file registration 2026-03-31 12:16:37 -07:00
0f2ea2062b fix(profiles): validate tar archive member paths on import
Fixes a zip-slip path traversal vulnerability in hermes profile import.
shutil.unpack_archive() on untrusted tar members allows entries like
../../escape.txt to write files outside ~/.hermes/profiles/.

- Add _normalize_profile_archive_parts() to reject absolute paths
  (POSIX and Windows), traversal (..), empty paths, backslash tricks
- Add _safe_extract_profile_archive() for manual per-member extraction
  that only allows regular files and directories (rejects symlinks)
- Replace shutil.unpack_archive() with the safe extraction path
- Add regression tests for traversal and absolute-path attacks

Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-31 12:14:27 -07:00
08171c1c31 fix: allow voice mode in WSL when PulseAudio bridge is configured
WSL detection was treated as a hard fail, blocking voice mode even when
audio worked via PulseAudio bridge. Now PULSE_SERVER env var presence
makes WSL a soft notice instead of a blocking warning. Device query
failures in WSL with PULSE_SERVER are also treated as non-blocking.
2026-03-31 12:13:33 -07:00
7f670a06cf feat: add --max-turns CLI flag to hermes chat
Exposes the existing max_turns parameter (cli.py main()) as a CLI flag
so programmatic callers (Paperclip adapter, scripts) can control the
agent's tool-calling iteration limit without editing config.yaml.

Priority chain unchanged: CLI flag > config agent.max_turns > env
HERMES_MAX_ITERATIONS > default 90.
2026-03-31 12:10:12 -07:00
cac9d20c4f test: add codex transport drop regression 2026-03-31 12:05:06 -07:00
e75964d46d fix: harden codex responses transport handling 2026-03-31 12:05:06 -07:00
161acb0086 fix: credential pool 401 recovery rotates to next credential after failed refresh (#4300)
When an OAuth token refresh fails on a 401 error, the pool recovery
would return 'not recovered' without trying the next credential in the
pool. This meant users who added a second valid credential via
'hermes auth add' would never see it used when the primary credential
was dead.

Now: try refresh first (handles expired tokens quickly), and if that
fails, rotate to the next available credential — same as 429/402
already did.

Adds three tests covering 401 refresh success, refresh-fail-then-rotate,
and refresh-fail-with-no-remaining-credentials.
2026-03-31 12:02:29 -07:00
143b74ec00 fix: first-run guard stuck in loop when provider configured via config.yaml (#4298)
The _has_any_provider_configured() guard only checked env vars, .env file,
and auth.json — missing config.yaml model.provider/base_url/api_key entirely.
Users who configured a provider through setup (saving to config.yaml) but had
empty API key placeholders in .env from the install template were permanently
blocked by the 'not configured' message.

Changes:
- _has_any_provider_configured() now checks config.yaml model section for
  explicit provider, base_url, or api_key — covers custom endpoints and
  providers that store credentials in config rather than env vars
- .env.example: comment out all empty API key placeholders so they don't
  pollute the environment when copied to .env by the installer
- .env.example: mark LLM_MODEL as deprecated (config.yaml is source of truth)
- 4 new tests for the config.yaml detection path

Reported by OkadoOP on Discord.
2026-03-31 11:42:52 -07:00
57625329a2 docs+feat: comprehensive local LLM provider guides and context length warning (#4294)
* docs: update llama.cpp section with --jinja flag and tool calling guide

The llama.cpp docs were missing the --jinja flag which is required for
tool calling to work. Without it, models output tool calls as raw JSON
text instead of structured API responses, making Hermes unable to
execute them.

Changes:
- Add --jinja and -fa flags to the server startup example
- Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with
  hermes model interactive setup
- Add caution block explaining the --jinja requirement and symptoms
- List models with native tool calling support
- Add /props endpoint verification tip

* docs+feat: comprehensive local LLM provider guides and context length warning

Docs (providers.md):
- Rewrote Ollama section with context length warning (defaults to 4k on
  <24GB VRAM), three methods to increase it, and verification steps
- Rewrote vLLM section with --max-model-len, tool calling flags
  (--enable-auto-tool-choice, --tool-call-parser), and context guidance
- Rewrote SGLang section with --context-length, --tool-call-parser,
  and warning about 128-token default max output
- Added LM Studio section (port 1234, context length defaults to 2048,
  tool calling since 0.3.6)
- Added llama.cpp context length flag (-c) and GPU offload (-ngl)
- Added Troubleshooting Local Models section covering:
  - Tool calls appearing as text (with per-server fix table)
  - Silent context truncation and diagnosis commands
  - Low detected context at startup
  - Truncated responses
- Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with
  hermes model interactive setup and config.yaml examples
- Added deprecation warning for legacy env vars in General Setup

Code (cli.py):
- Added context length warning in show_banner() when detected context
  is <= 8192 tokens, with server-specific fix hints:
  - Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var
  - LM Studio (port 1234): suggests model settings adjustment
  - Other servers: suggests config.yaml override

Tests:
- 9 new tests covering warning thresholds, server-specific hints,
  and no-warning cases
2026-03-31 11:42:48 -07:00
0240baa357 fix: strip orphaned think/reasoning tags from user-facing responses
Some models (e.g. Kimi K2.5 on Alibaba OpenAI-compatible endpoint)
emit reasoning text followed by a closing </think> without a matching
opening <think> tag.  The existing paired-tag regexes in
_strip_think_blocks() cannot match these orphaned tags, so </think>
leaks into user-facing responses on all platforms.

Add a catch-all regex that strips any remaining opening or closing
think/thinking/reasoning/REASONING_SCRATCHPAD tags after the existing
paired-block removal pass.

Closes #4285
2026-03-31 11:42:44 -07:00
c1606aed69 fix(cli): allow empty strings and falsy values in config set
`hermes config set KEY ""` and `hermes config set KEY 0` were rejected
because the guard used `not value` which is truthy for empty strings,
zero, and False. Changed to `value is None` so only truly missing
arguments are rejected.

Closes #4277

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 11:41:12 -07:00
49d7210fed fix(gateway): parse thread_id from delivery target format
The delivery target parser uses split(':', 1) which only splits on the
first colon. For the documented format platform:chat_id:thread_id
(e.g. 'telegram:-1001234567890:17585'), thread_id gets munged into
chat_id and is never extracted.

Fix: split(':', 2) to correctly extract all three parts. Also fix
to_string() to include thread_id for proper round-tripping.

The downstream plumbing in _deliver_to_platform() already handles
thread_id correctly (line 292-293) — it just never received a value.
2026-03-31 10:45:27 -07:00
84a541b619 feat: support * wildcard in platform allowlists and improve WhatsApp docs
* docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS

- Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference
- Warn that * is not a wildcard and silently blocks all messages
- Show WHATSAPP_ALLOWED_USERS as optional, not required
- Update troubleshooting with the * trap and debug mode tip
- Fix Security section to mention the allow-all alternative

Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=*
caused all incoming messages to be silently dropped at the bridge level.

* feat: support * wildcard in platform allowlists

Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already
supports * as an allow-all wildcard.

Bridge (allowlist.js): matchesAllowedUser() now checks for * in the
allowedUsers set before iterating sender aliases.

Gateway (run.py): _is_authorized() checks for * in allowed_ids after
parsing the allowlist. This is generic — works for all platforms, not
just WhatsApp.

Updated docs to document * as a supported value instead of warning
against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to
the env vars reference.

Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram
to verify cross-platform behavior).
2026-03-31 10:42:03 -07:00
cca0996a28 fix(browser): skip SSRF check for local backends (Camofox, headless Chromium) (#4292)
The SSRF protection added in #3041 blocks all private/internal addresses
unconditionally in browser_navigate(). This prevents legitimate local use
cases (localhost apps, LAN devices) when using Camofox or the built-in
headless Chromium without a cloud provider.

The check is only meaningful for cloud backends (Browserbase, BrowserUse)
where the agent could reach internal resources on a remote machine. Local
backends give the user full terminal and network access already — the
SSRF check adds zero security value.

Add _is_local_backend() helper that returns True when Camofox is active
or no cloud provider is configured. Both the pre-navigation and
post-redirect SSRF checks now skip when running locally. The
browser.allow_private_urls config option remains available as an
explicit opt-out for cloud mode.
2026-03-31 10:40:13 -07:00
fad3f338d1 fix: patch _REDACT_ENABLED in test fixture for module-level snapshot
The _REDACT_ENABLED constant is snapshotted at import time, so
monkeypatch.delenv() alone doesn't re-enable redaction during tests
when HERMES_REDACT_SECRETS=false is set in the host environment.
2026-03-31 10:30:48 -07:00
6dcc3330b3 fix(security): add missing GitHub OAuth token patterns and snapshot redact flag
- Add gho_, ghu_, ghs_, ghr_ prefix patterns (OAuth, user-to-server,
  server-to-server, and refresh tokens) — all four types used by
  GitHub Apps and Copilot auth flows were absent from _PREFIX_PATTERNS
- Snapshot HERMES_REDACT_SECRETS at module import time instead of
  re-reading os.getenv() on every call, preventing runtime env mutations
  (e.g. LLM-generated export commands) from disabling redaction
2026-03-31 10:30:48 -07:00
289df5dd1c Merge branch 'NousResearch:main' into docker-optimization 2026-03-31 07:08:44 -05:00
344239c2db feat: auto-detect models from server probe in custom endpoint setup (#4218)
Custom endpoint setup (_model_flow_custom) now probes the server first
and presents detected models instead of asking users to type blind:

- Single model: auto-confirms with Y/n prompt
- Multiple models: numbered list picker, or type a name
- No models / probe failed: falls back to manual input

Context length prompt also moved after model selection so the user sees
the verified endpoint before being asked for details.

All recent fixes preserved: config dict sync (#4172), api_key
persistence (#4182), no save_env_value for URLs (#4165).

Inspired by PR #4194 by sudoingX — re-implemented against current main.

Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>
2026-03-31 03:29:00 -07:00
79b2694b9a fix: _allow_private_urls name collision + stale OPENAI_BASE_URL test (#4217)
1. browser_tool.py: _allow_private_urls() used 'global _allow_private_urls'
   then assigned a bool to it, replacing the function in the module namespace.
   After first call, subsequent calls hit TypeError: 'bool' object is not
   callable. Renamed cache variable to _cached_allow_private_urls.

2. test_provider_parity.py: test_custom_endpoint_when_no_nous relied on
   OPENAI_BASE_URL env var (removed in config refactor). Mock
   _resolve_custom_runtime directly instead.
2026-03-31 03:16:40 -07:00
8d59881a62 feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX

Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.

- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647

* fix(tests): prevent pool auto-seeding from host env in credential pool tests

Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.

- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test

* feat(auth): add thread safety, least_used strategy, and request counting

- Add threading.Lock to CredentialPool for gateway thread safety
  (concurrent requests from multiple gateway sessions could race on
  pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
  with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
  with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
  thread safety (4 threads × 20 selects with no corruption)

* feat(auth): add interactive mode for bare 'hermes auth' command

When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:

1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
   add flow explicitly asks 'API key or OAuth login?' — making it
   clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
   least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection

The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.

* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)

Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.

* feat(auth): support custom endpoint credential pools keyed by provider name

Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).

- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
  model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
  providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
  pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing

* docs: add Excalidraw diagram of full credential pool flow

Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration

Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g

* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow

The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached

* docs: add comprehensive credential pool documentation

- New page: website/docs/user-guide/features/credential-pools.md
  Full guide covering quick start, CLI commands, rotation strategies,
  error recovery, custom endpoint pools, auto-discovery, thread safety,
  architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
  first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide

* chore: remove excalidraw diagram from repo (external link only)

* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns

- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
  (token_type, scope, client_id, portal_base_url, obtained_at,
  expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
  agent_key_obtained_at, tls) into a single extra dict with
  __getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider

Net -17 lines. All 383 targeted tests pass.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
2ae50bdddd fix(telegram): enforce 32-char limit on command names with collision avoidance (#4211)
Telegram Bot API requires command names to be 1-32 characters. Plugin
and skill names that exceed this limit now get truncated. If truncation
creates a collision (with core commands, other plugins, or other skills),
the name is shortened to 31 chars and a digit 0-9 is appended.

Adds _clamp_telegram_names() helper used for both plugin and skill
entries in telegram_menu_commands(). Core CommandDef commands are tracked
as reserved names so truncated plugin/skill names never shadow them.

Addresses the fix from PR #4191 (sroecker) with collision-safe truncation.

Tests: 9 new tests covering truncation, digit suffixes, exhaustion, dedup.
2026-03-31 02:41:50 -07:00
50302ed70a fix(tools): make browser SSRF check configurable via browser.allow_private_urls (#4198)
* fix(tools): skip SSRF check in local browser mode

The SSRF protection added in #3041 blocks all private/internal
addresses unconditionally in browser_navigate(). This prevents
legitimate local development use cases (localhost testing, LAN
device access) when using the local Chromium backend.

The SSRF check is only meaningful for cloud browsers (Browserbase,
BrowserUse) where the agent could reach internal resources on a
remote machine. In local mode, the user already has full terminal
and network access, so the check adds no security value.

This change makes the SSRF check conditional on _get_cloud_provider(),
keeping full protection in cloud mode while allowing private addresses
in local mode.

* fix(tools): make SSRF check configurable via browser.allow_private_urls

Replace unconditional SSRF check with a configurable setting.
Default (False) keeps existing security behavior. Setting to True
allows navigating to private/internal IPs for local dev and LAN use cases.

---------

Co-authored-by: Nils (Norya) <nils@begou.dev>
2026-03-31 02:11:55 -07:00
086ec5590d fix: gate Claude Code credentials behind explicit Hermes config in wizard trigger (#4210)
If a user has Claude Code installed but never configured Hermes, the
first-run guard found those external credentials and skipped the setup
wizard. Users got silently routed to someone else's inference without
being asked.

Now _has_any_provider_configured() checks whether Hermes itself has been
explicitly configured (model in config differs from hardcoded default)
before counting Claude Code credentials. Fresh installs trigger the
wizard regardless of what external tools are on the machine.

Salvaged from PR #4194 by sudoingX — wizard trigger fix only.
Model auto-detect change under separate review.

Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>
2026-03-31 02:01:15 -07:00
c53a296df1 feat: add MiniMax M2.7 to hermes model picker and opencode-go (#4208)
Add MiniMax-M2.7 and M2.7-highspeed to _PROVIDER_MODELS for minimax
and minimax-cn providers in main.py so hermes model shows them.
Update opencode-go bare ID from m2.5 to m2.7 in models.py.

Salvaged from PR #4197 by octo-patch.
2026-03-31 01:54:13 -07:00
1bca6f3930 fix: save API key to model config for custom endpoints (#4182)
Custom cloud endpoints (Together.ai, RunPod, Groq, etc.) lost their
API key after #4165 removed OPENAI_API_KEY .env saves.  The key was
only saved to the custom_providers list which is unreachable at
runtime for plain 'custom' provider resolution.

Save model.api_key to config.yaml alongside model.provider and
model.base_url in all three custom endpoint code paths:
- _model_flow_custom (new endpoint with model name)
- _model_flow_custom (new endpoint without model name)
- _model_flow_named_custom (switching to a saved endpoint)

The runtime resolver already reads model.api_key (runtime_provider.py
line 224-228), so the key is picked up automatically.  Each custom
endpoint carries its own key in config — no shared OPENAI_API_KEY
env var needed.
2026-03-31 01:36:15 -07:00
a994cf5e5a docs: update adding-providers guide for unified setup flow
setup_model_provider() now delegates to select_provider_and_model()
from main.py, so new providers only need to be wired in main.py.
Removed setup.py from file checklists, replaced the setup.py section
with a tip explaining the automatic inheritance.
2026-03-31 01:29:43 -07:00
ff78ad4c81 feat: add discord.reactions config option to disable message reactions (#4199)
Adds a 'reactions' key under the discord config section (default: true).
When set to false, the bot no longer adds 👀// reactions to messages
during processing. The config maps to DISCORD_REACTIONS env var following
the same pattern as require_mention and auto_thread.

Files changed:
- hermes_cli/config.py: Add reactions default to DEFAULT_CONFIG
- gateway/config.py: Map discord.reactions to DISCORD_REACTIONS env var
- gateway/platforms/discord.py: Gate on_processing_start/complete hooks
- tests/gateway/test_discord_reactions.py: 3 new tests for config gate
2026-03-31 01:24:48 -07:00
491e79bca9 refactor: unify setup wizard provider selection with hermes model
setup_model_provider() had 800+ lines of duplicated provider handling
that reimplemented the same credential prompting, OAuth flows, and model
selection that hermes model already provides via the _model_flow_*
functions.  Every new provider had to be added in both places, and the
two implementations diverged in config persistence (setup.py did raw
YAML writes, _set_model_provider, and _update_config_for_provider
depending on the provider — main.py used its own load/save cycle).

This caused the #4172 bug: _model_flow_custom saved config to disk but
the wizard's final save_config(config) overwrote it with stale values.

Fix: extract the core of cmd_model() into select_provider_and_model()
and have setup_model_provider() call it.  After the call, re-sync the
wizard's config dict from disk.  Deletes ~800 lines of duplicated
provider handling from setup.py.

Also fixes cmd_model() double-AuthError crash on fresh installs with
no API keys configured.
2026-03-31 01:04:07 -07:00
89d8127772 fix: setup wizard overwrites custom endpoint config (#4172)
_model_flow_custom() saved model.provider and model.base_url to disk
via its own load_config/save_config cycle, but never updated the
setup wizard's in-memory config dict.  The wizard's final
save_config(config) then overwrote the custom settings with the
stale default string model value.

Fix: after saving to disk, also mutate the caller's config dict so
the wizard's final save preserves model.provider='custom' and the
base_url.  Both the model_name and no-model_name branches are
covered.

Added regression tests that simulate the full wizard flow including
the final save_config(config) call — the step that was previously
untested.
2026-03-30 23:17:26 -07:00
f890a94c12 refactor: make config.yaml the single source of truth for endpoint URLs (#4165)
OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source
confusion. Users (especially Docker) would see the URL in .env and assume
that's where all config lives, then wonder why LLM_MODEL in .env didn't work.

Changes:
- Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py,
  setup.py, and tools_config.py
- Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py,
  models.py, and gateway/run.py
- Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and
  auxiliary_client.py — config.yaml model.default is authoritative
- Vision base URL now saved to config.yaml auxiliary.vision.base_url
  (both setup wizard and tools_config paths)
- Tests updated to set config values instead of env vars

Convention enforced: .env is for SECRETS only (API keys). All other
configuration (model names, base URLs, provider selection) lives
exclusively in config.yaml.
2026-03-30 22:02:53 -07:00
4d7e3c7157 fix(tests): provide model name in Codex 401 refresh tests for CI (#4166)
CI has no config.yaml, so cron/gateway resolve an empty model name.
The Codex Responses validator rejects empty models before the mock
API call is reached. Provide explicit model in job dict and env var.
2026-03-30 21:17:09 -07:00
1bd206ea5d feat: add /btw command for ephemeral side questions (#4161)
Adds /btw <question> — ask a quick follow-up using the current
session context without interrupting the main conversation.

- Snapshots conversation history, answers with a no-tools agent
- Response is not persisted to session history or DB
- Runs in a background thread (CLI) / async task (gateway)
- Per-session guard prevents concurrent /btw in gateway

Implementation:
- model_tools.py: enabled_toolsets=[] now correctly means "no tools"
  (was falsy, fell through to default "all tools")
- run_agent.py: persist_session=False gates _persist_session()
- cli.py: _handle_btw_command (background thread, Rich panel output)
- gateway/run.py: _handle_btw_command + _run_btw_task (async task)
- hermes_cli/commands.py: CommandDef for "btw"

Inspired by PR #3504 by areu01or00, reimplemented cleanly on current
main with the enabled_toolsets=[] fix and without the __btw_no_tools__
hack.
2026-03-30 21:10:05 -07:00
f8e1ee10aa Fix profile list model display (#4160)
Co-authored-by: txhno <roshwarrier@gmail.com>
2026-03-30 20:40:13 -07:00
c1ef9b2250 fix(cli): ensure on_session_end hook fires on interrupted exits (#4159)
- Add SIGTERM/SIGHUP signal handlers for graceful shutdown
- Add BrokenPipeError to exit exception handling (SSH disconnects)
- Fire on_session_end plugin hook in finally block, guarded by
  _agent_running to avoid double-firing on normal exits (the hook
  already fires per-turn from run_conversation)

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-30 20:37:17 -07:00
3a68ec3172 feat: add Fireworks context length detection support (#4158)
- Add api.fireworks.ai to _URL_TO_PROVIDER for automatic provider detection
- Add fireworks to PROVIDER_TO_MODELS_DEV mapped to 'fireworks-ai' (the
  correct models.dev provider key — original PR used 'fireworks' which
  would silently fail the lookup)


Cherry-picked from PR #3989 with models.dev key fix.

Co-authored-by: sroecker <sroecker@users.noreply.github.com>
2026-03-30 20:37:08 -07:00
d30ea65c9b fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148)
* fix(tests): mock sys.stdin.isatty for cmd_model TTY guard

* fix(tests): update camofox snapshot format + trajectory compressor mock path

- test_browser_camofox: mock response now uses snapshot format (accessibility tree)
- test_trajectory_compressor: mock _get_async_client instead of setting async_client directly

* fix: URL-based auth detection for third-party Anthropic endpoints + test fixes

Reverts the key-prefix approach from #4093 which broke JWT and managed
key OAuth detection. Instead, detects third-party endpoints by URL:
if base_url is set and isn't anthropic.com, it's a proxy (Azure AI
Foundry, AWS Bedrock, etc.) that uses x-api-key regardless of key format.

Auth decision chain is now:
1. _requires_bearer_auth(url) → MiniMax → Bearer
2. _is_third_party_anthropic_endpoint(url) → Azure/Bedrock → x-api-key
3. _is_oauth_token(key) → OAuth on direct Anthropic → Bearer
4. else → x-api-key

Also includes test fixes from PR #4051 by @erosika:
- Mock sys.stdin.isatty for cmd_model TTY guard
- Update camofox snapshot format mock
- Fix trajectory compressor async client mock path

---------

Co-authored-by: Erosika <eri@plasticlabs.ai>
2026-03-30 20:36:56 -07:00
fb4b87f4af chore: add claude-sonnet-4.6 to OpenRouter and Nous model lists (#4157) 2026-03-30 20:33:21 -07:00
5b0243e6ad docs: deep quality pass — expand 10 thin pages, fix specific issues (#4134)
Developer guide stubs expanded to full documentation:
- trajectory-format.md: 56→233 lines (JSONL format, ShareGPT example,
  normalization rules, reasoning markup, replay code)
- session-storage.md: 66→388 lines (SQLite schema, migration table,
  FTS5 search syntax, lineage queries, Python API examples)
- context-compression-and-caching.md: 72→321 lines (dual compression
  system, config defaults, 4-phase algorithm, before/after example,
  prompt caching mechanics, cache-aware patterns)
- tools-runtime.md: 65→246 lines (registry API, dispatch flow,
  availability checking, error wrapping, approval flow)
- prompt-assembly.md: 89→246 lines (concrete assembled prompt example,
  SOUL.md injection, context file discovery table)

User-facing pages expanded:
- docker.md: 62→224 lines (volumes, env forwarding, docker-compose,
  resource limits, troubleshooting)
- updating.md: 79→167 lines (update behavior, version checking,
  rollback instructions, Nix users)
- skins.md: 80→206 lines (all color/spinner/branding keys, built-in
  skin descriptions, full custom skin YAML template)

Hub pages improved:
- integrations/index.md: 25→82 lines (web search backends table,
  TTS/browser providers, quick config example)
- features/overview.md: added Integrations section with 6 missing links

Specific fixes:
- configuration.md: removed duplicate Gateway Streaming section
- mcp.md: removed internal "PR work" language
- plugins.md: added inline minimal plugin example (self-contained)

13 files changed, ~1700 lines added. Docusaurus build verified clean.
2026-03-30 20:30:11 -07:00
54b876a5c9 fix: add actionable guidance to context-exceeded error messages (#4155)
When context compression fails, users now see hints suggesting /new
or /compress instead of a dead-end error. Covers all 4 error paths:
payload-too-large, max compression attempts (2 paths), and context
length exceeded.

Closes #4061
Salvaged from PR #4076 by SHL0MS.

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 20:23:28 -07:00
83e5249be6 fix(gateway): use setsid instead of systemd-run --user for /update (salvage #4024) (#4104)
Salvaged from PR #4024 by @Sertug17. Fixes #4017.

- Replace systemd-run --user --scope with setsid for portable session detach
- Add system-level service detection to cmd_update gateway restart
- Falls back to start_new_session=True on systems without setsid (macOS, minimal containers)
2026-03-30 20:22:09 -07:00
fb2af3bd1d docs: document tool progress streaming in API server and Open WebUI (#4138)
Update docs to reflect that tool progress now streams inline during
SSE responses. Previously docs said tool calls were invisible.

- api-server.md: add 'Tool progress in streams' note to streaming docs
- open-webui.md: update 'How It Works' steps, add Tool Progress tip
2026-03-30 19:40:39 -07:00
cc63b2d1cd fix(gateway): remove user-facing compression warnings (#4139)
Auto-compression still runs silently in the background with server-side
logging, but no longer sends messages to the user's chat about it.

Removed:
- 'Session is large... Auto-compressing' pre-compression notification
- 'Compressed: N → M messages' post-compression notification
- 'Session is still very large after compression' warning
- 'Auto-compression failed' warning
- Rate-limit tracking (only existed for these warnings)
2026-03-30 19:17:07 -07:00
45396aaa92 fix(alibaba): use standard DashScope international endpoint (#4133)
* fix(alibaba): use standard DashScope international endpoint

The Alibaba Cloud provider was hardcoded to the coding-intl endpoint
(https://coding-intl.dashscope.aliyuncs.com/v1) which only accepts
Alibaba Coding Plan API keys.

Standard DashScope API keys fail with invalid_api_key error against
this endpoint. Changed to the international compatible-mode endpoint
(https://dashscope-intl.aliyuncs.com/compatible-mode/v1) which works
with standard DashScope keys.

Users with Coding Plan keys or China-region keys can still override
via DASHSCOPE_BASE_URL or config.yaml base_url.

Fixes #3912

* fix: update test to match new DashScope default endpoint

---------

Co-authored-by: kagura-agent <kagura.chen28@gmail.com>
2026-03-30 19:06:30 -07:00
04367e2fac fix(cron): stop truncating job IDs in list view (#4132)
Remove [:8] truncation from hermes cron list output. Job IDs are 12
hex chars — truncating to 8 makes them unusable for cron run/pause/remove
which require the full ID.

Co-authored-by: vitobotta <vitobotta@users.noreply.github.com>
2026-03-30 19:05:34 -07:00
cdb64a869a fix(security): reject private and loopback IPs in Telegram DoH fallback (#4129)
Co-authored-by: Maymun <139681654+maymuneth@users.noreply.github.com>
2026-03-30 18:53:24 -07:00
1e59d4813c feat(api_server): stream tool progress to Open WebUI (#4092)
Wire the existing tool_progress_callback through the API server's
streaming handler so Open WebUI users see what tool is running.

Uses the existing 3-arg callback signature (name, preview, args)
that fires at tool start — no changes to run_agent.py needed.
Progress appears as inline markdown in the SSE content stream.

Inspired by PR #4032 by sroecker, reimplemented to avoid breaking
the callback signature used by CLI and gateway consumers.
2026-03-30 18:50:27 -07:00
f776191650 fix: persist compressed context to gateway session after mid-run compression
When context compression fires during run_conversation() in the gateway,
the compressed messages were silently lost on the next turn. Two bugs:

1. Agent-side: _flush_messages_to_session_db() calculated
   flush_from = max(len(conversation_history), _last_flushed_db_idx).
   After compression, _last_flushed_db_idx was correctly reset to 0,
   but conversation_history still had its original pre-compression
   length (e.g. 200). Since compressed messages are shorter (~30),
   messages[200:] was empty — nothing written to the new session's
   SQLite.

   Fix: Set conversation_history = None after each _compress_context()
   call so start_idx = 0 and all compressed messages are flushed.

2. Gateway-side: history_offset was always len(agent_history) — the
   original pre-compression length. After compression shortened the
   message list, agent_messages[200:] was empty, causing the gateway
   to fall back to writing only a user/assistant pair, losing the
   compressed summary and tail context.

   Fix: Detect session splits (agent.session_id != original) and set
   history_offset = 0 so all compressed messages are written to JSONL.
2026-03-30 18:49:14 -07:00
44d02f35d2 docs: restructure site navigation — promote features and platforms to top-level (#4116)
Major reorganization of the documentation site for better discoverability
and navigation. 94 pages across 8 top-level sections (was 5).

Structural changes:
- Promote Features from 3-level-deep subcategory to top-level section
  with new Overview hub page categorizing all 26 feature pages
- Promote Messaging Platforms from User Guide subcategory to top-level
  section, add platform comparison matrix (13 platforms x 7 features)
- Create new Integrations section with hub page, grouping MCP, ACP,
  API Server, Honcho, Provider Routing, Fallback Providers
- Extract AI provider content (626 lines) from configuration.md into
  dedicated integrations/providers.md — configuration.md drops from
  1803 to 1178 lines
- Subcategorize Developer Guide into Architecture, Extending, Internals
- Rename "User Guide" to "Using Hermes" for top-level items

Orphan fixes (7 pages now reachable via sidebar):
- build-a-hermes-plugin.md added to Guides
- sms.md added to Messaging Platforms
- context-references.md added to Features > Core
- plugins.md added to Features > Core
- git-worktrees.md added to Using Hermes
- checkpoints-and-rollback.md added to Using Hermes
- checkpoints.md (30-line stub) deleted, superseded by
  checkpoints-and-rollback.md (203 lines)

New files:
- integrations/index.md — Integrations hub page
- integrations/providers.md — AI provider setup (extracted)
- user-guide/features/overview.md — Features hub page

Broken link fixes:
- quickstart.md, faq.md: update context-length-detection anchors
- configuration.md: update checkpoints link
- overview.md: fix checkpoint link path

Docusaurus build verified clean (zero broken links/anchors).
2026-03-30 18:39:51 -07:00
b2e1a095f8 fix(anthropic): write scopes field to Claude Code credentials on token refresh (#4126)
Claude Code >=2.1.81 checks for a 'scopes' array containing 'user:inference'
in ~/.claude/.credentials.json before accepting stored OAuth tokens as valid.

When Hermes refreshes the token, it writes only accessToken, refreshToken, and
expiresAt — omitting the scopes field. This causes Claude Code to report
'loggedIn: false' and refuse to start, even though the token is valid.

This commit:
- Parses the 'scope' field from the OAuth refresh response
- Passes it to _write_claude_code_credentials() as a keyword argument
- Persists the scopes array in the claudeAiOauth credential store
- Preserves existing scopes when the refresh response omits the field

Tested against Claude Code v2.1.87 on Linux — auth status correctly reports
loggedIn: true and claude --print works after this fix.

Co-authored-by: Nick <git@flybynight.io>
2026-03-30 18:35:16 -07:00
ffd5d37f9b fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens (#4093)
* fix: treat non-sk-ant- prefixed keys (Azure AI Foundry) as regular API keys, not OAuth tokens

* fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens

_is_oauth_token() returned True for any key not starting with
sk-ant-api, misclassifying Azure AI Foundry keys as OAuth tokens
and sending Bearer auth instead of x-api-key → 401 rejection.

Real Anthropic OAuth tokens all start with sk-ant-oat (confirmed
from live .credentials.json). Non-sk-ant- keys are third-party
provider keys that should use x-api-key.

Test fixtures updated to use realistic sk-ant-oat01- prefixed
tokens instead of fake strings.

Salvaged from PR #4075 by @HangGlidersRule.

---------

Co-authored-by: Clawdbot <clawdbot@openclaw.ai>
2026-03-30 17:41:13 -07:00
720507efac feat: add post-migration cleanup for OpenClaw directories (#4100)
After migrating from OpenClaw, leftover workspace directories contain
state files (todo.json, sessions, logs) that confuse the agent — it
discovers them and reads/writes to stale locations instead of the
Hermes state directory, causing issues like cron jobs reading a
different todo list than interactive sessions.

Changes:
- hermes claw migrate now offers to archive the source directory after
  successful migration (rename to .pre-migration, not delete)
- New `hermes claw cleanup` subcommand for users who already migrated
  and need to archive leftover OpenClaw directories
- Migration notes updated with explicit cleanup guidance
- 42 tests covering all new functionality

Reported by SteveSkedasticity — multiple todo.json files across
~/.hermes/, ~/.openclaw/workspace/, and ~/.openclaw/workspace-assistant/
caused cron jobs to read from wrong locations.
2026-03-30 17:39:08 -07:00
8a794d029d fix(ci): add repo conditionals to prevent fork workflow failures (#4107)
Add github.repository checks to docker-publish and deploy-site
workflows so they skip on forks where upstream-specific resources
(Docker Hub org, custom domain) are unavailable.

Co-authored-by: StreamOfRon <StreamOfRon@users.noreply.github.com>
2026-03-30 17:38:32 -07:00
e64b047663 chore: prepare Hermes for Homebrew packaging (#4099)
Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>
2026-03-30 17:34:43 -07:00
1b7473e702 Fixes and refactors enabled by recent updates to main. 2026-03-31 09:29:59 +09:00
1126284c97 Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-31 09:29:43 +09:00
11aa44d34d docs(telegram): add webhook mode documentation (#4089)
Documents the Telegram webhook mode from #3880:
- New 'Webhook Mode' section in telegram.md with polling vs webhook
  comparison, config table, Fly.io deployment example, troubleshooting
- Add TELEGRAM_WEBHOOK_URL/PORT/SECRET to environment-variables.md
- Add Telegram section to .env.example (existing + webhook vars)

Co-authored-by: raulbcs <raulbcs@users.noreply.github.com>
2026-03-30 17:21:59 -07:00
07746dca0c fix(matrix): E2EE decryption — request keys, auto-trust devices, retry buffered events (#4083)
When the Matrix adapter receives encrypted events it can't decrypt
(MegolmEvent), it now:

1. Requests the missing room key from other devices via
   client.request_room_key(event) instead of silently dropping the message

2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries
   decryption after each E2EE maintenance cycle when new keys arrive

3. Auto-trusts/verifies all devices after key queries so other clients
   share session keys with the bot proactively

4. Exports Megolm keys on disconnect and imports them on connect, so
   session keys survive gateway restarts

This addresses the 'could not decrypt event' warnings that caused the
bot to miss messages in encrypted rooms.
2026-03-30 17:16:09 -07:00
7e0c2c3ce3 docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087)
Reference docs fixes:
- cli-commands.md: remove non-existent --provider alibaba, add hermes
  profile/completion/plugins/mcp to top-level table, add --profile/-p
  global flag, add --source chat option
- slash-commands.md: add /yolo and /commands, fix /q alias conflict
  (resolves to /queue not /quit), add missing aliases (/bg, /set-home,
  /reload_mcp, /gateway)
- toolsets-reference.md: fix hermes-api-server (not same as hermes-cli,
  omits clarify/send_message/text_to_speech)
- profile-commands.md: fix show name required not optional, --clone-from
  not --from, add --remove/--name to alias, fix alias path, fix export/
  import arg types, remove non-existent fish completion
- tools-reference.md: add EXA_API_KEY to web tools requires_env
- mcp-config-reference.md: add auth key for OAuth, tool name sanitization
- environment-variables.md: add EXA_API_KEY, update provider values
- plugins.md: remove non-existent ctx.register_command(), add
  ctx.inject_message()

Feature docs additions:
- security.md: add /yolo mode, approval modes (manual/smart/off),
  configurable timeout, expanded dangerous patterns table
- cron.md: add wrap_response config, [SILENT] suppression
- mcp.md: add dynamic tool discovery, MCP sampling support
- cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length
- docker.md: add skills/credential file mounting

Messaging platform docs:
- telegram.md: add webhook mode, DoH fallback IPs
- slack.md: add multi-workspace OAuth support
- discord.md: add DISCORD_IGNORE_NO_MENTION
- matrix.md: add MSC3245 native voice messages
- feishu.md: expand from 129 to 365 lines (encrypt key, verification
  token, group policy, card actions, media, rate limiting, markdown,
  troubleshooting)
- wecom.md: expand from 86 to 264 lines (per-group allowlists, media,
  AES decryption, stream replies, reconnection, troubleshooting)

Configuration docs:
- quickstart.md: add DeepSeek, Copilot, Copilot ACP providers
- configuration.md: add DeepSeek provider, Exa web backend, terminal
  env_passthrough/images, browser.command_timeout, compression params,
  discord config, security/tirith config, timezone, auxiliary models

21 files changed, ~1000 lines added
2026-03-30 17:15:21 -07:00
3c8f910973 feat: respect NO_COLOR env var and TERM=dumb (#4079)
Add should_use_color() function to hermes_cli/colors.py that checks
NO_COLOR (https://no-color.org/) and TERM=dumb before emitting ANSI
escapes. The existing color() helper now uses this function instead
of a bare isatty() check.

This is the foundation — cli.py and banner.py still have inline ANSI
constants that bypass this module (tracked in #4071).

Closes #4066

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:07:21 -07:00
13f3e67165 ux: show 'Initializing agent...' on first message (#4086)
Display a brief status message before the heavy agent initialization
(OpenAI client setup, tool loading, memory init, etc.) so users
aren't staring at a blank screen for several seconds.

Only prints when self.agent is None (first use or after model switch).

Closes #4060

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:05:40 -07:00
4a7c17fca5 fix(gateway): read custom_providers context_length in hygiene compression (#4085)
Gateway hygiene pre-compression only checked model.context_length from
the top-level config, missing per-model context_length defined in
custom_providers entries. This caused premature compression for custom
provider users (e.g. 128K default instead of 200K configured).

The AIAgent's own compressor already reads custom_providers correctly
(run_agent.py lines 1171-1189). This adds the same fallback to the
gateway hygiene path, running after runtime provider resolution so
the base_url is available for matching.
2026-03-30 17:04:31 -07:00
6e4598ce1e Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-31 08:48:54 +09:00
f007284d05 fix: rate-limit pairing rejection messages to prevent spam (#4081)
* fix: rate-limit pairing rejection messages to prevent spam

When generate_code() returns None (rate limited or max pending), the
"Too many pairing requests" message was sent on every subsequent DM
with no cooldown. A user sending 30 messages would get 30 rejection
replies — reported as potential hack on WhatsApp.

Now check _is_rate_limited() before any pairing response, and record
rate limit after sending a rejection. Subsequent messages from the
same user are silently ignored until the rate limit window expires.

* test: add coverage for pairing response rate limiting

Follow-up to cherry-picked PR #4042 — adds tests verifying:
- Rate-limited users get silently ignored (no response sent)
- Rejection messages record rate limit for subsequent suppression

---------

Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
2026-03-30 16:48:00 -07:00
3d47af01c3 fix(honcho): write config to instance-local path for profile isolation (#4037)
Multiple agents/profiles running 'hermes honcho setup' all wrote to
the shared global ~/.honcho/config.json, overwriting each other's
configuration.

Root cause: _write_config() defaulted to resolve_config_path() which
returns the global path when no instance-local file exists yet (i.e.
on first setup).

Fix: _write_config() now defaults to _local_config_path() which always
returns $HERMES_HOME/honcho.json. Each profile gets its own config file.
Reading still falls back to global for cross-app interop and seeding.

Also updates cmd_setup and cmd_status messaging to show the actual
write path.

Includes 10 new tests verifying profile isolation, global fallback
reads, and multi-profile independence.
2026-03-30 16:41:19 -07:00
275fcc6673 Merge pull request #4054 from NousResearch/ascii-video/text-readability-and-layout-oracle
ascii-video skill: text readability techniques and external layout oracle
2026-03-30 15:52:14 -07:00
ab62614a89 ascii-video: add text readability techniques and external layout oracle pattern
- composition.md: add text backdrop (gaussian dark mask behind glyphs) and
  external layout oracle pattern (browser-based text layout → JSON → Python
  renderer pipeline for obstacle-aware text reflow)
- shaders.md: add reverse vignette shader (center-darkening for text readability)
- troubleshooting.md: add diagnostic entries for text-over-busy-background
  readability and kaleidoscope-destroys-text pitfall
2026-03-30 18:48:22 -04:00
0287597d02 Optimize Playwright install 2026-03-30 17:38:07 -05:00
de368cac54 fix(tools): show browser and TTS in reconfigure menu (#4041)
* fix(gateway): honor default for invalid bool-like config values

* refactor: simplify web backend priority detection

Replace cascading boolean conditions with a priority-ordered loop.
Same behavior (verified against all 16 env var combinations),
half the lines, trivially extensible for new backends.

* fix(tools): show browser and TTS in reconfigure menu

_toolset_has_keys() returned False for toolsets with no-key providers
(Local Browser, Edge TTS) because it only checked providers with
env_vars. Users couldn't find these tools in the reconfigure list
and had no obvious way to switch browser/TTS backends.

Now treats providers with empty env_vars as always-configured, so
toolsets with free/local options always appear in the reconfigure menu.

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 14:11:39 -07:00
3a1e489dd6 Add build-essential to Dockerfile dependencies 2026-03-30 15:57:22 -05:00
0d1003559d refactor: simplify web backend priority detection (#4036)
* fix(gateway): honor default for invalid bool-like config values

* refactor: simplify web backend priority detection

Replace cascading boolean conditions with a priority-ordered loop.
Same behavior (verified against all 16 env var combinations),
half the lines, trivially extensible for new backends.

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 13:37:25 -07:00
4f4d7c4eeb Merge branch 'NousResearch:main' into docker-optimization 2026-03-30 15:29:27 -05:00
5de312c9e3 Simplify dockerignore 2026-03-30 15:29:06 -05:00
48942c89b5 Further npm optimizations 2026-03-30 15:27:11 -05:00
eba8d52d54 fix: show correct shell config path for macOS/zsh in install script (#4025)
- print_success() hardcoded 'source ~/.bashrc' regardless of user's shell
- On macOS (default zsh), ~/.bashrc doesn't exist, leaving users unable to
  find the hermes command after install
- Now detects $SHELL and shows the correct file (zshrc/bashrc)
- Also captures .[all] install failure output instead of silencing with
  2>/dev/null, so users can diagnose why full extras failed
2026-03-30 13:25:11 -07:00
72104eb06f fix(gateway): honor default for invalid bool-like config values (#4029)
Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 13:24:48 -07:00
fdef0456a7 Merge branch 'NousResearch:main' into docker-optimization 2026-03-30 15:21:45 -05:00
4b35836ba4 fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028)
MiniMax's /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of x-api-key. Without this fix,
MiniMax users get 401 errors in gateway sessions.

Adds _requires_bearer_auth() to detect MiniMax endpoints and route
through auth_token in the Anthropic SDK. Check runs before OAuth
token detection so MiniMax keys aren't misclassified as setup tokens.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-30 13:21:39 -07:00
bd376fe976 fix(docs): improve mobile sidebar navigation
The sidebar had all categories expanded by default (collapsed: false),
which on mobile created a 60+ item flat list when opening the sidebar.
Reported by danny on Discord.

Changes:
- Set all top-level categories to collapsed: true (tap to expand)
- Enable autoCollapseCategories: true (accordion — opening one section
  closes others, prevents the overwhelming flat list)
- Enable hideable sidebar (swipe-to-dismiss on mobile)
- Add mobile CSS: larger touch targets (0.75rem padding), bolder
  category headers, visible subcategory indentation with left border,
  wider sidebar (85vw / 360px max), darker backdrop overlay
2026-03-30 13:20:55 -07:00
f93637b3a1 feat: add /profile slash command to show active profile (#4027)
Adds /profile to COMMAND_REGISTRY (Info category) with handlers in
both CLI and gateway. Shows the active profile name and home directory.

Works on all platforms — CLI, Telegram, Discord, Slack, etc.
Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/.
Shows 'default' when running without a profile.
2026-03-30 13:20:06 -07:00
8210e7aba6 Optimize Dockerfile: combine RUN commands, clear caches, add .dockerignore
- Combine apt-get update and install into single RUN with cache clearing
- Remove APT lists after installation
- Add --no-cache-dir to pip install
- Add --prefer-offline --no-audit to npm install
- Create .dockerignore to exclude unnecessary files from build context
- Update docker-publish.yml workflow to tag images with release names
- Ensure buildx caching is used (type=gha)
2026-03-30 15:19:52 -05:00
7b4fe0528f fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028)
MiniMax's /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of x-api-key. Without this fix,
MiniMax users get 401 errors in gateway sessions.

Adds _requires_bearer_auth() to detect MiniMax endpoints and route
through auth_token in the Anthropic SDK. Check runs before OAuth
token detection so MiniMax keys aren't misclassified as setup tokens.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-30 13:19:44 -07:00
950f69475f feat(browser): add Camofox local anti-detection browser backend (#4008)
Camofox-browser is a self-hosted Node.js server wrapping Camoufox
(Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set,
all 11 browser tools route through the Camofox REST API instead of
the agent-browser CLI.

Maps 1:1 to the existing browser tool interface:
- Navigate, snapshot, click, type, scroll, back, press, close
- Get images, vision (screenshot + LLM analysis)
- Console (returns empty with note — camofox limitation)

Setup: npm start in camofox-browser dir, or docker run -p 9377:9377
Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env

Advantages over Browserbase (cloud):
- Free (no per-session API costs)
- Local (zero network latency for browser ops)
- Anti-detection at C++ level (bypasses Cloudflare/Google bot detection)
- Works offline, Docker-ready

Files:
- tools/browser_camofox.py: Full REST backend (~400 lines)
- tools/browser_tool.py: Routing at each tool function
- hermes_cli/config.py: CAMOFOX_URL env var entry
- tests/tools/test_browser_camofox.py: 20 tests
2026-03-30 13:18:42 -07:00
7dac75f2ae fix: prevent context pressure warning spam after compression (#4012)
* feat: add /yolo slash command to toggle dangerous command approvals

Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.

* fix: prevent context pressure warning spam (agent loop + gateway rate-limit)

Two complementary fixes for repeated context pressure warnings spamming
gateway users (Telegram, Discord, etc.):

1. Agent-level loop fix (run_agent.py):
   After compression, only reset _context_pressure_warned if the
   post-compression estimate is actually below the 85% warning level.
   Previously the flag was unconditionally reset, causing the warning
   to re-fire every loop iteration when compression couldn't reduce
   below 85% of the threshold (e.g. very low threshold like 15%,
   or system prompt alone exceeds the warning level).

2. Gateway-level rate-limit (gateway/run.py, salvaged from PR #3786):
   Per-chat_id cooldown of 1 hour on compression warning messages.
   Both warning paths ('still large after compression' and 'compression
   failed') are gated. Defense-in-depth — even if the agent-level fix
   has edge cases, users won't see more than one warning per hour.

Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>

---------

Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
2026-03-30 13:18:21 -07:00
ed9af6e589 fix: create AsyncOpenAI lazily in trajectory_compressor to avoid closed event loop (#4013)
The AsyncOpenAI client was created once at __init__ and stored as an
instance attribute. process_directory() calls asyncio.run() which creates
and closes a fresh event loop. On a second call, the client's httpx
transport is still bound to the closed loop, raising RuntimeError:
"Event loop is closed" — the same pattern fixed by PR #3398 for the
main agent loop.

Create the client lazily in _get_async_client() so each asyncio.run()
gets a client bound to the current loop.

Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
2026-03-30 13:16:16 -07:00
158f49f19a fix: enforce priority order in Telegram menu — core > plugins > skills (#4023)
The menu now has explicit priority tiers:
1. Core CommandDef commands (always included, never bumped)
2. Plugin slash commands (take precedence over skills)
3. Built-in skill commands (fill remaining slots alphabetically)

Only skills get trimmed when the 100-command cap is hit. Adding new
core commands or plugin commands automatically pushes skills out,
not the other way around.
2026-03-30 13:04:06 -07:00
86250a3e45 docs: expand terminal backends section + fix docs build (#4016)
* feat(telegram): add webhook mode as alternative to polling

When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>

* fix: send_document call in background task delivery + vision download timeout

Two fixes salvaged from PR #2269 by amethystani:

1. gateway/run.py: adapter.send_file() → adapter.send_document()
   send_file() doesn't exist on BasePlatformAdapter. Background task
   media files were silently never delivered (AttributeError swallowed
   by except Exception: pass).

2. tools/vision_tools.py: configurable image download timeout via
   HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
   against raise None when max_retries=0.

The third fix in #2269 (opencode-go auth config) was already resolved
on main.

Co-authored-by: amethystani <amethystani@users.noreply.github.com>

* docs: expand terminal backends section + fix feishu MDX build error

---------

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
2026-03-30 12:59:58 -07:00
ea342f2382 Fix banner alignment in installer script (#4011)
Co-authored-by: Ahmed Khaled <wakeupwithme000@gmail.com>
2026-03-30 11:24:10 -07:00
60ecde8ac7 fix: fit all 100 commands in Telegram menu with 40-char descriptions (#4010)
* fix: truncate skill descriptions to 100 chars in Telegram menu

* fix: 40-char desc cap + 100 command limit for Telegram menu

setMyCommands has an undocumented total payload size limit.
50 commands with 256-char descriptions failed, 50 with 100-char
worked, and 100 with 40-char descriptions also works (~5300 total
chars). Truncate skill descriptions to 40 chars in the menu picker
and set cap back to 100. Full descriptions available via /commands.
2026-03-30 11:21:13 -07:00
f3069c649c fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009)
Add timeout parameters to 4 subprocess.run() calls that could hang
indefinitely if the child process blocks (e.g., unresponsive docker
daemon, systemctl waiting for D-Bus):

- doctor.py: docker info (timeout=10), ssh check (timeout=15)
- status.py: systemctl is-active (timeout=5), launchctl list (timeout=5)

Each call site now catches subprocess.TimeoutExpired and treats it as
a failure, consistent with how non-zero return codes are already handled.

Add AST-based regression test that verifies every subprocess.run() call
in CLI modules specifies a timeout keyword argument.

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-30 11:17:15 -07:00
0976bf6cd0 feat: add /yolo slash command to toggle dangerous command approvals (#3990)
Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.
2026-03-30 11:17:09 -07:00
da3e22bcfa fix: cap Telegram menu at 50 commands — API rejects above ~60 (#4006)
* fix: use SKILLS_DIR not repo path for Telegram menu skill filter

Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's
skills/ directory. The previous filter compared against the repo path
so no skills matched. Now checks SKILLS_DIR and excludes .hub/
subdirectory (user-installed hub skills).

* fix: cap Telegram menu at 50 commands — API rejects above ~60

Telegram's setMyCommands returns BOT_COMMANDS_TOO_MUCH when
registering close to 100 commands despite docs claiming 100 is the
limit. Metadata overhead causes rejection above ~60. Cap at 50 for
reliability — remaining commands accessible via /commands.
2026-03-30 11:05:20 -07:00
9fd78c7a8e fix: use SKILLS_DIR not repo path for Telegram menu skill filter (#4005)
Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's
skills/ directory. The previous filter compared against the repo path
so no skills matched. Now checks SKILLS_DIR and excludes .hub/
subdirectory (user-installed hub skills).
2026-03-30 11:01:13 -07:00
5ceed021dc feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934)
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap

Map active skills to Telegram's slash command menu so users can
discover and invoke skills directly. Three changes:

1. Telegram menu now includes active skill commands alongside built-in
   commands, capped at 100 entries (Telegram Bot API limit). Overflow
   commands remain callable but hidden from the picker. Logged at
   startup when cap is hit.

2. New /commands [page] gateway command for paginated browsing of all
   commands + skills. /help now shows first 10 skill commands and
   points to /commands for the full list.

3. When a user types a slash command that matches a disabled or
   uninstalled skill, they get actionable guidance:
   - Disabled: 'Enable it with: hermes skills config'
   - Optional (not installed): 'Install with: hermes skills install official/<path>'

Built on ideas from PR #3921 by @kshitijk4poor.

* chore: move 21 niche skills to optional-skills

Move specialized/niche skills from built-in (skills/) to optional
(optional-skills/) to reduce the default skill count. Users can
install them with: hermes skills install official/<category>/<name>

Moved skills (21):
- mlops: accelerate, chroma, faiss, flash-attention,
  hermes-atropos-environments, huggingface-tokenizers, instructor,
  lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning,
  qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan
- research: domain-intel, duckduckgo-search
- devops: inference-sh cli

Built-in skills: 96 → 75
Optional skills: 22 → 43

* fix: only include repo built-in skills in Telegram menu, not user-installed

User-installed skills (from hub or manually added) stay accessible via
/skills and by typing the command directly, but don't get registered
in the Telegram slash command picker. Only skills whose SKILL.md is
under the repo's skills/ directory are included in the menu.

This keeps the Telegram menu focused on the curated built-in set while
user-installed skills remain discoverable through /skills and /commands.
2026-03-30 10:57:30 -07:00
97d6813f51 fix(cache): use deterministic call_id fallbacks instead of random UUIDs (#3991)
When the API doesn't provide a call_id for tool calls, the fallback
generated a random uuid4 hex. This made every API call's input unique
when replayed, preventing OpenAI's prompt cache from matching the
prefix across turns.

Replaced all four uuid4 fallback sites with a deterministic hash of
(function_name, arguments, position_index). The same tool call now
always produces the same fallback call_id, preserving cache-friendly
input stability.

Affected code paths:
- _chat_messages_to_responses_input() — Codex input reconstruction
- _normalize_codex_response() — function_call and custom_tool_call
- _build_assistant_message() — assistant message construction
2026-03-30 09:43:56 -07:00
37825189dd fix(skills): validate hub bundle paths before install (#3986)
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-30 08:37:19 -07:00
e08778fa1e chore: release v0.6.0 (2026.3.30) (#3985) 2026-03-30 08:29:38 -07:00
fb634068df fix(security): extend secret redaction to ElevenLabs, Tavily and Exa API keys (#3920)
ElevenLabs (sk_), Tavily (tvly-), and Exa (exa_) keys were not covered
by _PREFIX_PATTERNS, leaking in plain text via printenv or log output.

Salvaged from PR #3790 by @memosr. Tests rewritten with correct
assertions (original tests had vacuously true checks).

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-30 08:13:01 -07:00
74181fe726 fix: add TTY guard to interactive CLI commands to prevent CPU spin (#3933)
When interactive TUI commands are invoked non-interactively (e.g. via
the agent's terminal() tool through a subprocess pipe), curses loops
spin at 100% CPU and input() calls hang indefinitely.

Defense in depth — two layers:

1. Source-level guard in curses_checklist() (curses_ui.py + checklist.py):
   Returns cancel_returns immediately when stdin is not a TTY. This
   catches ALL callers automatically, including future code.

2. Command-level guards with clear error messages:
   - hermes tools (interactive checklist, not list/disable/enable)
   - hermes setup (interactive wizard)
   - hermes model (provider/model picker)
   - hermes whatsapp (pairing setup)
   - hermes skills config (skill toggle)
   - hermes mcp configure (tool selection)
   - hermes uninstall (confirmation prompt)

Non-interactive subcommands (hermes tools list, hermes tools enable,
hermes mcp add/remove/list/test, hermes skills search/install/browse)
remain unaffected.
2026-03-30 08:10:23 -07:00
1e896b0251 fix: resolve 7 failing CI tests (#3936)
1. matrix voice: _on_room_message_media unconditionally overwrote
   media_urls with the image cache path (always None for non-images),
   wiping the locally-cached voice path. Now only overrides when
   cached_path is truthy.

2. cli_tools_command: /tools disable no longer prompts for confirmation
   (input() removed in earlier commit to fix TUI hang), but tests still
   expected the old Y/N prompt flow. Updated tests to match current
   behavior (direct apply + session reset).

3. slack app_mention: connect() was refactored for multi-workspace
   (creates AsyncWebClient per token), but test only mocked the old
   self._app.client path. Added AsyncWebClient and acquire_scoped_lock
   mocks.

4. website_policy: module-level _cached_policy from earlier tests caused
   fast-path return of None. Added invalidate_cache() before assertion.

5. codex 401 refresh: already passing on current main (fixed by
   intervening commit).
2026-03-30 08:10:14 -07:00
0b0c1b326c fix: openclaw migration overwrites model config dict with string (#3924)
migrate_model_config() was writing `config["model"] = model_str` which
replaces the entire model dict (default, provider, base_url) with a
bare string. This causes 'str' object has no attribute 'get' errors
throughout Hermes when any code does model_cfg.get("default").

Now preserves the existing model dict and only updates the "default"
key, keeping provider/base_url intact.
2026-03-30 03:02:28 -07:00
b4496b33b5 fix: background task media delivery + vision download timeout (#3919)
* feat(telegram): add webhook mode as alternative to polling

When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>

* fix: send_document call in background task delivery + vision download timeout

Two fixes salvaged from PR #2269 by amethystani:

1. gateway/run.py: adapter.send_file() → adapter.send_document()
   send_file() doesn't exist on BasePlatformAdapter. Background task
   media files were silently never delivered (AttributeError swallowed
   by except Exception: pass).

2. tools/vision_tools.py: configurable image download timeout via
   HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
   against raise None when max_retries=0.

The third fix in #2269 (opencode-go auth config) was already resolved
on main.

Co-authored-by: amethystani <amethystani@users.noreply.github.com>

---------

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
2026-03-30 02:59:39 -07:00
d028a94b83 fix(whatsapp): skip reply prefix in bot mode — only needed for self-chat (#3931)
The WhatsApp bridge prepends '⚕ *Hermes Agent*\n────────────\n' to
every outgoing message. In self-chat mode this is necessary to
distinguish the bot's responses from the user's own messages. In bot
mode the messages already come from a different number, making the
prefix redundant and cluttered.

Now only prepends the prefix when WHATSAPP_MODE is 'self-chat' (the
default). Bot mode messages are sent clean.
2026-03-30 02:55:33 -07:00
0e592aa5b4 fix(cli): remove input() from /tools disable that freezes the terminal (#3918)
input() hangs inside prompt_toolkit's TUI event loop — this is a known
pitfall (AGENTS.md). The /tools disable and /tools enable commands used
input() for a Y/N confirmation prompt, causing the terminal to freeze
with no way to type a response.

Fix: remove the confirmation prompt. The user typing '/tools disable web'
is implicit consent. The change is applied directly with a status message.
2026-03-30 02:53:21 -07:00
efae525dc5 feat(plugins): add inject_message interface for remote message injection (#3778) 2026-03-30 02:48:06 -07:00
5148682b43 feat: mount skills directory into all remote backends with live sync (#3890)
Skills with scripts/, templates/, and references/ subdirectories need
those files available inside sandboxed execution environments. Previously
the skills directory was missing entirely from remote backends.

Live sync — files stay current as credentials refresh and skills update:
- Docker/Singularity: bind mounts are inherently live (host changes
  visible immediately)
- Modal: _sync_files() runs before each command with mtime+size caching,
  pushing only changed credential and skill files (~13μs no-op overhead)
- SSH: rsync --safe-links before each command (naturally incremental)
- Daytona: _upload_if_changed() with mtime+size caching before each command

Security — symlink filtering:
- Docker/Singularity: sanitized temp copy when symlinks detected
- Modal/Daytona: iter_skills_files() skips symlinks
- SSH: rsync --safe-links skips symlinks pointing outside source tree
- Temp dir cleanup via atexit + reuse across calls

Non-root user support:
- SSH: detects remote home via echo $HOME, syncs to $HOME/.hermes/
- Daytona: detects sandbox home before sync, uploads to $HOME/.hermes/
- Docker/Modal/Singularity: run as root, /root/.hermes/ is correct

Also:
- credential_files.py: fix name/path key fallback in required_credential_files
- Singularity, SSH, Daytona: gained credential file support
- 14 tests covering symlink filtering, name/path fallback, iter_skills_files
2026-03-30 02:45:41 -07:00
791f4e94b2 feat(slack): multi-workspace support via OAuth token file (#3903)
Salvaged from PR #2033 by yoannes. Adds multi-workspace Slack support
so a single Hermes instance can serve multiple Slack workspaces after
OAuth installs.

Changes:
- Support comma-separated bot tokens in SLACK_BOT_TOKEN env var
- Load additional OAuth-persisted tokens from HERMES_HOME/slack_tokens.json
- Route all Slack API calls through workspace-aware _get_client(chat_id)
  instead of always using the primary app client
- Track channel → workspace mapping from incoming events
- Per-workspace bot_user_id for correct mention detection
- Workspace-aware file downloads (correct auth token per workspace)

Backward compatible: single-token setups work identically.

Token file format (slack_tokens.json):
  {"T12345": {"token": "xoxb-...", "team_name": "My Workspace"}}

Fixed from original PR:
- Uses get_hermes_home() instead of hardcoded ~/.hermes/ path

Co-authored-by: yoannes <yoannes@users.noreply.github.com>
2026-03-30 01:51:48 -07:00
a4b064763d fix(cron): tighten [SILENT] instruction to prevent report-with-silent-prefix (#3901)
The model was interpreting [SILENT] as a metadata prefix and writing
full reports with [SILENT] slapped at the front. The old instruction
said 'optionally followed by a brief internal note' which gave too
much room. New instruction explicitly says: [SILENT] means nothing
else, do NOT combine it with a report.
2026-03-30 00:11:00 -07:00
138ea3fbe8 fix(docs): escape angle-bracket URLs in feishu.md breaking MDX build (#3902) 2026-03-30 00:09:30 -07:00
ee61485cac feat(matrix): support native voice messages via MSC3245 (#3877)
* feat(matrix): support native voice messages

* fix: skip matrix voice tests when matrix-nio not installed

---------

Co-authored-by: Carlos Alberto Pereira Gomes <carlosapgomes@users.noreply.github.com>
2026-03-30 00:02:51 -07:00
947faed3bc feat(approvals): make dangerous command approval timeout configurable (#3886)
* feat(approvals): make dangerous command approval timeout configurable

Read `approvals.timeout` from config.yaml (default 60s) instead of
hardcoding 60 seconds in both the fallback CLI prompt and the TUI
prompt_toolkit callback.

Follows the same pattern as `clarify.timeout` which is already
configurable via CLI_CONFIG.

Closes #3765

* fix: add timeout default to approvals section in DEFAULT_CONFIG

---------

Co-authored-by: acsezen <asezen@icloud.com>
2026-03-30 00:02:02 -07:00
c288bbfb57 fix(cli): prevent status bar wrapping into duplicate rows (#3883)
- measure status bar display width using prompt_toolkit cell widths
- trim rendered status text when fragments would overflow
- add a final single-fragment fallback to prevent wrapping
- update width assertions to validate display cells instead of len()
2026-03-29 23:59:07 -07:00
a347921314 docs: comprehensive OpenClaw migration guide (#3900)
New standalone guide at guides/migrate-from-openclaw.md with:
- Complete config key mapping tables for every category
- Agent behavior mappings (thinkingDefault → reasoning_effort, etc.)
- Session reset policy mapping (session.reset vs resetTriggers)
- TTS dual-source explanation (messages.tts.providers + talk config)
- MCP server field-by-field mapping
- Messaging platform table with exact config paths and env vars
- API key resolution: 3 sources, priority order, supported targets
- SecretRef handling: plain strings, env templates, SecretRef objects
- Post-migration checklist (6 steps)
- Troubleshooting section
- Complete archived items table with recreation guidance

CLI commands reference condensed to summary + link to full guide.
Added to sidebar under Guides & Tutorials.
2026-03-29 23:58:12 -07:00
09def65eff fix(migration): expand OpenClaw migration to cover full data footprint (#3869)
Cross-referenced the OpenClaw Zod schema and TypeScript source against
our migration script. Found and fixed:

Expanded data sources:
- Legacy config fallback: clawdbot.json, moldbot.json
- Legacy dir fallback: ~/.clawdbot/, ~/.moldbot/
- API keys from ~/.openclaw/.env and auth-profiles.json
- Personal skills from ~/.agents/skills/
- Project skills from workspace/.agents/skills/
- BOOTSTRAP.md archived (was silently skipped)
- Expanded env key allowlist: DEEPSEEK, GEMINI, ZAI, MINIMAX

Fixed wrong config paths (verified against Zod schema):
- humanDelay.enabled → humanDelay.mode (field doesn't exist as .enabled)
- agents.defaults.exec.timeout → tools.exec.timeoutSec (wrong path + name)
- messages.tts.elevenlabs.voiceId → messages.tts.providers.elevenlabs.voiceId
- session.resetTriggers (string[]) → session.reset (structured object)
- approvals.mode → approvals.exec.mode (no top-level mode)
- browser.inactivityTimeoutMs → doesn't exist; map cdpUrl+headless instead
- tools.webSearch.braveApiKey → tools.web.search.brave.apiKey
- tools.exec.timeout → tools.exec.timeoutSec

Added SecretRef resolution:
- All token/apiKey fields in OpenClaw can be strings, env templates
  (${VAR}), or SecretRef objects ({source:'env',id:'VAR'}). Added
  resolve_secret_input() to handle all three forms.

Fixed auth-profiles.json:
- Canonical field is 'key' not 'apiKey' (though alias accepted)
- File wraps entries in a 'profiles' key — now handled

Fixed TTS config:
- Provider settings at messages.tts.providers.{name} (not flat)
- Also checks top-level 'talk' config as fallback source

Docs updated with new sources and key list.
2026-03-29 22:49:34 -07:00
649d149438 feat(telegram): add webhook mode as alternative to polling (#3880)
When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-29 22:36:07 -07:00
5602458794 security: harden dangerous command detection and add file tool path guards (#3872)
Closes gaps that allowed an agent to expose Docker's Remote API to the
internet by writing to /etc/docker/daemon.json.

Terminal tool (approval.py):
- chmod: now catches 666 and symbolic modes (o+w, a+w), not just 777
- cp/mv/install: detected when targeting /etc/
- sed -i/--in-place: detected when targeting /etc/

File tools (file_tools.py):
- write_file and patch now refuse to write to sensitive system paths
  (/etc/, /boot/, /usr/lib/systemd/, docker.sock)
- Directs users to the terminal tool (which has approval prompts) for
  system file modifications
2026-03-29 22:33:47 -07:00
1c900c45e3 fix(agent): support full context length resolution for direct Gemini API endpoints (#3876)
* add .aac audio file format support to transcription tool

* fix(agent): support full context length resolution for direct Gemini API endpoints

Add generativelanguage.googleapis.com to _URL_TO_PROVIDER so direct
Gemini API users get correct 1M+ context length instead of the 128K
unknown-proxy fallback.

Co-authored-by: bb873 <bb873@users.noreply.github.com>

---------

Co-authored-by: Adrian Scott <adrian@adrianscott.com>
Co-authored-by: bb873 <bb873@users.noreply.github.com>
2026-03-29 21:56:07 -07:00
227601c200 feat(discord): add message processing reactions (salvage #1980) (#3871)
Adds lifecycle hooks to the base platform adapter so Discord (and future
platforms) can react to message processing events:

  👀  when processing starts
    on successful completion (delivery confirmed)
    on failure, error, or cancellation

Implementation:
- base.py: on_processing_start/on_processing_complete hooks with
  _run_processing_hook error isolation wrapper; delivery tracking
  via _record_delivery closure for accurate success detection
- discord.py: _add_reaction/_remove_reaction helpers + hook overrides
- Tests for base hook lifecycle and Discord-specific reactions

Co-authored-by: alanwilhelm <alanwilhelm@users.noreply.github.com>
2026-03-29 21:55:23 -07:00
fd29933a6d fix: use argparse entrypoint in top-level launcher (#3874)
The ./hermes convenience script still used the legacy Fire-based
cli.main wrapper, which doesn't support subcommands (gateway, cron,
doctor, etc.). The installed 'hermes' command already uses
hermes_cli.main:main (argparse) — this aligns the launcher.

Salvaged from PR #2009 by gito369.
2026-03-29 21:54:36 -07:00
839f798b74 feat(telegram): add group mention gating and regex triggers (#3870)
Adds Discord-style mention gating for Telegram groups:
- telegram.require_mention: gate group messages (default: false)
- telegram.mention_patterns: regex wake-word triggers
- telegram.free_response_chats: bypass gating for specific chats

When require_mention is enabled, group messages are accepted only for:
- slash commands
- replies to the bot
- @botusername mentions
- regex wake-word pattern matches

DMs remain unrestricted. @mention text is stripped before passing to
the agent. Invalid regex patterns are ignored with a warning.

Config bridges follow the existing Discord pattern (yaml → env vars).

Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType
comparison to work without python-telegram-bot installed (uses string
matching instead of enum, consistent with other entity_type checks).

Co-authored-by: mcleay <mcleay@users.noreply.github.com>
2026-03-29 21:53:59 -07:00
366bfc3c76 fix(setup): auto-install matrix-nio during hermes setup (#3873)
Setup previously only printed a manual install hint for matrix-nio,
causing the gateway to crash with 'matrix-nio not installed' after
configuring Matrix. Now auto-installs matrix-nio (or matrix-nio[e2e]
when E2EE is enabled) using the same uv-first/pip-fallback pattern
as Daytona and Modal backends.

Also adds hermes-agent[matrix] to the [all] extra in pyproject.toml
and a regression test to keep it there.

Co-authored-by: Gutslabs <Gutslabs@users.noreply.github.com>
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
2026-03-29 21:53:28 -07:00
b4ceb541a7 fix(terminal): preserve partial output when command times out (#3868)
When a command timed out, all captured output was discarded — the agent
only saw 'Command timed out after Xs' with zero context. Now returns
the buffered output followed by a timeout marker, matching the existing
interrupt path behavior.

Salvaged from PR #3286 by @binhnt92.

Co-authored-by: nguyen binh <binhnt92@users.noreply.github.com>
2026-03-29 21:51:44 -07:00
ccf7bb1102 fix(nous): use curated model list instead of full API dump for Nous Portal (#3867)
All three Nous Portal model selection paths (hermes model, first-time
login, setup wizard) were hitting the live /models endpoint and showing
every model available — potentially hundreds. Now uses the curated
_PROVIDER_MODELS['nous'] list (25 agentic models matching OpenRouter
defaults) with 'Enter custom model name' for anything else.

Fixed in:
- hermes_cli/main.py: _model_flow_nous()
- hermes_cli/auth.py: _login_nous() model selection
- hermes_cli/setup.py: post-login model selection
2026-03-29 21:38:10 -07:00
ce2841f3c9 feat(gateway): add WeCom (Enterprise WeChat) platform support (#3847)
Adds WeCom as a gateway platform adapter using the AI Bot WebSocket
gateway for real-time bidirectional communication. No public endpoint
or new pip dependencies needed (uses existing aiohttp + httpx).

Features:
- WebSocket persistent connection with auto-reconnect (exponential backoff)
- DM and group messaging with configurable access policies
- Media upload/download with AES decryption for encrypted attachments
- Markdown rendering, quote context preservation
- Proactive + passive reply message modes
- Chunked media upload pipeline (512KB chunks)

Cherry-picked from PR #1898 by EvilRan with:
- Moved to current main (PR was 300 commits behind)
- Skipped base.py regressions (reply_to additions are good but belong
  in a separate PR since they affect all platforms)
- Fixed test assertions to match current base class send() signature
  (reply_to=None kwarg now explicit)
- All 16 integration points added surgically to current main
- No new pip dependencies (aiohttp + httpx already installed)

Fixes #1898

Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>
2026-03-29 21:29:13 -07:00
e296efbf24 fix: add INFO-level logging for auxiliary provider resolution (#3866)
The auxiliary client's auto-detection chain was a black box — when
compression, summarization, or memory flush failed, the only clue was
a generic 'Request timed out' with no indication of which provider was
tried or why it was skipped.

Now logs at INFO level:
- 'Auxiliary auto-detect: using local/custom (qwen3.5-9b) — skipped:
  openrouter, nous' when auto-detection picks a provider
- 'Auxiliary compression: using auto (qwen3.5-9b) at http://localhost:11434/v1'
  before each auxiliary call
- 'Auxiliary compression: provider custom unavailable, falling back to
  openrouter' on fallback
- Clear warning with actionable guidance when NO provider is available:
  'Set OPENROUTER_API_KEY or configure a local model in config.yaml'
2026-03-29 21:29:00 -07:00
1cbb1b99cc Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway. 2026-03-30 13:28:10 +09:00
2ff2cd3a59 add .aac audio file format support to transcription tool (#3865)
Co-authored-by: Adrian Scott <adrian@adrianscott.com>
2026-03-29 21:27:03 -07:00
f39ca81bab docs: comprehensive hermes claw migrate reference (#3864)
The existing docs were two lines. The migration script handles 35
categories of data across persona, memory, skills, messaging platforms,
model providers, MCP servers, agent config, and more.

New docs cover:
- All CLI options (--dry-run, --preset, --overwrite, --migrate-secrets,
  --source, --workspace-target, --skill-conflict, --yes)
- 27 directly-imported categories with source → destination mapping
- 7 archived categories with manual recreation guidance
- Security notes on API key allowlisting
- Usage examples for common migration scenarios
2026-03-29 21:25:13 -07:00
3fad1e7cc1 fix(cron): resolve human-friendly delivery labels via channel directory (#3860)
Cron jobs configured with deliver labels from send_message(action='list')
like 'whatsapp:Alice (dm)' passed the label as a literal chat_id.
WhatsApp bridge failed with jidDecode error since 'Alice (dm)' isn't
a valid JID.

Now _resolve_delivery_target() strips display suffixes like ' (dm)' and
resolves human-friendly names via the channel directory before using
them. Raw IDs pass through unchanged when the directory has no match.

Fixes #1945.
2026-03-29 21:24:17 -07:00
86ac23c8da fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862)
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.

Changes:
- resolve_provider() now raises AuthError when no credentials are found
  instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
  and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
  including all supported providers, aliases, and local server setup
2026-03-29 21:06:35 -07:00
3cc50532d1 fix: auxiliary client uses placeholder key for local servers without auth (#3842)
Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
2026-03-29 21:05:36 -07:00
2d607d36f6 fix(security): catch sensitive path writes in approval checks (#3859)
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-29 20:57:57 -07:00
aa389924ad fix: prefer curated model list when live probe returns fewer models (#3856)
The model picker for API-key providers (MiniMax, z.ai, etc.) probes
the live /models endpoint when the curated list has fewer than 8
models. When the live endpoint returns fewer models than the curated
list (e.g. MiniMax's Anthropic-compatible endpoint doesn't list M2.7),
the incomplete live list was used instead.

Now falls back to the curated list when live returns fewer models,
ensuring new models like MiniMax-M2.7 always appear in the picker.
2026-03-29 20:55:15 -07:00
5e67fc8c40 fix(vision): reject non-image files and enforce website policy (salvage #1940) (#3845)
Three safety gaps in vision_analyze_tool:

1. Local files accepted without checking if they're actually images —
   a renamed text file would get base64-encoded and sent to the model.
   Now validates magic bytes (PNG, JPEG, GIF, BMP, WebP, SVG).

2. No website policy enforcement on image URLs — blocked domains could
   be fetched via the vision tool. Now checks before download.

3. No redirect check — if an allowed URL redirected to a blocked domain,
   the download would proceed. Now re-checks the final URL.

Fixed one test that needed _validate_image_url mocked to bypass DNS
resolution on the fake blocked.test domain (is_safe_url does DNS
checks that were added after the original PR).

Co-authored-by: GutSlabs <GutSlabs@users.noreply.github.com>
2026-03-29 20:55:04 -07:00
b60cfd6ce6 fix(telegram): gracefully handle deleted reply targets (#3858)
* fix: add gpt-5.4-mini to Codex fallback catalog

* fix(telegram): gracefully handle deleted reply targets

When a user deletes their message while Hermes is processing, Telegram
returns BadRequest 'Message to be replied not found'. Previously this
was an unhandled permanent error causing silent delivery failure.

Now clears reply_to_id and retries so the response is still delivered,
matching the existing 'thread not found' recovery pattern.

Inspired by PR #3231 by @heathley. Fixes #3229.

---------

Co-authored-by: Clippy <clippy@grads.flow>
Co-authored-by: Nigel Gibbs <heathley@users.noreply.github.com>
2026-03-29 20:47:07 -07:00
981e14001c fix: clear api_mode on provider switch instead of hardcoding chat_completions (#3857)
PR #3726 fixed stale codex_responses persisting when switching providers
by hardcoding api_mode=chat_completions in 5 model flows. This broke
MiniMax, MiniMax-CN, and Alibaba which use /anthropic endpoints that
need anthropic_messages — the hardcoded value overrides the URL-based
auto-detection in runtime_provider.py.

Fix: pop api_mode from config in the 3 URL-dependent flows (custom
endpoint, Kimi, api_key_provider) instead of hardcoding. The runtime
resolver already correctly auto-detects api_mode from the base_url
suffix (/anthropic -> anthropic_messages, else chat_completions).

OpenRouter and Copilot ACP flows keep the explicit value since their
api_mode is always known.

Reported by stefan171.
2026-03-29 20:44:39 -07:00
9d28f4aba3 fix: add gpt-5.4-mini to Codex fallback catalog (#3855)
Co-authored-by: Clippy <clippy@grads.flow>
2026-03-29 20:10:00 -07:00
3e203de125 fix(skills): block category path traversal in skill manager (#3844)
Validate category names in _create_skill() before using them as
filesystem path segments. Previously, categories like '../escape' or
'/tmp/pwned' could write skill files outside ~/.hermes/skills/.

Adds _validate_category() that rejects slashes, backslashes, absolute
paths, and non-alphanumeric characters (reuses existing VALID_NAME_RE).

Tests: 5 new tests for traversal, absolute paths, and valid categories.

Salvaged from PR #1939 by Gutslabs.
2026-03-29 20:08:22 -07:00
2d264a4562 fix(tests): resolve 10 CI failures across hooks, tiktoken, plugins (#3848)
test_hooks.py (7 failures): Built-in boot-md hook was always loaded
by _register_builtin_hooks(), adding +1 to every expected hook count.
Mock out built-in registration in TestDiscoverAndLoad so tests isolate
user-hook discovery logic.

test_tool_token_estimation.py (2 failures): tiktoken is not in
core/[all] dependencies. The estimation function gracefully returns {}
when tiktoken is missing, but tests expected non-empty results. Added
skipif markers for tests that need tiktoken.

test_plugins_cmd.py (1 failure): bare 'hermes plugins' now dispatches
to cmd_toggle() (interactive curses UI) instead of cmd_list(). Updated
test to match the new behavior.
2026-03-29 20:05:59 -07:00
3e2c8c529b fix(whatsapp): resolve LID↔phone aliases in allowlist matching (#3830)
WhatsApp DMs can arrive with LID sender IDs even when
WHATSAPP_ALLOWED_USERS is configured with phone numbers. The allowlist
check now reads bridge session mapping files (lid-mapping-*.json) to
resolve phone↔LID aliases, matching users regardless of which
identifier format the message uses.

Both the Python gateway (_is_user_authorized) and the Node bridge
(allowlist.js) now share the same mapping-file-based resolution logic.

Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
2026-03-29 18:21:50 -07:00
e4d575e563 fix: report subagent status as completed when summary exists (#3829)
When a subagent hit max_iterations, status was always 'failed' even
if it produced a usable summary via _handle_max_iterations(). This
happened because the status check required both completed=True AND
a summary, but completed is False whenever max_iterations is reached
(run_agent.py line 7969).

Now gates status on whether a summary was produced — if the subagent
returned a final_response, the parent has usable output regardless of
iteration budget. The exit_reason field already distinguishes
'completed' vs 'max_iterations' for anything that needs to know how
the task ended.

Closes #1899.
2026-03-29 18:21:36 -07:00
2a0e8b001f fix(cli): handle closed stdout ValueError in safe print paths (#3843)
When stdout is closed (piped to a dead process, broken terminal),
Python raises ValueError('I/O operation on closed file'), not OSError.
_safe_print and the API error printer only caught OSError, letting the
ValueError propagate and crash the agent.

Salvaged from PR #3760 by @apexscaleai. Fixes #3534.

Co-authored-by: apexscaleai <apexscaleai@users.noreply.github.com>
2026-03-29 18:21:27 -07:00
ca4907dfbc feat(gateway): add Feishu/Lark platform support (#3817)
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.

Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
  _feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
  lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main

New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])

Fixes #1788

Co-authored-by: penwyp <penwyp@users.noreply.github.com>
2026-03-29 18:17:42 -07:00
e314833c9d feat(display): configurable tool preview length -- show full paths by default (#3841)
Tool call previews (paths, commands, queries) were hardcoded to truncate
at 35-40 chars across CLI spinners, completion lines, and gateway progress
messages. Users could not see full file paths in tool output.

New config option: display.tool_preview_length (default 0 = no limit).
Set a positive number to truncate at that length.

Changes:
- display.py: module-level _tool_preview_max_len with getter/setter;
  build_tool_preview() and get_cute_tool_message() _trunc/_path respect it
- cli.py: reads config at startup, spinner widget respects config
- gateway/run.py: reads config per-message, progress callback respects config
- run_agent.py: removed redundant 30-char quiet-mode spinner truncation
- config.py: added display.tool_preview_length to DEFAULT_CONFIG

Reported by kriskaminski
2026-03-29 18:02:42 -07:00
59f2b228f7 fix(paths): respect HERMES_HOME for protected .env write-deny path (#3840)
The write-deny list in file_operations.py hardcoded ~/.hermes/.env,
which misses the actual .env in custom HERMES_HOME or profile setups.
Use get_hermes_home() for profile-safe path resolution.

Salvaged from PR #3232 by @erhnysr.

Co-authored-by: Erhnysr <erhnysr@users.noreply.github.com>
2026-03-29 18:02:11 -07:00
d6b7836210 fix: update session_log_file during context compression (#3835)
When compression creates a child session with a new session_id,
session_log_file was still pointing to the old session's JSON file.
This caused _save_session_log() to write new data to the wrong file.

Closes #3731.

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-29 17:49:58 -07:00
17b6000e90 feat(skills): add songwriting-and-ai-music creative skill (salvage #1901) (#3834)
Adds a songwriting craft and AI music prompt engineering skill covering
song structure, rhyme/meter, emotional arcs, Suno metatag reference,
phonetic tricks for AI singers, parody adaptation, and production workflow.

Complements existing music skills (heartmula, audiocraft, songsee) which
cover model setup/usage — this one covers the creative process itself.

Also removes the empty skills/music-creation/ category (only had a
DESCRIPTION.md, no actual skills).

Co-authored-by: 123mikeyd <123mikeyd@users.noreply.github.com>
2026-03-29 17:49:19 -07:00
45c8d3da96 fix(banner): show lazy-initialized tools in yellow instead of red (salvage #1854) (#3822)
Tools from check_fn-gated toolsets (honcho, homeassistant) showed as
red (disabled) in the startup banner even when properly configured.
This happened because check_fn runs lazily after session context is
set, but the banner renders before agent init.

Now distinguishes three states:
  - red:    truly unavailable (missing env var, no API key)
  - yellow: lazy-initialized (check_fn pending, will activate on use)
  - normal: available and ready

Only the banner fix was salvaged from the original PR; unrelated
bundled changes (context_compressor, STT config, auth default_model,
SessionResetPolicy) were discarded.

Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
2026-03-29 16:53:29 -07:00
5ca6d681f0 feat(skills): add memento-flashcards optional skill (#3827)
* feat(skills): add memento-flashcards skill

* docs(skills): clarify memento-flashcards interaction model

* fix: use HERMES_HOME env var for profile-safe data path

---------

Co-authored-by: Magnus Ahmad <magnus.ahmad@gmail.com>
2026-03-29 16:52:52 -07:00
df806bdbaf feat(cron): add cron.wrap_response config to disable delivery wrapping (#3807)
Adds a config option to suppress the header/footer text that wraps
cron job responses when delivered to messaging platforms.

Set cron.wrap_response: false in config.yaml for clean output without
the 'Cronjob Response: <name>' header and 'The agent cannot see this
message' footer.  Default is true (preserves current behavior).
2026-03-29 16:31:01 -07:00
0ef80c5f32 fix(whatsapp): reuse persistent aiohttp session across requests (#3818)
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.

Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
  poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
  solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
  session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
  when gateway runs via [messaging] extras)

Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.

Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
2026-03-29 16:25:20 -07:00
c4cf20f564 fix: clear __pycache__ during update to prevent stale bytecode ImportError (#3819)
Third report of gateway crashing with:
  ImportError: cannot import name 'get_hermes_home' from 'hermes_constants'

Root cause: stale .pyc bytecode files survive code updates. When Python
loads a cached .pyc that references names from the old source, the import
fails and the gateway won't start.

Two bugs fixed:
1. Git update path: no cache clearing at all after git pull
2. ZIP update path: __pycache__ was explicitly in the preserve set

Added _clear_bytecode_cache() helper that removes all __pycache__ dirs
under PROJECT_ROOT (skipping venv/node_modules/.git/.worktrees). Called
in both git and ZIP update paths, before pip install.
2026-03-29 16:23:36 -07:00
68d5472810 fix: omit tools param entirely when empty instead of sending None (#3820)
Some providers (Fireworks AI) reject tools=null, and others (Anthropic)
reject tools=[]. The safest approach is to not include the key at all
when there are no tools — the OpenAI SDK treats a missing parameter as
NOT_GIVEN and omits it from the request entirely.

Inspired by PR #3736 (@kelsia14).
2026-03-29 16:12:47 -07:00
252fbea005 feat(providers): add ordered fallback provider chain (salvage #1761) (#3813)
Extends the single fallback_model mechanism into an ordered chain.
When the primary model fails, Hermes tries each fallback provider in
sequence until one succeeds or the chain is exhausted.

Config format (new):
  fallback_providers:
    - provider: openrouter
      model: anthropic/claude-sonnet-4
    - provider: openai
      model: gpt-4o

Legacy single-dict fallback_model format still works unchanged.

Key fix vs original PR: the call sites in the retry loop now use
_fallback_index < len(_fallback_chain) instead of the old one-shot
_fallback_activated guard, so the chain actually advances through
all configured providers.

Changes:
- run_agent.py: _fallback_chain list + _fallback_index replaces
  one-shot _fallback_model; _try_activate_fallback() advances
  through chain; failed provider resolution skips to next entry;
  call sites updated to allow chain advancement
- cli.py: reads fallback_providers with legacy fallback_model compat
- gateway/run.py: same
- hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG
- tests: 12 new chain tests + 6 existing test fixtures updated

Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>
2026-03-29 16:04:53 -07:00
c774833667 fix(banner): show honcho tools as available when configured (#3810)
The honcho check_fn only checked runtime session state, which isn't
set until the agent initializes. At banner time, honcho tools showed
as red/disabled even when properly configured.

Now checks configuration (enabled + api_key/base_url) as a fallback
when the session context isn't active yet. Fast path (session active)
unchanged; slow path (config check) only runs at banner time.

Adds 4 tests covering: session active, configured but no session,
not configured, and import failure graceful fallback.

Closes #1843.
2026-03-29 15:55:05 -07:00
d5d22fe7ba feat(mcp): dynamic tool discovery via notifications/tools/list_changed (#3812)
When a connected MCP server sends a ToolListChangedNotification (per the
MCP spec), Hermes now automatically re-fetches the tool list, deregisters
removed tools, and registers new ones — without requiring a restart.

This enables MCP servers with dynamic toolsets (e.g. GitHub MCP with
GITHUB_DYNAMIC_TOOLSETS=1) to add/remove tools at runtime.

Changes:
- registry.py: add ToolRegistry.deregister() for nuke-and-repave refresh
- mcp_tool.py: extract _register_server_tools() from
  _discover_and_register_server() as a shared helper for both initial
  discovery and dynamic refresh
- mcp_tool.py: add _make_message_handler() and _refresh_tools() on
  MCPServerTask, wired into all 3 ClientSession sites (stdio, new HTTP,
  deprecated HTTP)
- Graceful degradation: silently falls back to static discovery when the
  MCP SDK lacks notification types or message_handler support
- 8 new tests covering registration, refresh, handler dispatch, and
  deregister

Salvaged from PR #1794 by shivvor2.
2026-03-29 15:52:54 -07:00
bf84cdfa5e fix: ensure tool schema always includes name field in get_definitions (#3811)
When a tool plugin registers a schema without an explicit 'name' key,
get_definitions() crashes with KeyError:

    available_tool_names = {t["function"]["name"] for t in filtered_tools}

Fix: always merge entry.name into schema so 'name' is never missing.

Refs: #3729

Co-authored-by: ekkoitac <ekko.itac@gmail.com>
2026-03-29 15:49:21 -07:00
38d694f559 fix(gateway): apply home channel env overrides consistently (#3808)
Home channel env vars (SLACK_HOME_CHANNEL, SIGNAL_HOME_CHANNEL, etc.)
for Slack, Signal, Mattermost, Matrix, Email, and SMS were nested
inside the credential-env blocks, so they were ignored when the
platform was already configured via config.yaml.

Moved the home channel handling outside the credential blocks with a
Platform.X in config.platforms guard, matching the existing pattern
for Telegram and Discord.

Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
2026-03-29 15:48:51 -07:00
ed6427e0a7 fix(agent): user-friendly 429 rate limit messages with Retry-After support (#3809)
When hitting rate limits (429), the agent now:
- Extracts the Retry-After header from the provider response and uses it
  as the wait time instead of blind exponential backoff (capped at 120s)
- Shows rate-limit-specific messaging: 'Rate limit reached. Waiting Xs
  before retry (attempt N/M)...'
- Shows a distinct exhaustion message: 'Rate limit persisted after N
  retries. Please try again later.'

Non-429 errors keep the existing exponential backoff and generic messaging.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-29 15:48:06 -07:00
0fd3b59ba1 feat(cli): add Ctrl+Z process suspend support (#3802)
Adds a Ctrl+Z key binding to suspend the hermes CLI to background
using standard Unix job control. Uses prompt_toolkit's run_in_terminal()
to properly save/restore terminal state, then sends SIGTSTP to the
process group. Prints a branded message with resume instructions.
Shows a not-supported notice on Windows.

Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>
2026-03-29 15:47:55 -07:00
6716e66e89 feat: add MCP server mode — hermes mcp serve (#3795)
hermes mcp serve starts a stdio MCP server that lets any MCP client
(Claude Code, Cursor, Codex, etc.) interact with Hermes conversations.

Matches OpenClaw's 9-tool channel bridge surface:

Tools exposed:
- conversations_list: list active sessions across all platforms
- conversation_get: details on one conversation
- messages_read: read message history
- attachments_fetch: extract non-text content from messages
- events_poll: poll for new events since a cursor
- events_wait: long-poll / block until next event (near-real-time)
- messages_send: send to any platform via send_message_tool
- channels_list: browse available messaging targets
- permissions_list_open: list pending approval requests
- permissions_respond: allow/deny approvals

Architecture:
- EventBridge: background thread polls SessionDB for new messages,
  maintains in-memory event queue with waiter support
- Reads sessions.json + SessionDB directly (no gateway dep for reads)
- Reuses send_message_tool for sending (same platform adapters)
- FastMCP server with stdio transport
- Zero new dependencies (uses existing mcp>=1.2.0 optional dep)

Files:
- mcp_serve.py: MCP server + EventBridge (~600 lines)
- hermes_cli/main.py: added serve sub-parser to hermes mcp
- hermes_cli/mcp_config.py: route serve action to run_mcp_server
- tests/test_mcp_serve.py: 53 tests
- docs: updated MCP page + CLI commands reference
2026-03-29 15:47:19 -07:00
d02561af85 feat: add Gemini 3.1 preview models to OpenRouter and Nous catalogs (#3803)
* Add new Gemini 3.1 model entries to models.py

* fix: also add Gemini 3.1 models to nous provider list

---------

Co-authored-by: Andrei Ignat <andrei@ignat.se>
2026-03-29 15:44:07 -07:00
8eb70a6885 fix(email): close SMTP and IMAP connections on failure (#3804)
SMTP connections in _send_email() and _send_email_with_attachment() leak
when login() or send_message() raises before quit() is reached. Both now
wrapped in try/finally with a close() fallback if quit() also fails.

IMAP connection in _fetch_new_messages() leaks when UID processing raises,
since logout() sits after the loop. Restructured with try/finally so
logout() runs unconditionally.

Co-authored-by: Himess <Himess@users.noreply.github.com>
2026-03-29 15:38:32 -07:00
ee3d2941cc feat: show estimated tool token context in hermes tools checklist (#3805)
* feat: show estimated tool token context in hermes tools checklist

Adds a live token estimate indicator to the bottom of the interactive
tool configuration checklist (hermes tools / hermes setup). As users
toggle toolsets on/off, the total estimated context cost updates in
real time.

Implementation:
- tools/registry.py: Add get_schema() for check_fn-free schema access
- hermes_cli/curses_ui.py: Add optional status_fn callback to
  curses_checklist — renders at bottom-right of terminal, stays fixed
  while items scroll
- hermes_cli/tools_config.py: Add _estimate_tool_tokens() using
  tiktoken (cl100k_base, already installed) to count tokens in the
  JSON-serialised OpenAI-format tool schemas. Results are cached
  per-process. The status function deduplicates overlapping tools
  (e.g. browser includes web_search) for accurate totals.
- 12 new tests covering estimation, caching, graceful degradation
  when tiktoken is unavailable, status_fn wiring, deduplication,
  and the numbered fallback display

* fix: use effective toolsets (includes plugins) for token estimation index mapping

The status_fn closure built ts_keys from CONFIGURABLE_TOOLSETS but the
checklist uses _get_effective_configurable_toolsets() which appends plugin
toolsets. With plugins present, the indices would mismatch, causing
IndexError when selecting a plugin toolset.
2026-03-29 15:36:56 -07:00
475205e30b fix: restore terminalbench2_env.py from patch-tool redaction corruption (#3801)
Commit ed27b826 introduced patch-tool redaction corruption that:
- Replaced max_token_length=16000 with max_token_length=***
- Truncated api_key=os.getenv(...) to api_key=os.get...EY
- Truncated tokenizer_name to NousRe...1-8B
- Deleted 409 lines including _run_tests(), _eval_with_timeout(),
  evaluate(), wandb_log(), and the __main__ entry point

Restores the file from pre-corruption state (ed27b826^) and re-applies
the two legitimate changes from subsequent commits:
- eval_concurrency config field (from ed27b826)
- docker_image registration in register_task_env_overrides (from ed27b826)
- ManagedServer branching for vLLM/SGLang backends (from 13f54596)

Closes #1737, #1740.
2026-03-29 15:33:52 -07:00
612321631f fix(gateway): use atomic writes for config.yaml to prevent data loss (#3800)
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.

Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).

Also fixes missing encoding='utf-8' on the /personality clear write.

Salvaged from PR #1211 by albatrosjj.
2026-03-29 15:32:46 -07:00
83cbf7b5bb fix(gateway): use atomic writes for config.yaml to prevent data loss (#3800)
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.

Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).

Also fixes missing encoding='utf-8' on the /personality clear write.

Salvaged from PR #1211 by albatrosjj.
2026-03-29 15:31:21 -07:00
563101e2a9 feat: add Canvas LMS skill for fetching courses and assignments (#3799)
Adds a Canvas LMS integration skill under optional-skills/productivity/canvas/
with a Python CLI wrapper (canvas_api.py) for listing courses and assignments
via personal access token auth.

Cherry-picked from PR #1250 by Alicorn-Max-S with:
- Moved from skills/ to optional-skills/ (niche educational integration)
- Fixed hardcoded ~/.hermes/ path to use $HERMES_HOME
- Removed Canvas env vars from .env.example (optional skill)
- Cleaned stale 'mini-swe-agent backend' reference from .env.example header

Co-authored-by: Alicorn-Max-S <Alicorn-Max-S@users.noreply.github.com>
2026-03-29 15:28:32 -07:00
fe6a916284 feat(skills): add one-three-one-rule communication skill (#3797)
Adds a structured 1-3-1 decision-making framework as an optional skill.
Produces: one problem statement, three options with trade-offs, one
recommendation with definition of done and implementation plan.

Moved to optional-skills/ (niche communication framework, not broadly
needed by default). Improved description with clearer trigger conditions
and replaced implementation-specific example with a generic one.

Based on PR #1262 by Willardgmoore.

Co-authored-by: Willard Moore <willardgmoore@users.noreply.github.com>
2026-03-29 15:25:12 -07:00
57481c8ac5 fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk (#3796)
* fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk

Matrix, Mattermost, HomeAssistant, and DingTalk were present in
platform_map but fell through to the "not yet implemented" else branch,
causing send_message tool calls to silently fail on these platforms.

Add four async sender functions:
- _send_mattermost: POST /api/v4/posts via Mattermost REST API
- _send_matrix: PUT /_matrix/client/v3/rooms/.../send via Matrix CS API
- _send_homeassistant: POST /api/services/notify/notify via HA REST API
- _send_dingtalk: POST to session webhook URL

Add routing in _send_to_platform() and 17 unit tests covering success,
HTTP errors, missing config, env var fallback, and Matrix txn_id uniqueness.

* fix: pass platform tokens explicitly to Mattermost/Matrix/HA senders

The original PR passed pconfig.extra to sender functions, but tokens
live at pconfig.token (not in extra). This caused the senders to always
fall through to env var lookup instead of using the gateway-resolved
token.

Changes:
- Mattermost/Matrix/HA: accept token as first arg, matching the
  Telegram/Discord/Slack sender pattern
- DingTalk: add DINGTALK_WEBHOOK_URL env var fallback + docstring
  explaining the session-webhook vs robot-webhook difference
- Tests updated for new signatures + new DingTalk env var test

---------

Co-authored-by: sprmn24 <oncuevtv@gmail.com>
2026-03-29 15:17:46 -07:00
c62cadb73a fix: make display_hermes_home imports lazy to prevent ImportError during hermes update (#3776)
When a user runs 'hermes update', the Python process caches old modules
in sys.modules.  After git pull updates files on disk, lazy imports of
newly-updated modules fail because they try to import display_hermes_home
from the cached (old) hermes_constants which doesn't have the function.

This specifically broke the gateway auto-restart in cmd_update — importing
hermes_cli/gateway.py triggered the top-level 'from hermes_constants
import display_hermes_home' against the cached old module.  The ImportError
was silently caught, so the gateway was never restarted after update.

Users with a running gateway then hit the ImportError on their next
Telegram/Discord message when the stale gateway process lazily loaded
run_agent.py (new version) which also had the top-level import.

Fixes:
- hermes_cli/gateway.py: lazy import at call site (line 940)
- run_agent.py: lazy import at call site (line 6927)
- tools/terminal_tool.py: lazy imports at 3 call sites
- tools/tts_tool.py: static schema string (no module-level call)
- hermes_cli/auth.py: lazy import at call site (line 2024)
- hermes_cli/main.py: reload hermes_constants after git pull in cmd_update

Also fixes 4 pre-existing test failures in test_parse_env_var caused by
NameError on display_hermes_home in terminal_tool.py.
2026-03-29 15:15:17 -07:00
442888a05b fix: store token lock identity at acquire time for Slack and Discord
Community review (devoruncommented) correctly identified that the Slack
adapter re-read SLACK_APP_TOKEN from os.getenv() during disconnect,
which could differ from the value used during connect if the environment
changed. Discord had the same pattern with self.config.token (less risky
but still not bulletproof).

Both now follow the Telegram pattern: store the token identity on self
at acquire time, use the stored value for release, clear after release.

Also fixes docs: alias naming was hermes-<name> in docs but actual
implementation creates <name> directly (e.g. ~/.local/bin/coder not
~/.local/bin/hermes-coder).
2026-03-29 11:09:17 -07:00
b151d5f7a7 docs: fix profile alias naming and improve quick start
The docs incorrectly showed aliases as 'hermes-work' when the actual
implementation creates 'work' (profile name directly, no prefix).

Rewrote the user guide to lead with the alias pattern:
  hermes profile create coder → coder chat, coder setup, etc.

Also clarified that the banner shows 'Profile: coder' and the prompt
shows 'coder ❯' when a non-default profile is active.

Fixed alias paths in command reference (hermes-work → work).
2026-03-29 10:51:51 -07:00
f6db1b27ba feat: add profiles — run multiple isolated Hermes instances (#3681)
Each profile is a fully independent HERMES_HOME with its own config,
API keys, memory, sessions, skills, gateway, cron, and state.db.

Core module: hermes_cli/profiles.py (~900 lines)
  - Profile CRUD: create, delete, list, show, rename
  - Three clone levels: blank, --clone (config), --clone-all (everything)
  - Export/import: tar.gz archive for backup and migration
  - Wrapper alias scripts (~/.local/bin/<name>)
  - Collision detection for alias names
  - Sticky default via ~/.hermes/active_profile
  - Skill seeding via subprocess (handles module-level caching)
  - Auto-stop gateway on delete with disable-before-stop for services
  - Tab completion generation for bash and zsh

CLI integration (hermes_cli/main.py):
  - _apply_profile_override(): pre-import -p/--profile flag + sticky default
  - Full 'hermes profile' subcommand: list, use, create, delete, show,
    alias, rename, export, import
  - 'hermes completion bash/zsh' command
  - Multi-profile skill sync in hermes update

Display (cli.py, banner.py, gateway/run.py):
  - CLI prompt: 'coder ❯' when using a non-default profile
  - Banner shows profile name
  - Gateway startup log includes profile name

Gateway safety:
  - Token locks: Discord, Slack, WhatsApp, Signal (extends Telegram pattern)
  - Port conflict detection: API server, webhook adapter

Diagnostics (hermes_cli/doctor.py):
  - Profile health section: lists profiles, checks config, .env, aliases
  - Orphan alias detection: warns when wrapper points to deleted profile

Tests (tests/hermes_cli/test_profiles.py):
  - 71 automated tests covering: validation, CRUD, clone levels, rename,
    export/import, active profile, isolation, alias collision, completion
  - Full suite: 6760 passed, 0 new failures

Documentation:
  - website/docs/user-guide/profiles.md: full user guide (12 sections)
  - website/docs/reference/profile-commands.md: command reference (12 commands)
  - website/docs/reference/faq.md: 6 profile FAQ entries
  - website/sidebars.ts: navigation updated
2026-03-29 10:41:20 -07:00
0df4d1278e feat(plugins): add enable/disable commands + interactive toggle UI (#3747)
Adds plugin management with three interfaces:

  hermes plugins          # interactive curses checklist (like hermes tools)
  hermes plugins enable   # non-interactive enable
  hermes plugins disable  # non-interactive disable
  hermes plugins list     # table with status column

Disabled plugins are stored in config.yaml under plugins.disabled and
skipped during discovery. Uses the same curses_checklist component as
hermes tools for the interactive UI.

Changes:
- hermes_cli/plugins.py: _get_disabled_plugins() + skip disabled during
  discover_and_load()
- hermes_cli/plugins_cmd.py: cmd_toggle() interactive UI, cmd_enable(),
  cmd_disable(), updated cmd_list() with status column
- hermes_cli/main.py: enable/disable subparser entries
- website/docs/reference/cli-commands.md: updated plugins section
- website/docs/user-guide/features/plugins.md: updated managing section
2026-03-29 10:39:57 -07:00
95f99ea4b9 feat: built-in boot-md hook — run BOOT.md on gateway startup (#3733)
The gateway now ships with a built-in boot-md hook that checks for
~/.hermes/BOOT.md on every startup. If the file exists, the agent
executes its instructions in a background thread. No installation
or configuration needed — just create the file.

No BOOT.md = zero overhead (the hook silently returns).

Implementation:
- gateway/builtin_hooks/boot_md.py: handler with boot prompt,
  background thread, [SILENT] suppression, error handling
- gateway/hooks.py: _register_builtin_hooks() called at the start
  of discover_and_load() to wire in built-in hooks
- Docs updated: hooks page documents BOOT.md as a built-in feature
2026-03-29 10:19:54 -07:00
811adca277 feat(skills): add SiYuan Note and Scrapling as optional skills (#3742)
Add two new optional skills:

- siyuan (optional-skills/productivity/): SiYuan Note knowledge base
  API skill — search, read, create, and manage blocks/documents in a
  self-hosted SiYuan instance via curl. Requires SIYUAN_TOKEN.

- scrapling (optional-skills/research/): Intelligent web scraping skill
  using the Scrapling library — anti-bot fetching, Cloudflare bypass,
  CSS/XPath selectors, spider framework for multi-page crawling.

Placed in optional-skills/ (not bundled) since both are niche tools
that require external dependencies.

Co-authored-by: FEUAZUR <FEUAZUR@users.noreply.github.com>
2026-03-29 09:34:56 -07:00
aafe37012a docs: update skills catalog — add red-teaming and optional skills (#3745)
* fix(discord): clean up deferred "thinking..." after slash commands complete

After a slash command is deferred (interaction.response.defer), the
"thinking..." indicator persisted indefinitely because the code used
followup.send() which creates a separate message instead of replacing
or removing the deferred response.

Fix: use edit_original_response() to replace "thinking..." with the
confirmation text when provided, or delete_original_response() to
remove it when there is no confirmation. Also consolidated /reasoning
and /voice handlers to use _run_simple_slash instead of duplicating
the defer+dispatch pattern.

Fixes #3595.

* docs: update skills catalog — add red-teaming category and all 16 optional skills

The skills catalog was missing:
- red-teaming category with the godmode jailbreaking skill
- The entire optional skills section (16 skills across 10 categories)

Added both with descriptions sourced from each SKILL.md frontmatter.
Verified against the actual skills/ and optional-skills/ directories.
2026-03-29 09:34:35 -07:00
909de72426 fix: set api_mode when switching providers via hermes model (#3726)
When switching providers via 'hermes model', the previous provider's
api_mode persisted in config.yaml. Switching from Copilot
(codex_responses) to a chat_completions provider like Z.AI would send
requests to the wrong endpoint (404).

Set api_mode = chat_completions in the 4 provider flows that were
missing it: OpenRouter, custom endpoint, Kimi, and api_key_provider.

Co-authored-by: Nour Eddine Hamaidi <HenkDz@users.noreply.github.com>
2026-03-29 08:07:11 -07:00
ba1b600bce fix(tests): align skill/setup and platform mocks with current behavior (#3721)
- Skill invocation: no secret capture callback so SSH remote setup note is emitted
- Patch agent.skill_utils.sys for platform checks (skill_matches_platform)
- Skip CLAUDE.md priority test on Darwin (case-insensitive FS)

Made-with: Cursor

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-29 07:51:43 -07:00
fcd1645223 feat(skills): support external skill directories via config (#3678)
Add skills.external_dirs config option — a list of additional directories
to scan for skills alongside ~/.hermes/skills/. External dirs are read-only:
skill creation/editing always writes to the local dir. Local skills take
precedence when names collide.

This lets users share skills across tools/agents without copying them into
Hermes's own directory (e.g. ~/.agents/skills, /shared/team-skills).

Changes:
- agent/skill_utils.py: add get_external_skills_dirs() and get_all_skills_dirs()
- agent/prompt_builder.py: scan external dirs in build_skills_system_prompt()
- tools/skills_tool.py: _find_all_skills() and skill_view() search external dirs;
  security check recognizes configured external dirs as trusted
- agent/skill_commands.py: /skill slash commands discover external skills
- hermes_cli/config.py: add skills.external_dirs to DEFAULT_CONFIG
- cli-config.yaml.example: document the option
- tests/agent/test_external_skills.py: 11 tests covering discovery, precedence,
  deduplication, and skill_view for external skills

Requested by community member primco.
2026-03-29 00:33:30 -07:00
253a9adc72 docs(skills): clarify DuckDuckGo runtime requirements (#3680)
Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-29 00:17:57 -07:00
300964178f docs: document credential file passthrough and env var forwarding for remote backends (#3677)
Three docs pages updated:

- security.md: New 'Credential File Passthrough' section, updated
  sandbox filter table to include Docker/Modal rows, added info box
  about Docker env_passthrough merge
- creating-skills.md: New 'Credential File Requirements' section
  with frontmatter examples and guidance on when to use env vars
  vs credential files
- environment-variables.md: Updated TERMINAL_DOCKER_FORWARD_ENV
  description to note auto-passthrough from skills
2026-03-29 00:16:34 -07:00
7a3682ac3f feat: mount skill credential files + fix env passthrough for remote backends (#3671)
Two related fixes for remote terminal backends (Modal/Docker):

1. NEW: Credential file mounting system
   Skills declare required_credential_files in frontmatter. Files are
   mounted into Docker (read-only bind mounts) and Modal (mounts at
   creation + sync via exec on each command for mid-session changes).
   Google Workspace skill updated with the new field.

2. FIX: Docker backend now includes env_passthrough vars
   Skills that declare required_environment_variables (e.g. Notion with
   NOTION_API_KEY) register vars in the env_passthrough system. The
   local backend checked this, but Docker's forward_env was a separate
   disconnected list. Now Docker exec merges both sources, so
   skill-declared env vars are forwarded into containers automatically.

   This fixes the reported issue where NOTION_API_KEY in ~/.hermes/.env
   wasn't reaching the Docker container despite being registered via
   the Notion skill's prerequisites.

Closes #3665
2026-03-28 23:53:40 -07:00
9f01244137 fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home()
Prep for profiles: user-facing messages now use display_hermes_home() so
diagnostic output shows the correct path for each profile.

New helper: display_hermes_home() in hermes_constants.py
12 files swept, ~30 user-facing string replacements.
Includes dynamic TTS schema description.
2026-03-28 23:47:21 -07:00
0a80dd9c7a fix(discord): clean up deferred "thinking..." after slash commands complete (#3674)
After a slash command is deferred (interaction.response.defer), the
"thinking..." indicator persisted indefinitely because the code used
followup.send() which creates a separate message instead of replacing
or removing the deferred response.

Fix: use edit_original_response() to replace "thinking..." with the
confirmation text when provided, or delete_original_response() to
remove it when there is no confirmation. Also consolidated /reasoning
and /voice handlers to use _run_simple_slash instead of duplicating
the defer+dispatch pattern.

Fixes #3595.
2026-03-28 23:46:43 -07:00
4764e06fde fix(acp): complete session management surface for editor clients (salvage #3501) (#3675)
* fix acp adapter session methods

* test: stub local command in transcription provider cases

---------

Co-authored-by: David Zhang <david.d.zhang@gmail.com>
2026-03-28 23:45:53 -07:00
4c532c153b fix: URL-encode Signal phone numbers and correct attachment RPC parameter (#3670)
Fixes two Signal bugs:

1. SSE connection: URL-encode phone numbers so + isn't interpreted as space (400 Bad Request)
2. Attachment fetch: use 'id' parameter instead of 'attachmentId' (NullPointerException in signal-cli)

Also refactors Signal tests with shared helpers.
2026-03-28 23:45:28 -07:00
a99c0478d0 fix(skills): move parallel-cli to optional-skills (#3673)
parallel-cli is a paid third-party vendor skill that requires
PARALLEL_API_KEY, but it was shipped in the default skills/ directory
with no env-var gate. This caused it to appear in every user's system
prompt even when they have no Parallel account or API key.

Move it to optional-skills/ so it is only visible through the Skills
Hub and must be explicitly installed. Also remove it from the default
skills catalog docs.
2026-03-28 23:45:05 -07:00
c6e3084baf fix(gateway): replace print() with logger calls in BasePlatformAdapter (#3669)
Salvage of PR #3616 (memosr). Replaces 6 print() calls with proper logger calls in BasePlatformAdapter + removes redundant traceback.print_exc().

Co-Authored-By: memosr <memosr@users.noreply.github.com>
2026-03-28 22:25:35 -07:00
dcbdfdbb2b feat(docker): add Docker container for the agent (salvage #1841) (#3668)
Adds a complete Docker packaging for Hermes Agent:
- Dockerfile based on debian:13.4 with all deps
- Entrypoint that bootstraps .env, config.yaml, SOUL.md on first run
- CI workflow to build, test, and push to DockerHub
- Documentation for interactive, gateway, and upgrade workflows

Closes #850, #913.

Changes vs original PR:
- Removed pre-created legacy cache/platform dirs from entrypoint
  (image_cache, audio_cache, pairing, whatsapp/session) — these are
  now created on demand by the application using the consolidated
  layout from get_hermes_dir()
- Moved docs from docs/docker.md to website/docs/user-guide/docker.md
  and added to Docusaurus sidebar

Co-authored-by: benbarclay <benbarclay@users.noreply.github.com>
2026-03-28 22:21:48 -07:00
91b881f931 feat(mattermost): configurable mention behavior — respond without @mention (#3664)
Adds MATTERMOST_REQUIRE_MENTION and MATTERMOST_FREE_RESPONSE_CHANNELS
env vars, matching Discord's existing mention gating pattern.

- MATTERMOST_REQUIRE_MENTION=false: respond to all channel messages
- MATTERMOST_FREE_RESPONSE_CHANNELS=id1,id2: specific channels where
  bot responds without @mention even when require_mention is true
- DMs always respond regardless of mention settings
- @mention is now stripped from message text (clean agent input)

7 new tests for mention gating, free-response channels, DM bypass,
and mention stripping. Updated existing test for mention stripping.

Docs: updated mattermost.md with Mention Behavior section,
environment-variables.md with new vars, config.py with metadata.
2026-03-28 22:17:43 -07:00
3e1157080a fix(tools): use non-deprecated streamable_http_client for MCP HTTP transport (#3646)
Switch MCP HTTP transport from the deprecated streamablehttp_client()
(mcp < 1.24.0) to the new streamable_http_client() API that accepts a
pre-built httpx.AsyncClient.

Changes vs the original PR #3391:
- Separate try/except imports so mcp < 1.24.0 doesn't break (graceful
  fallback to deprecated API instead of losing HTTP MCP entirely)
- Wrap httpx.AsyncClient in async-with for proper lifecycle management
  (the new SDK API explicitly skips closing caller-provided clients)
- Match SDK's own create_mcp_http_client defaults: follow_redirects=True,
  Timeout(connect_timeout, read=300.0)
- Keep deprecated code path as fallback for older SDK versions

Co-authored-by: HenkDz <HenkDz@users.noreply.github.com>
2026-03-28 18:20:49 -07:00
1a032ccf79 fix(skills): stop marking persisted env vars missing on remote backends (#3650)
Salvage of PR #3452 (kentimsit). Fixes skill readiness checks on remote backends — persisted env vars are no longer incorrectly marked as missing.

Co-Authored-By: kentimsit <kentimsit@users.noreply.github.com>
2026-03-28 17:52:32 -07:00
0bd7e95dfc fix(honcho): allow self-hosted local instances without API key (#3644)
Self-hosted Honcho on localhost doesn't require authentication, but
both the activation gates and the SDK client required an API key.

Combined fix from three contributor PRs:
- Relax all 8 activation gates to accept (api_key OR base_url) as
  valid credentials (#3482 by @cameronbergh)
- Use 'local' placeholder for the SDK client when base_url points to
  localhost/127.0.0.1/::1 (#3570 by @ygd58)

Files changed: run_agent.py (2 gates), cli.py (1 gate),
gateway/run.py (1 gate), honcho_integration/cli.py (2 gates),
hermes_cli/doctor.py (2 gates), honcho_integration/client.py (SDK).

Co-authored-by: cameronbergh <cameronbergh@users.noreply.github.com>
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
Co-authored-by: devorun <devorun@users.noreply.github.com>
2026-03-28 17:49:56 -07:00
d35567c6e0 feat(web): add Exa as a web search and extract backend (#3648)
Adds Exa (https://exa.ai) as a fourth web backend alongside Parallel,
Firecrawl, and Tavily. Follows the exact same integration pattern:

- Backend selection: config web.backend=exa or auto-detect from EXA_API_KEY
- Search: _exa_search() with highlights for result descriptions
- Extract: _exa_extract() with full text content extraction
- Lazy singleton client with x-exa-integration header
- Wired into web_search_tool and web_extract_tool dispatchers
- check_web_api_key() and requires_env updated
- CLI: hermes setup summary, hermes tools config, hermes config show
- config.py: EXA_API_KEY in OPTIONAL_ENV_VARS with metadata
- pyproject.toml: exa-py>=2.9.0,<3 in dependencies


Salvaged from PR #1850.

Co-authored-by: louiswalsh <louiswalsh@users.noreply.github.com>
2026-03-28 17:35:53 -07:00
bea49e02a3 fix: route /bg spinner through TUI widget to prevent status bar collision (#3643)
Background agent's KawaiiSpinner wrote \r-based animation and stop()
messages through StdoutProxy, colliding with prompt_toolkit's status bar.

Two fixes:
- display.py: use isinstance(out, StdoutProxy) instead of fragile
  hasattr+name check for detecting prompt_toolkit's stdout wrapper
- cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route
  thinking updates through the TUI widget only when no foreground
  agent is active; clear spinner text in finally block with same guard

Closes #2718

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-28 17:29:37 -07:00
c6e2e486bf fix: add download retry to cache_audio_from_url matching cache_image_from_url (#3401)
PR #3323 added retry with exponential backoff to cache_image_from_url
but missed the sibling function cache_audio_from_url 18 lines below in
the same file. A single transient 429/5xx/timeout loses voice messages
while image downloads now survive them.

Apply the same retry pattern: 3 attempts with 1.5s exponential backoff,
immediate raise on non-retryable 4xx.
2026-03-28 17:28:38 -07:00
973deb4f76 fix(browser): guard LLM response content against None in snapshot and vision (#3642)
Salvage of PR #3532 (binhnt92). Guards browser_tool.py against None content from reasoning-only models (DeepSeek-R1, QwQ). Follow-up to #3449.

Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>
2026-03-28 17:25:04 -07:00
dc74998718 fix(sessions): support stdout (-) in session and snapshot export (salvage #3617) (#3641)
* fix(sessions): support stdout when output path is '-' in session export

* fix: style cleanup + extend stdout support to snapshot export

Follow-up for salvaged PR #3617:
- Fix import sys; on one line (style consistency)
- Update help text to mention - for stdout
- Apply same stdout support to hermes skills snapshot export

---------

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-03-28 17:24:32 -07:00
17617e4399 feat(discord): DISCORD_IGNORE_NO_MENTION — skip messages that @mention others but not the bot (#3640)
Salvage of PR #3310 (luojiesi). When DISCORD_IGNORE_NO_MENTION=true (default), messages that @mention other users but not the bot are silently skipped in server channels. DMs excluded — mentions there are just references.

Co-Authored-By: luojiesi <luojiesi@users.noreply.github.com>
2026-03-28 17:19:41 -07:00
ffdfeb91d8 fix(nix): unify directory and file permissions across all three layers (#3619)
Activation script, tmpfiles, and container entrypoint now agree on
0750 for all directories. Tighten config.yaml and workspace documents
from 0644 to 0640 (group-readable, no world access). Add explicit
chmod for .managed marker and container $TARGET_HOME to eliminate
umask dependence. Secrets (auth.json, .env) remain 0600.
2026-03-29 05:29:24 +05:30
857a5d7b47 fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624)
Pasting text from rich-text editors (Google Docs, Word, etc.) can inject
lone surrogate characters (U+D800..U+DFFF) that are invalid UTF-8.
The OpenAI SDK serializes messages with ensure_ascii=False, then encodes
to UTF-8 for the HTTP body — surrogates crash this with:
  UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2'

Three-layer fix:
1. Primary: sanitize user_message at the top of run_conversation()
2. CLI: sanitize in chat() before appending to conversation_history
3. Safety net: catch UnicodeEncodeError in the API error handler,
   sanitize the entire messages list in-place, and retry once.
   Also exclude UnicodeEncodeError from is_local_validation_error
   so it doesn't get classified as non-retryable.

Includes 14 new tests covering the sanitization helpers and the
integration with run_conversation().
2026-03-28 16:53:14 -07:00
b029742092 fix(cli): strengthen paste collapse fallback for terminals without bracketed paste (#3625)
The _on_text_changed fallback only detected pastes when all characters
arrived in a single event (chars_added > 1).  Some terminals (notably
VSCode integrated terminal in certain configs) may deliver paste data
differently, causing the fallback to miss.

Add a second heuristic: if the newline count jumps by 4+ in a single
text-change event, treat it as a paste.  Alt+Enter only adds 1 newline
per event, so this never false-positives on manual multi-line input.

Also fixes: the fallback path was missing _paste_just_collapsed flag
set before replacing buffer text, which could cause a re-trigger loop.
2026-03-28 15:40:49 -07:00
02fb7c4aaf docs: comprehensive docs audit — fix 12 stale/missing items across 10 pages (#3618)
Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
2026-03-28 15:26:35 -07:00
1e924e99b9 refactor: consolidate ~/.hermes directory layout with backward compat (#3610)
New installs get a cleaner structure:
  cache/images/      (was image_cache/)
  cache/audio/       (was audio_cache/)
  cache/documents/   (was document_cache/)
  cache/screenshots/ (was browser_screenshots/)
  platforms/whatsapp/session/ (was whatsapp/session/)
  platforms/matrix/store/    (was matrix/store/)
  platforms/pairing/         (was pairing/)

Existing installs are unaffected -- get_hermes_dir() checks for the
old path first and uses it if present. No migration needed.

Adds get_hermes_dir(new_subpath, old_name) helper to hermes_constants.py
for reuse by any future subsystem.
2026-03-28 15:22:19 -07:00
614e43d3d9 feat(skills): add garrytan/gstack as default Skills Hub tap (#3605)
Add the gstack community skills repo to the default tap list and fix
skill_identifier construction for repos with an empty path prefix.

Co-authored-by: Tugrul Guner <tugrulguner@users.noreply.github.com>
2026-03-28 14:55:49 -07:00
e4480ff426 fix(config): accept 'model' key as alias for 'default' in model config (#3603)
Users intuitively write model: { model: my-model } instead of
model: { default: my-model } and it silently falls back to the
hardcoded default. Now both spellings work across all three config
consumers: runtime_provider, CLI, and gateway.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-28 14:55:27 -07:00
9a364f2805 fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599)
Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480.

Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>
2026-03-28 14:55:18 -07:00
1b2d4f21f3 feat(cli): show resume-by-title command in exit summary (#3607)
When exiting a session that has a title (auto-generated or manual),
the exit summary now also shows:
  hermes -c "Session Title"
alongside the existing hermes --resume <id> command.

Also adds the title to the session info block.
2026-03-28 14:54:53 -07:00
9009169eeb fix: recover updater when venv pip is missing (#3608)
Some environments lose pip inside the venv. Before invoking pip install,
check pip --version and bootstrap with ensurepip if missing. Applied to
both update code paths (_update_via_zip and cmd_update).


Salvaged from PR #3359.

Co-authored-by: Git-on-my-level <Git-on-my-level@users.noreply.github.com>
2026-03-28 14:54:49 -07:00
0f042f3930 fix(email): filter automated/noreply senders to prevent reply loops (salvage #3461) (#3606)
* fix(gateway): filter automated/noreply senders in email adapter

Fixes #3453

Adds noreply/automated sender filtering to the email adapter. Drops emails from noreply, mailer-daemon, postmaster addresses and bulk mail headers (Auto-Submitted, Precedence, List-Unsubscribe) before dispatching. Prevents pairing codes and AI responses being sent to automated senders.

* fix: remove redundant seen_uids add + trailing whitespace cleanup

---------

Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>
2026-03-28 14:50:50 -07:00
7a9e45e560 fix: regenerate uv.lock to match v0.5.0 in pyproject.toml (#3594)
The lockfile was still pinned to hermes-agent 0.4.0 after the v0.5.0
release, causing downstream consumers (e.g. the Nix package built via
uv2nix) to report the wrong version.  Also drops stale transitive deps
(bashlex, boto3, swe-rex) that were carried over from the removed
swe-rex integration.
2026-03-29 03:19:47 +05:30
a641f20cac fix(gateway): self-heal missing launchd plist on start (#3601)
When the plist is deleted (manual cleanup, failed upgrade),
hermes gateway start now regenerates it automatically instead of
failing. Also simplifies the returncode==3 error path since the
plist is guaranteed to exist at that point.

Co-authored-by: Bartok9 <Bartok9@users.noreply.github.com>
2026-03-28 14:48:55 -07:00
ee066b7be6 fix: use placeholder api_key for custom providers without credentials (#3604)
Local/custom OpenAI-compatible providers (Ollama, LM Studio, vLLM) that
don't require auth were hitting empty api_key rejections from the OpenAI
SDK, especially when used as smart model routing targets.

Uses the same 'no-key-required' placeholder already used in
_resolve_openrouter_runtime() for the identical scenario.


Salvaged from PR #3543.

Co-authored-by: scottlowry <scottlowry@users.noreply.github.com>
2026-03-28 14:47:41 -07:00
a6bc13ce13 fix(github-auth): check ~/.hermes/.env before ~/.git-credentials for token extraction (#3466)
* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials for token extraction

Users who configured their token via `hermes setup` have it stored in
~/.hermes/.env (GITHUB_TOKEN=...), not in ~/.git-credentials. On macOS
with osxkeychain as the default git credential helper, ~/.git-credentials
may not exist at all, causing silent 401 failures in all GitHub skills.

Add ~/.hermes/.env as the first fallback in the auth detection block and
the inline "Extracting the Token from Git Credentials" example.

Priority order: env var → ~/.hermes/.env → ~/.git-credentials → none

Part of fix for NousResearch/hermes-agent#3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464

* fix(github-auth): check ~/.hermes/.env before ~/.git-credentials

Fixes #3464
2026-03-28 14:46:49 -07:00
f803f66339 fix(terminal): avoid merging heredoc EOF with fence wrapper (#3598)
One-shot local execution built `printf FENCE; <cmd>; __hermes_rc=...`, so a
command ending in a heredoc produced a closing line like `EOF; __hermes_rc=...`,
which is not a valid delimiter. Bash then treated the rest of the wrapper as
heredoc body, leaking it into tool output (e.g. gh issue/PR flows).

Use newline-separated wrapper lines so the delimiter stays alone and the
trailer runs after the heredoc completes.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-28 14:43:41 -07:00
839d9d7471 feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (#3597)
Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
2026-03-28 14:35:28 -07:00
404a0b823e fix: add self-termination guard for pkill/killall targeting hermes/gateway (#3593)
Prevent the agent from accidentally killing its own process with
pkill -f gateway, killall hermes, etc. Adds a dangerous command
pattern that triggers the approval flow.

Co-authored-by: arasovic <arasovic@users.noreply.github.com>
2026-03-28 14:33:48 -07:00
dabe3c34cc feat(webhook): hermes webhook CLI + skill for event-driven subscriptions (#3578)
Adds 'hermes webhook' CLI subcommand and a skill — zero new model tools.

CLI commands (require webhook platform to be enabled):
  hermes webhook subscribe <name> [--events, --prompt, --deliver, ...]
  hermes webhook list
  hermes webhook remove <name>
  hermes webhook test <name>

All commands gate on webhook platform being enabled in config. If not
configured, prints setup instructions (gateway setup wizard, manual
config.yaml, or env vars).

The agent uses these via terminal tool, guided by the webhook-subscriptions
skill which documents setup, common patterns (GitHub, Stripe, CI/CD,
monitoring), prompt template syntax, security, and troubleshooting.

Adapter enhancement: webhook.py hot-reloads dynamic subscriptions from
~/.hermes/webhook_subscriptions.json on each incoming request (mtime-gated).
Static config.yaml routes always take precedence.

Docs: updated webhooks.md with Dynamic Subscriptions section, added
hermes webhook to cli-commands.md reference.

No new model tools. No toolset changes.

24 new tests for CLI CRUD, persistence, enabled-gate, and adapter
dynamic route loading.
2026-03-28 14:33:35 -07:00
82d6c28bd5 fix(skills): cache-aware /skills install and uninstall in TUI (#3586)
Two fixes for /skills install and /skills uninstall slash commands:

1. input() hangs indefinitely inside prompt_toolkit's TUI event loop,
   soft-locking the CLI. The user typing the slash command is already
   implicit consent, so confirmation is now always skipped.

2. Cache invalidation was unconditional — installing or uninstalling a
   skill mid-session silently broke the prompt cache, increasing costs.
   The slash handler now defers cache invalidation by default (skill
   takes effect next session). Pass --now to invalidate immediately,
   with a message explaining the cost tradeoff. The CLI argparse path
   (hermes skills install) is unaffected and still invalidates.

Fixes #3474
Salvaged from PR #3496 by dlkakbs.
2026-03-28 14:32:23 -07:00
dc7d504aca Remove incorrect docker alternative for signal-cli (#3545)
Removed docker alternative for signal-cli-rest-api from the documentation. It does not support the raw signal-cli http daemon. See https://github.com/bbernhard/signal-cli-rest-api/issues/720
2026-03-28 14:28:57 -07:00
9e411f7d70 fix(update): skip config migration prompts in non-interactive sessions (#3584)
hermes update hangs on input() when run from cron, scripts, or piped
contexts. Check both stdin and stdout isatty(), catch EOFError as a
fallback, and print guidance to run 'hermes config migrate' later.

Co-authored-by: phippsbot-byte <phippsbot-byte@users.noreply.github.com>
2026-03-28 14:26:32 -07:00
708f187549 fix(gateway): exit with failure when all platforms fail with retryable errors (#3592)
When all messaging platforms exhaust retries and get queued for background
reconnection, exit with code 1 so systemd Restart=on-failure can restart
the process. Previously the gateway stayed alive as a zombie with no
connected platforms and exit code 0.

Salvaged from PR #3567 by kelsia14. Test updates added.

Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
2026-03-28 14:25:12 -07:00
d7c41f3cef fix(telegram): honor proxy env vars in fallback transport (salvage #3411) (#3591)
* fix: keep gateway running through telegram proxy failures

- continue gateway startup in degraded mode when Telegram cannot connect yet
- ensure Telegram fallback transport also honors proxy env vars
- support reconnect retries without taking down the whole gateway

* test(telegram): cover proxy env handling in fallback transport

---------

Co-authored-by: kufufu9 <pi@local>
2026-03-28 14:23:27 -07:00
6893c3befc fix(gateway): inject PATH + VIRTUAL_ENV into launchd plist for macOS service (#3585)
Salvage of PR #2173 (hanai) and PR #3432 (timknip).

Injects PATH, VIRTUAL_ENV, and HERMES_HOME into the macOS launchd plist so gateway subprocesses find user-installed tools (node, ffmpeg, etc.). Matches systemd unit parity with venv/bin, node_modules/.bin, and resolved node dir in PATH. Includes 7 new tests and docs updates across 4 pages.

Co-Authored-By: Han <ihanai1991@gmail.com>
Co-Authored-By: timknip <timknip@users.noreply.github.com>
2026-03-28 14:23:26 -07:00
5cdc24c2e2 docs(slack): add missing Messages Tab setup step (#3590)
Without enabling the Messages Tab in App Home settings, users see
"Sending messages to this app has been turned off" when trying to DM
the bot — even with all correct scopes and event subscriptions.

Add Step 5 (Enable the Messages Tab) between Event Subscriptions and
Install App, with a danger admonition. Also add troubleshooting entry
for this specific error message. Renumber subsequent steps (6→7→8→9).

Co-authored-by: Alberto Leal <mail4alberto@gmail.com>
2026-03-28 14:23:19 -07:00
2dd286c162 fix: write models.dev disk cache atomically (#3588)
Use atomic_json_write() from utils.py instead of plain open()/json.dump()
for the models.dev disk cache. Prevents corrupted cache if the process is
killed mid-write — _load_disk_cache() silently returns {} on corrupt JSON,
losing all model metadata until the next successful API fetch.

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-28 14:20:30 -07:00
924857c3e3 fix: prevent tool name/arg concatenation for Ollama-compatible endpoints (#3582)
Ollama reuses index 0 for every tool call in a parallel batch,
distinguishing them only by id.  The streaming accumulator now
detects a new non-empty id at an already-active index and redirects
it to a fresh slot, preventing names and arguments from being
concatenated into a single tool call.

No-op for normal providers that use incrementing indices.

Co-authored-by: dmater01 <dmater01@users.noreply.github.com>
2026-03-28 14:08:26 -07:00
ba3bbf5b53 fix: add missing mattermost/matrix/dingtalk toolsets + platform consistency tests (salvage #3512) (#3583)
* Fixing mattermost configuration parsing bugs

* fix: add homeassistant to skills_config + platform consistency tests

Follow-up for cherry-picked #3512:
- Add homeassistant to skills_config.py PLATFORMS (was in tools_config
  but missing from skills_config)
- Add 3 consistency tests that verify all platforms in tools_config have
  matching toolset definitions, gateway includes, and skills_config entries
  — prevents this class of bug from recurring

---------

Co-authored-by: DaneelV3 <dannel@v3rtical.tech>
2026-03-28 14:05:02 -07:00
d6b4fa2e9f fix: strip @botname from commands so /new@TigerNanoBot resolves correctly (#3581)
Commands sent directly to the bot in groups include @botname suffix
(e.g. /compress@TigerNanoBot). get_command() now strips the @anything
part before lookup, matching how Telegram bot menu generates commands.
Fixes all slash commands silently doing nothing when sent with @mention.

Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>
2026-03-28 14:01:01 -07:00
df1bf0a209 feat(api-server): add basic security headers (#3576)
Add X-Content-Type-Options: nosniff and Referrer-Policy: no-referrer
to all API server responses via a new security_headers_middleware.

Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>
2026-03-28 14:00:52 -07:00
49a49983e4 feat(api-server): add Access-Control-Max-Age to CORS preflight responses (#3580)
Adds Access-Control-Max-Age: 600 to CORS preflight responses, telling
browsers to cache the preflight for 10 minutes. Reduces redundant OPTIONS
requests and improves perceived latency for browser-based API clients.

Salvaged from PR #3514 by aydnOktay.

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-28 14:00:03 -07:00
e97c0cb578 fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support
* feat: GPT tool-use steering + strip budget warnings from history

Two changes to improve tool reliability, especially for OpenAI GPT models:

1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the
   system prompt when the model name contains 'gpt' and tools are loaded.
   This addresses a known behavioral pattern where GPT models describe
   intended actions ('I will run the tests') instead of actually making
   tool calls. Inspired by similar steering in OpenCode (beast.txt) and
   Cline (GPT-5.1 variant).

2. Budget warning history stripping: Budget pressure warnings injected by
   _get_budget_warning() into tool results are now stripped when
   conversation history is replayed via run_conversation(). Previously,
   these turn-scoped signals persisted across turns, causing models to
   avoid tool calls in all subsequent messages after any turn that hit
   the 70-90% iteration threshold.

* fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support

Prep for the upcoming profiles feature — each profile is a separate
HERMES_HOME directory, so all paths must respect the env var.

Fixes:
- gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to
  ~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses
  get_hermes_home() so each profile gets its own Matrix state.

- gateway/platforms/telegram.py: Two locations reading config.yaml via
  Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id
  persistence and hot-reload would read the wrong config in a profile.

- tools/file_tools.py: Security path for hub index blocking was
  hardcoded to ~/.hermes, would miss the actual profile's hub cache.

- hermes_cli/gateway.py: Service naming now uses the profile name
  (hermes-gateway-coder) instead of a cryptic hash suffix. Extracted
  _profile_suffix() helper shared by systemd and launchd.

- hermes_cli/gateway.py: Launchd plist path and Label now scoped per
  profile (ai.hermes.gateway-coder.plist). Previously all profiles
  would collide on the same plist file on macOS.

- hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in
  EnvironmentVariables — was missing entirely, making custom
  HERMES_HOME broken on macOS launchd (pre-existing bug).

- All launchctl commands in gateway.py, main.py, status.py updated
  to use get_launchd_label() instead of hardcoded string.

Test fixes: DM topic tests now set HERMES_HOME env var alongside
Path.home() mock. Launchd test uses get_launchd_label() for expected
commands.
2026-03-28 13:51:08 -07:00
c0aa06f300 fix(test): update streaming test to match PR #3566 behavior change (#3574)
PR #3566 intentionally routes suppressed content to stream_delta_callback
when tool calls are present, so reasoning tag extraction can fire during
streaming. The test was still asserting the old behavior where content
after tool calls was fully suppressed from the callback.

Updated the assertion to match: content IS delivered to the callback
(for tag extraction), with display-level suppression handled by the
CLI's _stream_delta.
2026-03-28 13:41:23 -07:00
3273732891 fix(api-server): add CORS headers to streaming SSE responses (#3573)
StreamResponse headers are flushed on prepare() before the CORS
middleware can inject them. Resolve CORS headers up front using
_cors_headers_for_origin() so the full set (including
Access-Control-Allow-Origin) is present on SSE streams.

Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
2026-03-28 13:38:30 -07:00
09ebf8b252 feat(api-server): add /v1/health alias for OpenAI compatibility (#3572)
Add GET /v1/health as an alias to the existing /health endpoint so
OpenAI-compatible health checks work out of the box.

Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>
2026-03-28 13:32:39 -07:00
33c89e52ec fix(whatsapp): add **kwargs to media sending methods to accept metadata (#3571)
The base orchestrator passes metadata=_thread_metadata to
send_image_file, send_video, and send_document. WhatsApp was the
only platform adapter missing the parameter, causing TypeError
crashes when sending media.

Extended to all three methods (original PR only fixed send_image_file).


Salvaged from PR #3144.

Co-authored-by: afifai <afifai@users.noreply.github.com>
2026-03-28 13:28:04 -07:00
e95965d76a Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-26 16:18:28 -07:00
95dc9aaa75 feat: add managed tool gateway and Nous subscription support
- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support
2026-03-26 16:17:58 -07:00
574 changed files with 73140 additions and 9341 deletions

15
.dockerignore Normal file
View File

@ -0,0 +1,15 @@
# Git
.git
.gitignore
.gitmodules
# Dependencies
node_modules
# CI/CD
.github
# Environment files
.env
*.md

View File

@ -7,18 +7,19 @@
# OpenRouter provides access to many models through one API
# All LLM calls go through OpenRouter - no direct provider keys needed
# Get your key at: https://openrouter.ai/keys
OPENROUTER_API_KEY=
# OPENROUTER_API_KEY=
# Default model to use (OpenRouter format: provider/model)
# Examples: anthropic/claude-opus-4.6, openai/gpt-4o, google/gemini-3-flash-preview, zhipuai/glm-4-plus
LLM_MODEL=anthropic/claude-opus-4.6
# Default model is configured in ~/.hermes/config.yaml (model.default).
# Use 'hermes model' or 'hermes setup' to change it.
# LLM_MODEL is no longer read from .env — this line is kept for reference only.
# LLM_MODEL=anthropic/claude-opus-4.6
# =============================================================================
# LLM PROVIDER (z.ai / GLM)
# =============================================================================
# z.ai provides access to ZhipuAI GLM models (GLM-4-Plus, etc.)
# Get your key at: https://z.ai or https://open.bigmodel.cn
GLM_API_KEY=
# GLM_API_KEY=
# GLM_BASE_URL=https://api.z.ai/api/paas/v4 # Override default base URL
# =============================================================================
@ -28,7 +29,7 @@ GLM_API_KEY=
# Get your key at: https://platform.kimi.ai (Kimi Code console)
# Keys prefixed sk-kimi- use the Kimi Code API (api.kimi.com) by default.
# Legacy keys from platform.moonshot.ai need KIMI_BASE_URL override below.
KIMI_API_KEY=
# KIMI_API_KEY=
# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys
# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys
@ -38,11 +39,11 @@ KIMI_API_KEY=
# =============================================================================
# MiniMax provides access to MiniMax models (global endpoint)
# Get your key at: https://www.minimax.io
MINIMAX_API_KEY=
# MINIMAX_API_KEY=
# MINIMAX_BASE_URL=https://api.minimax.io/v1 # Override default base URL
# MiniMax China endpoint (for users in mainland China)
MINIMAX_CN_API_KEY=
# MINIMAX_CN_API_KEY=
# MINIMAX_CN_BASE_URL=https://api.minimaxi.com/v1 # Override default base URL
# =============================================================================
@ -50,7 +51,7 @@ MINIMAX_CN_API_KEY=
# =============================================================================
# OpenCode Zen provides curated, tested models (GPT, Claude, Gemini, MiniMax, GLM, Kimi)
# Pay-as-you-go pricing. Get your key at: https://opencode.ai/auth
OPENCODE_ZEN_API_KEY=
# OPENCODE_ZEN_API_KEY=
# OPENCODE_ZEN_BASE_URL=https://opencode.ai/zen/v1 # Override default base URL
# =============================================================================
@ -58,7 +59,7 @@ OPENCODE_ZEN_API_KEY=
# =============================================================================
# OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5)
# $10/month subscription. Get your key at: https://opencode.ai/auth
OPENCODE_GO_API_KEY=
# OPENCODE_GO_API_KEY=
# =============================================================================
# LLM PROVIDER (Hugging Face Inference Providers)
@ -67,34 +68,38 @@ OPENCODE_GO_API_KEY=
# Free tier included ($0.10/month), no markup on provider rates.
# Get your token at: https://huggingface.co/settings/tokens
# Required permission: "Make calls to Inference Providers"
HF_TOKEN=
# HF_TOKEN=
# OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1 # Override default base URL
# =============================================================================
# TOOL API KEYS
# =============================================================================
# Exa API Key - AI-native web search and contents
# Get at: https://exa.ai
# EXA_API_KEY=
# Parallel API Key - AI-native web search and extract
# Get at: https://parallel.ai
PARALLEL_API_KEY=
# PARALLEL_API_KEY=
# Firecrawl API Key - Web search, extract, and crawl
# Get at: https://firecrawl.dev/
FIRECRAWL_API_KEY=
# FIRECRAWL_API_KEY=
# FAL.ai API Key - Image generation
# Get at: https://fal.ai/
FAL_KEY=
# FAL_KEY=
# Honcho - Cross-session AI-native user modeling (optional)
# Builds a persistent understanding of the user across sessions and tools.
# Get at: https://app.honcho.dev
# Also requires ~/.honcho/config.json with enabled=true (see README).
HONCHO_API_KEY=
# HONCHO_API_KEY=
# =============================================================================
# TERMINAL TOOL CONFIGURATION (mini-swe-agent backend)
# TERMINAL TOOL CONFIGURATION
# =============================================================================
# Backend type: "local", "singularity", "docker", "modal", or "ssh"
# Terminal backend is configured in ~/.hermes/config.yaml (terminal.backend).
@ -177,10 +182,10 @@ TERMINAL_LIFETIME_SECONDS=300
# Browserbase API Key - Cloud browser execution
# Get at: https://browserbase.com/
BROWSERBASE_API_KEY=
# BROWSERBASE_API_KEY=
# Browserbase Project ID - From your Browserbase dashboard
BROWSERBASE_PROJECT_ID=
# BROWSERBASE_PROJECT_ID=
# Enable residential proxies for better CAPTCHA solving (default: true)
# Routes traffic through residential IPs, significantly improves success rate
@ -212,7 +217,7 @@ BROWSER_INACTIVITY_TIMEOUT=120
# Uses OpenAI's API directly (not via OpenRouter).
# Named VOICE_TOOLS_OPENAI_KEY to avoid interference with OpenRouter.
# Get at: https://platform.openai.com/api-keys
VOICE_TOOLS_OPENAI_KEY=
# VOICE_TOOLS_OPENAI_KEY=
# =============================================================================
# SLACK INTEGRATION
@ -227,6 +232,21 @@ VOICE_TOOLS_OPENAI_KEY=
# Slack allowed users (comma-separated Slack user IDs)
# SLACK_ALLOWED_USERS=
# =============================================================================
# TELEGRAM INTEGRATION
# =============================================================================
# Telegram Bot Token - From @BotFather (https://t.me/BotFather)
# TELEGRAM_BOT_TOKEN=
# TELEGRAM_ALLOWED_USERS= # Comma-separated user IDs
# TELEGRAM_HOME_CHANNEL= # Default chat for cron delivery
# TELEGRAM_HOME_CHANNEL_NAME= # Display name for home channel
# Webhook mode (optional — for cloud deployments like Fly.io/Railway)
# Default is long polling. Setting TELEGRAM_WEBHOOK_URL switches to webhook mode.
# TELEGRAM_WEBHOOK_URL=https://my-app.fly.dev/telegram
# TELEGRAM_WEBHOOK_PORT=8443
# TELEGRAM_WEBHOOK_SECRET= # Recommended for production
# WhatsApp (built-in Baileys bridge — run `hermes whatsapp` to pair)
# WHATSAPP_ENABLED=false
# WHATSAPP_ALLOWED_USERS=15551234567
@ -283,11 +303,11 @@ IMAGE_TOOLS_DEBUG=false
# Tinker API Key - RL training service
# Get at: https://tinker-console.thinkingmachines.ai/keys
TINKER_API_KEY=
# TINKER_API_KEY=
# Weights & Biases API Key - Experiment tracking and metrics
# Get at: https://wandb.ai/authorize
WANDB_API_KEY=
# WANDB_API_KEY=
# RL API Server URL (default: http://localhost:8080)
# Change if running the rl-server on a different host/port

View File

@ -6,6 +6,8 @@ on:
paths:
- 'website/**'
- 'landingpage/**'
- 'skills/**'
- 'optional-skills/**'
- '.github/workflows/deploy-site.yml'
workflow_dispatch:
@ -19,6 +21,8 @@ concurrency:
jobs:
build-and-deploy:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
environment:
name: github-pages
@ -32,6 +36,16 @@ jobs:
cache: npm
cache-dependency-path: website/package-lock.json
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install PyYAML for skill extraction
run: pip install pyyaml
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Install dependencies
run: npm ci
working-directory: website

79
.github/workflows/docker-publish.yml vendored Normal file
View File

@ -0,0 +1,79 @@
name: Docker Build and Publish
on:
push:
branches: [main]
pull_request:
branches: [main]
release:
types: [published]
concurrency:
group: docker-${{ github.ref }}
cancel-in-progress: true
jobs:
build-and-push:
# Only run on the upstream repository, not on forks
if: github.repository == 'NousResearch/hermes-agent'
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build image
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
load: true
tags: nousresearch/hermes-agent:test
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Test image starts
run: |
docker run --rm \
-v /tmp/hermes-test:/opt/data \
--entrypoint /opt/hermes/docker/entrypoint.sh \
nousresearch/hermes-agent:test --help
- name: Log in to Docker Hub
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Push image (main branch)
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Push image (release)
if: github.event_name == 'release'
uses: docker/build-push-action@v6
with:
context: .
file: Dockerfile
push: true
tags: |
nousresearch/hermes-agent:latest
nousresearch/hermes-agent:${{ github.event.release.tag_name }}
nousresearch/hermes-agent:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max

View File

@ -27,8 +27,11 @@ jobs:
with:
python-version: '3.11'
- name: Install ascii-guard
run: python -m pip install ascii-guard
- name: Install Python dependencies
run: python -m pip install ascii-guard pyyaml
- name: Extract skill metadata for dashboard
run: python3 website/scripts/extract-skills.py
- name: Lint docs diagrams
run: npm run lint:diagrams

View File

@ -34,9 +34,37 @@ jobs:
- name: Run tests
run: |
source .venv/bin/activate
python -m pytest tests/ -q --ignore=tests/integration --tb=short -n auto
python -m pytest tests/ -q --ignore=tests/integration --ignore=tests/e2e --tb=short -n auto
env:
# Ensure tests don't accidentally call real APIs
OPENROUTER_API_KEY: ""
OPENAI_API_KEY: ""
NOUS_API_KEY: ""
e2e:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Set up Python 3.11
run: uv python install 3.11
- name: Install dependencies
run: |
uv venv .venv --python 3.11
source .venv/bin/activate
uv pip install -e ".[all,dev]"
- name: Run e2e tests
run: |
source .venv/bin/activate
python -m pytest tests/e2e/ -v --tb=short
env:
OPENROUTER_API_KEY: ""
OPENAI_API_KEY: ""
NOUS_API_KEY: ""

View File

@ -210,6 +210,10 @@ registry.register(
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
**Path references in tool schemas**: If the schema description mentions file paths (e.g. default output directories), use `display_hermes_home()` to make them profile-aware. The schema is generated at import time, which is after `_apply_profile_override()` sets `HERMES_HOME`.
**State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.
---
@ -358,8 +362,69 @@ in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
---
## Profiles: Multi-Instance Support
Hermes supports **profiles** — multiple fully isolated instances, each with its own
`HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
automatically scope to the active profile.
### Rules for profile-safe code
1. **Use `get_hermes_home()` for all HERMES_HOME paths.** Import from `hermes_constants`.
NEVER hardcode `~/.hermes` or `Path.home() / ".hermes"` in code that reads/writes state.
```python
# GOOD
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
# BAD — breaks profiles
config_path = Path.home() / ".hermes" / "config.yaml"
```
2. **Use `display_hermes_home()` for user-facing messages.** Import from `hermes_constants`.
This returns `~/.hermes` for default or `~/.hermes/profiles/<name>` for profiles.
```python
# GOOD
from hermes_constants import display_hermes_home
print(f"Config saved to {display_hermes_home()}/config.yaml")
# BAD — shows wrong path for profiles
print("Config saved to ~/.hermes/config.yaml")
```
3. **Module-level constants are fine** — they cache `get_hermes_home()` at import time,
which is AFTER `_apply_profile_override()` sets the env var. Just use `get_hermes_home()`,
not `Path.home() / ".hermes"`.
4. **Tests that mock `Path.home()` must also set `HERMES_HOME`** — since code now uses
`get_hermes_home()` (reads env var), not `Path.home() / ".hermes"`:
```python
with patch.object(Path, "home", return_value=tmp_path), \
patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
...
```
5. **Gateway platform adapters should use token locks** — if the adapter connects with
a unique credential (bot token, API key), call `acquire_scoped_lock()` from
`gateway.status` in the `connect()`/`start()` method and `release_scoped_lock()` in
`disconnect()`/`stop()`. This prevents two profiles from using the same credential.
See `gateway/platforms/telegram.py` for the canonical pattern.
6. **Profile operations are HOME-anchored, not HERMES_HOME-anchored** — `_get_profiles_root()`
returns `Path.home() / ".hermes" / "profiles"`, NOT `get_hermes_home() / "profiles"`.
This is intentional — it lets `hermes -p coder profile list` see all profiles regardless
of which one is active.
## Known Pitfalls
### DO NOT hardcode `~/.hermes` paths
Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_hermes_home()`
for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
### DO NOT use `simple_term_menu` for interactive menus
Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.
@ -375,6 +440,19 @@ Tool schema descriptions must not mention tools from other toolsets by name (e.g
### Tests must not write to `~/.hermes/`
The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
**Profile tests**: When testing profile features, also mock `Path.home()` so that
`_get_profiles_root()` and `_get_default_hermes_home()` resolve within the temp dir.
Use the pattern from `tests/hermes_cli/test_profiles.py`:
```python
@pytest.fixture
def profile_env(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
return home
```
---
## Testing

25
Dockerfile Normal file
View File

@ -0,0 +1,25 @@
FROM debian:13.4
# Install system dependencies in one layer, clear APT cache
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev && \
rm -rf /var/lib/apt/lists/*
COPY . /opt/hermes
WORKDIR /opt/hermes
# Install Python and Node dependencies in one layer, no cache
RUN pip install --no-cache-dir -e ".[all]" --break-system-packages && \
npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
cd /opt/hermes/scripts/whatsapp-bridge && \
npm install --prefer-offline --no-audit && \
npm cache clean --force
WORKDIR /opt/hermes
RUN chmod +x /opt/hermes/docker/entrypoint.sh
ENV HERMES_HOME=/opt/data
VOLUME [ "/opt/data" ]
ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]

4
MANIFEST.in Normal file
View File

@ -0,0 +1,4 @@
graft skills
graft optional-skills
global-exclude __pycache__
global-exclude *.py[cod]

249
RELEASE_v0.6.0.md Normal file
View File

@ -0,0 +1,249 @@
# Hermes Agent v0.6.0 (v2026.3.30)
**Release Date:** March 30, 2026
> The multi-instance release — Profiles for running isolated agent instances, MCP server mode, Docker container, fallback provider chains, two new messaging platforms (Feishu/Lark and WeCom), Telegram webhook mode, Slack multi-workspace OAuth, 95 PRs and 16 resolved issues in 2 days.
---
## ✨ Highlights
- **Profiles — Multi-Instance Hermes** — Run multiple isolated Hermes instances from the same installation. Each profile gets its own config, memory, sessions, skills, and gateway service. Create with `hermes profile create`, switch with `hermes -p <name>`, export/import for sharing. Full token-lock isolation prevents two profiles from using the same bot credential. ([#3681](https://github.com/NousResearch/hermes-agent/pull/3681))
- **MCP Server Mode** — Expose Hermes conversations and sessions to any MCP-compatible client (Claude Desktop, Cursor, VS Code, etc.) via `hermes mcp serve`. Browse conversations, read messages, search across sessions, and manage attachments — all through the Model Context Protocol. Supports both stdio and Streamable HTTP transports. ([#3795](https://github.com/NousResearch/hermes-agent/pull/3795))
- **Docker Container** — Official Dockerfile for running Hermes Agent in a container. Supports both CLI and gateway modes with volume-mounted config. ([#3668](https://github.com/NousResearch/hermes-agent/pull/3668), closes [#850](https://github.com/NousResearch/hermes-agent/issues/850))
- **Ordered Fallback Provider Chain** — Configure multiple inference providers with automatic failover. When your primary provider returns errors or is unreachable, Hermes automatically tries the next provider in the chain. Configure via `fallback_providers` in config.yaml. ([#3813](https://github.com/NousResearch/hermes-agent/pull/3813), closes [#1734](https://github.com/NousResearch/hermes-agent/issues/1734))
- **Feishu/Lark Platform Support** — Full gateway adapter for Feishu (飞书) and Lark with event subscriptions, message cards, group chat, image/file attachments, and interactive card callbacks. ([#3799](https://github.com/NousResearch/hermes-agent/pull/3799), [#3817](https://github.com/NousResearch/hermes-agent/pull/3817), closes [#1788](https://github.com/NousResearch/hermes-agent/issues/1788))
- **WeCom (Enterprise WeChat) Platform Support** — New gateway adapter for WeCom (企业微信) with text/image/voice messages, group chats, and callback verification. ([#3847](https://github.com/NousResearch/hermes-agent/pull/3847))
- **Slack Multi-Workspace OAuth** — Connect a single Hermes gateway to multiple Slack workspaces via OAuth token file. Each workspace gets its own bot token, resolved dynamically per incoming event. ([#3903](https://github.com/NousResearch/hermes-agent/pull/3903))
- **Telegram Webhook Mode & Group Controls** — Run the Telegram adapter in webhook mode as an alternative to polling — faster response times and better for production deployments behind a reverse proxy. New group mention gating controls when the bot responds: always, only when @mentioned, or via regex triggers. ([#3880](https://github.com/NousResearch/hermes-agent/pull/3880), [#3870](https://github.com/NousResearch/hermes-agent/pull/3870))
- **Exa Search Backend** — Add Exa as an alternative web search and content extraction backend alongside Firecrawl and DuckDuckGo. Set `EXA_API_KEY` and configure as preferred backend. ([#3648](https://github.com/NousResearch/hermes-agent/pull/3648))
- **Skills & Credentials on Remote Backends** — Mount skill directories and credential files into Modal and Docker containers, so remote terminal sessions have access to the same skills and secrets as local execution. ([#3890](https://github.com/NousResearch/hermes-agent/pull/3890), [#3671](https://github.com/NousResearch/hermes-agent/pull/3671), closes [#3665](https://github.com/NousResearch/hermes-agent/issues/3665), [#3433](https://github.com/NousResearch/hermes-agent/issues/3433))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
- **Ordered fallback provider chain** — automatic failover across multiple configured providers ([#3813](https://github.com/NousResearch/hermes-agent/pull/3813))
- **Fix api_mode on provider switch** — switching providers via `hermes model` now correctly clears stale `api_mode` instead of hardcoding `chat_completions`, fixing 404s for providers with Anthropic-compatible endpoints ([#3726](https://github.com/NousResearch/hermes-agent/pull/3726), [#3857](https://github.com/NousResearch/hermes-agent/pull/3857), closes [#3685](https://github.com/NousResearch/hermes-agent/issues/3685))
- **Stop silent OpenRouter fallback** — when no provider is configured, Hermes now raises a clear error instead of silently routing to OpenRouter ([#3807](https://github.com/NousResearch/hermes-agent/pull/3807), [#3862](https://github.com/NousResearch/hermes-agent/pull/3862))
- **Gemini 3.1 preview models** — added to OpenRouter and Nous Portal catalogs ([#3803](https://github.com/NousResearch/hermes-agent/pull/3803), closes [#3753](https://github.com/NousResearch/hermes-agent/issues/3753))
- **Gemini direct API context length** — full context length resolution for direct Google AI endpoints ([#3876](https://github.com/NousResearch/hermes-agent/pull/3876))
- **gpt-5.4-mini** added to Codex fallback catalog ([#3855](https://github.com/NousResearch/hermes-agent/pull/3855))
- **Curated model lists preferred** over live API probe when the probe returns fewer models ([#3856](https://github.com/NousResearch/hermes-agent/pull/3856), [#3867](https://github.com/NousResearch/hermes-agent/pull/3867))
- **User-friendly 429 rate limit messages** with Retry-After countdown ([#3809](https://github.com/NousResearch/hermes-agent/pull/3809))
- **Auxiliary client placeholder key** for local servers without auth requirements ([#3842](https://github.com/NousResearch/hermes-agent/pull/3842))
- **INFO-level logging** for auxiliary provider resolution ([#3866](https://github.com/NousResearch/hermes-agent/pull/3866))
### Agent Loop & Conversation
- **Subagent status reporting** — reports `completed` status when summary exists instead of generic failure ([#3829](https://github.com/NousResearch/hermes-agent/pull/3829))
- **Session log file updated during compression** — prevents stale file references after context compression ([#3835](https://github.com/NousResearch/hermes-agent/pull/3835))
- **Omit empty tools param** — sends no `tools` parameter when empty instead of `None`, fixing compatibility with strict providers ([#3820](https://github.com/NousResearch/hermes-agent/pull/3820))
### Profiles & Multi-Instance
- **Profiles system** — `hermes profile create/list/switch/delete/export/import/rename`. Each profile gets isolated HERMES_HOME, gateway service, CLI wrapper. Token locks prevent credential collisions. Tab completion for profile names. ([#3681](https://github.com/NousResearch/hermes-agent/pull/3681))
- **Profile-aware display paths** — all user-facing `~/.hermes` paths replaced with `display_hermes_home()` to show the correct profile directory ([#3623](https://github.com/NousResearch/hermes-agent/pull/3623))
- **Lazy display_hermes_home imports** — prevents `ImportError` during `hermes update` when modules cache stale bytecode ([#3776](https://github.com/NousResearch/hermes-agent/pull/3776))
- **HERMES_HOME for protected paths** — `.env` write-deny path now respects HERMES_HOME instead of hardcoded `~/.hermes` ([#3840](https://github.com/NousResearch/hermes-agent/pull/3840))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **Feishu/Lark** — Full adapter with event subscriptions, message cards, group chat, image/file attachments, interactive card callbacks ([#3799](https://github.com/NousResearch/hermes-agent/pull/3799), [#3817](https://github.com/NousResearch/hermes-agent/pull/3817))
- **WeCom (Enterprise WeChat)** — Text/image/voice messages, group chats, callback verification ([#3847](https://github.com/NousResearch/hermes-agent/pull/3847))
### Telegram
- **Webhook mode** — run as webhook endpoint instead of polling for production deployments ([#3880](https://github.com/NousResearch/hermes-agent/pull/3880))
- **Group mention gating & regex triggers** — configurable bot response behavior in groups: always, @mention-only, or regex-matched ([#3870](https://github.com/NousResearch/hermes-agent/pull/3870))
- **Gracefully handle deleted reply targets** — no more crashes when the message being replied to was deleted ([#3858](https://github.com/NousResearch/hermes-agent/pull/3858), closes [#3229](https://github.com/NousResearch/hermes-agent/issues/3229))
### Discord
- **Message processing reactions** — adds a reaction emoji while processing and removes it when done, giving visual feedback in channels ([#3871](https://github.com/NousResearch/hermes-agent/pull/3871))
- **DISCORD_IGNORE_NO_MENTION** — skip messages that @mention other users/bots but not Hermes ([#3640](https://github.com/NousResearch/hermes-agent/pull/3640))
- **Clean up deferred "thinking..."** — properly removes the "thinking..." indicator after slash commands complete ([#3674](https://github.com/NousResearch/hermes-agent/pull/3674), closes [#3595](https://github.com/NousResearch/hermes-agent/issues/3595))
### Slack
- **Multi-workspace OAuth** — connect to multiple Slack workspaces from a single gateway via OAuth token file ([#3903](https://github.com/NousResearch/hermes-agent/pull/3903))
### WhatsApp
- **Persistent aiohttp session** — reuse HTTP sessions across requests instead of creating new ones per message ([#3818](https://github.com/NousResearch/hermes-agent/pull/3818))
- **LID↔phone alias resolution** — correctly match Linked ID and phone number formats in allowlists ([#3830](https://github.com/NousResearch/hermes-agent/pull/3830))
- **Skip reply prefix in bot mode** — cleaner message formatting when running as a WhatsApp bot ([#3931](https://github.com/NousResearch/hermes-agent/pull/3931))
### Matrix
- **Native voice messages via MSC3245** — send voice messages as proper Matrix voice events instead of file attachments ([#3877](https://github.com/NousResearch/hermes-agent/pull/3877))
### Mattermost
- **Configurable mention behavior** — respond to messages without requiring @mention ([#3664](https://github.com/NousResearch/hermes-agent/pull/3664))
### Signal
- **URL-encode phone numbers** and correct attachment RPC parameter — fixes delivery failures with certain phone number formats ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670)) — @kshitijk4poor
### Email
- **Close SMTP/IMAP connections on failure** — prevents connection leaks during error scenarios ([#3804](https://github.com/NousResearch/hermes-agent/pull/3804))
### Gateway Core
- **Atomic config writes** — use atomic file writes for config.yaml to prevent data loss during crashes ([#3800](https://github.com/NousResearch/hermes-agent/pull/3800))
- **Home channel env overrides** — apply environment variable overrides for home channels consistently ([#3796](https://github.com/NousResearch/hermes-agent/pull/3796), [#3808](https://github.com/NousResearch/hermes-agent/pull/3808))
- **Replace print() with logger** — BasePlatformAdapter now uses proper logging instead of print statements ([#3669](https://github.com/NousResearch/hermes-agent/pull/3669))
- **Cron delivery labels** — resolve human-friendly delivery labels via channel directory ([#3860](https://github.com/NousResearch/hermes-agent/pull/3860), closes [#1945](https://github.com/NousResearch/hermes-agent/issues/1945))
- **Cron [SILENT] tightening** — prevent agents from prefixing reports with [SILENT] to suppress delivery ([#3901](https://github.com/NousResearch/hermes-agent/pull/3901))
- **Background task media delivery** and vision download timeout fixes ([#3919](https://github.com/NousResearch/hermes-agent/pull/3919))
- **Boot-md hook** — example built-in hook to run a BOOT.md file on gateway startup ([#3733](https://github.com/NousResearch/hermes-agent/pull/3733))
---
## 🖥️ CLI & User Experience
### Interactive CLI
- **Configurable tool preview length** — show full file paths by default instead of truncating at 40 chars ([#3841](https://github.com/NousResearch/hermes-agent/pull/3841))
- **Tool token context display** — `hermes tools` checklist now shows estimated token cost per toolset ([#3805](https://github.com/NousResearch/hermes-agent/pull/3805))
- **/bg spinner TUI fix** — route background task spinner through the TUI widget to prevent status bar collision ([#3643](https://github.com/NousResearch/hermes-agent/pull/3643))
- **Prevent status bar wrapping** into duplicate rows ([#3883](https://github.com/NousResearch/hermes-agent/pull/3883)) — @kshitijk4poor
- **Handle closed stdout ValueError** in safe print paths — fixes crashes when stdout is closed during gateway thread shutdown ([#3843](https://github.com/NousResearch/hermes-agent/pull/3843), closes [#3534](https://github.com/NousResearch/hermes-agent/issues/3534))
- **Remove input() from /tools disable** — eliminates freeze in terminal when disabling tools ([#3918](https://github.com/NousResearch/hermes-agent/pull/3918))
- **TTY guard for interactive CLI commands** — prevent CPU spin when launched without a terminal ([#3933](https://github.com/NousResearch/hermes-agent/pull/3933))
- **Argparse entrypoint** — use argparse in the top-level launcher for cleaner error handling ([#3874](https://github.com/NousResearch/hermes-agent/pull/3874))
- **Lazy-initialized tools show yellow** in banner instead of red, reducing false alarm about "missing" tools ([#3822](https://github.com/NousResearch/hermes-agent/pull/3822))
- **Honcho tools shown in banner** when configured ([#3810](https://github.com/NousResearch/hermes-agent/pull/3810))
### Setup & Configuration
- **Auto-install matrix-nio** during `hermes setup` when Matrix is selected ([#3802](https://github.com/NousResearch/hermes-agent/pull/3802), [#3873](https://github.com/NousResearch/hermes-agent/pull/3873))
- **Session export stdout support** — export sessions to stdout with `-` for piping ([#3641](https://github.com/NousResearch/hermes-agent/pull/3641), closes [#3609](https://github.com/NousResearch/hermes-agent/issues/3609))
- **Configurable approval timeouts** — set how long dangerous command approval prompts wait before auto-denying ([#3886](https://github.com/NousResearch/hermes-agent/pull/3886), closes [#3765](https://github.com/NousResearch/hermes-agent/issues/3765))
- **Clear __pycache__ during update** — prevents stale bytecode ImportError after `hermes update` ([#3819](https://github.com/NousResearch/hermes-agent/pull/3819))
---
## 🔧 Tool System
### MCP
- **MCP Server Mode** — `hermes mcp serve` exposes conversations, sessions, and attachments to MCP clients via stdio or Streamable HTTP ([#3795](https://github.com/NousResearch/hermes-agent/pull/3795))
- **Dynamic tool discovery** — respond to `notifications/tools/list_changed` events to pick up new tools from MCP servers without reconnecting ([#3812](https://github.com/NousResearch/hermes-agent/pull/3812))
- **Non-deprecated HTTP transport** — switched from `sse_client` to `streamable_http_client` ([#3646](https://github.com/NousResearch/hermes-agent/pull/3646))
### Web Tools
- **Exa search backend** — alternative to Firecrawl and DuckDuckGo for web search and extraction ([#3648](https://github.com/NousResearch/hermes-agent/pull/3648))
### Browser
- **Guard against None LLM responses** in browser snapshot and vision tools ([#3642](https://github.com/NousResearch/hermes-agent/pull/3642))
### Terminal & Remote Backends
- **Mount skill directories** into Modal and Docker containers ([#3890](https://github.com/NousResearch/hermes-agent/pull/3890))
- **Mount credential files** into remote backends with mtime+size caching ([#3671](https://github.com/NousResearch/hermes-agent/pull/3671))
- **Preserve partial output** when commands time out instead of losing everything ([#3868](https://github.com/NousResearch/hermes-agent/pull/3868))
- **Stop marking persisted env vars as missing** on remote backends ([#3650](https://github.com/NousResearch/hermes-agent/pull/3650))
### Audio
- **.aac format support** in transcription tool ([#3865](https://github.com/NousResearch/hermes-agent/pull/3865), closes [#1963](https://github.com/NousResearch/hermes-agent/issues/1963))
- **Audio download retry** — retry logic for `cache_audio_from_url` matching the existing image download pattern ([#3401](https://github.com/NousResearch/hermes-agent/pull/3401)) — @binhnt92
### Vision
- **Reject non-image files** and enforce website-only policy for vision analysis ([#3845](https://github.com/NousResearch/hermes-agent/pull/3845))
### Tool Schema
- **Ensure name field** always present in tool definitions, fixing `KeyError: 'name'` crashes ([#3811](https://github.com/NousResearch/hermes-agent/pull/3811), closes [#3729](https://github.com/NousResearch/hermes-agent/issues/3729))
### ACP (Editor Integration)
- **Complete session management surface** for VS Code/Zed/JetBrains clients — proper task lifecycle, cancel support, session persistence ([#3675](https://github.com/NousResearch/hermes-agent/pull/3675))
---
## 🧩 Skills & Plugins
### Skills System
- **External skill directories** — configure additional skill directories via `skills.external_dirs` in config.yaml ([#3678](https://github.com/NousResearch/hermes-agent/pull/3678))
- **Category path traversal blocked** — prevents `../` attacks in skill category names ([#3844](https://github.com/NousResearch/hermes-agent/pull/3844))
- **parallel-cli moved to optional-skills** — reduces default skill footprint ([#3673](https://github.com/NousResearch/hermes-agent/pull/3673)) — @kshitijk4poor
### New Skills
- **memento-flashcards** — spaced repetition flashcard system ([#3827](https://github.com/NousResearch/hermes-agent/pull/3827))
- **songwriting-and-ai-music** — songwriting craft and AI music generation prompts ([#3834](https://github.com/NousResearch/hermes-agent/pull/3834))
- **SiYuan Note** — integration with SiYuan note-taking app ([#3742](https://github.com/NousResearch/hermes-agent/pull/3742))
- **Scrapling** — web scraping skill using Scrapling library ([#3742](https://github.com/NousResearch/hermes-agent/pull/3742))
- **one-three-one-rule** — communication framework skill ([#3797](https://github.com/NousResearch/hermes-agent/pull/3797))
### Plugin System
- **Plugin enable/disable commands** — `hermes plugins enable/disable <name>` for managing plugin state without removing them ([#3747](https://github.com/NousResearch/hermes-agent/pull/3747))
- **Plugin message injection** — plugins can now inject messages into the conversation stream on behalf of the user via `ctx.inject_message()` ([#3778](https://github.com/NousResearch/hermes-agent/pull/3778)) — @winglian
- **Honcho self-hosted support** — allow local Honcho instances without requiring an API key ([#3644](https://github.com/NousResearch/hermes-agent/pull/3644))
---
## 🔒 Security & Reliability
### Security Hardening
- **Hardened dangerous command detection** — expanded pattern matching for risky shell commands and added file tool path guards for sensitive locations (`/etc/`, `/boot/`, docker.sock) ([#3872](https://github.com/NousResearch/hermes-agent/pull/3872))
- **Sensitive path write checks** in approval system — catch writes to system config files through file tools, not just terminal ([#3859](https://github.com/NousResearch/hermes-agent/pull/3859))
- **Secret redaction expansion** — now covers ElevenLabs, Tavily, and Exa API keys ([#3920](https://github.com/NousResearch/hermes-agent/pull/3920))
- **Vision file rejection** — reject non-image files passed to vision analysis to prevent information disclosure ([#3845](https://github.com/NousResearch/hermes-agent/pull/3845))
- **Category path traversal blocking** — prevent directory traversal in skill category names ([#3844](https://github.com/NousResearch/hermes-agent/pull/3844))
### Reliability
- **Atomic config.yaml writes** — prevent data loss during gateway crashes ([#3800](https://github.com/NousResearch/hermes-agent/pull/3800))
- **Clear __pycache__ on update** — prevent stale bytecode from causing ImportError after updates ([#3819](https://github.com/NousResearch/hermes-agent/pull/3819))
- **Lazy imports for update safety** — prevent ImportError chains during `hermes update` when modules reference new functions ([#3776](https://github.com/NousResearch/hermes-agent/pull/3776))
- **Restore terminalbench2 from patch corruption** — recovered file damaged by patch tool's secret redaction ([#3801](https://github.com/NousResearch/hermes-agent/pull/3801))
- **Terminal timeout preserves partial output** — no more lost command output on timeout ([#3868](https://github.com/NousResearch/hermes-agent/pull/3868))
---
## 🐛 Notable Bug Fixes
- **OpenClaw migration model config overwrite** — migration no longer overwrites model config dict with a string ([#3924](https://github.com/NousResearch/hermes-agent/pull/3924)) — @0xbyt4
- **OpenClaw migration expanded** — covers full data footprint including sessions, cron, memory ([#3869](https://github.com/NousResearch/hermes-agent/pull/3869))
- **Telegram deleted reply targets** — gracefully handle replies to deleted messages instead of crashing ([#3858](https://github.com/NousResearch/hermes-agent/pull/3858))
- **Discord "thinking..." persistence** — properly cleans up deferred response indicators ([#3674](https://github.com/NousResearch/hermes-agent/pull/3674))
- **WhatsApp LID↔phone aliases** — fixes allowlist matching failures with Linked ID format ([#3830](https://github.com/NousResearch/hermes-agent/pull/3830))
- **Signal URL-encoded phone numbers** — fixes delivery failures with certain formats ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670))
- **Email connection leaks** — properly close SMTP/IMAP connections on error ([#3804](https://github.com/NousResearch/hermes-agent/pull/3804))
- **_safe_print ValueError** — no more gateway thread crashes on closed stdout ([#3843](https://github.com/NousResearch/hermes-agent/pull/3843))
- **Tool schema KeyError 'name'** — ensure name field always present in tool definitions ([#3811](https://github.com/NousResearch/hermes-agent/pull/3811))
- **api_mode stale on provider switch** — correctly clear when switching providers via `hermes model` ([#3857](https://github.com/NousResearch/hermes-agent/pull/3857))
---
## 🧪 Testing
- Resolved 10+ CI failures across hooks, tiktoken, plugins, and skill tests ([#3848](https://github.com/NousResearch/hermes-agent/pull/3848), [#3721](https://github.com/NousResearch/hermes-agent/pull/3721), [#3936](https://github.com/NousResearch/hermes-agent/pull/3936))
---
## 📚 Documentation
- **Comprehensive OpenClaw migration guide** — step-by-step guide for migrating from OpenClaw/Claw3D to Hermes Agent ([#3864](https://github.com/NousResearch/hermes-agent/pull/3864), [#3900](https://github.com/NousResearch/hermes-agent/pull/3900))
- **Credential file passthrough docs** — document how to forward credential files and env vars to remote backends ([#3677](https://github.com/NousResearch/hermes-agent/pull/3677))
- **DuckDuckGo requirements clarified** — note runtime dependency on duckduckgo-search package ([#3680](https://github.com/NousResearch/hermes-agent/pull/3680))
- **Skills catalog updated** — added red-teaming category and optional skills listing ([#3745](https://github.com/NousResearch/hermes-agent/pull/3745))
- **Feishu docs MDX fix** — escape angle-bracket URLs that break Docusaurus build ([#3902](https://github.com/NousResearch/hermes-agent/pull/3902))
---
## 👥 Contributors
### Core
- **@teknium1** — 90 PRs across all subsystems
### Community Contributors
- **@kshitijk4poor** — 3 PRs: Signal phone number fix ([#3670](https://github.com/NousResearch/hermes-agent/pull/3670)), parallel-cli to optional-skills ([#3673](https://github.com/NousResearch/hermes-agent/pull/3673)), status bar wrapping fix ([#3883](https://github.com/NousResearch/hermes-agent/pull/3883))
- **@winglian** — 1 PR: Plugin message injection interface ([#3778](https://github.com/NousResearch/hermes-agent/pull/3778))
- **@binhnt92** — 1 PR: Audio download retry logic ([#3401](https://github.com/NousResearch/hermes-agent/pull/3401))
- **@0xbyt4** — 1 PR: OpenClaw migration model config fix ([#3924](https://github.com/NousResearch/hermes-agent/pull/3924))
### Issues Resolved from Community
@Material-Scientist ([#850](https://github.com/NousResearch/hermes-agent/issues/850)), @hanxu98121 ([#1734](https://github.com/NousResearch/hermes-agent/issues/1734)), @penwyp ([#1788](https://github.com/NousResearch/hermes-agent/issues/1788)), @dan-and ([#1945](https://github.com/NousResearch/hermes-agent/issues/1945)), @AdrianScott ([#1963](https://github.com/NousResearch/hermes-agent/issues/1963)), @clawdbot47 ([#3229](https://github.com/NousResearch/hermes-agent/issues/3229)), @alanfwilliams ([#3404](https://github.com/NousResearch/hermes-agent/issues/3404)), @kentimsit ([#3433](https://github.com/NousResearch/hermes-agent/issues/3433)), @hayka-pacha ([#3534](https://github.com/NousResearch/hermes-agent/issues/3534)), @primmer ([#3595](https://github.com/NousResearch/hermes-agent/issues/3595)), @dagelf ([#3609](https://github.com/NousResearch/hermes-agent/issues/3609)), @HenkDz ([#3685](https://github.com/NousResearch/hermes-agent/issues/3685)), @tmdgusya ([#3729](https://github.com/NousResearch/hermes-agent/issues/3729)), @TypQxQ ([#3753](https://github.com/NousResearch/hermes-agent/issues/3753)), @acsezen ([#3765](https://github.com/NousResearch/hermes-agent/issues/3765))
---
**Full Changelog**: [v2026.3.28...v2026.3.30](https://github.com/NousResearch/hermes-agent/compare/v2026.3.28...v2026.3.30)

290
RELEASE_v0.7.0.md Normal file
View File

@ -0,0 +1,290 @@
# Hermes Agent v0.7.0 (v2026.4.3)
**Release Date:** April 3, 2026
> The resilience release — pluggable memory providers, credential pool rotation, Camofox anti-detection browser, inline diff previews, gateway hardening across race conditions and approval routing, and deep security fixes across 168 PRs and 46 resolved issues.
---
## ✨ Highlights
- **Pluggable Memory Provider Interface** — Memory is now an extensible plugin system. Third-party memory backends (Honcho, vector stores, custom DBs) implement a simple provider ABC and register via the plugin system. Built-in memory is the default provider. Honcho integration restored to full parity as the reference plugin with profile-scoped host/peer resolution. ([#4623](https://github.com/NousResearch/hermes-agent/pull/4623), [#4616](https://github.com/NousResearch/hermes-agent/pull/4616), [#4355](https://github.com/NousResearch/hermes-agent/pull/4355))
- **Same-Provider Credential Pools** — Configure multiple API keys for the same provider with automatic rotation. Thread-safe `least_used` strategy distributes load across keys, and 401 failures trigger automatic rotation to the next credential. Set up via the setup wizard or `credential_pool` config. ([#4188](https://github.com/NousResearch/hermes-agent/pull/4188), [#4300](https://github.com/NousResearch/hermes-agent/pull/4300), [#4361](https://github.com/NousResearch/hermes-agent/pull/4361))
- **Camofox Anti-Detection Browser Backend** — New local browser backend using Camoufox for stealth browsing. Persistent sessions with VNC URL discovery for visual debugging, configurable SSRF bypass for local backends, auto-install via `hermes tools`. ([#4008](https://github.com/NousResearch/hermes-agent/pull/4008), [#4419](https://github.com/NousResearch/hermes-agent/pull/4419), [#4292](https://github.com/NousResearch/hermes-agent/pull/4292))
- **Inline Diff Previews** — File write and patch operations now show inline diffs in the tool activity feed, giving you visual confirmation of what changed before the agent moves on. ([#4411](https://github.com/NousResearch/hermes-agent/pull/4411), [#4423](https://github.com/NousResearch/hermes-agent/pull/4423))
- **API Server Session Continuity & Tool Streaming** — The API server (Open WebUI integration) now streams tool progress events in real-time and supports `X-Hermes-Session-Id` headers for persistent sessions across requests. Sessions persist to the shared SessionDB. ([#4092](https://github.com/NousResearch/hermes-agent/pull/4092), [#4478](https://github.com/NousResearch/hermes-agent/pull/4478), [#4802](https://github.com/NousResearch/hermes-agent/pull/4802))
- **ACP: Client-Provided MCP Servers** — Editor integrations (VS Code, Zed, JetBrains) can now register their own MCP servers, which Hermes picks up as additional agent tools. Your editor's MCP ecosystem flows directly into the agent. ([#4705](https://github.com/NousResearch/hermes-agent/pull/4705))
- **Gateway Hardening** — Major stability pass across race conditions, photo media delivery, flood control, stuck sessions, approval routing, and compression death spirals. The gateway is substantially more reliable in production. ([#4727](https://github.com/NousResearch/hermes-agent/pull/4727), [#4750](https://github.com/NousResearch/hermes-agent/pull/4750), [#4798](https://github.com/NousResearch/hermes-agent/pull/4798), [#4557](https://github.com/NousResearch/hermes-agent/pull/4557))
- **Security: Secret Exfiltration Blocking** — Browser URLs and LLM responses are now scanned for secret patterns, blocking exfiltration attempts via URL encoding, base64, or prompt injection. Credential directory protections expanded to `.docker`, `.azure`, `.config/gh`. Execute_code sandbox output is redacted. ([#4483](https://github.com/NousResearch/hermes-agent/pull/4483), [#4360](https://github.com/NousResearch/hermes-agent/pull/4360), [#4305](https://github.com/NousResearch/hermes-agent/pull/4305), [#4327](https://github.com/NousResearch/hermes-agent/pull/4327))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
- **Same-provider credential pools** — configure multiple API keys with automatic `least_used` rotation and 401 failover ([#4188](https://github.com/NousResearch/hermes-agent/pull/4188), [#4300](https://github.com/NousResearch/hermes-agent/pull/4300))
- **Credential pool preserved through smart routing** — pool state survives fallback provider switches and defers eager fallback on 429 ([#4361](https://github.com/NousResearch/hermes-agent/pull/4361))
- **Per-turn primary runtime restoration** — after fallback provider use, the agent automatically restores the primary provider on the next turn with transport recovery ([#4624](https://github.com/NousResearch/hermes-agent/pull/4624))
- **`developer` role for GPT-5 and Codex models** — uses OpenAI's recommended system message role for newer models ([#4498](https://github.com/NousResearch/hermes-agent/pull/4498))
- **Google model operational guidance** — Gemini and Gemma models get provider-specific prompting guidance ([#4641](https://github.com/NousResearch/hermes-agent/pull/4641))
- **Anthropic long-context tier 429 handling** — automatically reduces context to 200k when hitting tier limits ([#4747](https://github.com/NousResearch/hermes-agent/pull/4747))
- **URL-based auth for third-party Anthropic endpoints** + CI test fixes ([#4148](https://github.com/NousResearch/hermes-agent/pull/4148))
- **Bearer auth for MiniMax Anthropic endpoints** ([#4028](https://github.com/NousResearch/hermes-agent/pull/4028))
- **Fireworks context length detection** ([#4158](https://github.com/NousResearch/hermes-agent/pull/4158))
- **Standard DashScope international endpoint** for Alibaba provider ([#4133](https://github.com/NousResearch/hermes-agent/pull/4133), closes [#3912](https://github.com/NousResearch/hermes-agent/issues/3912))
- **Custom providers context_length** honored in hygiene compression ([#4085](https://github.com/NousResearch/hermes-agent/pull/4085))
- **Non-sk-ant keys** treated as regular API keys, not OAuth tokens ([#4093](https://github.com/NousResearch/hermes-agent/pull/4093))
- **Claude-sonnet-4.6** added to OpenRouter and Nous model lists ([#4157](https://github.com/NousResearch/hermes-agent/pull/4157))
- **Qwen 3.6 Plus Preview** added to model lists ([#4376](https://github.com/NousResearch/hermes-agent/pull/4376))
- **MiniMax M2.7** added to hermes model picker and OpenCode ([#4208](https://github.com/NousResearch/hermes-agent/pull/4208))
- **Auto-detect models from server probe** in custom endpoint setup ([#4218](https://github.com/NousResearch/hermes-agent/pull/4218))
- **Config.yaml single source of truth** for endpoint URLs — no more env var vs config.yaml conflicts ([#4165](https://github.com/NousResearch/hermes-agent/pull/4165))
- **Setup wizard no longer overwrites** custom endpoint config ([#4180](https://github.com/NousResearch/hermes-agent/pull/4180), closes [#4172](https://github.com/NousResearch/hermes-agent/issues/4172))
- **Unified setup wizard provider selection** with `hermes model` — single code path for both flows ([#4200](https://github.com/NousResearch/hermes-agent/pull/4200))
- **Root-level provider config** no longer overrides `model.provider` ([#4329](https://github.com/NousResearch/hermes-agent/pull/4329))
- **Rate-limit pairing rejection messages** to prevent spam ([#4081](https://github.com/NousResearch/hermes-agent/pull/4081))
### Agent Loop & Conversation
- **Preserve Anthropic thinking block signatures** across tool-use turns ([#4626](https://github.com/NousResearch/hermes-agent/pull/4626))
- **Classify think-only empty responses** before retrying — prevents infinite retry loops on models that produce thinking blocks without content ([#4645](https://github.com/NousResearch/hermes-agent/pull/4645))
- **Prevent compression death spiral** from API disconnects — stops the loop where compression triggers, fails, compresses again ([#4750](https://github.com/NousResearch/hermes-agent/pull/4750), closes [#2153](https://github.com/NousResearch/hermes-agent/issues/2153))
- **Persist compressed context** to gateway session after mid-run compression ([#4095](https://github.com/NousResearch/hermes-agent/pull/4095))
- **Context-exceeded error messages** now include actionable guidance ([#4155](https://github.com/NousResearch/hermes-agent/pull/4155), closes [#4061](https://github.com/NousResearch/hermes-agent/issues/4061))
- **Strip orphaned think/reasoning tags** from user-facing responses ([#4311](https://github.com/NousResearch/hermes-agent/pull/4311), closes [#4285](https://github.com/NousResearch/hermes-agent/issues/4285))
- **Harden Codex responses preflight** and stream error handling ([#4313](https://github.com/NousResearch/hermes-agent/pull/4313))
- **Deterministic call_id fallbacks** instead of random UUIDs for prompt cache consistency ([#3991](https://github.com/NousResearch/hermes-agent/pull/3991))
- **Context pressure warning spam** prevented after compression ([#4012](https://github.com/NousResearch/hermes-agent/pull/4012))
- **AsyncOpenAI created lazily** in trajectory compressor to avoid closed event loop errors ([#4013](https://github.com/NousResearch/hermes-agent/pull/4013))
### Memory & Sessions
- **Pluggable memory provider interface** — ABC-based plugin system for custom memory backends with profile isolation ([#4623](https://github.com/NousResearch/hermes-agent/pull/4623))
- **Honcho full integration parity** restored as reference memory provider plugin ([#4355](https://github.com/NousResearch/hermes-agent/pull/4355)) — @erosika
- **Honcho profile-scoped** host and peer resolution ([#4616](https://github.com/NousResearch/hermes-agent/pull/4616))
- **Memory flush state persisted** to prevent redundant re-flushes on gateway restart ([#4481](https://github.com/NousResearch/hermes-agent/pull/4481))
- **Memory provider tools** routed through sequential execution path ([#4803](https://github.com/NousResearch/hermes-agent/pull/4803))
- **Honcho config** written to instance-local path for profile isolation ([#4037](https://github.com/NousResearch/hermes-agent/pull/4037))
- **API server sessions** persist to shared SessionDB ([#4802](https://github.com/NousResearch/hermes-agent/pull/4802))
- **Token usage persisted** for non-CLI sessions ([#4627](https://github.com/NousResearch/hermes-agent/pull/4627))
- **Quote dotted terms in FTS5 queries** — fixes session search for terms containing dots ([#4549](https://github.com/NousResearch/hermes-agent/pull/4549))
---
## 📱 Messaging Platforms (Gateway)
### Gateway Core
- **Race condition fixes** — photo media loss, flood control, stuck sessions, and STT config issues resolved in one hardening pass ([#4727](https://github.com/NousResearch/hermes-agent/pull/4727))
- **Approval routing through running-agent guard** — `/approve` and `/deny` now route correctly when the agent is blocked waiting for approval instead of being swallowed as interrupts ([#4798](https://github.com/NousResearch/hermes-agent/pull/4798), [#4557](https://github.com/NousResearch/hermes-agent/pull/4557), closes [#4542](https://github.com/NousResearch/hermes-agent/issues/4542))
- **Resume agent after /approve** — tool result is no longer lost when executing blocked commands ([#4418](https://github.com/NousResearch/hermes-agent/pull/4418))
- **DM thread sessions seeded** with parent transcript to preserve context ([#4559](https://github.com/NousResearch/hermes-agent/pull/4559))
- **Skill-aware slash commands** — gateway dynamically registers installed skills as slash commands with paginated `/commands` list and Telegram 100-command cap ([#3934](https://github.com/NousResearch/hermes-agent/pull/3934), [#4005](https://github.com/NousResearch/hermes-agent/pull/4005), [#4006](https://github.com/NousResearch/hermes-agent/pull/4006), [#4010](https://github.com/NousResearch/hermes-agent/pull/4010), [#4023](https://github.com/NousResearch/hermes-agent/pull/4023))
- **Per-platform disabled skills** respected in Telegram menu and gateway dispatch ([#4799](https://github.com/NousResearch/hermes-agent/pull/4799))
- **Remove user-facing compression warnings** — cleaner message flow ([#4139](https://github.com/NousResearch/hermes-agent/pull/4139))
- **`-v/-q` flags wired to stderr logging** for gateway service ([#4474](https://github.com/NousResearch/hermes-agent/pull/4474))
- **HERMES_HOME remapped** to target user in system service unit ([#4456](https://github.com/NousResearch/hermes-agent/pull/4456))
- **Honor default for invalid bool-like config values** ([#4029](https://github.com/NousResearch/hermes-agent/pull/4029))
- **setsid instead of systemd-run** for `/update` command to avoid systemd permission issues ([#4104](https://github.com/NousResearch/hermes-agent/pull/4104), closes [#4017](https://github.com/NousResearch/hermes-agent/issues/4017))
- **'Initializing agent...'** shown on first message for better UX ([#4086](https://github.com/NousResearch/hermes-agent/pull/4086))
- **Allow running gateway service as root** for LXC/container environments ([#4732](https://github.com/NousResearch/hermes-agent/pull/4732))
### Telegram
- **32-char limit on command names** with collision avoidance ([#4211](https://github.com/NousResearch/hermes-agent/pull/4211))
- **Priority order enforced** in menu — core > plugins > skills ([#4023](https://github.com/NousResearch/hermes-agent/pull/4023))
- **Capped at 50 commands** — API rejects above ~60 ([#4006](https://github.com/NousResearch/hermes-agent/pull/4006))
- **Skip empty/whitespace text** to prevent 400 errors ([#4388](https://github.com/NousResearch/hermes-agent/pull/4388))
- **E2E gateway tests** added ([#4497](https://github.com/NousResearch/hermes-agent/pull/4497)) — @pefontana
### Discord
- **Button-based approval UI** — register `/approve` and `/deny` slash commands with interactive button prompts ([#4800](https://github.com/NousResearch/hermes-agent/pull/4800))
- **Configurable reactions** — `discord.reactions` config option to disable message processing reactions ([#4199](https://github.com/NousResearch/hermes-agent/pull/4199))
- **Skip reactions and auto-threading** for unauthorized users ([#4387](https://github.com/NousResearch/hermes-agent/pull/4387))
### Slack
- **Reply in thread** — `slack.reply_in_thread` config option for threaded responses ([#4643](https://github.com/NousResearch/hermes-agent/pull/4643), closes [#2662](https://github.com/NousResearch/hermes-agent/issues/2662))
### WhatsApp
- **Enforce require_mention in group chats** ([#4730](https://github.com/NousResearch/hermes-agent/pull/4730))
### Webhook
- **Platform support fixes** — skip home channel prompt, disable tool progress for webhook adapters ([#4660](https://github.com/NousResearch/hermes-agent/pull/4660))
### Matrix
- **E2EE decryption hardening** — request missing keys, auto-trust devices, retry buffered events ([#4083](https://github.com/NousResearch/hermes-agent/pull/4083))
---
## 🖥️ CLI & User Experience
### New Slash Commands
- **`/yolo`** — toggle dangerous command approvals on/off for the session ([#3990](https://github.com/NousResearch/hermes-agent/pull/3990))
- **`/btw`** — ephemeral side questions that don't affect the main conversation context ([#4161](https://github.com/NousResearch/hermes-agent/pull/4161))
- **`/profile`** — show active profile info without leaving the chat session ([#4027](https://github.com/NousResearch/hermes-agent/pull/4027))
### Interactive CLI
- **Inline diff previews** for write and patch operations in the tool activity feed ([#4411](https://github.com/NousResearch/hermes-agent/pull/4411), [#4423](https://github.com/NousResearch/hermes-agent/pull/4423))
- **TUI pinned to bottom** on startup — no more large blank spaces between response and input ([#4412](https://github.com/NousResearch/hermes-agent/pull/4412), [#4359](https://github.com/NousResearch/hermes-agent/pull/4359), closes [#4398](https://github.com/NousResearch/hermes-agent/issues/4398), [#4421](https://github.com/NousResearch/hermes-agent/issues/4421))
- **`/history` and `/resume`** now surface recent sessions directly instead of requiring search ([#4728](https://github.com/NousResearch/hermes-agent/pull/4728))
- **Cache tokens shown** in `/insights` overview so total adds up ([#4428](https://github.com/NousResearch/hermes-agent/pull/4428))
- **`--max-turns` CLI flag** for `hermes chat` to limit agent iterations ([#4314](https://github.com/NousResearch/hermes-agent/pull/4314))
- **Detect dragged file paths** instead of treating them as slash commands ([#4533](https://github.com/NousResearch/hermes-agent/pull/4533)) — @rolme
- **Allow empty strings and falsy values** in `config set` ([#4310](https://github.com/NousResearch/hermes-agent/pull/4310), closes [#4277](https://github.com/NousResearch/hermes-agent/issues/4277))
- **Voice mode in WSL** when PulseAudio bridge is configured ([#4317](https://github.com/NousResearch/hermes-agent/pull/4317))
- **Respect `NO_COLOR` env var** and `TERM=dumb` for accessibility ([#4079](https://github.com/NousResearch/hermes-agent/pull/4079), closes [#4066](https://github.com/NousResearch/hermes-agent/issues/4066)) — @SHL0MS
- **Correct shell reload instruction** for macOS/zsh users ([#4025](https://github.com/NousResearch/hermes-agent/pull/4025))
- **Zero exit code** on successful quiet mode queries ([#4613](https://github.com/NousResearch/hermes-agent/pull/4613), closes [#4601](https://github.com/NousResearch/hermes-agent/issues/4601)) — @devorun
- **on_session_end hook fires** on interrupted exits ([#4159](https://github.com/NousResearch/hermes-agent/pull/4159))
- **Profile list display** reads `model.default` key correctly ([#4160](https://github.com/NousResearch/hermes-agent/pull/4160))
- **Browser and TTS** shown in reconfigure menu ([#4041](https://github.com/NousResearch/hermes-agent/pull/4041))
- **Web backend priority** detection simplified ([#4036](https://github.com/NousResearch/hermes-agent/pull/4036))
### Setup & Configuration
- **Allowed_users preserved** during setup and quiet unconfigured provider warnings ([#4551](https://github.com/NousResearch/hermes-agent/pull/4551)) — @kshitijk4poor
- **Save API key to model config** for custom endpoints ([#4202](https://github.com/NousResearch/hermes-agent/pull/4202), closes [#4182](https://github.com/NousResearch/hermes-agent/issues/4182))
- **Claude Code credentials gated** behind explicit Hermes config in wizard trigger ([#4210](https://github.com/NousResearch/hermes-agent/pull/4210))
- **Atomic writes in save_config_value** to prevent config loss on interrupt ([#4298](https://github.com/NousResearch/hermes-agent/pull/4298), [#4320](https://github.com/NousResearch/hermes-agent/pull/4320))
- **Scopes field written** to Claude Code credentials on token refresh ([#4126](https://github.com/NousResearch/hermes-agent/pull/4126))
### Update System
- **Fork detection and upstream sync** in `hermes update` ([#4744](https://github.com/NousResearch/hermes-agent/pull/4744))
- **Preserve working optional extras** when one extra fails during update ([#4550](https://github.com/NousResearch/hermes-agent/pull/4550))
- **Handle conflicted git index** during hermes update ([#4735](https://github.com/NousResearch/hermes-agent/pull/4735))
- **Avoid launchd restart race** on macOS ([#4736](https://github.com/NousResearch/hermes-agent/pull/4736))
- **Missing subprocess.run() timeouts** added to doctor and status commands ([#4009](https://github.com/NousResearch/hermes-agent/pull/4009))
---
## 🔧 Tool System
### Browser
- **Camofox anti-detection browser backend** — local stealth browsing with auto-install via `hermes tools` ([#4008](https://github.com/NousResearch/hermes-agent/pull/4008))
- **Persistent Camofox sessions** with VNC URL discovery for visual debugging ([#4419](https://github.com/NousResearch/hermes-agent/pull/4419))
- **Skip SSRF check for local backends** (Camofox, headless Chromium) ([#4292](https://github.com/NousResearch/hermes-agent/pull/4292))
- **Configurable SSRF check** via `browser.allow_private_urls` ([#4198](https://github.com/NousResearch/hermes-agent/pull/4198)) — @nils010485
- **CAMOFOX_PORT=9377** added to Docker commands ([#4340](https://github.com/NousResearch/hermes-agent/pull/4340))
### File Operations
- **Inline diff previews** on write and patch actions ([#4411](https://github.com/NousResearch/hermes-agent/pull/4411), [#4423](https://github.com/NousResearch/hermes-agent/pull/4423))
- **Stale file detection** on write and patch — warns when file was modified externally since last read ([#4345](https://github.com/NousResearch/hermes-agent/pull/4345))
- **Staleness timestamp refreshed** after writes ([#4390](https://github.com/NousResearch/hermes-agent/pull/4390))
- **Size guard, dedup, and device blocking** on read_file ([#4315](https://github.com/NousResearch/hermes-agent/pull/4315))
### MCP
- **Stability fix pack** — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking ([#4757](https://github.com/NousResearch/hermes-agent/pull/4757), closes [#4462](https://github.com/NousResearch/hermes-agent/issues/4462), [#2537](https://github.com/NousResearch/hermes-agent/issues/2537))
### ACP (Editor Integration)
- **Client-provided MCP servers** registered as agent tools — editors pass their MCP servers to Hermes ([#4705](https://github.com/NousResearch/hermes-agent/pull/4705))
### Skills System
- **Size limits for agent writes** and **fuzzy matching for skill patch** — prevents oversized skill writes and improves edit reliability ([#4414](https://github.com/NousResearch/hermes-agent/pull/4414))
- **Validate hub bundle paths** before install — blocks path traversal in skill bundles ([#3986](https://github.com/NousResearch/hermes-agent/pull/3986))
- **Unified hermes-agent and hermes-agent-setup** into single skill ([#4332](https://github.com/NousResearch/hermes-agent/pull/4332))
- **Skill metadata type check** in extract_skill_conditions ([#4479](https://github.com/NousResearch/hermes-agent/pull/4479))
### New/Updated Skills
- **research-paper-writing** — full end-to-end research pipeline (replaced ml-paper-writing) ([#4654](https://github.com/NousResearch/hermes-agent/pull/4654)) — @SHL0MS
- **ascii-video** — text readability techniques and external layout oracle ([#4054](https://github.com/NousResearch/hermes-agent/pull/4054)) — @SHL0MS
- **youtube-transcript** updated for youtube-transcript-api v1.x ([#4455](https://github.com/NousResearch/hermes-agent/pull/4455)) — @el-analista
- **Skills browse and search page** added to documentation site ([#4500](https://github.com/NousResearch/hermes-agent/pull/4500)) — @IAvecilla
---
## 🔒 Security & Reliability
### Security Hardening
- **Block secret exfiltration** via browser URLs and LLM responses — scans for secret patterns in URL encoding, base64, and prompt injection vectors ([#4483](https://github.com/NousResearch/hermes-agent/pull/4483))
- **Redact secrets from execute_code sandbox output** ([#4360](https://github.com/NousResearch/hermes-agent/pull/4360))
- **Protect `.docker`, `.azure`, `.config/gh` credential directories** from read/write via file tools and terminal ([#4305](https://github.com/NousResearch/hermes-agent/pull/4305), [#4327](https://github.com/NousResearch/hermes-agent/pull/4327)) — @memosr
- **GitHub OAuth token patterns** added to redaction + snapshot redact flag ([#4295](https://github.com/NousResearch/hermes-agent/pull/4295))
- **Reject private and loopback IPs** in Telegram DoH fallback ([#4129](https://github.com/NousResearch/hermes-agent/pull/4129))
- **Reject path traversal** in credential file registration ([#4316](https://github.com/NousResearch/hermes-agent/pull/4316))
- **Validate tar archive member paths** on profile import — blocks zip-slip attacks ([#4318](https://github.com/NousResearch/hermes-agent/pull/4318))
- **Exclude auth.json and .env** from profile exports ([#4475](https://github.com/NousResearch/hermes-agent/pull/4475))
### Reliability
- **Prevent compression death spiral** from API disconnects ([#4750](https://github.com/NousResearch/hermes-agent/pull/4750), closes [#2153](https://github.com/NousResearch/hermes-agent/issues/2153))
- **Handle `is_closed` as method** in OpenAI SDK — prevents false positive client closure detection ([#4416](https://github.com/NousResearch/hermes-agent/pull/4416), closes [#4377](https://github.com/NousResearch/hermes-agent/issues/4377))
- **Exclude matrix from [all] extras** — python-olm is upstream-broken, prevents install failures ([#4615](https://github.com/NousResearch/hermes-agent/pull/4615), closes [#4178](https://github.com/NousResearch/hermes-agent/issues/4178))
- **OpenCode model routing** repaired ([#4508](https://github.com/NousResearch/hermes-agent/pull/4508))
- **Docker container image** optimized ([#4034](https://github.com/NousResearch/hermes-agent/pull/4034)) — @bcross
### Windows & Cross-Platform
- **Voice mode in WSL** with PulseAudio bridge ([#4317](https://github.com/NousResearch/hermes-agent/pull/4317))
- **Homebrew packaging** preparation ([#4099](https://github.com/NousResearch/hermes-agent/pull/4099))
- **CI fork conditionals** to prevent workflow failures on forks ([#4107](https://github.com/NousResearch/hermes-agent/pull/4107))
---
## 🐛 Notable Bug Fixes
- **Gateway approval blocked agent thread** — approval now blocks the agent thread like CLI does, preventing tool result loss ([#4557](https://github.com/NousResearch/hermes-agent/pull/4557), closes [#4542](https://github.com/NousResearch/hermes-agent/issues/4542))
- **Compression death spiral** from API disconnects — detected and halted instead of looping ([#4750](https://github.com/NousResearch/hermes-agent/pull/4750), closes [#2153](https://github.com/NousResearch/hermes-agent/issues/2153))
- **Anthropic thinking blocks lost** across tool-use turns ([#4626](https://github.com/NousResearch/hermes-agent/pull/4626))
- **Profile model config ignored** with `-p` flag — model.model now promoted to model.default correctly ([#4160](https://github.com/NousResearch/hermes-agent/pull/4160), closes [#4486](https://github.com/NousResearch/hermes-agent/issues/4486))
- **CLI blank space** between response and input area ([#4412](https://github.com/NousResearch/hermes-agent/pull/4412), [#4359](https://github.com/NousResearch/hermes-agent/pull/4359), closes [#4398](https://github.com/NousResearch/hermes-agent/issues/4398))
- **Dragged file paths** treated as slash commands instead of file references ([#4533](https://github.com/NousResearch/hermes-agent/pull/4533)) — @rolme
- **Orphaned `</think>` tags** leaking into user-facing responses ([#4311](https://github.com/NousResearch/hermes-agent/pull/4311), closes [#4285](https://github.com/NousResearch/hermes-agent/issues/4285))
- **OpenAI SDK `is_closed`** is a method not property — false positive client closure ([#4416](https://github.com/NousResearch/hermes-agent/pull/4416), closes [#4377](https://github.com/NousResearch/hermes-agent/issues/4377))
- **MCP OAuth server** could block Hermes startup instead of degrading gracefully ([#4757](https://github.com/NousResearch/hermes-agent/pull/4757), closes [#4462](https://github.com/NousResearch/hermes-agent/issues/4462))
- **MCP event loop closed** on shutdown with HTTP servers ([#4757](https://github.com/NousResearch/hermes-agent/pull/4757), closes [#2537](https://github.com/NousResearch/hermes-agent/issues/2537))
- **Alibaba provider** hardcoded to wrong endpoint ([#4133](https://github.com/NousResearch/hermes-agent/pull/4133), closes [#3912](https://github.com/NousResearch/hermes-agent/issues/3912))
- **Slack reply_in_thread** missing config option ([#4643](https://github.com/NousResearch/hermes-agent/pull/4643), closes [#2662](https://github.com/NousResearch/hermes-agent/issues/2662))
- **Quiet mode exit code** — successful `-q` queries no longer exit nonzero ([#4613](https://github.com/NousResearch/hermes-agent/pull/4613), closes [#4601](https://github.com/NousResearch/hermes-agent/issues/4601))
- **Mobile sidebar** shows only close button due to backdrop-filter issue in docs site ([#4207](https://github.com/NousResearch/hermes-agent/pull/4207)) — @xsmyile
- **Config restore reverted** by stale-branch squash merge — `_config_version` fixed ([#4440](https://github.com/NousResearch/hermes-agent/pull/4440))
---
## 🧪 Testing
- **Telegram gateway E2E tests** — full integration test suite for the Telegram adapter ([#4497](https://github.com/NousResearch/hermes-agent/pull/4497)) — @pefontana
- **11 real test failures fixed** plus sys.modules cascade poisoner resolved ([#4570](https://github.com/NousResearch/hermes-agent/pull/4570))
- **7 CI failures resolved** across hooks, plugins, and skill tests ([#3936](https://github.com/NousResearch/hermes-agent/pull/3936))
- **Codex 401 refresh tests** updated for CI compatibility ([#4166](https://github.com/NousResearch/hermes-agent/pull/4166))
- **Stale OPENAI_BASE_URL test** fixed ([#4217](https://github.com/NousResearch/hermes-agent/pull/4217))
---
## 📚 Documentation
- **Comprehensive documentation audit** — 9 HIGH and 20+ MEDIUM gaps fixed across 21 files ([#4087](https://github.com/NousResearch/hermes-agent/pull/4087))
- **Site navigation restructured** — features and platforms promoted to top-level ([#4116](https://github.com/NousResearch/hermes-agent/pull/4116))
- **Tool progress streaming** documented for API server and Open WebUI ([#4138](https://github.com/NousResearch/hermes-agent/pull/4138))
- **Telegram webhook mode** documentation ([#4089](https://github.com/NousResearch/hermes-agent/pull/4089))
- **Local LLM provider guides** — comprehensive setup guides with context length warnings ([#4294](https://github.com/NousResearch/hermes-agent/pull/4294))
- **WhatsApp allowlist behavior** clarified with `WHATSAPP_ALLOW_ALL_USERS` documentation ([#4293](https://github.com/NousResearch/hermes-agent/pull/4293))
- **Slack configuration options** — new config section in Slack docs ([#4644](https://github.com/NousResearch/hermes-agent/pull/4644))
- **Terminal backends section** expanded + docs build fixes ([#4016](https://github.com/NousResearch/hermes-agent/pull/4016))
- **Adding-providers guide** updated for unified setup flow ([#4201](https://github.com/NousResearch/hermes-agent/pull/4201))
- **ACP Zed config** fixed ([#4743](https://github.com/NousResearch/hermes-agent/pull/4743))
- **Community FAQ** entries for common workflows and troubleshooting ([#4797](https://github.com/NousResearch/hermes-agent/pull/4797))
- **Skills browse and search page** on docs site ([#4500](https://github.com/NousResearch/hermes-agent/pull/4500)) — @IAvecilla
---
## 👥 Contributors
### Core
- **@teknium1** — 135 commits across all subsystems
### Top Community Contributors
- **@kshitijk4poor** — 13 commits: preserve allowed_users during setup ([#4551](https://github.com/NousResearch/hermes-agent/pull/4551)), and various fixes
- **@erosika** — 12 commits: Honcho full integration parity restored as memory provider plugin ([#4355](https://github.com/NousResearch/hermes-agent/pull/4355))
- **@pefontana** — 9 commits: Telegram gateway E2E test suite ([#4497](https://github.com/NousResearch/hermes-agent/pull/4497))
- **@bcross** — 5 commits: Docker container image optimization ([#4034](https://github.com/NousResearch/hermes-agent/pull/4034))
- **@SHL0MS** — 4 commits: NO_COLOR/TERM=dumb support ([#4079](https://github.com/NousResearch/hermes-agent/pull/4079)), ascii-video skill updates ([#4054](https://github.com/NousResearch/hermes-agent/pull/4054)), research-paper-writing skill ([#4654](https://github.com/NousResearch/hermes-agent/pull/4654))
### All Contributors
@0xbyt4, @arasovic, @Bartok9, @bcross, @binhnt92, @camden-lowrance, @curtitoo, @Dakota, @Dave Tist, @Dean Kerr, @devorun, @dieutx, @Dilee, @el-analista, @erosika, @Gutslabs, @IAvecilla, @Jack, @Johannnnn506, @kshitijk4poor, @Laura Batalha, @Leegenux, @Lume, @MacroAnarchy, @maymuneth, @memosr, @NexVeridian, @Nick, @nils010485, @pefontana, @Penov, @rolme, @SHL0MS, @txchen, @xsmyile
### Issues Resolved from Community
@acsezen ([#2537](https://github.com/NousResearch/hermes-agent/issues/2537)), @arasovic ([#4285](https://github.com/NousResearch/hermes-agent/issues/4285)), @camden-lowrance ([#4462](https://github.com/NousResearch/hermes-agent/issues/4462)), @devorun ([#4601](https://github.com/NousResearch/hermes-agent/issues/4601)), @eloklam ([#4486](https://github.com/NousResearch/hermes-agent/issues/4486)), @HenkDz ([#3719](https://github.com/NousResearch/hermes-agent/issues/3719)), @hypotyposis ([#2153](https://github.com/NousResearch/hermes-agent/issues/2153)), @kazamak ([#4178](https://github.com/NousResearch/hermes-agent/issues/4178)), @lstep ([#4366](https://github.com/NousResearch/hermes-agent/issues/4366)), @Mark-Lok ([#4542](https://github.com/NousResearch/hermes-agent/issues/4542)), @NoJster ([#4421](https://github.com/NousResearch/hermes-agent/issues/4421)), @patp ([#2662](https://github.com/NousResearch/hermes-agent/issues/2662)), @pr0n ([#4601](https://github.com/NousResearch/hermes-agent/issues/4601)), @saulmc ([#4377](https://github.com/NousResearch/hermes-agent/issues/4377)), @SHL0MS ([#4060](https://github.com/NousResearch/hermes-agent/issues/4060), [#4061](https://github.com/NousResearch/hermes-agent/issues/4061), [#4066](https://github.com/NousResearch/hermes-agent/issues/4066), [#4172](https://github.com/NousResearch/hermes-agent/issues/4172), [#4277](https://github.com/NousResearch/hermes-agent/issues/4277)), @Z-Mackintosh ([#4398](https://github.com/NousResearch/hermes-agent/issues/4398))
---
**Full Changelog**: [v2026.3.30...v2026.4.3](https://github.com/NousResearch/hermes-agent/compare/v2026.3.30...v2026.4.3)

View File

@ -74,7 +74,7 @@ def main() -> None:
agent = HermesACPAgent()
try:
asyncio.run(acp.run_agent(agent))
asyncio.run(acp.run_agent(agent, use_unstable_protocol=True))
except KeyboardInterrupt:
logger.info("Shutting down (KeyboardInterrupt)")
except Exception:

View File

@ -22,9 +22,15 @@ from acp.schema import (
InitializeResponse,
ListSessionsResponse,
LoadSessionResponse,
McpServerHttp,
McpServerSse,
McpServerStdio,
NewSessionResponse,
PromptResponse,
ResumeSessionResponse,
SetSessionConfigOptionResponse,
SetSessionModelResponse,
SetSessionModeResponse,
ResourceContentBlock,
SessionCapabilities,
SessionForkCapabilities,
@ -90,15 +96,83 @@ class HermesACPAgent(acp.Agent):
self._conn = conn
logger.info("ACP client connected")
async def _register_session_mcp_servers(
self,
state: SessionState,
mcp_servers: list[McpServerStdio | McpServerHttp | McpServerSse] | None,
) -> None:
"""Register ACP-provided MCP servers and refresh the agent tool surface."""
if not mcp_servers:
return
try:
from tools.mcp_tool import register_mcp_servers
config_map: dict[str, dict] = {}
for server in mcp_servers:
name = server.name
if isinstance(server, McpServerStdio):
config = {
"command": server.command,
"args": list(server.args),
"env": {item.name: item.value for item in server.env},
}
else:
config = {
"url": server.url,
"headers": {item.name: item.value for item in server.headers},
}
config_map[name] = config
await asyncio.to_thread(register_mcp_servers, config_map)
except Exception:
logger.warning(
"Session %s: failed to register ACP MCP servers",
state.session_id,
exc_info=True,
)
return
try:
from model_tools import get_tool_definitions
enabled_toolsets = getattr(state.agent, "enabled_toolsets", None) or ["hermes-acp"]
disabled_toolsets = getattr(state.agent, "disabled_toolsets", None)
state.agent.tools = get_tool_definitions(
enabled_toolsets=enabled_toolsets,
disabled_toolsets=disabled_toolsets,
quiet_mode=True,
)
state.agent.valid_tool_names = {
tool["function"]["name"] for tool in state.agent.tools or []
}
invalidate = getattr(state.agent, "_invalidate_system_prompt", None)
if callable(invalidate):
invalidate()
logger.info(
"Session %s: refreshed tool surface after ACP MCP registration (%d tools)",
state.session_id,
len(state.agent.tools or []),
)
except Exception:
logger.warning(
"Session %s: failed to refresh tool surface after ACP MCP registration",
state.session_id,
exc_info=True,
)
# ---- ACP lifecycle ------------------------------------------------------
async def initialize(
self,
protocol_version: int,
protocol_version: int | None = None,
client_capabilities: ClientCapabilities | None = None,
client_info: Implementation | None = None,
**kwargs: Any,
) -> InitializeResponse:
resolved_protocol_version = (
protocol_version if isinstance(protocol_version, int) else acp.PROTOCOL_VERSION
)
provider = detect_provider()
auth_methods = None
if provider:
@ -111,7 +185,11 @@ class HermesACPAgent(acp.Agent):
]
client_name = client_info.name if client_info else "unknown"
logger.info("Initialize from %s (protocol v%s)", client_name, protocol_version)
logger.info(
"Initialize from %s (protocol v%s)",
client_name,
resolved_protocol_version,
)
return InitializeResponse(
protocol_version=acp.PROTOCOL_VERSION,
@ -139,6 +217,7 @@ class HermesACPAgent(acp.Agent):
**kwargs: Any,
) -> NewSessionResponse:
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
return NewSessionResponse(session_id=state.session_id)
@ -153,6 +232,7 @@ class HermesACPAgent(acp.Agent):
if state is None:
logger.warning("load_session: session %s not found", session_id)
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
return LoadSessionResponse()
@ -167,6 +247,7 @@ class HermesACPAgent(acp.Agent):
if state is None:
logger.warning("resume_session: session %s not found, creating new", session_id)
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
return ResumeSessionResponse()
@ -190,6 +271,8 @@ class HermesACPAgent(acp.Agent):
) -> ForkSessionResponse:
state = self.session_manager.fork_session(session_id, cwd=cwd)
new_id = state.session_id if state else ""
if state is not None:
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Forked session %s -> %s", session_id, new_id)
return ForkSessionResponse(session_id=new_id)
@ -471,7 +554,7 @@ class HermesACPAgent(acp.Agent):
async def set_session_model(
self, model_id: str, session_id: str, **kwargs: Any
):
) -> SetSessionModelResponse | None:
"""Switch the model for a session (called by ACP protocol)."""
state = self.session_manager.get_session(session_id)
if state:
@ -489,4 +572,37 @@ class HermesACPAgent(acp.Agent):
)
self.session_manager.save_session(session_id)
logger.info("Session %s: model switched to %s", session_id, model_id)
return SetSessionModelResponse()
logger.warning("Session %s: model switch requested for missing session", session_id)
return None
async def set_session_mode(
self, mode_id: str, session_id: str, **kwargs: Any
) -> SetSessionModeResponse | None:
"""Persist the editor-requested mode so ACP clients do not fail on mode switches."""
state = self.session_manager.get_session(session_id)
if state is None:
logger.warning("Session %s: mode switch requested for missing session", session_id)
return None
setattr(state, "mode", mode_id)
self.session_manager.save_session(session_id)
logger.info("Session %s: mode switched to %s", session_id, mode_id)
return SetSessionModeResponse()
async def set_config_option(
self, config_id: str, session_id: str, value: str, **kwargs: Any
) -> SetSessionConfigOptionResponse | None:
"""Accept ACP config option updates even when Hermes has no typed ACP config surface yet."""
state = self.session_manager.get_session(session_id)
if state is None:
logger.warning("Session %s: config update requested for missing session", session_id)
return None
options = getattr(state, "config_options", None)
if not isinstance(options, dict):
options = {}
options[str(config_id)] = value
setattr(state, "config_options", options)
self.session_manager.save_session(session_id)
logger.info("Session %s: config option %s updated", session_id, config_id)
return SetSessionConfigOptionResponse(config_options=[])

View File

@ -426,7 +426,7 @@ class SessionManager:
config = load_config()
model_cfg = config.get("model")
default_model = "anthropic/claude-opus-4.6"
default_model = ""
config_provider = None
if isinstance(model_cfg, dict):
default_model = str(model_cfg.get("default") or default_model)

View File

@ -10,6 +10,7 @@ Auth supports:
- Claude Code credentials (~/.claude.json or ~/.claude/.credentials.json) → Bearer auth
"""
import copy
import json
import logging
import os
@ -162,6 +163,36 @@ def _is_oauth_token(key: str) -> bool:
return True
def _is_third_party_anthropic_endpoint(base_url: str | None) -> bool:
"""Return True for non-Anthropic endpoints using the Anthropic Messages API.
Third-party proxies (Azure AI Foundry, AWS Bedrock, self-hosted) authenticate
with their own API keys via x-api-key, not Anthropic OAuth tokens. OAuth
detection should be skipped for these endpoints.
"""
if not base_url:
return False # No base_url = direct Anthropic API
normalized = base_url.rstrip("/").lower()
if "anthropic.com" in normalized:
return False # Direct Anthropic API — OAuth applies
return True # Any other endpoint is a third-party proxy
def _requires_bearer_auth(base_url: str | None) -> bool:
"""Return True for Anthropic-compatible providers that require Bearer auth.
Some third-party /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of Anthropic's native x-api-key header.
MiniMax's global and China Anthropic-compatible endpoints follow this pattern.
"""
if not base_url:
return False
normalized = base_url.rstrip("/").lower()
return normalized.startswith("https://api.minimax.io/anthropic") or normalized.startswith(
"https://api.minimaxi.com/anthropic"
)
def build_anthropic_client(api_key: str, base_url: str = None):
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
@ -180,7 +211,25 @@ def build_anthropic_client(api_key: str, base_url: str = None):
if base_url:
kwargs["base_url"] = base_url
if _is_oauth_token(api_key):
if _requires_bearer_auth(base_url):
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
# Authorization: Bearer even for regular API keys. Route those endpoints
# through auth_token so the SDK sends Bearer auth instead of x-api-key.
# Check this before OAuth token shape detection because MiniMax secrets do
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
# Anthropic OAuth/setup tokens.
kwargs["auth_token"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
elif _is_third_party_anthropic_endpoint(base_url):
# Third-party proxies (Azure AI Foundry, AWS Bedrock, etc.) use their
# own API keys with x-api-key auth. Skip OAuth detection — their keys
# don't follow Anthropic's sk-ant-* prefix convention and would be
# misclassified as OAuth tokens.
kwargs["api_key"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
elif _is_oauth_token(api_key):
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
# Anthropic routes OAuth requests based on user-agent and headers;
# without Claude Code's fingerprint, requests get intermittent 500s.
@ -259,71 +308,105 @@ def is_claude_code_token_valid(creds: Dict[str, Any]) -> bool:
return now_ms < (expires_at - 60_000)
def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
"""Attempt to refresh an expired Claude Code OAuth token.
Uses the same token endpoint and client_id as Claude Code / OpenCode.
Only works for credentials that have a refresh token (from claude /login
or claude setup-token with OAuth flow).
Tries the new platform.claude.com endpoint first (Claude Code >=2.1.81),
then falls back to console.anthropic.com for older tokens.
Returns the new access token, or None if refresh fails.
"""
def refresh_anthropic_oauth_pure(refresh_token: str, *, use_json: bool = False) -> Dict[str, Any]:
"""Refresh an Anthropic OAuth token without mutating local credential files."""
import time
import urllib.parse
import urllib.request
if not refresh_token:
raise ValueError("refresh_token is required")
client_id = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
if use_json:
data = json.dumps({
"grant_type": "refresh_token",
"refresh_token": refresh_token,
"client_id": client_id,
}).encode()
content_type = "application/json"
else:
data = urllib.parse.urlencode({
"grant_type": "refresh_token",
"refresh_token": refresh_token,
"client_id": client_id,
}).encode()
content_type = "application/x-www-form-urlencoded"
token_endpoints = [
"https://platform.claude.com/v1/oauth/token",
"https://console.anthropic.com/v1/oauth/token",
]
last_error = None
for endpoint in token_endpoints:
req = urllib.request.Request(
endpoint,
data=data,
headers={
"Content-Type": content_type,
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
},
method="POST",
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
result = json.loads(resp.read().decode())
except Exception as exc:
last_error = exc
logger.debug("Anthropic token refresh failed at %s: %s", endpoint, exc)
continue
access_token = result.get("access_token", "")
if not access_token:
raise ValueError("Anthropic refresh response was missing access_token")
next_refresh = result.get("refresh_token", refresh_token)
expires_in = result.get("expires_in", 3600)
return {
"access_token": access_token,
"refresh_token": next_refresh,
"expires_at_ms": int(time.time() * 1000) + (expires_in * 1000),
}
if last_error is not None:
raise last_error
raise ValueError("Anthropic token refresh failed")
def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
"""Attempt to refresh an expired Claude Code OAuth token."""
refresh_token = creds.get("refreshToken", "")
if not refresh_token:
logger.debug("No refresh token available — cannot refresh")
return None
# Client ID used by Claude Code's OAuth flow
CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
# Anthropic migrated OAuth from console.anthropic.com to platform.claude.com
# (Claude Code v2.1.81+). Try new endpoint first, fall back to old.
token_endpoints = [
"https://platform.claude.com/v1/oauth/token",
"https://console.anthropic.com/v1/oauth/token",
]
payload = json.dumps({
"grant_type": "refresh_token",
"refresh_token": refresh_token,
"client_id": CLIENT_ID,
}).encode()
headers = {
"Content-Type": "application/json",
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
}
for endpoint in token_endpoints:
req = urllib.request.Request(
endpoint, data=payload, headers=headers, method="POST",
try:
refreshed = refresh_anthropic_oauth_pure(refresh_token, use_json=False)
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
result = json.loads(resp.read().decode())
new_access = result.get("access_token", "")
new_refresh = result.get("refresh_token", refresh_token)
expires_in = result.get("expires_in", 3600)
if new_access:
new_expires_ms = int(time.time() * 1000) + (expires_in * 1000)
_write_claude_code_credentials(new_access, new_refresh, new_expires_ms)
logger.debug("Refreshed Claude Code OAuth token via %s", endpoint)
return new_access
except Exception as e:
logger.debug("Token refresh failed at %s: %s", endpoint, e)
return None
logger.debug("Successfully refreshed Claude Code OAuth token")
return refreshed["access_token"]
except Exception as e:
logger.debug("Failed to refresh Claude Code token: %s", e)
return None
def _write_claude_code_credentials(access_token: str, refresh_token: str, expires_at_ms: int) -> None:
"""Write refreshed credentials back to ~/.claude/.credentials.json."""
def _write_claude_code_credentials(
access_token: str,
refresh_token: str,
expires_at_ms: int,
*,
scopes: Optional[list] = None,
) -> None:
"""Write refreshed credentials back to ~/.claude/.credentials.json.
The optional *scopes* list (e.g. ``["user:inference", "user:profile", ...]``)
is persisted so that Claude Code's own auth check recognises the credential
as valid. Claude Code >=2.1.81 gates on the presence of ``"user:inference"``
in the stored scopes before it will use the token.
"""
cred_path = Path.home() / ".claude" / ".credentials.json"
try:
# Read existing file to preserve other fields
@ -331,11 +414,19 @@ def _write_claude_code_credentials(access_token: str, refresh_token: str, expire
if cred_path.exists():
existing = json.loads(cred_path.read_text(encoding="utf-8"))
existing["claudeAiOauth"] = {
oauth_data: Dict[str, Any] = {
"accessToken": access_token,
"refreshToken": refresh_token,
"expiresAt": expires_at_ms,
}
if scopes is not None:
oauth_data["scopes"] = scopes
elif "claudeAiOauth" in existing and "scopes" in existing["claudeAiOauth"]:
# Preserve previously-stored scopes when the refresh response
# does not include a scope field.
oauth_data["scopes"] = existing["claudeAiOauth"]["scopes"]
existing["claudeAiOauth"] = oauth_data
cred_path.parent.mkdir(parents=True, exist_ok=True)
cred_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
@ -495,10 +586,208 @@ def run_oauth_setup_token() -> Optional[str]:
return None
# ── Hermes-native PKCE OAuth flow ────────────────────────────────────────
# Mirrors the flow used by Claude Code, pi-ai, and OpenCode.
# Stores credentials in ~/.hermes/.anthropic_oauth.json (our own file).
_OAUTH_CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
_OAUTH_TOKEN_URL = "https://console.anthropic.com/v1/oauth/token"
_OAUTH_REDIRECT_URI = "https://console.anthropic.com/oauth/code/callback"
_OAUTH_SCOPES = "org:create_api_key user:profile user:inference"
_HERMES_OAUTH_FILE = get_hermes_home() / ".anthropic_oauth.json"
def _generate_pkce() -> tuple:
"""Generate PKCE code_verifier and code_challenge (S256)."""
import base64
import hashlib
import secrets
verifier = base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b"=").decode()
challenge = base64.urlsafe_b64encode(
hashlib.sha256(verifier.encode()).digest()
).rstrip(b"=").decode()
return verifier, challenge
def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
"""Run Hermes-native OAuth PKCE flow and return credential state."""
import time
import webbrowser
verifier, challenge = _generate_pkce()
params = {
"code": "true",
"client_id": _OAUTH_CLIENT_ID,
"response_type": "code",
"redirect_uri": _OAUTH_REDIRECT_URI,
"scope": _OAUTH_SCOPES,
"code_challenge": challenge,
"code_challenge_method": "S256",
"state": verifier,
}
from urllib.parse import urlencode
auth_url = f"https://claude.ai/oauth/authorize?{urlencode(params)}"
print()
print("Authorize Hermes with your Claude Pro/Max subscription.")
print()
print("╭─ Claude Pro/Max Authorization ────────────────────╮")
print("│ │")
print("│ Open this link in your browser: │")
print("╰───────────────────────────────────────────────────╯")
print()
print(f" {auth_url}")
print()
try:
webbrowser.open(auth_url)
print(" (Browser opened automatically)")
except Exception:
pass
print()
print("After authorizing, you'll see a code. Paste it below.")
print()
try:
auth_code = input("Authorization code: ").strip()
except (KeyboardInterrupt, EOFError):
return None
if not auth_code:
print("No code entered.")
return None
splits = auth_code.split("#")
code = splits[0]
state = splits[1] if len(splits) > 1 else ""
try:
import urllib.request
exchange_data = json.dumps({
"grant_type": "authorization_code",
"client_id": _OAUTH_CLIENT_ID,
"code": code,
"state": state,
"redirect_uri": _OAUTH_REDIRECT_URI,
"code_verifier": verifier,
}).encode()
req = urllib.request.Request(
_OAUTH_TOKEN_URL,
data=exchange_data,
headers={
"Content-Type": "application/json",
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=15) as resp:
result = json.loads(resp.read().decode())
except Exception as e:
print(f"Token exchange failed: {e}")
return None
access_token = result.get("access_token", "")
refresh_token = result.get("refresh_token", "")
expires_in = result.get("expires_in", 3600)
if not access_token:
print("No access token in response.")
return None
expires_at_ms = int(time.time() * 1000) + (expires_in * 1000)
return {
"access_token": access_token,
"refresh_token": refresh_token,
"expires_at_ms": expires_at_ms,
}
def run_hermes_oauth_login() -> Optional[str]:
"""Run Hermes-native OAuth PKCE flow for Claude Pro/Max subscription.
Opens a browser to claude.ai for authorization, prompts for the code,
exchanges it for tokens, and stores them in ~/.hermes/.anthropic_oauth.json.
Returns the access token on success, None on failure.
"""
result = run_hermes_oauth_login_pure()
if not result:
return None
access_token = result["access_token"]
refresh_token = result["refresh_token"]
expires_at_ms = result["expires_at_ms"]
_save_hermes_oauth_credentials(access_token, refresh_token, expires_at_ms)
_write_claude_code_credentials(access_token, refresh_token, expires_at_ms)
print("Authentication successful!")
return access_token
def _save_hermes_oauth_credentials(access_token: str, refresh_token: str, expires_at_ms: int) -> None:
"""Save OAuth credentials to ~/.hermes/.anthropic_oauth.json."""
data = {
"accessToken": access_token,
"refreshToken": refresh_token,
"expiresAt": expires_at_ms,
}
try:
_HERMES_OAUTH_FILE.parent.mkdir(parents=True, exist_ok=True)
_HERMES_OAUTH_FILE.write_text(json.dumps(data, indent=2), encoding="utf-8")
_HERMES_OAUTH_FILE.chmod(0o600)
except (OSError, IOError) as e:
logger.debug("Failed to save Hermes OAuth credentials: %s", e)
def read_hermes_oauth_credentials() -> Optional[Dict[str, Any]]:
"""Read Hermes-managed OAuth credentials from ~/.hermes/.anthropic_oauth.json."""
if _HERMES_OAUTH_FILE.exists():
try:
data = json.loads(_HERMES_OAUTH_FILE.read_text(encoding="utf-8"))
if data.get("accessToken"):
return data
except (json.JSONDecodeError, OSError, IOError) as e:
logger.debug("Failed to read Hermes OAuth credentials: %s", e)
return None
def refresh_hermes_oauth_token() -> Optional[str]:
"""Refresh the Hermes-managed OAuth token using the stored refresh token.
Returns the new access token, or None if refresh fails.
"""
creds = read_hermes_oauth_credentials()
if not creds or not creds.get("refreshToken"):
return None
try:
refreshed = refresh_anthropic_oauth_pure(
creds["refreshToken"],
use_json=True,
)
_save_hermes_oauth_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
logger.debug("Successfully refreshed Hermes OAuth token")
return refreshed["access_token"]
except Exception as e:
logger.debug("Failed to refresh Hermes OAuth token: %s", e)
return None
# ---------------------------------------------------------------------------
@ -661,6 +950,69 @@ def _convert_content_part_to_anthropic(part: Any) -> Optional[Dict[str, Any]]:
return block
def _to_plain_data(value: Any, *, _depth: int = 0, _path: Optional[set] = None) -> Any:
"""Recursively convert SDK objects to plain Python data structures.
Guards against circular references (``_path`` tracks ``id()`` of objects
on the *current* recursion path) and runaway depth (capped at 20 levels).
Uses path-based tracking so shared (but non-cyclic) objects referenced by
multiple siblings are converted correctly rather than being stringified.
"""
_MAX_DEPTH = 20
if _depth > _MAX_DEPTH:
return str(value)
if _path is None:
_path = set()
obj_id = id(value)
if obj_id in _path:
return str(value)
if hasattr(value, "model_dump"):
_path.add(obj_id)
result = _to_plain_data(value.model_dump(), _depth=_depth + 1, _path=_path)
_path.discard(obj_id)
return result
if isinstance(value, dict):
_path.add(obj_id)
result = {k: _to_plain_data(v, _depth=_depth + 1, _path=_path) for k, v in value.items()}
_path.discard(obj_id)
return result
if isinstance(value, (list, tuple)):
_path.add(obj_id)
result = [_to_plain_data(v, _depth=_depth + 1, _path=_path) for v in value]
_path.discard(obj_id)
return result
if hasattr(value, "__dict__"):
_path.add(obj_id)
result = {
k: _to_plain_data(v, _depth=_depth + 1, _path=_path)
for k, v in vars(value).items()
if not k.startswith("_")
}
_path.discard(obj_id)
return result
return value
def _extract_preserved_thinking_blocks(message: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Return Anthropic thinking blocks previously preserved on the message."""
raw_details = message.get("reasoning_details")
if not isinstance(raw_details, list):
return []
preserved: List[Dict[str, Any]] = []
for detail in raw_details:
if not isinstance(detail, dict):
continue
block_type = str(detail.get("type", "") or "").strip().lower()
if block_type not in {"thinking", "redacted_thinking"}:
continue
preserved.append(copy.deepcopy(detail))
return preserved
def _convert_content_to_anthropic(content: Any) -> Any:
"""Convert OpenAI-style multimodal content arrays to Anthropic blocks."""
if not isinstance(content, list):
@ -707,7 +1059,7 @@ def convert_messages_to_anthropic(
continue
if role == "assistant":
blocks = []
blocks = _extract_preserved_thinking_blocks(m)
if content:
if isinstance(content, list):
converted_content = _convert_content_to_anthropic(content)
@ -991,6 +1343,7 @@ def normalize_anthropic_response(
"""
text_parts = []
reasoning_parts = []
reasoning_details = []
tool_calls = []
for block in response.content:
@ -998,6 +1351,9 @@ def normalize_anthropic_response(
text_parts.append(block.text)
elif block.type == "thinking":
reasoning_parts.append(block.thinking)
block_dict = _to_plain_data(block)
if isinstance(block_dict, dict):
reasoning_details.append(block_dict)
elif block.type == "tool_use":
name = block.name
if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
@ -1028,7 +1384,7 @@ def normalize_anthropic_response(
tool_calls=tool_calls or None,
reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
reasoning_content=None,
reasoning_details=None,
reasoning_details=reasoning_details or None,
),
finish_reason,
)
)

View File

@ -7,7 +7,7 @@ the best available backend without duplicating fallback logic.
Resolution order for text tasks (auto mode):
1. OpenRouter (OPENROUTER_API_KEY)
2. Nous Portal (~/.hermes/auth.json active provider)
3. Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
3. Custom endpoint (config.yaml model.base_url + OPENAI_API_KEY)
4. Codex OAuth (Responses API via chatgpt.com with gpt-5.3-codex,
wrapped to look like a chat.completions client)
5. Native Anthropic
@ -47,6 +47,7 @@ from typing import Any, Dict, List, Optional, Tuple
from openai import OpenAI
from agent.credential_pool import load_pool
from hermes_cli.config import get_hermes_home
from hermes_constants import OPENROUTER_BASE_URL
@ -96,6 +97,45 @@ _CODEX_AUX_MODEL = "gpt-5.2-codex"
_CODEX_AUX_BASE_URL = "https://chatgpt.com/backend-api/codex"
def _select_pool_entry(provider: str) -> Tuple[bool, Optional[Any]]:
"""Return (pool_exists_for_provider, selected_entry)."""
try:
pool = load_pool(provider)
except Exception as exc:
logger.debug("Auxiliary client: could not load pool for %s: %s", provider, exc)
return False, None
if not pool or not pool.has_credentials():
return False, None
try:
return True, pool.select()
except Exception as exc:
logger.debug("Auxiliary client: could not select pool entry for %s: %s", provider, exc)
return True, None
def _pool_runtime_api_key(entry: Any) -> str:
if entry is None:
return ""
# Use the PooledCredential.runtime_api_key property which handles
# provider-specific fallback (e.g. agent_key for nous).
key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
return str(key or "").strip()
def _pool_runtime_base_url(entry: Any, fallback: str = "") -> str:
if entry is None:
return str(fallback or "").strip().rstrip("/")
# runtime_base_url handles provider-specific logic (e.g. nous prefers inference_base_url).
# Fall back through inference_base_url and base_url for non-PooledCredential entries.
url = (
getattr(entry, "runtime_base_url", None)
or getattr(entry, "inference_base_url", None)
or getattr(entry, "base_url", None)
or fallback
)
return str(url or "").strip().rstrip("/")
# ── Codex Responses → chat.completions adapter ─────────────────────────────
# All auxiliary consumers call client.chat.completions.create(**kwargs) and
# read response.choices[0].message.content. This adapter translates those
@ -439,6 +479,22 @@ def _read_nous_auth() -> Optional[dict]:
Returns the provider state dict if Nous is active with tokens,
otherwise None.
"""
pool_present, entry = _select_pool_entry("nous")
if pool_present:
if entry is None:
return None
return {
"access_token": getattr(entry, "access_token", ""),
"refresh_token": getattr(entry, "refresh_token", None),
"agent_key": getattr(entry, "agent_key", None),
"inference_base_url": _pool_runtime_base_url(entry, _NOUS_DEFAULT_BASE_URL),
"portal_base_url": getattr(entry, "portal_base_url", None),
"client_id": getattr(entry, "client_id", None),
"scope": getattr(entry, "scope", None),
"token_type": getattr(entry, "token_type", "Bearer"),
"source": "pool",
}
try:
if not _AUTH_JSON_PATH.is_file():
return None
@ -467,6 +523,11 @@ def _nous_base_url() -> str:
def _read_codex_access_token() -> Optional[str]:
"""Read a valid, non-expired Codex OAuth access token from Hermes auth store."""
pool_present, entry = _select_pool_entry("openai-codex")
if pool_present:
token = _pool_runtime_api_key(entry)
return token or None
try:
from hermes_cli.auth import _read_codex_tokens
data = _read_codex_tokens()
@ -513,6 +574,24 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
if provider_id == "anthropic":
return _try_anthropic()
pool_present, entry = _select_pool_entry(provider_id)
if pool_present:
api_key = _pool_runtime_api_key(entry)
if not api_key:
continue
base_url = _pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
creds = resolve_api_key_provider_credentials(provider_id)
api_key = str(creds.get("api_key", "")).strip()
if not api_key:
@ -562,6 +641,16 @@ def _get_auxiliary_env_override(task: str, suffix: str) -> Optional[str]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = _pool_runtime_api_key(entry)
if not or_key:
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
or_key = os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
@ -577,22 +666,22 @@ def _try_nous() -> Tuple[Optional[OpenAI], Optional[str]]:
global auxiliary_is_nous
auxiliary_is_nous = True
logger.debug("Auxiliary client: Nous Portal")
model = "gemini-3-flash" if nous.get("source") == "pool" else _NOUS_MODEL
return (
OpenAI(api_key=_nous_api_key(nous), base_url=_nous_base_url()),
_NOUS_MODEL,
OpenAI(
api_key=_nous_api_key(nous),
base_url=str(nous.get("inference_base_url") or _nous_base_url()).rstrip("/"),
),
model,
)
def _read_main_model() -> str:
"""Read the user's configured main model from config/env.
"""Read the user's configured main model from config.yaml.
Falls back through HERMES_MODEL → LLM_MODEL → config.yaml model.default
so the auxiliary client can use the same model as the main agent when no
dedicated auxiliary model is available.
config.yaml model.default is the single source of truth for the active
model. Environment variables are no longer consulted.
"""
from_env = os.getenv("OPENAI_MODEL") or os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL")
if from_env:
return from_env.strip()
try:
from hermes_cli.config import load_config
cfg = load_config()
@ -627,8 +716,6 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
custom_key = runtime.get("api_key")
if not isinstance(custom_base, str) or not custom_base.strip():
return None, None
if not isinstance(custom_key, str) or not custom_key.strip():
return None, None
custom_base = custom_base.strip().rstrip("/")
if "openrouter.ai" in custom_base.lower():
@ -636,6 +723,13 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
# configured. Treat that as "no custom endpoint" for auxiliary routing.
return None, None
# Local servers (Ollama, llama.cpp, vLLM, LM Studio) don't require auth.
# Use a placeholder key — the OpenAI SDK requires a non-empty string but
# local servers ignore the Authorization header. Same fix as cli.py
# _ensure_runtime_credentials() (PR #2556).
if not isinstance(custom_key, str) or not custom_key.strip():
custom_key = "no-key-required"
return custom_base, custom_key.strip()
@ -654,11 +748,19 @@ def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_codex() -> Tuple[Optional[Any], Optional[str]]:
codex_token = _read_codex_access_token()
if not codex_token:
return None, None
pool_present, entry = _select_pool_entry("openai-codex")
if pool_present:
codex_token = _pool_runtime_api_key(entry)
if not codex_token:
return None, None
base_url = _pool_runtime_base_url(entry, _CODEX_AUX_BASE_URL) or _CODEX_AUX_BASE_URL
else:
codex_token = _read_codex_access_token()
if not codex_token:
return None, None
base_url = _CODEX_AUX_BASE_URL
logger.debug("Auxiliary client: Codex OAuth (%s via Responses API)", _CODEX_AUX_MODEL)
real_client = OpenAI(api_key=codex_token, base_url=_CODEX_AUX_BASE_URL)
real_client = OpenAI(api_key=codex_token, base_url=base_url)
return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL
@ -668,14 +770,21 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
except ImportError:
return None, None
token = resolve_anthropic_token()
pool_present, entry = _select_pool_entry("anthropic")
if pool_present:
if entry is None:
return None, None
token = _pool_runtime_api_key(entry)
else:
entry = None
token = resolve_anthropic_token()
if not token:
return None, None
# Allow base URL override from config.yaml model.base_url, but only
# when the configured provider is anthropic — otherwise a non-Anthropic
# base_url (e.g. Codex endpoint) would leak into Anthropic requests.
base_url = _ANTHROPIC_DEFAULT_BASE_URL
base_url = _pool_runtime_base_url(entry, _ANTHROPIC_DEFAULT_BASE_URL) if pool_present else _ANTHROPIC_DEFAULT_BASE_URL
try:
from hermes_cli.config import load_config
cfg = load_config()
@ -737,16 +846,37 @@ def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[st
return None, None
_AUTO_PROVIDER_LABELS = {
"_try_openrouter": "openrouter",
"_try_nous": "nous",
"_try_custom_endpoint": "local/custom",
"_try_codex": "openai-codex",
"_resolve_api_key_provider": "api-key",
}
def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain: OpenRouter → Nous → custom → Codex → API-key → None."""
global auxiliary_is_nous
auxiliary_is_nous = False # Reset — _try_nous() will set True if it wins
tried = []
for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint,
_try_codex, _resolve_api_key_provider):
fn_name = getattr(try_fn, "__name__", "unknown")
label = _AUTO_PROVIDER_LABELS.get(fn_name, fn_name)
client, model = try_fn()
if client is not None:
if tried:
logger.info("Auxiliary auto-detect: using %s (%s) — skipped: %s",
label, model or "default", ", ".join(tried))
else:
logger.info("Auxiliary auto-detect: using %s (%s)", label, model or "default")
return client, model
logger.debug("Auxiliary client: none available")
tried.append(label)
logger.warning("Auxiliary auto-detect: no provider available (tried: %s). "
"Compression, summarization, and memory flush will not work. "
"Set OPENROUTER_API_KEY or configure a local model in config.yaml.",
", ".join(tried))
return None, None
@ -897,11 +1027,12 @@ def resolve_provider_client(
custom_key = (
(explicit_api_key or "").strip()
or os.getenv("OPENAI_API_KEY", "").strip()
or "no-key-required" # local servers don't need auth
)
if not custom_base or not custom_key:
if not custom_base:
logger.warning(
"resolve_provider_client: explicit custom endpoint requested "
"but no API key was found (set explicit_api_key or OPENAI_API_KEY)"
"but base_url is empty"
)
return None, None
final_model = model or _read_main_model() or "gpt-4o-mini"
@ -947,9 +1078,9 @@ def resolve_provider_client(
tried_sources = list(pconfig.api_key_env_vars)
if provider == "copilot":
tried_sources.append("gh auth token")
logger.warning("resolve_provider_client: provider %s has no API "
"key configured (tried: %s)",
provider, ", ".join(tried_sources))
logger.debug("resolve_provider_client: provider %s has no API "
"key configured (tried: %s)",
provider, ", ".join(tried_sources))
return None, None
base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
@ -1458,6 +1589,29 @@ def _resolve_task_provider_model(
return "auto", resolved_model, None, None
_DEFAULT_AUX_TIMEOUT = 30.0
def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
"""Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
if not task:
return default
try:
from hermes_cli.config import load_config
config = load_config()
except ImportError:
return default
aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
raw = task_config.get("timeout")
if raw is not None:
try:
return float(raw)
except (ValueError, TypeError):
pass
return default
def _build_call_kwargs(
provider: str,
model: str,
@ -1515,7 +1669,7 @@ def call_llm(
temperature: float = None,
max_tokens: int = None,
tools: list = None,
timeout: float = 30.0,
timeout: float = None,
extra_body: dict = None,
) -> Any:
"""Centralized synchronous LLM call.
@ -1533,7 +1687,7 @@ def call_llm(
temperature: Sampling temperature (None = provider default).
max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
tools: Tool definitions (for function calling).
timeout: Request timeout in seconds.
timeout: Request timeout in seconds (None = read from auxiliary.{task}.timeout config).
extra_body: Additional request body fields.
Returns:
@ -1589,8 +1743,8 @@ def call_llm(
)
# For auto/custom, fall back to OpenRouter
if not resolved_base_url:
logger.warning("Provider %s unavailable, falling back to openrouter",
resolved_provider)
logger.info("Auxiliary %s: provider %s unavailable, falling back to openrouter",
task or "call", resolved_provider)
client, final_model = _get_cached_client(
"openrouter", resolved_model or _OPENROUTER_MODEL)
if client is None:
@ -1598,10 +1752,19 @@ def call_llm(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
f"Run: hermes setup")
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
# Log what we're about to do — makes auxiliary operations visible
_base_info = str(getattr(client, "base_url", resolved_base_url) or "")
if task:
logger.info("Auxiliary %s: using %s (%s)%s",
task, resolved_provider or "auto", final_model or "default",
f" at {_base_info}" if _base_info and "openrouter" not in _base_info else "")
kwargs = _build_call_kwargs(
resolved_provider, final_model, messages,
temperature=temperature, max_tokens=max_tokens,
tools=tools, timeout=timeout, extra_body=extra_body,
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
# Handle max_tokens vs max_completion_tokens retry
@ -1683,7 +1846,7 @@ async def async_call_llm(
temperature: float = None,
max_tokens: int = None,
tools: list = None,
timeout: float = 30.0,
timeout: float = None,
extra_body: dict = None,
) -> Any:
"""Centralized asynchronous LLM call.
@ -1744,10 +1907,12 @@ async def async_call_llm(
f"No LLM provider configured for task={task} provider={resolved_provider}. "
f"Run: hermes setup")
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
kwargs = _build_call_kwargs(
resolved_provider, final_model, messages,
temperature=temperature, max_tokens=max_tokens,
tools=tools, timeout=timeout, extra_body=extra_body,
tools=tools, timeout=effective_timeout, extra_body=extra_body,
base_url=resolved_base_url)
try:

View File

@ -0,0 +1,113 @@
"""BuiltinMemoryProvider — wraps MEMORY.md / USER.md as a MemoryProvider.
Always registered as the first provider. Cannot be disabled or removed.
This is the existing Hermes memory system exposed through the provider
interface for compatibility with the MemoryManager.
The actual storage logic lives in tools/memory_tool.py (MemoryStore).
This provider is a thin adapter that delegates to MemoryStore and
exposes the memory tool schema.
"""
from __future__ import annotations
import json
import logging
from typing import Any, Dict, List, Optional
from agent.memory_provider import MemoryProvider
logger = logging.getLogger(__name__)
class BuiltinMemoryProvider(MemoryProvider):
"""Built-in file-backed memory (MEMORY.md + USER.md).
Always active, never disabled by other providers. The `memory` tool
is handled by run_agent.py's agent-level tool interception (not through
the normal registry), so get_tool_schemas() returns an empty list —
the memory tool is already wired separately.
"""
def __init__(
self,
memory_store=None,
memory_enabled: bool = False,
user_profile_enabled: bool = False,
):
self._store = memory_store
self._memory_enabled = memory_enabled
self._user_profile_enabled = user_profile_enabled
@property
def name(self) -> str:
return "builtin"
def is_available(self) -> bool:
"""Built-in memory is always available."""
return True
def initialize(self, session_id: str, **kwargs) -> None:
"""Load memory from disk if not already loaded."""
if self._store is not None:
self._store.load_from_disk()
def system_prompt_block(self) -> str:
"""Return MEMORY.md and USER.md content for the system prompt.
Uses the frozen snapshot captured at load time. This ensures the
system prompt stays stable throughout a session (preserving the
prompt cache), even though the live entries may change via tool calls.
"""
if not self._store:
return ""
parts = []
if self._memory_enabled:
mem_block = self._store.format_for_system_prompt("memory")
if mem_block:
parts.append(mem_block)
if self._user_profile_enabled:
user_block = self._store.format_for_system_prompt("user")
if user_block:
parts.append(user_block)
return "\n\n".join(parts)
def prefetch(self, query: str, *, session_id: str = "") -> str:
"""Built-in memory doesn't do query-based recall — it's injected via system_prompt_block."""
return ""
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
"""Built-in memory doesn't auto-sync turns — writes happen via the memory tool."""
def get_tool_schemas(self) -> List[Dict[str, Any]]:
"""Return empty list.
The `memory` tool is an agent-level intercepted tool, handled
specially in run_agent.py before normal tool dispatch. It's not
part of the standard tool registry. We don't duplicate it here.
"""
return []
def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
"""Not used — the memory tool is intercepted in run_agent.py."""
return json.dumps({"error": "Built-in memory tool is handled by the agent loop"})
def shutdown(self) -> None:
"""No cleanup needed — files are saved on every write."""
# -- Property access for backward compatibility --------------------------
@property
def store(self):
"""Access the underlying MemoryStore for legacy code paths."""
return self._store
@property
def memory_enabled(self) -> bool:
return self._memory_enabled
@property
def user_profile_enabled(self) -> bool:
return self._user_profile_enabled

View File

@ -141,7 +141,7 @@ class ContextCompressor:
"last_prompt_tokens": self.last_prompt_tokens,
"threshold_tokens": self.threshold_tokens,
"context_length": self.context_length,
"usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
"usage_percent": min(100, (self.last_prompt_tokens / self.context_length * 100)) if self.context_length else 0,
"compression_count": self.compression_count,
}
@ -347,7 +347,7 @@ Write only the summary body. Do not include any preamble or prefix."""
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.3,
"max_tokens": summary_budget * 2,
"timeout": 45.0,
# timeout resolved from auxiliary.compression.timeout config by call_llm
}
if self.summary_model:
call_kwargs["model"] = self.summary_model

View File

@ -17,7 +17,7 @@ REFERENCE_PATTERN = re.compile(
r"(?<![\w/])@(?:(?P<simple>diff|staged)\b|(?P<kind>file|folder|git|url):(?P<value>\S+))"
)
TRAILING_PUNCTUATION = ",.;!?"
_SENSITIVE_HOME_DIRS = (".ssh", ".aws", ".gnupg", ".kube")
_SENSITIVE_HOME_DIRS = (".ssh", ".aws", ".gnupg", ".kube", ".docker", ".azure", ".config/gh")
_SENSITIVE_HERMES_DIRS = (Path("skills") / ".hub",)
_SENSITIVE_HOME_FILES = (
Path(".ssh") / "authorized_keys",

848
agent/credential_pool.py Normal file
View File

@ -0,0 +1,848 @@
"""Persistent multi-credential pool for same-provider failover."""
from __future__ import annotations
import logging
import random
import threading
import time
import uuid
import os
from dataclasses import dataclass, fields, replace
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
PROVIDER_REGISTRY,
_agent_key_is_usable,
_codex_access_token_is_expiring,
_decode_jwt_claims,
_is_expiring,
_load_auth_store,
_load_provider_state,
read_credential_pool,
write_credential_pool,
)
logger = logging.getLogger(__name__)
def _load_config_safe() -> Optional[dict]:
"""Load config.yaml, returning None on any error."""
try:
from hermes_cli.config import load_config
return load_config()
except Exception:
return None
# --- Status and type constants ---
STATUS_OK = "ok"
STATUS_EXHAUSTED = "exhausted"
AUTH_TYPE_OAUTH = "oauth"
AUTH_TYPE_API_KEY = "api_key"
SOURCE_MANUAL = "manual"
STRATEGY_FILL_FIRST = "fill_first"
STRATEGY_ROUND_ROBIN = "round_robin"
STRATEGY_RANDOM = "random"
STRATEGY_LEAST_USED = "least_used"
SUPPORTED_POOL_STRATEGIES = {
STRATEGY_FILL_FIRST,
STRATEGY_ROUND_ROBIN,
STRATEGY_RANDOM,
STRATEGY_LEAST_USED,
}
# Cooldown before retrying an exhausted credential.
# 429 (rate-limited) cools down faster since quotas reset frequently.
# 402 (billing/quota) and other codes use a longer default.
EXHAUSTED_TTL_429_SECONDS = 60 * 60 # 1 hour
EXHAUSTED_TTL_DEFAULT_SECONDS = 24 * 60 * 60 # 24 hours
# Pool key prefix for custom OpenAI-compatible endpoints.
# Custom endpoints all share provider='custom' but are keyed by their
# custom_providers name: 'custom:<normalized_name>'.
CUSTOM_POOL_PREFIX = "custom:"
# Fields that are only round-tripped through JSON — never used for logic as attributes.
_EXTRA_KEYS = frozenset({
"token_type", "scope", "client_id", "portal_base_url", "obtained_at",
"expires_in", "agent_key_id", "agent_key_expires_in", "agent_key_reused",
"agent_key_obtained_at", "tls",
})
@dataclass
class PooledCredential:
provider: str
id: str
label: str
auth_type: str
priority: int
source: str
access_token: str
refresh_token: Optional[str] = None
last_status: Optional[str] = None
last_status_at: Optional[float] = None
last_error_code: Optional[int] = None
base_url: Optional[str] = None
expires_at: Optional[str] = None
expires_at_ms: Optional[int] = None
last_refresh: Optional[str] = None
inference_base_url: Optional[str] = None
agent_key: Optional[str] = None
agent_key_expires_at: Optional[str] = None
request_count: int = 0
extra: Dict[str, Any] = None # type: ignore[assignment]
def __post_init__(self):
if self.extra is None:
self.extra = {}
def __getattr__(self, name: str):
if name in _EXTRA_KEYS:
return self.extra.get(name)
raise AttributeError(f"'{type(self).__name__}' object has no attribute {name!r}")
@classmethod
def from_dict(cls, provider: str, payload: Dict[str, Any]) -> "PooledCredential":
field_names = {f.name for f in fields(cls) if f.name != "provider"}
data = {k: payload.get(k) for k in field_names if k in payload}
extra = {k: payload[k] for k in _EXTRA_KEYS if k in payload and payload[k] is not None}
data["extra"] = extra
data.setdefault("id", uuid.uuid4().hex[:6])
data.setdefault("label", payload.get("source", provider))
data.setdefault("auth_type", AUTH_TYPE_API_KEY)
data.setdefault("priority", 0)
data.setdefault("source", SOURCE_MANUAL)
data.setdefault("access_token", "")
return cls(provider=provider, **data)
def to_dict(self) -> Dict[str, Any]:
_ALWAYS_EMIT = {"last_status", "last_status_at", "last_error_code"}
result: Dict[str, Any] = {}
for field_def in fields(self):
if field_def.name in ("provider", "extra"):
continue
value = getattr(self, field_def.name)
if value is not None or field_def.name in _ALWAYS_EMIT:
result[field_def.name] = value
for k, v in self.extra.items():
if v is not None:
result[k] = v
return result
@property
def runtime_api_key(self) -> str:
if self.provider == "nous":
return str(self.agent_key or self.access_token or "")
return str(self.access_token or "")
@property
def runtime_base_url(self) -> Optional[str]:
if self.provider == "nous":
return self.inference_base_url or self.base_url
return self.base_url
def label_from_token(token: str, fallback: str) -> str:
claims = _decode_jwt_claims(token)
for key in ("email", "preferred_username", "upn"):
value = claims.get(key)
if isinstance(value, str) and value.strip():
return value.strip()
return fallback
def _next_priority(entries: List[PooledCredential]) -> int:
return max((entry.priority for entry in entries), default=-1) + 1
def _is_manual_source(source: str) -> bool:
normalized = (source or "").strip().lower()
return normalized == SOURCE_MANUAL or normalized.startswith(f"{SOURCE_MANUAL}:")
def _exhausted_ttl(error_code: Optional[int]) -> int:
"""Return cooldown seconds based on the HTTP status that caused exhaustion."""
if error_code == 429:
return EXHAUSTED_TTL_429_SECONDS
return EXHAUSTED_TTL_DEFAULT_SECONDS
def _normalize_custom_pool_name(name: str) -> str:
"""Normalize a custom provider name for use as a pool key suffix."""
return name.strip().lower().replace(" ", "-")
def _iter_custom_providers(config: Optional[dict] = None):
"""Yield (normalized_name, entry_dict) for each valid custom_providers entry."""
if config is None:
config = _load_config_safe()
if config is None:
return
custom_providers = config.get("custom_providers")
if not isinstance(custom_providers, list):
return
for entry in custom_providers:
if not isinstance(entry, dict):
continue
name = entry.get("name")
if not isinstance(name, str):
continue
yield _normalize_custom_pool_name(name), entry
def get_custom_provider_pool_key(base_url: str) -> Optional[str]:
"""Look up the custom_providers list in config.yaml and return 'custom:<name>' for a matching base_url.
Returns None if no match is found.
"""
if not base_url:
return None
normalized_url = base_url.strip().rstrip("/")
for norm_name, entry in _iter_custom_providers():
entry_url = str(entry.get("base_url") or "").strip().rstrip("/")
if entry_url and entry_url == normalized_url:
return f"{CUSTOM_POOL_PREFIX}{norm_name}"
return None
def list_custom_pool_providers() -> List[str]:
"""Return all 'custom:*' pool keys that have entries in auth.json."""
pool_data = read_credential_pool(None)
return sorted(
key for key in pool_data
if key.startswith(CUSTOM_POOL_PREFIX)
and isinstance(pool_data.get(key), list)
and pool_data[key]
)
def _get_custom_provider_config(pool_key: str) -> Optional[Dict[str, Any]]:
"""Return the custom_providers config entry matching a pool key like 'custom:together.ai'."""
if not pool_key.startswith(CUSTOM_POOL_PREFIX):
return None
suffix = pool_key[len(CUSTOM_POOL_PREFIX):]
for norm_name, entry in _iter_custom_providers():
if norm_name == suffix:
return entry
return None
def get_pool_strategy(provider: str) -> str:
"""Return the configured selection strategy for a provider."""
config = _load_config_safe()
if config is None:
return STRATEGY_FILL_FIRST
strategies = config.get("credential_pool_strategies")
if not isinstance(strategies, dict):
return STRATEGY_FILL_FIRST
strategy = str(strategies.get(provider, "") or "").strip().lower()
if strategy in SUPPORTED_POOL_STRATEGIES:
return strategy
return STRATEGY_FILL_FIRST
class CredentialPool:
def __init__(self, provider: str, entries: List[PooledCredential]):
self.provider = provider
self._entries = sorted(entries, key=lambda entry: entry.priority)
self._current_id: Optional[str] = None
self._strategy = get_pool_strategy(provider)
self._lock = threading.Lock()
def has_credentials(self) -> bool:
return bool(self._entries)
def has_available(self) -> bool:
"""True if at least one entry is not currently in exhaustion cooldown."""
return bool(self._available_entries())
def entries(self) -> List[PooledCredential]:
return list(self._entries)
def current(self) -> Optional[PooledCredential]:
if not self._current_id:
return None
return next((entry for entry in self._entries if entry.id == self._current_id), None)
def _replace_entry(self, old: PooledCredential, new: PooledCredential) -> None:
"""Swap an entry in-place by id, preserving sort order."""
for idx, entry in enumerate(self._entries):
if entry.id == old.id:
self._entries[idx] = new
return
def _persist(self) -> None:
write_credential_pool(
self.provider,
[entry.to_dict() for entry in self._entries],
)
def _mark_exhausted(self, entry: PooledCredential, status_code: Optional[int]) -> PooledCredential:
updated = replace(
entry,
last_status=STATUS_EXHAUSTED,
last_status_at=time.time(),
last_error_code=status_code,
)
self._replace_entry(entry, updated)
self._persist()
return updated
def _refresh_entry(self, entry: PooledCredential, *, force: bool) -> Optional[PooledCredential]:
if entry.auth_type != AUTH_TYPE_OAUTH or not entry.refresh_token:
if force:
self._mark_exhausted(entry, None)
return None
try:
if self.provider == "anthropic":
from agent.anthropic_adapter import refresh_anthropic_oauth_pure
refreshed = refresh_anthropic_oauth_pure(
entry.refresh_token,
use_json=entry.source.endswith("hermes_pkce"),
)
updated = replace(
entry,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
expires_at_ms=refreshed["expires_at_ms"],
)
elif self.provider == "openai-codex":
refreshed = auth_mod.refresh_codex_oauth_pure(
entry.access_token,
entry.refresh_token,
)
updated = replace(
entry,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
elif self.provider == "nous":
nous_state = {
"access_token": entry.access_token,
"refresh_token": entry.refresh_token,
"client_id": entry.client_id,
"portal_base_url": entry.portal_base_url,
"inference_base_url": entry.inference_base_url,
"token_type": entry.token_type,
"scope": entry.scope,
"obtained_at": entry.obtained_at,
"expires_at": entry.expires_at,
"agent_key": entry.agent_key,
"agent_key_expires_at": entry.agent_key_expires_at,
"tls": entry.tls,
}
refreshed = auth_mod.refresh_nous_oauth_from_state(
nous_state,
min_key_ttl_seconds=DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
force_refresh=force,
force_mint=force,
)
# Apply returned fields: dataclass fields via replace, extras via dict update
field_updates = {}
extra_updates = dict(entry.extra)
_field_names = {f.name for f in fields(entry)}
for k, v in refreshed.items():
if k in _field_names:
field_updates[k] = v
elif k in _EXTRA_KEYS:
extra_updates[k] = v
updated = replace(entry, extra=extra_updates, **field_updates)
else:
return entry
except Exception as exc:
logger.debug("Credential refresh failed for %s/%s: %s", self.provider, entry.id, exc)
self._mark_exhausted(entry, None)
return None
updated = replace(updated, last_status=STATUS_OK, last_status_at=None, last_error_code=None)
self._replace_entry(entry, updated)
self._persist()
return updated
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
if entry.auth_type != AUTH_TYPE_OAUTH:
return False
if self.provider == "anthropic":
if entry.expires_at_ms is None:
return False
return int(entry.expires_at_ms) <= int(time.time() * 1000) + 120_000
if self.provider == "openai-codex":
return _codex_access_token_is_expiring(
entry.access_token,
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
)
if self.provider == "nous":
# Nous refresh/mint can require network access and should happen when
# runtime credentials are actually resolved, not merely when the pool
# is enumerated for listing, migration, or selection.
return False
return False
def mark_used(self, entry_id: Optional[str] = None) -> None:
"""Increment request_count for tracking. Used by least_used strategy."""
target_id = entry_id or self._current_id
if not target_id:
return
with self._lock:
for idx, entry in enumerate(self._entries):
if entry.id == target_id:
self._entries[idx] = replace(entry, request_count=entry.request_count + 1)
return
def select(self) -> Optional[PooledCredential]:
with self._lock:
return self._select_unlocked()
def _available_entries(self, *, clear_expired: bool = False, refresh: bool = False) -> List[PooledCredential]:
"""Return entries not currently in exhaustion cooldown.
When *clear_expired* is True, entries whose cooldown has elapsed are
reset to STATUS_OK and persisted. When *refresh* is True, entries
that need a token refresh are refreshed (skipped on failure).
"""
now = time.time()
cleared_any = False
available: List[PooledCredential] = []
for entry in self._entries:
if entry.last_status == STATUS_EXHAUSTED:
ttl = _exhausted_ttl(entry.last_error_code)
if entry.last_status_at and now - entry.last_status_at < ttl:
continue
if clear_expired:
cleared = replace(entry, last_status=STATUS_OK, last_status_at=None, last_error_code=None)
self._replace_entry(entry, cleared)
entry = cleared
cleared_any = True
if refresh and self._entry_needs_refresh(entry):
refreshed = self._refresh_entry(entry, force=False)
if refreshed is None:
continue
entry = refreshed
available.append(entry)
if cleared_any:
self._persist()
return available
def _select_unlocked(self) -> Optional[PooledCredential]:
available = self._available_entries(clear_expired=True, refresh=True)
if not available:
self._current_id = None
return None
if self._strategy == STRATEGY_RANDOM:
entry = random.choice(available)
self._current_id = entry.id
return entry
if self._strategy == STRATEGY_LEAST_USED and len(available) > 1:
entry = min(available, key=lambda e: e.request_count)
self._current_id = entry.id
return entry
if self._strategy == STRATEGY_ROUND_ROBIN and len(available) > 1:
entry = available[0]
rotated = [candidate for candidate in self._entries if candidate.id != entry.id]
rotated.append(replace(entry, priority=len(self._entries) - 1))
self._entries = [replace(candidate, priority=idx) for idx, candidate in enumerate(rotated)]
self._persist()
self._current_id = entry.id
return self.current() or entry
entry = available[0]
self._current_id = entry.id
return entry
def peek(self) -> Optional[PooledCredential]:
current = self.current()
if current is not None:
return current
available = self._available_entries()
return available[0] if available else None
def mark_exhausted_and_rotate(self, *, status_code: Optional[int]) -> Optional[PooledCredential]:
with self._lock:
entry = self.current() or self._select_unlocked()
if entry is None:
return None
self._mark_exhausted(entry, status_code)
self._current_id = None
return self._select_unlocked()
def try_refresh_current(self) -> Optional[PooledCredential]:
with self._lock:
return self._try_refresh_current_unlocked()
def _try_refresh_current_unlocked(self) -> Optional[PooledCredential]:
entry = self.current()
if entry is None:
return None
refreshed = self._refresh_entry(entry, force=True)
if refreshed is not None:
self._current_id = refreshed.id
return refreshed
def reset_statuses(self) -> int:
count = 0
new_entries = []
for entry in self._entries:
if entry.last_status or entry.last_status_at or entry.last_error_code:
new_entries.append(replace(entry, last_status=None, last_status_at=None, last_error_code=None))
count += 1
else:
new_entries.append(entry)
if count:
self._entries = new_entries
self._persist()
return count
def remove_index(self, index: int) -> Optional[PooledCredential]:
if index < 1 or index > len(self._entries):
return None
removed = self._entries.pop(index - 1)
self._entries = [
replace(entry, priority=new_priority)
for new_priority, entry in enumerate(self._entries)
]
self._persist()
if self._current_id == removed.id:
self._current_id = None
return removed
def add_entry(self, entry: PooledCredential) -> PooledCredential:
entry = replace(entry, priority=_next_priority(self._entries))
self._entries.append(entry)
self._persist()
return entry
def _upsert_entry(entries: List[PooledCredential], provider: str, source: str, payload: Dict[str, Any]) -> bool:
existing_idx = None
for idx, entry in enumerate(entries):
if entry.source == source:
existing_idx = idx
break
if existing_idx is None:
payload.setdefault("id", uuid.uuid4().hex[:6])
payload.setdefault("priority", _next_priority(entries))
payload.setdefault("label", payload.get("label") or source)
entries.append(PooledCredential.from_dict(provider, payload))
return True
existing = entries[existing_idx]
field_updates = {}
extra_updates = {}
_field_names = {f.name for f in fields(existing)}
for key, value in payload.items():
if key in {"id", "priority"} or value is None:
continue
if key == "label" and existing.label:
continue
if key in _field_names:
if getattr(existing, key) != value:
field_updates[key] = value
elif key in _EXTRA_KEYS:
if existing.extra.get(key) != value:
extra_updates[key] = value
if field_updates or extra_updates:
if extra_updates:
field_updates["extra"] = {**existing.extra, **extra_updates}
entries[existing_idx] = replace(existing, **field_updates)
return True
return False
def _normalize_pool_priorities(provider: str, entries: List[PooledCredential]) -> bool:
if provider != "anthropic":
return False
source_rank = {
"env:ANTHROPIC_TOKEN": 0,
"env:CLAUDE_CODE_OAUTH_TOKEN": 1,
"hermes_pkce": 2,
"claude_code": 3,
"env:ANTHROPIC_API_KEY": 4,
}
manual_entries = sorted(
(entry for entry in entries if _is_manual_source(entry.source)),
key=lambda entry: entry.priority,
)
seeded_entries = sorted(
(entry for entry in entries if not _is_manual_source(entry.source)),
key=lambda entry: (
source_rank.get(entry.source, len(source_rank)),
entry.priority,
entry.label,
),
)
ordered = [*manual_entries, *seeded_entries]
id_to_idx = {entry.id: idx for idx, entry in enumerate(entries)}
changed = False
for new_priority, entry in enumerate(ordered):
if entry.priority != new_priority:
entries[id_to_idx[entry.id]] = replace(entry, priority=new_priority)
changed = True
return changed
def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
auth_store = _load_auth_store()
if provider == "anthropic":
from agent.anthropic_adapter import read_claude_code_credentials, read_hermes_oauth_credentials
for source_name, creds in (
("hermes_pkce", read_hermes_oauth_credentials()),
("claude_code", read_claude_code_credentials()),
):
if creds and creds.get("accessToken"):
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_OAUTH,
"access_token": creds.get("accessToken", ""),
"refresh_token": creds.get("refreshToken"),
"expires_at_ms": creds.get("expiresAt"),
"label": label_from_token(creds.get("accessToken", ""), source_name),
},
)
elif provider == "nous":
state = _load_provider_state(auth_store, "nous")
if state:
active_sources.add("device_code")
changed |= _upsert_entry(
entries,
provider,
"device_code",
{
"source": "device_code",
"auth_type": AUTH_TYPE_OAUTH,
"access_token": state.get("access_token", ""),
"refresh_token": state.get("refresh_token"),
"expires_at": state.get("expires_at"),
"token_type": state.get("token_type"),
"scope": state.get("scope"),
"client_id": state.get("client_id"),
"portal_base_url": state.get("portal_base_url"),
"inference_base_url": state.get("inference_base_url"),
"agent_key": state.get("agent_key"),
"agent_key_expires_at": state.get("agent_key_expires_at"),
"tls": state.get("tls") if isinstance(state.get("tls"), dict) else None,
"label": label_from_token(state.get("access_token", ""), "device_code"),
},
)
elif provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
if isinstance(tokens, dict) and tokens.get("access_token"):
active_sources.add("device_code")
changed |= _upsert_entry(
entries,
provider,
"device_code",
{
"source": "device_code",
"auth_type": AUTH_TYPE_OAUTH,
"access_token": tokens.get("access_token", ""),
"refresh_token": tokens.get("refresh_token"),
"base_url": "https://chatgpt.com/backend-api/codex",
"last_refresh": state.get("last_refresh"),
"label": label_from_token(tokens.get("access_token", ""), "device_code"),
},
)
return changed, active_sources
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
if provider == "openrouter":
token = os.getenv("OPENROUTER_API_KEY", "").strip()
if token:
source = "env:OPENROUTER_API_KEY"
active_sources.add(source)
changed |= _upsert_entry(
entries,
provider,
source,
{
"source": source,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": token,
"base_url": OPENROUTER_BASE_URL,
"label": "OPENROUTER_API_KEY",
},
)
return changed, active_sources
pconfig = PROVIDER_REGISTRY.get(provider)
if not pconfig or pconfig.auth_type != AUTH_TYPE_API_KEY:
return changed, active_sources
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip().rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
env_vars = [
"ANTHROPIC_TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN",
"ANTHROPIC_API_KEY",
]
for env_var in env_vars:
token = os.getenv(env_var, "").strip()
if not token:
continue
source = f"env:{env_var}"
active_sources.add(source)
auth_type = AUTH_TYPE_OAUTH if provider == "anthropic" and not token.startswith("sk-ant-api") else AUTH_TYPE_API_KEY
base_url = env_url or pconfig.inference_base_url
changed |= _upsert_entry(
entries,
provider,
source,
{
"source": source,
"auth_type": auth_type,
"access_token": token,
"base_url": base_url,
"label": env_var,
},
)
return changed, active_sources
def _prune_stale_seeded_entries(entries: List[PooledCredential], active_sources: Set[str]) -> bool:
retained = [
entry
for entry in entries
if _is_manual_source(entry.source)
or entry.source in active_sources
or not (
entry.source.startswith("env:")
or entry.source in {"claude_code", "hermes_pkce"}
)
]
if len(retained) == len(entries):
return False
entries[:] = retained
return True
def _seed_custom_pool(pool_key: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
"""Seed a custom endpoint pool from custom_providers config and model config."""
changed = False
active_sources: Set[str] = set()
# Seed from the custom_providers config entry's api_key field
cp_config = _get_custom_provider_config(pool_key)
if cp_config:
api_key = str(cp_config.get("api_key") or "").strip()
base_url = str(cp_config.get("base_url") or "").strip().rstrip("/")
name = str(cp_config.get("name") or "").strip()
if api_key:
source = f"config:{name}"
active_sources.add(source)
changed |= _upsert_entry(
entries,
pool_key,
source,
{
"source": source,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": api_key,
"base_url": base_url,
"label": name or source,
},
)
# Seed from model.api_key if model.provider=='custom' and model.base_url matches
try:
config = _load_config_safe()
model_cfg = config.get("model") if config else None
if isinstance(model_cfg, dict):
model_provider = str(model_cfg.get("provider") or "").strip().lower()
model_base_url = str(model_cfg.get("base_url") or "").strip().rstrip("/")
model_api_key = ""
for k in ("api_key", "api"):
v = model_cfg.get(k)
if isinstance(v, str) and v.strip():
model_api_key = v.strip()
break
if model_provider == "custom" and model_base_url and model_api_key:
# Check if this model's base_url matches our custom provider
matched_key = get_custom_provider_pool_key(model_base_url)
if matched_key == pool_key:
source = "model_config"
active_sources.add(source)
changed |= _upsert_entry(
entries,
pool_key,
source,
{
"source": source,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": model_api_key,
"base_url": model_base_url,
"label": "model_config",
},
)
except Exception:
pass
return changed, active_sources
def load_pool(provider: str) -> CredentialPool:
provider = (provider or "").strip().lower()
raw_entries = read_credential_pool(provider)
entries = [PooledCredential.from_dict(provider, payload) for payload in raw_entries]
if provider.startswith(CUSTOM_POOL_PREFIX):
# Custom endpoint pool — seed from custom_providers config and model config
custom_changed, custom_sources = _seed_custom_pool(provider, entries)
changed = custom_changed
changed |= _prune_stale_seeded_entries(entries, custom_sources)
else:
singleton_changed, singleton_sources = _seed_from_singletons(provider, entries)
env_changed, env_sources = _seed_from_env(provider, entries)
changed = singleton_changed or env_changed
changed |= _prune_stale_seeded_entries(entries, singleton_sources | env_sources)
changed |= _normalize_pool_priorities(provider, entries)
if changed:
write_credential_pool(
provider,
[entry.to_dict() for entry in sorted(entries, key=lambda item: item.priority)],
)
return CredentialPool(provider, entries)

View File

@ -10,6 +10,9 @@ import os
import sys
import threading
import time
from dataclasses import dataclass, field
from difflib import unified_diff
from pathlib import Path
# ANSI escape codes for coloring tool failure indicators
_RED = "\033[31m"
@ -17,6 +20,39 @@ _RESET = "\033[0m"
logger = logging.getLogger(__name__)
_ANSI_RESET = "\033[0m"
_ANSI_DIM = "\033[38;2;150;150;150m"
_ANSI_FILE = "\033[38;2;180;160;255m"
_ANSI_HUNK = "\033[38;2;120;120;140m"
_ANSI_MINUS = "\033[38;2;255;255;255;48;2;120;20;20m"
_ANSI_PLUS = "\033[38;2;255;255;255;48;2;20;90;20m"
_MAX_INLINE_DIFF_FILES = 6
_MAX_INLINE_DIFF_LINES = 80
@dataclass
class LocalEditSnapshot:
"""Pre-tool filesystem snapshot used to render diffs locally after writes."""
paths: list[Path] = field(default_factory=list)
before: dict[str, str | None] = field(default_factory=dict)
# =========================================================================
# Configurable tool preview length (0 = no limit)
# Set once at startup by CLI or gateway from display.tool_preview_length config.
# =========================================================================
_tool_preview_max_len: int = 0 # 0 = unlimited
def set_tool_preview_max_len(n: int) -> None:
"""Set the global max length for tool call previews. 0 = no limit."""
global _tool_preview_max_len
_tool_preview_max_len = max(int(n), 0) if n else 0
def get_tool_preview_max_len() -> int:
"""Return the configured max preview length (0 = unlimited)."""
return _tool_preview_max_len
# =========================================================================
# Skin-aware helpers (lazy import to avoid circular deps)
@ -94,8 +130,14 @@ def _oneline(text: str) -> str:
return " ".join(text.split())
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | None:
"""Build a short preview of a tool call's primary argument for display."""
def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -> str | None:
"""Build a short preview of a tool call's primary argument for display.
*max_len* controls truncation. ``None`` (default) defers to the global
``_tool_preview_max_len`` set via config; ``0`` means unlimited.
"""
if max_len is None:
max_len = _tool_preview_max_len
if not args:
return None
primary_args = {
@ -190,11 +232,305 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | N
preview = _oneline(str(value))
if not preview:
return None
if len(preview) > max_len:
if max_len > 0 and len(preview) > max_len:
preview = preview[:max_len - 3] + "..."
return preview
# =========================================================================
# Inline diff previews for write actions
# =========================================================================
def _resolved_path(path: str) -> Path:
"""Resolve a possibly-relative filesystem path against the current cwd."""
candidate = Path(os.path.expanduser(path))
if candidate.is_absolute():
return candidate
return Path.cwd() / candidate
def _snapshot_text(path: Path) -> str | None:
"""Return UTF-8 file content, or None for missing/unreadable files."""
try:
return path.read_text(encoding="utf-8")
except (FileNotFoundError, IsADirectoryError, UnicodeDecodeError, OSError):
return None
def _display_diff_path(path: Path) -> str:
"""Prefer cwd-relative paths in diffs when available."""
try:
return str(path.resolve().relative_to(Path.cwd().resolve()))
except Exception:
return str(path)
def _resolve_skill_manage_paths(args: dict) -> list[Path]:
"""Resolve skill_manage write targets to filesystem paths."""
action = args.get("action")
name = args.get("name")
if not action or not name:
return []
from tools.skill_manager_tool import _find_skill, _resolve_skill_dir
if action == "create":
skill_dir = _resolve_skill_dir(name, args.get("category"))
return [skill_dir / "SKILL.md"]
existing = _find_skill(name)
if not existing:
return []
skill_dir = Path(existing["path"])
if action in {"edit", "patch"}:
file_path = args.get("file_path")
return [skill_dir / file_path] if file_path else [skill_dir / "SKILL.md"]
if action in {"write_file", "remove_file"}:
file_path = args.get("file_path")
return [skill_dir / file_path] if file_path else []
if action == "delete":
files = [path for path in sorted(skill_dir.rglob("*")) if path.is_file()]
return files
return []
def _resolve_local_edit_paths(tool_name: str, function_args: dict | None) -> list[Path]:
"""Resolve local filesystem targets for write-capable tools."""
if not isinstance(function_args, dict):
return []
if tool_name == "write_file":
path = function_args.get("path")
return [_resolved_path(path)] if path else []
if tool_name == "patch":
path = function_args.get("path")
return [_resolved_path(path)] if path else []
if tool_name == "skill_manage":
return _resolve_skill_manage_paths(function_args)
return []
def capture_local_edit_snapshot(tool_name: str, function_args: dict | None) -> LocalEditSnapshot | None:
"""Capture before-state for local write previews."""
paths = _resolve_local_edit_paths(tool_name, function_args)
if not paths:
return None
snapshot = LocalEditSnapshot(paths=paths)
for path in paths:
snapshot.before[str(path)] = _snapshot_text(path)
return snapshot
def _result_succeeded(result: str | None) -> bool:
"""Conservatively detect whether a tool result represents success."""
if not result:
return False
try:
data = json.loads(result)
except (json.JSONDecodeError, TypeError):
return False
if not isinstance(data, dict):
return False
if data.get("error"):
return False
if "success" in data:
return bool(data.get("success"))
return True
def _diff_from_snapshot(snapshot: LocalEditSnapshot | None) -> str | None:
"""Generate unified diff text from a stored before-state and current files."""
if not snapshot:
return None
chunks: list[str] = []
for path in snapshot.paths:
before = snapshot.before.get(str(path))
after = _snapshot_text(path)
if before == after:
continue
display_path = _display_diff_path(path)
diff = "".join(
unified_diff(
[] if before is None else before.splitlines(keepends=True),
[] if after is None else after.splitlines(keepends=True),
fromfile=f"a/{display_path}",
tofile=f"b/{display_path}",
)
)
if diff:
chunks.append(diff)
if not chunks:
return None
return "".join(chunk if chunk.endswith("\n") else chunk + "\n" for chunk in chunks)
def extract_edit_diff(
tool_name: str,
result: str | None,
*,
function_args: dict | None = None,
snapshot: LocalEditSnapshot | None = None,
) -> str | None:
"""Extract a unified diff from a file-edit tool result."""
if tool_name == "patch" and result:
try:
data = json.loads(result)
except (json.JSONDecodeError, TypeError):
data = None
if isinstance(data, dict):
diff = data.get("diff")
if isinstance(diff, str) and diff.strip():
return diff
if tool_name not in {"write_file", "patch", "skill_manage"}:
return None
if not _result_succeeded(result):
return None
return _diff_from_snapshot(snapshot)
def _emit_inline_diff(diff_text: str, print_fn) -> bool:
"""Emit rendered diff text through the CLI's prompt_toolkit-safe printer."""
if print_fn is None or not diff_text:
return False
try:
print_fn(" ┊ review diff")
for line in diff_text.rstrip("\n").splitlines():
print_fn(line)
return True
except Exception:
return False
def _render_inline_unified_diff(diff: str) -> list[str]:
"""Render unified diff lines in Hermes' inline transcript style."""
rendered: list[str] = []
from_file = None
to_file = None
for raw_line in diff.splitlines():
if raw_line.startswith("--- "):
from_file = raw_line[4:].strip()
continue
if raw_line.startswith("+++ "):
to_file = raw_line[4:].strip()
if from_file or to_file:
rendered.append(f"{_ANSI_FILE}{from_file or 'a/?'}{to_file or 'b/?'}{_ANSI_RESET}")
continue
if raw_line.startswith("@@"):
rendered.append(f"{_ANSI_HUNK}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith("-"):
rendered.append(f"{_ANSI_MINUS}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith("+"):
rendered.append(f"{_ANSI_PLUS}{raw_line}{_ANSI_RESET}")
continue
if raw_line.startswith(" "):
rendered.append(f"{_ANSI_DIM}{raw_line}{_ANSI_RESET}")
continue
if raw_line:
rendered.append(raw_line)
return rendered
def _split_unified_diff_sections(diff: str) -> list[str]:
"""Split a unified diff into per-file sections."""
sections: list[list[str]] = []
current: list[str] = []
for line in diff.splitlines():
if line.startswith("--- ") and current:
sections.append(current)
current = [line]
continue
current.append(line)
if current:
sections.append(current)
return ["\n".join(section) for section in sections if section]
def _summarize_rendered_diff_sections(
diff: str,
*,
max_files: int = _MAX_INLINE_DIFF_FILES,
max_lines: int = _MAX_INLINE_DIFF_LINES,
) -> list[str]:
"""Render diff sections while capping file count and total line count."""
sections = _split_unified_diff_sections(diff)
rendered: list[str] = []
omitted_files = 0
omitted_lines = 0
for idx, section in enumerate(sections):
if idx >= max_files:
omitted_files += 1
omitted_lines += len(_render_inline_unified_diff(section))
continue
section_lines = _render_inline_unified_diff(section)
remaining_budget = max_lines - len(rendered)
if remaining_budget <= 0:
omitted_lines += len(section_lines)
omitted_files += 1
continue
if len(section_lines) <= remaining_budget:
rendered.extend(section_lines)
continue
rendered.extend(section_lines[:remaining_budget])
omitted_lines += len(section_lines) - remaining_budget
omitted_files += 1 + max(0, len(sections) - idx - 1)
for leftover in sections[idx + 1:]:
omitted_lines += len(_render_inline_unified_diff(leftover))
break
if omitted_files or omitted_lines:
summary = f"… omitted {omitted_lines} diff line(s)"
if omitted_files:
summary += f" across {omitted_files} additional file(s)/section(s)"
rendered.append(f"{_ANSI_HUNK}{summary}{_ANSI_RESET}")
return rendered
def render_edit_diff_with_delta(
tool_name: str,
result: str | None,
*,
function_args: dict | None = None,
snapshot: LocalEditSnapshot | None = None,
print_fn=None,
) -> bool:
"""Render an edit diff inline without taking over the terminal UI."""
diff = extract_edit_diff(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if not diff:
return False
try:
rendered_lines = _summarize_rendered_diff_sections(diff)
except Exception as exc:
logger.debug("Could not render inline diff: %s", exc)
return False
return _emit_inline_diff("\n".join(rendered_lines), print_fn)
# =========================================================================
# KawaiiSpinner
# =========================================================================
@ -284,11 +620,11 @@ class KawaiiSpinner:
The CLI already drives a TUI widget (_spinner_text) for spinner display,
so KawaiiSpinner's \\r-based animation is redundant under StdoutProxy.
"""
out = self._out
# StdoutProxy has a 'raw' attribute (bool) that plain file objects lack.
if hasattr(out, 'raw') and type(out).__name__ == 'StdoutProxy':
return True
return False
try:
from prompt_toolkit.patch_stdout import StdoutProxy
return isinstance(self._out, StdoutProxy)
except ImportError:
return False
def _animate(self):
# When stdout is not a real terminal (e.g. Docker, systemd, pipe),
@ -484,10 +820,14 @@ def get_cute_tool_message(
def _trunc(s, n=40):
s = str(s)
if _tool_preview_max_len == 0:
return s # no limit
return (s[:n-3] + "...") if len(s) > n else s
def _path(p, n=35):
p = str(p)
if _tool_preview_max_len == 0:
return p # no limit
return ("..." + p[-(n-3):]) if len(p) > n else p
def _wrap(line: str) -> str:

View File

@ -644,6 +644,9 @@ class InsightsEngine:
lines.append(f" Sessions: {o['total_sessions']:<12} Messages: {o['total_messages']:,}")
lines.append(f" Tool calls: {o['total_tool_calls']:<12,} User messages: {o['user_messages']:,}")
lines.append(f" Input tokens: {o['total_input_tokens']:<12,} Output tokens: {o['total_output_tokens']:,}")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f" Cache read: {o['total_cache_read_tokens']:<12,} Cache write: {o['total_cache_write_tokens']:,}")
cost_str = f"${o['estimated_cost']:.2f}"
if o.get("models_without_pricing"):
cost_str += " *"
@ -746,7 +749,11 @@ class InsightsEngine:
# Overview
lines.append(f"**Sessions:** {o['total_sessions']} | **Messages:** {o['total_messages']:,} | **Tool calls:** {o['total_tool_calls']:,}")
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,} / cache: {cache_total:,})")
else:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
cost_note = ""
if o.get("models_without_pricing"):
cost_note = " _(excludes custom/self-hosted models)_"

335
agent/memory_manager.py Normal file
View File

@ -0,0 +1,335 @@
"""MemoryManager — orchestrates the built-in memory provider plus at most
ONE external plugin memory provider.
Single integration point in run_agent.py. Replaces scattered per-backend
code with one manager that delegates to registered providers.
The BuiltinMemoryProvider is always registered first and cannot be removed.
Only ONE external (non-builtin) provider is allowed at a time — attempting
to register a second external provider is rejected with a warning. This
prevents tool schema bloat and conflicting memory backends.
Usage in run_agent.py:
self._memory_manager = MemoryManager()
self._memory_manager.add_provider(BuiltinMemoryProvider(...))
# Only ONE of these:
self._memory_manager.add_provider(plugin_provider)
# System prompt
prompt_parts.append(self._memory_manager.build_system_prompt())
# Pre-turn
context = self._memory_manager.prefetch_all(user_message)
# Post-turn
self._memory_manager.sync_all(user_msg, assistant_response)
self._memory_manager.queue_prefetch_all(user_msg)
"""
from __future__ import annotations
import json
import logging
from typing import Any, Dict, List, Optional
from agent.memory_provider import MemoryProvider
logger = logging.getLogger(__name__)
class MemoryManager:
"""Orchestrates the built-in provider plus at most one external provider.
The builtin provider is always first. Only one non-builtin (external)
provider is allowed. Failures in one provider never block the other.
"""
def __init__(self) -> None:
self._providers: List[MemoryProvider] = []
self._tool_to_provider: Dict[str, MemoryProvider] = {}
self._has_external: bool = False # True once a non-builtin provider is added
# -- Registration --------------------------------------------------------
def add_provider(self, provider: MemoryProvider) -> None:
"""Register a memory provider.
Built-in provider (name ``"builtin"``) is always accepted.
Only **one** external (non-builtin) provider is allowed — a second
attempt is rejected with a warning.
"""
is_builtin = provider.name == "builtin"
if not is_builtin:
if self._has_external:
existing = next(
(p.name for p in self._providers if p.name != "builtin"), "unknown"
)
logger.warning(
"Rejected memory provider '%s' — external provider '%s' is "
"already registered. Only one external memory provider is "
"allowed at a time. Configure which one via memory.provider "
"in config.yaml.",
provider.name, existing,
)
return
self._has_external = True
self._providers.append(provider)
# Index tool names → provider for routing
for schema in provider.get_tool_schemas():
tool_name = schema.get("name", "")
if tool_name and tool_name not in self._tool_to_provider:
self._tool_to_provider[tool_name] = provider
elif tool_name in self._tool_to_provider:
logger.warning(
"Memory tool name conflict: '%s' already registered by %s, "
"ignoring from %s",
tool_name,
self._tool_to_provider[tool_name].name,
provider.name,
)
logger.info(
"Memory provider '%s' registered (%d tools)",
provider.name,
len(provider.get_tool_schemas()),
)
@property
def providers(self) -> List[MemoryProvider]:
"""All registered providers in order."""
return list(self._providers)
@property
def provider_names(self) -> List[str]:
"""Names of all registered providers."""
return [p.name for p in self._providers]
def get_provider(self, name: str) -> Optional[MemoryProvider]:
"""Get a provider by name, or None if not registered."""
for p in self._providers:
if p.name == name:
return p
return None
# -- System prompt -------------------------------------------------------
def build_system_prompt(self) -> str:
"""Collect system prompt blocks from all providers.
Returns combined text, or empty string if no providers contribute.
Each non-empty block is labeled with the provider name.
"""
blocks = []
for provider in self._providers:
try:
block = provider.system_prompt_block()
if block and block.strip():
blocks.append(block)
except Exception as e:
logger.warning(
"Memory provider '%s' system_prompt_block() failed: %s",
provider.name, e,
)
return "\n\n".join(blocks)
# -- Prefetch / recall ---------------------------------------------------
def prefetch_all(self, query: str, *, session_id: str = "") -> str:
"""Collect prefetch context from all providers.
Returns merged context text labeled by provider. Empty providers
are skipped. Failures in one provider don't block others.
"""
parts = []
for provider in self._providers:
try:
result = provider.prefetch(query, session_id=session_id)
if result and result.strip():
parts.append(result)
except Exception as e:
logger.debug(
"Memory provider '%s' prefetch failed (non-fatal): %s",
provider.name, e,
)
return "\n\n".join(parts)
def queue_prefetch_all(self, query: str, *, session_id: str = "") -> None:
"""Queue background prefetch on all providers for the next turn."""
for provider in self._providers:
try:
provider.queue_prefetch(query, session_id=session_id)
except Exception as e:
logger.debug(
"Memory provider '%s' queue_prefetch failed (non-fatal): %s",
provider.name, e,
)
# -- Sync ----------------------------------------------------------------
def sync_all(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
"""Sync a completed turn to all providers."""
for provider in self._providers:
try:
provider.sync_turn(user_content, assistant_content, session_id=session_id)
except Exception as e:
logger.warning(
"Memory provider '%s' sync_turn failed: %s",
provider.name, e,
)
# -- Tools ---------------------------------------------------------------
def get_all_tool_schemas(self) -> List[Dict[str, Any]]:
"""Collect tool schemas from all providers."""
schemas = []
seen = set()
for provider in self._providers:
try:
for schema in provider.get_tool_schemas():
name = schema.get("name", "")
if name and name not in seen:
schemas.append(schema)
seen.add(name)
except Exception as e:
logger.warning(
"Memory provider '%s' get_tool_schemas() failed: %s",
provider.name, e,
)
return schemas
def get_all_tool_names(self) -> set:
"""Return set of all tool names across all providers."""
return set(self._tool_to_provider.keys())
def has_tool(self, tool_name: str) -> bool:
"""Check if any provider handles this tool."""
return tool_name in self._tool_to_provider
def handle_tool_call(
self, tool_name: str, args: Dict[str, Any], **kwargs
) -> str:
"""Route a tool call to the correct provider.
Returns JSON string result. Raises ValueError if no provider
handles the tool.
"""
provider = self._tool_to_provider.get(tool_name)
if provider is None:
return json.dumps({"error": f"No memory provider handles tool '{tool_name}'"})
try:
return provider.handle_tool_call(tool_name, args, **kwargs)
except Exception as e:
logger.error(
"Memory provider '%s' handle_tool_call(%s) failed: %s",
provider.name, tool_name, e,
)
return json.dumps({"error": f"Memory tool '{tool_name}' failed: {e}"})
# -- Lifecycle hooks -----------------------------------------------------
def on_turn_start(self, turn_number: int, message: str, **kwargs) -> None:
"""Notify all providers of a new turn.
kwargs may include: remaining_tokens, model, platform, tool_count.
"""
for provider in self._providers:
try:
provider.on_turn_start(turn_number, message, **kwargs)
except Exception as e:
logger.debug(
"Memory provider '%s' on_turn_start failed: %s",
provider.name, e,
)
def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
"""Notify all providers of session end."""
for provider in self._providers:
try:
provider.on_session_end(messages)
except Exception as e:
logger.debug(
"Memory provider '%s' on_session_end failed: %s",
provider.name, e,
)
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
"""Notify all providers before context compression.
Returns combined text from providers to include in the compression
summary prompt. Empty string if no provider contributes.
"""
parts = []
for provider in self._providers:
try:
result = provider.on_pre_compress(messages)
if result and result.strip():
parts.append(result)
except Exception as e:
logger.debug(
"Memory provider '%s' on_pre_compress failed: %s",
provider.name, e,
)
return "\n\n".join(parts)
def on_memory_write(self, action: str, target: str, content: str) -> None:
"""Notify external providers when the built-in memory tool writes.
Skips the builtin provider itself (it's the source of the write).
"""
for provider in self._providers:
if provider.name == "builtin":
continue
try:
provider.on_memory_write(action, target, content)
except Exception as e:
logger.debug(
"Memory provider '%s' on_memory_write failed: %s",
provider.name, e,
)
def on_delegation(self, task: str, result: str, *,
child_session_id: str = "", **kwargs) -> None:
"""Notify all providers that a subagent completed."""
for provider in self._providers:
try:
provider.on_delegation(
task, result, child_session_id=child_session_id, **kwargs
)
except Exception as e:
logger.debug(
"Memory provider '%s' on_delegation failed: %s",
provider.name, e,
)
def shutdown_all(self) -> None:
"""Shut down all providers (reverse order for clean teardown)."""
for provider in reversed(self._providers):
try:
provider.shutdown()
except Exception as e:
logger.warning(
"Memory provider '%s' shutdown failed: %s",
provider.name, e,
)
def initialize_all(self, session_id: str, **kwargs) -> None:
"""Initialize all providers.
Automatically injects ``hermes_home`` into *kwargs* so that every
provider can resolve profile-scoped storage paths without importing
``get_hermes_home()`` themselves.
"""
if "hermes_home" not in kwargs:
from hermes_constants import get_hermes_home
kwargs["hermes_home"] = str(get_hermes_home())
for provider in self._providers:
try:
provider.initialize(session_id=session_id, **kwargs)
except Exception as e:
logger.warning(
"Memory provider '%s' initialize failed: %s",
provider.name, e,
)

231
agent/memory_provider.py Normal file
View File

@ -0,0 +1,231 @@
"""Abstract base class for pluggable memory providers.
Memory providers give the agent persistent recall across sessions. One
external provider is active at a time alongside the always-on built-in
memory (MEMORY.md / USER.md). The MemoryManager enforces this limit.
Built-in memory is always active as the first provider and cannot be removed.
External providers (Honcho, Hindsight, Mem0, etc.) are additive — they never
disable the built-in store. Only one external provider runs at a time to
prevent tool schema bloat and conflicting memory backends.
Registration:
1. Built-in: BuiltinMemoryProvider — always present, not removable.
2. Plugins: Ship in plugins/memory/<name>/, activated by memory.provider config.
Lifecycle (called by MemoryManager, wired in run_agent.py):
initialize() — connect, create resources, warm up
system_prompt_block() — static text for the system prompt
prefetch(query) — background recall before each turn
sync_turn(user, asst) — async write after each turn
get_tool_schemas() — tool schemas to expose to the model
handle_tool_call() — dispatch a tool call
shutdown() — clean exit
Optional hooks (override to opt in):
on_turn_start(turn, message, **kwargs) — per-turn tick with runtime context
on_session_end(messages) — end-of-session extraction
on_pre_compress(messages) -> str — extract before context compression
on_memory_write(action, target, content) — mirror built-in memory writes
on_delegation(task, result, **kwargs) — parent-side observation of subagent work
"""
from __future__ import annotations
import logging
from abc import ABC, abstractmethod
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
class MemoryProvider(ABC):
"""Abstract base class for memory providers."""
@property
@abstractmethod
def name(self) -> str:
"""Short identifier for this provider (e.g. 'builtin', 'honcho', 'hindsight')."""
# -- Core lifecycle (implement these) ------------------------------------
@abstractmethod
def is_available(self) -> bool:
"""Return True if this provider is configured, has credentials, and is ready.
Called during agent init to decide whether to activate the provider.
Should not make network calls — just check config and installed deps.
"""
@abstractmethod
def initialize(self, session_id: str, **kwargs) -> None:
"""Initialize for a session.
Called once at agent startup. May create resources (banks, tables),
establish connections, start background threads, etc.
kwargs always include:
- hermes_home (str): The active HERMES_HOME directory path. Use this
for profile-scoped storage instead of hardcoding ``~/.hermes``.
- platform (str): "cli", "telegram", "discord", "cron", etc.
kwargs may also include:
- agent_context (str): "primary", "subagent", "cron", or "flush".
Providers should skip writes for non-primary contexts (cron system
prompts would corrupt user representations).
- agent_identity (str): Profile name (e.g. "coder"). Use for
per-profile provider identity scoping.
- agent_workspace (str): Shared workspace name (e.g. "hermes").
- parent_session_id (str): For subagents, the parent's session_id.
- user_id (str): Platform user identifier (gateway sessions).
"""
def system_prompt_block(self) -> str:
"""Return text to include in the system prompt.
Called during system prompt assembly. Return empty string to skip.
This is for STATIC provider info (instructions, status). Prefetched
recall context is injected separately via prefetch().
"""
return ""
def prefetch(self, query: str, *, session_id: str = "") -> str:
"""Recall relevant context for the upcoming turn.
Called before each API call. Return formatted text to inject as
context, or empty string if nothing relevant. Implementations
should be fast — use background threads for the actual recall
and return cached results here.
session_id is provided for providers serving concurrent sessions
(gateway group chats, cached agents). Providers that don't need
per-session scoping can ignore it.
"""
return ""
def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
"""Queue a background recall for the NEXT turn.
Called after each turn completes. The result will be consumed
by prefetch() on the next turn. Default is no-op — providers
that do background prefetching should override this.
"""
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
"""Persist a completed turn to the backend.
Called after each turn. Should be non-blocking — queue for
background processing if the backend has latency.
"""
@abstractmethod
def get_tool_schemas(self) -> List[Dict[str, Any]]:
"""Return tool schemas this provider exposes.
Each schema follows the OpenAI function calling format:
{"name": "...", "description": "...", "parameters": {...}}
Return empty list if this provider has no tools (context-only).
"""
def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
"""Handle a tool call for one of this provider's tools.
Must return a JSON string (the tool result).
Only called for tool names returned by get_tool_schemas().
"""
raise NotImplementedError(f"Provider {self.name} does not handle tool {tool_name}")
def shutdown(self) -> None:
"""Clean shutdown — flush queues, close connections."""
# -- Optional hooks (override to opt in) ---------------------------------
def on_turn_start(self, turn_number: int, message: str, **kwargs) -> None:
"""Called at the start of each turn with the user message.
Use for turn-counting, scope management, periodic maintenance.
kwargs may include: remaining_tokens, model, platform, tool_count.
Providers use what they need; extras are ignored.
"""
def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
"""Called when a session ends (explicit exit or timeout).
Use for end-of-session fact extraction, summarization, etc.
messages is the full conversation history.
NOT called after every turn — only at actual session boundaries
(CLI exit, /reset, gateway session expiry).
"""
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
"""Called before context compression discards old messages.
Use to extract insights from messages about to be compressed.
messages is the list that will be summarized/discarded.
Return text to include in the compression summary prompt so the
compressor preserves provider-extracted insights. Return empty
string for no contribution (backwards-compatible default).
"""
return ""
def on_delegation(self, task: str, result: str, *,
child_session_id: str = "", **kwargs) -> None:
"""Called on the PARENT agent when a subagent completes.
The parent's memory provider gets the task+result pair as an
observation of what was delegated and what came back. The subagent
itself has no provider session (skip_memory=True).
task: the delegation prompt
result: the subagent's final response
child_session_id: the subagent's session_id
"""
def get_config_schema(self) -> List[Dict[str, Any]]:
"""Return config fields this provider needs for setup.
Used by 'hermes memory setup' to walk the user through configuration.
Each field is a dict with:
key: config key name (e.g. 'api_key', 'mode')
description: human-readable description
secret: True if this should go to .env (default: False)
required: True if required (default: False)
default: default value (optional)
choices: list of valid values (optional)
url: URL where user can get this credential (optional)
env_var: explicit env var name for secrets (default: auto-generated)
Return empty list if no config needed (e.g. local-only providers).
"""
return []
def save_config(self, values: Dict[str, Any], hermes_home: str) -> None:
"""Write non-secret config to the provider's native location.
Called by 'hermes memory setup' after collecting user inputs.
``values`` contains only non-secret fields (secrets go to .env).
``hermes_home`` is the active HERMES_HOME directory path.
Providers with native config files (JSON, YAML) should override
this to write to their expected location. Providers that use only
env vars can leave the default (no-op).
All new memory provider plugins MUST implement either:
- save_config() for native config file formats, OR
- use only env vars (in which case get_config_schema() fields
should all have ``env_var`` set and this method stays no-op).
"""
def on_memory_write(self, action: str, target: str, content: str) -> None:
"""Called when the built-in memory tool writes an entry.
action: 'add', 'replace', or 'remove'
target: 'memory' or 'user'
content: the entry content
Use to mirror built-in memory writes to your backend.
"""

View File

@ -171,10 +171,12 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"dashscope.aliyuncs.com": "alibaba",
"dashscope-intl.aliyuncs.com": "alibaba",
"openrouter.ai": "openrouter",
"generativelanguage.googleapis.com": "google",
"inference-api.nousresearch.com": "nous",
"api.deepseek.com": "deepseek",
"api.githubcopilot.com": "copilot",
"models.github.ai": "copilot",
"api.fireworks.ai": "fireworks",
}

View File

@ -15,6 +15,8 @@ import time
from pathlib import Path
from typing import Any, Dict, Optional
from utils import atomic_json_write
import requests
logger = logging.getLogger(__name__)
@ -41,6 +43,7 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"opencode-zen": "opencode",
"opencode-go": "opencode-go",
"kilocode": "kilo",
"fireworks": "fireworks-ai",
}
@ -64,12 +67,10 @@ def _load_disk_cache() -> Dict[str, Any]:
def _save_disk_cache(data: Dict[str, Any]) -> None:
"""Save models.dev data to disk cache."""
"""Save models.dev data to disk cache atomically."""
try:
cache_path = _get_cache_path()
cache_path.parent.mkdir(parents=True, exist_ok=True)
with open(cache_path, "w", encoding="utf-8") as f:
json.dump(data, f, separators=(",", ":"))
atomic_json_write(cache_path, data, indent=None, separators=(",", ":"))
except Exception as e:
logger.debug("Failed to save models.dev disk cache: %s", e)

View File

@ -18,6 +18,7 @@ from typing import Optional
from agent.skill_utils import (
extract_skill_conditions,
extract_skill_description,
get_all_skills_dirs,
get_disabled_skill_names,
iter_skill_index_files,
parse_frontmatter,
@ -186,7 +187,36 @@ TOOL_USE_ENFORCEMENT_GUIDANCE = (
# Model name substrings that trigger tool-use enforcement guidance.
# Add new patterns here when a model family needs explicit steering.
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex")
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma")
# Gemini/Gemma-specific operational guidance, adapted from OpenCode's gemini.txt.
# Injected alongside TOOL_USE_ENFORCEMENT_GUIDANCE when the model is Gemini or Gemma.
GOOGLE_MODEL_OPERATIONAL_GUIDANCE = (
"# Google model operational directives\n"
"Follow these operational rules strictly:\n"
"- **Absolute paths:** Always construct and use absolute file paths for all "
"file system operations. Combine the project root with relative paths.\n"
"- **Verify first:** Use read_file/search_files to check file contents and "
"project structure before making changes. Never guess at file contents.\n"
"- **Dependency checks:** Never assume a library is available. Check "
"package.json, requirements.txt, Cargo.toml, etc. before importing.\n"
"- **Conciseness:** Keep explanatory text brief — a few sentences, not "
"paragraphs. Focus on actions and results over narration.\n"
"- **Parallel tool calls:** When you need to perform multiple independent "
"operations (e.g. reading several files), make all the tool calls in a "
"single response rather than sequentially.\n"
"- **Non-interactive commands:** Use flags like -y, --yes, --non-interactive "
"to prevent CLI tools from hanging on prompts.\n"
"- **Keep going:** Work autonomously until the task is fully resolved. "
"Don't stop with a plan — execute it.\n"
)
# Model name substrings that should use the 'developer' role instead of
# 'system' for the system prompt. OpenAI's newer models (GPT-5, Codex)
# give stronger instruction-following weight to the 'developer' role.
# The swap happens at the API boundary in _build_api_kwargs() so internal
# message representation stays consistent ("system" everywhere).
DEVELOPER_ROLE_MODELS = ("gpt-5", "codex")
PLATFORM_HINTS = {
"whatsapp": (
@ -444,18 +474,33 @@ def build_skills_system_prompt(
mtime/size manifest — survives process restarts
Falls back to a full filesystem scan when both layers miss.
External skill directories (``skills.external_dirs`` in config.yaml) are
scanned alongside the local ``~/.hermes/skills/`` directory. External dirs
are read-only — they appear in the index but new skills are always created
in the local dir. Local skills take precedence when names collide.
"""
hermes_home = get_hermes_home()
skills_dir = hermes_home / "skills"
external_dirs = get_all_skills_dirs()[1:] # skip local (index 0)
if not skills_dir.exists():
if not skills_dir.exists() and not external_dirs:
return ""
# ── Layer 1: in-process LRU cache ─────────────────────────────────
# Include the resolved platform so per-platform disabled-skill lists
# produce distinct cache entries (gateway serves multiple platforms).
_platform_hint = (
os.environ.get("HERMES_PLATFORM")
or os.environ.get("HERMES_SESSION_PLATFORM")
or ""
)
cache_key = (
str(skills_dir.resolve()),
tuple(str(d) for d in external_dirs),
tuple(sorted(str(t) for t in (available_tools or set()))),
tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
_platform_hint,
)
with _SKILLS_PROMPT_CACHE_LOCK:
cached = _SKILLS_PROMPT_CACHE.get(cache_key)
@ -540,6 +585,56 @@ def build_skills_system_prompt(
category_descriptions,
)
# ── External skill directories ─────────────────────────────────────
# Scan external dirs directly (no snapshot caching — they're read-only
# and typically small). Local skills already in skills_by_category take
# precedence: we track seen names and skip duplicates from external dirs.
seen_skill_names: set[str] = set()
for cat_skills in skills_by_category.values():
for name, _desc in cat_skills:
seen_skill_names.add(name)
for ext_dir in external_dirs:
if not ext_dir.exists():
continue
for skill_file in iter_skill_index_files(ext_dir, "SKILL.md"):
try:
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
if not is_compatible:
continue
entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
skill_name = entry["skill_name"]
if skill_name in seen_skill_names:
continue
if entry["frontmatter_name"] in disabled or skill_name in disabled:
continue
if not _skill_should_show(
extract_skill_conditions(frontmatter),
available_tools,
available_toolsets,
):
continue
seen_skill_names.add(skill_name)
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
)
except Exception as e:
logger.debug("Error reading external skill %s: %s", skill_file, e)
# External category descriptions
for desc_file in iter_skill_index_files(ext_dir, "DESCRIPTION.md"):
try:
content = desc_file.read_text(encoding="utf-8")
fm, _ = parse_frontmatter(content)
cat_desc = fm.get("description")
if not cat_desc:
continue
rel = desc_file.relative_to(ext_dir)
cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
category_descriptions.setdefault(cat, str(cat_desc).strip().strip("'\""))
except Exception as e:
logger.debug("Could not read external skill description %s: %s", desc_file, e)
if not skills_by_category:
result = ""
else:
@ -587,6 +682,73 @@ def build_skills_system_prompt(
return result
def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -> str:
"""Build a compact Nous subscription capability block for the system prompt."""
try:
from hermes_cli.nous_subscription import get_nous_subscription_features
from tools.tool_backend_helpers import managed_nous_tools_enabled
except Exception as exc:
logger.debug("Failed to import Nous subscription helper: %s", exc)
return ""
if not managed_nous_tools_enabled():
return ""
valid_names = set(valid_tool_names or set())
relevant_tool_names = {
"web_search",
"web_extract",
"browser_navigate",
"browser_snapshot",
"browser_click",
"browser_type",
"browser_scroll",
"browser_console",
"browser_close",
"browser_press",
"browser_get_images",
"browser_vision",
"image_generate",
"text_to_speech",
"terminal",
"process",
"execute_code",
}
if valid_names and not (valid_names & relevant_tool_names):
return ""
features = get_nous_subscription_features()
def _status_line(feature) -> str:
if feature.managed_by_nous:
return f"- {feature.label}: active via Nous subscription"
if feature.active:
current = feature.current_provider or "configured provider"
return f"- {feature.label}: currently using {current}"
if feature.included_by_default and features.nous_auth_present:
return f"- {feature.label}: included with Nous subscription, not currently selected"
if feature.key == "modal" and features.nous_auth_present:
return f"- {feature.label}: optional via Nous subscription"
return f"- {feature.label}: not currently available"
lines = [
"# Nous Subscription",
"Nous subscription includes managed web tools (Firecrawl), image generation (FAL), OpenAI TTS, and browser automation (Browserbase) by default. Modal execution is optional.",
"Current capability status:",
]
lines.extend(_status_line(feature) for feature in features.items())
lines.extend(
[
"When a Nous-managed feature is active, do not ask the user for Firecrawl, FAL, OpenAI TTS, or Browserbase API keys.",
"If the user is not subscribed and asks for a capability that Nous subscription would unlock or simplify, suggest Nous subscription as one option alongside direct setup or local alternatives.",
"Do not mention subscription unless the user asks about it or it directly solves the current missing capability.",
"Useful commands: hermes setup, hermes setup tools, hermes setup terminal, hermes status.",
]
)
return "\n".join(lines)
# =========================================================================
# Context files (SOUL.md, AGENTS.md, .cursorrules)
# =========================================================================

View File

@ -13,11 +13,19 @@ import re
logger = logging.getLogger(__name__)
# Snapshot at import time so runtime env mutations (e.g. LLM-generated
# `export HERMES_REDACT_SECRETS=false`) cannot disable redaction mid-session.
_REDACT_ENABLED = os.getenv("HERMES_REDACT_SECRETS", "").lower() not in ("0", "false", "no", "off")
# Known API key prefixes -- match the prefix + contiguous token chars
_PREFIX_PATTERNS = [
r"sk-[A-Za-z0-9_-]{10,}", # OpenAI / OpenRouter / Anthropic (sk-ant-*)
r"ghp_[A-Za-z0-9]{10,}", # GitHub PAT (classic)
r"github_pat_[A-Za-z0-9_]{10,}", # GitHub PAT (fine-grained)
r"gho_[A-Za-z0-9]{10,}", # GitHub OAuth access token
r"ghu_[A-Za-z0-9]{10,}", # GitHub user-to-server token
r"ghs_[A-Za-z0-9]{10,}", # GitHub server-to-server token
r"ghr_[A-Za-z0-9]{10,}", # GitHub refresh token
r"xox[baprs]-[A-Za-z0-9-]{10,}", # Slack tokens
r"AIza[A-Za-z0-9_-]{30,}", # Google API keys
r"pplx-[A-Za-z0-9]{10,}", # Perplexity
@ -37,6 +45,9 @@ _PREFIX_PATTERNS = [
r"dop_v1_[A-Za-z0-9]{10,}", # DigitalOcean PAT
r"doo_v1_[A-Za-z0-9]{10,}", # DigitalOcean OAuth
r"am_[A-Za-z0-9_-]{10,}", # AgentMail API key
r"sk_[A-Za-z0-9_]{10,}", # ElevenLabs TTS key (sk_ underscore, not sk- dash)
r"tvly-[A-Za-z0-9]{10,}", # Tavily search API key
r"exa_[A-Za-z0-9]{10,}", # Exa search API key
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name
@ -106,7 +117,7 @@ def redact_sensitive_text(text: str) -> str:
text = str(text)
if not text:
return text
if os.getenv("HERMES_REDACT_SECRETS", "").lower() in ("0", "false", "no", "off"):
if not _REDACT_ENABLED:
return text
# Known prefixes (sk-, ghp_, etc.)

View File

@ -128,7 +128,11 @@ def _build_skill_message(
supporting.append(rel)
if supporting and skill_dir:
skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
try:
skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
except ValueError:
# Skill is from an external dir — use the skill name instead
skill_view_target = skill_dir.name
parts.append("")
parts.append("[This skill has supporting files you can load with the skill_view tool:]")
for sf in supporting:
@ -158,38 +162,49 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
if not SKILLS_DIR.exists():
return _skill_commands
from agent.skill_utils import get_external_skills_dirs
disabled = _get_disabled_skill_names()
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
try:
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
seen_names: set = set()
# Scan local dir first, then external dirs
dirs_to_scan = []
if SKILLS_DIR.exists():
dirs_to_scan.append(SKILLS_DIR)
dirs_to_scan.extend(get_external_skills_dirs())
for scan_dir in dirs_to_scan:
for skill_md in scan_dir.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
name = frontmatter.get('name', skill_md.parent.name)
# Respect user's disabled skills config
if name in disabled:
try:
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
continue
name = frontmatter.get('name', skill_md.parent.name)
if name in seen_names:
continue
# Respect user's disabled skills config
if name in disabled:
continue
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split('\n'):
line = line.strip()
if line and not line.startswith('#'):
description = line[:80]
break
seen_names.add(name)
cmd_name = name.lower().replace(' ', '-').replace('_', '-')
_skill_commands[f"/{cmd_name}"] = {
"name": name,
"description": description or f"Invoke the {name} skill",
"skill_md_path": str(skill_md),
"skill_dir": str(skill_md.parent),
}
except Exception:
continue
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split('\n'):
line = line.strip()
if line and not line.startswith('#'):
description = line[:80]
break
cmd_name = name.lower().replace(' ', '-').replace('_', '-')
_skill_commands[f"/{cmd_name}"] = {
"name": name,
"description": description or f"Invoke the {name} skill",
"skill_md_path": str(skill_md),
"skill_dir": str(skill_md.parent),
}
except Exception:
continue
except Exception:
pass
return _skill_commands

View File

@ -118,12 +118,17 @@ def skill_matches_platform(frontmatter: Dict[str, Any]) -> bool:
# ── Disabled skills ───────────────────────────────────────────────────────
def get_disabled_skill_names() -> Set[str]:
def get_disabled_skill_names(platform: str | None = None) -> Set[str]:
"""Read disabled skill names from config.yaml.
Resolves platform from ``HERMES_PLATFORM`` env var, falls back to
the global disabled list. Reads the config file directly (no CLI
config imports) to stay lightweight.
Args:
platform: Explicit platform name (e.g. ``"telegram"``). When
*None*, resolves from ``HERMES_PLATFORM`` or
``HERMES_SESSION_PLATFORM`` env vars. Falls back to the
global disabled list when no platform is determined.
Reads the config file directly (no CLI config imports) to stay
lightweight.
"""
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
@ -140,7 +145,11 @@ def get_disabled_skill_names() -> Set[str]:
if not isinstance(skills_cfg, dict):
return set()
resolved_platform = os.getenv("HERMES_PLATFORM")
resolved_platform = (
platform
or os.getenv("HERMES_PLATFORM")
or os.getenv("HERMES_SESSION_PLATFORM")
)
if resolved_platform:
platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
resolved_platform
@ -158,12 +167,85 @@ def _normalize_string_set(values) -> Set[str]:
return {str(v).strip() for v in values if str(v).strip()}
# ── External skills directories ──────────────────────────────────────────
def get_external_skills_dirs() -> List[Path]:
"""Read ``skills.external_dirs`` from config.yaml and return validated paths.
Each entry is expanded (``~`` and ``${VAR}``) and resolved to an absolute
path. Only directories that actually exist are returned. Duplicates and
paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
"""
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return []
try:
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
except Exception:
return []
if not isinstance(parsed, dict):
return []
skills_cfg = parsed.get("skills")
if not isinstance(skills_cfg, dict):
return []
raw_dirs = skills_cfg.get("external_dirs")
if not raw_dirs:
return []
if isinstance(raw_dirs, str):
raw_dirs = [raw_dirs]
if not isinstance(raw_dirs, list):
return []
local_skills = (get_hermes_home() / "skills").resolve()
seen: Set[Path] = set()
result: List[Path] = []
for entry in raw_dirs:
entry = str(entry).strip()
if not entry:
continue
# Expand ~ and environment variables
expanded = os.path.expanduser(os.path.expandvars(entry))
p = Path(expanded).resolve()
if p == local_skills:
continue
if p in seen:
continue
if p.is_dir():
seen.add(p)
result.append(p)
else:
logger.debug("External skills dir does not exist, skipping: %s", p)
return result
def get_all_skills_dirs() -> List[Path]:
"""Return all skill directories: local ``~/.hermes/skills/`` first, then external.
The local dir is always first (and always included even if it doesn't exist
yet — callers handle that). External dirs follow in config order.
"""
dirs = [get_hermes_home() / "skills"]
dirs.extend(get_external_skills_dirs())
return dirs
# ── Condition extraction ──────────────────────────────────────────────────
def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
"""Extract conditional activation fields from parsed frontmatter."""
hermes = (frontmatter.get("metadata") or {}).get("hermes") or {}
metadata = frontmatter.get("metadata")
# Handle cases where metadata is not a dict (e.g., a string from malformed YAML)
if not isinstance(metadata, dict):
metadata = {}
hermes = metadata.get("hermes") or {}
if not isinstance(hermes, dict):
hermes = {}
return {
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
"requires_toolsets": hermes.get("requires_toolsets", []),

View File

@ -6,6 +6,8 @@ import os
import re
from typing import Any, Dict, Optional
from utils import is_truthy_value
_COMPLEX_KEYWORDS = {
"debug",
"debugging",
@ -47,13 +49,7 @@ _URL_RE = re.compile(r"https?://|www\.", re.IGNORECASE)
def _coerce_bool(value: Any, default: bool = False) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, str):
return value.strip().lower() in {"1", "true", "yes", "on"}
return bool(value)
return is_truthy_value(value, default=default)
def _coerce_int(value: Any, default: int) -> int:
@ -127,6 +123,7 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"api_mode": primary.get("api_mode"),
"command": primary.get("command"),
"args": list(primary.get("args") or []),
"credential_pool": primary.get("credential_pool"),
},
"label": None,
"signature": (
@ -162,6 +159,7 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"api_mode": primary.get("api_mode"),
"command": primary.get("command"),
"args": list(primary.get("args") or []),
"credential_pool": primary.get("credential_pool"),
},
"label": None,
"signature": (

View File

@ -19,7 +19,7 @@ _TITLE_PROMPT = (
)
def generate_title(user_message: str, assistant_response: str, timeout: float = 15.0) -> Optional[str]:
def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
"""Generate a session title from the first exchange.
Uses the auxiliary LLM client (cheapest/fastest available model).

View File

@ -7,17 +7,33 @@
# =============================================================================
model:
# Default model to use (can be overridden with --model flag)
# Both "default" and "model" work as the key name here.
default: "anthropic/claude-opus-4.6"
# Inference provider selection:
# "auto" - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
# "nous-api" - Use Nous Portal via API key (requires: NOUS_API_KEY)
# "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
# "nous" - Always use Nous Portal (requires: hermes login)
# "zai" - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
# "kimi-coding"- Use Kimi / Moonshot AI models (requires: KIMI_API_KEY)
# "minimax" - Use MiniMax global endpoint (requires: MINIMAX_API_KEY)
# "minimax-cn" - Use MiniMax China endpoint (requires: MINIMAX_CN_API_KEY)
# "auto" - Auto-detect from credentials (default)
# "openrouter" - OpenRouter (requires: OPENROUTER_API_KEY or OPENAI_API_KEY)
# "nous" - Nous Portal OAuth (requires: hermes login)
# "nous-api" - Nous Portal API key (requires: NOUS_API_KEY)
# "anthropic" - Direct Anthropic API (requires: ANTHROPIC_API_KEY)
# "openai-codex" - OpenAI Codex (requires: hermes login --provider openai-codex)
# "copilot" - GitHub Copilot / GitHub Models (requires: GITHUB_TOKEN)
# "zai" - z.ai / ZhipuAI GLM (requires: GLM_API_KEY)
# "kimi-coding" - Kimi / Moonshot AI (requires: KIMI_API_KEY)
# "minimax" - MiniMax global (requires: MINIMAX_API_KEY)
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "kilocode" - KiloCode gateway (requires: KILOCODE_API_KEY)
# "ai-gateway" - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
#
# Local servers (LM Studio, Ollama, vLLM, llama.cpp):
# "custom" - Any OpenAI-compatible endpoint. Set base_url below.
# Aliases: "lmstudio", "ollama", "vllm", "llamacpp" all map to "custom".
# Example for LM Studio:
# provider: "lmstudio"
# base_url: "http://localhost:1234/v1"
# No API key needed — local servers typically ignore auth.
#
# Can also be overridden with --provider flag or HERMES_INFERENCE_PROVIDER env var.
provider: "auto"
@ -308,6 +324,9 @@ compression:
# vision:
# provider: "auto"
# model: "" # e.g. "google/gemini-2.5-flash", "openai/gpt-4o"
# timeout: 30 # LLM API call timeout (seconds)
# download_timeout: 30 # Image HTTP download timeout (seconds)
# # Increase for slow connections or self-hosted image servers
#
# # Web page scraping / summarization + browser page text extraction
# web_extract:
@ -401,6 +420,15 @@ skills:
# Set to 0 to disable.
creation_nudge_interval: 15
# External skill directories — share skills across tools/agents without
# copying them into ~/.hermes/skills/. Each path is expanded (~ and ${VAR})
# and resolved to an absolute path. External dirs are read-only: skill
# creation always writes to ~/.hermes/skills/. Local skills take precedence
# when names collide.
# external_dirs:
# - ~/.agents/skills
# - /home/shared/team-skills
# =============================================================================
# Agent Behavior
# =============================================================================

848
cli.py

File diff suppressed because it is too large Load Diff

View File

@ -9,6 +9,7 @@ runs at a time if multiple processes overlap.
"""
import asyncio
import concurrent.futures
import json
import logging
import os
@ -26,6 +27,7 @@ except ImportError:
msvcrt = None
from pathlib import Path
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config
from typing import Optional
from hermes_time import now as _hermes_now
@ -86,6 +88,22 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
chat_id, thread_id = rest.split(":", 1)
else:
chat_id, thread_id = rest, None
# Resolve human-friendly labels like "Alice (dm)" to real IDs.
# send_message(action="list") shows labels with display suffixes
# that aren't valid platform IDs (e.g. WhatsApp JIDs).
try:
from gateway.channel_directory import resolve_channel_name
target = chat_id
# Strip display suffix like " (dm)" or " (group)"
if target.endswith(")") and " (" in target:
target = target.rsplit(" (", 1)[0].strip()
resolved = resolve_channel_name(platform_name.lower(), target)
if resolved:
chat_id = resolved
except Exception:
pass
return {
"platform": platform_name,
"chat_id": chat_id,
@ -145,6 +163,8 @@ def _deliver_result(job: dict, content: str) -> None:
"mattermost": Platform.MATTERMOST,
"homeassistant": Platform.HOMEASSISTANT,
"dingtalk": Platform.DINGTALK,
"feishu": Platform.FEISHU,
"wecom": Platform.WECOM,
"email": Platform.EMAIL,
"sms": Platform.SMS,
}
@ -164,18 +184,29 @@ def _deliver_result(job: dict, content: str) -> None:
logger.warning("Job '%s': platform '%s' not configured/enabled", job["id"], platform_name)
return
# Wrap the content so the user knows this is a cron delivery and that
# the interactive agent has no visibility into it.
task_name = job.get("name", job["id"])
wrapped = (
f"Cronjob Response: {task_name}\n"
f"-------------\n\n"
f"{content}\n\n"
f"Note: The agent cannot see this message, and therefore cannot respond to it."
)
# Optionally wrap the content with a header/footer so the user knows this
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
# in config.yaml for clean output.
wrap_response = True
try:
user_cfg = load_config()
wrap_response = user_cfg.get("cron", {}).get("wrap_response", True)
except Exception:
pass
if wrap_response:
task_name = job.get("name", job["id"])
delivery_content = (
f"Cronjob Response: {task_name}\n"
f"-------------\n\n"
f"{content}\n\n"
f"Note: The agent cannot see this message, and therefore cannot respond to it."
)
else:
delivery_content = content
# Run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id)
coro = _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id)
try:
result = asyncio.run(coro)
except RuntimeError:
@ -186,7 +217,7 @@ def _deliver_result(job: dict, content: str) -> None:
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id))
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id))
result = future.result(timeout=30)
except Exception as e:
logger.error("Job '%s': delivery to %s:%s failed: %s", job["id"], platform_name, chat_id, e)
@ -206,11 +237,12 @@ def _build_job_prompt(job: dict) -> str:
# Always prepend [SILENT] guidance so the cron agent can suppress
# delivery when it has nothing new or noteworthy to report.
silent_hint = (
"[SYSTEM: If you have nothing new or noteworthy to report, respond "
"with exactly \"[SILENT]\" (optionally followed by a brief internal "
"note). This suppresses delivery to the user while still saving "
"output locally. Only use [SILENT] when there are genuinely no "
"changes worth reporting.]\n\n"
"[SYSTEM: If you have a meaningful status report or findings, "
"send them — that is the whole point of this job. Only respond "
"with exactly \"[SILENT]\" (nothing else) when there is genuinely "
"nothing new to report. [SILENT] suppresses delivery to the user. "
"Never combine [SILENT] with content — either report your "
"findings normally, or say [SILENT] and nothing more.]\n\n"
)
prompt = silent_hint + prompt
if skills is None:
@ -308,7 +340,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if delivery_target.get("thread_id") is not None:
os.environ["HERMES_CRON_AUTO_DELIVER_THREAD_ID"] = str(delivery_target["thread_id"])
model = job.get("model") or os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
model = job.get("model") or os.getenv("HERMES_MODEL") or ""
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
_cfg = {}
@ -406,13 +438,36 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
provider_sort=pr.get("sort"),
disabled_toolsets=["cronjob", "messaging", "clarify"],
quiet_mode=True,
skip_memory=True, # Cron system prompts would corrupt user representations
platform="cron",
session_id=_cron_session_id,
session_db=_session_db,
)
result = agent.run_conversation(prompt)
# Run the agent with a timeout so a hung API call or tool doesn't
# block the cron ticker thread indefinitely. Default 10 minutes;
# override via env var. Uses a separate thread because
# run_conversation is synchronous.
_cron_timeout = float(os.getenv("HERMES_CRON_TIMEOUT", 600))
_cron_pool = concurrent.futures.ThreadPoolExecutor(max_workers=1)
_cron_future = _cron_pool.submit(agent.run_conversation, prompt)
try:
result = _cron_future.result(timeout=_cron_timeout)
except concurrent.futures.TimeoutError:
logger.error(
"Job '%s' timed out after %.0fs — interrupting agent",
job_name, _cron_timeout,
)
if hasattr(agent, "interrupt"):
agent.interrupt("Cron job timed out")
_cron_pool.shutdown(wait=False, cancel_futures=True)
raise TimeoutError(
f"Cron job '{job_name}' timed out after "
f"{int(_cron_timeout // 60)} minutes"
)
finally:
_cron_pool.shutdown(wait=False)
final_response = result.get("final_response", "") or ""
# Use a separate variable for log display; keep final_response clean
# for delivery logic (empty response = no delivery).

15
docker/SOUL.md Normal file
View File

@ -0,0 +1,15 @@
# Hermes Agent Persona
<!--
This file defines the agent's personality and tone.
The agent will embody whatever you write here.
Edit this to customize how Hermes communicates with you.
Examples:
- "You are a warm, playful assistant who uses kaomoji occasionally."
- "You are a concise technical expert. No fluff, just facts."
- "You speak like a friendly coworker who happens to know everything."
This file is loaded fresh each message -- no restart needed.
Delete the contents (or this file) to use the default personality.
-->

34
docker/entrypoint.sh Normal file
View File

@ -0,0 +1,34 @@
#!/bin/bash
# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
set -e
HERMES_HOME="/opt/data"
INSTALL_DIR="/opt/hermes"
# Create essential directory structure. Cache and platform directories
# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
# demand by the application — don't pre-create them here so new installs
# get the consolidated layout from get_hermes_dir().
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills}
# .env
if [ ! -f "$HERMES_HOME/.env" ]; then
cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
fi
# config.yaml
if [ ! -f "$HERMES_HOME/config.yaml" ]; then
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
fi
# SOUL.md
if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
fi
# Sync bundled skills (manifest-based so user edits are preserved)
if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
exec hermes "$@"

View File

@ -76,14 +76,13 @@ Open Zed settings (`Cmd+,` on macOS or `Ctrl+,` on Linux) and add to your
```json
{
"acp": {
"agents": [
{
"name": "hermes-agent",
"registry_dir": "/path/to/hermes-agent/acp_registry"
}
]
}
"agent_servers": {
"hermes-agent": {
"type": "custom",
"command": "hermes",
"args": ["acp"],
},
},
}
```

View File

@ -209,7 +209,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
# Agent settings -- TB2 tasks are complex, need many turns
max_agent_turns=60,
max_token_length=***
max_token_length=16000,
agent_temperature=0.6,
system_prompt=None,
@ -233,7 +233,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
steps_per_eval=1,
total_steps=1,
tokenizer_name="NousRe...1-8B",
tokenizer_name="NousResearch/Hermes-3-Llama-3.1-8B",
use_wandb=True,
wandb_name="terminal-bench-2",
ensure_scores_are_not_same=False, # Binary rewards may all be 0 or 1
@ -245,7 +245,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4",
server_type="openai",
api_key=os.get...EY", ""),
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
@ -513,3 +513,446 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
reward = 0.0
else:
# Run tests in a thread so the blocking ctx.terminal() calls
# don't freeze the entire event loop (which would stall all
# other tasks, tqdm updates, and timeout timers).
ctx = ToolContext(task_id)
try:
loop = asyncio.get_event_loop()
reward = await loop.run_in_executor(
None, # default thread pool
self._run_tests, eval_item, ctx, task_name,
)
except Exception as e:
logger.error("Task %s: test verification failed: %s", task_name, e)
reward = 0.0
finally:
ctx.cleanup()
passed = reward == 1.0
status = "PASS" if passed else "FAIL"
elapsed = time.time() - task_start
tqdm.write(f" [{status}] {task_name} (turns={result.turns_used}, {elapsed:.0f}s)")
logger.info(
"Task %s: reward=%.1f, turns=%d, finished=%s",
task_name, reward, result.turns_used, result.finished_naturally,
)
out = {
"passed": passed,
"reward": reward,
"task_name": task_name,
"category": category,
"turns_used": result.turns_used,
"finished_naturally": result.finished_naturally,
"messages": result.messages,
}
self._save_result(out)
return out
except Exception as e:
elapsed = time.time() - task_start
logger.error("Task %s: rollout failed: %s", task_name, e, exc_info=True)
tqdm.write(f" [ERROR] {task_name}: {e} ({elapsed:.0f}s)")
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": str(e),
}
self._save_result(out)
return out
finally:
# --- Cleanup: clear overrides, sandbox, and temp files ---
clear_task_env_overrides(task_id)
try:
cleanup_vm(task_id)
except Exception as e:
logger.debug("VM cleanup for %s: %s", task_id[:8], e)
if task_dir and task_dir.exists():
shutil.rmtree(task_dir, ignore_errors=True)
def _run_tests(
self, item: Dict[str, Any], ctx: ToolContext, task_name: str
) -> float:
"""
Upload and execute the test suite in the agent's sandbox, then
download the verifier output locally to read the reward.
Follows Harbor's verification pattern:
1. Upload tests/ directory into the sandbox
2. Execute test.sh inside the sandbox
3. Download /logs/verifier/ directory to a local temp dir
4. Read reward.txt locally with native Python I/O
Downloading locally avoids issues with the file_read tool on
the Modal VM and matches how Harbor handles verification.
TB2 test scripts (test.sh) typically:
1. Install pytest via uv/pip
2. Run pytest against the test files in /tests/
3. Write results to /logs/verifier/reward.txt
Args:
item: The TB2 task dict (contains tests_tar, test_sh)
ctx: ToolContext scoped to this task's sandbox
task_name: For logging
Returns:
1.0 if tests pass, 0.0 otherwise
"""
tests_tar = item.get("tests_tar", "")
test_sh = item.get("test_sh", "")
if not test_sh:
logger.warning("Task %s: no test_sh content, reward=0", task_name)
return 0.0
# Create required directories in the sandbox
ctx.terminal("mkdir -p /tests /logs/verifier")
# Upload test files into the sandbox (binary-safe via base64)
if tests_tar:
tests_temp = Path(tempfile.mkdtemp(prefix=f"tb2-tests-{task_name}-"))
try:
_extract_base64_tar(tests_tar, tests_temp)
ctx.upload_dir(str(tests_temp), "/tests")
except Exception as e:
logger.warning("Task %s: failed to upload test files: %s", task_name, e)
finally:
shutil.rmtree(tests_temp, ignore_errors=True)
# Write the test runner script (test.sh)
ctx.write_file("/tests/test.sh", test_sh)
ctx.terminal("chmod +x /tests/test.sh")
# Execute the test suite
logger.info(
"Task %s: running test suite (timeout=%ds)",
task_name, self.config.test_timeout,
)
test_result = ctx.terminal(
"bash /tests/test.sh",
timeout=self.config.test_timeout,
)
exit_code = test_result.get("exit_code", -1)
output = test_result.get("output", "")
# Download the verifier output directory locally, then read reward.txt
# with native Python I/O. This avoids issues with file_read on the
# Modal VM and matches Harbor's verification pattern.
reward = 0.0
local_verifier_dir = Path(tempfile.mkdtemp(prefix=f"tb2-verifier-{task_name}-"))
try:
ctx.download_dir("/logs/verifier", str(local_verifier_dir))
reward_file = local_verifier_dir / "reward.txt"
if reward_file.exists() and reward_file.stat().st_size > 0:
content = reward_file.read_text().strip()
if content == "1":
reward = 1.0
elif content == "0":
reward = 0.0
else:
# Unexpected content -- try parsing as float
try:
reward = float(content)
except (ValueError, TypeError):
logger.warning(
"Task %s: reward.txt content unexpected (%r), "
"falling back to exit_code=%d",
task_name, content, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
else:
# reward.txt not written -- fall back to exit code
logger.warning(
"Task %s: reward.txt not found after download, "
"falling back to exit_code=%d",
task_name, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
except Exception as e:
logger.warning(
"Task %s: failed to download verifier dir: %s, "
"falling back to exit_code=%d",
task_name, e, exit_code,
)
reward = 1.0 if exit_code == 0 else 0.0
finally:
shutil.rmtree(local_verifier_dir, ignore_errors=True)
# Log test output for debugging failures
if reward == 0.0:
output_preview = output[-500:] if output else "(no output)"
logger.info(
"Task %s: FAIL (exit_code=%d)\n%s",
task_name, exit_code, output_preview,
)
return reward
# =========================================================================
# Evaluate -- main entry point for the eval subcommand
# =========================================================================
async def _eval_with_timeout(self, item: Dict[str, Any]) -> Dict:
"""
Wrap rollout_and_score_eval with a per-task wall-clock timeout.
If the task exceeds task_timeout seconds, it's automatically scored
as FAIL. This prevents any single task from hanging indefinitely.
"""
task_name = item.get("task_name", "unknown")
category = item.get("category", "unknown")
try:
return await asyncio.wait_for(
self.rollout_and_score_eval(item),
timeout=self.config.task_timeout,
)
except asyncio.TimeoutError:
from tqdm import tqdm
elapsed = self.config.task_timeout
tqdm.write(f" [TIMEOUT] {task_name} (exceeded {elapsed}s wall-clock limit)")
logger.error("Task %s: wall-clock timeout after %ds", task_name, elapsed)
out = {
"passed": False, "reward": 0.0,
"task_name": task_name, "category": category,
"error": f"timeout ({elapsed}s)",
}
self._save_result(out)
return out
async def evaluate(self, *args, **kwargs) -> None:
"""
Run Terminal-Bench 2.0 evaluation over all tasks.
This is the main entry point when invoked via:
python environments/terminalbench2_env.py evaluate
Runs all tasks through rollout_and_score_eval() via asyncio.gather()
(same pattern as GPQA and other Atropos eval envs). Each task is
wrapped with a wall-clock timeout so hung tasks auto-fail.
Suppresses noisy Modal/terminal output (HERMES_QUIET) so the tqdm
bar stays visible.
"""
start_time = time.time()
# Route all logging through tqdm.write() so the progress bar stays
# pinned at the bottom while log lines scroll above it.
from tqdm import tqdm
class _TqdmHandler(logging.Handler):
def emit(self, record):
try:
tqdm.write(self.format(record))
except Exception:
self.handleError(record)
handler = _TqdmHandler()
handler.setFormatter(logging.Formatter(
"%(asctime)s [%(name)s] %(levelname)s: %(message)s",
datefmt="%H:%M:%S",
))
root = logging.getLogger()
root.handlers = [handler] # Replace any existing handlers
root.setLevel(logging.INFO)
# Silence noisy third-party loggers that flood the output
logging.getLogger("httpx").setLevel(logging.WARNING) # Every HTTP request
logging.getLogger("openai").setLevel(logging.WARNING) # OpenAI client retries
logging.getLogger("rex-deploy").setLevel(logging.WARNING) # Swerex deployment
logging.getLogger("rex_image_builder").setLevel(logging.WARNING) # Image builds
print(f"\n{'='*60}")
print("Starting Terminal-Bench 2.0 Evaluation")
print(f"{'='*60}")
print(f" Dataset: {self.config.dataset_name}")
print(f" Total tasks: {len(self.all_eval_items)}")
print(f" Max agent turns: {self.config.max_agent_turns}")
print(f" Task timeout: {self.config.task_timeout}s")
print(f" Terminal backend: {self.config.terminal_backend}")
print(f" Tool thread pool: {self.config.tool_pool_size}")
print(f" Terminal timeout: {self.config.terminal_timeout}s/cmd")
print(f" Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
print(f" Max concurrent tasks: {self.config.max_concurrent_tasks}")
print(f"{'='*60}\n")
# Semaphore to limit concurrent Modal sandbox creations.
# Without this, all 86 tasks fire simultaneously, each creating a Modal
# sandbox via asyncio.run() inside a thread pool worker. Modal's blocking
# calls (App.lookup, etc.) deadlock when too many are created at once.
semaphore = asyncio.Semaphore(self.config.max_concurrent_tasks)
async def _eval_with_semaphore(item):
async with semaphore:
return await self._eval_with_timeout(item)
# Fire all tasks with wall-clock timeout, track live accuracy on the bar
total_tasks = len(self.all_eval_items)
eval_tasks = [
asyncio.ensure_future(_eval_with_semaphore(item))
for item in self.all_eval_items
]
results = []
passed_count = 0
pbar = tqdm(total=total_tasks, desc="Evaluating TB2", dynamic_ncols=True)
try:
for coro in asyncio.as_completed(eval_tasks):
result = await coro
results.append(result)
if result and result.get("passed"):
passed_count += 1
done = len(results)
pct = (passed_count / done * 100) if done else 0
pbar.set_postfix_str(f"pass={passed_count}/{done} ({pct:.1f}%)")
pbar.update(1)
except (KeyboardInterrupt, asyncio.CancelledError):
pbar.close()
print(f"\n\nInterrupted! Cleaning up {len(eval_tasks)} tasks...")
# Cancel all pending tasks
for task in eval_tasks:
task.cancel()
# Let cancellations propagate (finally blocks run cleanup_vm)
await asyncio.gather(*eval_tasks, return_exceptions=True)
# Belt-and-suspenders: clean up any remaining sandboxes
from tools.terminal_tool import cleanup_all_environments
cleanup_all_environments()
print("All sandboxes cleaned up.")
return
finally:
pbar.close()
end_time = time.time()
# Filter out None results (shouldn't happen, but be safe)
valid_results = [r for r in results if r is not None]
if not valid_results:
print("Warning: No valid evaluation results obtained")
return
# ---- Compute metrics ----
total = len(valid_results)
passed = sum(1 for r in valid_results if r.get("passed"))
overall_pass_rate = passed / total if total > 0 else 0.0
# Per-category breakdown
cat_results: Dict[str, List[Dict]] = defaultdict(list)
for r in valid_results:
cat_results[r.get("category", "unknown")].append(r)
# Build metrics dict
eval_metrics = {
"eval/pass_rate": overall_pass_rate,
"eval/total_tasks": total,
"eval/passed_tasks": passed,
"eval/evaluation_time_seconds": end_time - start_time,
}
# Per-category metrics
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_pass_rate = cat_passed / cat_total if cat_total > 0 else 0.0
cat_key = category.replace(" ", "_").replace("-", "_").lower()
eval_metrics[f"eval/pass_rate_{cat_key}"] = cat_pass_rate
# Store metrics for wandb_log
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
# ---- Print summary ----
print(f"\n{'='*60}")
print("Terminal-Bench 2.0 Evaluation Results")
print(f"{'='*60}")
print(f"Overall Pass Rate: {overall_pass_rate:.4f} ({passed}/{total})")
print(f"Evaluation Time: {end_time - start_time:.1f} seconds")
print("\nCategory Breakdown:")
for category, cat_items in sorted(cat_results.items()):
cat_passed = sum(1 for r in cat_items if r.get("passed"))
cat_total = len(cat_items)
cat_rate = cat_passed / cat_total if cat_total > 0 else 0.0
print(f" {category}: {cat_rate:.1%} ({cat_passed}/{cat_total})")
# Print individual task results
print("\nTask Results:")
for r in sorted(valid_results, key=lambda x: x.get("task_name", "")):
status = "PASS" if r.get("passed") else "FAIL"
turns = r.get("turns_used", "?")
error = r.get("error", "")
extra = f" (error: {error})" if error else ""
print(f" [{status}] {r['task_name']} (turns={turns}){extra}")
print(f"{'='*60}\n")
# Build sample records for evaluate_log (includes full conversations)
samples = [
{
"task_name": r.get("task_name"),
"category": r.get("category"),
"passed": r.get("passed"),
"reward": r.get("reward"),
"turns_used": r.get("turns_used"),
"error": r.get("error"),
"messages": r.get("messages"),
}
for r in valid_results
]
# Log evaluation results
try:
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
generation_parameters={
"temperature": self.config.agent_temperature,
"max_tokens": self.config.max_token_length,
"max_agent_turns": self.config.max_agent_turns,
"terminal_backend": self.config.terminal_backend,
},
)
except Exception as e:
print(f"Error logging evaluation results: {e}")
# Close streaming file
if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
self._streaming_file.close()
print(f" Live results saved to: {self._streaming_path}")
# Kill all remaining sandboxes. Timed-out tasks leave orphaned thread
# pool workers still executing commands -- cleanup_all stops them.
from tools.terminal_tool import cleanup_all_environments
print("\nCleaning up all sandboxes...")
cleanup_all_environments()
# Shut down the tool thread pool so orphaned workers from timed-out
# tasks are killed immediately instead of retrying against dead
# sandboxes and spamming the console with TimeoutError warnings.
from environments.agent_loop import _tool_executor
_tool_executor.shutdown(wait=False, cancel_futures=True)
print("Done.")
# =========================================================================
# Wandb logging
# =========================================================================
async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
"""Log TB2-specific metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
# Add stored eval metrics
for metric_name, metric_value in self.eval_metrics:
wandb_metrics[metric_name] = metric_value
self.eval_metrics = []
await super().wandb_log(wandb_metrics)
if __name__ == "__main__":
TerminalBench2EvalEnv.cli()

View File

@ -11,11 +11,11 @@ Solution:
_AsyncWorker thread internally, making it safe for both CLI and Atropos use.
No monkey-patching is required.
This module is kept for backward compatibility apply_patches() is now a no-op.
This module is kept for backward compatibility. apply_patches() is a no-op.
Usage:
Call apply_patches() once at import time (done automatically by hermes_base_env.py).
This is idempotent — calling it multiple times is safe.
This is idempotent and safe to call multiple times.
"""
import logging
@ -26,17 +26,10 @@ _patches_applied = False
def apply_patches():
"""Apply all monkey patches needed for Atropos compatibility.
Now a no-op — Modal async safety is built directly into ModalEnvironment.
Safe to call multiple times.
"""
"""Apply all monkey patches needed for Atropos compatibility."""
global _patches_applied
if _patches_applied:
return
# Modal async-safety is now built into tools/environments/modal.py
# via the _AsyncWorker class. No monkey-patching needed.
logger.debug("apply_patches() called — no patches needed (async safety is built-in)")
logger.debug("apply_patches() called; no patches needed (async safety is built-in)")
_patches_applied = True

View File

@ -0,0 +1 @@
"""Built-in gateway hooks that are always registered."""

View File

@ -0,0 +1,86 @@
"""Built-in boot-md hook — run ~/.hermes/BOOT.md on gateway startup.
This hook is always registered. It silently skips if no BOOT.md exists.
To activate, create ``~/.hermes/BOOT.md`` with instructions for the
agent to execute on every gateway restart.
Example BOOT.md::
# Startup Checklist
1. Check if any cron jobs failed overnight
2. Send a status update to Discord #general
3. If there are errors in /opt/app/deploy.log, summarize them
The agent runs in a background thread so it doesn't block gateway
startup. If nothing needs attention, it replies with [SILENT] to
suppress delivery.
"""
import logging
import os
import threading
from pathlib import Path
logger = logging.getLogger("hooks.boot-md")
HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
BOOT_FILE = HERMES_HOME / "BOOT.md"
def _build_boot_prompt(content: str) -> str:
"""Wrap BOOT.md content in a system-level instruction."""
return (
"You are running a startup boot checklist. Follow the BOOT.md "
"instructions below exactly.\n\n"
"---\n"
f"{content}\n"
"---\n\n"
"Execute each instruction. If you need to send a message to a "
"platform, use the send_message tool.\n"
"If nothing needs attention and there is nothing to report, "
"reply with ONLY: [SILENT]"
)
def _run_boot_agent(content: str) -> None:
"""Spawn a one-shot agent session to execute the boot instructions."""
try:
from run_agent import AIAgent
prompt = _build_boot_prompt(content)
agent = AIAgent(
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
max_iterations=20,
)
result = agent.run_conversation(prompt)
response = result.get("final_response", "")
if response and "[SILENT]" not in response:
logger.info("boot-md completed: %s", response[:200])
else:
logger.info("boot-md completed (nothing to report)")
except Exception as e:
logger.error("boot-md agent failed: %s", e)
async def handle(event_type: str, context: dict) -> None:
"""Gateway startup handler — run BOOT.md if it exists."""
if not BOOT_FILE.exists():
return
content = BOOT_FILE.read_text(encoding="utf-8").strip()
if not content:
return
logger.info("Running BOOT.md (%d chars)", len(content))
# Run in a background thread so we don't block gateway startup.
thread = threading.Thread(
target=_run_boot_agent,
args=(content,),
name="boot-md",
daemon=True,
)
thread.start()

View File

@ -17,6 +17,7 @@ from typing import Dict, List, Optional, Any
from enum import Enum
from hermes_cli.config import get_hermes_home
from utils import is_truthy_value
logger = logging.getLogger(__name__)
@ -25,11 +26,14 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
"""Coerce bool-ish config values, preserving a caller-provided default."""
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, str):
return value.strip().lower() in ("true", "1", "yes", "on")
return bool(value)
lowered = value.strip().lower()
if lowered in ("true", "1", "yes", "on"):
return True
if lowered in ("false", "0", "no", "off"):
return False
return default
return is_truthy_value(value, default=default)
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
@ -57,6 +61,8 @@ class Platform(Enum):
DINGTALK = "dingtalk"
API_SERVER = "api_server"
WEBHOOK = "webhook"
FEISHU = "feishu"
WECOM = "wecom"
@dataclass
@ -274,6 +280,12 @@ class GatewayConfig:
# Webhook uses enabled flag only (secrets are per-route)
elif platform == Platform.WEBHOOK:
connected.append(platform)
# Feishu uses extra dict for app credentials
elif platform == Platform.FEISHU and config.extra.get("app_id"):
connected.append(platform)
# WeCom uses extra dict for bot credentials
elif platform == Platform.WECOM and config.extra.get("bot_id"):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@ -507,6 +519,10 @@ def load_gateway_config() -> GatewayConfig:
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "require_mention" in platform_cfg:
bridged["require_mention"] = platform_cfg["require_mention"]
if "mention_patterns" in platform_cfg:
bridged["mention_patterns"] = platform_cfg["mention_patterns"]
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
@ -531,6 +547,34 @@ def load_gateway_config() -> GatewayConfig:
os.environ["DISCORD_FREE_RESPONSE_CHANNELS"] = str(frc)
if "auto_thread" in discord_cfg and not os.getenv("DISCORD_AUTO_THREAD"):
os.environ["DISCORD_AUTO_THREAD"] = str(discord_cfg["auto_thread"]).lower()
if "reactions" in discord_cfg and not os.getenv("DISCORD_REACTIONS"):
os.environ["DISCORD_REACTIONS"] = str(discord_cfg["reactions"]).lower()
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
import json as _json
os.environ["TELEGRAM_MENTION_PATTERNS"] = _json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
if frc is not None and not os.getenv("TELEGRAM_FREE_RESPONSE_CHATS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["TELEGRAM_FREE_RESPONSE_CHATS"] = str(frc)
whatsapp_cfg = yaml_cfg.get("whatsapp", {})
if isinstance(whatsapp_cfg, dict):
if "require_mention" in whatsapp_cfg and not os.getenv("WHATSAPP_REQUIRE_MENTION"):
os.environ["WHATSAPP_REQUIRE_MENTION"] = str(whatsapp_cfg["require_mention"]).lower()
if "mention_patterns" in whatsapp_cfg and not os.getenv("WHATSAPP_MENTION_PATTERNS"):
os.environ["WHATSAPP_MENTION_PATTERNS"] = json.dumps(whatsapp_cfg["mention_patterns"])
frc = whatsapp_cfg.get("free_response_chats")
if frc is not None and not os.getenv("WHATSAPP_FREE_RESPONSE_CHATS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["WHATSAPP_FREE_RESPONSE_CHATS"] = str(frc)
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@ -647,14 +691,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.SLACK] = PlatformConfig()
config.platforms[Platform.SLACK].enabled = True
config.platforms[Platform.SLACK].token = slack_token
# Home channel
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home:
config.platforms[Platform.SLACK].home_channel = HomeChannel(
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home and Platform.SLACK in config.platforms:
config.platforms[Platform.SLACK].home_channel = HomeChannel(
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
# Signal
signal_url = os.getenv("SIGNAL_HTTP_URL")
@ -668,13 +711,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
"account": signal_account,
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
})
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home and Platform.SIGNAL in config.platforms:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
# Mattermost
mattermost_token = os.getenv("MATTERMOST_TOKEN")
@ -687,13 +730,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.MATTERMOST].enabled = True
config.platforms[Platform.MATTERMOST].token = mattermost_token
config.platforms[Platform.MATTERMOST].extra["url"] = mattermost_url
mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
if mattermost_home:
config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
)
mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
if mattermost_home and Platform.MATTERMOST in config.platforms:
config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
)
# Matrix
matrix_token = os.getenv("MATRIX_ACCESS_TOKEN")
@ -715,13 +758,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.MATRIX].extra["password"] = matrix_password
matrix_e2ee = os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes")
config.platforms[Platform.MATRIX].extra["encryption"] = matrix_e2ee
matrix_home = os.getenv("MATRIX_HOME_ROOM")
if matrix_home:
config.platforms[Platform.MATRIX].home_channel = HomeChannel(
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
)
matrix_home = os.getenv("MATRIX_HOME_ROOM")
if matrix_home and Platform.MATRIX in config.platforms:
config.platforms[Platform.MATRIX].home_channel = HomeChannel(
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
)
# Home Assistant
hass_token = os.getenv("HASS_TOKEN")
@ -748,13 +791,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
"imap_host": email_imap,
"smtp_host": email_smtp,
})
email_home = os.getenv("EMAIL_HOME_ADDRESS")
if email_home:
config.platforms[Platform.EMAIL].home_channel = HomeChannel(
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
email_home = os.getenv("EMAIL_HOME_ADDRESS")
if email_home and Platform.EMAIL in config.platforms:
config.platforms[Platform.EMAIL].home_channel = HomeChannel(
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
# SMS (Twilio)
twilio_sid = os.getenv("TWILIO_ACCOUNT_SID")
@ -763,13 +806,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.SMS] = PlatformConfig()
config.platforms[Platform.SMS].enabled = True
config.platforms[Platform.SMS].api_key = os.getenv("TWILIO_AUTH_TOKEN", "")
sms_home = os.getenv("SMS_HOME_CHANNEL")
if sms_home:
config.platforms[Platform.SMS].home_channel = HomeChannel(
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
)
sms_home = os.getenv("SMS_HOME_CHANNEL")
if sms_home and Platform.SMS in config.platforms:
config.platforms[Platform.SMS].home_channel = HomeChannel(
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
)
# API Server
api_server_enabled = os.getenv("API_SERVER_ENABLED", "").lower() in ("true", "1", "yes")
@ -811,6 +854,55 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if webhook_secret:
config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret
# Feishu / Lark
feishu_app_id = os.getenv("FEISHU_APP_ID")
feishu_app_secret = os.getenv("FEISHU_APP_SECRET")
if feishu_app_id and feishu_app_secret:
if Platform.FEISHU not in config.platforms:
config.platforms[Platform.FEISHU] = PlatformConfig()
config.platforms[Platform.FEISHU].enabled = True
config.platforms[Platform.FEISHU].extra.update({
"app_id": feishu_app_id,
"app_secret": feishu_app_secret,
"domain": os.getenv("FEISHU_DOMAIN", "feishu"),
"connection_mode": os.getenv("FEISHU_CONNECTION_MODE", "websocket"),
})
feishu_encrypt_key = os.getenv("FEISHU_ENCRYPT_KEY", "")
if feishu_encrypt_key:
config.platforms[Platform.FEISHU].extra["encrypt_key"] = feishu_encrypt_key
feishu_verification_token = os.getenv("FEISHU_VERIFICATION_TOKEN", "")
if feishu_verification_token:
config.platforms[Platform.FEISHU].extra["verification_token"] = feishu_verification_token
feishu_home = os.getenv("FEISHU_HOME_CHANNEL")
if feishu_home:
config.platforms[Platform.FEISHU].home_channel = HomeChannel(
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
)
# WeCom (Enterprise WeChat)
wecom_bot_id = os.getenv("WECOM_BOT_ID")
wecom_secret = os.getenv("WECOM_SECRET")
if wecom_bot_id and wecom_secret:
if Platform.WECOM not in config.platforms:
config.platforms[Platform.WECOM] = PlatformConfig()
config.platforms[Platform.WECOM].enabled = True
config.platforms[Platform.WECOM].extra.update({
"bot_id": wecom_bot_id,
"secret": wecom_secret,
})
wecom_ws_url = os.getenv("WECOM_WEBSOCKET_URL", "")
if wecom_ws_url:
config.platforms[Platform.WECOM].extra["websocket_url"] = wecom_ws_url
wecom_home = os.getenv("WECOM_HOME_CHANNEL")
if wecom_home:
config.platforms[Platform.WECOM].home_channel = HomeChannel(
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
)
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
@ -825,5 +917,3 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.default_reset_policy.at_hour = int(reset_hour)
except ValueError:
pass

View File

@ -70,12 +70,15 @@ class DeliveryTarget:
if target == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id format
# Check for platform:chat_id or platform:chat_id:thread_id format
if ":" in target:
platform_str, chat_id = target.split(":", 1)
parts = target.split(":", 2)
platform_str = parts[0]
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
platform = Platform(platform_str)
return cls(platform=platform, chat_id=chat_id, is_explicit=True)
return cls(platform=platform, chat_id=chat_id, thread_id=thread_id, is_explicit=True)
except ValueError:
# Unknown platform, treat as local
return cls(platform=Platform.LOCAL)
@ -94,6 +97,8 @@ class DeliveryTarget:
return "origin"
if self.platform == Platform.LOCAL:
return "local"
if self.chat_id and self.thread_id:
return f"{self.platform.value}:{self.chat_id}:{self.thread_id}"
if self.chat_id:
return f"{self.platform.value}:{self.chat_id}"
return self.platform.value

View File

@ -51,14 +51,33 @@ class HookRegistry:
"""Return metadata about all loaded hooks."""
return list(self._loaded_hooks)
def _register_builtin_hooks(self) -> None:
"""Register built-in hooks that are always active."""
try:
from gateway.builtin_hooks.boot_md import handle as boot_md_handle
self._handlers.setdefault("gateway:startup", []).append(boot_md_handle)
self._loaded_hooks.append({
"name": "boot-md",
"description": "Run ~/.hermes/BOOT.md on gateway startup",
"events": ["gateway:startup"],
"path": "(builtin)",
})
except Exception as e:
print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
def discover_and_load(self) -> None:
"""
Scan the hooks directory for hook directories and load their handlers.
Also registers built-in hooks that are always active.
Each hook directory must contain:
- HOOK.yaml with at least 'name' and 'events' keys
- handler.py with a top-level 'handle' function (sync or async)
"""
self._register_builtin_hooks()
if not HOOKS_DIR.exists():
return

View File

@ -25,7 +25,7 @@ import time
from pathlib import Path
from typing import Optional
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
# Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
@ -41,7 +41,7 @@ LOCKOUT_SECONDS = 3600 # Lockout duration after too many failures
MAX_PENDING_PER_PLATFORM = 3 # Max pending codes per platform
MAX_FAILED_ATTEMPTS = 5 # Failed approvals before lockout
PAIRING_DIR = get_hermes_home() / "pairing"
PAIRING_DIR = get_hermes_dir("platforms/pairing", "pairing")
def _secure_write(path: Path, data: str) -> None:

View File

@ -2,7 +2,7 @@
OpenAI-compatible API server platform adapter.
Exposes an HTTP server with endpoints:
- POST /v1/chat/completions — OpenAI Chat Completions format (stateless)
- POST /v1/chat/completions — OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header)
- POST /v1/responses — OpenAI Responses API format (stateful via previous_response_id)
- GET /v1/responses/{response_id} — Retrieve a stored response
- DELETE /v1/responses/{response_id} — Delete a stored response
@ -223,6 +223,23 @@ if AIOHTTP_AVAILABLE:
else:
body_limit_middleware = None # type: ignore[assignment]
_SECURITY_HEADERS = {
"X-Content-Type-Options": "nosniff",
"Referrer-Policy": "no-referrer",
}
if AIOHTTP_AVAILABLE:
@web.middleware
async def security_headers_middleware(request, handler):
"""Add security headers to all responses (including errors)."""
response = await handler(request)
for k, v in _SECURITY_HEADERS.items():
response.headers.setdefault(k, v)
return response
else:
security_headers_middleware = None # type: ignore[assignment]
class _IdempotencyCache:
"""In-memory idempotency cache with TTL and basic LRU semantics."""
@ -283,6 +300,7 @@ class APIServerAdapter(BasePlatformAdapter):
self._runner: Optional["web.AppRunner"] = None
self._site: Optional["web.TCPSite"] = None
self._response_store = ResponseStore()
self._session_db: Optional[Any] = None # Lazy-init SessionDB for session continuity
@staticmethod
def _parse_cors_origins(value: Any) -> tuple[str, ...]:
@ -307,6 +325,7 @@ class APIServerAdapter(BasePlatformAdapter):
if "*" in self._cors_origins:
headers = dict(_CORS_HEADERS)
headers["Access-Control-Allow-Origin"] = "*"
headers["Access-Control-Max-Age"] = "600"
return headers
if origin not in self._cors_origins:
@ -315,6 +334,7 @@ class APIServerAdapter(BasePlatformAdapter):
headers = dict(_CORS_HEADERS)
headers["Access-Control-Allow-Origin"] = origin
headers["Vary"] = "Origin"
headers["Access-Control-Max-Age"] = "600"
return headers
def _origin_allowed(self, origin: str) -> bool:
@ -352,6 +372,24 @@ class APIServerAdapter(BasePlatformAdapter):
status=401,
)
# ------------------------------------------------------------------
# Session DB helper
# ------------------------------------------------------------------
def _ensure_session_db(self):
"""Lazily initialise and return the shared SessionDB instance.
Sessions are persisted to ``state.db`` so that ``hermes sessions list``
shows API-server conversations alongside CLI and gateway ones.
"""
if self._session_db is None:
try:
from hermes_state import SessionDB
self._session_db = SessionDB()
except Exception as e:
logger.debug("SessionDB unavailable for API server: %s", e)
return self._session_db
# ------------------------------------------------------------------
# Agent creation helper
# ------------------------------------------------------------------
@ -361,6 +399,7 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt: Optional[str] = None,
session_id: Optional[str] = None,
stream_delta_callback=None,
tool_progress_callback=None,
) -> Any:
"""
Create an AIAgent instance using the gateway's runtime config.
@ -393,6 +432,8 @@ class APIServerAdapter(BasePlatformAdapter):
session_id=session_id,
platform="api_server",
stream_delta_callback=stream_delta_callback,
tool_progress_callback=tool_progress_callback,
session_db=self._ensure_session_db(),
)
return agent
@ -475,7 +516,22 @@ class APIServerAdapter(BasePlatformAdapter):
status=400,
)
session_id = str(uuid.uuid4())
# Allow caller to continue an existing session by passing X-Hermes-Session-Id.
# When provided, history is loaded from state.db instead of from the request body.
provided_session_id = request.headers.get("X-Hermes-Session-Id", "").strip()
if provided_session_id:
session_id = provided_session_id
try:
db = self._ensure_session_db()
if db is not None:
history = db.get_messages_as_conversation(session_id)
except Exception as e:
logger.warning("Failed to load session history for %s: %s", session_id, e)
history = []
else:
session_id = str(uuid.uuid4())
# history already set from request body above
completion_id = f"chatcmpl-{uuid.uuid4().hex[:29]}"
model_name = body.get("model", "hermes-agent")
created = int(time.time())
@ -495,6 +551,15 @@ class APIServerAdapter(BasePlatformAdapter):
if delta is not None:
_stream_q.put(delta)
def _on_tool_progress(name, preview, args):
"""Inject tool progress into the SSE stream for Open WebUI."""
if name.startswith("_"):
return # Skip internal events (_thinking)
from agent.display import get_tool_emoji
emoji = get_tool_emoji(name)
label = preview or name
_stream_q.put(f"\n`{emoji} {label}`\n")
# Start agent in background. agent_ref is a mutable container
# so the SSE writer can interrupt the agent on client disconnect.
agent_ref = [None]
@ -504,12 +569,13 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt=system_prompt,
session_id=session_id,
stream_delta_callback=_on_delta,
tool_progress_callback=_on_tool_progress,
agent_ref=agent_ref,
))
return await self._write_sse_chat_completion(
request, completion_id, model_name, created, _stream_q,
agent_task, agent_ref,
agent_task, agent_ref, session_id=session_id,
)
# Non-streaming: run the agent (with optional Idempotency-Key)
@ -568,11 +634,11 @@ class APIServerAdapter(BasePlatformAdapter):
},
}
return web.json_response(response_data)
return web.json_response(response_data, headers={"X-Hermes-Session-Id": session_id})
async def _write_sse_chat_completion(
self, request: "web.Request", completion_id: str, model: str,
created: int, stream_q, agent_task, agent_ref=None,
created: int, stream_q, agent_task, agent_ref=None, session_id: str = None,
) -> "web.StreamResponse":
"""Write real streaming SSE from agent's stream_delta_callback queue.
@ -582,10 +648,16 @@ class APIServerAdapter(BasePlatformAdapter):
"""
import queue as _q
response = web.StreamResponse(
status=200,
headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
)
sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
# CORS middleware can't inject headers into StreamResponse after
# prepare() flushes them, so resolve CORS headers up front.
origin = request.headers.get("Origin", "")
cors = self._cors_headers_for_origin(origin) if origin else None
if cors:
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
try:
@ -1171,6 +1243,7 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt: Optional[str] = None,
session_id: Optional[str] = None,
stream_delta_callback=None,
tool_progress_callback=None,
agent_ref: Optional[list] = None,
) -> tuple:
"""
@ -1191,6 +1264,7 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt=ephemeral_system_prompt,
session_id=session_id,
stream_delta_callback=stream_delta_callback,
tool_progress_callback=tool_progress_callback,
)
if agent_ref is not None:
agent_ref[0] = agent
@ -1218,10 +1292,11 @@ class APIServerAdapter(BasePlatformAdapter):
return False
try:
mws = [mw for mw in (cors_middleware, body_limit_middleware) if mw is not None]
mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
self._app = web.Application(middlewares=mws)
self._app["api_server_adapter"] = self
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get("/v1/health", self._handle_health)
self._app.router.add_get("/v1/models", self._handle_models)
self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
self._app.router.add_post("/v1/responses", self._handle_responses)
@ -1237,6 +1312,17 @@ class APIServerAdapter(BasePlatformAdapter):
self._app.router.add_post("/api/jobs/{job_id}/resume", self._handle_resume_job)
self._app.router.add_post("/api/jobs/{job_id}/run", self._handle_run_job)
# Port conflict detection — fail fast if port is already in use
import socket as _socket
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
_s.settimeout(1)
_s.connect(('127.0.0.1', self._port))
logger.error('[%s] Port %d already in use. Set a different port in config.yaml: platforms.api_server.port', self.name, self._port)
return False
except (ConnectionRefusedError, OSError):
pass # port is free
self._runner = web.AppRunner(self._app)
await self._runner.setup()
self._site = web.TCPSite(self._runner, self._host, self._port)

View File

@ -27,6 +27,7 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.session import SessionSource, build_session_key
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
@ -44,8 +45,8 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
# (e.g. Telegram file URLs expire after ~1 hour).
# ---------------------------------------------------------------------------
# Default location: {HERMES_HOME}/image_cache/
IMAGE_CACHE_DIR = get_hermes_home() / "image_cache"
# Default location: {HERMES_HOME}/cache/images/ (legacy: image_cache/)
IMAGE_CACHE_DIR = get_hermes_dir("cache/images", "image_cache")
def get_image_cache_dir() -> Path:
@ -147,7 +148,7 @@ def cleanup_image_cache(max_age_hours: int = 24) -> int:
# here so the STT tool (OpenAI Whisper) can transcribe them from local files.
# ---------------------------------------------------------------------------
AUDIO_CACHE_DIR = get_hermes_home() / "audio_cache"
AUDIO_CACHE_DIR = get_hermes_dir("cache/audio", "audio_cache")
def get_audio_cache_dir() -> Path:
@ -174,29 +175,51 @@ def cache_audio_from_bytes(data: bytes, ext: str = ".ogg") -> str:
return str(filepath)
async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) -> str:
"""
Download an audio file from a URL and save it to the local cache.
Retries on transient failures (timeouts, 429, 5xx) with exponential
backoff so a single slow CDN response doesn't lose the media.
Args:
url: The HTTP/HTTPS URL to download from.
ext: File extension including the dot (e.g. ".ogg", ".mp3").
retries: Number of retry attempts on transient failures.
Returns:
Absolute path to the cached audio file as a string.
"""
import asyncio
import httpx
import logging as _logging
_log = _logging.getLogger(__name__)
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "audio/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_audio_from_bytes(response.content, ext)
for attempt in range(retries + 1):
try:
response = await client.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
"Accept": "audio/*,*/*;q=0.8",
},
)
response.raise_for_status()
return cache_audio_from_bytes(response.content, ext)
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
last_exc = exc
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
raise
if attempt < retries:
wait = 1.5 * (attempt + 1)
_log.debug("Audio cache retry %d/%d for %s (%.1fs): %s",
attempt + 1, retries, url[:80], wait, exc)
await asyncio.sleep(wait)
continue
raise
raise last_exc
# ---------------------------------------------------------------------------
@ -206,7 +229,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
# here so the agent can reference them by local file path.
# ---------------------------------------------------------------------------
DOCUMENT_CACHE_DIR = get_hermes_home() / "document_cache"
DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")
SUPPORTED_DOCUMENT_TYPES = {
".pdf": "application/pdf",
@ -333,7 +356,10 @@ class MessageEvent:
return None
# Split on space and get first word, strip the /
parts = self.text.split(maxsplit=1)
return parts[0][1:].lower() if parts else None
raw = parts[0][1:].lower() if parts else None
if raw and "@" in raw:
raw = raw.split("@", 1)[0]
return raw
def get_command_args(self) -> str:
"""Get the arguments after a command."""
@ -872,6 +898,26 @@ class BasePlatformAdapter(ABC):
except Exception:
pass
# ── Processing lifecycle hooks ──────────────────────────────────────────
# Subclasses override these to react to message processing events
# (e.g. Discord adds 👀/✅/❌ reactions).
async def on_processing_start(self, event: MessageEvent) -> None:
"""Hook called when background processing begins."""
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Hook called when background processing completes."""
async def _run_processing_hook(self, hook_name: str, *args: Any, **kwargs: Any) -> None:
"""Run a lifecycle hook without letting failures break message flow."""
hook = getattr(self, hook_name, None)
if not callable(hook):
return
try:
await hook(*args, **kwargs)
except Exception as e:
logger.warning("[%s] %s hook failed: %s", self.name, hook_name, e)
@staticmethod
def _is_retryable_error(error: Optional[str]) -> bool:
"""Return True if the error string looks like a transient network failure."""
@ -979,7 +1025,7 @@ class BasePlatformAdapter(ABC):
# simultaneous messages. Queue them without interrupting the active run,
# then process them immediately after the current task finishes.
if event.message_type == MessageType.PHOTO:
print(f"[{self.name}] 🖼️ Queuing photo follow-up for session {session_key} without interrupt")
logger.debug("[%s] Queuing photo follow-up for session %s without interrupt", self.name, session_key)
existing = self._pending_messages.get(session_key)
if existing and existing.message_type == MessageType.PHOTO:
existing.media_urls.extend(event.media_urls)
@ -994,12 +1040,19 @@ class BasePlatformAdapter(ABC):
return # Don't interrupt now - will run after current task completes
# Default behavior for non-photo follow-ups: interrupt the running agent
print(f"[{self.name}] New message while session {session_key} is active - triggering interrupt")
logger.debug("[%s] New message while session %s is active triggering interrupt", self.name, session_key)
self._pending_messages[session_key] = event
# Signal the interrupt (the processing task checks this)
self._active_sessions[session_key].set()
return # Don't process now - will be handled after current task finishes
# Mark session as active BEFORE spawning background task to close
# the race window where a second message arriving before the task
# starts would also pass the _active_sessions check and spawn a
# duplicate task. (grammY sequentialize / aiogram EventIsolation
# pattern — set the guard synchronously, not inside the task.)
self._active_sessions[session_key] = asyncio.Event()
# Spawn background task to process this message
task = asyncio.create_task(self._process_message_background(event, session_key))
try:
@ -1034,8 +1087,22 @@ class BasePlatformAdapter(ABC):
async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
"""Background task that actually processes the message."""
# Create interrupt event for this session
interrupt_event = asyncio.Event()
# Track delivery outcomes for the processing-complete hook
delivery_attempted = False
delivery_succeeded = False
def _record_delivery(result):
nonlocal delivery_attempted, delivery_succeeded
if result is None:
return
delivery_attempted = True
if getattr(result, "success", False):
delivery_succeeded = True
# Reuse the interrupt event set by handle_message() (which marks
# the session active before spawning this task to prevent races).
# Fall back to a new Event only if the entry was removed externally.
interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
self._active_sessions[session_key] = interrupt_event
# Start continuous typing indicator (refreshes every 2 seconds)
@ -1043,12 +1110,17 @@ class BasePlatformAdapter(ABC):
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
try:
await self._run_processing_hook("on_processing_start", event)
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
# Send response if any
# Send response if any. A None/empty response is normal when
# streaming already delivered the text (already_sent=True) or
# when the message was queued behind an active agent. Log at
# DEBUG to avoid noisy warnings for expected behavior.
if not response:
logger.warning("[%s] Handler returned empty/None response for %s", self.name, event.source.chat_id)
logger.debug("[%s] Handler returned empty/None response for %s", self.name, event.source.chat_id)
if response:
# Extract MEDIA:<path> tags (from TTS tool) before other processing
media_files, response = self.extract_media(response)
@ -1112,6 +1184,7 @@ class BasePlatformAdapter(ABC):
reply_to=event.message_id,
metadata=_thread_metadata,
)
_record_delivery(result)
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
@ -1180,9 +1253,9 @@ class BasePlatformAdapter(ABC):
)
if not media_result.success:
print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
logger.warning("[%s] Failed to send media (%s): %s", self.name, ext, media_result.error)
except Exception as media_err:
print(f"[{self.name}] Error sending media: {media_err}")
logger.warning("[%s] Error sending media: %s", self.name, media_err)
# Send auto-detected local files as native attachments
for file_path in local_files:
@ -1211,10 +1284,14 @@ class BasePlatformAdapter(ABC):
except Exception as file_err:
logger.error("[%s] Error sending local file %s: %s", self.name, file_path, file_err)
# Determine overall success for the processing hook
processing_ok = delivery_succeeded if delivery_attempted else not bool(response)
await self._run_processing_hook("on_processing_complete", event, processing_ok)
# Check if there's a pending message that was queued during our processing
if session_key in self._pending_messages:
pending_event = self._pending_messages.pop(session_key)
print(f"[{self.name}] 📨 Processing queued message from interrupt")
logger.debug("[%s] Processing queued message from interrupt", self.name)
# Clean up current session before processing pending
if session_key in self._active_sessions:
del self._active_sessions[session_key]
@ -1227,10 +1304,12 @@ class BasePlatformAdapter(ABC):
await self._process_message_background(pending_event, session_key)
return # Already cleaned up
except asyncio.CancelledError:
await self._run_processing_hook("on_processing_complete", event, False)
raise
except Exception as e:
print(f"[{self.name}] Error handling message: {e}")
import traceback
traceback.print_exc()
await self._run_processing_hook("on_processing_complete", event, False)
logger.error("[%s] Error handling message: %s", self.name, e, exc_info=True)
# Send the error to the user so they aren't left with radio silence
try:
error_type = type(e).__name__

View File

@ -408,7 +408,7 @@ class VoiceReceiver:
class DiscordAdapter(BasePlatformAdapter):
"""
Discord bot adapter.
Handles:
- Receiving messages from servers and DMs
- Sending responses with Discord markdown
@ -418,10 +418,10 @@ class DiscordAdapter(BasePlatformAdapter):
- Auto-threading for long conversations
- Reaction-based feedback
"""
# Discord message limits
MAX_MESSAGE_LENGTH = 2000
# Auto-disconnect from voice channel after this many seconds of inactivity
VOICE_TIMEOUT = 300
@ -449,7 +449,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._bot_task: Optional[asyncio.Task] = None
# Cap to prevent unbounded growth (Discord threads get archived).
self._MAX_TRACKED_THREADS = 500
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
if not DISCORD_AVAILABLE:
@ -480,12 +480,23 @@ class DiscordAdapter(BasePlatformAdapter):
logger.warning("Opus codec found at %s but failed to load", opus_path)
if not discord.opus.is_loaded():
logger.warning("Opus codec not found — voice channel playback disabled")
if not self.config.token:
logger.error("[%s] No bot token configured", self.name)
return False
try:
# Acquire scoped lock to prevent duplicate bot token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = self.config.token
acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('discord_token_lock', message, retryable=False)
return False
# Set up intents -- members intent needed for username-to-ID resolution
intents = Intents.default()
intents.message_content = True
@ -493,13 +504,13 @@ class DiscordAdapter(BasePlatformAdapter):
intents.guild_messages = True
intents.members = True
intents.voice_states = True
# Create bot
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
)
# Parse allowed user entries (may contain usernames or IDs)
allowed_env = os.getenv("DISCORD_ALLOWED_USERS", "")
if allowed_env:
@ -507,17 +518,17 @@ class DiscordAdapter(BasePlatformAdapter):
_clean_discord_id(uid) for uid in allowed_env.split(",")
if uid.strip()
}
adapter_self = self # capture for closure
# Register event handlers
@self._client.event
async def on_ready():
logger.info("[%s] Connected as %s", adapter_self.name, adapter_self._client.user)
# Resolve any usernames in the allowed list to numeric IDs
await adapter_self._resolve_allowed_usernames()
# Sync slash commands with Discord
try:
synced = await adapter_self._client.tree.sync()
@ -525,18 +536,22 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[%s] Slash command sync failed: %s", adapter_self.name, e, exc_info=True)
adapter_self._ready_event.set()
@self._client.event
async def on_message(message: DiscordMessage):
# Always ignore our own messages
if message.author == self._client.user:
return
# Ignore Discord system messages (thread renames, pins, member joins, etc.)
# Allow both default and reply types — replies have a distinct MessageType.
if message.type not in (discord.MessageType.default, discord.MessageType.reply):
return
# Check if the message author is in the allowed user list
if not self._is_allowed_user(str(message.author.id)):
return
# Bot message filtering (DISCORD_ALLOW_BOTS):
# "none" — ignore all other bots (default)
# "mentions" — accept bot messages only when they @mention us
@ -549,7 +564,23 @@ class DiscordAdapter(BasePlatformAdapter):
if not self._client.user or self._client.user not in message.mentions:
return
# "all" falls through to handle_message
# If the message @mentions other users but NOT the bot, the
# sender is talking to someone else — stay silent. Only
# applies in server channels; in DMs the user is always
# talking to the bot (mentions are just references).
# Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
_bot_mentioned = (
self._client.user is not None
and self._client.user in message.mentions
)
if not _bot_mentioned:
return # Talking to someone else, don't interrupt
await self._handle_message(message)
@self._client.event
@ -587,23 +618,23 @@ class DiscordAdapter(BasePlatformAdapter):
# Register slash commands
self._register_slash_commands()
# Start the bot in background
self._bot_task = asyncio.create_task(self._client.start(self.config.token))
# Wait for ready
await asyncio.wait_for(self._ready_event.wait(), timeout=30)
self._running = True
return True
except asyncio.TimeoutError:
logger.error("[%s] Timeout waiting for connection to Discord", self.name, exc_info=True)
return False
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to connect to Discord: %s", self.name, e, exc_info=True)
return False
async def disconnect(self) -> None:
"""Disconnect from Discord."""
# Clean up all active voice connections before closing the client
@ -622,8 +653,61 @@ class DiscordAdapter(BasePlatformAdapter):
self._running = False
self._client = None
self._ready_event.clear()
# Release the token lock
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('discord-bot-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[%s] Disconnected", self.name)
async def _add_reaction(self, message: Any, emoji: str) -> bool:
"""Add an emoji reaction to a Discord message."""
if not message or not hasattr(message, "add_reaction"):
return False
try:
await message.add_reaction(emoji)
return True
except Exception as e:
logger.debug("[%s] add_reaction failed (%s): %s", self.name, emoji, e)
return False
async def _remove_reaction(self, message: Any, emoji: str) -> bool:
"""Remove the bot's own emoji reaction from a Discord message."""
if not message or not hasattr(message, "remove_reaction") or not self._client or not self._client.user:
return False
try:
await message.remove_reaction(emoji, self._client.user)
return True
except Exception as e:
logger.debug("[%s] remove_reaction failed (%s): %s", self.name, emoji, e)
return False
def _reactions_enabled(self) -> bool:
"""Check if message reactions are enabled via config/env."""
return os.getenv("DISCORD_REACTIONS", "true").lower() not in ("false", "0", "no")
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction for normal Discord message events."""
if not self._reactions_enabled():
return
message = event.raw_message
if hasattr(message, "add_reaction"):
await self._add_reaction(message, "👀")
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
"""Swap the in-progress reaction for a final success/failure reaction."""
if not self._reactions_enabled():
return
message = event.raw_message
if hasattr(message, "add_reaction"):
await self._remove_reaction(message, "👀")
await self._add_reaction(message, "" if success else "")
async def send(
self,
chat_id: str,
@ -640,24 +724,24 @@ class DiscordAdapter(BasePlatformAdapter):
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
message_ids = []
reference = None
if reply_to:
try:
ref_msg = await channel.fetch_message(int(reply_to))
reference = ref_msg
except Exception as e:
logger.debug("Could not fetch reply-to message: %s", e)
for i, chunk in enumerate(chunks):
chunk_reference = reference if i == 0 else None
try:
@ -684,13 +768,13 @@ class DiscordAdapter(BasePlatformAdapter):
else:
raise
message_ids.append(str(msg.id))
return SendResult(
success=True,
message_id=message_ids[0] if message_ids else None,
raw_response={"message_ids": message_ids}
)
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to send Discord message: %s", self.name, e, exc_info=True)
return SendResult(success=False, error=str(e))
@ -1162,25 +1246,25 @@ class DiscordAdapter(BasePlatformAdapter):
"""Send an image natively as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
import aiohttp
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Download the image and send as a Discord file attachment
# (Discord renders attachments inline, unlike plain URLs)
async with aiohttp.ClientSession() as session:
async with session.get(image_url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
if resp.status != 200:
raise Exception(f"Failed to download image: HTTP {resp.status}")
image_data = await resp.read()
# Determine filename from URL or content type
content_type = resp.headers.get("content-type", "image/png")
ext = "png"
@ -1190,16 +1274,16 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "gif"
elif "webp" in content_type:
ext = "webp"
import io
file = discord.File(io.BytesIO(image_data), filename=f"image.{ext}")
msg = await channel.send(
content=caption if caption else None,
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except ImportError:
logger.warning(
"[%s] aiohttp not installed, falling back to URL. Run: pip install aiohttp",
@ -1250,7 +1334,7 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to send document, falling back to base adapter: %s", self.name, e, exc_info=True)
return await super().send_document(chat_id, file_path, caption, file_name, reply_to, metadata=metadata)
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Start a persistent typing indicator for a channel.
@ -1294,20 +1378,20 @@ class DiscordAdapter(BasePlatformAdapter):
await task
except (asyncio.CancelledError, Exception):
pass
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Discord channel."""
if not self._client:
return {"name": "Unknown", "type": "dm"}
try:
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return {"name": str(chat_id), "type": "dm"}
# Determine channel type
if isinstance(channel, discord.DMChannel):
chat_type = "dm"
@ -1323,7 +1407,7 @@ class DiscordAdapter(BasePlatformAdapter):
else:
chat_type = "channel"
name = getattr(channel, "name", str(chat_id))
return {
"name": name,
"type": chat_type,
@ -1333,7 +1417,7 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.error("[%s] Failed to get chat info for %s: %s", self.name, chat_id, e, exc_info=True)
return {"name": str(chat_id), "type": "dm", "error": str(e)}
async def _resolve_allowed_usernames(self) -> None:
"""
Resolve non-numeric entries in DISCORD_ALLOWED_USERS to Discord user IDs.
@ -1401,7 +1485,7 @@ class DiscordAdapter(BasePlatformAdapter):
def format_message(self, content: str) -> str:
"""
Format message for Discord.
Discord uses its own markdown variant.
"""
# Discord markdown is fairly standard, no special escaping needed
@ -1413,15 +1497,23 @@ class DiscordAdapter(BasePlatformAdapter):
command_text: str,
followup_msg: str | None = None,
) -> None:
"""Common handler for simple slash commands that dispatch a command string."""
"""Common handler for simple slash commands that dispatch a command string.
Defers the interaction (shows "thinking..."), dispatches the command,
then cleans up the deferred response. If *followup_msg* is provided
the "thinking..." indicator is replaced with that text; otherwise it
is deleted so the channel isn't cluttered.
"""
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
if followup_msg:
try:
await interaction.followup.send(followup_msg, ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
try:
if followup_msg:
await interaction.edit_original_response(content=followup_msg)
else:
await interaction.delete_original_response()
except Exception as e:
logger.debug("Discord interaction cleanup failed: %s", e)
def _register_slash_commands(self) -> None:
"""Register Discord slash commands on the command tree."""
@ -1446,9 +1538,7 @@ class DiscordAdapter(BasePlatformAdapter):
@tree.command(name="reasoning", description="Show or change reasoning effort")
@discord.app_commands.describe(effort="Reasoning effort: xhigh, high, medium, low, minimal, or none.")
async def slash_reasoning(interaction: discord.Interaction, effort: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/reasoning {effort}".strip())
await self.handle_message(event)
await self._run_simple_slash(interaction, f"/reasoning {effort}".strip())
@tree.command(name="personality", description="Set a personality")
@discord.app_commands.describe(name="Personality name. Leave empty to list available.")
@ -1521,14 +1611,22 @@ class DiscordAdapter(BasePlatformAdapter):
discord.app_commands.Choice(name="status — show current mode", value="status"),
])
async def slash_voice(interaction: discord.Interaction, mode: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/voice {mode}".strip())
await self.handle_message(event)
await self._run_simple_slash(interaction, f"/voice {mode}".strip())
@tree.command(name="update", description="Update Hermes Agent to the latest version")
async def slash_update(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/update", "Update initiated~")
@tree.command(name="approve", description="Approve a pending dangerous command")
@discord.app_commands.describe(scope="Optional: 'all', 'session', 'always', 'all session', 'all always'")
async def slash_approve(interaction: discord.Interaction, scope: str = ""):
await self._run_simple_slash(interaction, f"/approve {scope}".strip())
@tree.command(name="deny", description="Deny a pending dangerous command")
@discord.app_commands.describe(scope="Optional: 'all' to deny all pending commands")
async def slash_deny(interaction: discord.Interaction, scope: str = ""):
await self._run_simple_slash(interaction, f"/deny {scope}".strip())
@tree.command(name="thread", description="Create a new thread and start a Hermes session in it")
@discord.app_commands.describe(
name="Thread name",
@ -1563,7 +1661,7 @@ class DiscordAdapter(BasePlatformAdapter):
chat_name = interaction.channel.name
if hasattr(interaction.channel, "guild") and interaction.channel.guild:
chat_name = f"{interaction.channel.guild.name} / #{chat_name}"
# Get channel topic (if available)
chat_topic = getattr(interaction.channel, "topic", None)
@ -1772,33 +1870,41 @@ class DiscordAdapter(BasePlatformAdapter):
return None
async def send_exec_approval(
self, chat_id: str, command: str, approval_id: str
self, chat_id: str, command: str, session_key: str,
description: str = "dangerous command",
metadata: Optional[dict] = None,
) -> SendResult:
"""
Send a button-based exec approval prompt for a dangerous command.
Returns SendResult. The approval is resolved when a user clicks a button.
The buttons call ``resolve_gateway_approval()`` to unblock the waiting
agent thread — this replaces the text-based ``/approve`` flow on Discord.
"""
if not self._client or not DISCORD_AVAILABLE:
return SendResult(success=False, error="Not connected")
try:
channel = self._client.get_channel(int(chat_id))
# Resolve channel — use thread_id from metadata if present
target_id = chat_id
if metadata and metadata.get("thread_id"):
target_id = metadata["thread_id"]
channel = self._client.get_channel(int(target_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
channel = await self._client.fetch_channel(int(target_id))
# Discord embed description limit is 4096; show full command up to that
max_desc = 4088
cmd_display = command if len(command) <= max_desc else command[: max_desc - 3] + "..."
embed = discord.Embed(
title="Command Approval Required",
title="⚠️ Command Approval Required",
description=f"```\n{cmd_display}\n```",
color=discord.Color.orange(),
)
embed.set_footer(text=f"Approval ID: {approval_id}")
embed.add_field(name="Reason", value=description, inline=False)
view = ExecApprovalView(
approval_id=approval_id,
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
)
@ -1967,7 +2073,7 @@ class DiscordAdapter(BasePlatformAdapter):
if doc_ext in SUPPORTED_DOCUMENT_TYPES:
msg_type = MessageType.DOCUMENT
break
# When auto-threading kicked in, route responses to the new thread
effective_channel = auto_threaded_channel or message.channel
@ -1986,7 +2092,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Get channel topic (if available - TextChannels have topics, DMs/threads don't)
chat_topic = getattr(message.channel, "topic", None)
# Build source
source = self.build_source(
chat_id=str(effective_channel.id),
@ -1997,7 +2103,7 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id=thread_id,
chat_topic=chat_topic,
)
# Build media URLs -- download image attachments to local cache so the
# vision tool can access them reliably (Discord CDN URLs can expire).
media_urls = []
@ -2091,7 +2197,7 @@ class DiscordAdapter(BasePlatformAdapter):
"[Discord] Failed to cache document %s: %s",
att.filename, e, exc_info=True,
)
event_text = message.content
if pending_text_injection:
event_text = f"{pending_text_injection}\n\n{event_text}" if event_text else pending_text_injection
@ -2131,13 +2237,15 @@ if DISCORD_AVAILABLE:
"""
Interactive button view for exec approval of dangerous commands.
Shows three buttons: Allow Once (green), Always Allow (blue), Deny (red).
Only users in the allowed list can click. The view times out after 5 minutes.
Shows four buttons: Allow Once, Allow Session, Always Allow, Deny.
Clicking a button calls ``resolve_gateway_approval()`` to unblock the
waiting agent thread — the same mechanism as the text ``/approve`` flow.
Only users in the allowed list can click. Times out after 5 minutes.
"""
def __init__(self, approval_id: str, allowed_user_ids: set):
def __init__(self, session_key: str, allowed_user_ids: set):
super().__init__(timeout=300) # 5-minute timeout
self.approval_id = approval_id
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.resolved = False
@ -2148,9 +2256,10 @@ if DISCORD_AVAILABLE:
return str(interaction.user.id) in self.allowed_user_ids
async def _resolve(
self, interaction: discord.Interaction, action: str, color: discord.Color
self, interaction: discord.Interaction, choice: str,
color: discord.Color, label: str,
):
"""Resolve the approval and update the message."""
"""Resolve the approval via the gateway approval queue and update the embed."""
if self.resolved:
await interaction.response.send_message(
"This approval has already been resolved~", ephemeral=True
@ -2169,7 +2278,7 @@ if DISCORD_AVAILABLE:
embed = interaction.message.embeds[0] if interaction.message.embeds else None
if embed:
embed.color = color
embed.set_footer(text=f"{action} by {interaction.user.display_name}")
embed.set_footer(text=f"{label} by {interaction.user.display_name}")
# Disable all buttons
for child in self.children:
@ -2177,33 +2286,40 @@ if DISCORD_AVAILABLE:
await interaction.response.edit_message(embed=embed, view=self)
# Store the approval decision
# Unblock the waiting agent thread via the gateway approval queue
try:
from tools.approval import approve_permanent
if action == "allow_once":
pass # One-time approval handled by gateway
elif action == "allow_always":
approve_permanent(self.approval_id)
except ImportError:
pass
from tools.approval import resolve_gateway_approval
count = resolve_gateway_approval(self.session_key, choice)
logger.info(
"Discord button resolved %d approval(s) for session %s (choice=%s, user=%s)",
count, self.session_key, choice, interaction.user.display_name,
)
except Exception as exc:
logger.error("Failed to resolve gateway approval from button: %s", exc)
@discord.ui.button(label="Allow Once", style=discord.ButtonStyle.green)
async def allow_once(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "allow_once", discord.Color.green())
await self._resolve(interaction, "once", discord.Color.green(), "Approved once")
@discord.ui.button(label="Allow Session", style=discord.ButtonStyle.grey)
async def allow_session(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "session", discord.Color.blue(), "Approved for session")
@discord.ui.button(label="Always Allow", style=discord.ButtonStyle.blurple)
async def allow_always(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "allow_always", discord.Color.blue())
await self._resolve(interaction, "always", discord.Color.purple(), "Approved permanently")
@discord.ui.button(label="Deny", style=discord.ButtonStyle.red)
async def deny(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "deny", discord.Color.red())
await self._resolve(interaction, "deny", discord.Color.red(), "Denied")
async def on_timeout(self):
"""Handle view timeout -- disable buttons and mark as expired."""

View File

@ -43,6 +43,20 @@ from gateway.platforms.base import (
from gateway.config import Platform, PlatformConfig
logger = logging.getLogger(__name__)
# Automated sender patterns — emails from these are silently ignored
_NOREPLY_PATTERNS = (
"noreply", "no-reply", "no_reply", "donotreply", "do-not-reply",
"mailer-daemon", "postmaster", "bounce", "notifications@",
"automated@", "auto-confirm", "auto-reply", "automailer",
)
# RFC headers that indicate bulk/automated mail
_AUTOMATED_HEADERS = {
"Auto-Submitted": lambda v: v.lower() != "no",
"Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
"X-Auto-Response-Suppress": lambda v: bool(v),
"List-Unsubscribe": lambda v: bool(v),
}
# Gmail-safe max length per email body
MAX_MESSAGE_LENGTH = 50_000
@ -50,7 +64,17 @@ MAX_MESSAGE_LENGTH = 50_000
# Supported image extensions for inline detection
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
def _is_automated_sender(address: str, headers: dict) -> bool:
"""Return True if this email is from an automated/noreply source."""
addr = address.lower()
if any(pattern in addr for pattern in _NOREPLY_PATTERNS):
return True
for header, check in _AUTOMATED_HEADERS.items():
value = headers.get(header, "")
if value and check(value):
return True
return False
def check_email_requirements() -> bool:
"""Check if email platform dependencies are available."""
addr = os.getenv("EMAIL_ADDRESS")
@ -313,55 +337,63 @@ class EmailAdapter(BasePlatformAdapter):
results = []
try:
imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port, timeout=30)
imap.login(self._address, self._password)
imap.select("INBOX")
try:
imap.login(self._address, self._password)
imap.select("INBOX")
status, data = imap.uid("search", None, "UNSEEN")
if status != "OK" or not data or not data[0]:
imap.logout()
return results
status, data = imap.uid("search", None, "UNSEEN")
if status != "OK" or not data or not data[0]:
return results
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
# Trim periodically to prevent unbounded memory growth
if len(self._seen_uids) > self._seen_uids_max:
self._trim_seen_uids()
for uid in data[0].split():
if uid in self._seen_uids:
continue
self._seen_uids.add(uid)
# Trim periodically to prevent unbounded memory growth
if len(self._seen_uids) > self._seen_uids_max:
self._trim_seen_uids()
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
if status != "OK":
continue
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
if status != "OK":
continue
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
raw_email = msg_data[0][1]
msg = email_lib.message_from_bytes(raw_email)
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
sender_raw = msg.get("From", "")
sender_addr = _extract_email_address(sender_raw)
sender_name = _decode_header_value(sender_raw)
# Remove email from name if present
if "<" in sender_name:
sender_name = sender_name.split("<")[0].strip().strip('"')
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
body = _extract_text_body(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
message_id = msg.get("Message-ID", "")
in_reply_to = msg.get("In-Reply-To", "")
# Skip automated/noreply senders before any processing
msg_headers = dict(msg.items())
if _is_automated_sender(sender_addr, msg_headers):
logger.debug("[Email] Skipping automated sender: %s", sender_addr)
continue
body = _extract_text_body(msg)
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
imap.logout()
results.append({
"uid": uid,
"sender_addr": sender_addr,
"sender_name": sender_name,
"subject": subject,
"message_id": message_id,
"in_reply_to": in_reply_to,
"body": body,
"attachments": attachments,
"date": msg.get("Date", ""),
})
finally:
try:
imap.logout()
except Exception:
pass
except Exception as e:
logger.error("[Email] IMAP fetch error: %s", e)
return results
@ -374,6 +406,11 @@ class EmailAdapter(BasePlatformAdapter):
if sender_addr == self._address.lower():
return
# Never reply to automated senders
if _is_automated_sender(sender_addr, {}):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
@ -469,10 +506,15 @@ class EmailAdapter(BasePlatformAdapter):
msg.attach(MIMEText(body, "plain", "utf-8"))
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
try:
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
finally:
try:
smtp.quit()
except Exception:
smtp.close()
logger.info("[Email] Sent reply to %s (subject: %s)", to_addr, subject)
return msg_id
@ -556,10 +598,15 @@ class EmailAdapter(BasePlatformAdapter):
msg.attach(part)
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
smtp.quit()
try:
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
finally:
try:
smtp.quit()
except Exception:
smtp.close()
return msg_id

3255
gateway/platforms/feishu.py Normal file

File diff suppressed because it is too large Load Diff

View File

@ -17,6 +17,8 @@ Environment variables:
from __future__ import annotations
import asyncio
import io
import json
import logging
import mimetypes
import os
@ -40,11 +42,21 @@ logger = logging.getLogger(__name__)
MAX_MESSAGE_LENGTH = 4000
# Store directory for E2EE keys and sync state.
_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"
# Uses get_hermes_home() so each profile gets its own Matrix store.
from hermes_constants import get_hermes_dir as _get_hermes_dir
_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")
# Grace period: ignore messages older than this many seconds before startup.
_STARTUP_GRACE_SECONDS = 5
# E2EE key export file for persistence across restarts.
_KEY_EXPORT_FILE = _STORE_DIR / "exported_keys.txt"
_KEY_EXPORT_PASSPHRASE = "hermes-matrix-e2ee-keys"
# Pending undecrypted events: cap and TTL for retry buffer.
_MAX_PENDING_EVENTS = 100
_PENDING_EVENT_TTL = 300 # seconds — stop retrying after 5 min
def check_matrix_requirements() -> bool:
"""Return True if the Matrix adapter can be used."""
@ -107,6 +119,10 @@ class MatrixAdapter(BasePlatformAdapter):
self._processed_events: deque = deque(maxlen=1000)
self._processed_events_set: set = set()
# Buffer for undecrypted events pending key receipt.
# Each entry: (room, event, timestamp)
self._pending_megolm: list = []
def _is_duplicate_event(self, event_id) -> bool:
"""Return True if this event was already processed. Tracks the ID otherwise."""
if not event_id:
@ -228,6 +244,16 @@ class MatrixAdapter(BasePlatformAdapter):
logger.info("Matrix: E2EE crypto initialized")
except Exception as exc:
logger.warning("Matrix: crypto init issue: %s", exc)
# Import previously exported Megolm keys (survives restarts).
if _KEY_EXPORT_FILE.exists():
try:
await client.import_keys(
str(_KEY_EXPORT_FILE), _KEY_EXPORT_PASSPHRASE,
)
logger.info("Matrix: imported Megolm keys from backup")
except Exception as exc:
logger.debug("Matrix: could not import keys: %s", exc)
elif self._encryption:
logger.warning(
"Matrix: E2EE requested but crypto store is not loaded; "
@ -282,6 +308,18 @@ class MatrixAdapter(BasePlatformAdapter):
except (asyncio.CancelledError, Exception):
pass
# Export Megolm keys before closing so the next restart can decrypt
# events that used sessions from this run.
if self._client and self._encryption and getattr(self._client, "olm", None):
try:
_STORE_DIR.mkdir(parents=True, exist_ok=True)
await self._client.export_keys(
str(_KEY_EXPORT_FILE), _KEY_EXPORT_PASSPHRASE,
)
logger.info("Matrix: exported Megolm keys for next restart")
except Exception as exc:
logger.debug("Matrix: could not export keys on disconnect: %s", exc)
if self._client:
await self._client.close()
self._client = None
@ -510,8 +548,11 @@ class MatrixAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Upload an audio file as a voice message."""
return await self._send_local_file(chat_id, audio_path, "m.audio", caption, reply_to, metadata=metadata)
"""Upload an audio file as a voice message (MSC3245 native voice)."""
return await self._send_local_file(
chat_id, audio_path, "m.audio", caption, reply_to,
metadata=metadata, is_voice=True
)
async def send_video(
self,
@ -544,13 +585,16 @@ class MatrixAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
is_voice: bool = False,
) -> SendResult:
"""Upload bytes to Matrix and send as a media message."""
import nio
# Upload to homeserver.
resp = await self._client.upload(
data,
# nio expects a DataProvider (callable) or file-like object, not raw bytes.
# nio.upload() returns a tuple (UploadResponse|UploadError, Optional[Dict])
resp, maybe_encryption_info = await self._client.upload(
io.BytesIO(data),
content_type=content_type,
filename=filename,
)
@ -572,6 +616,10 @@ class MatrixAdapter(BasePlatformAdapter):
},
}
# Add MSC3245 voice flag for native voice messages.
if is_voice:
msg_content["org.matrix.msc3245.voice"] = {}
if reply_to:
msg_content["m.relates_to"] = {
"m.in_reply_to": {"event_id": reply_to}
@ -599,6 +647,7 @@ class MatrixAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
file_name: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
is_voice: bool = False,
) -> SendResult:
"""Read a local file and upload it."""
p = Path(file_path)
@ -611,7 +660,7 @@ class MatrixAdapter(BasePlatformAdapter):
ct = mimetypes.guess_type(fname)[0] or "application/octet-stream"
data = p.read_bytes()
return await self._upload_and_send(room_id, data, fname, ct, msgtype, caption, reply_to, metadata)
return await self._upload_and_send(room_id, data, fname, ct, msgtype, caption, reply_to, metadata, is_voice)
# ------------------------------------------------------------------
# Sync loop
@ -650,17 +699,22 @@ class MatrixAdapter(BasePlatformAdapter):
Hermes uses a custom sync loop instead of matrix-nio's sync_forever(),
so we need to explicitly drive the key management work that sync_forever()
normally handles for encrypted rooms.
Also auto-trusts all devices (so senders share session keys with us)
and retries decryption for any buffered MegolmEvents.
"""
client = self._client
if not client or not self._encryption or not getattr(client, "olm", None):
return
did_query_keys = client.should_query_keys
tasks = [asyncio.create_task(client.send_to_device_messages())]
if client.should_upload_keys:
tasks.append(asyncio.create_task(client.keys_upload()))
if client.should_query_keys:
if did_query_keys:
tasks.append(asyncio.create_task(client.keys_query()))
if client.should_claim_keys:
@ -676,6 +730,111 @@ class MatrixAdapter(BasePlatformAdapter):
except Exception as exc:
logger.warning("Matrix: E2EE maintenance task failed: %s", exc)
# After key queries, auto-trust all devices so senders share keys with
# us. For a bot this is the right default — we want to decrypt
# everything, not enforce manual verification.
if did_query_keys:
self._auto_trust_devices()
# Retry any buffered undecrypted events now that new keys may have
# arrived (from key requests, key queries, or to-device forwarding).
if self._pending_megolm:
await self._retry_pending_decryptions()
def _auto_trust_devices(self) -> None:
"""Trust/verify all unverified devices we know about.
When other clients see our device as verified, they proactively share
Megolm session keys with us. Without this, many clients will refuse
to include an unverified device in key distributions.
"""
client = self._client
if not client:
return
device_store = getattr(client, "device_store", None)
if not device_store:
return
own_device = getattr(client, "device_id", None)
trusted_count = 0
try:
# DeviceStore.__iter__ yields OlmDevice objects directly.
for device in device_store:
if getattr(device, "device_id", None) == own_device:
continue
if not getattr(device, "verified", False):
client.verify_device(device)
trusted_count += 1
except Exception as exc:
logger.debug("Matrix: auto-trust error: %s", exc)
if trusted_count:
logger.info("Matrix: auto-trusted %d new device(s)", trusted_count)
async def _retry_pending_decryptions(self) -> None:
"""Retry decrypting buffered MegolmEvents after new keys arrive."""
import nio
client = self._client
if not client or not self._pending_megolm:
return
now = time.time()
still_pending: list = []
for room, event, ts in self._pending_megolm:
# Drop events that have aged past the TTL.
if now - ts > _PENDING_EVENT_TTL:
logger.debug(
"Matrix: dropping expired pending event %s (age %.0fs)",
getattr(event, "event_id", "?"), now - ts,
)
continue
try:
decrypted = client.decrypt_event(event)
except Exception:
# Still missing the key — keep in buffer.
still_pending.append((room, event, ts))
continue
if isinstance(decrypted, nio.MegolmEvent):
# decrypt_event returned the same undecryptable event.
still_pending.append((room, event, ts))
continue
logger.info(
"Matrix: decrypted buffered event %s (%s)",
getattr(event, "event_id", "?"),
type(decrypted).__name__,
)
# Route to the appropriate handler based on decrypted type.
try:
if isinstance(decrypted, nio.RoomMessageText):
await self._on_room_message(room, decrypted)
elif isinstance(
decrypted,
(nio.RoomMessageImage, nio.RoomMessageAudio,
nio.RoomMessageVideo, nio.RoomMessageFile),
):
await self._on_room_message_media(room, decrypted)
else:
logger.debug(
"Matrix: decrypted event %s has unhandled type %s",
getattr(event, "event_id", "?"),
type(decrypted).__name__,
)
except Exception as exc:
logger.warning(
"Matrix: error processing decrypted event %s: %s",
getattr(event, "event_id", "?"), exc,
)
self._pending_megolm = still_pending
# ------------------------------------------------------------------
# Event callbacks
# ------------------------------------------------------------------
@ -697,13 +856,29 @@ class MatrixAdapter(BasePlatformAdapter):
if event_ts and event_ts < self._startup_ts - _STARTUP_GRACE_SECONDS:
return
# Handle decrypted MegolmEvents — extract the inner event.
# Handle undecryptable MegolmEvents: request the missing session key
# and buffer the event for retry once the key arrives.
if isinstance(event, nio.MegolmEvent):
# Failed to decrypt.
logger.warning(
"Matrix: could not decrypt event %s in %s",
"Matrix: could not decrypt event %s in %s — requesting key",
event.event_id, room.room_id,
)
# Ask other devices in the room to forward the session key.
try:
resp = await self._client.request_room_key(event)
if hasattr(resp, "event_id") or not isinstance(resp, Exception):
logger.debug(
"Matrix: room key request sent for session %s",
getattr(event, "session_id", "?"),
)
except Exception as exc:
logger.debug("Matrix: room key request failed: %s", exc)
# Buffer for retry on next maintenance cycle.
self._pending_megolm.append((room, event, time.time()))
if len(self._pending_megolm) > _MAX_PENDING_EVENTS:
self._pending_megolm = self._pending_megolm[-_MAX_PENDING_EVENTS:]
return
# Skip edits (m.replace relation).
@ -806,11 +981,19 @@ class MatrixAdapter(BasePlatformAdapter):
event_mimetype = (content_info.get("info") or {}).get("mimetype", "")
media_type = "application/octet-stream"
msg_type = MessageType.DOCUMENT
is_voice_message = False
if isinstance(event, nio.RoomMessageImage):
msg_type = MessageType.PHOTO
media_type = event_mimetype or "image/png"
elif isinstance(event, nio.RoomMessageAudio):
msg_type = MessageType.AUDIO
# Check for MSC3245 voice flag: org.matrix.msc3245.voice: {}
source_content = getattr(event, "source", {}).get("content", {})
if source_content.get("org.matrix.msc3245.voice") is not None:
is_voice_message = True
msg_type = MessageType.VOICE
else:
msg_type = MessageType.AUDIO
media_type = event_mimetype or "audio/ogg"
elif isinstance(event, nio.RoomMessageVideo):
msg_type = MessageType.VIDEO
@ -848,6 +1031,31 @@ class MatrixAdapter(BasePlatformAdapter):
if relates_to.get("rel_type") == "m.thread":
thread_id = relates_to.get("event_id")
# For voice messages, cache audio locally for transcription tools.
# Use the authenticated nio client to download (Matrix requires auth for media).
media_urls = [http_url] if http_url else None
media_types = [media_type] if http_url else None
if is_voice_message and url and url.startswith("mxc://"):
try:
import nio
from gateway.platforms.base import cache_audio_from_bytes
resp = await self._client.download(mxc=url)
if isinstance(resp, nio.MemoryDownloadResponse):
# Extract extension from mimetype or default to .ogg
ext = ".ogg"
if media_type and "/" in media_type:
subtype = media_type.split("/")[1]
ext = f".{subtype}" if subtype else ".ogg"
local_path = cache_audio_from_bytes(resp.body, ext)
media_urls = [local_path]
logger.debug("Matrix: cached voice message to %s", local_path)
else:
logger.warning("Matrix: failed to download voice: %s", getattr(resp, "message", resp))
except Exception as e:
logger.warning("Matrix: failed to cache voice message, using HTTP URL: %s", e)
source = self.build_source(
chat_id=room.room_id,
chat_type=chat_type,
@ -856,8 +1064,9 @@ class MatrixAdapter(BasePlatformAdapter):
thread_id=thread_id,
)
# Use cached local path for images, HTTP URL for other media types
media_urls = [cached_path] if cached_path else ([http_url] if http_url else None)
# Use cached local path for images (voice messages already handled above).
if cached_path:
media_urls = [cached_path]
media_types = [media_type] if media_urls else None
msg_event = MessageEvent(

View File

@ -603,9 +603,19 @@ class MattermostAdapter(BasePlatformAdapter):
# For DMs, user_id is sufficient. For channels, check for @mention.
message_text = post.get("message", "")
# Mention-only mode: skip channel messages that don't @mention the bot.
# DMs (type "D") are always processed.
# Mention-gating for non-DM channels.
# Config (env vars):
# MATTERMOST_REQUIRE_MENTION: Require @mention in channels (default: true)
# MATTERMOST_FREE_RESPONSE_CHANNELS: Channel IDs where bot responds without mention
if channel_type_raw != "D":
require_mention = os.getenv(
"MATTERMOST_REQUIRE_MENTION", "true"
).lower() not in ("false", "0", "no")
free_channels_raw = os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
is_free_channel = channel_id in free_channels
mention_patterns = [
f"@{self._bot_username}",
f"@{self._bot_user_id}",
@ -614,13 +624,21 @@ class MattermostAdapter(BasePlatformAdapter):
pattern.lower() in message_text.lower()
for pattern in mention_patterns
)
if not has_mention:
if require_mention and not is_free_channel and not has_mention:
logger.debug(
"Mattermost: skipping non-DM message without @mention (channel=%s)",
channel_id,
)
return
# Strip @mention from the message text so the agent sees clean input.
if has_mention:
for pattern in mention_patterns:
message_text = re.sub(
re.escape(pattern), "", message_text, flags=re.IGNORECASE
).strip()
# Resolve sender info.
sender_id = post.get("user_id", "")
sender_name = data.get("sender_name", "").lstrip("@") or sender_id

View File

@ -22,7 +22,7 @@ import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Optional, Any
from urllib.parse import unquote
from urllib.parse import quote, unquote
import httpx
@ -184,6 +184,8 @@ class SignalAdapter(BasePlatformAdapter):
self._recent_sent_timestamps: set = set()
self._max_recent_timestamps = 50
self._phone_lock_identity: Optional[str] = None
logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url, _redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled")
@ -198,6 +200,29 @@ class SignalAdapter(BasePlatformAdapter):
logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
return False
# Acquire scoped lock to prevent duplicate Signal listeners for the same phone
try:
from gateway.status import acquire_scoped_lock
self._phone_lock_identity = self.account
acquired, existing = acquire_scoped_lock(
"signal-phone",
self._phone_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this Signal account"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second Signal listener."
)
logger.error("Signal: %s", message)
self._set_fatal_error("signal_phone_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Health check — verify signal-cli daemon is reachable
@ -245,6 +270,14 @@ class SignalAdapter(BasePlatformAdapter):
await self.client.aclose()
self.client = None
if self._phone_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("signal-phone", self._phone_lock_identity)
except Exception as e:
logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
self._phone_lock_identity = None
logger.info("Signal: disconnected")
# ------------------------------------------------------------------
@ -253,7 +286,7 @@ class SignalAdapter(BasePlatformAdapter):
async def _sse_listener(self) -> None:
"""Listen for SSE events from signal-cli daemon."""
url = f"{self.http_url}/api/v1/events?account={self.account}"
url = f"{self.http_url}/api/v1/events?account={quote(self.account, safe='')}"
backoff = SSE_RETRY_DELAY_INITIAL
while self._running:
@ -521,7 +554,7 @@ class SignalAdapter(BasePlatformAdapter):
"""Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
result = await self._rpc("getAttachment", {
"account": self.account,
"attachmentId": attachment_id,
"id": attachment_id,
})
if not result:

View File

@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
"""
import asyncio
import json
import logging
import os
import re
@ -73,6 +74,10 @@ class SlackAdapter(BasePlatformAdapter):
self._bot_user_id: Optional[str] = None
self._user_name_cache: Dict[str, str] = {} # user_id → display name
self._socket_mode_task: Optional[asyncio.Task] = None
# Multi-workspace support
self._team_clients: Dict[str, AsyncWebClient] = {} # team_id → WebClient
self._team_bot_user_ids: Dict[str, str] = {} # team_id → bot_user_id
self._channel_team: Dict[str, str] = {} # channel_id → team_id
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
@ -82,23 +87,70 @@ class SlackAdapter(BasePlatformAdapter):
)
return False
bot_token = self.config.token
raw_token = self.config.token
app_token = os.getenv("SLACK_APP_TOKEN")
if not bot_token:
if not raw_token:
logger.error("[Slack] SLACK_BOT_TOKEN not set")
return False
if not app_token:
logger.error("[Slack] SLACK_APP_TOKEN not set")
return False
try:
self._app = AsyncApp(token=bot_token)
# Support comma-separated bot tokens for multi-workspace
bot_tokens = [t.strip() for t in raw_token.split(",") if t.strip()]
# Get our own bot user ID for mention detection
auth_response = await self._app.client.auth_test()
self._bot_user_id = auth_response.get("user_id")
bot_name = auth_response.get("user", "unknown")
# Also load tokens from OAuth token file
from hermes_constants import get_hermes_home
tokens_file = get_hermes_home() / "slack_tokens.json"
if tokens_file.exists():
try:
saved = json.loads(tokens_file.read_text(encoding="utf-8"))
for team_id, entry in saved.items():
tok = entry.get("token", "") if isinstance(entry, dict) else ""
if tok and tok not in bot_tokens:
bot_tokens.append(tok)
team_label = entry.get("team_name", team_id) if isinstance(entry, dict) else team_id
logger.info("[Slack] Loaded saved token for workspace %s", team_label)
except Exception as e:
logger.warning("[Slack] Failed to read %s: %s", tokens_file, e)
try:
# Acquire scoped lock to prevent duplicate app token usage
from gateway.status import acquire_scoped_lock
self._token_lock_identity = app_token
acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
if not acquired:
owner_pid = existing.get('pid') if isinstance(existing, dict) else None
message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
logger.error('[%s] %s', self.name, message)
self._set_fatal_error('slack_token_lock', message, retryable=False)
return False
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
# Register each bot token and map team_id → client
for token in bot_tokens:
client = AsyncWebClient(token=token)
auth_response = await client.auth_test()
team_id = auth_response.get("team_id", "")
bot_user_id = auth_response.get("user_id", "")
bot_name = auth_response.get("user", "unknown")
team_name = auth_response.get("team", "unknown")
self._team_clients[team_id] = client
self._team_bot_user_ids[team_id] = bot_user_id
# First token sets the primary bot_user_id (backward compat)
if self._bot_user_id is None:
self._bot_user_id = bot_user_id
logger.info(
"[Slack] Authenticated as @%s in workspace %s (team: %s)",
bot_name, team_name, team_id,
)
# Register message event handler
@self._app.event("message")
@ -123,7 +175,10 @@ class SlackAdapter(BasePlatformAdapter):
self._socket_mode_task = asyncio.create_task(self._handler.start_async())
self._running = True
logger.info("[Slack] Connected as @%s (Socket Mode)", bot_name)
logger.info(
"[Slack] Socket Mode connected (%d workspace(s))",
len(self._team_clients),
)
return True
except Exception as e: # pragma: no cover - defensive logging
@ -138,8 +193,25 @@ class SlackAdapter(BasePlatformAdapter):
except Exception as e: # pragma: no cover - defensive logging
logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
self._running = False
# Release the token lock (use stored identity, not re-read env)
try:
from gateway.status import release_scoped_lock
if getattr(self, '_token_lock_identity', None):
release_scoped_lock('slack-app-token', self._token_lock_identity)
self._token_lock_identity = None
except Exception:
pass
logger.info("[Slack] Disconnected")
def _get_client(self, chat_id: str) -> AsyncWebClient:
"""Return the workspace-specific WebClient for a channel."""
team_id = self._channel_team.get(chat_id)
if team_id and team_id in self._team_clients:
return self._team_clients[team_id]
return self._app.client # fallback to primary
async def send(
self,
chat_id: str,
@ -176,7 +248,7 @@ class SlackAdapter(BasePlatformAdapter):
if broadcast and i == 0:
kwargs["reply_broadcast"] = True
last_result = await self._app.client.chat_postMessage(**kwargs)
last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)
return SendResult(
success=True,
@ -198,7 +270,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return SendResult(success=False, error="Not connected")
try:
await self._app.client.chat_update(
await self._get_client(chat_id).chat_update(
channel=chat_id,
ts=message_id,
text=content,
@ -232,7 +304,7 @@ class SlackAdapter(BasePlatformAdapter):
return # Can only set status in a thread context
try:
await self._app.client.assistant_threads_setStatus(
await self._get_client(chat_id).assistant_threads_setStatus(
channel_id=chat_id,
thread_ts=thread_ts,
status="is thinking...",
@ -251,7 +323,18 @@ class SlackAdapter(BasePlatformAdapter):
Prefers metadata thread_id (the thread parent's ts, set by the
gateway) over reply_to (which may be a child message's ts).
When ``reply_in_thread`` is ``false`` in the platform extra config,
top-level channel messages receive direct channel replies instead of
thread replies. Messages that originate inside an existing thread are
always replied to in-thread to preserve conversation context.
"""
# When reply_in_thread is disabled (default: True for backward compat),
# only thread messages that are already part of an existing thread.
if not self.config.extra.get("reply_in_thread", True):
existing_thread = (metadata or {}).get("thread_id") or (metadata or {}).get("thread_ts")
return existing_thread or None
if metadata:
if metadata.get("thread_id"):
return metadata["thread_id"]
@ -274,7 +357,7 @@ class SlackAdapter(BasePlatformAdapter):
if not os.path.exists(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=file_path,
filename=os.path.basename(file_path),
@ -376,7 +459,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return False
try:
await self._app.client.reactions_add(
await self._get_client(channel).reactions_add(
channel=channel, timestamp=timestamp, name=emoji
)
return True
@ -392,7 +475,7 @@ class SlackAdapter(BasePlatformAdapter):
if not self._app:
return False
try:
await self._app.client.reactions_remove(
await self._get_client(channel).reactions_remove(
channel=channel, timestamp=timestamp, name=emoji
)
return True
@ -402,7 +485,7 @@ class SlackAdapter(BasePlatformAdapter):
# ----- User identity resolution -----
async def _resolve_user_name(self, user_id: str) -> str:
async def _resolve_user_name(self, user_id: str, chat_id: str = "") -> str:
"""Resolve a Slack user ID to a display name, with caching."""
if not user_id:
return ""
@ -413,7 +496,8 @@ class SlackAdapter(BasePlatformAdapter):
return user_id
try:
result = await self._app.client.users_info(user=user_id)
client = self._get_client(chat_id) if chat_id else self._app.client
result = await client.users_info(user=user_id)
user = result.get("user", {})
# Prefer display_name → real_name → user_id
profile = user.get("profile", {})
@ -477,7 +561,7 @@ class SlackAdapter(BasePlatformAdapter):
response = await client.get(image_url)
response.raise_for_status()
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
content=response.content,
filename="image.png",
@ -537,7 +621,7 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=False, error=f"Video file not found: {video_path}")
try:
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=video_path,
filename=os.path.basename(video_path),
@ -578,7 +662,7 @@ class SlackAdapter(BasePlatformAdapter):
display_name = file_name or os.path.basename(file_path)
try:
result = await self._app.client.files_upload_v2(
result = await self._get_client(chat_id).files_upload_v2(
channel=chat_id,
file=file_path,
filename=display_name,
@ -606,7 +690,7 @@ class SlackAdapter(BasePlatformAdapter):
return {"name": chat_id, "type": "unknown"}
try:
result = await self._app.client.conversations_info(channel=chat_id)
result = await self._get_client(chat_id).conversations_info(channel=chat_id)
channel = result.get("channel", {})
is_dm = channel.get("is_im", False)
return {
@ -639,6 +723,11 @@ class SlackAdapter(BasePlatformAdapter):
user_id = event.get("user", "")
channel_id = event.get("channel", "")
ts = event.get("ts", "")
team_id = event.get("team", "")
# Track which workspace owns this channel
if team_id and channel_id:
self._channel_team[channel_id] = team_id
# Determine if this is a DM or channel message
channel_type = event.get("channel_type", "")
@ -655,11 +744,12 @@ class SlackAdapter(BasePlatformAdapter):
thread_ts = event.get("thread_ts") or ts # ts fallback for channels
# In channels, only respond if bot is mentioned
if not is_dm and self._bot_user_id:
if f"<@{self._bot_user_id}>" not in text:
bot_uid = self._team_bot_user_ids.get(team_id, self._bot_user_id)
if not is_dm and bot_uid:
if f"<@{bot_uid}>" not in text:
return
# Strip the bot mention from the text
text = text.replace(f"<@{self._bot_user_id}>", "").strip()
text = text.replace(f"<@{bot_uid}>", "").strip()
# Determine message type
msg_type = MessageType.TEXT
@ -679,7 +769,7 @@ class SlackAdapter(BasePlatformAdapter):
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
ext = ".jpg"
# Slack private URLs require the bot token as auth header
cached = await self._download_slack_file(url, ext)
cached = await self._download_slack_file(url, ext, team_id=team_id)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.PHOTO
@ -690,7 +780,7 @@ class SlackAdapter(BasePlatformAdapter):
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
ext = ".ogg"
cached = await self._download_slack_file(url, ext, audio=True)
cached = await self._download_slack_file(url, ext, audio=True, team_id=team_id)
media_urls.append(cached)
media_types.append(mimetype)
msg_type = MessageType.VOICE
@ -721,7 +811,7 @@ class SlackAdapter(BasePlatformAdapter):
continue
# Download and cache
raw_bytes = await self._download_slack_file_bytes(url)
raw_bytes = await self._download_slack_file_bytes(url, team_id=team_id)
cached_path = cache_document_from_bytes(
raw_bytes, original_filename or f"document{ext}"
)
@ -750,7 +840,7 @@ class SlackAdapter(BasePlatformAdapter):
logger.warning("[Slack] Failed to cache document from %s: %s", url, e, exc_info=True)
# Resolve user display name (cached after first lookup)
user_name = await self._resolve_user_name(user_id)
user_name = await self._resolve_user_name(user_id, chat_id=channel_id)
# Build source
source = self.build_source(
@ -787,6 +877,11 @@ class SlackAdapter(BasePlatformAdapter):
text = command.get("text", "").strip()
user_id = command.get("user_id", "")
channel_id = command.get("channel_id", "")
team_id = command.get("team_id", "")
# Track which workspace owns this channel
if team_id and channel_id:
self._channel_team[channel_id] = team_id
# Map subcommands to gateway commands — derived from central registry.
# Also keep "compact" as a Slack-specific alias for /compress.
@ -818,12 +913,12 @@ class SlackAdapter(BasePlatformAdapter):
await self.handle_message(event)
async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
async def _download_slack_file(self, url: str, ext: str, audio: bool = False, team_id: str = "") -> str:
"""Download a Slack file using the bot token for auth, with retry."""
import asyncio
import httpx
bot_token = self.config.token
bot_token = self._team_clients[team_id].token if team_id and team_id in self._team_clients else self.config.token
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
@ -853,12 +948,12 @@ class SlackAdapter(BasePlatformAdapter):
raise
raise last_exc
async def _download_slack_file_bytes(self, url: str) -> bytes:
async def _download_slack_file_bytes(self, url: str, team_id: str = "") -> bytes:
"""Download a Slack file and return raw bytes, with retry."""
import asyncio
import httpx
bot_token = self.config.token
bot_token = self._team_clients[team_id].token if team_id and team_id in self._team_clients else self.config.token
last_exc = None
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:

View File

@ -8,6 +8,7 @@ Uses python-telegram-bot library for:
"""
import asyncio
import json
import logging
import os
import re
@ -122,6 +123,8 @@ class TelegramAdapter(BasePlatformAdapter):
super().__init__(config, Platform.TELEGRAM)
self._app: Optional[Application] = None
self._bot: Optional[Bot] = None
self._webhook_mode: bool = False
self._mention_patterns = self._compile_mention_patterns()
self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
# Buffer rapid/album photo updates so Telegram image bursts are handled
# as a single MessageEvent instead of self-interrupting multiple turns.
@ -345,7 +348,8 @@ class TelegramAdapter(BasePlatformAdapter):
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
config_path = _Path.home() / ".hermes" / "config.yaml"
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
logger.warning("[%s] Config file not found at %s, cannot persist thread_id", self.name, config_path)
return
@ -455,7 +459,19 @@ class TelegramAdapter(BasePlatformAdapter):
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
async def connect(self) -> bool:
"""Connect to Telegram and start polling for updates."""
"""Connect to Telegram via polling or webhook.
By default, uses long polling (outbound connection to Telegram).
If ``TELEGRAM_WEBHOOK_URL`` is set, starts an HTTP webhook server
instead. Webhook mode is useful for cloud deployments (Fly.io,
Railway) where inbound HTTP can wake a suspended machine.
Env vars for webhook mode::
TELEGRAM_WEBHOOK_URL Public HTTPS URL (e.g. https://app.fly.dev/telegram)
TELEGRAM_WEBHOOK_PORT Local listen port (default 8443)
TELEGRAM_WEBHOOK_SECRET Secret token for update verification
"""
if not TELEGRAM_AVAILABLE:
logger.error(
"[%s] python-telegram-bot not installed. Run: pip install python-telegram-bot",
@ -549,37 +565,76 @@ class TelegramAdapter(BasePlatformAdapter):
else:
raise
await self._app.start()
loop = asyncio.get_running_loop()
def _polling_error_callback(error: Exception) -> None:
if self._polling_error_task and not self._polling_error_task.done():
return
if self._looks_like_polling_conflict(error):
self._polling_error_task = loop.create_task(self._handle_polling_conflict(error))
elif self._looks_like_network_error(error):
logger.warning("[%s] Telegram network error, scheduling reconnect: %s", self.name, error)
self._polling_error_task = loop.create_task(self._handle_polling_network_error(error))
else:
logger.error("[%s] Telegram polling error: %s", self.name, error, exc_info=True)
# Decide between webhook and polling mode
webhook_url = os.getenv("TELEGRAM_WEBHOOK_URL", "").strip()
# Store reference for retry use in _handle_polling_conflict
self._polling_error_callback_ref = _polling_error_callback
if webhook_url:
# ── Webhook mode ─────────────────────────────────────
# Telegram pushes updates to our HTTP endpoint. This
# enables cloud platforms (Fly.io, Railway) to auto-wake
# suspended machines on inbound HTTP traffic.
webhook_port = int(os.getenv("TELEGRAM_WEBHOOK_PORT", "8443"))
webhook_secret = os.getenv("TELEGRAM_WEBHOOK_SECRET", "").strip() or None
from urllib.parse import urlparse
webhook_path = urlparse(webhook_url).path or "/telegram"
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
error_callback=_polling_error_callback,
)
await self._app.updater.start_webhook(
listen="0.0.0.0",
port=webhook_port,
url_path=webhook_path,
webhook_url=webhook_url,
secret_token=webhook_secret,
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
)
self._webhook_mode = True
logger.info(
"[%s] Webhook server listening on 0.0.0.0:%d%s",
self.name, webhook_port, webhook_path,
)
else:
# ── Polling mode (default) ───────────────────────────
loop = asyncio.get_running_loop()
def _polling_error_callback(error: Exception) -> None:
if self._polling_error_task and not self._polling_error_task.done():
return
if self._looks_like_polling_conflict(error):
self._polling_error_task = loop.create_task(self._handle_polling_conflict(error))
elif self._looks_like_network_error(error):
logger.warning("[%s] Telegram network error, scheduling reconnect: %s", self.name, error)
self._polling_error_task = loop.create_task(self._handle_polling_network_error(error))
else:
logger.error("[%s] Telegram polling error: %s", self.name, error, exc_info=True)
# Store reference for retry use in _handle_polling_conflict
self._polling_error_callback_ref = _polling_error_callback
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
drop_pending_updates=True,
error_callback=_polling_error_callback,
)
# Register bot commands so Telegram shows a hint menu when users type /
# List is derived from the central COMMAND_REGISTRY — adding a new
# gateway command there automatically adds it to the Telegram menu.
try:
from telegram import BotCommand
from hermes_cli.commands import telegram_bot_commands
from hermes_cli.commands import telegram_menu_commands
# Telegram allows up to 100 commands but has an undocumented
# payload size limit. Skill descriptions are truncated to 40
# chars in telegram_menu_commands() to fit 100 commands safely.
menu_commands, hidden_count = telegram_menu_commands(max_commands=100)
await self._bot.set_my_commands([
BotCommand(name, desc) for name, desc in telegram_bot_commands()
BotCommand(name, desc) for name, desc in menu_commands
])
if hidden_count:
logger.info(
"[%s] Telegram menu: %d commands registered, %d hidden (over 100 limit). Use /commands for full list.",
self.name, len(menu_commands), hidden_count,
)
except Exception as e:
logger.warning(
"[%s] Could not register Telegram command menu: %s",
@ -589,7 +644,8 @@ class TelegramAdapter(BasePlatformAdapter):
)
self._mark_connected()
logger.info("[%s] Connected and polling for Telegram updates", self.name)
mode = "webhook" if self._webhook_mode else "polling"
logger.info("[%s] Connected to Telegram (%s mode)", self.name, mode)
# Set up DM topics (Bot API 9.4 — Private Chat Topics)
# Runs after connection is established so the bot can call createForumTopic.
@ -617,7 +673,7 @@ class TelegramAdapter(BasePlatformAdapter):
return False
async def disconnect(self) -> None:
"""Stop polling, cancel pending album flushes, and disconnect."""
"""Stop polling/webhook, cancel pending album flushes, and disconnect."""
pending_media_group_tasks = list(self._media_group_tasks.values())
for task in pending_media_group_tasks:
task.cancel()
@ -686,6 +742,10 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._bot:
return SendResult(success=False, error="Not connected")
# Skip whitespace-only text to prevent Telegram 400 empty-text errors.
if not content or not content.strip():
return SendResult(success=True, message_id=None)
try:
# Format and split message if needed
formatted = self.format_message(content)
@ -761,6 +821,16 @@ class TelegramAdapter(BasePlatformAdapter):
)
effective_thread_id = None
continue
if "message to be replied not found" in err_lower and reply_to_id is not None:
# Original message was deleted before we
# could reply — clear reply target and retry
# so the response is still delivered.
logger.warning(
"[%s] Reply target deleted, retrying without reply_to: %s",
self.name, send_err,
)
reply_to_id = None
continue
# Other BadRequest errors are permanent — don't retry
raise
if _send_attempt < 2:
@ -830,7 +900,9 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception:
pass # best-effort truncation
return SendResult(success=True, message_id=message_id)
# Flood control / RetryAfter — back off and retry once
# Flood control / RetryAfter — short waits are retried inline,
# long waits return a failure immediately so streaming can fall back
# to a normal final send instead of leaving a truncated partial.
retry_after = getattr(e, "retry_after", None)
if retry_after is not None or "retry after" in err_str:
wait = retry_after if retry_after else 1.0
@ -838,6 +910,8 @@ class TelegramAdapter(BasePlatformAdapter):
"[%s] Telegram flood control, waiting %.1fs",
self.name, wait,
)
if wait > 5.0:
return SendResult(success=False, error=f"flood_control:{wait}")
await asyncio.sleep(wait)
try:
await self._bot.edit_message_text(
@ -1314,6 +1388,148 @@ class TelegramAdapter(BasePlatformAdapter):
return text
# ── Group mention gating ──────────────────────────────────────────────
def _telegram_require_mention(self) -> bool:
"""Return whether group chats should require an explicit bot trigger."""
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return bool(configured)
return os.getenv("TELEGRAM_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
def _telegram_free_response_chats(self) -> set[str]:
raw = self.config.extra.get("free_response_chats")
if raw is None:
raw = os.getenv("TELEGRAM_FREE_RESPONSE_CHATS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
return {part.strip() for part in str(raw).split(",") if part.strip()}
def _compile_mention_patterns(self) -> List[re.Pattern]:
"""Compile optional regex wake-word patterns for group triggers."""
patterns = self.config.extra.get("mention_patterns")
if patterns is None:
raw = os.getenv("TELEGRAM_MENTION_PATTERNS", "").strip()
if raw:
try:
loaded = json.loads(raw)
except Exception:
loaded = [part.strip() for part in raw.splitlines() if part.strip()]
if not loaded:
loaded = [part.strip() for part in raw.split(",") if part.strip()]
patterns = loaded
if patterns is None:
return []
if isinstance(patterns, str):
patterns = [patterns]
if not isinstance(patterns, list):
logger.warning(
"[%s] telegram mention_patterns must be a list or string; got %s",
self.name,
type(patterns).__name__,
)
return []
compiled: List[re.Pattern] = []
for pattern in patterns:
if not isinstance(pattern, str) or not pattern.strip():
continue
try:
compiled.append(re.compile(pattern, re.IGNORECASE))
except re.error as exc:
logger.warning("[%s] Invalid Telegram mention pattern %r: %s", self.name, pattern, exc)
if compiled:
logger.info("[%s] Loaded %d Telegram mention pattern(s)", self.name, len(compiled))
return compiled
def _is_group_chat(self, message: Message) -> bool:
chat = getattr(message, "chat", None)
if not chat:
return False
chat_type = str(getattr(chat, "type", "")).split(".")[-1].lower()
return chat_type in ("group", "supergroup")
def _is_reply_to_bot(self, message: Message) -> bool:
if not self._bot or not getattr(message, "reply_to_message", None):
return False
reply_user = getattr(message.reply_to_message, "from_user", None)
return bool(reply_user and getattr(reply_user, "id", None) == getattr(self._bot, "id", None))
def _message_mentions_bot(self, message: Message) -> bool:
if not self._bot:
return False
bot_username = (getattr(self._bot, "username", None) or "").lstrip("@").lower()
bot_id = getattr(self._bot, "id", None)
def _iter_sources():
yield getattr(message, "text", None) or "", getattr(message, "entities", None) or []
yield getattr(message, "caption", None) or "", getattr(message, "caption_entities", None) or []
for source_text, entities in _iter_sources():
if bot_username and f"@{bot_username}" in source_text.lower():
return True
for entity in entities:
entity_type = str(getattr(entity, "type", "")).split(".")[-1].lower()
if entity_type == "mention" and bot_username:
offset = int(getattr(entity, "offset", -1))
length = int(getattr(entity, "length", 0))
if offset < 0 or length <= 0:
continue
if source_text[offset:offset + length].strip().lower() == f"@{bot_username}":
return True
elif entity_type == "text_mention":
user = getattr(entity, "user", None)
if user and getattr(user, "id", None) == bot_id:
return True
return False
def _message_matches_mention_patterns(self, message: Message) -> bool:
if not self._mention_patterns:
return False
for candidate in (getattr(message, "text", None), getattr(message, "caption", None)):
if not candidate:
continue
for pattern in self._mention_patterns:
if pattern.search(candidate):
return True
return False
def _clean_bot_trigger_text(self, text: Optional[str]) -> Optional[str]:
if not text or not self._bot or not getattr(self._bot, "username", None):
return text
username = re.escape(self._bot.username)
cleaned = re.sub(rf"(?i)@{username}\b[,:\-]*\s*", "", text).strip()
return cleaned or text
def _should_process_message(self, message: Message, *, is_command: bool = False) -> bool:
"""Apply Telegram group trigger rules.
DMs remain unrestricted. Group/supergroup messages are accepted when:
- the chat is explicitly allowlisted in ``free_response_chats``
- ``require_mention`` is disabled
- the message is a command
- the message replies to the bot
- the bot is @mentioned
- the text/caption matches a configured regex wake-word pattern
"""
if not self._is_group_chat(message):
return True
if str(getattr(getattr(message, "chat", None), "id", "")) in self._telegram_free_response_chats():
return True
if not self._telegram_require_mention():
return True
if is_command:
return True
if self._is_reply_to_bot(message):
return True
if self._message_mentions_bot(message):
return True
return self._message_matches_mention_patterns(message)
async def _handle_text_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming text messages.
@ -1323,14 +1539,19 @@ class TelegramAdapter(BasePlatformAdapter):
"""
if not update.message or not update.message.text:
return
if not self._should_process_message(update.message):
return
event = self._build_message_event(update.message, MessageType.TEXT)
event.text = self._clean_bot_trigger_text(event.text)
self._enqueue_text_event(event)
async def _handle_command(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming command messages."""
if not update.message or not update.message.text:
return
if not self._should_process_message(update.message, is_command=True):
return
event = self._build_message_event(update.message, MessageType.COMMAND)
await self.handle_message(event)
@ -1339,6 +1560,8 @@ class TelegramAdapter(BasePlatformAdapter):
"""Handle incoming location/venue pin messages."""
if not update.message:
return
if not self._should_process_message(update.message):
return
msg = update.message
venue = getattr(msg, "venue", None)
@ -1482,6 +1705,8 @@ class TelegramAdapter(BasePlatformAdapter):
"""Handle incoming media messages, downloading images to local cache."""
if not update.message:
return
if not self._should_process_message(update.message):
return
msg = update.message
@ -1505,7 +1730,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Add caption as text
if msg.caption:
event.text = msg.caption
event.text = self._clean_bot_trigger_text(msg.caption)
# Handle stickers: describe via vision tool with caching
if msg.sticker:
@ -1757,7 +1982,8 @@ class TelegramAdapter(BasePlatformAdapter):
recognized without a gateway restart.
"""
try:
config_path = _Path.home() / ".hermes" / "config.yaml"
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
if not config_path.exists():
return

View File

@ -12,6 +12,7 @@ from __future__ import annotations
import asyncio
import ipaddress
import logging
import os
import socket
from typing import Iterable, Optional
@ -43,6 +44,14 @@ _DOH_PROVIDERS: list[dict] = [
_SEED_FALLBACK_IPS: list[str] = ["149.154.167.220"]
def _resolve_proxy_url() -> str | None:
for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
value = (os.environ.get(key) or "").strip()
if value:
return value
return None
class TelegramFallbackTransport(httpx.AsyncBaseTransport):
"""Retry Telegram Bot API requests via fallback IPs while preserving TLS/SNI.
@ -54,6 +63,9 @@ class TelegramFallbackTransport(httpx.AsyncBaseTransport):
def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
proxy_url = _resolve_proxy_url()
if proxy_url and "proxy" not in transport_kwargs:
transport_kwargs["proxy"] = proxy_url
self._primary = httpx.AsyncHTTPTransport(**transport_kwargs)
self._fallbacks = {
ip: httpx.AsyncHTTPTransport(**transport_kwargs) for ip in self._fallback_ips
@ -123,6 +135,9 @@ def _normalize_fallback_ips(values: Iterable[str]) -> list[str]:
if addr.version != 4:
logger.warning("Ignoring non-IPv4 Telegram fallback IP: %s", raw)
continue
if addr.is_private or addr.is_loopback or addr.is_link_local or addr.is_unspecified:
logger.warning("Ignoring private/internal Telegram fallback IP: %s", raw)
continue
normalized.append(str(addr))
return normalized

View File

@ -27,6 +27,7 @@ import hashlib
import hmac
import json
import logging
import os
import re
import subprocess
import time
@ -53,6 +54,7 @@ logger = logging.getLogger(__name__)
DEFAULT_HOST = "0.0.0.0"
DEFAULT_PORT = 8644
_INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
_DYNAMIC_ROUTES_FILENAME = "webhook_subscriptions.json"
def check_webhook_requirements() -> bool:
@ -68,7 +70,10 @@ class WebhookAdapter(BasePlatformAdapter):
self._host: str = config.extra.get("host", DEFAULT_HOST)
self._port: int = int(config.extra.get("port", DEFAULT_PORT))
self._global_secret: str = config.extra.get("secret", "")
self._routes: Dict[str, dict] = config.extra.get("routes", {})
self._static_routes: Dict[str, dict] = config.extra.get("routes", {})
self._dynamic_routes: Dict[str, dict] = {}
self._dynamic_routes_mtime: float = 0.0
self._routes: Dict[str, dict] = dict(self._static_routes)
self._runner = None
# Delivery info keyed by session chat_id — consumed by send()
@ -96,6 +101,9 @@ class WebhookAdapter(BasePlatformAdapter):
# ------------------------------------------------------------------
async def connect(self) -> bool:
# Load agent-created subscriptions before validating
self._reload_dynamic_routes()
# Validate routes at startup — secret is required per route
for name, route in self._routes.items():
secret = route.get("secret", self._global_secret)
@ -110,6 +118,17 @@ class WebhookAdapter(BasePlatformAdapter):
app.router.add_get("/health", self._handle_health)
app.router.add_post("/webhooks/{route_name}", self._handle_webhook)
# Port conflict detection — fail fast if port is already in use
import socket as _socket
try:
with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
_s.settimeout(1)
_s.connect(('127.0.0.1', self._port))
logger.error('[webhook] Port %d already in use. Set a different port in config.yaml: platforms.webhook.port', self._port)
return False
except (ConnectionRefusedError, OSError):
pass # port is free
self._runner = web.AppRunner(app)
await self._runner.setup()
site = web.TCPSite(self._runner, self._host, self._port)
@ -182,8 +201,46 @@ class WebhookAdapter(BasePlatformAdapter):
"""GET /health — simple health check."""
return web.json_response({"status": "ok", "platform": "webhook"})
def _reload_dynamic_routes(self) -> None:
"""Reload agent-created subscriptions from disk if the file changed."""
from pathlib import Path as _Path
hermes_home = _Path(
os.getenv("HERMES_HOME", str(_Path.home() / ".hermes"))
).expanduser()
subs_path = hermes_home / _DYNAMIC_ROUTES_FILENAME
if not subs_path.exists():
if self._dynamic_routes:
self._dynamic_routes = {}
self._routes = dict(self._static_routes)
logger.debug("[webhook] Dynamic subscriptions file removed, cleared dynamic routes")
return
try:
mtime = subs_path.stat().st_mtime
if mtime <= self._dynamic_routes_mtime:
return # No change
data = json.loads(subs_path.read_text(encoding="utf-8"))
if not isinstance(data, dict):
return
# Merge: static routes take precedence over dynamic ones
self._dynamic_routes = {
k: v for k, v in data.items()
if k not in self._static_routes
}
self._routes = {**self._dynamic_routes, **self._static_routes}
self._dynamic_routes_mtime = mtime
logger.info(
"[webhook] Reloaded %d dynamic route(s): %s",
len(self._dynamic_routes),
", ".join(self._dynamic_routes.keys()) or "(none)",
)
except Exception as e:
logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
"""POST /webhooks/{route_name} — receive and process a webhook event."""
# Hot-reload dynamic subscriptions on each request (mtime-gated, cheap)
self._reload_dynamic_routes()
route_name = request.match_info.get("route_name", "")
route_config = self._routes.get(route_name)

1338
gateway/platforms/wecom.py Normal file

File diff suppressed because it is too large Load Diff

View File

@ -16,9 +16,11 @@ with different backends via a bridge pattern.
"""
import asyncio
import json
import logging
import os
import platform
import re
import subprocess
_IS_WINDOWS = platform.system() == "Windows"
@ -26,6 +28,7 @@ from pathlib import Path
from typing import Dict, Optional, Any
from hermes_cli.config import get_hermes_home
from hermes_constants import get_hermes_dir
logger = logging.getLogger(__name__)
@ -134,13 +137,140 @@ class WhatsAppAdapter(BasePlatformAdapter):
)
self._session_path: Path = Path(config.extra.get(
"session_path",
get_hermes_home() / "whatsapp" / "session"
get_hermes_dir("platforms/whatsapp/session", "whatsapp/session")
))
self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
self._mention_patterns = self._compile_mention_patterns()
self._message_queue: asyncio.Queue = asyncio.Queue()
self._bridge_log_fh = None
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
self._session_lock_identity: Optional[str] = None
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return bool(configured)
return os.getenv("WHATSAPP_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
def _whatsapp_free_response_chats(self) -> set[str]:
raw = self.config.extra.get("free_response_chats")
if raw is None:
raw = os.getenv("WHATSAPP_FREE_RESPONSE_CHATS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
return {part.strip() for part in str(raw).split(",") if part.strip()}
def _compile_mention_patterns(self):
patterns = self.config.extra.get("mention_patterns")
if patterns is None:
raw = os.getenv("WHATSAPP_MENTION_PATTERNS", "").strip()
if raw:
try:
patterns = json.loads(raw)
except Exception:
patterns = [part.strip() for part in raw.splitlines() if part.strip()]
if not patterns:
patterns = [part.strip() for part in raw.split(",") if part.strip()]
if patterns is None:
return []
if isinstance(patterns, str):
patterns = [patterns]
if not isinstance(patterns, list):
logger.warning("[%s] whatsapp mention_patterns must be a list or string; got %s", self.name, type(patterns).__name__)
return []
compiled = []
for pattern in patterns:
if not isinstance(pattern, str) or not pattern.strip():
continue
try:
compiled.append(re.compile(pattern, re.IGNORECASE))
except re.error as exc:
logger.warning("[%s] Invalid WhatsApp mention pattern %r: %s", self.name, pattern, exc)
if compiled:
logger.info("[%s] Loaded %d WhatsApp mention pattern(s)", self.name, len(compiled))
return compiled
@staticmethod
def _normalize_whatsapp_id(value: Optional[str]) -> str:
if not value:
return ""
normalized = str(value).strip()
if ":" in normalized and "@" in normalized:
normalized = normalized.replace(":", "@", 1)
return normalized
def _bot_ids_from_message(self, data: Dict[str, Any]) -> set[str]:
bot_ids = set()
for candidate in data.get("botIds") or []:
normalized = self._normalize_whatsapp_id(candidate)
if normalized:
bot_ids.add(normalized)
return bot_ids
def _message_is_reply_to_bot(self, data: Dict[str, Any]) -> bool:
quoted_participant = self._normalize_whatsapp_id(data.get("quotedParticipant"))
if not quoted_participant:
return False
return quoted_participant in self._bot_ids_from_message(data)
def _message_mentions_bot(self, data: Dict[str, Any]) -> bool:
bot_ids = self._bot_ids_from_message(data)
if not bot_ids:
return False
mentioned_ids = {
nid
for candidate in (data.get("mentionedIds") or [])
if (nid := self._normalize_whatsapp_id(candidate))
}
if mentioned_ids & bot_ids:
return True
body = str(data.get("body") or "")
lower_body = body.lower()
for bot_id in bot_ids:
bare_id = bot_id.split("@", 1)[0].lower()
if bare_id and (f"@{bare_id}" in lower_body or bare_id in lower_body):
return True
return False
def _message_matches_mention_patterns(self, data: Dict[str, Any]) -> bool:
if not self._mention_patterns:
return False
body = str(data.get("body") or "")
return any(pattern.search(body) for pattern in self._mention_patterns)
def _clean_bot_mention_text(self, text: str, data: Dict[str, Any]) -> str:
if not text:
return text
bot_ids = self._bot_ids_from_message(data)
cleaned = text
for bot_id in bot_ids:
bare_id = bot_id.split("@", 1)[0]
if bare_id:
cleaned = re.sub(rf"@{re.escape(bare_id)}\b[,:\-]*\s*", "", cleaned)
return cleaned.strip() or text
def _should_process_message(self, data: Dict[str, Any]) -> bool:
if not data.get("isGroup"):
return True
chat_id = str(data.get("chatId") or "")
if chat_id in self._whatsapp_free_response_chats():
return True
if not self._whatsapp_require_mention():
return True
body = str(data.get("body") or "").strip()
if body.startswith("/"):
return True
if self._message_is_reply_to_bot(data):
return True
if self._message_mentions_bot(data):
return True
return self._message_matches_mention_patterns(data)
async def connect(self) -> bool:
"""
@ -159,6 +289,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
logger.info("[%s] Bridge found at %s", self.name, bridge_path)
# Acquire scoped lock to prevent duplicate sessions
try:
from gateway.status import acquire_scoped_lock
self._session_lock_identity = str(self._session_path)
acquired, existing = acquire_scoped_lock(
"whatsapp-session",
self._session_lock_identity,
metadata={"platform": self.platform.value},
)
if not acquired:
owner_pid = existing.get("pid") if isinstance(existing, dict) else None
message = (
"Another local Hermes gateway is already using this WhatsApp session"
+ (f" (PID {owner_pid})." if owner_pid else ".")
+ " Stop the other gateway before starting a second WhatsApp bridge."
)
logger.error("[%s] %s", self.name, message)
self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
return False
except Exception as e:
logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
# Auto-install npm dependencies if node_modules doesn't exist
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
@ -199,6 +352,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] Using existing bridge (status: {bridge_status})")
self._mark_connected()
self._bridge_process = None # Not managed by us
self._http_session = aiohttp.ClientSession()
self._poll_task = asyncio.create_task(self._poll_messages())
return True
else:
@ -304,6 +458,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] Bridge log: {self._bridge_log}")
print(f"[{self.name}] If session expired, re-pair: hermes whatsapp")
# Create a persistent HTTP session for all bridge communication
self._http_session = aiohttp.ClientSession()
# Start message polling task
self._poll_task = asyncio.create_task(self._poll_messages())
@ -312,6 +469,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
return True
except Exception as e:
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception:
pass
logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
self._close_bridge_log()
return False
@ -369,10 +532,32 @@ class WhatsAppAdapter(BasePlatformAdapter):
else:
# Bridge was not started by us, don't kill it
print(f"[{self.name}] Disconnecting (external bridge left running)")
# Cancel the poll task explicitly
if self._poll_task and not self._poll_task.done():
self._poll_task.cancel()
try:
await self._poll_task
except (asyncio.CancelledError, Exception):
pass
self._poll_task = None
# Close the persistent HTTP session
if self._http_session and not self._http_session.closed:
await self._http_session.close()
self._http_session = None
if self._session_lock_identity:
try:
from gateway.status import release_scoped_lock
release_scoped_lock("whatsapp-session", self._session_lock_identity)
except Exception as e:
logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
self._mark_disconnected()
self._bridge_process = None
self._close_bridge_log()
self._session_lock_identity = None
print(f"[{self.name}] Disconnected")
async def send(
@ -383,7 +568,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
@ -391,36 +576,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with aiohttp.ClientSession() as session:
payload = {
"chatId": chat_id,
"message": content,
}
if reply_to:
payload["replyTo"] = reply_to
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except ImportError:
return SendResult(
success=False,
error="aiohttp not installed. Run: pip install aiohttp"
)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@ -431,28 +609,27 @@ class WhatsAppAdapter(BasePlatformAdapter):
content: str,
) -> SendResult:
"""Edit a previously sent message via the WhatsApp bridge."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
return SendResult(success=False, error=bridge_exit)
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@ -465,7 +642,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
file_name: Optional[str] = None,
) -> SendResult:
"""Send any media file via bridge /send-media endpoint."""
if not self._running:
if not self._running or not self._http_session:
return SendResult(success=False, error="Not connected")
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
@ -486,22 +663,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if file_name:
payload["fileName"] = file_name
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://127.0.0.1:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@ -526,6 +702,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a local image file natively via bridge."""
return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
@ -536,6 +713,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a video natively via bridge — plays inline in WhatsApp."""
return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
@ -547,6 +725,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send a document/file as a downloadable attachment via bridge."""
return await self._send_media_to_bridge(
@ -556,45 +735,43 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send typing indicator via bridge."""
if not self._running:
if not self._running or not self._http_session:
return
if await self._check_managed_bridge_exit():
return
try:
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
await self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a WhatsApp chat."""
if not self._running:
if not self._running or not self._http_session:
return {"name": "Unknown", "type": "dm"}
if await self._check_managed_bridge_exit():
return {"name": chat_id, "type": "dm"}
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
async with self._http_session.get(
f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
except Exception as e:
logger.debug("Could not get WhatsApp chat info for %s: %s", chat_id, e)
@ -602,29 +779,26 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def _poll_messages(self) -> None:
"""Poll the bridge for incoming messages."""
try:
import aiohttp
except ImportError:
print(f"[{self.name}] aiohttp not installed, message polling disabled")
return
import aiohttp
while self._running:
if not self._http_session:
break
bridge_exit = await self._check_managed_bridge_exit()
if bridge_exit:
print(f"[{self.name}] {bridge_exit}")
break
try:
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://127.0.0.1:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
async with self._http_session.get(
f"http://127.0.0.1:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
except asyncio.CancelledError:
break
except Exception as e:
@ -640,6 +814,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def _build_message_event(self, data: Dict[str, Any]) -> Optional[MessageEvent]:
"""Build a MessageEvent from bridge message data, downloading images to cache."""
try:
if not self._should_process_message(data):
return None
# Determine message type
msg_type = MessageType.TEXT
if data.get("hasMedia"):
@ -721,6 +898,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
# the message text so the agent can read it inline.
# Cap at 100KB to match Telegram/Discord/Slack behaviour.
body = data.get("body", "")
if data.get("isGroup"):
body = self._clean_bot_mention_text(body, data)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if msg_type == MessageType.DOCUMENT and cached_urls:
for doc_path in cached_urls:

File diff suppressed because it is too large Load Diff

View File

@ -364,6 +364,12 @@ class SessionEntry:
auto_reset_reason: Optional[str] = None # "idle" or "daily"
reset_had_activity: bool = False # whether the expired session had any messages
# Set by the background expiry watcher after it successfully flushes
# memories for this session. Persisted to sessions.json so the flag
# survives gateway restarts (the old in-memory _pre_flushed_sessions
# set was lost on restart, causing redundant re-flushes).
memory_flushed: bool = False
def to_dict(self) -> Dict[str, Any]:
result = {
"session_key": self.session_key,
@ -381,6 +387,7 @@ class SessionEntry:
"last_prompt_tokens": self.last_prompt_tokens,
"estimated_cost_usd": self.estimated_cost_usd,
"cost_status": self.cost_status,
"memory_flushed": self.memory_flushed,
}
if self.origin:
result["origin"] = self.origin.to_dict()
@ -416,6 +423,7 @@ class SessionEntry:
last_prompt_tokens=data.get("last_prompt_tokens", 0),
estimated_cost_usd=data.get("estimated_cost_usd", 0.0),
cost_status=data.get("cost_status", "unknown"),
memory_flushed=data.get("memory_flushed", False),
)
@ -479,9 +487,6 @@ class SessionStore:
self._loaded = False
self._lock = threading.Lock()
self._has_active_processes_fn = has_active_processes_fn
# on_auto_reset is deprecated — memory flush now runs proactively
# via the background session expiry watcher in GatewayRunner.
self._pre_flushed_sessions: set = set() # session_ids already flushed by watcher
# Initialize SQLite session database
self._db = None
@ -684,15 +689,12 @@ class SessionStore:
self._save()
return entry
else:
# Session is being auto-reset. The background expiry watcher
# should have already flushed memories proactively; discard
# the marker so it doesn't accumulate.
# Session is being auto-reset.
was_auto_reset = True
auto_reset_reason = reset_reason
# Track whether the expired session had any real conversation
reset_had_activity = entry.total_tokens > 0
db_end_session_id = entry.session_id
self._pre_flushed_sessions.discard(entry.session_id)
else:
was_auto_reset = False
auto_reset_reason = None
@ -736,71 +738,58 @@ class SessionStore:
except Exception as e:
print(f"[gateway] Warning: Failed to create SQLite session: {e}")
# Seed new DM thread sessions with parent DM session history.
# When a bot reply creates a Slack thread and the user responds in it,
# the thread gets a new session (keyed by thread_ts). Without seeding,
# the thread session starts with zero context — the user's original
# question and the bot's answer are invisible. Fix: copy the parent
# DM session's transcript into the new thread session so context carries
# over while still keeping threads isolated from each other.
if (
source.chat_type == "dm"
and source.thread_id
and entry.created_at == entry.updated_at # brand-new session
and not was_auto_reset
):
parent_source = SessionSource(
platform=source.platform,
chat_id=source.chat_id,
chat_type="dm",
user_id=source.user_id,
# no thread_id — this is the parent DM session
)
parent_key = self._generate_session_key(parent_source)
with self._lock:
parent_entry = self._entries.get(parent_key)
if parent_entry and parent_entry.session_id != entry.session_id:
try:
parent_history = self.load_transcript(parent_entry.session_id)
if parent_history:
self.rewrite_transcript(entry.session_id, parent_history)
logger.info(
"[Session] Seeded DM thread session %s with %d messages from parent %s",
entry.session_id, len(parent_history), parent_entry.session_id,
)
except Exception as e:
logger.warning("[Session] Failed to seed thread session: %s", e)
return entry
def update_session(
self,
session_key: str,
input_tokens: int = 0,
output_tokens: int = 0,
cache_read_tokens: int = 0,
cache_write_tokens: int = 0,
last_prompt_tokens: int = None,
model: str = None,
estimated_cost_usd: Optional[float] = None,
cost_status: Optional[str] = None,
cost_source: Optional[str] = None,
provider: Optional[str] = None,
base_url: Optional[str] = None,
) -> None:
"""Update a session's metadata after an interaction."""
db_session_id = None
"""Update lightweight session metadata after an interaction."""
with self._lock:
self._ensure_loaded_locked()
if session_key in self._entries:
entry = self._entries[session_key]
entry.updated_at = _now()
# Direct assignment — the gateway receives cumulative totals
# from the cached agent, not per-call deltas.
entry.input_tokens = input_tokens
entry.output_tokens = output_tokens
entry.cache_read_tokens = cache_read_tokens
entry.cache_write_tokens = cache_write_tokens
if last_prompt_tokens is not None:
entry.last_prompt_tokens = last_prompt_tokens
if estimated_cost_usd is not None:
entry.estimated_cost_usd = estimated_cost_usd
if cost_status:
entry.cost_status = cost_status
entry.total_tokens = (
entry.input_tokens
+ entry.output_tokens
+ entry.cache_read_tokens
+ entry.cache_write_tokens
)
self._save()
db_session_id = entry.session_id
if self._db and db_session_id:
try:
self._db.set_token_counts(
db_session_id,
input_tokens=input_tokens,
output_tokens=output_tokens,
cache_read_tokens=cache_read_tokens,
cache_write_tokens=cache_write_tokens,
estimated_cost_usd=estimated_cost_usd,
cost_status=cost_status,
cost_source=cost_source,
billing_provider=provider,
billing_base_url=base_url,
model=model,
absolute=True,
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""

View File

@ -174,12 +174,12 @@ class GatewayStreamConsumer:
self._already_sent = True
self._last_sent_text = text
else:
# Edit not supported by this adapter — stop streaming,
# let the normal send path handle the final response.
# Without this guard, adapters like Signal/Email would
# flood the chat with a new message every edit_interval.
# If an edit fails mid-stream (especially Telegram flood control),
# stop progressive edits and let the normal final send path deliver
# the complete answer instead of leaving the user with a partial.
logger.debug("Edit failed, disabling streaming for this adapter")
self._edit_supported = False
self._already_sent = False
else:
# Editing not supported — skip intermediate updates.
# The final response will be sent by the normal path.

11
hermes
View File

@ -1,12 +1,11 @@
#!/usr/bin/env python3
"""
Hermes Agent CLI Launcher
Hermes Agent CLI launcher.
This is a convenience wrapper to launch the Hermes CLI.
Usage: ./hermes [options]
This wrapper should behave like the installed `hermes` command, including
subcommands such as `gateway`, `cron`, and `doctor`.
"""
if __name__ == "__main__":
from cli import main
import fire
fire.Fire(main)
from hermes_cli.main import main
main()

View File

@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.5.0"
__release_date__ = "2026.3.28"
__version__ = "0.7.0"
__release_date__ = "2026.4.3"

View File

@ -160,7 +160,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
id="alibaba",
name="Alibaba Cloud (DashScope)",
auth_type="api_key",
inference_base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
inference_base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
api_key_env_vars=("DASHSCOPE_API_KEY",),
base_url_env_var="DASHSCOPE_BASE_URL",
),
@ -200,6 +200,10 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
id="opencode-go",
name="OpenCode Go",
auth_type="api_key",
# OpenCode Go mixes API surfaces by model:
# - GLM / Kimi use OpenAI-compatible chat completions under /v1
# - MiniMax models use Anthropic Messages under /v1/messages
# Keep the provider base at /v1 and select api_mode per-model.
inference_base_url="https://opencode.ai/zen/go/v1",
api_key_env_vars=("OPENCODE_GO_API_KEY",),
base_url_env_var="OPENCODE_GO_BASE_URL",
@ -545,7 +549,11 @@ def _load_auth_store(auth_file: Optional[Path] = None) -> Dict[str, Any]:
except Exception:
return {"version": AUTH_STORE_VERSION, "providers": {}}
if isinstance(raw, dict) and isinstance(raw.get("providers"), dict):
if isinstance(raw, dict) and (
isinstance(raw.get("providers"), dict)
or isinstance(raw.get("credential_pool"), dict)
):
raw.setdefault("providers", {})
return raw
# Migrate from PR's "systems" format if present
@ -613,6 +621,30 @@ def _save_provider_state(auth_store: Dict[str, Any], provider_id: str, state: Di
auth_store["active_provider"] = provider_id
def read_credential_pool(provider_id: Optional[str] = None) -> Dict[str, Any]:
"""Return the persisted credential pool, or one provider slice."""
auth_store = _load_auth_store()
pool = auth_store.get("credential_pool")
if not isinstance(pool, dict):
pool = {}
if provider_id is None:
return dict(pool)
provider_entries = pool.get(provider_id)
return list(provider_entries) if isinstance(provider_entries, list) else []
def write_credential_pool(provider_id: str, entries: List[Dict[str, Any]]) -> Path:
"""Persist one provider's credential pool under auth.json."""
with _auth_store_lock():
auth_store = _load_auth_store()
pool = auth_store.get("credential_pool")
if not isinstance(pool, dict):
pool = {}
auth_store["credential_pool"] = pool
pool[provider_id] = list(entries)
return _save_auth_store(auth_store)
def get_provider_auth_state(provider_id: str) -> Optional[Dict[str, Any]]:
"""Return persisted auth state for a provider, or None."""
auth_store = _load_auth_store()
@ -638,10 +670,25 @@ def clear_provider_auth(provider_id: Optional[str] = None) -> bool:
return False
providers = auth_store.get("providers", {})
if target not in providers:
return False
if not isinstance(providers, dict):
providers = {}
auth_store["providers"] = providers
del providers[target]
pool = auth_store.get("credential_pool")
if not isinstance(pool, dict):
pool = {}
auth_store["credential_pool"] = pool
cleared = False
if target in providers:
del providers[target]
cleared = True
if target in pool:
del pool[target]
cleared = True
if not cleared:
return False
if auth_store.get("active_provider") == target:
auth_store["active_provider"] = None
_save_auth_store(auth_store)
@ -696,6 +743,10 @@ def resolve_provider(
"hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
"go": "opencode-go", "opencode-go-sub": "opencode-go",
"kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
# Local server aliases — route through the generic custom provider
"lmstudio": "custom", "lm-studio": "custom", "lm_studio": "custom",
"ollama": "custom", "vllm": "custom", "llamacpp": "custom",
"llama.cpp": "custom", "llama-cpp": "custom",
}
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
@ -742,7 +793,12 @@ def resolve_provider(
if has_usable_secret(os.getenv(env_var, "")):
return pid
return "openrouter"
raise AuthError(
"No inference provider configured. Run 'hermes model' to choose a "
"provider and model, or set an API key (OPENROUTER_API_KEY, "
"OPENAI_API_KEY, etc.) in ~/.hermes/.env.",
code="no_provider_configured",
)
# =============================================================================
@ -889,15 +945,14 @@ def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None
_save_auth_store(auth_store)
def _refresh_codex_auth_tokens(
tokens: Dict[str, str],
timeout_seconds: float,
) -> Dict[str, str]:
"""Refresh Codex access token using the refresh token.
Saves the new tokens to Hermes auth store automatically.
"""
refresh_token = tokens.get("refresh_token")
def refresh_codex_oauth_pure(
access_token: str,
refresh_token: str,
*,
timeout_seconds: float = 20.0,
) -> Dict[str, Any]:
"""Refresh Codex OAuth tokens without mutating Hermes auth state."""
del access_token # Access token is only used by callers to decide whether to refresh.
if not isinstance(refresh_token, str) or not refresh_token.strip():
raise AuthError(
"Codex auth is missing refresh_token. Run `hermes login` to re-authenticate.",
@ -952,8 +1007,8 @@ def _refresh_codex_auth_tokens(
relogin_required=True,
) from exc
access_token = refresh_payload.get("access_token")
if not isinstance(access_token, str) or not access_token.strip():
refreshed_access = refresh_payload.get("access_token")
if not isinstance(refreshed_access, str) or not refreshed_access.strip():
raise AuthError(
"Codex token refresh response was missing access_token.",
provider="openai-codex",
@ -961,11 +1016,33 @@ def _refresh_codex_auth_tokens(
relogin_required=True,
)
updated_tokens = dict(tokens)
updated_tokens["access_token"] = access_token.strip()
updated = {
"access_token": refreshed_access.strip(),
"refresh_token": refresh_token.strip(),
"last_refresh": datetime.now(timezone.utc).isoformat().replace("+00:00", "Z"),
}
next_refresh = refresh_payload.get("refresh_token")
if isinstance(next_refresh, str) and next_refresh.strip():
updated_tokens["refresh_token"] = next_refresh.strip()
updated["refresh_token"] = next_refresh.strip()
return updated
def _refresh_codex_auth_tokens(
tokens: Dict[str, str],
timeout_seconds: float,
) -> Dict[str, str]:
"""Refresh Codex access token using the refresh token.
Saves the new tokens to Hermes auth store automatically.
"""
refreshed = refresh_codex_oauth_pure(
str(tokens.get("access_token", "") or ""),
str(tokens.get("refresh_token", "") or ""),
timeout_seconds=timeout_seconds,
)
updated_tokens = dict(tokens)
updated_tokens["access_token"] = refreshed["access_token"]
updated_tokens["refresh_token"] = refreshed["refresh_token"]
_save_codex_tokens(updated_tokens)
return updated_tokens
@ -1304,6 +1381,205 @@ def _agent_key_is_usable(state: Dict[str, Any], min_ttl_seconds: int) -> bool:
return not _is_expiring(state.get("agent_key_expires_at"), min_ttl_seconds)
def resolve_nous_access_token(
*,
timeout_seconds: float = 15.0,
insecure: Optional[bool] = None,
ca_bundle: Optional[str] = None,
refresh_skew_seconds: int = ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
) -> str:
"""Resolve a refresh-aware Nous Portal access token for managed tool gateways."""
with _auth_store_lock():
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "nous")
if not state:
raise AuthError(
"Hermes is not logged into Nous Portal.",
provider="nous",
relogin_required=True,
)
portal_base_url = (
_optional_base_url(state.get("portal_base_url"))
or os.getenv("HERMES_PORTAL_BASE_URL")
or os.getenv("NOUS_PORTAL_BASE_URL")
or DEFAULT_NOUS_PORTAL_URL
).rstrip("/")
client_id = str(state.get("client_id") or DEFAULT_NOUS_CLIENT_ID)
verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
access_token = state.get("access_token")
refresh_token = state.get("refresh_token")
if not isinstance(access_token, str) or not access_token:
raise AuthError(
"No access token found for Nous Portal login.",
provider="nous",
relogin_required=True,
)
if not _is_expiring(state.get("expires_at"), refresh_skew_seconds):
return access_token
if not isinstance(refresh_token, str) or not refresh_token:
raise AuthError(
"Session expired and no refresh token is available.",
provider="nous",
relogin_required=True,
)
timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
with httpx.Client(
timeout=timeout,
headers={"Accept": "application/json"},
verify=verify,
) as client:
refreshed = _refresh_access_token(
client=client,
portal_base_url=portal_base_url,
client_id=client_id,
refresh_token=refresh_token,
)
now = datetime.now(timezone.utc)
access_ttl = _coerce_ttl_seconds(refreshed.get("expires_in"))
state["access_token"] = refreshed["access_token"]
state["refresh_token"] = refreshed.get("refresh_token") or refresh_token
state["token_type"] = refreshed.get("token_type") or state.get("token_type") or "Bearer"
state["scope"] = refreshed.get("scope") or state.get("scope")
state["obtained_at"] = now.isoformat()
state["expires_in"] = access_ttl
state["expires_at"] = datetime.fromtimestamp(
now.timestamp() + access_ttl,
tz=timezone.utc,
).isoformat()
state["portal_base_url"] = portal_base_url
state["client_id"] = client_id
state["tls"] = {
"insecure": verify is False,
"ca_bundle": verify if isinstance(verify, str) else None,
}
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
return state["access_token"]
def refresh_nous_oauth_pure(
access_token: str,
refresh_token: str,
client_id: str,
portal_base_url: str,
inference_base_url: str,
*,
token_type: str = "Bearer",
scope: str = DEFAULT_NOUS_SCOPE,
obtained_at: Optional[str] = None,
expires_at: Optional[str] = None,
agent_key: Optional[str] = None,
agent_key_expires_at: Optional[str] = None,
min_key_ttl_seconds: int = DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
timeout_seconds: float = 15.0,
insecure: Optional[bool] = None,
ca_bundle: Optional[str] = None,
force_refresh: bool = False,
force_mint: bool = False,
) -> Dict[str, Any]:
"""Refresh Nous OAuth state without mutating auth.json."""
state: Dict[str, Any] = {
"access_token": access_token,
"refresh_token": refresh_token,
"client_id": client_id or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": (portal_base_url or DEFAULT_NOUS_PORTAL_URL).rstrip("/"),
"inference_base_url": (inference_base_url or DEFAULT_NOUS_INFERENCE_URL).rstrip("/"),
"token_type": token_type or "Bearer",
"scope": scope or DEFAULT_NOUS_SCOPE,
"obtained_at": obtained_at,
"expires_at": expires_at,
"agent_key": agent_key,
"agent_key_expires_at": agent_key_expires_at,
"tls": {
"insecure": bool(insecure),
"ca_bundle": ca_bundle,
},
}
verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
with httpx.Client(timeout=timeout, headers={"Accept": "application/json"}, verify=verify) as client:
if force_refresh or _is_expiring(state.get("expires_at"), ACCESS_TOKEN_REFRESH_SKEW_SECONDS):
refreshed = _refresh_access_token(
client=client,
portal_base_url=state["portal_base_url"],
client_id=state["client_id"],
refresh_token=state["refresh_token"],
)
now = datetime.now(timezone.utc)
access_ttl = _coerce_ttl_seconds(refreshed.get("expires_in"))
state["access_token"] = refreshed["access_token"]
state["refresh_token"] = refreshed.get("refresh_token") or state["refresh_token"]
state["token_type"] = refreshed.get("token_type") or state.get("token_type") or "Bearer"
state["scope"] = refreshed.get("scope") or state.get("scope")
refreshed_url = _optional_base_url(refreshed.get("inference_base_url"))
if refreshed_url:
state["inference_base_url"] = refreshed_url
state["obtained_at"] = now.isoformat()
state["expires_in"] = access_ttl
state["expires_at"] = datetime.fromtimestamp(
now.timestamp() + access_ttl, tz=timezone.utc
).isoformat()
if force_mint or not _agent_key_is_usable(state, max(60, int(min_key_ttl_seconds))):
mint_payload = _mint_agent_key(
client=client,
portal_base_url=state["portal_base_url"],
access_token=state["access_token"],
min_ttl_seconds=min_key_ttl_seconds,
)
now = datetime.now(timezone.utc)
state["agent_key"] = mint_payload.get("api_key")
state["agent_key_id"] = mint_payload.get("key_id")
state["agent_key_expires_at"] = mint_payload.get("expires_at")
state["agent_key_expires_in"] = mint_payload.get("expires_in")
state["agent_key_reused"] = bool(mint_payload.get("reused", False))
state["agent_key_obtained_at"] = now.isoformat()
minted_url = _optional_base_url(mint_payload.get("inference_base_url"))
if minted_url:
state["inference_base_url"] = minted_url
return state
def refresh_nous_oauth_from_state(
state: Dict[str, Any],
*,
min_key_ttl_seconds: int = DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
timeout_seconds: float = 15.0,
force_refresh: bool = False,
force_mint: bool = False,
) -> Dict[str, Any]:
"""Refresh Nous OAuth from a state dict. Thin wrapper around refresh_nous_oauth_pure."""
tls = state.get("tls") or {}
return refresh_nous_oauth_pure(
state.get("access_token", ""),
state.get("refresh_token", ""),
state.get("client_id", "hermes-cli"),
state.get("portal_base_url", DEFAULT_NOUS_PORTAL_URL),
state.get("inference_base_url", DEFAULT_NOUS_INFERENCE_URL),
token_type=state.get("token_type", "Bearer"),
scope=state.get("scope", DEFAULT_NOUS_SCOPE),
obtained_at=state.get("obtained_at"),
expires_at=state.get("expires_at"),
agent_key=state.get("agent_key"),
agent_key_expires_at=state.get("agent_key_expires_at"),
min_key_ttl_seconds=min_key_ttl_seconds,
timeout_seconds=timeout_seconds,
insecure=tls.get("insecure"),
ca_bundle=tls.get("ca_bundle"),
force_refresh=force_refresh,
force_mint=force_mint,
)
def resolve_nous_runtime_credentials(
*,
min_key_ttl_seconds: int = DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
@ -2021,7 +2297,8 @@ def _login_openai_codex(args, pconfig: ProviderConfig) -> None:
config_path = _update_config_for_provider("openai-codex", creds.get("base_url", DEFAULT_CODEX_BASE_URL))
print()
print("Login successful!")
print(" Auth state: ~/.hermes/auth.json")
from hermes_constants import display_hermes_home as _dhh
print(f" Auth state: {_dhh()}/auth.json")
print(f" Config updated: {config_path} (model.provider=openai-codex)")
@ -2170,34 +2447,36 @@ def _codex_device_code_login() -> Dict[str, Any]:
}
def _login_nous(args, pconfig: ProviderConfig) -> None:
"""Nous Portal device authorization flow."""
def _nous_device_code_login(
*,
portal_base_url: Optional[str] = None,
inference_base_url: Optional[str] = None,
client_id: Optional[str] = None,
scope: Optional[str] = None,
open_browser: bool = True,
timeout_seconds: float = 15.0,
insecure: bool = False,
ca_bundle: Optional[str] = None,
min_key_ttl_seconds: int = 5 * 60,
) -> Dict[str, Any]:
"""Run the Nous device-code flow and return full OAuth state without persisting."""
pconfig = PROVIDER_REGISTRY["nous"]
portal_base_url = (
getattr(args, "portal_url", None)
portal_base_url
or os.getenv("HERMES_PORTAL_BASE_URL")
or os.getenv("NOUS_PORTAL_BASE_URL")
or pconfig.portal_base_url
).rstrip("/")
requested_inference_url = (
getattr(args, "inference_url", None)
inference_base_url
or os.getenv("NOUS_INFERENCE_BASE_URL")
or pconfig.inference_base_url
).rstrip("/")
client_id = getattr(args, "client_id", None) or pconfig.client_id
scope = getattr(args, "scope", None) or pconfig.scope
open_browser = not getattr(args, "no_browser", False)
timeout_seconds = getattr(args, "timeout", None) or 15.0
client_id = client_id or pconfig.client_id
scope = scope or pconfig.scope
timeout = httpx.Timeout(timeout_seconds)
insecure = bool(getattr(args, "insecure", False))
ca_bundle = (
getattr(args, "ca_bundle", None)
or os.getenv("HERMES_CA_BUNDLE")
or os.getenv("SSL_CERT_FILE")
)
verify: bool | str = False if insecure else (ca_bundle if ca_bundle else True)
# Skip browser open in SSH sessions
if _is_remote_session():
open_browser = False
@ -2208,74 +2487,109 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
elif ca_bundle:
print(f"TLS verification: custom CA bundle ({ca_bundle})")
try:
with httpx.Client(timeout=timeout, headers={"Accept": "application/json"}, verify=verify) as client:
device_data = _request_device_code(
client=client, portal_base_url=portal_base_url,
client_id=client_id, scope=scope,
)
verification_url = str(device_data["verification_uri_complete"])
user_code = str(device_data["user_code"])
expires_in = int(device_data["expires_in"])
interval = int(device_data["interval"])
print()
print("To continue:")
print(f" 1. Open: {verification_url}")
print(f" 2. If prompted, enter code: {user_code}")
if open_browser:
opened = webbrowser.open(verification_url)
if opened:
print(" (Opened browser for verification)")
else:
print(" Could not open browser automatically — use the URL above.")
effective_interval = max(1, min(interval, DEVICE_AUTH_POLL_INTERVAL_CAP_SECONDS))
print(f"Waiting for approval (polling every {effective_interval}s)...")
token_data = _poll_for_token(
client=client, portal_base_url=portal_base_url,
client_id=client_id, device_code=str(device_data["device_code"]),
expires_in=expires_in, poll_interval=interval,
)
# Process token response
now = datetime.now(timezone.utc)
token_expires_in = _coerce_ttl_seconds(token_data.get("expires_in", 0))
expires_at = now.timestamp() + token_expires_in
inference_base_url = (
_optional_base_url(token_data.get("inference_base_url"))
or requested_inference_url
with httpx.Client(timeout=timeout, headers={"Accept": "application/json"}, verify=verify) as client:
device_data = _request_device_code(
client=client,
portal_base_url=portal_base_url,
client_id=client_id,
scope=scope,
)
if inference_base_url != requested_inference_url:
print(f"Using portal-provided inference URL: {inference_base_url}")
auth_state = {
"portal_base_url": portal_base_url,
"inference_base_url": inference_base_url,
"client_id": client_id,
"scope": token_data.get("scope") or scope,
"token_type": token_data.get("token_type", "Bearer"),
"access_token": token_data["access_token"],
"refresh_token": token_data.get("refresh_token"),
"obtained_at": now.isoformat(),
"expires_at": datetime.fromtimestamp(expires_at, tz=timezone.utc).isoformat(),
"expires_in": token_expires_in,
"tls": {
"insecure": verify is False,
"ca_bundle": verify if isinstance(verify, str) else None,
},
"agent_key": None,
"agent_key_id": None,
"agent_key_expires_at": None,
"agent_key_expires_in": None,
"agent_key_reused": None,
"agent_key_obtained_at": None,
}
verification_url = str(device_data["verification_uri_complete"])
user_code = str(device_data["user_code"])
expires_in = int(device_data["expires_in"])
interval = int(device_data["interval"])
print()
print("To continue:")
print(f" 1. Open: {verification_url}")
print(f" 2. If prompted, enter code: {user_code}")
if open_browser:
opened = webbrowser.open(verification_url)
if opened:
print(" (Opened browser for verification)")
else:
print(" Could not open browser automatically — use the URL above.")
effective_interval = max(1, min(interval, DEVICE_AUTH_POLL_INTERVAL_CAP_SECONDS))
print(f"Waiting for approval (polling every {effective_interval}s)...")
token_data = _poll_for_token(
client=client,
portal_base_url=portal_base_url,
client_id=client_id,
device_code=str(device_data["device_code"]),
expires_in=expires_in,
poll_interval=interval,
)
now = datetime.now(timezone.utc)
token_expires_in = _coerce_ttl_seconds(token_data.get("expires_in", 0))
expires_at = now.timestamp() + token_expires_in
resolved_inference_url = (
_optional_base_url(token_data.get("inference_base_url"))
or requested_inference_url
)
if resolved_inference_url != requested_inference_url:
print(f"Using portal-provided inference URL: {resolved_inference_url}")
auth_state = {
"portal_base_url": portal_base_url,
"inference_base_url": resolved_inference_url,
"client_id": client_id,
"scope": token_data.get("scope") or scope,
"token_type": token_data.get("token_type", "Bearer"),
"access_token": token_data["access_token"],
"refresh_token": token_data.get("refresh_token"),
"obtained_at": now.isoformat(),
"expires_at": datetime.fromtimestamp(expires_at, tz=timezone.utc).isoformat(),
"expires_in": token_expires_in,
"tls": {
"insecure": verify is False,
"ca_bundle": verify if isinstance(verify, str) else None,
},
"agent_key": None,
"agent_key_id": None,
"agent_key_expires_at": None,
"agent_key_expires_in": None,
"agent_key_reused": None,
"agent_key_obtained_at": None,
}
return refresh_nous_oauth_from_state(
auth_state,
min_key_ttl_seconds=min_key_ttl_seconds,
timeout_seconds=timeout_seconds,
force_refresh=False,
force_mint=True,
)
def _login_nous(args, pconfig: ProviderConfig) -> None:
"""Nous Portal device authorization flow."""
timeout_seconds = getattr(args, "timeout", None) or 15.0
insecure = bool(getattr(args, "insecure", False))
ca_bundle = (
getattr(args, "ca_bundle", None)
or os.getenv("HERMES_CA_BUNDLE")
or os.getenv("SSL_CERT_FILE")
)
try:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None) or pconfig.portal_base_url,
inference_base_url=getattr(args, "inference_url", None) or pconfig.inference_base_url,
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
inference_base_url = auth_state["inference_base_url"]
verify: bool | str = False if insecure else (ca_bundle if ca_bundle else True)
# Save auth state
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", auth_state)
@ -2287,34 +2601,29 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
print(f" Auth state: {saved_to}")
print(f" Config updated: {config_path} (model.provider=nous)")
# Mint an initial agent key and list available models
try:
runtime_creds = resolve_nous_runtime_credentials(
min_key_ttl_seconds=5 * 60,
timeout_seconds=timeout_seconds,
insecure=insecure, ca_bundle=ca_bundle,
)
runtime_key = runtime_creds.get("api_key")
runtime_base_url = runtime_creds.get("base_url") or inference_base_url
runtime_key = auth_state.get("agent_key") or auth_state.get("access_token")
if not isinstance(runtime_key, str) or not runtime_key:
raise AuthError("No runtime API key available to fetch models",
provider="nous", code="invalid_token")
raise AuthError(
"No runtime API key available to fetch models",
provider="nous",
code="invalid_token",
)
model_ids = fetch_nous_models(
inference_base_url=runtime_base_url,
api_key=runtime_key,
timeout_seconds=timeout_seconds,
verify=verify,
)
# Use curated model list (same as OpenRouter defaults) instead
# of the full /models dump which returns hundreds of models.
from hermes_cli.models import _PROVIDER_MODELS
model_ids = _PROVIDER_MODELS.get("nous", [])
print()
if model_ids:
print(f"Showing {len(model_ids)} curated models — use \"Enter custom model name\" for others.")
selected_model = _prompt_model_selection(model_ids)
if selected_model:
_save_model_choice(selected_model)
print(f"Default model set to: {selected_model}")
else:
print("No models were returned by the inference API.")
print("No curated models available for Nous Portal.")
except Exception as exc:
message = format_auth_error(exc) if isinstance(exc, AuthError) else str(exc)
print()

470
hermes_cli/auth_commands.py Normal file
View File

@ -0,0 +1,470 @@
"""Credential-pool auth subcommands."""
from __future__ import annotations
from getpass import getpass
import math
import time
from types import SimpleNamespace
import uuid
from agent.credential_pool import (
AUTH_TYPE_API_KEY,
AUTH_TYPE_OAUTH,
CUSTOM_POOL_PREFIX,
SOURCE_MANUAL,
STATUS_EXHAUSTED,
STRATEGY_FILL_FIRST,
STRATEGY_ROUND_ROBIN,
STRATEGY_RANDOM,
STRATEGY_LEAST_USED,
SUPPORTED_POOL_STRATEGIES,
PooledCredential,
_normalize_custom_pool_name,
get_pool_strategy,
label_from_token,
list_custom_pool_providers,
load_pool,
_exhausted_ttl,
)
import hermes_cli.auth as auth_mod
from hermes_cli.auth import PROVIDER_REGISTRY
from hermes_constants import OPENROUTER_BASE_URL
# Providers that support OAuth login in addition to API keys.
_OAUTH_CAPABLE_PROVIDERS = {"anthropic", "nous", "openai-codex"}
def _get_custom_provider_names() -> list:
"""Return list of (display_name, pool_key) tuples for custom_providers in config."""
try:
from hermes_cli.config import load_config
config = load_config()
except Exception:
return []
custom_providers = config.get("custom_providers")
if not isinstance(custom_providers, list):
return []
result = []
for entry in custom_providers:
if not isinstance(entry, dict):
continue
name = entry.get("name")
if not isinstance(name, str) or not name.strip():
continue
pool_key = f"{CUSTOM_POOL_PREFIX}{_normalize_custom_pool_name(name)}"
result.append((name.strip(), pool_key))
return result
def _resolve_custom_provider_input(raw: str) -> str | None:
"""If raw input matches a custom_providers entry name (case-insensitive), return its pool key."""
normalized = (raw or "").strip().lower().replace(" ", "-")
if not normalized:
return None
# Direct match on 'custom:name' format
if normalized.startswith(CUSTOM_POOL_PREFIX):
return normalized
for display_name, pool_key in _get_custom_provider_names():
if _normalize_custom_pool_name(display_name) == normalized:
return pool_key
return None
def _normalize_provider(provider: str) -> str:
normalized = (provider or "").strip().lower()
if normalized in {"or", "open-router"}:
return "openrouter"
# Check if it matches a custom provider name
custom_key = _resolve_custom_provider_input(normalized)
if custom_key:
return custom_key
return normalized
def _provider_base_url(provider: str) -> str:
if provider == "openrouter":
return OPENROUTER_BASE_URL
if provider.startswith(CUSTOM_POOL_PREFIX):
from agent.credential_pool import _get_custom_provider_config
cp_config = _get_custom_provider_config(provider)
if cp_config:
return str(cp_config.get("base_url") or "").strip()
return ""
pconfig = PROVIDER_REGISTRY.get(provider)
return pconfig.inference_base_url if pconfig else ""
def _oauth_default_label(provider: str, count: int) -> str:
return f"{provider}-oauth-{count}"
def _api_key_default_label(count: int) -> str:
return f"api-key-{count}"
def _display_source(source: str) -> str:
return source.split(":", 1)[1] if source.startswith("manual:") else source
def _format_exhausted_status(entry) -> str:
if entry.last_status != STATUS_EXHAUSTED:
return ""
code = f" ({entry.last_error_code})" if entry.last_error_code else ""
if not entry.last_status_at:
return f" exhausted{code}"
remaining = max(0, int(math.ceil((entry.last_status_at + _exhausted_ttl(entry.last_error_code)) - time.time())))
if remaining <= 0:
return f" exhausted{code} (ready to retry)"
minutes, seconds = divmod(remaining, 60)
hours, minutes = divmod(minutes, 60)
if hours:
wait = f"{hours}h {minutes}m"
elif minutes:
wait = f"{minutes}m {seconds}s"
else:
wait = f"{seconds}s"
return f" exhausted{code} ({wait} left)"
def auth_add_command(args) -> None:
provider = _normalize_provider(getattr(args, "provider", ""))
if provider not in PROVIDER_REGISTRY and provider != "openrouter" and not provider.startswith(CUSTOM_POOL_PREFIX):
raise SystemExit(f"Unknown provider: {provider}")
requested_type = str(getattr(args, "auth_type", "") or "").strip().lower()
if requested_type in {AUTH_TYPE_API_KEY, "api-key"}:
requested_type = AUTH_TYPE_API_KEY
if not requested_type:
if provider.startswith(CUSTOM_POOL_PREFIX):
requested_type = AUTH_TYPE_API_KEY
else:
requested_type = AUTH_TYPE_OAUTH if provider in {"anthropic", "nous", "openai-codex"} else AUTH_TYPE_API_KEY
pool = load_pool(provider)
if requested_type == AUTH_TYPE_API_KEY:
token = (getattr(args, "api_key", None) or "").strip()
if not token:
token = getpass("Paste your API key: ").strip()
if not token:
raise SystemExit("No API key provided.")
default_label = _api_key_default_label(len(pool.entries()) + 1)
label = (getattr(args, "label", None) or "").strip()
if not label:
label = input(f"Label (optional, default: {default_label}): ").strip() or default_label
entry = PooledCredential(
provider=provider,
id=uuid.uuid4().hex[:6],
label=label,
auth_type=AUTH_TYPE_API_KEY,
priority=0,
source=SOURCE_MANUAL,
access_token=token,
base_url=_provider_base_url(provider),
)
pool.add_entry(entry)
print(f'Added {provider} credential #{len(pool.entries())}: "{label}"')
return
if provider == "anthropic":
from agent import anthropic_adapter as anthropic_mod
creds = anthropic_mod.run_hermes_oauth_login_pure()
if not creds:
raise SystemExit("Anthropic OAuth login did not return credentials.")
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["access_token"],
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential(
provider=provider,
id=uuid.uuid4().hex[:6],
label=label,
auth_type=AUTH_TYPE_OAUTH,
priority=0,
source=f"{SOURCE_MANUAL}:hermes_pkce",
access_token=creds["access_token"],
refresh_token=creds.get("refresh_token"),
expires_at_ms=creds.get("expires_at_ms"),
base_url=_provider_base_url(provider),
)
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
return
if provider == "nous":
creds = auth_mod._nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None),
scope=getattr(args, "scope", None),
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=getattr(args, "timeout", None) or 15.0,
insecure=bool(getattr(args, "insecure", False)),
ca_bundle=getattr(args, "ca_bundle", None),
min_key_ttl_seconds=max(60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))),
)
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds.get("access_token", ""),
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential.from_dict(provider, {
**creds,
"label": label,
"auth_type": AUTH_TYPE_OAUTH,
"source": f"{SOURCE_MANUAL}:device_code",
"base_url": creds.get("inference_base_url"),
})
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
return
if provider == "openai-codex":
creds = auth_mod._codex_device_code_login()
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["tokens"]["access_token"],
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential(
provider=provider,
id=uuid.uuid4().hex[:6],
label=label,
auth_type=AUTH_TYPE_OAUTH,
priority=0,
source=f"{SOURCE_MANUAL}:device_code",
access_token=creds["tokens"]["access_token"],
refresh_token=creds["tokens"].get("refresh_token"),
base_url=creds.get("base_url"),
last_refresh=creds.get("last_refresh"),
)
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
return
raise SystemExit(f"`hermes auth add {provider}` is not implemented for auth type {requested_type} yet.")
def auth_list_command(args) -> None:
provider_filter = _normalize_provider(getattr(args, "provider", "") or "")
if provider_filter:
providers = [provider_filter]
else:
providers = sorted({
*PROVIDER_REGISTRY.keys(),
"openrouter",
*list_custom_pool_providers(),
})
for provider in providers:
pool = load_pool(provider)
entries = pool.entries()
if not entries:
continue
current = pool.peek()
print(f"{provider} ({len(entries)} credentials):")
for idx, entry in enumerate(entries, start=1):
marker = " "
if current is not None and entry.id == current.id:
marker = ""
status = _format_exhausted_status(entry)
source = _display_source(entry.source)
print(f" #{idx} {entry.label:<20} {entry.auth_type:<7} {source}{status} {marker}".rstrip())
print()
def auth_remove_command(args) -> None:
provider = _normalize_provider(getattr(args, "provider", ""))
index = int(getattr(args, "index"))
pool = load_pool(provider)
removed = pool.remove_index(index)
if removed is None:
raise SystemExit(f"No credential #{index} for provider {provider}.")
print(f"Removed {provider} credential #{index} ({removed.label})")
def auth_reset_command(args) -> None:
provider = _normalize_provider(getattr(args, "provider", ""))
pool = load_pool(provider)
count = pool.reset_statuses()
print(f"Reset status on {count} {provider} credentials")
def _interactive_auth() -> None:
"""Interactive credential pool management when `hermes auth` is called bare."""
# Show current pool status first
print("Credential Pool Status")
print("=" * 50)
auth_list_command(SimpleNamespace(provider=None))
print()
# Main menu
choices = [
"Add a credential",
"Remove a credential",
"Reset cooldowns for a provider",
"Set rotation strategy for a provider",
"Exit",
]
print("What would you like to do?")
for i, choice in enumerate(choices, 1):
print(f" {i}. {choice}")
try:
raw = input("\nChoice: ").strip()
except (EOFError, KeyboardInterrupt):
return
if not raw or raw == str(len(choices)):
return
if raw == "1":
_interactive_add()
elif raw == "2":
_interactive_remove()
elif raw == "3":
_interactive_reset()
elif raw == "4":
_interactive_strategy()
def _pick_provider(prompt: str = "Provider") -> str:
"""Prompt for a provider name with auto-complete hints."""
known = sorted(set(list(PROVIDER_REGISTRY.keys()) + ["openrouter"]))
custom_names = _get_custom_provider_names()
if custom_names:
custom_display = [name for name, _key in custom_names]
print(f"\nKnown providers: {', '.join(known)}")
print(f"Custom endpoints: {', '.join(custom_display)}")
else:
print(f"\nKnown providers: {', '.join(known)}")
try:
raw = input(f"{prompt}: ").strip()
except (EOFError, KeyboardInterrupt):
raise SystemExit()
return _normalize_provider(raw)
def _interactive_add() -> None:
provider = _pick_provider("Provider to add credential for")
if provider not in PROVIDER_REGISTRY and provider != "openrouter" and not provider.startswith(CUSTOM_POOL_PREFIX):
raise SystemExit(f"Unknown provider: {provider}")
# For OAuth-capable providers, ask which type
if provider in _OAUTH_CAPABLE_PROVIDERS:
print(f"\n{provider} supports both API keys and OAuth login.")
print(" 1. API key (paste a key from the provider dashboard)")
print(" 2. OAuth login (authenticate via browser)")
try:
type_choice = input("Type [1/2]: ").strip()
except (EOFError, KeyboardInterrupt):
return
if type_choice == "2":
auth_type = "oauth"
else:
auth_type = "api_key"
else:
auth_type = "api_key"
auth_add_command(SimpleNamespace(
provider=provider, auth_type=auth_type, label=None, api_key=None,
portal_url=None, inference_url=None, client_id=None, scope=None,
no_browser=False, timeout=None, insecure=False, ca_bundle=None,
))
def _interactive_remove() -> None:
provider = _pick_provider("Provider to remove credential from")
pool = load_pool(provider)
if not pool.has_credentials():
print(f"No credentials for {provider}.")
return
# Show entries with indices
for i, e in enumerate(pool.entries(), 1):
exhausted = _format_exhausted_status(e)
print(f" #{i} {e.label:25s} {e.auth_type:10s} {e.source}{exhausted}")
try:
raw = input("Remove # (or blank to cancel): ").strip()
except (EOFError, KeyboardInterrupt):
return
if not raw:
return
try:
index = int(raw)
except ValueError:
print("Invalid number.")
return
auth_remove_command(SimpleNamespace(provider=provider, index=index))
def _interactive_reset() -> None:
provider = _pick_provider("Provider to reset cooldowns for")
auth_reset_command(SimpleNamespace(provider=provider))
def _interactive_strategy() -> None:
provider = _pick_provider("Provider to set strategy for")
current = get_pool_strategy(provider)
strategies = [STRATEGY_FILL_FIRST, STRATEGY_ROUND_ROBIN, STRATEGY_LEAST_USED, STRATEGY_RANDOM]
print(f"\nCurrent strategy for {provider}: {current}")
print()
descriptions = {
STRATEGY_FILL_FIRST: "Use first key until exhausted, then next",
STRATEGY_ROUND_ROBIN: "Cycle through keys evenly",
STRATEGY_LEAST_USED: "Always pick the least-used key",
STRATEGY_RANDOM: "Random selection",
}
for i, s in enumerate(strategies, 1):
marker = "" if s == current else ""
print(f" {i}. {s:15s}{descriptions.get(s, '')}{marker}")
try:
raw = input("\nStrategy [1-4]: ").strip()
except (EOFError, KeyboardInterrupt):
return
if not raw:
return
try:
idx = int(raw) - 1
strategy = strategies[idx]
except (ValueError, IndexError):
print("Invalid choice.")
return
from hermes_cli.config import load_config, save_config
cfg = load_config()
pool_strategies = cfg.get("credential_pool_strategies") or {}
if not isinstance(pool_strategies, dict):
pool_strategies = {}
pool_strategies[provider] = strategy
cfg["credential_pool_strategies"] = pool_strategies
save_config(cfg)
print(f"Set {provider} strategy to: {strategy}")
def auth_command(args) -> None:
action = getattr(args, "auth_action", "")
if action == "add":
auth_add_command(args)
return
if action == "list":
auth_list_command(args)
return
if action == "remove":
auth_remove_command(args)
return
if action == "reset":
auth_reset_command(args)
return
# No subcommand — launch interactive mode
_interactive_auth()

View File

@ -258,7 +258,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
get_toolset_for_tool: Callable to map tool name -> toolset name.
context_length: Model's context window size in tokens.
"""
from model_tools import check_tool_availability
from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
if get_toolset_for_tool is None:
from model_tools import get_toolset_for_tool
@ -267,8 +267,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
_, unavailable_toolsets = check_tool_availability(quiet=True)
disabled_tools = set()
# Tools whose toolset has a check_fn are lazy-initialized (e.g. honcho,
# homeassistant) — they show as unavailable at banner time because the
# check hasn't run yet, but they aren't misconfigured.
lazy_tools = set()
for item in unavailable_toolsets:
disabled_tools.update(item.get("tools", []))
toolset_name = item.get("name", "")
ts_req = TOOLSET_REQUIREMENTS.get(toolset_name, {})
tools_in_ts = item.get("tools", [])
if ts_req.get("check_fn"):
lazy_tools.update(tools_in_ts)
else:
disabled_tools.update(tools_in_ts)
layout_table = Table.grid(padding=(0, 2))
layout_table.add_column("left", justify="center")
@ -328,6 +338,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
for name in sorted(tool_names):
if name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
elif name in lazy_tools:
colored_names.append(f"[yellow]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
@ -347,6 +359,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
colored_names.append("[dim]...[/]")
elif name in disabled_tools:
colored_names.append(f"[red]{name}[/]")
elif name in lazy_tools:
colored_names.append(f"[yellow]{name}[/]")
else:
colored_names.append(f"[{text}]{name}[/]")
tools_str = ", ".join(colored_names)
@ -403,16 +417,26 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_connected:
summary_parts.append(f"{mcp_connected} MCP servers")
summary_parts.append("/help for commands")
# Show active profile name when not 'default'
try:
from hermes_cli.profiles import get_active_profile_name
_profile_name = get_active_profile_name()
if _profile_name and _profile_name != "default":
right_lines.append(f"[bold {accent}]Profile:[/] [{text}]{_profile_name}[/]")
except Exception:
pass # Never break the banner over a profiles.py bug
right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")
# Update check — use prefetched result if available
try:
behind = get_update_result(timeout=0.5)
if behind and behind > 0:
from hermes_cli.config import recommended_update_command
commits_word = "commit" if behind == 1 else "commits"
right_lines.append(
f"[bold yellow]⚠ {behind} {commits_word} behind[/]"
f"[dim yellow] — run [bold]hermes update[/bold] to update[/]"
f"[dim yellow] — run [bold]{recommended_update_command()}[/bold] to update[/]"
)
except Exception:
pass # Never break the banner over an update check

View File

@ -12,6 +12,7 @@ import getpass
from hermes_cli.banner import cprint, _DIM, _RST
from hermes_cli.config import save_env_value_secure
from hermes_constants import display_hermes_home
def clarify_callback(cli, question, choices):
@ -131,7 +132,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
}
stored = save_env_value_secure(var_name, value)
cprint(f"\n{_DIM} ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
_dhh = display_hermes_home()
cprint(f"\n{_DIM} ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
return {
**stored,
"skipped": False,
@ -183,7 +185,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
}
stored = save_env_value_secure(var_name, value)
cprint(f"\n{_DIM} ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
_dhh = display_hermes_home()
cprint(f"\n{_DIM} ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
return {
**stored,
"skipped": False,
@ -238,7 +241,8 @@ def approval_callback(cli, command: str, description: str) -> str:
lock = cli._approval_lock
with lock:
timeout = 60
from cli import CLI_CONFIG
timeout = CLI_CONFIG.get("approvals", {}).get("timeout", 60)
response_queue = queue.Queue()
choices = ["once", "session", "always", "deny"]
if len(command) > 70:

View File

@ -5,6 +5,7 @@ toggleable list of items. Falls back to a numbered text UI when
curses is unavailable (Windows without curses, piped stdin, etc.).
"""
import sys
from typing import List, Set
from hermes_cli.colors import Colors, color
@ -26,6 +27,10 @@ def curses_checklist(
The indices the user confirmed as checked. On cancel (ESC/q),
returns ``pre_selected`` unchanged.
"""
# Safety: return defaults when stdin is not a terminal.
if not sys.stdin.isatty():
return set(pre_selected)
try:
import curses
selected = set(pre_selected)

View File

@ -4,14 +4,19 @@ Usage:
hermes claw migrate # Interactive migration from ~/.openclaw
hermes claw migrate --dry-run # Preview what would be migrated
hermes claw migrate --preset full --overwrite # Full migration, overwrite conflicts
hermes claw cleanup # Archive leftover OpenClaw directories
hermes claw cleanup --dry-run # Preview what would be archived
"""
import importlib.util
import logging
import shutil
import sys
from datetime import datetime
from pathlib import Path
from hermes_cli.config import get_hermes_home, get_config_path, load_config, save_config
from hermes_constants import get_optional_skills_dir
from hermes_cli.setup import (
Colors,
color,
@ -19,6 +24,7 @@ from hermes_cli.setup import (
print_info,
print_success,
print_error,
print_warning,
prompt_yes_no,
)
@ -27,8 +33,7 @@ logger = logging.getLogger(__name__)
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
_OPENCLAW_SCRIPT = (
PROJECT_ROOT
/ "optional-skills"
get_optional_skills_dir(PROJECT_ROOT / "optional-skills")
/ "migration"
/ "openclaw-migration"
/ "scripts"
@ -45,6 +50,18 @@ _OPENCLAW_SCRIPT_INSTALLED = (
/ "openclaw_to_hermes.py"
)
# Known OpenClaw directory names (current + legacy)
_OPENCLAW_DIR_NAMES = (".openclaw", ".clawdbot", ".moldbot")
# State files commonly found in OpenClaw workspace directories that cause
# confusion after migration (the agent discovers them and writes to them)
_WORKSPACE_STATE_GLOBS = (
"*/todo.json",
"*/sessions/*",
"*/memory/*.json",
"*/logs/*",
)
def _find_migration_script() -> Path | None:
"""Find the openclaw_to_hermes.py script in known locations."""
@ -71,24 +88,105 @@ def _load_migration_module(script_path: Path):
return mod
def _find_openclaw_dirs() -> list[Path]:
"""Find all OpenClaw directories on disk."""
found = []
for name in _OPENCLAW_DIR_NAMES:
candidate = Path.home() / name
if candidate.is_dir():
found.append(candidate)
return found
def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""Scan an OpenClaw directory for workspace state files that cause confusion.
Returns a list of (path, description) tuples.
"""
findings: list[tuple[Path, str]] = []
# Direct state files in the root
for name in ("todo.json", "sessions", "logs"):
candidate = source_dir / name
if candidate.exists():
kind = "directory" if candidate.is_dir() else "file"
findings.append((candidate, f"Root {kind}: {name}"))
# State files inside workspace directories
for child in sorted(source_dir.iterdir()):
if not child.is_dir() or child.name.startswith("."):
continue
# Check for workspace-like subdirectories
for state_name in ("todo.json", "sessions", "logs", "memory"):
state_path = child / state_name
if state_path.exists():
kind = "directory" if state_path.is_dir() else "file"
rel = state_path.relative_to(source_dir)
findings.append((state_path, f"Workspace {kind}: {rel}"))
return findings
def _archive_directory(source_dir: Path, dry_run: bool = False) -> Path:
"""Rename an OpenClaw directory to .pre-migration.
Returns the archive path.
"""
timestamp = datetime.now().strftime("%Y%m%d")
archive_name = f"{source_dir.name}.pre-migration"
archive_path = source_dir.parent / archive_name
# If archive already exists, add timestamp
if archive_path.exists():
archive_name = f"{source_dir.name}.pre-migration-{timestamp}"
archive_path = source_dir.parent / archive_name
# If still exists (multiple runs same day), add counter
counter = 2
while archive_path.exists():
archive_name = f"{source_dir.name}.pre-migration-{timestamp}-{counter}"
archive_path = source_dir.parent / archive_name
counter += 1
if not dry_run:
source_dir.rename(archive_path)
return archive_path
def claw_command(args):
"""Route hermes claw subcommands."""
action = getattr(args, "claw_action", None)
if action == "migrate":
_cmd_migrate(args)
elif action in ("cleanup", "clean"):
_cmd_cleanup(args)
else:
print("Usage: hermes claw migrate [options]")
print("Usage: hermes claw <command> [options]")
print()
print("Commands:")
print(" migrate Migrate settings from OpenClaw to Hermes")
print(" cleanup Archive leftover OpenClaw directories after migration")
print()
print("Run 'hermes claw migrate --help' for migration options.")
print("Run 'hermes claw <command> --help' for options.")
def _cmd_migrate(args):
"""Run the OpenClaw → Hermes migration."""
source_dir = Path(getattr(args, "source", None) or Path.home() / ".openclaw")
# Check current and legacy OpenClaw directories
explicit_source = getattr(args, "source", None)
if explicit_source:
source_dir = Path(explicit_source)
else:
source_dir = Path.home() / ".openclaw"
if not source_dir.is_dir():
# Try legacy directory names
for legacy in (".clawdbot", ".moldbot"):
candidate = Path.home() / legacy
if candidate.is_dir():
source_dir = candidate
break
dry_run = getattr(args, "dry_run", False)
preset = getattr(args, "preset", "full")
overwrite = getattr(args, "overwrite", False)
@ -198,6 +296,168 @@ def _cmd_migrate(args):
# Print results
_print_migration_report(report, dry_run)
# After successful non-dry-run migration, offer to archive the source directory
if not dry_run and report.get("summary", {}).get("migrated", 0) > 0:
_offer_source_archival(source_dir, getattr(args, "yes", False))
def _offer_source_archival(source_dir: Path, auto_yes: bool = False):
"""After migration, offer to rename the source directory to prevent state fragmentation.
OpenClaw workspace directories contain state files (todo.json, sessions, etc.)
that the agent may discover and write to, causing confusion. Renaming the
directory prevents this.
"""
if not source_dir.is_dir():
return
# Scan for state files that could cause problems
state_files = _scan_workspace_state(source_dir)
print()
print_header("Post-Migration Cleanup")
print_info("The OpenClaw directory still exists and contains workspace state files")
print_info("that can confuse the agent (todo lists, sessions, logs).")
if state_files:
print()
print(color(" Found state files:", Colors.YELLOW))
# Show up to 10 most relevant findings
for path, desc in state_files[:10]:
print(f" {desc}")
if len(state_files) > 10:
print(f" ... and {len(state_files) - 10} more")
print()
print_info(f"Recommend: rename {source_dir.name}/ to {source_dir.name}.pre-migration/")
print_info("This prevents the agent from discovering old workspace directories.")
print_info("You can always rename it back if needed.")
print()
if auto_yes or prompt_yes_no(f"Archive {source_dir} now?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
print_info("The original directory has been renamed, not deleted.")
print_info(f"To undo: mv {archive_path} {source_dir}")
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"You can do it manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped. You can archive later with: hermes claw cleanup")
def _cmd_cleanup(args):
"""Archive leftover OpenClaw directories after migration.
Scans for OpenClaw directories that still exist after migration and offers
to rename them to .pre-migration to prevent state fragmentation.
"""
dry_run = getattr(args, "dry_run", False)
auto_yes = getattr(args, "yes", False)
explicit_source = getattr(args, "source", None)
print()
print(
color(
"┌─────────────────────────────────────────────────────────┐",
Colors.MAGENTA,
)
)
print(
color(
"│ ⚕ Hermes — OpenClaw Cleanup │",
Colors.MAGENTA,
)
)
print(
color(
"└─────────────────────────────────────────────────────────┘",
Colors.MAGENTA,
)
)
# Find OpenClaw directories
if explicit_source:
dirs_to_check = [Path(explicit_source)]
else:
dirs_to_check = _find_openclaw_dirs()
if not dirs_to_check:
print()
print_success("No OpenClaw directories found. Nothing to clean up.")
return
total_archived = 0
for source_dir in dirs_to_check:
print()
print_header(f"Found: {source_dir}")
# Scan for state files
state_files = _scan_workspace_state(source_dir)
# Show directory stats
try:
workspace_dirs = [
d for d in source_dir.iterdir()
if d.is_dir() and not d.name.startswith(".")
and any((d / name).exists() for name in ("todo.json", "SOUL.md", "MEMORY.md", "USER.md"))
]
except OSError:
workspace_dirs = []
if workspace_dirs:
print_info(f"Workspace directories: {len(workspace_dirs)}")
for ws in workspace_dirs[:5]:
items = []
if (ws / "todo.json").exists():
items.append("todo.json")
if (ws / "sessions").is_dir():
items.append("sessions/")
if (ws / "SOUL.md").exists():
items.append("SOUL.md")
if (ws / "MEMORY.md").exists():
items.append("MEMORY.md")
detail = ", ".join(items) if items else "empty"
print(f" {ws.name}/ ({detail})")
if len(workspace_dirs) > 5:
print(f" ... and {len(workspace_dirs) - 5} more")
if state_files:
print()
print(color(f" {len(state_files)} state file(s) that could cause confusion:", Colors.YELLOW))
for path, desc in state_files[:8]:
print(f" {desc}")
if len(state_files) > 8:
print(f" ... and {len(state_files) - 8} more")
print()
if dry_run:
archive_path = _archive_directory(source_dir, dry_run=True)
print_info(f"Would archive: {source_dir}{archive_path}")
else:
if auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
total_archived += 1
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"Try manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped.")
# Summary
print()
if dry_run:
print_info(f"Dry run complete. {len(dirs_to_check)} directory(ies) would be archived.")
print_info("Run without --dry-run to archive them.")
elif total_archived:
print_success(f"Cleaned up {total_archived} OpenClaw directory(ies).")
print_info("Directories were renamed, not deleted. You can undo by renaming them back.")
else:
print_info("No directories were archived.")
def _print_migration_report(report: dict, dry_run: bool):
"""Print a formatted migration report."""

View File

@ -12,6 +12,8 @@ import os
logger = logging.getLogger(__name__)
DEFAULT_CODEX_MODELS: List[str] = [
"gpt-5.4-mini",
"gpt-5.4",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-5.1-codex-max",
@ -19,8 +21,9 @@ DEFAULT_CODEX_MODELS: List[str] = [
]
_FORWARD_COMPAT_TEMPLATE_MODELS: List[tuple[str, tuple[str, ...]]] = [
("gpt-5.3-codex", ("gpt-5.2-codex",)),
("gpt-5.4-mini", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.4", ("gpt-5.3-codex", "gpt-5.2-codex")),
("gpt-5.3-codex", ("gpt-5.2-codex",)),
("gpt-5.3-codex-spark", ("gpt-5.3-codex", "gpt-5.2-codex")),
]

View File

@ -1,8 +1,24 @@
"""Shared ANSI color utilities for Hermes CLI modules."""
import os
import sys
def should_use_color() -> bool:
"""Return True when colored output is appropriate.
Respects the NO_COLOR environment variable (https://no-color.org/)
and TERM=dumb, in addition to the existing TTY check.
"""
if os.environ.get("NO_COLOR") is not None:
return False
if os.environ.get("TERM") == "dumb":
return False
if not sys.stdout.isatty():
return False
return True
class Colors:
RESET = "\033[0m"
BOLD = "\033[1m"
@ -16,7 +32,7 @@ class Colors:
def color(text: str, *codes) -> str:
"""Apply color codes to text (only when output is a TTY)."""
if not sys.stdout.isatty():
"""Apply color codes to text (only when color output is appropriate)."""
if not should_use_color():
return text
return "".join(codes) + text + Colors.RESET

View File

@ -67,10 +67,13 @@ COMMAND_REGISTRY: list[CommandDef] = [
gateway_only=True),
CommandDef("background", "Run a prompt in the background", "Session",
aliases=("bg",), args_hint="<prompt>"),
CommandDef("btw", "Ephemeral side question using session context (no tools, not persisted)", "Session",
args_hint="<question>"),
CommandDef("queue", "Queue a prompt for the next turn (doesn't interrupt)", "Session",
aliases=("q",), args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session",
gateway_only=True),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
gateway_only=True, aliases=("set-home",)),
CommandDef("resume", "Resume a previously-named session", "Session",
@ -90,6 +93,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("verbose", "Cycle tool progress display: off -> new -> all -> verbose",
"Configuration", cli_only=True,
gateway_config_gate="display.tool_progress_command"),
CommandDef("yolo", "Toggle YOLO mode (skip all dangerous command approvals)",
"Configuration"),
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
args_hint="[level|show|hide]",
subcommands=("none", "low", "minimal", "medium", "high", "xhigh", "show", "hide", "on", "off")),
@ -118,6 +123,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
"Tools & Skills", cli_only=True),
# Info
CommandDef("commands", "Browse all commands and skills (paginated)", "Info",
gateway_only=True, args_hint="[page]"),
CommandDef("help", "Show available commands", "Info"),
CommandDef("usage", "Show token usage for the current session", "Info"),
CommandDef("insights", "Show usage insights and analytics", "Info",
@ -361,6 +368,134 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
return result
_TG_NAME_LIMIT = 32
def _clamp_telegram_names(
entries: list[tuple[str, str]],
reserved: set[str],
) -> list[tuple[str, str]]:
"""Enforce Telegram's 32-char command name limit with collision avoidance.
Names exceeding 32 chars are truncated. If truncation creates a duplicate
(against *reserved* names or earlier entries in the same batch), the name is
shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
If all 10 digit slots are taken the entry is silently dropped.
"""
used: set[str] = set(reserved)
result: list[tuple[str, str]] = []
for name, desc in entries:
if len(name) > _TG_NAME_LIMIT:
candidate = name[:_TG_NAME_LIMIT]
if candidate in used:
prefix = name[:_TG_NAME_LIMIT - 1]
for digit in range(10):
candidate = f"{prefix}{digit}"
if candidate not in used:
break
else:
# All 10 digit slots exhausted — skip entry
continue
name = candidate
if name in used:
continue
used.add(name)
result.append((name, desc))
return result
def telegram_menu_commands(max_commands: int = 100) -> tuple[list[tuple[str, str]], int]:
"""Return Telegram menu commands capped to the Bot API limit.
Priority order (higher priority = never bumped by overflow):
1. Core CommandDef commands (always included)
2. Plugin slash commands (take precedence over skills)
3. Built-in skill commands (fill remaining slots, alphabetical)
Skills are the only tier that gets trimmed when the cap is hit.
User-installed hub skills are excluded — accessible via /skills.
Skills disabled for the ``"telegram"`` platform (via ``hermes skills
config``) are excluded from the menu entirely.
Returns:
(menu_commands, hidden_count) where hidden_count is the number of
skill commands omitted due to the cap.
"""
core_commands = list(telegram_bot_commands())
# Reserve core names so plugin/skill truncation can't collide with them
reserved_names = {n for n, _ in core_commands}
all_commands = list(core_commands)
# Plugin slash commands get priority over skills
plugin_entries: list[tuple[str, str]] = []
try:
from hermes_cli.plugins import get_plugin_manager
pm = get_plugin_manager()
plugin_cmds = getattr(pm, "_plugin_commands", {})
for cmd_name in sorted(plugin_cmds):
tg_name = cmd_name.replace("-", "_")
desc = "Plugin command"
if len(desc) > 40:
desc = desc[:37] + "..."
plugin_entries.append((tg_name, desc))
except Exception:
pass
# Clamp plugin names to 32 chars with collision avoidance
plugin_entries = _clamp_telegram_names(plugin_entries, reserved_names)
reserved_names.update(n for n, _ in plugin_entries)
all_commands.extend(plugin_entries)
# Load per-platform disabled skills so they don't consume menu slots.
# get_skill_commands() already filters the *global* disabled list, but
# per-platform overrides (skills.platform_disabled.telegram) were never
# applied here — that's what this block fixes.
_platform_disabled: set[str] = set()
try:
from agent.skill_utils import get_disabled_skill_names
_platform_disabled = get_disabled_skill_names(platform="telegram")
except Exception:
pass
# Remaining slots go to built-in skill commands (not hub-installed).
skill_entries: list[tuple[str, str]] = []
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
continue
if skill_path.startswith(_hub_dir):
continue
# Skip skills disabled for telegram
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
name = cmd_key.lstrip("/").replace("-", "_")
desc = info.get("description", "")
# Keep descriptions short — setMyCommands has an undocumented
# total payload limit. 40 chars fits 100 commands safely.
if len(desc) > 40:
desc = desc[:37] + "..."
skill_entries.append((name, desc))
except Exception:
pass
# Clamp skill names to 32 chars with collision avoidance
skill_entries = _clamp_telegram_names(skill_entries, reserved_names)
# Skills fill remaining slots — they're the only tier that gets trimmed
remaining_slots = max(0, max_commands - len(all_commands))
hidden_count = max(0, len(skill_entries) - remaining_slots)
all_commands.extend(skill_entries[:remaining_slots])
return all_commands[:max_commands], hidden_count
def slack_subcommand_map() -> dict[str, str]:
"""Return subcommand -> /command mapping for Slack /hermes handler.

View File

@ -22,6 +22,8 @@ import tempfile
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
from tools.tool_backend_helpers import managed_nous_tools_enabled as _managed_nous_tools_enabled
_IS_WINDOWS = platform.system() == "Windows"
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
@ -34,12 +36,13 @@ _EXTRA_ENV_KEYS = frozenset({
"SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
"SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
"DINGTALK_CLIENT_ID", "DINGTALK_CLIENT_SECRET",
"FEISHU_APP_ID", "FEISHU_APP_SECRET", "FEISHU_ENCRYPT_KEY", "FEISHU_VERIFICATION_TOKEN",
"WECOM_BOT_ID", "WECOM_SECRET",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
"WHATSAPP_MODE", "WHATSAPP_ENABLED",
"MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
"MATRIX_PASSWORD", "MATRIX_ENCRYPTION", "MATRIX_HOME_ROOM",
})
import yaml
from hermes_cli.colors import Colors, color
@ -50,26 +53,86 @@ from hermes_cli.default_soul import DEFAULT_SOUL_MD
# Managed mode (NixOS declarative config)
# =============================================================================
_MANAGED_TRUE_VALUES = ("true", "1", "yes")
_MANAGED_SYSTEM_NAMES = {
"brew": "Homebrew",
"homebrew": "Homebrew",
"nix": "NixOS",
"nixos": "NixOS",
}
def get_managed_system() -> Optional[str]:
"""Return the package manager owning this install, if any."""
raw = os.getenv("HERMES_MANAGED", "").strip()
if raw:
normalized = raw.lower()
if normalized in _MANAGED_TRUE_VALUES:
return "NixOS"
return _MANAGED_SYSTEM_NAMES.get(normalized, raw)
managed_marker = get_hermes_home() / ".managed"
if managed_marker.exists():
return "NixOS"
return None
def is_managed() -> bool:
"""Check if hermes is running in Nix-managed mode.
"""Check if Hermes is running in package-manager-managed mode.
Two signals: the HERMES_MANAGED env var (set by the systemd service),
or a .managed marker file in HERMES_HOME (set by the NixOS activation
script, so interactive shells also see it).
"""
if os.getenv("HERMES_MANAGED", "").lower() in ("true", "1", "yes"):
return True
managed_marker = get_hermes_home() / ".managed"
return managed_marker.exists()
return get_managed_system() is not None
def get_managed_update_command() -> Optional[str]:
"""Return the preferred upgrade command for a managed install."""
managed_system = get_managed_system()
if managed_system == "Homebrew":
return "brew upgrade hermes-agent"
if managed_system == "NixOS":
return "sudo nixos-rebuild switch"
return None
def recommended_update_command() -> str:
"""Return the best update command for the current installation."""
return get_managed_update_command() or "hermes update"
def format_managed_message(action: str = "modify this Hermes installation") -> str:
"""Build a user-facing error for managed installs."""
managed_system = get_managed_system() or "a package manager"
raw = os.getenv("HERMES_MANAGED", "").strip().lower()
if managed_system == "NixOS":
env_hint = "true" if raw in _MANAGED_TRUE_VALUES else raw or "true"
return (
f"Cannot {action}: this Hermes installation is managed by NixOS "
f"(HERMES_MANAGED={env_hint}).\n"
"Edit services.hermes-agent.settings in your configuration.nix and run:\n"
" sudo nixos-rebuild switch"
)
if managed_system == "Homebrew":
env_hint = raw or "homebrew"
return (
f"Cannot {action}: this Hermes installation is managed by Homebrew "
f"(HERMES_MANAGED={env_hint}).\n"
"Use:\n"
" brew upgrade hermes-agent"
)
return (
f"Cannot {action}: this Hermes installation is managed by {managed_system}.\n"
"Use your package manager to upgrade or reinstall Hermes."
)
def managed_error(action: str = "modify configuration"):
"""Print user-friendly error for managed mode."""
print(
f"Cannot {action}: configuration is managed by NixOS (HERMES_MANAGED=true).\n"
"Edit services.hermes-agent.settings in your configuration.nix and run:\n"
" sudo nixos-rebuild switch",
file=sys.stderr,
)
print(format_managed_message(action), file=sys.stderr)
# =============================================================================
@ -134,7 +197,9 @@ def ensure_hermes_home():
# =============================================================================
DEFAULT_CONFIG = {
"model": "anthropic/claude-opus-4.6",
"model": "",
"fallback_providers": [],
"credential_pool_strategies": {},
"toolsets": ["hermes-cli"],
"agent": {
"max_turns": 90,
@ -148,6 +213,7 @@ DEFAULT_CONFIG = {
"terminal": {
"backend": "local",
"modal_mode": "auto",
"cwd": ".", # Use current directory
"timeout": 180,
# Environment variables to pass through to sandboxed execution
@ -182,6 +248,14 @@ DEFAULT_CONFIG = {
"inactivity_timeout": 120,
"command_timeout": 30, # Timeout for browser commands in seconds (screenshot, navigate, etc.)
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
"camofox": {
# When true, Hermes sends a stable profile-scoped userId to Camofox
# so the server can map it to a persistent browser profile directory.
# Requires Camofox server to be configured with CAMOFOX_PROFILE_DIR.
# When false (default), each session gets a random userId (ephemeral).
"managed_persistence": False,
},
},
# Filesystem checkpoints — automatic snapshots before destructive file ops.
@ -191,6 +265,11 @@ DEFAULT_CONFIG = {
"enabled": True,
"max_snapshots": 50, # Max checkpoints to keep per directory
},
# Maximum characters returned by a single read_file call. Reads that
# exceed this are rejected with guidance to use offset+limit.
# 100K chars ≈ 2535K tokens across typical tokenisers.
"file_read_max_chars": 100_000,
"compression": {
"enabled": True,
@ -220,49 +299,57 @@ DEFAULT_CONFIG = {
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
"base_url": "", # direct OpenAI-compatible endpoint (takes precedence over provider)
"api_key": "", # API key for base_url (falls back to OPENAI_API_KEY)
"timeout": 30, # seconds — increase for slow local vision models
"timeout": 30, # seconds — LLM API call timeout; increase for slow local vision models
"download_timeout": 30, # seconds — image HTTP download timeout; increase for slow connections
},
"web_extract": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30, # seconds — increase for slow local models
},
"compression": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 120, # seconds — compression summarises large contexts; increase for local models
},
"session_search": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"skills_hub": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"approval": {
"provider": "auto",
"model": "", # fast/cheap model recommended (e.g. gemini-flash, haiku)
"base_url": "",
"api_key": "",
"timeout": 30,
},
"mcp": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
"flush_memories": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
},
@ -274,9 +361,11 @@ DEFAULT_CONFIG = {
"bell_on_complete": False,
"show_reasoning": False,
"streaming": False,
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
"tool_progress_command": False, # Enable /verbose command in messaging gateway
"tool_preview_length": 0, # Max chars for tool call previews (0 = no limit, show full paths/commands)
},
# Privacy settings
@ -339,6 +428,11 @@ DEFAULT_CONFIG = {
"user_profile_enabled": True,
"memory_char_limit": 2200, # ~800 tokens at 2.75 chars/token
"user_char_limit": 1375, # ~500 tokens at 2.75 chars/token
# External memory provider plugin (empty = built-in only).
# Set to a provider name to activate: "openviking", "mem0",
# "hindsight", "holographic", "retaindb", "byterover".
# Only ONE external provider is allowed at a time.
"provider": "",
},
# Subagent delegation — override the provider:model used by delegate_task
@ -359,6 +453,13 @@ DEFAULT_CONFIG = {
# Never saved to sessions, logs, or trajectories.
"prefill_messages_file": "",
# Skills — external skill directories for sharing skills across tools/agents.
# Each path is expanded (~, ${VAR}) and resolved. Read-only — skill creation
# always goes to ~/.hermes/skills/.
"skills": {
"external_dirs": [], # e.g. ["~/.agents/skills", "/shared/team-skills"]
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
# This section is only needed for hermes-specific overrides; everything else
# (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@ -373,6 +474,7 @@ DEFAULT_CONFIG = {
"require_mention": True, # Require @mention to respond in server channels
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
},
# WhatsApp platform settings (gateway mode)
@ -389,6 +491,7 @@ DEFAULT_CONFIG = {
# off — skip all approval prompts (equivalent to --yolo)
"approvals": {
"mode": "manual",
"timeout": 60,
},
# Permanently allowed dangerous command patterns (added via "always" approval)
@ -414,8 +517,14 @@ DEFAULT_CONFIG = {
},
},
"cron": {
# Wrap delivered cron responses with a header (task name) and footer
# ("The agent cannot see this message"). Set to false for clean output.
"wrap_response": True,
},
# Config schema version - bump this when adding new required fields
"_config_version": 10,
"_config_version": 11,
}
# =============================================================================
@ -430,6 +539,7 @@ ENV_VARS_BY_VERSION: Dict[int, List[str]] = {
5: ["WHATSAPP_ENABLED", "WHATSAPP_MODE", "WHATSAPP_ALLOWED_USERS",
"SLACK_BOT_TOKEN", "SLACK_APP_TOKEN", "SLACK_ALLOWED_USERS"],
10: ["TAVILY_API_KEY"],
11: ["TERMINAL_MODAL_MODE"],
}
# Required environment variables with metadata for migration prompts.
@ -616,6 +726,14 @@ OPTIONAL_ENV_VARS = {
},
# ── Tool API keys ──
"EXA_API_KEY": {
"description": "Exa API key for AI-native web search and contents",
"prompt": "Exa API key",
"url": "https://exa.ai/",
"tools": ["web_search", "web_extract"],
"password": True,
"category": "tool",
},
"PARALLEL_API_KEY": {
"description": "Parallel API key for AI-native web search and extract",
"prompt": "Parallel API key",
@ -640,6 +758,38 @@ OPTIONAL_ENV_VARS = {
"category": "tool",
"advanced": True,
},
"FIRECRAWL_GATEWAY_URL": {
"description": "Exact Firecrawl tool-gateway origin override for Nous Subscribers only (optional)",
"prompt": "Firecrawl gateway URL (leave empty to derive from domain)",
"url": None,
"password": False,
"category": "tool",
"advanced": True,
},
"TOOL_GATEWAY_DOMAIN": {
"description": "Shared tool-gateway domain suffix for Nous Subscribers only, used to derive vendor hosts, e.g. nousresearch.com -> firecrawl-gateway.nousresearch.com",
"prompt": "Tool-gateway domain suffix",
"url": None,
"password": False,
"category": "tool",
"advanced": True,
},
"TOOL_GATEWAY_SCHEME": {
"description": "Shared tool-gateway URL scheme for Nous Subscribers only, used to derive vendor hosts (`https` by default, set `http` for local gateway testing)",
"prompt": "Tool-gateway URL scheme",
"url": None,
"password": False,
"category": "tool",
"advanced": True,
},
"TOOL_GATEWAY_USER_TOKEN": {
"description": "Explicit Nous Subscriber access token for tool-gateway requests (optional; otherwise read from the Hermes auth store)",
"prompt": "Tool-gateway user token",
"url": None,
"password": True,
"category": "tool",
"advanced": True,
},
"TAVILY_API_KEY": {
"description": "Tavily API key for AI-native web search, extract, and crawl",
"prompt": "Tavily API key",
@ -672,6 +822,14 @@ OPTIONAL_ENV_VARS = {
"password": True,
"category": "tool",
},
"CAMOFOX_URL": {
"description": "Camofox browser server URL for local anti-detection browsing (e.g. http://localhost:9377)",
"prompt": "Camofox server URL",
"url": "https://github.com/jo-inc/camofox-browser",
"tools": ["browser_navigate", "browser_click"],
"password": False,
"category": "tool",
},
"FAL_KEY": {
"description": "FAL API key for image generation",
"prompt": "FAL API key",
@ -802,6 +960,20 @@ OPTIONAL_ENV_VARS = {
"password": False,
"category": "messaging",
},
"MATTERMOST_REQUIRE_MENTION": {
"description": "Require @mention in Mattermost channels (default: true). Set to false to respond to all messages.",
"prompt": "Require @mention in channels",
"url": None,
"password": False,
"category": "messaging",
},
"MATTERMOST_FREE_RESPONSE_CHANNELS": {
"description": "Comma-separated Mattermost channel IDs where bot responds without @mention",
"prompt": "Free-response channel IDs (comma-separated)",
"url": None,
"password": False,
"category": "messaging",
},
"MATRIX_HOMESERVER": {
"description": "Matrix homeserver URL (e.g. https://matrix.example.org)",
"prompt": "Matrix homeserver URL",
@ -947,6 +1119,15 @@ OPTIONAL_ENV_VARS = {
},
}
if not _managed_nous_tools_enabled():
for _hidden_var in (
"FIRECRAWL_GATEWAY_URL",
"TOOL_GATEWAY_DOMAIN",
"TOOL_GATEWAY_SCHEME",
"TOOL_GATEWAY_USER_TOKEN",
):
OPTIONAL_ENV_VARS.pop(_hidden_var, None)
def get_missing_env_vars(required_only: bool = False) -> List[Dict[str, Any]]:
"""
@ -1249,6 +1430,36 @@ def _expand_env_vars(obj):
return obj
def _normalize_root_model_keys(config: Dict[str, Any]) -> Dict[str, Any]:
"""Move stale root-level provider/base_url into model section.
Some users (or older code) placed ``provider:`` and ``base_url:`` at the
config root instead of inside ``model:``. These root-level keys are only
used as a fallback when the corresponding ``model.*`` key is empty — they
never override an existing ``model.provider`` or ``model.base_url``.
After migration the root-level keys are removed so they can't cause
confusion on subsequent loads.
"""
# Only act if there are root-level keys to migrate
has_root = any(config.get(k) for k in ("provider", "base_url"))
if not has_root:
return config
config = dict(config)
model = config.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
config["model"] = model
for key in ("provider", "base_url"):
root_val = config.get(key)
if root_val and not model.get(key):
model[key] = root_val
config.pop(key, None)
return config
def _normalize_max_turns_config(config: Dict[str, Any]) -> Dict[str, Any]:
"""Normalize legacy root-level max_turns into agent.max_turns."""
config = dict(config)
@ -1290,7 +1501,7 @@ def load_config() -> Dict[str, Any]:
except Exception as e:
print(f"Warning: Failed to load config: {e}")
return _expand_env_vars(_normalize_max_turns_config(config))
return _expand_env_vars(_normalize_root_model_keys(_normalize_max_turns_config(config)))
_SECURITY_COMMENT = """
@ -1397,7 +1608,7 @@ def save_config(config: Dict[str, Any]):
ensure_hermes_home()
config_path = get_config_path()
normalized = _normalize_max_turns_config(config)
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
@ -1672,6 +1883,7 @@ def show_config():
keys = [
("OPENROUTER_API_KEY", "OpenRouter"),
("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
("EXA_API_KEY", "Exa"),
("PARALLEL_API_KEY", "Parallel"),
("FIRECRAWL_API_KEY", "Firecrawl"),
("TAVILY_API_KEY", "Tavily"),
@ -1831,7 +2043,9 @@ def set_config_value(key: str, value: str):
# Check if it's an API key (goes to .env)
api_keys = [
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
'EXA_API_KEY', 'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL',
'FIRECRAWL_GATEWAY_URL', 'TOOL_GATEWAY_DOMAIN', 'TOOL_GATEWAY_SCHEME',
'TOOL_GATEWAY_USER_TOKEN', 'TAVILY_API_KEY',
'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
@ -1887,6 +2101,7 @@ def set_config_value(key: str, value: str):
# config.yaml is authoritative, but terminal_tool only reads TERMINAL_ENV etc.
_config_to_env_sync = {
"terminal.backend": "TERMINAL_ENV",
"terminal.modal_mode": "TERMINAL_MODAL_MODE",
"terminal.docker_image": "TERMINAL_DOCKER_IMAGE",
"terminal.singularity_image": "TERMINAL_SINGULARITY_IMAGE",
"terminal.modal_image": "TERMINAL_MODAL_IMAGE",
@ -1920,7 +2135,7 @@ def config_command(args):
elif subcmd == "set":
key = getattr(args, 'key', None)
value = getattr(args, 'value', None)
if not key or not value:
if not key or value is None:
print("Usage: hermes config set <key> <value>")
print()
print("Examples:")

View File

@ -56,7 +56,7 @@ def cron_list(show_all: bool = False):
print()
for job in jobs:
job_id = job.get("id", "?")[:8]
job_id = job.get("id", "?")
name = job.get("name", "(unnamed)")
schedule = job.get("schedule_display", job.get("schedule", {}).get("value", "?"))
state = job.get("state", "scheduled" if job.get("enabled", True) else "paused")

View File

@ -4,7 +4,8 @@ Used by `hermes tools` and `hermes skills` for interactive checklists.
Provides a curses multi-select with keyboard navigation, plus a
text-based numbered fallback for terminals without curses support.
"""
from typing import List, Set
import sys
from typing import Callable, List, Optional, Set
from hermes_cli.colors import Colors, color
@ -15,6 +16,7 @@ def curses_checklist(
selected: Set[int],
*,
cancel_returns: Set[int] | None = None,
status_fn: Optional[Callable[[Set[int]], str]] = None,
) -> Set[int]:
"""Curses multi-select checklist. Returns set of selected indices.
@ -23,10 +25,18 @@ def curses_checklist(
items: Display labels for each row.
selected: Indices that start checked (pre-selected).
cancel_returns: Returned on ESC/q. Defaults to the original *selected*.
status_fn: Optional callback ``f(chosen_indices) -> str`` whose return
value is rendered on the bottom row of the terminal. Use this for
live aggregate info (e.g. estimated token counts).
"""
if cancel_returns is None:
cancel_returns = set(selected)
# Safety: curses and input() both hang or spin when stdin is not a
# terminal (e.g. subprocess pipe). Return defaults immediately.
if not sys.stdin.isatty():
return cancel_returns
try:
import curses
chosen = set(selected)
@ -47,6 +57,9 @@ def curses_checklist(
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Reserve bottom row for status bar when status_fn provided
footer_rows = 1 if status_fn else 0
# Header
try:
hattr = curses.A_BOLD
@ -62,7 +75,7 @@ def curses_checklist(
pass
# Scrollable item list
visible_rows = max_y - 3
visible_rows = max_y - 3 - footer_rows
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
@ -72,7 +85,7 @@ def curses_checklist(
range(scroll_offset, min(len(items), scroll_offset + visible_rows))
):
y = draw_i + 3
if y >= max_y - 1:
if y >= max_y - 1 - footer_rows:
break
check = "" if i in chosen else " "
arrow = "" if i == cursor else " "
@ -87,6 +100,20 @@ def curses_checklist(
except curses.error:
pass
# Status bar (bottom row, right-aligned)
if status_fn:
try:
status_text = status_fn(chosen)
if status_text:
# Right-align on the bottom row
sx = max(0, max_x - len(status_text) - 1)
sattr = curses.A_DIM
if curses.has_colors():
sattr |= curses.color_pair(3)
stdscr.addnstr(max_y - 1, sx, status_text, max_x - sx - 1, sattr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
@ -107,7 +134,7 @@ def curses_checklist(
return result_holder[0] if result_holder[0] is not None else cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns)
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
def _numbered_fallback(
@ -115,6 +142,7 @@ def _numbered_fallback(
items: List[str],
selected: Set[int],
cancel_returns: Set[int],
status_fn: Optional[Callable[[Set[int]], str]] = None,
) -> Set[int]:
"""Text-based toggle fallback for terminals without curses."""
chosen = set(selected)
@ -125,6 +153,10 @@ def _numbered_fallback(
for i, label in enumerate(items):
marker = color("[✓]", Colors.GREEN) if i in chosen else "[ ]"
print(f" {marker} {i + 1:>2}. {label}")
if status_fn:
status_text = status_fn(chosen)
if status_text:
print(color(f"\n {status_text}", Colors.DIM))
print()
try:
val = input(color(" Toggle # (or Enter to confirm): ", Colors.DIM)).strip()

View File

@ -10,9 +10,11 @@ import subprocess
import shutil
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
from hermes_constants import display_hermes_home
PROJECT_ROOT = get_project_root()
HERMES_HOME = get_hermes_home()
_DHH = display_hermes_home() # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)
# Load environment variables from ~/.hermes/.env so API key checks work
from dotenv import load_dotenv
@ -53,10 +55,10 @@ def _has_provider_env_config(content: str) -> bool:
def _honcho_is_configured_for_doctor() -> bool:
"""Return True when Honcho is configured, even if this process has no active session."""
try:
from honcho_integration.client import HonchoClientConfig
from plugins.memory.honcho.client import HonchoClientConfig
cfg = HonchoClientConfig.from_global_config()
return bool(cfg.enabled and cfg.api_key)
return bool(cfg.enabled and (cfg.api_key or cfg.base_url))
except Exception:
return False
@ -209,14 +211,14 @@ def run_doctor(args):
# Check ~/.hermes/.env (primary location for user config)
env_path = HERMES_HOME / '.env'
if env_path.exists():
check_ok("~/.hermes/.env file exists")
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
check_warn("No API key found in ~/.hermes/.env")
check_warn(f"No API key found in {_DHH}/.env")
issues.append("Run 'hermes setup' to configure API keys")
else:
# Also check project root as fallback
@ -224,11 +226,11 @@ def run_doctor(args):
if fallback_env.exists():
check_ok(".env file exists (in project directory)")
else:
check_fail("~/.hermes/.env file missing")
check_fail(f"{_DHH}/.env file missing")
if should_fix:
env_path.parent.mkdir(parents=True, exist_ok=True)
env_path.touch()
check_ok("Created empty ~/.hermes/.env")
check_ok(f"Created empty {_DHH}/.env")
check_info("Run 'hermes setup' to configure API keys")
fixed_count += 1
else:
@ -238,7 +240,7 @@ def run_doctor(args):
# Check ~/.hermes/config.yaml (primary) or project cli-config.yaml (fallback)
config_path = HERMES_HOME / 'config.yaml'
if config_path.exists():
check_ok("~/.hermes/config.yaml exists")
check_ok(f"{_DHH}/config.yaml exists")
else:
fallback_config = PROJECT_ROOT / 'cli-config.yaml'
if fallback_config.exists():
@ -248,11 +250,11 @@ def run_doctor(args):
if should_fix and example_config.exists():
config_path.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(str(example_config), str(config_path))
check_ok("Created ~/.hermes/config.yaml from cli-config.yaml.example")
check_ok(f"Created {_DHH}/config.yaml from cli-config.yaml.example")
fixed_count += 1
elif should_fix:
check_warn("config.yaml not found and no example to copy from")
manual_issues.append("Create ~/.hermes/config.yaml manually")
manual_issues.append(f"Create {_DHH}/config.yaml manually")
else:
check_warn("config.yaml not found", "(using defaults)")
@ -294,28 +296,28 @@ def run_doctor(args):
hermes_home = HERMES_HOME
if hermes_home.exists():
check_ok("~/.hermes directory exists")
check_ok(f"{_DHH} directory exists")
else:
if should_fix:
hermes_home.mkdir(parents=True, exist_ok=True)
check_ok("Created ~/.hermes directory")
check_ok(f"Created {_DHH} directory")
fixed_count += 1
else:
check_warn("~/.hermes not found", "(will be created on first use)")
check_warn(f"{_DHH} not found", "(will be created on first use)")
# Check expected subdirectories
expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
for subdir_name in expected_subdirs:
subdir_path = hermes_home / subdir_name
if subdir_path.exists():
check_ok(f"~/.hermes/{subdir_name}/ exists")
check_ok(f"{_DHH}/{subdir_name}/ exists")
else:
if should_fix:
subdir_path.mkdir(parents=True, exist_ok=True)
check_ok(f"Created ~/.hermes/{subdir_name}/")
check_ok(f"Created {_DHH}/{subdir_name}/")
fixed_count += 1
else:
check_warn(f"~/.hermes/{subdir_name}/ not found", "(will be created on first use)")
check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
# Check for SOUL.md persona file
soul_path = hermes_home / "SOUL.md"
@ -324,11 +326,11 @@ def run_doctor(args):
# Check if it's just the template comments (no real content)
lines = [l for l in content.splitlines() if l.strip() and not l.strip().startswith(("<!--", "-->", "#"))]
if lines:
check_ok("~/.hermes/SOUL.md exists (persona configured)")
check_ok(f"{_DHH}/SOUL.md exists (persona configured)")
else:
check_info("~/.hermes/SOUL.md exists but is empty — edit it to customize personality")
check_info(f"{_DHH}/SOUL.md exists but is empty — edit it to customize personality")
else:
check_warn("~/.hermes/SOUL.md not found", "(create it to give Hermes a custom personality)")
check_warn(f"{_DHH}/SOUL.md not found", "(create it to give Hermes a custom personality)")
if should_fix:
soul_path.parent.mkdir(parents=True, exist_ok=True)
soul_path.write_text(
@ -337,13 +339,13 @@ def run_doctor(args):
"You are Hermes, a helpful AI assistant.\n",
encoding="utf-8",
)
check_ok("Created ~/.hermes/SOUL.md with basic template")
check_ok(f"Created {_DHH}/SOUL.md with basic template")
fixed_count += 1
# Check memory directory
memories_dir = hermes_home / "memories"
if memories_dir.exists():
check_ok("~/.hermes/memories/ directory exists")
check_ok(f"{_DHH}/memories/ directory exists")
memory_file = memories_dir / "MEMORY.md"
user_file = memories_dir / "USER.md"
if memory_file.exists():
@ -357,10 +359,10 @@ def run_doctor(args):
else:
check_info("USER.md not created yet (will be created when the agent first writes a memory)")
else:
check_warn("~/.hermes/memories/ not found", "(will be created on first use)")
check_warn(f"{_DHH}/memories/ not found", "(will be created on first use)")
if should_fix:
memories_dir.mkdir(parents=True, exist_ok=True)
check_ok("Created ~/.hermes/memories/")
check_ok(f"Created {_DHH}/memories/")
fixed_count += 1
# Check SQLite session store
@ -372,11 +374,11 @@ def run_doctor(args):
cursor = conn.execute("SELECT COUNT(*) FROM sessions")
count = cursor.fetchone()[0]
conn.close()
check_ok(f"~/.hermes/state.db exists ({count} sessions)")
check_ok(f"{_DHH}/state.db exists ({count} sessions)")
except Exception as e:
check_warn(f"~/.hermes/state.db exists but has issues: {e}")
check_warn(f"{_DHH}/state.db exists but has issues: {e}")
else:
check_info("~/.hermes/state.db not created yet (will be created on first session)")
check_info(f"{_DHH}/state.db not created yet (will be created on first session)")
_check_gateway_service_linger(issues)
@ -404,8 +406,11 @@ def run_doctor(args):
if terminal_env == "docker":
if shutil.which("docker"):
# Check if docker daemon is running
result = subprocess.run(["docker", "info"], capture_output=True)
if result.returncode == 0:
try:
result = subprocess.run(["docker", "info"], capture_output=True, timeout=10)
except subprocess.TimeoutExpired:
result = None
if result is not None and result.returncode == 0:
check_ok("docker", "(daemon running)")
else:
check_fail("docker daemon not running")
@ -424,12 +429,16 @@ def run_doctor(args):
ssh_host = os.getenv("TERMINAL_SSH_HOST")
if ssh_host:
# Try to connect
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
capture_output=True,
text=True
)
if result.returncode == 0:
try:
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
capture_output=True,
text=True,
timeout=15
)
except subprocess.TimeoutExpired:
result = None
if result is not None and result.returncode == 0:
check_ok(f"SSH connection to {ssh_host}")
else:
check_fail(f"SSH connection to {ssh_host}")
@ -691,7 +700,7 @@ def run_doctor(args):
if github_token:
check_ok("GitHub token configured (authenticated API access)")
else:
check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")
check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
# =========================================================================
# Honcho memory
@ -700,19 +709,19 @@ def run_doctor(args):
print(color("◆ Honcho Memory", Colors.CYAN, Colors.BOLD))
try:
from honcho_integration.client import HonchoClientConfig, resolve_config_path
from plugins.memory.honcho.client import HonchoClientConfig, resolve_config_path
hcfg = HonchoClientConfig.from_global_config()
_honcho_cfg_path = resolve_config_path()
if not _honcho_cfg_path.exists():
check_warn("Honcho config not found", "run: hermes honcho setup")
check_warn("Honcho config not found", "run: hermes memory setup")
elif not hcfg.enabled:
check_info(f"Honcho disabled (set enabled: true in {_honcho_cfg_path} to activate)")
elif not hcfg.api_key:
check_fail("Honcho API key not set", "run: hermes honcho setup")
issues.append("No Honcho API key — run 'hermes honcho setup'")
elif not (hcfg.api_key or hcfg.base_url):
check_fail("Honcho API key or base URL not set", "run: hermes memory setup")
issues.append("No Honcho API key — run 'hermes memory setup'")
else:
from honcho_integration.client import get_honcho_client, reset_honcho_client
from plugins.memory.honcho.client import get_honcho_client, reset_honcho_client
reset_honcho_client()
try:
get_honcho_client(hcfg)
@ -728,6 +737,53 @@ def run_doctor(args):
except Exception as _e:
check_warn("Honcho check failed", str(_e))
# =========================================================================
# Profiles
# =========================================================================
try:
from hermes_cli.profiles import list_profiles, _get_wrapper_dir, profile_exists
import re as _re
named_profiles = [p for p in list_profiles() if not p.is_default]
if named_profiles:
print()
print(color("◆ Profiles", Colors.CYAN, Colors.BOLD))
check_ok(f"{len(named_profiles)} profile(s) found")
wrapper_dir = _get_wrapper_dir()
for p in named_profiles:
parts = []
if p.gateway_running:
parts.append("gateway running")
if p.model:
parts.append(p.model[:30])
if not (p.path / "config.yaml").exists():
parts.append("⚠ missing config")
if not (p.path / ".env").exists():
parts.append("no .env")
wrapper = wrapper_dir / p.name
if not wrapper.exists():
parts.append("no alias")
status = ", ".join(parts) if parts else "configured"
check_ok(f" {p.name}: {status}")
# Check for orphan wrappers
if wrapper_dir.is_dir():
for wrapper in wrapper_dir.iterdir():
if not wrapper.is_file():
continue
try:
content = wrapper.read_text()
if "hermes -p" in content:
_m = _re.search(r"hermes -p (\S+)", content)
if _m and not profile_exists(_m.group(1)):
check_warn(f"Orphan alias: {wrapper.name} → profile '{_m.group(1)}' no longer exists")
except Exception:
pass
except ImportError:
pass
except Exception as _e:
logger.debug("Profile health check failed: %s", _e)
# =========================================================================
# Summary
# =========================================================================

View File

@ -15,6 +15,8 @@ from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
from hermes_cli.config import get_env_value, get_hermes_home, save_env_value, is_managed, managed_error
# display_hermes_home is imported lazily at call sites to avoid ImportError
# when hermes_constants is cached from a pre-update version during `hermes update`.
from hermes_cli.setup import (
print_header, print_info, print_success, print_warning, print_error,
prompt, prompt_choice, prompt_yes_no,
@ -125,20 +127,43 @@ _SERVICE_BASE = "hermes-gateway"
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
def _profile_suffix() -> str:
"""Derive a service-name suffix from the current HERMES_HOME.
Returns ``""`` for the default ``~/.hermes``, the profile name for
``~/.hermes/profiles/<name>``, or a short hash for any other custom
HERMES_HOME path.
"""
import hashlib
import re
from pathlib import Path as _Path
home = get_hermes_home().resolve()
default = (_Path.home() / ".hermes").resolve()
if home == default:
return ""
# Detect ~/.hermes/profiles/<name> pattern → use the profile name
profiles_root = (default / "profiles").resolve()
try:
rel = home.relative_to(profiles_root)
parts = rel.parts
if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
return parts[0]
except ValueError:
pass
# Fallback: short hash for arbitrary HERMES_HOME paths
return hashlib.sha256(str(home).encode()).hexdigest()[:8]
def get_service_name() -> str:
"""Derive a systemd service name scoped to this HERMES_HOME.
Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
Any other HERMES_HOME appends a short hash so multiple installations
can each have their own systemd service without conflicting.
Profile ``~/.hermes/profiles/coder`` returns ``hermes-gateway-coder``.
Any other HERMES_HOME appends a short hash for uniqueness.
"""
import hashlib
from pathlib import Path as _Path # local import to avoid monkeypatch interference
home = get_hermes_home().resolve()
default = (_Path.home() / ".hermes").resolve()
if home == default:
suffix = _profile_suffix()
if not suffix:
return _SERVICE_BASE
suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
return f"{_SERVICE_BASE}-{suffix}"
@ -233,8 +258,11 @@ def _system_service_identity(run_as_user: str | None = None) -> tuple[str, str,
username = (run_as_user or os.getenv("SUDO_USER") or os.getenv("USER") or os.getenv("LOGNAME") or getpass.getuser()).strip()
if not username:
raise ValueError("Could not determine which user the gateway service should run as")
if username == "root" and not run_as_user:
raise ValueError("Refusing to install the gateway system service as root; pass --run-as-user root to override (e.g. in LXC containers)")
if username == "root":
raise ValueError("Refusing to install the gateway system service as root; pass --run-as USER")
print_warning("Installing gateway service to run as root.")
print_info(" This is fine for LXC/container environments but not recommended on bare-metal hosts.")
try:
user_info = pwd.getpwnam(username)
@ -296,9 +324,9 @@ def install_linux_gateway_from_setup(force: bool = False) -> tuple[str | None, b
while True:
run_as_user = prompt(" Run the system gateway service as which user?", default="")
run_as_user = (run_as_user or "").strip()
if run_as_user and run_as_user != "root":
if run_as_user:
break
print_error(" Enter a non-root username.")
print_error(" Enter a username.")
systemd_install(force=force, system=True, run_as_user=run_as_user)
return scope, True
@ -369,7 +397,14 @@ def print_systemd_linger_guidance() -> None:
print(" sudo loginctl enable-linger $USER")
def get_launchd_plist_path() -> Path:
return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
"""Return the launchd plist path, scoped per profile.
Default ``~/.hermes`` → ``ai.hermes.gateway.plist`` (backward compatible).
Profile ``~/.hermes/profiles/coder`` → ``ai.hermes.gateway-coder.plist``.
"""
suffix = _profile_suffix()
name = f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
return Path.home() / "Library" / "LaunchAgents" / f"{name}.plist"
def _detect_venv_dir() -> Path | None:
"""Detect the active virtualenv directory.
@ -431,6 +466,32 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
return [p for p in candidates if p not in path_entries and Path(p).exists()]
def _hermes_home_for_target_user(target_home_dir: str) -> str:
"""Remap the current HERMES_HOME to the equivalent under a target user's home.
When installing a system service via sudo, get_hermes_home() resolves to
root's home. This translates it to the target user's equivalent path:
/root/.hermes → /home/alice/.hermes
/root/.hermes/profiles/coder → /home/alice/.hermes/profiles/coder
/opt/custom-hermes → /opt/custom-hermes (kept as-is)
"""
current_hermes = get_hermes_home().resolve()
current_default = (Path.home() / ".hermes").resolve()
target_default = Path(target_home_dir) / ".hermes"
# Default ~/.hermes → remap to target user's default
if current_hermes == current_default:
return str(target_default)
# Profile or subdir of ~/.hermes → preserve the relative structure
try:
relative = current_hermes.relative_to(current_default)
return str(target_default / relative)
except ValueError:
# Completely custom path (not under ~/.hermes) — keep as-is
return str(current_hermes)
def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
@ -446,12 +507,11 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
if resolved_node_dir not in path_entries:
path_entries.append(resolved_node_dir)
hermes_home = str(get_hermes_home().resolve())
common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
if system:
username, group_name, home_dir = _system_service_identity(run_as_user)
hermes_home = _hermes_home_for_target_user(home_dir)
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
@ -486,6 +546,7 @@ StandardError=journal
WantedBy=multi-user.target
"""
hermes_home = str(get_hermes_home().resolve())
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
@ -769,18 +830,46 @@ def systemd_status(deep: bool = False, system: bool = False):
# Launchd (macOS)
# =============================================================================
def get_launchd_label() -> str:
"""Return the launchd service label, scoped per profile."""
suffix = _profile_suffix()
return f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
def generate_launchd_plist() -> str:
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
hermes_home = str(get_hermes_home().resolve())
log_dir = get_hermes_home() / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
label = get_launchd_label()
# Build a sane PATH for the launchd plist. launchd provides only a
# minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
# nvm, cargo, etc. We prepend venv/bin and node_modules/.bin (matching
# the systemd unit), then capture the user's full shell PATH so every
# user-installed tool (node, ffmpeg, …) is reachable.
detected_venv = _detect_venv_dir()
venv_bin = str(detected_venv / "bin") if detected_venv else str(PROJECT_ROOT / "venv" / "bin")
venv_dir = str(detected_venv) if detected_venv else str(PROJECT_ROOT / "venv")
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
# Resolve the directory containing the node binary (e.g. Homebrew, nvm)
# so it's explicitly in PATH even if the user's shell PATH changes later.
priority_dirs = [venv_bin, node_bin]
resolved_node = shutil.which("node")
if resolved_node:
resolved_node_dir = str(Path(resolved_node).resolve().parent)
if resolved_node_dir not in priority_dirs:
priority_dirs.append(resolved_node_dir)
sane_path = ":".join(
dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
)
return f"""<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>ai.hermes.gateway</string>
<string>{label}</string>
<key>ProgramArguments</key>
<array>
@ -795,6 +884,16 @@ def generate_launchd_plist() -> str:
<key>WorkingDirectory</key>
<string>{working_dir}</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>{sane_path}</string>
<key>VIRTUAL_ENV</key>
<string>{venv_dir}</string>
<key>HERMES_HOME</key>
<string>{hermes_home}</string>
</dict>
<key>RunAtLoad</key>
<true/>
@ -867,7 +966,8 @@ def launchd_install(force: bool = False):
print()
print("Next steps:")
print(" hermes gateway status # Check status")
print(" tail -f ~/.hermes/logs/gateway.log # View logs")
from hermes_constants import display_hermes_home as _dhh
print(f" tail -f {_dhh()}/logs/gateway.log # View logs")
def launchd_uninstall():
plist_path = get_launchd_plist_path()
@ -880,20 +980,33 @@ def launchd_uninstall():
print("✓ Service uninstalled")
def launchd_start():
refresh_launchd_plist_if_needed()
plist_path = get_launchd_plist_path()
label = get_launchd_label()
# Self-heal if the plist is missing entirely (e.g., manual cleanup, failed upgrade)
if not plist_path.exists():
print("↻ launchd plist missing; regenerating service definition")
plist_path.parent.mkdir(parents=True, exist_ok=True)
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
subprocess.run(["launchctl", "start", label], check=True)
print("✓ Service started")
return
refresh_launchd_plist_if_needed()
try:
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
subprocess.run(["launchctl", "start", label], check=True)
except subprocess.CalledProcessError as e:
if e.returncode != 3 or not plist_path.exists():
if e.returncode != 3:
raise
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
subprocess.run(["launchctl", "start", label], check=True)
print("✓ Service started")
def launchd_stop():
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
label = get_launchd_label()
subprocess.run(["launchctl", "stop", label], check=True)
print("✓ Service stopped")
def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
@ -948,8 +1061,9 @@ def launchd_restart():
def launchd_status(deep: bool = False):
plist_path = get_launchd_plist_path()
label = get_launchd_label()
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
["launchctl", "list", label],
capture_output=True,
text=True
)
@ -981,11 +1095,12 @@ def launchd_status(deep: bool = False):
# Gateway Runner
# =============================================================================
def run_gateway(verbose: bool = False, replace: bool = False):
def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
"""Run the gateway in foreground.
Args:
verbose: Enable verbose logging output.
verbose: Stderr log verbosity count added on top of default WARNING (0=WARNING, 1=INFO, 2+=DEBUG).
quiet: Suppress all stderr log output.
replace: If True, kill any existing gateway instance before starting.
This prevents systemd restart loops when the old process
hasn't fully exited yet.
@ -1004,7 +1119,8 @@ def run_gateway(verbose: bool = False, replace: bool = False):
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
success = asyncio.run(start_gateway(replace=replace))
verbosity = None if quiet else verbose
success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
if not success:
sys.exit(1)
@ -1237,6 +1353,59 @@ _PLATFORMS = [
"help": "The AppSecret from your DingTalk application credentials."},
],
},
{
"key": "feishu",
"label": "Feishu / Lark",
"emoji": "🪽",
"token_var": "FEISHU_APP_ID",
"setup_instructions": [
"1. Go to https://open.feishu.cn/ (or https://open.larksuite.com/ for Lark)",
"2. Create an app and copy the App ID and App Secret",
"3. Enable the Bot capability for the app",
"4. Choose WebSocket (recommended) or Webhook connection mode",
"5. Add the bot to a group chat or message it directly",
"6. Restrict access with FEISHU_ALLOWED_USERS for production use",
],
"vars": [
{"name": "FEISHU_APP_ID", "prompt": "App ID", "password": False,
"help": "The App ID from your Feishu/Lark application."},
{"name": "FEISHU_APP_SECRET", "prompt": "App Secret", "password": True,
"help": "The App Secret from your Feishu/Lark application."},
{"name": "FEISHU_DOMAIN", "prompt": "Domain — feishu or lark (default: feishu)", "password": False,
"help": "Use 'feishu' for Feishu China, or 'lark' for Lark international."},
{"name": "FEISHU_CONNECTION_MODE", "prompt": "Connection mode — websocket or webhook (default: websocket)", "password": False,
"help": "websocket is recommended unless you specifically need webhook mode."},
{"name": "FEISHU_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
"is_allowlist": True,
"help": "Restrict which Feishu/Lark users can interact with the bot."},
{"name": "FEISHU_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
"help": "Chat ID for scheduled results and notifications."},
],
},
{
"key": "wecom",
"label": "WeCom (Enterprise WeChat)",
"emoji": "💬",
"token_var": "WECOM_BOT_ID",
"setup_instructions": [
"1. Go to WeCom Admin Console → Applications → Create AI Bot",
"2. Copy the Bot ID and Secret from the bot's credentials page",
"3. The bot connects via WebSocket — no public endpoint needed",
"4. Add the bot to a group chat or message it directly in WeCom",
"5. Restrict access with WECOM_ALLOWED_USERS for production use",
],
"vars": [
{"name": "WECOM_BOT_ID", "prompt": "Bot ID", "password": False,
"help": "The Bot ID from your WeCom AI Bot."},
{"name": "WECOM_SECRET", "prompt": "Secret", "password": True,
"help": "The secret from your WeCom AI Bot."},
{"name": "WECOM_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
"is_allowlist": True,
"help": "Restrict which WeCom users can interact with the bot."},
{"name": "WECOM_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
"help": "Chat ID for scheduled results and notifications."},
],
},
]
@ -1454,7 +1623,7 @@ def _is_service_running() -> bool:
return False
elif is_macos() and get_launchd_plist_path().exists():
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
["launchctl", "list", get_launchd_label()],
capture_output=True, text=True
)
return result.returncode == 0
@ -1725,9 +1894,10 @@ def gateway_command(args):
# Default to run if no subcommand
if subcmd is None or subcmd == "run":
verbose = getattr(args, 'verbose', False)
verbose = getattr(args, 'verbose', 0)
quiet = getattr(args, 'quiet', False)
replace = getattr(args, 'replace', False)
run_gateway(verbose, replace=replace)
run_gateway(verbose, quiet=quiet, replace=replace)
return
if subcmd == "setup":
@ -1855,7 +2025,7 @@ def gateway_command(args):
# Start fresh
print("Starting gateway...")
run_gateway(verbose=False)
run_gateway(verbose=0)
elif subcmd == "status":
deep = getattr(args, 'deep', False)

File diff suppressed because it is too large Load Diff

View File

@ -24,6 +24,7 @@ from hermes_cli.config import (
get_hermes_home, # noqa: F401 — used by test mocks
)
from hermes_cli.colors import Colors, color
from hermes_constants import display_hermes_home
logger = logging.getLogger(__name__)
@ -244,7 +245,7 @@ def cmd_mcp_add(args):
api_key = _prompt("API key / Bearer token", password=True)
if api_key:
save_env_value(env_key, api_key)
_success(f"Saved to ~/.hermes/.env as {env_key}")
_success(f"Saved to {display_hermes_home()}/.env as {env_key}")
# Set header with env var interpolation
if api_key or existing_key:
@ -332,7 +333,7 @@ def cmd_mcp_add(args):
_save_mcp_server(name, server_config)
print()
_success(f"Saved '{name}' to ~/.hermes/config.yaml ({tool_count}/{total} tools enabled)")
_success(f"Saved '{name}' to {display_hermes_home()}/config.yaml ({tool_count}/{total} tools enabled)")
_info("Start a new session to use these tools.")
@ -510,6 +511,10 @@ def _interpolate_value(value: str) -> str:
def cmd_mcp_configure(args):
"""Reconfigure which tools are enabled for an existing MCP server."""
import sys as _sys
if not _sys.stdin.isatty():
print("Error: 'hermes mcp configure' requires an interactive terminal.", file=_sys.stderr)
_sys.exit(1)
name = args.name
servers = _get_mcp_servers()
@ -607,6 +612,11 @@ def mcp_command(args):
"""Main dispatcher for ``hermes mcp`` subcommands."""
action = getattr(args, "mcp_action", None)
if action == "serve":
from mcp_serve import run_mcp_server
run_mcp_server(verbose=getattr(args, "verbose", False))
return
handlers = {
"add": cmd_mcp_add,
"remove": cmd_mcp_remove,
@ -625,6 +635,7 @@ def mcp_command(args):
# No subcommand — show list
cmd_mcp_list()
print(color(" Commands:", Colors.CYAN))
_info("hermes mcp serve Run as MCP server")
_info("hermes mcp add <name> --url <endpoint> Add an MCP server")
_info("hermes mcp add <name> --command <cmd> Add a stdio server")
_info("hermes mcp remove <name> Remove a server")

451
hermes_cli/memory_setup.py Normal file
View File

@ -0,0 +1,451 @@
"""hermes memory setup|status — configure memory provider plugins.
Auto-detects installed memory providers via the plugin system.
Interactive curses-based UI for provider selection, then walks through
the provider's config schema. Writes config to config.yaml + .env.
"""
from __future__ import annotations
import getpass
import os
import sys
from pathlib import Path
# ---------------------------------------------------------------------------
# Curses-based interactive picker (same pattern as hermes tools)
# ---------------------------------------------------------------------------
def _curses_select(title: str, items: list[tuple[str, str]], default: int = 0) -> int:
"""Interactive single-select with arrow keys.
items: list of (label, description) tuples.
Returns selected index, or default on escape/quit.
"""
try:
import curses
result = [default]
def _menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
curses.init_pair(3, curses.COLOR_CYAN, -1)
cursor = default
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
# Title
try:
stdscr.addnstr(0, 0, title, max_x - 1,
curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0))
stdscr.addnstr(1, 0, " ↑↓ navigate ⏎ select q quit", max_x - 1,
curses.color_pair(3) if curses.has_colors() else curses.A_DIM)
except curses.error:
pass
for i, (label, desc) in enumerate(items):
y = i + 3
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {label}"
if desc:
line += f" {desc}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line[:max_x - 1], max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord('k')):
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord('j')):
cursor = (cursor + 1) % len(items)
elif key in (curses.KEY_ENTER, 10, 13):
result[0] = cursor
return
elif key in (27, ord('q')):
return
curses.wrapper(_menu)
return result[0]
except Exception:
# Fallback: numbered input
print(f"\n {title}\n")
for i, (label, desc) in enumerate(items):
marker = "" if i == default else " "
d = f" {desc}" if desc else ""
print(f" {marker} {i + 1}. {label}{d}")
while True:
try:
val = input(f"\n Select [1-{len(items)}] ({default + 1}): ")
if not val:
return default
idx = int(val) - 1
if 0 <= idx < len(items):
return idx
except (ValueError, EOFError):
return default
def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
"""Prompt for a value with optional default and secret masking."""
suffix = f" [{default}]" if default else ""
if secret:
sys.stdout.write(f" {label}{suffix}: ")
sys.stdout.flush()
if sys.stdin.isatty():
val = getpass.getpass(prompt="")
else:
val = sys.stdin.readline().strip()
else:
sys.stdout.write(f" {label}{suffix}: ")
sys.stdout.flush()
val = sys.stdin.readline().strip()
return val or (default or "")
# ---------------------------------------------------------------------------
# Provider discovery
# ---------------------------------------------------------------------------
def _install_dependencies(provider_name: str) -> None:
"""Install pip dependencies declared in plugin.yaml."""
import subprocess
from pathlib import Path as _Path
plugin_dir = _Path(__file__).parent.parent / "plugins" / "memory" / provider_name
yaml_path = plugin_dir / "plugin.yaml"
if not yaml_path.exists():
return
try:
import yaml
with open(yaml_path) as f:
meta = yaml.safe_load(f) or {}
except Exception:
return
pip_deps = meta.get("pip_dependencies", [])
if not pip_deps:
return
# pip name → import name mapping for packages where they differ
_IMPORT_NAMES = {
"honcho-ai": "honcho",
"mem0ai": "mem0",
"hindsight-client": "hindsight_client",
}
# Check which packages are missing
missing = []
for dep in pip_deps:
import_name = _IMPORT_NAMES.get(dep, dep.replace("-", "_").split("[")[0])
try:
__import__(import_name)
except ImportError:
missing.append(dep)
if not missing:
return
print(f"\n Installing dependencies: {', '.join(missing)}")
try:
subprocess.run(
[sys.executable, "-m", "pip", "install", "--quiet"] + missing,
check=True, timeout=120,
capture_output=True,
)
print(f" ✓ Installed {', '.join(missing)}")
except subprocess.CalledProcessError as e:
print(f" ⚠ Failed to install {', '.join(missing)}")
stderr = (e.stderr or b"").decode()[:200]
if stderr:
print(f" {stderr}")
print(f" Run manually: pip install {' '.join(missing)}")
except Exception as e:
print(f" ⚠ Install failed: {e}")
print(f" Run manually: pip install {' '.join(missing)}")
# Also show external dependencies (non-pip) if any
ext_deps = meta.get("external_dependencies", [])
for dep in ext_deps:
dep_name = dep.get("name", "")
check_cmd = dep.get("check", "")
install_cmd = dep.get("install", "")
if check_cmd:
try:
subprocess.run(
check_cmd, shell=True, capture_output=True, timeout=5
)
except Exception:
if install_cmd:
print(f"\n'{dep_name}' not found. Install with:")
print(f" {install_cmd}")
def _get_available_providers() -> list:
"""Discover memory providers from plugins/memory/.
Returns list of (name, description, provider_instance) tuples.
"""
try:
from plugins.memory import discover_memory_providers, load_memory_provider
raw = discover_memory_providers()
except Exception:
raw = []
results = []
for name, desc, available in raw:
try:
provider = load_memory_provider(name)
if not provider:
continue
except Exception:
continue
# Override description with setup hint
schema = provider.get_config_schema() if hasattr(provider, "get_config_schema") else []
has_secrets = any(f.get("secret") for f in schema)
if has_secrets:
setup_hint = "requires API key"
elif not schema:
setup_hint = "no setup needed"
else:
setup_hint = "local"
results.append((name, setup_hint, provider))
return results
# ---------------------------------------------------------------------------
# Setup wizard
# ---------------------------------------------------------------------------
def cmd_setup(args) -> None:
"""Interactive memory provider setup wizard."""
from hermes_cli.config import load_config, save_config
providers = _get_available_providers()
if not providers:
print("\n No memory provider plugins detected.")
print(" Install a plugin to ~/.hermes/plugins/ and try again.\n")
return
# Build picker items
items = []
for name, desc, _ in providers:
items.append((name, f"{desc}"))
items.append(("Built-in only", "— MEMORY.md / USER.md (default)"))
builtin_idx = len(items) - 1
selected = _curses_select("Memory provider setup", items, default=builtin_idx)
config = load_config()
if not isinstance(config.get("memory"), dict):
config["memory"] = {}
# Built-in only
if selected >= len(providers) or selected < 0:
config["memory"]["provider"] = ""
save_config(config)
print("\n ✓ Memory provider: built-in only")
print(" Saved to config.yaml\n")
return
name, _, provider = providers[selected]
# Install pip dependencies if declared in plugin.yaml
_install_dependencies(name)
schema = provider.get_config_schema() if hasattr(provider, "get_config_schema") else []
# Provider config section
provider_config = config["memory"].get(name, {})
if not isinstance(provider_config, dict):
provider_config = {}
env_path = Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))) / ".env"
env_writes = {}
if schema:
print(f"\n Configuring {name}:\n")
for field in schema:
key = field["key"]
desc = field.get("description", key)
default = field.get("default")
is_secret = field.get("secret", False)
choices = field.get("choices")
env_var = field.get("env_var")
url = field.get("url")
if choices and not is_secret:
# Use curses picker for choice fields
choice_items = [(c, "") for c in choices]
current = provider_config.get(key, default)
current_idx = 0
if current and current in choices:
current_idx = choices.index(current)
sel = _curses_select(f" {desc}", choice_items, default=current_idx)
provider_config[key] = choices[sel]
elif is_secret:
# Prompt for secret
existing = os.environ.get(env_var, "") if env_var else ""
if existing:
masked = f"...{existing[-4:]}" if len(existing) > 4 else "set"
val = _prompt(f"{desc} (current: {masked}, blank to keep)", secret=True)
else:
hint = f" Get yours at {url}" if url else ""
if hint:
print(hint)
val = _prompt(desc, secret=True)
if val and env_var:
env_writes[env_var] = val
else:
# Regular text prompt
current = provider_config.get(key)
effective_default = current or default
val = _prompt(desc, default=str(effective_default) if effective_default else None)
if val:
provider_config[key] = val
# Write activation key to config.yaml
config["memory"]["provider"] = name
save_config(config)
# Write non-secret config to provider's native location
hermes_home = str(Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))))
if provider_config and hasattr(provider, "save_config"):
try:
provider.save_config(provider_config, hermes_home)
except Exception as e:
print(f" ⚠ Failed to write provider config: {e}")
# Write secrets to .env
if env_writes:
_write_env_vars(env_path, env_writes)
print(f"\n ✓ Memory provider: {name}")
print(f" ✓ Activation saved to config.yaml")
if provider_config:
print(f" ✓ Provider config saved")
if env_writes:
print(f" ✓ API keys saved to .env")
print(f"\n Start a new session to activate.\n")
def _write_env_vars(env_path: Path, env_writes: dict) -> None:
"""Append or update env vars in .env file."""
env_path.parent.mkdir(parents=True, exist_ok=True)
existing_lines = []
if env_path.exists():
existing_lines = env_path.read_text().splitlines()
updated_keys = set()
new_lines = []
for line in existing_lines:
key_match = line.split("=", 1)[0].strip() if "=" in line else ""
if key_match in env_writes:
new_lines.append(f"{key_match}={env_writes[key_match]}")
updated_keys.add(key_match)
else:
new_lines.append(line)
for key, val in env_writes.items():
if key not in updated_keys:
new_lines.append(f"{key}={val}")
env_path.write_text("\n".join(new_lines) + "\n")
# ---------------------------------------------------------------------------
# Status
# ---------------------------------------------------------------------------
def cmd_status(args) -> None:
"""Show current memory provider config."""
from hermes_cli.config import load_config
config = load_config()
mem_config = config.get("memory", {})
provider_name = mem_config.get("provider", "")
print(f"\nMemory status\n" + "" * 40)
print(f" Built-in: always active")
print(f" Provider: {provider_name or '(none — built-in only)'}")
if provider_name:
provider_config = mem_config.get(provider_name, {})
if provider_config:
print(f"\n {provider_name} config:")
for key, val in provider_config.items():
print(f" {key}: {val}")
providers = _get_available_providers()
found = any(name == provider_name for name, _, _ in providers)
if found:
print(f"\n Plugin: installed ✓")
for pname, _, p in providers:
if pname == provider_name:
if p.is_available():
print(f" Status: available ✓")
else:
print(f" Status: not available ✗")
schema = p.get_config_schema() if hasattr(p, "get_config_schema") else []
secrets = [f for f in schema if f.get("secret")]
if secrets:
print(f" Missing:")
for s in secrets:
env_var = s.get("env_var", "")
url = s.get("url", "")
is_set = bool(os.environ.get(env_var))
mark = "" if is_set else ""
line = f" {mark} {env_var}"
if url and not is_set:
line += f"{url}"
print(line)
break
else:
print(f"\n Plugin: NOT installed ✗")
print(f" Install the '{provider_name}' memory plugin to ~/.hermes/plugins/")
providers = _get_available_providers()
if providers:
print(f"\n Installed plugins:")
for pname, desc, _ in providers:
active = " ← active" if pname == provider_name else ""
print(f"{pname} ({desc}){active}")
print()
# ---------------------------------------------------------------------------
# Router
# ---------------------------------------------------------------------------
def memory_command(args) -> None:
"""Route memory subcommands."""
sub = getattr(args, "memory_command", None)
if sub == "setup":
cmd_setup(args)
elif sub == "status":
cmd_status(args)
else:
cmd_status(args)

View File

@ -26,6 +26,7 @@ class ModelSwitchResult:
provider_changed: bool = False
api_key: str = ""
base_url: str = ""
api_mode: str = ""
persist: bool = False
error_message: str = ""
warning_message: str = ""
@ -73,6 +74,7 @@ def switch_model(
detect_provider_for_model,
validate_requested_model,
_PROVIDER_LABELS,
opencode_model_api_mode,
)
from hermes_cli.runtime_provider import resolve_runtime_provider
@ -98,11 +100,13 @@ def switch_model(
# Step 4: Resolve credentials for target provider
api_key = current_api_key
base_url = current_base_url
api_mode = ""
if provider_changed:
try:
runtime = resolve_runtime_provider(requested=target_provider)
api_key = runtime.get("api_key", "")
base_url = runtime.get("base_url", "")
api_mode = runtime.get("api_mode", "")
except Exception as e:
provider_label = _PROVIDER_LABELS.get(target_provider, target_provider)
if target_provider == "custom":
@ -130,6 +134,7 @@ def switch_model(
runtime = resolve_runtime_provider(requested=current_provider)
api_key = runtime.get("api_key", "")
base_url = runtime.get("base_url", "")
api_mode = runtime.get("api_mode", "")
except Exception:
pass
@ -166,6 +171,12 @@ def switch_model(
and ("localhost" in (base_url or "") or "127.0.0.1" in (base_url or ""))
)
if target_provider in {"opencode-zen", "opencode-go"}:
# Recompute against the requested new model, not the currently-configured
# model used during runtime resolution. OpenCode mixes API surfaces by
# model family, so a same-provider model switch can change api_mode.
api_mode = opencode_model_api_mode(target_provider, new_model)
return ModelSwitchResult(
success=True,
new_model=new_model,
@ -173,6 +184,7 @@ def switch_model(
provider_changed=provider_changed,
api_key=api_key,
base_url=base_url,
api_mode=api_mode,
persist=bool(validation.get("persist")),
warning_message=validation.get("message") or "",
is_custom_target=is_custom_target,

View File

@ -27,6 +27,8 @@ GITHUB_MODELS_CATALOG_URL = COPILOT_MODELS_URL
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.6", ""),
("qwen/qwen3.6-plus:free", "free"),
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openai/gpt-5.4", ""),
@ -35,6 +37,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("google/gemini-3.1-pro-preview", ""),
("google/gemini-3.1-flash-lite-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
@ -54,6 +58,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"qwen/qwen3.6-plus:free",
"anthropic/claude-sonnet-4.5",
"anthropic/claude-haiku-4.5",
"openai/gpt-5.4",
@ -62,6 +68,8 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"openai/gpt-5.3-codex",
"google/gemini-3-pro-preview",
"google/gemini-3-flash-preview",
"google/gemini-3.1-pro-preview",
"google/gemini-3.1-flash-lite-preview",
"qwen/qwen3.5-plus-02-15",
"qwen/qwen3.5-35b-a3b",
"stepfun/step-3.5-flash",
@ -117,6 +125,12 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"moonshot": [
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"minimax": [
"MiniMax-M2.7",
"MiniMax-M2.7-highspeed",
@ -185,7 +199,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"opencode-go": [
"glm-5",
"kimi-k2.5",
"minimax-m2.5",
"minimax-m2.7",
],
"ai-gateway": [
"anthropic/claude-opus-4.6",
@ -343,7 +357,7 @@ def list_available_providers() -> list[dict[str, str]]:
try:
from hermes_cli.auth import get_auth_status, has_usable_secret
if pid == "custom":
custom_base_url = _get_custom_base_url() or os.getenv("OPENAI_BASE_URL", "")
custom_base_url = _get_custom_base_url() or ""
has_creds = bool(custom_base_url.strip())
elif pid == "openrouter":
has_creds = has_usable_secret(os.getenv("OPENROUTER_API_KEY", ""))
@ -940,6 +954,53 @@ def copilot_model_api_mode(
return "chat_completions"
def normalize_opencode_model_id(provider_id: Optional[str], model_id: Optional[str]) -> str:
"""Normalize OpenCode config IDs to the bare model slug used in API requests."""
provider = normalize_provider(provider_id)
current = str(model_id or "").strip()
if not current or provider not in {"opencode-zen", "opencode-go"}:
return current
prefix = f"{provider}/"
if current.lower().startswith(prefix):
return current[len(prefix):]
return current
def opencode_model_api_mode(provider_id: Optional[str], model_id: Optional[str]) -> str:
"""Determine the API mode for an OpenCode Zen / Go model.
OpenCode routes different models behind different API surfaces:
- GPT-5 / Codex models on Zen use ``/v1/responses``
- Claude models on Zen use ``/v1/messages``
- MiniMax models on Go use ``/v1/messages``
- GLM / Kimi on Go use ``/v1/chat/completions``
- Other Zen models (Gemini, GLM, Kimi, MiniMax, Qwen, etc.) use
``/v1/chat/completions``
This follows the published OpenCode docs for Zen and Go endpoints.
"""
provider = normalize_provider(provider_id)
normalized = normalize_opencode_model_id(provider_id, model_id).lower()
if not normalized:
return "chat_completions"
if provider == "opencode-go":
if normalized.startswith("minimax-"):
return "anthropic_messages"
return "chat_completions"
if provider == "opencode-zen":
if normalized.startswith("claude-"):
return "anthropic_messages"
if normalized.startswith("gpt-"):
return "codex_responses"
return "chat_completions"
return "chat_completions"
def github_model_reasoning_efforts(
model_id: Optional[str],
*,

View File

@ -0,0 +1,517 @@
"""Helpers for Nous subscription managed-tool capabilities."""
from __future__ import annotations
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Iterable, Optional, Set
from hermes_cli.auth import get_nous_auth_status
from hermes_cli.config import get_env_value, load_config
from tools.managed_tool_gateway import is_managed_tool_gateway_ready
from tools.tool_backend_helpers import (
has_direct_modal_credentials,
managed_nous_tools_enabled,
normalize_browser_cloud_provider,
normalize_modal_mode,
resolve_modal_backend_state,
resolve_openai_audio_api_key,
)
_DEFAULT_PLATFORM_TOOLSETS = {
"cli": "hermes-cli",
}
@dataclass(frozen=True)
class NousFeatureState:
key: str
label: str
included_by_default: bool
available: bool
active: bool
managed_by_nous: bool
direct_override: bool
toolset_enabled: bool
current_provider: str = ""
explicit_configured: bool = False
@dataclass(frozen=True)
class NousSubscriptionFeatures:
subscribed: bool
nous_auth_present: bool
provider_is_nous: bool
features: Dict[str, NousFeatureState]
@property
def web(self) -> NousFeatureState:
return self.features["web"]
@property
def image_gen(self) -> NousFeatureState:
return self.features["image_gen"]
@property
def tts(self) -> NousFeatureState:
return self.features["tts"]
@property
def browser(self) -> NousFeatureState:
return self.features["browser"]
@property
def modal(self) -> NousFeatureState:
return self.features["modal"]
def items(self) -> Iterable[NousFeatureState]:
ordered = ("web", "image_gen", "tts", "browser", "modal")
for key in ordered:
yield self.features[key]
def _model_config_dict(config: Dict[str, object]) -> Dict[str, object]:
model_cfg = config.get("model")
if isinstance(model_cfg, dict):
return dict(model_cfg)
if isinstance(model_cfg, str) and model_cfg.strip():
return {"default": model_cfg.strip()}
return {}
def _toolset_enabled(config: Dict[str, object], toolset_key: str) -> bool:
from toolsets import resolve_toolset
platform_toolsets = config.get("platform_toolsets")
if not isinstance(platform_toolsets, dict) or not platform_toolsets:
platform_toolsets = {"cli": [_DEFAULT_PLATFORM_TOOLSETS["cli"]]}
target_tools = set(resolve_toolset(toolset_key))
if not target_tools:
return False
for platform, raw_toolsets in platform_toolsets.items():
if isinstance(raw_toolsets, list):
toolset_names = list(raw_toolsets)
else:
default_toolset = _DEFAULT_PLATFORM_TOOLSETS.get(platform)
toolset_names = [default_toolset] if default_toolset else []
if not toolset_names:
default_toolset = _DEFAULT_PLATFORM_TOOLSETS.get(platform)
if default_toolset:
toolset_names = [default_toolset]
available_tools: Set[str] = set()
for toolset_name in toolset_names:
if not isinstance(toolset_name, str) or not toolset_name:
continue
try:
available_tools.update(resolve_toolset(toolset_name))
except Exception:
continue
if target_tools and target_tools.issubset(available_tools):
return True
return False
def _has_agent_browser() -> bool:
import shutil
agent_browser_bin = shutil.which("agent-browser")
local_bin = (
Path(__file__).parent.parent / "node_modules" / ".bin" / "agent-browser"
)
return bool(agent_browser_bin or local_bin.exists())
def _browser_label(current_provider: str) -> str:
mapping = {
"browserbase": "Browserbase",
"browser-use": "Browser Use",
"camofox": "Camofox",
"local": "Local browser",
}
return mapping.get(current_provider or "local", current_provider or "Local browser")
def _tts_label(current_provider: str) -> str:
mapping = {
"openai": "OpenAI TTS",
"elevenlabs": "ElevenLabs",
"edge": "Edge TTS",
"neutts": "NeuTTS",
}
return mapping.get(current_provider or "edge", current_provider or "Edge TTS")
def _resolve_browser_feature_state(
*,
browser_tool_enabled: bool,
browser_provider: str,
browser_provider_explicit: bool,
browser_local_available: bool,
direct_camofox: bool,
direct_browserbase: bool,
direct_browser_use: bool,
managed_browser_available: bool,
) -> tuple[str, bool, bool, bool]:
"""Resolve browser availability using the same precedence as runtime."""
if direct_camofox:
return "camofox", True, bool(browser_tool_enabled), False
if browser_provider_explicit:
current_provider = browser_provider or "local"
if current_provider == "browserbase":
provider_available = managed_browser_available or direct_browserbase
available = bool(browser_local_available and provider_available)
managed = bool(
browser_tool_enabled
and browser_local_available
and managed_browser_available
and not direct_browserbase
)
active = bool(browser_tool_enabled and available)
return current_provider, available, active, managed
if current_provider == "browser-use":
available = bool(browser_local_available and direct_browser_use)
active = bool(browser_tool_enabled and available)
return current_provider, available, active, False
if current_provider == "camofox":
return current_provider, False, False, False
current_provider = "local"
available = bool(browser_local_available)
active = bool(browser_tool_enabled and available)
return current_provider, available, active, False
if managed_browser_available or direct_browserbase:
available = bool(browser_local_available)
managed = bool(
browser_tool_enabled
and browser_local_available
and managed_browser_available
and not direct_browserbase
)
active = bool(browser_tool_enabled and available)
return "browserbase", available, active, managed
available = bool(browser_local_available)
active = bool(browser_tool_enabled and available)
return "local", available, active, False
def get_nous_subscription_features(
config: Optional[Dict[str, object]] = None,
) -> NousSubscriptionFeatures:
if config is None:
config = load_config() or {}
config = dict(config)
model_cfg = _model_config_dict(config)
provider_is_nous = str(model_cfg.get("provider") or "").strip().lower() == "nous"
try:
nous_status = get_nous_auth_status()
except Exception:
nous_status = {}
managed_tools_flag = managed_nous_tools_enabled()
nous_auth_present = bool(nous_status.get("logged_in"))
subscribed = provider_is_nous or nous_auth_present
web_tool_enabled = _toolset_enabled(config, "web")
image_tool_enabled = _toolset_enabled(config, "image_gen")
tts_tool_enabled = _toolset_enabled(config, "tts")
browser_tool_enabled = _toolset_enabled(config, "browser")
modal_tool_enabled = _toolset_enabled(config, "terminal")
web_cfg = config.get("web") if isinstance(config.get("web"), dict) else {}
tts_cfg = config.get("tts") if isinstance(config.get("tts"), dict) else {}
browser_cfg = config.get("browser") if isinstance(config.get("browser"), dict) else {}
terminal_cfg = config.get("terminal") if isinstance(config.get("terminal"), dict) else {}
web_backend = str(web_cfg.get("backend") or "").strip().lower()
tts_provider = str(tts_cfg.get("provider") or "edge").strip().lower()
browser_provider_explicit = "cloud_provider" in browser_cfg
browser_provider = normalize_browser_cloud_provider(
browser_cfg.get("cloud_provider") if browser_provider_explicit else None
)
terminal_backend = (
str(terminal_cfg.get("backend") or "local").strip().lower()
)
modal_mode = normalize_modal_mode(
terminal_cfg.get("modal_mode")
)
direct_exa = bool(get_env_value("EXA_API_KEY"))
direct_firecrawl = bool(get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL"))
direct_parallel = bool(get_env_value("PARALLEL_API_KEY"))
direct_tavily = bool(get_env_value("TAVILY_API_KEY"))
direct_fal = bool(get_env_value("FAL_KEY"))
direct_openai_tts = bool(resolve_openai_audio_api_key())
direct_elevenlabs = bool(get_env_value("ELEVENLABS_API_KEY"))
direct_camofox = bool(get_env_value("CAMOFOX_URL"))
direct_browserbase = bool(get_env_value("BROWSERBASE_API_KEY") and get_env_value("BROWSERBASE_PROJECT_ID"))
direct_browser_use = bool(get_env_value("BROWSER_USE_API_KEY"))
direct_modal = has_direct_modal_credentials()
managed_web_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("firecrawl")
managed_image_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("fal-queue")
managed_tts_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("openai-audio")
managed_browser_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("browserbase")
managed_modal_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("modal")
modal_state = resolve_modal_backend_state(
modal_mode,
has_direct=direct_modal,
managed_ready=managed_modal_available,
)
web_managed = web_backend == "firecrawl" and managed_web_available and not direct_firecrawl
web_active = bool(
web_tool_enabled
and (
web_managed
or (web_backend == "exa" and direct_exa)
or (web_backend == "firecrawl" and direct_firecrawl)
or (web_backend == "parallel" and direct_parallel)
or (web_backend == "tavily" and direct_tavily)
)
)
web_available = bool(
managed_web_available or direct_exa or direct_firecrawl or direct_parallel or direct_tavily
)
image_managed = image_tool_enabled and managed_image_available and not direct_fal
image_active = bool(image_tool_enabled and (image_managed or direct_fal))
image_available = bool(managed_image_available or direct_fal)
tts_current_provider = tts_provider or "edge"
tts_managed = (
tts_tool_enabled
and tts_current_provider == "openai"
and managed_tts_available
and not direct_openai_tts
)
tts_available = bool(
tts_current_provider in {"edge", "neutts"}
or (tts_current_provider == "openai" and (managed_tts_available or direct_openai_tts))
or (tts_current_provider == "elevenlabs" and direct_elevenlabs)
)
tts_active = bool(tts_tool_enabled and tts_available)
browser_local_available = _has_agent_browser()
(
browser_current_provider,
browser_available,
browser_active,
browser_managed,
) = _resolve_browser_feature_state(
browser_tool_enabled=browser_tool_enabled,
browser_provider=browser_provider,
browser_provider_explicit=browser_provider_explicit,
browser_local_available=browser_local_available,
direct_camofox=direct_camofox,
direct_browserbase=direct_browserbase,
direct_browser_use=direct_browser_use,
managed_browser_available=managed_browser_available,
)
if terminal_backend != "modal":
modal_managed = False
modal_available = True
modal_active = bool(modal_tool_enabled)
modal_direct_override = False
elif modal_state["selected_backend"] == "managed":
modal_managed = bool(modal_tool_enabled)
modal_available = True
modal_active = bool(modal_tool_enabled)
modal_direct_override = False
elif modal_state["selected_backend"] == "direct":
modal_managed = False
modal_available = True
modal_active = bool(modal_tool_enabled)
modal_direct_override = bool(modal_tool_enabled)
elif modal_mode == "managed":
modal_managed = False
modal_available = bool(managed_modal_available)
modal_active = False
modal_direct_override = False
elif modal_mode == "direct":
modal_managed = False
modal_available = bool(direct_modal)
modal_active = False
modal_direct_override = False
else:
modal_managed = False
modal_available = bool(managed_modal_available or direct_modal)
modal_active = False
modal_direct_override = False
tts_explicit_configured = False
raw_tts_cfg = config.get("tts")
if isinstance(raw_tts_cfg, dict) and "provider" in raw_tts_cfg:
tts_explicit_configured = tts_provider not in {"", "edge"}
features = {
"web": NousFeatureState(
key="web",
label="Web tools",
included_by_default=True,
available=web_available,
active=web_active,
managed_by_nous=web_managed,
direct_override=web_active and not web_managed,
toolset_enabled=web_tool_enabled,
current_provider=web_backend or "",
explicit_configured=bool(web_backend),
),
"image_gen": NousFeatureState(
key="image_gen",
label="Image generation",
included_by_default=True,
available=image_available,
active=image_active,
managed_by_nous=image_managed,
direct_override=image_active and not image_managed,
toolset_enabled=image_tool_enabled,
current_provider="FAL" if direct_fal else ("Nous Subscription" if image_managed else ""),
explicit_configured=direct_fal,
),
"tts": NousFeatureState(
key="tts",
label="OpenAI TTS",
included_by_default=True,
available=tts_available,
active=tts_active,
managed_by_nous=tts_managed,
direct_override=tts_active and not tts_managed,
toolset_enabled=tts_tool_enabled,
current_provider=_tts_label(tts_current_provider),
explicit_configured=tts_explicit_configured,
),
"browser": NousFeatureState(
key="browser",
label="Browser automation",
included_by_default=True,
available=browser_available,
active=browser_active,
managed_by_nous=browser_managed,
direct_override=browser_active and not browser_managed,
toolset_enabled=browser_tool_enabled,
current_provider=_browser_label(browser_current_provider),
explicit_configured=browser_provider_explicit,
),
"modal": NousFeatureState(
key="modal",
label="Modal execution",
included_by_default=False,
available=modal_available,
active=modal_active,
managed_by_nous=modal_managed,
direct_override=terminal_backend == "modal" and modal_direct_override,
toolset_enabled=modal_tool_enabled,
current_provider="Modal" if terminal_backend == "modal" else terminal_backend or "local",
explicit_configured=terminal_backend == "modal",
),
}
return NousSubscriptionFeatures(
subscribed=subscribed,
nous_auth_present=nous_auth_present,
provider_is_nous=provider_is_nous,
features=features,
)
def get_nous_subscription_explainer_lines() -> list[str]:
if not managed_nous_tools_enabled():
return []
return [
"Nous subscription enables managed web tools, image generation, OpenAI TTS, and browser automation by default.",
"Those managed tools bill to your Nous subscription. Modal execution is optional and can bill to your subscription too.",
"Change these later with: hermes setup tools, hermes setup terminal, or hermes status.",
]
def apply_nous_provider_defaults(config: Dict[str, object]) -> set[str]:
"""Apply provider-level Nous defaults shared by `hermes setup` and `hermes model`."""
if not managed_nous_tools_enabled():
return set()
features = get_nous_subscription_features(config)
if not features.provider_is_nous:
return set()
tts_cfg = config.get("tts")
if not isinstance(tts_cfg, dict):
tts_cfg = {}
config["tts"] = tts_cfg
current_tts = str(tts_cfg.get("provider") or "edge").strip().lower()
if current_tts not in {"", "edge"}:
return set()
tts_cfg["provider"] = "openai"
return {"tts"}
def apply_nous_managed_defaults(
config: Dict[str, object],
*,
enabled_toolsets: Optional[Iterable[str]] = None,
) -> set[str]:
if not managed_nous_tools_enabled():
return set()
features = get_nous_subscription_features(config)
if not features.provider_is_nous:
return set()
selected_toolsets = set(enabled_toolsets or ())
changed: set[str] = set()
web_cfg = config.get("web")
if not isinstance(web_cfg, dict):
web_cfg = {}
config["web"] = web_cfg
tts_cfg = config.get("tts")
if not isinstance(tts_cfg, dict):
tts_cfg = {}
config["tts"] = tts_cfg
browser_cfg = config.get("browser")
if not isinstance(browser_cfg, dict):
browser_cfg = {}
config["browser"] = browser_cfg
if "web" in selected_toolsets and not features.web.explicit_configured and not (
get_env_value("PARALLEL_API_KEY")
or get_env_value("TAVILY_API_KEY")
or get_env_value("FIRECRAWL_API_KEY")
or get_env_value("FIRECRAWL_API_URL")
):
web_cfg["backend"] = "firecrawl"
changed.add("web")
if "tts" in selected_toolsets and not features.tts.explicit_configured and not (
resolve_openai_audio_api_key()
or get_env_value("ELEVENLABS_API_KEY")
):
tts_cfg["provider"] = "openai"
changed.add("tts")
if "browser" in selected_toolsets and not features.browser.explicit_configured and not (
get_env_value("BROWSERBASE_API_KEY")
or get_env_value("BROWSER_USE_API_KEY")
):
browser_cfg["cloud_provider"] = "browserbase"
changed.add("browser")
if "image_gen" in selected_toolsets and not get_env_value("FAL_KEY"):
changed.add("image_gen")
return changed

View File

@ -38,6 +38,8 @@ from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
from utils import env_var_enabled
try:
import yaml
except ImportError: # pragma: no cover yaml is optional at import time
@ -65,7 +67,18 @@ _NS_PARENT = "hermes_plugins"
def _env_enabled(name: str) -> bool:
"""Return True when an env var is set to a truthy opt-in value."""
return os.getenv(name, "").strip().lower() in {"1", "true", "yes", "on"}
return env_var_enabled(name)
def _get_disabled_plugins() -> set:
"""Read the disabled plugins list from config.yaml."""
try:
from hermes_cli.config import load_config
config = load_config()
disabled = config.get("plugins", {}).get("disabled", [])
return set(disabled) if isinstance(disabled, list) else set()
except Exception:
return set()
# ---------------------------------------------------------------------------
@ -141,6 +154,34 @@ class PluginContext:
self._manager._plugin_tool_names.add(name)
logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
# -- message injection --------------------------------------------------
def inject_message(self, content: str, role: str = "user") -> bool:
"""Inject a message into the active conversation.
If the agent is idle (waiting for user input), this starts a new turn.
If the agent is running, this interrupts and injects the message.
This enables plugins (e.g. remote control viewers, messaging bridges)
to send messages into the conversation from external sources.
Returns True if the message was queued successfully.
"""
cli = self._manager._cli_ref
if cli is None:
logger.warning("inject_message: no CLI reference (not available in gateway mode)")
return False
msg = content if role == "user" else f"[{role}] {content}"
if getattr(cli, "_agent_running", False):
# Agent is mid-turn — interrupt with the message
cli._interrupt_queue.put(msg)
else:
# Agent is idle — queue as next input
cli._pending_input.put(msg)
return True
# -- hook registration --------------------------------------------------
def register_hook(self, hook_name: str, callback: Callable) -> None:
@ -173,6 +214,7 @@ class PluginManager:
self._hooks: Dict[str, List[Callable]] = {}
self._plugin_tool_names: Set[str] = set()
self._discovered: bool = False
self._cli_ref = None # Set by CLI after plugin discovery
# -----------------------------------------------------------------------
# Public
@ -199,8 +241,15 @@ class PluginManager:
# 3. Pip / entry-point plugins
manifests.extend(self._scan_entry_points())
# Load each manifest
# Load each manifest (skip user-disabled plugins)
disabled = _get_disabled_plugins()
for manifest in manifests:
if manifest.name in disabled:
loaded = LoadedPlugin(manifest=manifest, enabled=False)
loaded.error = "disabled via config"
self._plugins[manifest.name] = loaded
logger.debug("Skipping disabled plugin '%s'", manifest.name)
continue
self._load_plugin(manifest)
if manifests:

View File

@ -265,10 +265,11 @@ def cmd_install(identifier: str, force: bool = False) -> None:
)
sys.exit(1)
if mv_int > _SUPPORTED_MANIFEST_VERSION:
from hermes_cli.config import recommended_update_command
console.print(
f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
f"Run [bold]hermes update[/bold] to get a newer installer."
f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
)
sys.exit(1)
@ -374,6 +375,73 @@ def cmd_remove(name: str) -> None:
_display_removed(name, plugins_dir)
def _get_disabled_set() -> set:
"""Read the disabled plugins set from config.yaml."""
try:
from hermes_cli.config import load_config
config = load_config()
disabled = config.get("plugins", {}).get("disabled", [])
return set(disabled) if isinstance(disabled, list) else set()
except Exception:
return set()
def _save_disabled_set(disabled: set) -> None:
"""Write the disabled plugins list to config.yaml."""
from hermes_cli.config import load_config, save_config
config = load_config()
if "plugins" not in config:
config["plugins"] = {}
config["plugins"]["disabled"] = sorted(disabled)
save_config(config)
def cmd_enable(name: str) -> None:
"""Enable a previously disabled plugin."""
from rich.console import Console
console = Console()
plugins_dir = _plugins_dir()
# Verify the plugin exists
target = plugins_dir / name
if not target.is_dir():
console.print(f"[red]Plugin '{name}' is not installed.[/red]")
sys.exit(1)
disabled = _get_disabled_set()
if name not in disabled:
console.print(f"[dim]Plugin '{name}' is already enabled.[/dim]")
return
disabled.discard(name)
_save_disabled_set(disabled)
console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] enabled. Takes effect on next session.")
def cmd_disable(name: str) -> None:
"""Disable a plugin without removing it."""
from rich.console import Console
console = Console()
plugins_dir = _plugins_dir()
# Verify the plugin exists
target = plugins_dir / name
if not target.is_dir():
console.print(f"[red]Plugin '{name}' is not installed.[/red]")
sys.exit(1)
disabled = _get_disabled_set()
if name in disabled:
console.print(f"[dim]Plugin '{name}' is already disabled.[/dim]")
return
disabled.add(name)
_save_disabled_set(disabled)
console.print(f"[yellow]⊘[/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
def cmd_list() -> None:
"""List installed plugins."""
from rich.console import Console
@ -393,8 +461,11 @@ def cmd_list() -> None:
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
disabled = _get_disabled_set()
table = Table(title="Installed Plugins", show_lines=False)
table.add_column("Name", style="bold")
table.add_column("Status")
table.add_column("Version", style="dim")
table.add_column("Description")
table.add_column("Source", style="dim")
@ -420,11 +491,86 @@ def cmd_list() -> None:
if (d / ".git").exists():
source = "git"
table.add_row(name, str(version), description, source)
is_disabled = name in disabled or d.name in disabled
status = "[red]disabled[/red]" if is_disabled else "[green]enabled[/green]"
table.add_row(name, status, str(version), description, source)
console.print()
console.print(table)
console.print()
console.print("[dim]Interactive toggle:[/dim] hermes plugins")
console.print("[dim]Enable/disable:[/dim] hermes plugins enable/disable <name>")
def cmd_toggle() -> None:
"""Interactive curses checklist to enable/disable installed plugins."""
from rich.console import Console
try:
import yaml
except ImportError:
yaml = None
console = Console()
plugins_dir = _plugins_dir()
dirs = sorted(d for d in plugins_dir.iterdir() if d.is_dir())
if not dirs:
console.print("[dim]No plugins installed.[/dim]")
console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
return
disabled = _get_disabled_set()
# Build items list: "name — description" for display
names = []
labels = []
selected = set()
for i, d in enumerate(dirs):
manifest_file = d / "plugin.yaml"
name = d.name
description = ""
if manifest_file.exists() and yaml:
try:
with open(manifest_file) as f:
manifest = yaml.safe_load(f) or {}
name = manifest.get("name", d.name)
description = manifest.get("description", "")
except Exception:
pass
names.append(name)
label = f"{name}{description}" if description else name
labels.append(label)
if name not in disabled and d.name not in disabled:
selected.add(i)
from hermes_cli.curses_ui import curses_checklist
result = curses_checklist(
title="Plugins — toggle enabled/disabled",
items=labels,
selected=selected,
)
# Compute new disabled set from deselected items
new_disabled = set()
for i, name in enumerate(names):
if i not in result:
new_disabled.add(name)
if new_disabled != disabled:
_save_disabled_set(new_disabled)
enabled_count = len(names) - len(new_disabled)
console.print(
f"\n[green]✓[/green] {enabled_count} enabled, {len(new_disabled)} disabled. "
f"Takes effect on next session."
)
else:
console.print("\n[dim]No changes.[/dim]")
def plugins_command(args) -> None:
@ -437,8 +583,14 @@ def plugins_command(args) -> None:
cmd_update(args.name)
elif action in ("remove", "rm", "uninstall"):
cmd_remove(args.name)
elif action in ("list", "ls") or action is None:
elif action == "enable":
cmd_enable(args.name)
elif action == "disable":
cmd_disable(args.name)
elif action in ("list", "ls"):
cmd_list()
elif action is None:
cmd_toggle()
else:
from rich.console import Console

1054
hermes_cli/profiles.py Normal file

File diff suppressed because it is too large Load Diff

View File

@ -6,8 +6,10 @@ import os
from typing import Any, Dict, Optional
from hermes_cli import auth as auth_mod
from agent.credential_pool import CredentialPool, PooledCredential, get_custom_provider_pool_key, load_pool
from hermes_cli.auth import (
AuthError,
DEFAULT_CODEX_BASE_URL,
PROVIDER_REGISTRY,
format_auth_error,
resolve_provider,
@ -63,10 +65,13 @@ def _get_model_config() -> Dict[str, Any]:
model_cfg = config.get("model")
if isinstance(model_cfg, dict):
cfg = dict(model_cfg)
# Accept "model" as alias for "default" (users intuitively write model.model)
if not cfg.get("default") and cfg.get("model"):
cfg["default"] = cfg["model"]
default = (cfg.get("default") or "").strip()
base_url = (cfg.get("base_url") or "").strip()
is_local = "localhost" in base_url or "127.0.0.1" in base_url
is_fallback = not default or default == "anthropic/claude-opus-4.6"
is_fallback = not default
if is_local and is_fallback and base_url:
detected = _auto_detect_local_model(base_url)
if detected:
@ -77,9 +82,27 @@ def _get_model_config() -> Dict[str, Any]:
return {}
def _provider_supports_explicit_api_mode(provider: Optional[str], configured_provider: Optional[str] = None) -> bool:
"""Check whether a persisted api_mode should be honored for a given provider.
Prevents stale api_mode from a previous provider leaking into a
different one after a model/provider switch. Only applies the
persisted mode when the config's provider matches the runtime
provider (or when no configured provider is recorded).
"""
normalized_provider = (provider or "").strip().lower()
normalized_configured = (configured_provider or "").strip().lower()
if not normalized_configured:
return True
if normalized_provider == "custom":
return normalized_configured == "custom" or normalized_configured.startswith("custom:")
return normalized_configured == normalized_provider
def _copilot_runtime_api_mode(model_cfg: Dict[str, Any], api_key: str) -> str:
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode:
if configured_mode and _provider_supports_explicit_api_mode("copilot", configured_provider):
return configured_mode
model_name = str(model_cfg.get("default") or "").strip()
@ -106,6 +129,56 @@ def _parse_api_mode(raw: Any) -> Optional[str]:
return None
def _resolve_runtime_from_pool_entry(
*,
provider: str,
entry: PooledCredential,
requested_provider: str,
model_cfg: Optional[Dict[str, Any]] = None,
pool: Optional[CredentialPool] = None,
) -> Dict[str, Any]:
model_cfg = model_cfg or _get_model_config()
base_url = (getattr(entry, "runtime_base_url", None) or getattr(entry, "base_url", None) or "").rstrip("/")
api_key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
api_mode = "chat_completions"
if provider == "openai-codex":
api_mode = "codex_responses"
base_url = base_url or DEFAULT_CODEX_BASE_URL
elif provider == "anthropic":
api_mode = "anthropic_messages"
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
cfg_base_url = ""
if cfg_provider == "anthropic":
cfg_base_url = str(model_cfg.get("base_url") or "").strip().rstrip("/")
base_url = cfg_base_url or base_url or "https://api.anthropic.com"
elif provider == "openrouter":
base_url = base_url or OPENROUTER_BASE_URL
elif provider == "nous":
api_mode = "chat_completions"
elif provider == "copilot":
api_mode = _copilot_runtime_api_mode(model_cfg, getattr(entry, "runtime_api_key", ""))
else:
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode and _provider_supports_explicit_api_mode(provider, configured_provider):
api_mode = configured_mode
elif provider in ("opencode-zen", "opencode-go"):
from hermes_cli.models import opencode_model_api_mode
api_mode = opencode_model_api_mode(provider, model_cfg.get("default", ""))
elif base_url.rstrip("/").endswith("/anthropic"):
api_mode = "anthropic_messages"
return {
"provider": provider,
"api_mode": api_mode,
"base_url": base_url,
"api_key": api_key,
"source": getattr(entry, "source", "pool"),
"credential_pool": pool,
"requested_provider": requested_provider,
}
def resolve_requested_provider(requested: Optional[str] = None) -> str:
"""Resolve provider request from explicit arg, config, then env."""
if requested and requested.strip():
@ -125,6 +198,37 @@ def resolve_requested_provider(requested: Optional[str] = None) -> str:
return "auto"
def _try_resolve_from_custom_pool(
base_url: str,
provider_label: str,
api_mode_override: Optional[str] = None,
) -> Optional[Dict[str, Any]]:
"""Check if a credential pool exists for a custom endpoint and return a runtime dict if so."""
pool_key = get_custom_provider_pool_key(base_url)
if not pool_key:
return None
try:
pool = load_pool(pool_key)
if not pool.has_credentials():
return None
entry = pool.select()
if entry is None:
return None
pool_api_key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
if not pool_api_key:
return None
return {
"provider": provider_label,
"api_mode": api_mode_override or _detect_api_mode_for_url(base_url) or "chat_completions",
"base_url": base_url,
"api_key": pool_api_key,
"source": f"pool:{pool_key}",
"credential_pool": pool,
}
except Exception:
return None
def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, Any]]:
requested_norm = _normalize_custom_provider_name(requested_provider or "")
if not requested_norm or requested_norm == "custom":
@ -189,6 +293,11 @@ def _resolve_named_custom_runtime(
if not base_url:
return None
# Check if a credential pool exists for this custom endpoint
pool_result = _try_resolve_from_custom_pool(base_url, "custom", custom_provider.get("api_mode"))
if pool_result:
return pool_result
api_key_candidates = [
(explicit_api_key or "").strip(),
str(custom_provider.get("api_key", "") or "").strip(),
@ -203,7 +312,7 @@ def _resolve_named_custom_runtime(
or _detect_api_mode_for_url(base_url)
or "chat_completions",
"base_url": base_url,
"api_key": api_key,
"api_key": api_key or "no-key-required",
"source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
}
@ -226,28 +335,22 @@ def _resolve_openrouter_runtime(
requested_norm = (requested_provider or "").strip().lower()
cfg_provider = cfg_provider.strip().lower()
env_openai_base_url = os.getenv("OPENAI_BASE_URL", "").strip()
env_openrouter_base_url = os.getenv("OPENROUTER_BASE_URL", "").strip()
# Use config base_url when available and the provider context matches.
# OPENAI_BASE_URL env var is no longer consulted — config.yaml is
# the single source of truth for endpoint URLs.
use_config_base_url = False
if cfg_base_url.strip() and not explicit_base_url:
if requested_norm == "auto":
if (not cfg_provider or cfg_provider == "auto") and not env_openai_base_url:
if not cfg_provider or cfg_provider == "auto":
use_config_base_url = True
elif requested_norm == "custom" and cfg_provider == "custom":
# provider: custom — use base_url from config (Fixes #1760).
use_config_base_url = True
# When the user explicitly requested the openrouter provider, skip
# OPENAI_BASE_URL — it typically points to a custom / non-OpenRouter
# endpoint and would prevent switching back to OpenRouter (#874).
skip_openai_base = requested_norm == "openrouter"
# For custom, prefer config base_url over env so config.yaml is honored (#1760).
base_url = (
(explicit_base_url or "").strip()
or (cfg_base_url.strip() if use_config_base_url else "")
or ("" if skip_openai_base else env_openai_base_url)
or env_openrouter_base_url
or OPENROUTER_BASE_URL
).rstrip("/")
@ -284,6 +387,15 @@ def _resolve_openrouter_runtime(
# Also provide a placeholder API key for local servers that don't require
# authentication — the OpenAI SDK requires a non-empty api_key string.
effective_provider = "custom" if requested_norm == "custom" else "openrouter"
# For custom endpoints, check if a credential pool exists
if effective_provider == "custom" and base_url:
pool_result = _try_resolve_from_custom_pool(
base_url, effective_provider, _parse_api_mode(model_cfg.get("api_mode")),
)
if pool_result:
return pool_result
if effective_provider == "custom" and not api_key and not _is_openrouter_url:
api_key = "no-key-required"
@ -298,6 +410,134 @@ def _resolve_openrouter_runtime(
}
def _resolve_explicit_runtime(
*,
provider: str,
requested_provider: str,
model_cfg: Dict[str, Any],
explicit_api_key: Optional[str] = None,
explicit_base_url: Optional[str] = None,
) -> Optional[Dict[str, Any]]:
explicit_api_key = str(explicit_api_key or "").strip()
explicit_base_url = str(explicit_base_url or "").strip().rstrip("/")
if not explicit_api_key and not explicit_base_url:
return None
if provider == "anthropic":
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
cfg_base_url = ""
if cfg_provider == "anthropic":
cfg_base_url = str(model_cfg.get("base_url") or "").strip().rstrip("/")
base_url = explicit_base_url or cfg_base_url or "https://api.anthropic.com"
api_key = explicit_api_key
if not api_key:
from agent.anthropic_adapter import resolve_anthropic_token
api_key = resolve_anthropic_token()
if not api_key:
raise AuthError(
"No Anthropic credentials found. Set ANTHROPIC_TOKEN or ANTHROPIC_API_KEY, "
"run 'claude setup-token', or authenticate with 'claude /login'."
)
return {
"provider": "anthropic",
"api_mode": "anthropic_messages",
"base_url": base_url,
"api_key": api_key,
"source": "explicit",
"requested_provider": requested_provider,
}
if provider == "openai-codex":
base_url = explicit_base_url or DEFAULT_CODEX_BASE_URL
api_key = explicit_api_key
last_refresh = None
if not api_key:
creds = resolve_codex_runtime_credentials()
api_key = creds.get("api_key", "")
last_refresh = creds.get("last_refresh")
if not explicit_base_url:
base_url = creds.get("base_url", "").rstrip("/") or base_url
return {
"provider": "openai-codex",
"api_mode": "codex_responses",
"base_url": base_url,
"api_key": api_key,
"source": "explicit",
"last_refresh": last_refresh,
"requested_provider": requested_provider,
}
if provider == "nous":
state = auth_mod.get_provider_auth_state("nous") or {}
base_url = (
explicit_base_url
or str(state.get("inference_base_url") or auth_mod.DEFAULT_NOUS_INFERENCE_URL).strip().rstrip("/")
)
api_key = explicit_api_key or str(state.get("agent_key") or state.get("access_token") or "").strip()
expires_at = state.get("agent_key_expires_at") or state.get("expires_at")
if not api_key:
creds = resolve_nous_runtime_credentials(
min_key_ttl_seconds=max(60, int(os.getenv("HERMES_NOUS_MIN_KEY_TTL_SECONDS", "1800"))),
timeout_seconds=float(os.getenv("HERMES_NOUS_TIMEOUT_SECONDS", "15")),
)
api_key = creds.get("api_key", "")
expires_at = creds.get("expires_at")
if not explicit_base_url:
base_url = creds.get("base_url", "").rstrip("/") or base_url
return {
"provider": "nous",
"api_mode": "chat_completions",
"base_url": base_url,
"api_key": api_key,
"source": "explicit",
"expires_at": expires_at,
"requested_provider": requested_provider,
}
pconfig = PROVIDER_REGISTRY.get(provider)
if pconfig and pconfig.auth_type == "api_key":
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip().rstrip("/")
base_url = explicit_base_url
if not base_url:
if provider == "kimi-coding":
creds = resolve_api_key_provider_credentials(provider)
base_url = creds.get("base_url", "").rstrip("/")
else:
base_url = env_url or pconfig.inference_base_url
api_key = explicit_api_key
if not api_key:
creds = resolve_api_key_provider_credentials(provider)
api_key = creds.get("api_key", "")
if not base_url:
base_url = creds.get("base_url", "").rstrip("/")
api_mode = "chat_completions"
if provider == "copilot":
api_mode = _copilot_runtime_api_mode(model_cfg, api_key)
else:
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode:
api_mode = configured_mode
elif base_url.rstrip("/").endswith("/anthropic"):
api_mode = "anthropic_messages"
return {
"provider": provider,
"api_mode": api_mode,
"base_url": base_url.rstrip("/"),
"api_key": api_key,
"source": "explicit",
"requested_provider": requested_provider,
}
return None
def resolve_runtime_provider(
*,
requested: Optional[str] = None,
@ -321,6 +561,57 @@ def resolve_runtime_provider(
explicit_api_key=explicit_api_key,
explicit_base_url=explicit_base_url,
)
model_cfg = _get_model_config()
explicit_runtime = _resolve_explicit_runtime(
provider=provider,
requested_provider=requested_provider,
model_cfg=model_cfg,
explicit_api_key=explicit_api_key,
explicit_base_url=explicit_base_url,
)
if explicit_runtime:
return explicit_runtime
should_use_pool = provider != "openrouter"
if provider == "openrouter":
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
cfg_base_url = str(model_cfg.get("base_url") or "").strip()
env_openai_base_url = os.getenv("OPENAI_BASE_URL", "").strip()
env_openrouter_base_url = os.getenv("OPENROUTER_BASE_URL", "").strip()
has_custom_endpoint = bool(
explicit_base_url
or env_openai_base_url
or env_openrouter_base_url
)
if cfg_base_url and cfg_provider in {"auto", "custom"}:
has_custom_endpoint = True
has_runtime_override = bool(explicit_api_key or explicit_base_url)
should_use_pool = (
requested_provider in {"openrouter", "auto"}
and not has_custom_endpoint
and not has_runtime_override
)
try:
pool = load_pool(provider) if should_use_pool else None
except Exception:
pool = None
if pool and pool.has_credentials():
entry = pool.select()
pool_api_key = ""
if entry is not None:
pool_api_key = (
getattr(entry, "runtime_api_key", None)
or getattr(entry, "access_token", "")
)
if entry is not None and pool_api_key:
return _resolve_runtime_from_pool_entry(
provider=provider,
entry=entry,
requested_provider=requested_provider,
model_cfg=model_cfg,
pool=pool,
)
if provider == "nous":
creds = resolve_nous_runtime_credentials(
@ -374,7 +665,6 @@ def resolve_runtime_provider(
# Allow base URL override from config.yaml model.base_url, but only
# when the configured provider is anthropic — otherwise a non-Anthropic
# base_url (e.g. Codex endpoint) would leak into Anthropic requests.
model_cfg = _get_model_config()
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
cfg_base_url = ""
if cfg_provider == "anthropic":
@ -393,16 +683,19 @@ def resolve_runtime_provider(
pconfig = PROVIDER_REGISTRY.get(provider)
if pconfig and pconfig.auth_type == "api_key":
creds = resolve_api_key_provider_credentials(provider)
model_cfg = _get_model_config()
base_url = creds.get("base_url", "").rstrip("/")
api_mode = "chat_completions"
if provider == "copilot":
api_mode = _copilot_runtime_api_mode(model_cfg, creds.get("api_key", ""))
else:
# Check explicit api_mode from model config first
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
# Only honor persisted api_mode when it belongs to the same provider family.
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode:
if configured_mode and _provider_supports_explicit_api_mode(provider, configured_provider):
api_mode = configured_mode
elif provider in ("opencode-zen", "opencode-go"):
from hermes_cli.models import opencode_model_api_mode
api_mode = opencode_model_api_mode(provider, model_cfg.get("default", ""))
# Auto-detect Anthropic-compatible endpoints by URL convention
# (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
elif base_url.rstrip("/").endswith("/anthropic"):

File diff suppressed because it is too large Load Diff

View File

@ -24,6 +24,13 @@ PLATFORMS = {
"whatsapp": "📱 WhatsApp",
"signal": "📡 Signal",
"email": "📧 Email",
"homeassistant": "🏠 Home Assistant",
"mattermost": "💬 Mattermost",
"matrix": "💬 Matrix",
"dingtalk": "💬 DingTalk",
"feishu": "🪽 Feishu",
"wecom": "💬 WeCom",
"webhook": "🔗 Webhook",
}
# ─── Config Helpers ───────────────────────────────────────────────────────────

View File

@ -21,6 +21,7 @@ from rich.table import Table
# Lazy imports to avoid circular dependencies and slow startup.
# tools.skills_hub and tools.skills_guard are imported inside functions.
from hermes_constants import display_hermes_home
_console = Console()
@ -304,7 +305,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
def do_install(identifier: str, category: str = "", force: bool = False,
console: Optional[Console] = None, skip_confirm: bool = False) -> None:
console: Optional[Console] = None, skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Fetch, quarantine, scan, confirm, and install a skill."""
from tools.skills_hub import (
GitHubAuth, create_source_router, ensure_hub_dirs,
@ -352,7 +354,14 @@ def do_install(identifier: str, category: str = "", force: bool = False,
extra_metadata.update(getattr(bundle, "metadata", {}) or {})
# Quarantine the bundle
q_path = quarantine_bundle(bundle)
try:
q_path = quarantine_bundle(bundle)
except ValueError as exc:
c.print(f"[bold red]Installation blocked:[/] {exc}\n")
from tools.skills_hub import append_audit_log
append_audit_log("BLOCKED", bundle.name, bundle.source,
bundle.trust_level, "invalid_path", str(exc))
return
c.print(f"[dim]Quarantined to {q_path.relative_to(q_path.parent.parent.parent)}[/]")
# Scan
@ -387,7 +396,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
"[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
"It ships with hermes-agent but is not activated by default.\n"
"Installing will copy it to your skills directory where the agent can use it.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Official Skill",
border_style="bright_cyan",
))
@ -397,7 +406,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
"External skills can contain instructions that influence agent behavior,\n"
"shell commands, and scripts. Even after automated scanning, you should\n"
"review the installed files before use.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Disclaimer",
border_style="yellow",
))
@ -412,17 +421,29 @@ def do_install(identifier: str, category: str = "", force: bool = False,
return
# Install
install_dir = install_from_quarantine(q_path, bundle.name, category, bundle, result)
try:
install_dir = install_from_quarantine(q_path, bundle.name, category, bundle, result)
except ValueError as exc:
c.print(f"[bold red]Installation blocked:[/] {exc}\n")
shutil.rmtree(q_path, ignore_errors=True)
from tools.skills_hub import append_audit_log
append_audit_log("BLOCKED", bundle.name, bundle.source,
bundle.trust_level, "invalid_path", str(exc))
return
from tools.skills_hub import SKILLS_DIR
c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")
# Invalidate the skills prompt cache so the new skill appears immediately
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
if invalidate_cache:
# Invalidate the skills prompt cache so the new skill appears immediately
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Skill will be available in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to activate immediately (invalidates prompt cache).[/]\n")
def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
@ -610,7 +631,8 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N
def do_uninstall(name: str, console: Optional[Console] = None,
skip_confirm: bool = False) -> None:
skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Remove a hub-installed skill with confirmation."""
from tools.skills_hub import uninstall_skill
@ -630,11 +652,15 @@ def do_uninstall(name: str, console: Optional[Console] = None,
success, msg = uninstall_skill(name)
if success:
c.print(f"[bold green]{msg}[/]\n")
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
if invalidate_cache:
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Change will take effect in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
else:
c.print(f"[bold red]Error:[/] {msg}\n")
@ -734,7 +760,7 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "",
auth = GitHubAuth()
if not auth.is_authenticated():
c.print("[bold red]Error:[/] GitHub authentication required.\n"
"Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n")
f"Set GITHUB_TOKEN in {display_hermes_home()}/.env or run 'gh auth login'.\n")
return
c.print(f"[bold]Publishing '{name}' to {repo}...[/]")
@ -877,10 +903,15 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
"taps": tap_list,
}
out = Path(output_path)
out.write_text(json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n")
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
payload = json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n"
if output_path == "-":
import sys
sys.stdout.write(payload)
else:
out = Path(output_path)
out.write_text(payload)
c.print(f"[bold green]Snapshot exported:[/] {out}")
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
def do_snapshot_import(input_path: str, force: bool = False,
@ -1071,19 +1102,23 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "install":
if not args:
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force|--yes]\n")
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
return
identifier = args[0]
category = ""
# --yes / -y bypasses confirmation prompt (needed in TUI mode)
# --force handles reinstall override
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
# Slash commands run inside prompt_toolkit where input() hangs.
# Always skip confirmation — the user typing the command is implicit consent.
skip_confirm = True
force = "--force" in args
# --now invalidates prompt cache immediately (costs more money).
# Default: defer to next session to preserve cache.
invalidate_cache = "--now" in args
for i, a in enumerate(args):
if a == "--category" and i + 1 < len(args):
category = args[i + 1]
do_install(identifier, category=category, force=force,
skip_confirm=skip_confirm, console=c)
skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
console=c)
elif action == "inspect":
if not args:
@ -1113,10 +1148,13 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "uninstall":
if not args:
c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
c.print("[bold red]Usage:[/] /skills uninstall <name> [--now]\n")
return
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
do_uninstall(args[0], console=c, skip_confirm=skip_confirm)
# Slash commands run inside prompt_toolkit where input() hangs.
skip_confirm = True
invalidate_cache = "--now" in args
do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
invalidate_cache=invalidate_cache)
elif action == "publish":
if not args:

View File

@ -15,8 +15,10 @@ from hermes_cli.auth import AuthError, resolve_provider
from hermes_cli.colors import Colors, color
from hermes_cli.config import get_env_path, get_env_value, get_hermes_home, load_config
from hermes_cli.models import provider_label
from hermes_cli.nous_subscription import get_nous_subscription_features
from hermes_cli.runtime_provider import resolve_requested_provider
from hermes_constants import OPENROUTER_MODELS_URL
from tools.tool_backend_helpers import managed_nous_tools_enabled
def check_mark(ok: bool) -> str:
if ok:
@ -186,6 +188,31 @@ def show_status(args):
if codex_status.get("error") and not codex_logged_in:
print(f" Error: {codex_status.get('error')}")
# =========================================================================
# Nous Subscription Features
# =========================================================================
if managed_nous_tools_enabled():
features = get_nous_subscription_features(config)
print()
print(color("◆ Nous Subscription Features", Colors.CYAN, Colors.BOLD))
if not features.nous_auth_present:
print(" Nous Portal ✗ not logged in")
else:
print(" Nous Portal ✓ managed tools available")
for feature in features.items():
if feature.managed_by_nous:
state = "active via Nous subscription"
elif feature.active:
current = feature.current_provider or "configured provider"
state = f"active via {current}"
elif feature.included_by_default and features.nous_auth_present:
state = "included by subscription, not currently selected"
elif feature.key == "modal" and features.nous_auth_present:
state = "available via subscription (optional)"
else:
state = "not configured"
print(f" {feature.label:<15} {check_mark(feature.available or feature.active or feature.managed_by_nous)} {state}")
# =========================================================================
# API-Key Providers
# =========================================================================
@ -254,6 +281,9 @@ def show_status(args):
"Slack": ("SLACK_BOT_TOKEN", None),
"Email": ("EMAIL_ADDRESS", "EMAIL_HOME_ADDRESS"),
"SMS": ("TWILIO_ACCOUNT_SID", "SMS_HOME_CHANNEL"),
"DingTalk": ("DINGTALK_CLIENT_ID", None),
"Feishu": ("FEISHU_APP_ID", "FEISHU_HOME_CHANNEL"),
"WeCom": ("WECOM_BOT_ID", "WECOM_HOME_CHANNEL"),
}
for name, (token_var, home_var) in platforms.items():
@ -282,22 +312,31 @@ def show_status(args):
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True
)
is_active = result.stdout.strip() == "active"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except subprocess.TimeoutExpired:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == 'darwin':
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True,
text=True
)
is_loaded = result.returncode == 0
from hermes_cli.gateway import get_launchd_label
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=5
)
is_loaded = result.returncode == 0
except subprocess.TimeoutExpired:
is_loaded = False
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
print(" Manager: launchd")
else:

View File

@ -9,6 +9,8 @@ Saves per-platform tool configuration to ~/.hermes/config.yaml under
the `platform_toolsets` key.
"""
import json as _json
import logging
import sys
from pathlib import Path
from typing import Dict, List, Optional, Set
@ -18,6 +20,13 @@ from hermes_cli.config import (
load_config, save_config, get_env_value, save_env_value,
)
from hermes_cli.colors import Colors, color
from hermes_cli.nous_subscription import (
apply_nous_managed_defaults,
get_nous_subscription_features,
)
from tools.tool_backend_helpers import managed_nous_tools_enabled
logger = logging.getLogger(__name__)
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@ -136,8 +145,12 @@ PLATFORMS = {
"homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
"email": {"label": "📧 Email", "default_toolset": "hermes-email"},
"matrix": {"label": "💬 Matrix", "default_toolset": "hermes-matrix"},
"dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
"dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
"feishu": {"label": "🪽 Feishu", "default_toolset": "hermes-feishu"},
"wecom": {"label": "💬 WeCom", "default_toolset": "hermes-wecom"},
"api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
"mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
"webhook": {"label": "🔗 Webhook", "default_toolset": "hermes-webhook"},
}
@ -151,6 +164,15 @@ TOOL_CATEGORIES = {
"name": "Text-to-Speech",
"icon": "🔊",
"providers": [
{
"name": "Nous Subscription",
"tag": "Managed OpenAI TTS billed to your subscription",
"env_vars": [],
"tts_provider": "openai",
"requires_nous_auth": True,
"managed_nous_feature": "tts",
"override_env_vars": ["VOICE_TOOLS_OPENAI_KEY", "OPENAI_API_KEY"],
},
{
"name": "Microsoft Edge TTS",
"tag": "Free - no API key needed",
@ -181,6 +203,15 @@ TOOL_CATEGORIES = {
"setup_note": "A free DuckDuckGo search skill is also included — skip this if you don't need a premium provider.",
"icon": "🔍",
"providers": [
{
"name": "Nous Subscription",
"tag": "Managed Firecrawl billed to your subscription",
"web_backend": "firecrawl",
"env_vars": [],
"requires_nous_auth": True,
"managed_nous_feature": "web",
"override_env_vars": ["FIRECRAWL_API_KEY", "FIRECRAWL_API_URL"],
},
{
"name": "Firecrawl Cloud",
"tag": "Hosted service - search, extract, and crawl",
@ -189,6 +220,14 @@ TOOL_CATEGORIES = {
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
],
},
{
"name": "Exa",
"tag": "AI-native search and contents",
"web_backend": "exa",
"env_vars": [
{"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
],
},
{
"name": "Parallel",
"tag": "AI-native search and extract",
@ -219,6 +258,14 @@ TOOL_CATEGORIES = {
"name": "Image Generation",
"icon": "🎨",
"providers": [
{
"name": "Nous Subscription",
"tag": "Managed FAL image generation billed to your subscription",
"env_vars": [],
"requires_nous_auth": True,
"managed_nous_feature": "image_gen",
"override_env_vars": ["FAL_KEY"],
},
{
"name": "FAL.ai",
"tag": "FLUX 2 Pro with auto-upscaling",
@ -232,11 +279,21 @@ TOOL_CATEGORIES = {
"name": "Browser Automation",
"icon": "🌐",
"providers": [
{
"name": "Nous Subscription (Browserbase cloud)",
"tag": "Managed Browserbase billed to your subscription",
"env_vars": [],
"browser_provider": "browserbase",
"requires_nous_auth": True,
"managed_nous_feature": "browser",
"override_env_vars": ["BROWSERBASE_API_KEY", "BROWSERBASE_PROJECT_ID"],
"post_setup": "browserbase",
},
{
"name": "Local Browser",
"tag": "Free headless Chromium (no API key needed)",
"env_vars": [],
"browser_provider": None,
"browser_provider": "local",
"post_setup": "browserbase", # Same npm install for agent-browser
},
{
@ -258,6 +315,16 @@ TOOL_CATEGORIES = {
"browser_provider": "browser-use",
"post_setup": "browserbase",
},
{
"name": "Camofox",
"tag": "Local anti-detection browser (Firefox/Camoufox)",
"env_vars": [
{"key": "CAMOFOX_URL", "prompt": "Camofox server URL", "default": "http://localhost:9377",
"url": "https://github.com/jo-inc/camofox-browser"},
],
"browser_provider": "camofox",
"post_setup": "camofox",
},
],
},
"homeassistant": {
@ -317,10 +384,33 @@ def _run_post_setup(post_setup_key: str):
if result.returncode == 0:
_print_success(" Node.js dependencies installed")
else:
_print_warning(" npm install failed - run manually: cd ~/.hermes/hermes-agent && npm install")
from hermes_constants import display_hermes_home
_print_warning(f" npm install failed - run manually: cd {display_hermes_home()}/hermes-agent && npm install")
elif not node_modules.exists():
_print_warning(" Node.js not found - browser tools require: npm install (in hermes-agent directory)")
elif post_setup_key == "camofox":
camofox_dir = PROJECT_ROOT / "node_modules" / "@askjo" / "camoufox-browser"
if not camofox_dir.exists() and shutil.which("npm"):
_print_info(" Installing Camofox browser server...")
import subprocess
result = subprocess.run(
["npm", "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
_print_success(" Camofox installed")
else:
_print_warning(" npm install failed - run manually: npm install")
if camofox_dir.exists():
_print_info(" Start the Camofox server:")
_print_info(" npx @askjo/camoufox-browser")
_print_info(" First run downloads the Camoufox engine (~300MB)")
_print_info(" Or use Docker: docker run -p 9377:9377 -e CAMOFOX_PORT=9377 jo-inc/camofox-browser")
elif not shutil.which("npm"):
_print_warning(" Node.js not found. Install Camofox via Docker:")
_print_info(" docker run -p 9377:9377 -e CAMOFOX_PORT=9377 jo-inc/camofox-browser")
elif post_setup_key == "rl_training":
try:
__import__("tinker_atropos")
@ -471,7 +561,7 @@ def _get_platform_tools(
# MCP servers are expected to be available on all platforms by default.
# If the platform explicitly lists one or more MCP server names, treat that
# as an allowlist. Otherwise include every globally enabled MCP server.
mcp_servers = config.get("mcp_servers", {})
mcp_servers = config.get("mcp_servers") or {}
enabled_mcp_servers = {
name
for name, server_cfg in mcp_servers.items()
@ -533,8 +623,11 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
save_config(config)
def _toolset_has_keys(ts_key: str) -> bool:
def _toolset_has_keys(ts_key: str, config: dict = None) -> bool:
"""Check if a toolset's required API keys are configured."""
if config is None:
config = load_config()
if ts_key == "vision":
try:
from agent.auxiliary_client import resolve_vision_provider_client
@ -544,12 +637,20 @@ def _toolset_has_keys(ts_key: str) -> bool:
except Exception:
return False
if ts_key in {"web", "image_gen", "tts", "browser"}:
features = get_nous_subscription_features(config)
feature = features.features.get(ts_key)
if feature and (feature.available or feature.managed_by_nous):
return True
# Check TOOL_CATEGORIES first (provider-aware)
cat = TOOL_CATEGORIES.get(ts_key)
if cat:
for provider in cat.get("providers", []):
for provider in _visible_providers(cat, config):
env_vars = provider.get("env_vars", [])
if env_vars and all(get_env_value(e["key"]) for e in env_vars):
if not env_vars:
return True # No-key provider (e.g. Local Browser, Edge TTS)
if all(get_env_value(e["key"]) for e in env_vars):
return True
return False
@ -643,9 +744,61 @@ def _prompt_choice(question: str, choices: list, default: int = 0) -> int:
return default
# ─── Token Estimation ────────────────────────────────────────────────────────
# Module-level cache so discovery + tokenization runs at most once per process.
_tool_token_cache: Optional[Dict[str, int]] = None
def _estimate_tool_tokens() -> Dict[str, int]:
"""Return estimated token counts per individual tool name.
Uses tiktoken (cl100k_base) to count tokens in the JSON-serialised
OpenAI-format tool schema. Triggers tool discovery on first call,
then caches the result for the rest of the process.
Returns an empty dict when tiktoken or the registry is unavailable.
"""
global _tool_token_cache
if _tool_token_cache is not None:
return _tool_token_cache
try:
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
except Exception:
logger.debug("tiktoken unavailable; skipping tool token estimation")
_tool_token_cache = {}
return _tool_token_cache
try:
# Trigger full tool discovery (imports all tool modules).
import model_tools # noqa: F401
from tools.registry import registry
except Exception:
logger.debug("Tool registry unavailable; skipping token estimation")
_tool_token_cache = {}
return _tool_token_cache
counts: Dict[str, int] = {}
for name in registry.get_all_tool_names():
schema = registry.get_schema(name)
if schema:
# Mirror what gets sent to the API:
# {"type": "function", "function": <schema>}
text = _json.dumps({"type": "function", "function": schema})
counts[name] = len(enc.encode(text))
_tool_token_cache = counts
return _tool_token_cache
def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str]:
"""Multi-select checklist of toolsets. Returns set of selected toolset keys."""
from hermes_cli.curses_ui import curses_checklist
from toolsets import resolve_toolset
# Pre-compute per-tool token counts (cached after first call).
tool_tokens = _estimate_tool_tokens()
effective = _get_effective_configurable_toolsets()
@ -661,11 +814,27 @@ def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str
if ts_key in enabled
}
# Build a live status function that shows deduplicated total token cost.
status_fn = None
if tool_tokens:
ts_keys = [ts_key for ts_key, _, _ in effective]
def status_fn(chosen: set) -> str:
# Collect unique tool names across all selected toolsets
all_tools: set = set()
for idx in chosen:
all_tools.update(resolve_toolset(ts_keys[idx]))
total = sum(tool_tokens.get(name, 0) for name in all_tools)
if total >= 1000:
return f"Est. tool context: ~{total / 1000:.1f}k tokens"
return f"Est. tool context: ~{total} tokens"
chosen = curses_checklist(
f"Tools for {platform_label}",
labels,
pre_selected,
cancel_returns=pre_selected,
status_fn=status_fn,
)
return {effective[i][0] for i in chosen}
@ -687,11 +856,45 @@ def _configure_toolset(ts_key: str, config: dict):
_configure_simple_requirements(ts_key)
def _visible_providers(cat: dict, config: dict) -> list[dict]:
"""Return provider entries visible for the current auth/config state."""
features = get_nous_subscription_features(config)
visible = []
for provider in cat.get("providers", []):
if provider.get("managed_nous_feature") and not managed_nous_tools_enabled():
continue
if provider.get("requires_nous_auth") and not features.nous_auth_present:
continue
visible.append(provider)
return visible
def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
"""Return True when enabling this toolset should open provider setup."""
cat = TOOL_CATEGORIES.get(ts_key)
if not cat:
return not _toolset_has_keys(ts_key, config)
if ts_key == "tts":
tts_cfg = config.get("tts", {})
return not isinstance(tts_cfg, dict) or "provider" not in tts_cfg
if ts_key == "web":
web_cfg = config.get("web", {})
return not isinstance(web_cfg, dict) or "backend" not in web_cfg
if ts_key == "browser":
browser_cfg = config.get("browser", {})
return not isinstance(browser_cfg, dict) or "cloud_provider" not in browser_cfg
if ts_key == "image_gen":
return not get_env_value("FAL_KEY")
return not _toolset_has_keys(ts_key, config)
def _configure_tool_category(ts_key: str, cat: dict, config: dict):
"""Configure a tool category with provider selection."""
icon = cat.get("icon", "")
name = cat["name"]
providers = cat["providers"]
providers = _visible_providers(cat, config)
# Check Python version requirement
if cat.get("requires_python"):
@ -756,6 +959,27 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
def _is_provider_active(provider: dict, config: dict) -> bool:
"""Check if a provider entry matches the currently active config."""
managed_feature = provider.get("managed_nous_feature")
if managed_feature:
features = get_nous_subscription_features(config)
feature = features.features.get(managed_feature)
if feature is None:
return False
if managed_feature == "image_gen":
return feature.managed_by_nous
if provider.get("tts_provider"):
return (
feature.managed_by_nous
and config.get("tts", {}).get("provider") == provider["tts_provider"]
)
if "browser_provider" in provider:
current = config.get("browser", {}).get("cloud_provider")
return feature.managed_by_nous and provider["browser_provider"] == current
if provider.get("web_backend"):
current = config.get("web", {}).get("backend")
return feature.managed_by_nous and current == provider["web_backend"]
return feature.managed_by_nous
if provider.get("tts_provider"):
return config.get("tts", {}).get("provider") == provider["tts_provider"]
if "browser_provider" in provider:
@ -782,6 +1006,13 @@ def _detect_active_provider_index(providers: list, config: dict) -> int:
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
managed_feature = provider.get("managed_nous_feature")
if provider.get("requires_nous_auth"):
features = get_nous_subscription_features(config)
if not features.nous_auth_present:
_print_warning(" Nous Subscription is only available after logging into Nous Portal.")
return
# Set TTS provider in config if applicable
if provider.get("tts_provider"):
@ -790,11 +1021,12 @@ def _configure_provider(provider: dict, config: dict):
# Set browser cloud provider in config if applicable
if "browser_provider" in provider:
bp = provider["browser_provider"]
if bp:
if bp == "local":
config.setdefault("browser", {})["cloud_provider"] = "local"
_print_success(" Browser set to local mode")
elif bp:
config.setdefault("browser", {})["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
else:
config.get("browser", {}).pop("cloud_provider", None)
# Set web search backend in config if applicable
if provider.get("web_backend"):
@ -802,7 +1034,16 @@ def _configure_provider(provider: dict, config: dict):
_print_success(f" Web backend set to: {provider['web_backend']}")
if not env_vars:
if provider.get("post_setup"):
_run_post_setup(provider["post_setup"])
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
override_envs = provider.get("override_env_vars", [])
if any(get_env_value(env_var) for env_var in override_envs):
_print_warning(
" Direct credentials are still configured and may take precedence until you remove them from ~/.hermes/.env."
)
return
# Prompt for each required env var
@ -865,8 +1106,13 @@ def _configure_simple_requirements(ts_key: str):
key_label = " OPENAI_API_KEY" if "api.openai.com" in base_url.lower() else " API key"
api_key = _prompt(key_label, password=True)
if api_key and api_key.strip():
save_env_value("OPENAI_BASE_URL", base_url)
save_env_value("OPENAI_API_KEY", api_key.strip())
# Save vision base URL to config (not .env — only secrets go there)
from hermes_cli.config import load_config, save_config
_cfg = load_config()
_aux = _cfg.setdefault("auxiliary", {}).setdefault("vision", {})
_aux["base_url"] = base_url
save_config(_cfg)
if "api.openai.com" in base_url.lower():
save_env_value("AUXILIARY_VISION_MODEL", "gpt-4o-mini")
_print_success(" Saved")
@ -905,7 +1151,7 @@ def _reconfigure_tool(config: dict):
cat = TOOL_CATEGORIES.get(ts_key)
reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if cat or reqs:
if _toolset_has_keys(ts_key):
if _toolset_has_keys(ts_key, config):
configurable.append((ts_key, ts_label))
if not configurable:
@ -935,7 +1181,7 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
"""Reconfigure a tool category - provider selection + API key update."""
icon = cat.get("icon", "")
name = cat["name"]
providers = cat["providers"]
providers = _visible_providers(cat, config)
if len(providers) == 1:
provider = providers[0]
@ -970,6 +1216,13 @@ def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
def _reconfigure_provider(provider: dict, config: dict):
"""Reconfigure a provider - update API keys."""
env_vars = provider.get("env_vars", [])
managed_feature = provider.get("managed_nous_feature")
if provider.get("requires_nous_auth"):
features = get_nous_subscription_features(config)
if not features.nous_auth_present:
_print_warning(" Nous Subscription is only available after logging into Nous Portal.")
return
if provider.get("tts_provider"):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
@ -977,12 +1230,12 @@ def _reconfigure_provider(provider: dict, config: dict):
if "browser_provider" in provider:
bp = provider["browser_provider"]
if bp:
if bp == "local":
config.setdefault("browser", {})["cloud_provider"] = "local"
_print_success(" Browser set to local mode")
elif bp:
config.setdefault("browser", {})["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
else:
config.get("browser", {}).pop("cloud_provider", None)
_print_success(" Browser set to local mode")
# Set web search backend in config if applicable
if provider.get("web_backend"):
@ -990,7 +1243,16 @@ def _reconfigure_provider(provider: dict, config: dict):
_print_success(f" Web backend set to: {provider['web_backend']}")
if not env_vars:
if provider.get("post_setup"):
_run_post_setup(provider["post_setup"])
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
override_envs = provider.get("override_env_vars", [])
if any(get_env_value(env_var) for env_var in override_envs):
_print_warning(
" Direct credentials are still configured and may take precedence until you remove them from ~/.hermes/.env."
)
return
for var in env_vars:
@ -1099,13 +1361,23 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
label = next((l for k, l, _ in _get_effective_configurable_toolsets() if k == ts), ts)
print(color(f" - {label}", Colors.RED))
auto_configured = apply_nous_managed_defaults(
config,
enabled_toolsets=new_enabled,
)
if managed_nous_tools_enabled():
for ts_key in sorted(auto_configured):
label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts_key), ts_key)
print(color(f"{label}: using your Nous subscription defaults", Colors.GREEN))
# Walk through ALL selected tools that have provider options or
# need API keys. This ensures browser (Local vs Browserbase),
# TTS (Edge vs OpenAI vs ElevenLabs), etc. are shown even when
# a free provider exists.
to_configure = [
ts_key for ts_key in sorted(new_enabled)
if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if (TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key))
and ts_key not in auto_configured
]
if to_configure:
@ -1198,7 +1470,7 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
# Configure API keys for newly enabled tools
for ts_key in sorted(added):
if (TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)):
if not _toolset_has_keys(ts_key):
if _toolset_needs_configuration_prompt(ts_key, config):
_configure_toolset(ts_key, config)
_save_platform_tools(config, pk, new_enabled)
save_config(config)
@ -1238,7 +1510,7 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
# Configure newly enabled toolsets that need API keys
for ts_key in sorted(added):
if (TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)):
if not _toolset_has_keys(ts_key):
if _toolset_needs_configuration_prompt(ts_key, config):
_configure_toolset(ts_key, config)
_save_platform_tools(config, pkey, new_enabled)
@ -1255,7 +1527,8 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
platform_choices[idx] = f"Configure {pinfo['label']} ({new_count}/{total} enabled)"
print()
print(color(" Tool configuration saved to ~/.hermes/config.yaml", Colors.DIM))
from hermes_constants import display_hermes_home
print(color(f" Tool configuration saved to {display_hermes_home()}/config.yaml", Colors.DIM))
print(color(" Changes take effect on next 'hermes' or gateway restart.", Colors.DIM))
print()

260
hermes_cli/webhook.py Normal file
View File

@ -0,0 +1,260 @@
"""hermes webhook — manage dynamic webhook subscriptions from the CLI.
Usage:
hermes webhook subscribe <name> [options]
hermes webhook list
hermes webhook remove <name>
hermes webhook test <name> [--payload '{"key": "value"}']
Subscriptions persist to ~/.hermes/webhook_subscriptions.json and are
hot-reloaded by the webhook adapter without a gateway restart.
"""
import json
import os
import re
import secrets
import time
from pathlib import Path
from typing import Dict, Optional
from hermes_constants import display_hermes_home
_SUBSCRIPTIONS_FILENAME = "webhook_subscriptions.json"
def _hermes_home() -> Path:
return Path(
os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))
).expanduser()
def _subscriptions_path() -> Path:
return _hermes_home() / _SUBSCRIPTIONS_FILENAME
def _load_subscriptions() -> Dict[str, dict]:
path = _subscriptions_path()
if not path.exists():
return {}
try:
data = json.loads(path.read_text(encoding="utf-8"))
return data if isinstance(data, dict) else {}
except Exception:
return {}
def _save_subscriptions(subs: Dict[str, dict]) -> None:
path = _subscriptions_path()
path.parent.mkdir(parents=True, exist_ok=True)
tmp_path = path.with_suffix(".tmp")
tmp_path.write_text(
json.dumps(subs, indent=2, ensure_ascii=False),
encoding="utf-8",
)
os.replace(str(tmp_path), str(path))
def _get_webhook_config() -> dict:
"""Load webhook platform config. Returns {} if not configured."""
try:
from hermes_cli.config import load_config
cfg = load_config()
return cfg.get("platforms", {}).get("webhook", {})
except Exception:
return {}
def _is_webhook_enabled() -> bool:
return bool(_get_webhook_config().get("enabled"))
def _get_webhook_base_url() -> str:
wh = _get_webhook_config().get("extra", {})
host = wh.get("host", "0.0.0.0")
port = wh.get("port", 8644)
display_host = "localhost" if host == "0.0.0.0" else host
return f"http://{display_host}:{port}"
def _setup_hint() -> str:
_dhh = display_hermes_home()
return f"""
Webhook platform is not enabled. To set it up:
1. Run the gateway setup wizard:
hermes gateway setup
2. Or manually add to {_dhh}/config.yaml:
platforms:
webhook:
enabled: true
extra:
host: "0.0.0.0"
port: 8644
secret: "your-global-hmac-secret"
3. Or set environment variables in {_dhh}/.env:
WEBHOOK_ENABLED=true
WEBHOOK_PORT=8644
WEBHOOK_SECRET=your-global-secret
Then start the gateway: hermes gateway run
"""
def _require_webhook_enabled() -> bool:
"""Check webhook is enabled. Print setup guide and return False if not."""
if _is_webhook_enabled():
return True
print(_setup_hint())
return False
def webhook_command(args):
"""Entry point for 'hermes webhook' subcommand."""
sub = getattr(args, "webhook_action", None)
if not sub:
print("Usage: hermes webhook {subscribe|list|remove|test}")
print("Run 'hermes webhook --help' for details.")
return
if not _require_webhook_enabled():
return
if sub in ("subscribe", "add"):
_cmd_subscribe(args)
elif sub in ("list", "ls"):
_cmd_list(args)
elif sub in ("remove", "rm"):
_cmd_remove(args)
elif sub == "test":
_cmd_test(args)
def _cmd_subscribe(args):
name = args.name.strip().lower().replace(" ", "-")
if not re.match(r'^[a-z0-9][a-z0-9_-]*$', name):
print(f"Error: Invalid name '{name}'. Use lowercase alphanumeric with hyphens/underscores.")
return
subs = _load_subscriptions()
is_update = name in subs
secret = args.secret or secrets.token_urlsafe(32)
events = [e.strip() for e in args.events.split(",")] if args.events else []
route = {
"description": args.description or f"Agent-created subscription: {name}",
"events": events,
"secret": secret,
"prompt": args.prompt or "",
"skills": [s.strip() for s in args.skills.split(",")] if args.skills else [],
"deliver": args.deliver or "log",
"created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
}
if args.deliver_chat_id:
route["deliver_extra"] = {"chat_id": args.deliver_chat_id}
subs[name] = route
_save_subscriptions(subs)
base_url = _get_webhook_base_url()
status = "Updated" if is_update else "Created"
print(f"\n {status} webhook subscription: {name}")
print(f" URL: {base_url}/webhooks/{name}")
print(f" Secret: {secret}")
if events:
print(f" Events: {', '.join(events)}")
else:
print(" Events: (all)")
print(f" Deliver: {route['deliver']}")
if route.get("prompt"):
prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
print(f" Prompt: {prompt_preview}")
print(f"\n Configure your service to POST to the URL above.")
print(f" Use the secret for HMAC-SHA256 signature validation.")
print(f" The gateway must be running to receive events (hermes gateway run).\n")
def _cmd_list(args):
subs = _load_subscriptions()
if not subs:
print(" No dynamic webhook subscriptions.")
print(" Create one with: hermes webhook subscribe <name>")
return
base_url = _get_webhook_base_url()
print(f"\n {len(subs)} webhook subscription(s):\n")
for name, route in subs.items():
events = ", ".join(route.get("events", [])) or "(all)"
deliver = route.get("deliver", "log")
desc = route.get("description", "")
print(f"{name}")
if desc:
print(f" {desc}")
print(f" URL: {base_url}/webhooks/{name}")
print(f" Events: {events}")
print(f" Deliver: {deliver}")
print()
def _cmd_remove(args):
name = args.name.strip().lower()
subs = _load_subscriptions()
if name not in subs:
print(f" No subscription named '{name}'.")
print(" Note: Static routes from config.yaml cannot be removed here.")
return
del subs[name]
_save_subscriptions(subs)
print(f" Removed webhook subscription: {name}")
def _cmd_test(args):
"""Send a test POST to a webhook route."""
name = args.name.strip().lower()
subs = _load_subscriptions()
if name not in subs:
print(f" No subscription named '{name}'.")
return
route = subs[name]
secret = route.get("secret", "")
base_url = _get_webhook_base_url()
url = f"{base_url}/webhooks/{name}"
payload = args.payload or '{"test": true, "event_type": "test", "message": "Hello from hermes webhook test"}'
import hmac
import hashlib
sig = "sha256=" + hmac.new(
secret.encode(), payload.encode(), hashlib.sha256
).hexdigest()
print(f" Sending test POST to {url}")
try:
import urllib.request
req = urllib.request.Request(
url,
data=payload.encode(),
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
"X-GitHub-Event": "test",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=10) as resp:
body = resp.read().decode()
print(f" Response ({resp.status}): {body}")
except Exception as e:
print(f" Error: {e}")
print(" Is the gateway running? (hermes gateway run)")

View File

@ -17,6 +17,61 @@ def get_hermes_home() -> Path:
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
def get_optional_skills_dir(default: Path | None = None) -> Path:
"""Return the optional-skills directory, honoring package-manager wrappers.
Packaged installs may ship ``optional-skills`` outside the Python package
tree and expose it via ``HERMES_OPTIONAL_SKILLS``.
"""
override = os.getenv("HERMES_OPTIONAL_SKILLS", "").strip()
if override:
return Path(override)
if default is not None:
return default
return get_hermes_home() / "optional-skills"
def get_hermes_dir(new_subpath: str, old_name: str) -> Path:
"""Resolve a Hermes subdirectory with backward compatibility.
New installs get the consolidated layout (e.g. ``cache/images``).
Existing installs that already have the old path (e.g. ``image_cache``)
keep using it — no migration required.
Args:
new_subpath: Preferred path relative to HERMES_HOME (e.g. ``"cache/images"``).
old_name: Legacy path relative to HERMES_HOME (e.g. ``"image_cache"``).
Returns:
Absolute ``Path`` — old location if it exists on disk, otherwise the new one.
"""
home = get_hermes_home()
old_path = home / old_name
if old_path.exists():
return old_path
return home / new_subpath
def display_hermes_home() -> str:
"""Return a user-friendly display string for the current HERMES_HOME.
Uses ``~/`` shorthand for readability::
default: ``~/.hermes``
profile: ``~/.hermes/profiles/coder``
custom: ``/opt/hermes-custom``
Use this in **user-facing** print/log messages instead of hardcoding
``~/.hermes``. For code that needs a real ``Path``, use
:func:`get_hermes_home` instead.
"""
home = get_hermes_home()
try:
return "~/" + str(home.relative_to(Path.home()))
except ValueError:
return str(home)
VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")

View File

@ -1009,8 +1009,9 @@ class SessionDB:
Strategy:
- Preserve properly paired quoted phrases (``"exact phrase"``)
- Strip unmatched FTS5-special characters that would cause errors
- Wrap unquoted hyphenated terms in quotes so FTS5 matches them
as exact phrases instead of splitting on the hyphen
- Wrap unquoted hyphenated and dotted terms in quotes so FTS5
matches them as exact phrases instead of splitting on the
hyphen/dot (e.g. ``chat-send``, ``P2.2``, ``my-app.config.ts``)
"""
# Step 1: Extract balanced double-quoted phrases and protect them
# from further processing via numbered placeholders.
@ -1035,11 +1036,13 @@ class SessionDB:
sanitized = re.sub(r"(?i)^(AND|OR|NOT)\b\s*", "", sanitized.strip())
sanitized = re.sub(r"(?i)\s+(AND|OR|NOT)\s*$", "", sanitized.strip())
# Step 5: Wrap unquoted hyphenated terms (e.g. ``chat-send``) in
# double quotes. FTS5's tokenizer splits on hyphens, turning
# ``chat-send`` into ``chat AND send``. Quoting preserves the
# intended phrase match.
sanitized = re.sub(r"\b(\w+(?:-\w+)+)\b", r'"\1"', sanitized)
# Step 5: Wrap unquoted dotted and/or hyphenated terms in double
# quotes. FTS5's tokenizer splits on dots and hyphens, turning
# ``chat-send`` into ``chat AND send`` and ``P2.2`` into ``p2 AND 2``.
# Quoting preserves phrase semantics. A single pass avoids the
# double-quoting bug that would occur if dotted and hyphenated
# patterns were applied sequentially (e.g. ``my-app.config``).
sanitized = re.sub(r"\b(\w+(?:[.-]\w+)+)\b", r'"\1"', sanitized)
# Step 6: Restore preserved quoted phrases
for i, quoted in enumerate(_quoted_parts):

View File

@ -1,9 +0,0 @@
"""Honcho integration for AI-native memory.
This package is only active when honcho.enabled=true in config and
HONCHO_API_KEY is set. All honcho-ai imports are deferred to avoid
ImportError when the package is not installed.
Named ``honcho_integration`` (not ``honcho``) to avoid shadowing the
``honcho`` package installed by the ``honcho-ai`` SDK.
"""

868
mcp_serve.py Normal file
View File

@ -0,0 +1,868 @@
"""
Hermes MCP Server — expose messaging conversations as MCP tools.
Starts a stdio MCP server that lets any MCP client (Claude Code, Cursor, Codex,
etc.) list conversations, read message history, send messages, poll for live
events, and manage approval requests across all connected platforms.
Matches OpenClaw's 9-tool MCP channel bridge surface:
conversations_list, conversation_get, messages_read, attachments_fetch,
events_poll, events_wait, messages_send, permissions_list_open,
permissions_respond
Plus: channels_list (Hermes-specific extra)
Usage:
hermes mcp serve
hermes mcp serve --verbose
MCP client config (e.g. claude_desktop_config.json):
{
"mcpServers": {
"hermes": {
"command": "hermes",
"args": ["mcp", "serve"]
}
}
}
"""
from __future__ import annotations
import json
import logging
import os
import re
import sys
import threading
import time
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional
logger = logging.getLogger("hermes.mcp_serve")
# ---------------------------------------------------------------------------
# Lazy MCP SDK import
# ---------------------------------------------------------------------------
_MCP_SERVER_AVAILABLE = False
try:
from mcp.server.fastmcp import FastMCP
_MCP_SERVER_AVAILABLE = True
except ImportError:
FastMCP = None # type: ignore[assignment,misc]
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _get_sessions_dir() -> Path:
"""Return the sessions directory using HERMES_HOME."""
try:
from hermes_constants import get_hermes_home
return get_hermes_home() / "sessions"
except ImportError:
return Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "sessions"
def _get_session_db():
"""Get a SessionDB instance for reading message transcripts."""
try:
from hermes_state import SessionDB
return SessionDB()
except Exception as e:
logger.debug("SessionDB unavailable: %s", e)
return None
def _load_sessions_index() -> dict:
"""Load the gateway sessions.json index directly.
Returns a dict of session_key -> entry_dict with platform routing info.
This avoids importing the full SessionStore which needs GatewayConfig.
"""
sessions_file = _get_sessions_dir() / "sessions.json"
if not sessions_file.exists():
return {}
try:
with open(sessions_file, "r", encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.debug("Failed to load sessions.json: %s", e)
return {}
def _load_channel_directory() -> dict:
"""Load the cached channel directory for available targets."""
try:
from hermes_constants import get_hermes_home
directory_file = get_hermes_home() / "channel_directory.json"
except ImportError:
directory_file = Path(
os.environ.get("HERMES_HOME", Path.home() / ".hermes")
) / "channel_directory.json"
if not directory_file.exists():
return {}
try:
with open(directory_file, "r", encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.debug("Failed to load channel_directory.json: %s", e)
return {}
def _extract_message_content(msg: dict) -> str:
"""Extract text content from a message, handling multi-part content."""
content = msg.get("content", "")
if isinstance(content, list):
text_parts = [
p.get("text", "") for p in content
if isinstance(p, dict) and p.get("type") == "text"
]
return "\n".join(text_parts)
return str(content) if content else ""
def _extract_attachments(msg: dict) -> List[dict]:
"""Extract non-text attachments from a message.
Finds: multi-part image/file content blocks, MEDIA: tags in text,
image URLs, and file references.
"""
attachments = []
content = msg.get("content", "")
# Multi-part content blocks (image_url, file, etc.)
if isinstance(content, list):
for part in content:
if not isinstance(part, dict):
continue
ptype = part.get("type", "")
if ptype == "image_url":
url = part.get("image_url", {}).get("url", "") if isinstance(part.get("image_url"), dict) else ""
if url:
attachments.append({"type": "image", "url": url})
elif ptype == "image":
url = part.get("url", part.get("source", {}).get("url", ""))
if url:
attachments.append({"type": "image", "url": url})
elif ptype not in ("text",):
# Unknown non-text content type
attachments.append({"type": ptype, "data": part})
# MEDIA: tags in text content
text = _extract_message_content(msg)
if text:
media_pattern = re.compile(r'MEDIA:\s*(\S+)')
for match in media_pattern.finditer(text):
path = match.group(1)
attachments.append({"type": "media", "path": path})
return attachments
# ---------------------------------------------------------------------------
# Event Bridge — polls SessionDB for new messages, maintains event queue
# ---------------------------------------------------------------------------
QUEUE_LIMIT = 1000
POLL_INTERVAL = 0.2 # seconds between DB polls (200ms)
@dataclass
class QueueEvent:
"""An event in the bridge's in-memory queue."""
cursor: int
type: str # "message", "approval_requested", "approval_resolved"
session_key: str = ""
data: dict = field(default_factory=dict)
class EventBridge:
"""Background poller that watches SessionDB for new messages and
maintains an in-memory event queue with waiter support.
This is the Hermes equivalent of OpenClaw's WebSocket gateway bridge.
Instead of WebSocket events, we poll the SQLite database for changes.
"""
def __init__(self):
self._queue: List[QueueEvent] = []
self._cursor = 0
self._lock = threading.Lock()
self._new_event = threading.Event()
self._running = False
self._thread: Optional[threading.Thread] = None
self._last_poll_timestamps: Dict[str, float] = {} # session_key -> unix timestamp
# In-memory approval tracking (populated from events)
self._pending_approvals: Dict[str, dict] = {}
# mtime cache — skip expensive work when files haven't changed
self._sessions_json_mtime: float = 0.0
self._state_db_mtime: float = 0.0
self._cached_sessions_index: dict = {}
def start(self):
"""Start the background polling thread."""
if self._running:
return
self._running = True
self._thread = threading.Thread(target=self._poll_loop, daemon=True)
self._thread.start()
logger.debug("EventBridge started")
def stop(self):
"""Stop the background polling thread."""
self._running = False
self._new_event.set() # Wake any waiters
if self._thread:
self._thread.join(timeout=5)
logger.debug("EventBridge stopped")
def poll_events(
self,
after_cursor: int = 0,
session_key: Optional[str] = None,
limit: int = 20,
) -> dict:
"""Return events since after_cursor, optionally filtered by session_key."""
with self._lock:
events = [
e for e in self._queue
if e.cursor > after_cursor
and (not session_key or e.session_key == session_key)
][:limit]
next_cursor = events[-1].cursor if events else after_cursor
return {
"events": [
{"cursor": e.cursor, "type": e.type,
"session_key": e.session_key, **e.data}
for e in events
],
"next_cursor": next_cursor,
}
def wait_for_event(
self,
after_cursor: int = 0,
session_key: Optional[str] = None,
timeout_ms: int = 30000,
) -> Optional[dict]:
"""Block until a matching event arrives or timeout expires."""
deadline = time.monotonic() + (timeout_ms / 1000.0)
while time.monotonic() < deadline:
with self._lock:
for e in self._queue:
if e.cursor > after_cursor and (
not session_key or e.session_key == session_key
):
return {
"cursor": e.cursor, "type": e.type,
"session_key": e.session_key, **e.data,
}
remaining = deadline - time.monotonic()
if remaining <= 0:
break
self._new_event.clear()
self._new_event.wait(timeout=min(remaining, POLL_INTERVAL))
return None
def list_pending_approvals(self) -> List[dict]:
"""List approval requests observed during this bridge session."""
with self._lock:
return sorted(
self._pending_approvals.values(),
key=lambda a: a.get("created_at", ""),
)
def respond_to_approval(self, approval_id: str, decision: str) -> dict:
"""Resolve a pending approval (best-effort without gateway IPC)."""
with self._lock:
approval = self._pending_approvals.pop(approval_id, None)
if not approval:
return {"error": f"Approval not found: {approval_id}"}
self._enqueue(QueueEvent(
cursor=0, # Will be set by _enqueue
type="approval_resolved",
session_key=approval.get("session_key", ""),
data={"approval_id": approval_id, "decision": decision},
))
return {"resolved": True, "approval_id": approval_id, "decision": decision}
def _enqueue(self, event: QueueEvent) -> None:
"""Add an event to the queue and wake any waiters."""
with self._lock:
self._cursor += 1
event.cursor = self._cursor
self._queue.append(event)
# Trim queue to limit
while len(self._queue) > QUEUE_LIMIT:
self._queue.pop(0)
self._new_event.set()
def _poll_loop(self):
"""Background loop: poll SessionDB for new messages."""
db = _get_session_db()
if not db:
logger.warning("EventBridge: SessionDB unavailable, event polling disabled")
return
while self._running:
try:
self._poll_once(db)
except Exception as e:
logger.debug("EventBridge poll error: %s", e)
time.sleep(POLL_INTERVAL)
def _poll_once(self, db):
"""Check for new messages across all sessions.
Uses mtime checks on sessions.json and state.db to skip work
when nothing has changed — makes 200ms polling essentially free.
"""
# Check if sessions.json has changed (mtime check is ~1μs)
sessions_file = _get_sessions_dir() / "sessions.json"
try:
sj_mtime = sessions_file.stat().st_mtime if sessions_file.exists() else 0.0
except OSError:
sj_mtime = 0.0
if sj_mtime != self._sessions_json_mtime:
self._sessions_json_mtime = sj_mtime
self._cached_sessions_index = _load_sessions_index()
# Check if state.db has changed
try:
from hermes_constants import get_hermes_home
db_file = get_hermes_home() / "state.db"
except ImportError:
db_file = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "state.db"
try:
db_mtime = db_file.stat().st_mtime if db_file.exists() else 0.0
except OSError:
db_mtime = 0.0
if db_mtime == self._state_db_mtime and sj_mtime == self._sessions_json_mtime:
return # Nothing changed since last poll — skip entirely
self._state_db_mtime = db_mtime
entries = self._cached_sessions_index
for session_key, entry in entries.items():
session_id = entry.get("session_id", "")
if not session_id:
continue
last_seen = self._last_poll_timestamps.get(session_key, 0.0)
try:
messages = db.get_messages(session_id)
except Exception:
continue
if not messages:
continue
# Normalize timestamps to float for comparison
def _ts_float(ts) -> float:
if isinstance(ts, (int, float)):
return float(ts)
if isinstance(ts, str) and ts:
try:
return float(ts)
except ValueError:
# ISO string — parse to epoch
try:
from datetime import datetime
return datetime.fromisoformat(ts).timestamp()
except Exception:
return 0.0
return 0.0
# Find messages newer than our last seen timestamp
new_messages = []
for msg in messages:
ts = _ts_float(msg.get("timestamp", 0))
role = msg.get("role", "")
if role not in ("user", "assistant"):
continue
if ts > last_seen:
new_messages.append(msg)
for msg in new_messages:
content = _extract_message_content(msg)
if not content:
continue
self._enqueue(QueueEvent(
cursor=0,
type="message",
session_key=session_key,
data={
"role": msg.get("role", ""),
"content": content[:500],
"timestamp": str(msg.get("timestamp", "")),
"message_id": str(msg.get("id", "")),
},
))
# Update last seen to the most recent message timestamp
all_ts = [_ts_float(m.get("timestamp", 0)) for m in messages]
if all_ts:
latest = max(all_ts)
if latest > last_seen:
self._last_poll_timestamps[session_key] = latest
# ---------------------------------------------------------------------------
# MCP Server
# ---------------------------------------------------------------------------
def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
"""Create and return the Hermes MCP server with all tools registered."""
if not _MCP_SERVER_AVAILABLE:
raise ImportError(
"MCP server requires the 'mcp' package. "
"Install with: pip install 'hermes-agent[mcp]'"
)
mcp = FastMCP(
"hermes",
instructions=(
"Hermes Agent messaging bridge. Use these tools to interact with "
"conversations across Telegram, Discord, Slack, WhatsApp, Signal, "
"Matrix, and other connected platforms."
),
)
bridge = event_bridge or EventBridge()
# -- conversations_list ------------------------------------------------
@mcp.tool()
def conversations_list(
platform: Optional[str] = None,
limit: int = 50,
search: Optional[str] = None,
) -> str:
"""List active messaging conversations across connected platforms.
Returns conversations with their session keys (needed for messages_read),
platform, chat type, display name, and last activity time.
Args:
platform: Filter by platform name (telegram, discord, slack, etc.)
limit: Maximum number of conversations to return (default 50)
search: Optional text to filter conversations by name
"""
entries = _load_sessions_index()
conversations = []
for key, entry in entries.items():
origin = entry.get("origin", {})
entry_platform = entry.get("platform") or origin.get("platform", "")
if platform and entry_platform.lower() != platform.lower():
continue
display_name = entry.get("display_name", "")
chat_name = origin.get("chat_name", "")
if search:
search_lower = search.lower()
if (search_lower not in display_name.lower()
and search_lower not in chat_name.lower()
and search_lower not in key.lower()):
continue
conversations.append({
"session_key": key,
"session_id": entry.get("session_id", ""),
"platform": entry_platform,
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
"display_name": display_name,
"chat_name": chat_name,
"user_name": origin.get("user_name", ""),
"updated_at": entry.get("updated_at", ""),
})
conversations.sort(key=lambda c: c.get("updated_at", ""), reverse=True)
conversations = conversations[:limit]
return json.dumps({
"count": len(conversations),
"conversations": conversations,
}, indent=2)
# -- conversation_get --------------------------------------------------
@mcp.tool()
def conversation_get(session_key: str) -> str:
"""Get detailed info about one conversation by its session key.
Args:
session_key: The session key from conversations_list
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
origin = entry.get("origin", {})
return json.dumps({
"session_key": session_key,
"session_id": entry.get("session_id", ""),
"platform": entry.get("platform") or origin.get("platform", ""),
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
"display_name": entry.get("display_name", ""),
"user_name": origin.get("user_name", ""),
"chat_name": origin.get("chat_name", ""),
"chat_id": origin.get("chat_id", ""),
"thread_id": origin.get("thread_id"),
"updated_at": entry.get("updated_at", ""),
"created_at": entry.get("created_at", ""),
"input_tokens": entry.get("input_tokens", 0),
"output_tokens": entry.get("output_tokens", 0),
"total_tokens": entry.get("total_tokens", 0),
}, indent=2)
# -- messages_read -----------------------------------------------------
@mcp.tool()
def messages_read(
session_key: str,
limit: int = 50,
) -> str:
"""Read recent messages from a conversation.
Returns the message history in chronological order with role, content,
and timestamp for each message.
Args:
session_key: The session key from conversations_list
limit: Maximum number of messages to return (default 50, most recent)
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
session_id = entry.get("session_id", "")
if not session_id:
return json.dumps({"error": "No session ID for this conversation"})
db = _get_session_db()
if not db:
return json.dumps({"error": "Session database unavailable"})
try:
all_messages = db.get_messages(session_id)
except Exception as e:
return json.dumps({"error": f"Failed to read messages: {e}"})
filtered = []
for msg in all_messages:
role = msg.get("role", "")
if role in ("user", "assistant"):
content = _extract_message_content(msg)
if content:
filtered.append({
"id": str(msg.get("id", "")),
"role": role,
"content": content[:2000],
"timestamp": msg.get("timestamp", ""),
})
messages = filtered[-limit:]
return json.dumps({
"session_key": session_key,
"count": len(messages),
"total_in_session": len(filtered),
"messages": messages,
}, indent=2)
# -- attachments_fetch -------------------------------------------------
@mcp.tool()
def attachments_fetch(
session_key: str,
message_id: str,
) -> str:
"""List non-text attachments for a message in a conversation.
Extracts images, media files, and other non-text content blocks
from the specified message.
Args:
session_key: The session key from conversations_list
message_id: The message ID from messages_read
"""
entries = _load_sessions_index()
entry = entries.get(session_key)
if not entry:
return json.dumps({"error": f"Conversation not found: {session_key}"})
session_id = entry.get("session_id", "")
if not session_id:
return json.dumps({"error": "No session ID for this conversation"})
db = _get_session_db()
if not db:
return json.dumps({"error": "Session database unavailable"})
try:
all_messages = db.get_messages(session_id)
except Exception as e:
return json.dumps({"error": f"Failed to read messages: {e}"})
# Find the target message
target_msg = None
for msg in all_messages:
if str(msg.get("id", "")) == message_id:
target_msg = msg
break
if not target_msg:
return json.dumps({"error": f"Message not found: {message_id}"})
attachments = _extract_attachments(target_msg)
return json.dumps({
"message_id": message_id,
"count": len(attachments),
"attachments": attachments,
}, indent=2)
# -- events_poll -------------------------------------------------------
@mcp.tool()
def events_poll(
after_cursor: int = 0,
session_key: Optional[str] = None,
limit: int = 20,
) -> str:
"""Poll for new conversation events since a cursor position.
Returns events that have occurred since the given cursor. Use the
returned next_cursor value for subsequent polls.
Event types: message, approval_requested, approval_resolved
Args:
after_cursor: Return events after this cursor (0 for all)
session_key: Optional filter to one conversation
limit: Maximum events to return (default 20)
"""
result = bridge.poll_events(
after_cursor=after_cursor,
session_key=session_key,
limit=limit,
)
return json.dumps(result, indent=2)
# -- events_wait -------------------------------------------------------
@mcp.tool()
def events_wait(
after_cursor: int = 0,
session_key: Optional[str] = None,
timeout_ms: int = 30000,
) -> str:
"""Wait for the next conversation event (long-poll).
Blocks until a matching event arrives or the timeout expires.
Use this for near-real-time event delivery without polling.
Args:
after_cursor: Wait for events after this cursor
session_key: Optional filter to one conversation
timeout_ms: Maximum wait time in milliseconds (default 30000)
"""
event = bridge.wait_for_event(
after_cursor=after_cursor,
session_key=session_key,
timeout_ms=min(timeout_ms, 300000), # Cap at 5 minutes
)
if event:
return json.dumps({"event": event}, indent=2)
return json.dumps({"event": None, "reason": "timeout"}, indent=2)
# -- messages_send -----------------------------------------------------
@mcp.tool()
def messages_send(
target: str,
message: str,
) -> str:
"""Send a message to a platform conversation.
The target format is "platform:chat_id" — same format used by the
channels_list tool. You can also use human-friendly channel names
that will be resolved automatically.
Examples:
target="telegram:6308981865"
target="discord:#general"
target="slack:#engineering"
Args:
target: Platform target in "platform:identifier" format
message: The message text to send
"""
if not target or not message:
return json.dumps({"error": "Both target and message are required"})
try:
from tools.send_message_tool import send_message_tool
result_str = send_message_tool(
{"action": "send", "target": target, "message": message}
)
return result_str
except ImportError:
return json.dumps({"error": "Send message tool not available"})
except Exception as e:
return json.dumps({"error": f"Send failed: {e}"})
# -- channels_list -----------------------------------------------------
@mcp.tool()
def channels_list(platform: Optional[str] = None) -> str:
"""List available messaging channels and targets across platforms.
Returns channels that you can send messages to. The target strings
returned here can be used directly with the messages_send tool.
Args:
platform: Filter by platform name (telegram, discord, slack, etc.)
"""
directory = _load_channel_directory()
if not directory:
entries = _load_sessions_index()
targets = []
seen = set()
for key, entry in entries.items():
origin = entry.get("origin", {})
p = entry.get("platform") or origin.get("platform", "")
chat_id = origin.get("chat_id", "")
if not p or not chat_id:
continue
if platform and p.lower() != platform.lower():
continue
target_str = f"{p}:{chat_id}"
if target_str in seen:
continue
seen.add(target_str)
targets.append({
"target": target_str,
"platform": p,
"name": entry.get("display_name") or origin.get("chat_name", ""),
"chat_type": entry.get("chat_type", origin.get("chat_type", "")),
})
return json.dumps({"count": len(targets), "channels": targets}, indent=2)
channels = []
for plat, entries_list in directory.items():
if platform and plat.lower() != platform.lower():
continue
if isinstance(entries_list, list):
for ch in entries_list:
if isinstance(ch, dict):
chat_id = ch.get("id", ch.get("chat_id", ""))
channels.append({
"target": f"{plat}:{chat_id}" if chat_id else plat,
"platform": plat,
"name": ch.get("name", ch.get("display_name", "")),
"chat_type": ch.get("type", ""),
})
return json.dumps({"count": len(channels), "channels": channels}, indent=2)
# -- permissions_list_open ---------------------------------------------
@mcp.tool()
def permissions_list_open() -> str:
"""List pending approval requests observed during this bridge session.
Returns exec and plugin approval requests that the bridge has seen
since it started. Approvals are live-session only — older approvals
from before the bridge connected are not included.
"""
approvals = bridge.list_pending_approvals()
return json.dumps({
"count": len(approvals),
"approvals": approvals,
}, indent=2)
# -- permissions_respond -----------------------------------------------
@mcp.tool()
def permissions_respond(
id: str,
decision: str,
) -> str:
"""Respond to a pending approval request.
Args:
id: The approval ID from permissions_list_open
decision: One of "allow-once", "allow-always", or "deny"
"""
if decision not in ("allow-once", "allow-always", "deny"):
return json.dumps({
"error": f"Invalid decision: {decision}. "
f"Must be allow-once, allow-always, or deny"
})
result = bridge.respond_to_approval(id, decision)
return json.dumps(result, indent=2)
return mcp
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
def run_mcp_server(verbose: bool = False) -> None:
"""Start the Hermes MCP server on stdio."""
if not _MCP_SERVER_AVAILABLE:
print(
"Error: MCP server requires the 'mcp' package.\n"
"Install with: pip install 'hermes-agent[mcp]'",
file=sys.stderr,
)
sys.exit(1)
if verbose:
logging.basicConfig(level=logging.DEBUG, stream=sys.stderr)
else:
logging.basicConfig(level=logging.WARNING, stream=sys.stderr)
bridge = EventBridge()
bridge.start()
server = create_mcp_server(event_bridge=bridge)
import asyncio
async def _run():
try:
await server.run_stdio_async()
finally:
bridge.stop()
try:
asyncio.run(_run())
except KeyboardInterrupt:
bridge.stop()

View File

@ -156,7 +156,7 @@ def _discover_tools():
"tools.delegate_tool",
"tools.process_registry",
"tools.send_message_tool",
"tools.honcho_tools",
# "tools.honcho_tools", # Removed — Honcho is now a memory provider plugin
"tools.homeassistant_tool",
]
import importlib
@ -252,7 +252,7 @@ def get_tool_definitions(
# Determine which tool names the caller wants
tools_to_include: set = set()
if enabled_toolsets:
if enabled_toolsets is not None:
for toolset_name in enabled_toolsets:
if validate_toolset(toolset_name):
resolved = resolve_toolset(toolset_name)
@ -371,8 +371,6 @@ def handle_function_call(
task_id: Optional[str] = None,
user_task: Optional[str] = None,
enabled_tools: Optional[List[str]] = None,
honcho_manager: Optional[Any] = None,
honcho_session_key: Optional[str] = None,
) -> str:
"""
Main function call dispatcher that routes calls to the tool registry.
@ -417,16 +415,12 @@ def handle_function_call(
function_name, function_args,
task_id=task_id,
enabled_tools=sandbox_enabled,
honcho_manager=honcho_manager,
honcho_session_key=honcho_session_key,
)
else:
result = registry.dispatch(
function_name, function_args,
task_id=task_id,
user_task=user_task,
honcho_manager=honcho_manager,
honcho_session_key=honcho_session_key,
)
try:

Some files were not shown because too many files have changed in this diff Show More