Tasks can now carry file attachments (PDFs, images, source docs) that
workers read directly — closes the gap where source material had to be
pasted as a path into the task body.
- kanban_db: task_attachments table (additive), Attachment dataclass,
add/list/get/delete accessors, attachments_root/task_attachments_dir
path helpers (per-board, HERMES_KANBAN_ATTACHMENTS_ROOT override)
- build_worker_context: surfaces each attachment's absolute path so the
worker (full file/terminal tool access) reads it via read_file/pdftotext
- dashboard API: POST/GET/DELETE attachment routes (multipart upload,
25MB cap, traversal-safe filenames, root-containment check on download)
- dashboard UI: Attachments section in the task drawer — upload button,
list with download, per-row remove
- docs + tests (13 cases: DB accessors, REST round-trip, traversal
rejection, collision suffixing, worker-context surfacing)
Closes#35338
* fix(file-tools): handle UTF-8 BOM in read_file / write_file / patch
Some Windows editors prepend an invisible UTF-8 BOM (U+FEFF) to text
files. We had no awareness of it, so: read_file surfaced a phantom
U+FEFF as the first character; patch matches against the true first
line could miss; and a write/patch round-trip silently stripped the
marker, changing the file's byte signature.
Now:
- read_file / read_file_raw strip a single leading BOM so the model
never sees it (only on the first chunk — the marker lives at byte 0).
- patch_replace strips the BOM before fuzzy-matching (so an exact
first-line match works) and its post-write verification compares
BOM-stripped content.
- write_file restores the BOM when the original file had one and the
new content doesn't, mirroring the existing line-ending preservation
(detect on disk via a cheap `head -c 3` probe or reuse pre_content,
re-prepend across the edit). Guards against double-BOM.
Mid-content U+FEFF is left alone (it's data there, not a file marker).
Tests: TestBomHandling (real LocalEnvironment) — read-strips, raw-read
strips, write preserves, no-BOM-when-original-had-none, no-double-BOM,
patch round-trip preserves, patch matches first line through a BOM,
plus helper unit tests. 208 file-tool tests green.
* fix(kanban): respect mobile safe areas in task detail drawer
The task detail drawer is a body-level z-60 fixed overlay using
height:100vh starting at the viewport top. On mobile this puts the
drawer header behind the dashboard's fixed top bar (min-h-14, z-40)
and lets the bottom comment input sit under the browser's collapsing
nav bar.
- drawer: 100vh -> 100dvh (+ max-height:100dvh), 100vh kept as fallback
- head: padding-top honors env(safe-area-inset-top); mobile (<1024px,
matching the lg breakpoint where the fixed bar shows) clears the
3.5rem header
- comment-row + body: bottom padding extended with
env(safe-area-inset-bottom) so the bottom-most element clears the
mobile browser chrome
Mirrors the host shell idiom (100dvh + env(safe-area-inset-bottom) in
web/), and web/index.html already sets viewport-fit=cover so the insets
resolve. max()/calc() fallbacks leave desktop unchanged.
Closes#35324
Self-hosted Honcho setup had four sharp edges:
- local/cloud URLs ending in /vN double-prefixed by the SDK (/v3/v3/... 404)
- authenticated local servers had no setup prompt for a JWT/bearer token
- profile-derived host keys could be dot-containing workspace IDs Honcho rejects
- memory-provider config files with API keys written world-readable per umask
This keeps existing behavior but makes those paths safer:
- strip a trailing /vN version segment from any configured baseUrl before SDK
init (the SDK's route builders always prepend their own version prefix);
auth-skipping stays loopback-only
- add an optional local JWT/bearer prompt in honcho setup, stored under
hosts.<host>.apiKey
- derive new profile host keys with underscores, still reading legacy
hermes.<profile> blocks
- write memory-provider config files atomically with 0600 via a shared
utils.atomic_json_write(mode=) arg (honcho/hindsight/mem0/supermemory)
- skip honcho.json parsing in gateway cache-busting unless Honcho is the active
memory provider; memoize by honcho.json mtime when active
- bust the gateway agent cache on memory.provider change
- add a hermes memory setup <provider> one-liner so fresh installs can configure
a named provider without the picker (the per-provider hermes <provider>
subcommand only registers once that provider is active)
Closes#20688, #29885, #26459, #30246, #33382, #32244.
Co-authored-by: BROCCOLO1D
Adds tests for the echo-loop fix (outgoing X-Tags header, inbound skip
on tagged events, genuine tags pass through) and extends the tag to the
out-of-process _standalone_send() path so cron / send_message deliveries
to a self-subscribed topic are also skipped. Maps both contributors in
release.py AUTHOR_MAP.
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
When publish_topic equals the subscribe topic, the agent's own replies
are echoed back by ntfy as incoming messages, creating an infinite
reply spiral.
Fix: tag outgoing messages with X-Tags: hermes-agent header, and skip
incoming messages that carry this tag. This is zero-config — works
automatically regardless of topic configuration.
FixesNousResearch/hermes-agent#34447
FAL veo3.1 API expects duration as "4s"/"6s"/"8s" (with unit suffix),
not bare "4"/"6"/"8" like other families. Add per-family duration_suffix
field and apply it in _build_payload. Also add "4k" to veo3.1 resolutions
per FAL API docs.
Note: the managed gateway currently rejects the "4s" format (expects
integer duration). Gateway-side fix needed for veo3.1 to work through
the Nous subscription path.
Wire plugins/video_gen/fal/__init__.py to use the same
_ManagedFalSyncClient pattern that image gen already uses.
Changes:
- Add managed gateway resolution, client caching, and
_submit_fal_video_request() that routes between direct FAL_KEY
and Nous gateway modes
- Update is_available() to return True when either FAL_KEY or the
managed gateway is reachable
- Update generate() to use submit+get handle pattern instead of
fal_client.subscribe() directly
- Fix happy-horse endpoint namespace: fal-ai/ → alibaba/ (matches
the tool-gateway allowlist from fal-video-gen branch)
- Surface actionable error on 4xx gateway rejections
Tests:
- 4 new tests in test_managed_media_gateways.py (gateway routing,
client reuse, direct mode fallback, alibaba namespace)
- Updated existing test_fal_plugin.py fixture to use submit/handle
pattern and patch _resolve_managed_fal_video_gateway for isolation
Closes the termination-control gap left by PR #28432, which shipped the
read-only sibling endpoints (/workers/active, /runs/{run_id},
/runs/{run_id}/inspect) but no way to stop a misbehaving worker from
the dashboard without dropping to the CLI.
The new endpoint resolves run_id -> task_id and delegates to the
existing kanban_db.reclaim_task() flow, so the SIGTERM->SIGKILL
escalation, run-outcome bookkeeping, and event-log append all match
POST /tasks/{task_id}/reclaim exactly. No new termination semantics
introduced.
Responses:
200 {ok, run_id, task_id} on success
404 unknown run_id
409 run already ended OR task no longer reclaimable
Refs: #23762
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
The opencode-go relay defaults max_tokens to 262144 when none is sent,
but Xiami mimo-v2.5-pro only supports 131072 completion tokens — every
request 400s with "max_tokens is too large: 262144" before the agent
can do anything.
Add a get_max_tokens(model) hook on ProviderProfile (default returns
default_max_tokens) so profiles fronting multiple upstreams can vary
the cap per-model. Wire chat_completions transport through the hook.
Override on OpenCodeGoProfile with mimo-v2.5-pro=131072.
Only mimo-v2.5-pro is capped — other opencode-go models (kimi, glm,
qwen, minimax, other mimo variants) unchanged.
The original change's description and README claimed the per-call
hindsight_recall tool was unaffected by the new observation-only default.
That is inaccurate: hindsight_recall reads the same self._recall_types
instance attribute as the auto-recall prefetch path, and RECALL_SCHEMA
exposes no per-call types argument, so the model cannot override it.
Narrowing the default narrows BOTH paths.
Corrects the README behavior-change note, the config-table row, and the
get_config_schema description to reflect that recall_types applies to
both auto-recall and the hindsight_recall tool.
Auto-recall used to surface every fact type Hindsight had on the
session — `world`, `experience`, and `observation`. That triple-ships
the same underlying signal in three different framings: observations
are the concrete events the user said/did/asked, while world and
experience facts are aggregate summaries Hindsight derives from those
exact observations. Including all three burns most of
`recall_max_tokens` on rephrasings, crowds out events the model
actually needs to see, and produces effective duplicates in the
prompt — observations themselves are deduplicated by construction
so observation-only recall is denser per token and closer to
conversational ground truth.
Change
------
- Default `_recall_types = ["observation"]` (was `None`, which
delegated to server-side "return everything").
- `initialize()` now treats a missing `recall_types` config the same
way; also accepts comma-separated strings for parity with `recall_tags`.
- An explicit `recall_types=[]` config falls back to the default rather
than disabling the filter (would silently widen recall vs. the new
default).
- Added to `get_config_schema()` so it's discoverable via `hermes config`.
Per-call `hindsight_recall` tool invocations are unaffected — they
already only forward `types` when the caller passes the argument.
Docs / migration
----------------
plugins/memory/hindsight/README.md grows a "Behavior change" callout
explaining the why (no-duplicates, information-efficient) and how to
restore the legacy broad recall:
"recall_types": "observation,world,experience" # or a JSON list
in `~/.hermes/hindsight/config.json`.
Tests
-----
- `test_default_values` updated for the new default.
- New cases: explicit list override, CSV string accepted, empty list
falls back to default (not "wider than default").
OpenRouter supports a session_id field in extra_body that pins
multi-turn conversations to the same provider endpoint, enabling
prompt cache reuse across turns. The session_id was already threaded
through to build_extra_body() but never included in the returned dict.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
The web_crawl_tool() function was an orphan — no model schema registered
it, no skill or CLI command called it, and the agent had no way to invoke
it. PR #32608 proposed wiring it up as a model-callable tool; we've
decided not to expose crawl as a separate capability since web_search +
web_extract cover the use cases we want models to have.
Removed:
- tools/web_tools.py: web_crawl_tool() (~230 LOC)
- plugins/web/firecrawl/provider.py: supports_crawl() + crawl()
- plugins/web/tavily/provider.py: supports_crawl() + crawl()
- plugins/web/xai/provider.py: supports_crawl() override
- agent/web_search_provider.py: supports_crawl() + crawl() ABC methods
- agent/web_search_registry.py: get_active_crawl_provider() +
the 'crawl' branch in _resolve()
- agent/display.py: web_crawl tool-progress rendering
- hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools
- tools/website_policy.py: stale comment reference
- Tests: removed TestWebCrawlTavily class, the two website-policy
web_crawl tests, the searxng/ddgs/brave-free crawl-error tests,
the integration test_web_crawl method, and the
test_unconfigured_crawl_emits_top_level_error test. Trimmed the
capability-flag parametrize list and the WebSearchProvider ABC
conformance tests.
- Docs: trimmed the Crawl column from capability tables in both EN
and zh-Hans, updated the developer-guide ABC table.
Net: 25 files, +115/-1067.
Closes#33762 (the schema-text bug only existed if #32608 landed).
Supersedes #32608.
When auto-threading kicked in, the broadened backfill gate ran on the
freshly-created thread — but the thread has no prior context to fetch,
and the parent-channel reference passed to _fetch_channel_context would
have leaked unrelated context (see #31467).
Skip backfill when auto_threaded_channel is set. Also teach the
_FakeTextChannel / _FakeThreadChannel test doubles to expose a no-op
history() async generator so the broadened gate doesn't trip
AttributeError → discord.Forbidden (MagicMock) → TypeError in the
existing auto-thread tests. Add a regression test that asserts
auto-threaded messages do not trigger backfill.
Drop the _needed_mention local variable now that it has only one use,
inline its expression as _has_mention_gap, and add a comment explaining
the three backfill cases (mention-gated channel, thread, DM skip).
Behaviorally identical to the prior commit; cleanup only.
Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>
Discord threads where the bot has already participated bypass mention gating by default, but the backfill check was still tied to the mention-needed condition. That meant follow-up thread messages could trigger a response without providing recent thread history to the session.
Run history backfill for thread messages whenever backfill is enabled, while keeping DMs skipped and channel mention backfill behavior unchanged. Add a regression test for a known thread follow-up without an explicit mention.
Fixes#33666
Co-authored-by: Cursor <cursoragent@cursor.com>
Salvage follow-up on top of @vynxevainglory-ai's PR #29233. Keep the
column-body flex:1 + min-height:0 fix (tall columns scroll internally
now), but drop the flex-wrap: wrap part — instead just stop hiding
the existing horizontal scrollbar.
PR #523254b34 (sadiksaifi, May 18) deliberately moved the kanban board
from a wrapping grid to a single-row pinned-width flex so the board
stays as one stable horizontal row. The mistake in that PR was the
scrollbar-width: none + ::-webkit-scrollbar { display: none } pair,
which hid the affordance so columns past the viewport became visually
inaccessible. Fixing that hidden-scrollbar bug while keeping the
single-row design honors both contributors' intent.
Two CSS issues in the kanban dashboard:
1. Columns overflow horizontally with no way to reach them — the
original scrollbar-width: none hid the scrollbar entirely, and
even with a scrollbar, a wrapping layout is better UX for a board
with 8+ columns. Changed to flex-wrap: wrap and removed the
overflow-x: auto + hidden scrollbar rules. Columns now flow into
multiple rows (~3 per row on a typical viewport) instead of
running off-screen.
2. .hermes-kanban-column-body lacked flex: 1 and min-height: 0,
so the flex child's implicit min-height: auto prevented it from
shrinking below its content size. Columns with many cards pushed
past the parent max-height instead of scrolling internally.
Verified: 9 columns wrap into 3 rows, all visible without
horizontal scroll. Done column (53 tasks) scrolls vertically
within its column bounds.
Condenses the substance of PRs #16453, #17453, #16451, #17600, and #13373
into a minimal generic host contract that external context engine plugins
(e.g. hermes-lcm) need to integrate cleanly. Drops scaffolding that
duplicated existing infrastructure or had marginal value.
Five concrete changes:
1. `_transition_context_engine_session()` on AIAgent — generic lifecycle
helper that fires on_session_end → on_session_reset → on_session_start
→ optional carry_over_new_session_context. Engines implement only the
hooks they need; missing hooks are skipped. Built-in compressor keeps
its existing reset-only behavior because callers default to no
metadata. `reset_session_state()` now optionally accepts
previous_messages / old_session_id / carry_over_context and delegates
to the transition helper when provided. (#16453)
2. `conversation_id` passed to `on_session_start()` — both the
agent-init call site and the compression-boundary call site now
forward `self._gateway_session_key` so plugin engines have a stable
conversation identity that survives session_id rotation (compression
splits, /new, resume). The key already existed on AIAgent; it just
wasn't reaching engines. (#16453)
3. Canonical cache buckets forwarded to engines — the usage dict passed
to `update_from_response()` now includes input_tokens, output_tokens,
cache_read_tokens, cache_write_tokens, and reasoning_tokens on top of
the legacy prompt/completion/total keys. Engines can make decisions on
cache-hit ratios and reasoning costs instead of only aggregates. ABC
docstring updated. (#17453)
4. Plugin-registered context engines visible in the picker —
`_discover_context_engines()` in plugins_cmd.py now also includes
engines registered via `ctx.register_context_engine()` from plugin
manifests, deduplicating by name so repo-shipped descriptions win on
collision. (#16451)
5. `_EngineCollector.register_command()` — context engines using the
standard `register(ctx)` pattern can now expose slash commands (e.g.
`/lcm`). Routes to the global plugin command registry with the same
conflict-rejection policy regular plugins use (no shadowing built-ins,
no clobbering other plugins). Previously these calls hit a no-op and
the slash commands silently never appeared. (#17600)
Dropped from the original 5 PRs:
- Compression boundary signal (`boundary_reason="compression"`) from
#16453 — already on main at `agent/conversation_compression.py:412-424`,
landed via the bg-review extraction.
- `discover_plugins()` before fallback in run_agent.py from #16451 —
redundant: `get_plugin_context_engine()` already routes through
`_ensure_plugins_discovered()` which is idempotent.
- Runtime identity diagnostics method + helpers from #13373 (+251 LOC) —
operators can already read engine state via `engine.get_status()`;
the diagnostics view added marginal value relative to its surface area.
- The 553-LOC slash-command machinery from #17600 — replaced with a
20-LOC `register_command` method on the collector that reuses the
existing plugin command registry instead of building a parallel one.
Net: ~215 LOC of host-contract changes + 282 LOC of focused tests, vs
~1,176 LOC across the original 5 PRs.
Co-authored-by: Tosko4 <1294707+Tosko4@users.noreply.github.com>
Closes#16453.
Closes#17453.
Closes#16451.
Closes#17600.
Closes#13373.
Related: stephenschoettler/hermes-lcm#68.
* feat(image_gen): add Krea provider plugin (Krea 2 Medium + Large)
New built-in image_gen backend wrapping Krea's Krea 2 foundation
image model family. Auto-discovered like the other image_gen plugins
and appears in 'hermes tools' → Image Generation → Krea.
Krea's API is asynchronous — submit returns a job_id, poll /jobs/{id}
until terminal. The provider hides that behind the synchronous
ImageGenProvider.generate() contract: submit, poll every 2s with
light backoff (max 5s), 3-minute ceiling matching Krea's hosted-tool
timeout. Result URL is materialised to $HERMES_HOME/cache/images/
to avoid CDN-expiry 404s downstream (same fix as xAI #26942).
Models:
- krea-2-medium (default — Krea's 'start here' recommendation)
- krea-2-large
Aspect ratios map landscape→16:9, square→1:1, portrait→9:16.
Resolution: 1K (Krea's only current option).
Kwarg passthrough: seed, creativity (raw/low/medium/high), styles,
image_style_references (capped 10), moodboards (capped 1) — matches
Krea's per-request limits. Unknown kwargs are ignored.
Config knobs (config.yaml):
image_gen.provider: krea
image_gen.krea.model: krea-2-medium | krea-2-large
image_gen.krea.creativity: raw | low | medium | high
Env overrides: KREA_API_KEY (required), KREA_IMAGE_MODEL.
KREA_API_KEY is registered in OPTIONAL_ENV_VARS so 'hermes setup'
prompts for it.
31 new tests; image_gen suite + picker + tools_config: 211/211.
* fix(image_gen/krea): address review feedback
- Update KREA_API_KEY setup URL to the canonical token-creation page
(https://www.krea.ai/app/api/tokens). The previous URL returned 404.
- Fail fast on non-retryable HTTP statuses during poll. The previous
loop retried every HTTPError for the full 180s deadline, so an auth
(401), billing (402), forbidden (403), or not-found (404) response
would make image_generate hang for three minutes. Only retry
transient statuses (408/409/425/429/5xx); surface everything else
immediately.
- Add 5 tests covering fail-fast on 401/403/404 and retry on 429/503.
* fix(krea): point users at the real API token dashboard URL
Three call sites linked users to dashboard pages that don't exist:
- hermes_cli/config.py: https://www.krea.ai/app/api/tokens
- plugins/image_gen/krea/__init__.py get_setup_schema: https://www.krea.ai/api-keys
- plugins/image_gen/krea/__init__.py auth_required error: https://www.krea.ai/api-keys
Per Krea's own docs (https://docs.krea.ai/developers/api-keys-and-billing),
the real dashboard URL is https://www.krea.ai/settings/api-tokens. All three
sites now point there.
Use the shared observer/target resolver for session context so peer='user' and explicit configured peer IDs query Honcho from the same assistant-observed perspective when allowed. Add regression coverage for user alias, explicit peer, and self-observer fallback.
honcho_profile(peer="user") returned an empty card even when Honcho
held a populated peer card for the user. Two independent bugs combined
to produce the symptom:
1. Read path: get_peer_card() called _fetch_peer_card(observer, target=user),
which hits GET /peers/{observer}/card?target={user} — the observer's local
card of the user. On self-hosted Honcho v3 this slot is empty unless writes
also use it. The peer card lives on the user peer itself
(GET /peers/{user}/card). Add a fallback: when the observer-target slot is
empty and a target exists, retry against the target peer's own card.
2. Write path: set_peer_card() resolved only the target peer and called
user_peer.set_card(card). The read path uses the assistant peer as
observer, so writes and reads addressed different Honcho card scopes.
Align set_peer_card() with _resolve_observer_target() so writes go to
assistant_peer.set_card(card, target=user_peer_id), matching the read.
Both paths now use the same observer/target resolution, and the read
path additionally falls back to the target's own card for compatibility
with deployments where cards were written directly to the peer.
Closes: related to #13375, #17124, #20729
Three related regressions stemming from the pinUserPeer alias landing:
- Setup wizard read host-only fields when detecting current shape but the
parser supports root-level config and gives host pinUserPeer higher
precedence than pinPeerName. Re-running setup could mis-detect shape
and silently flip routing. Detection now uses the same resolver order
as HonchoClientConfig, and each shape branch scrubs every peer-mapping
key before writing so a stale pinUserPeer=false can't outrank a freshly
written pinPeerName=true. Multi no longer auto-writes
userPeerAliases={} (was silently masking root-level baselines).
- clone_honcho_for_profile inherited pinPeerName but not pinUserPeer, so
a default profile configured with the newer key produced cloned
profiles without the pin.
- Gateway cache-busting signature fingerprinted Honcho user-peer fields
but not ai_peer. Since HonchoSessionManager freezes cfg.ai_peer at
init, mid-flight aiPeer edits kept assistant writes on the old peer
until an unrelated cache eviction. ai_peer is now part of the
signature.
Remove "PR #14984 / #27371 / #1969" references and "the original key /
legacy / backwards-compatible / Port #N" narration from the honcho
plugin README, tests, and one stale code comment. These artefacts age
poorly: they describe how a change happened rather than what the code
does today, and they tax readers who weren't around for the original
work.
Also drop a dangling reference to scratch/memory-plugin-ux-specs.md in
__init__.py — the file isn't in the repo or git history.
No behaviour change.
Three correctness gaps when honcho.json's identity-mapping config changes
mid-flight:
1. The gateway's agent cache signature ignored honcho identity keys, so
editing peerName / pinPeerName / userPeerAliases / runtimePeerPrefix
was silently dropped until an unrelated cache eviction. Extend
_extract_cache_busting_config to fingerprint the resolved honcho
config so the AIAgent rebuilds on the next message.
2. cmd_setup let single → multi flips orphan the pinned-pool history
under peerName without warning. Detect the transition, warn that
runtime users will resolve to fresh empty peers, and auto-steer to
hybrid (alias the operator's runtime IDs back to peerName) so the
operator's own continuity survives. yes / no overrides available.
3. README didn't document the orphaning behaviour. Add a "Migrating
single → multi" callout under Deployment shapes.
Tests:
- TestPinTransition (test_pin_peer_name.py): fresh-manager flip resolves
to runtime, in-process flip is gated by the per-key session cache
(documents the gateway-cache-must-bust contract), 3 cache-bust
signature tests for pin / aliases / prefix.
- TestProfilePeerUniqueness: two profiles pinned to distinct peerNames
resolve to distinct peers; host-level peerName overrides root when
pinned.
- test_single_to_multi_steers_to_hybrid_by_default and
test_single_to_multi_yes_override_keeps_multi (test_cli.py): wizard
guard end-to-end coverage.
PR #27371 introduced three new identity-mapping config keys
(pinPeerName, userPeerAliases, runtimePeerPrefix), but the README's
'Full Configuration Reference' didn't mention them. Operators had
to read the source to understand the resolver, leading to predictable
support questions ("why is my user split across two peers?", "what
does pinPeerName actually pin?").
Add a new 'Identity Mapping' subsection that covers:
* The four config keys (pinUserPeer + alias, userPeerAliases,
runtimePeerPrefix) with concrete examples.
* The 7-step resolver ladder so operators can predict which peer a
given runtime ID will land on.
* Why there's no symmetric pinAiPeer (the AI peer is already pinned
by construction; the asymmetry is intentional).
* Host vs root semantics (host-level replaces root for maps, wipes
with empty value).
* The three deployment shapes ('hermes honcho setup' uses these same
shape names) with one-line guidance per shape.
The original key 'pinPeerName' from #14984 is ambiguous: a fresh
reader can't tell whether it pins the user peer or the AI peer from
the name alone. The resolver only ever pins the user-side
(_resolve_user_peer_id short-circuits when pin_peer_name is true; the
AI peer is already pinned by construction via aiPeer).
Add 'pinUserPeer' as the canonical alias. Both keys land on the
same internal pin_peer_name field; precedence is host pinUserPeer →
host pinPeerName → root pinUserPeer → root pinPeerName → default.
Host-level always beats root-level regardless of alias, so a host
block can still explicitly disable a root-level pin even via the new
key.
Make _resolve_bool variadic so it can express the four-value
precedence chain. All existing callers pass two positional args +
default keyword, which the new signature accepts unchanged.
Internal var name (pin_peer_name) stays the same to keep the
cherry-picked #27371 commits clean and avoid a noisy rename diff.
The PR #27371 resolver introduced three identity-mapping config keys
(pinPeerName, userPeerAliases, runtimePeerPrefix), but operators had
no guided way to set them — they had to read the README, understand
the resolver ladder, and hand-edit honcho.json. This commit adds an
interactive step to 'hermes honcho setup' that asks one question
('what's your deployment shape?') and writes the right combination
of keys.
Three shapes cover the realistic deployments:
* single -- pinPeerName=true. All gateway users collapse to your
peerName. Recommended for personal/single-operator use.
* multi -- pinPeerName=false, no aliases. Each runtime user gets
their own peer. Optional runtimePeerPrefix for cross-
platform namespace isolation.
* hybrid -- pinPeerName=false, with userPeerAliases mapping YOUR
runtime IDs (Telegram UID, Discord snowflake, Slack
user, Matrix MXID) to peerName. Multi-user gateway
where you are a privileged operator.
A 'skip' option leaves existing identity-mapping config untouched —
critical because re-running setup must not silently wipe operator-
curated aliases.
The wizard detects the current shape from existing config so the
prompt's default matches what the operator already has.
PR #27371 added host-scoped userPeerAliases, runtimePeerPrefix, and
pinPeerName, but the cloned-profile allowlist in
plugins/memory/honcho/cli.py::clone_honcho_for_profile() omitted them.
A new profile created via 'hermes honcho setup' or similar would
silently drop the operator's identity-mapping config, causing gateway
users to resolve to raw runtime IDs and fragmenting Honcho memory
across an unintended set of peers.
Add the three keys to the allowlist and a regression test class
covering all three plus the unset case.
Per AGENTS.md, ~/.hermes/.env is reserved for API keys / secrets and
config.yaml is the surface for non-secret configuration. The Nous
Portal plugin previously read HERMES_DASHBOARD_OAUTH_CLIENT_ID and
HERMES_DASHBOARD_PORTAL_URL from the environment only, which forced
local-dev / on-prem operators to put non-secret per-instance
configuration in .env — violating the convention.
Add dashboard.oauth.{client_id,portal_url} to DEFAULT_CONFIG and have
the plugin resolve each setting with env-overrides-config precedence:
1. Env var when set to a non-empty value (Fly.io platform-secret
injection — what pushes per-deploy client_ids without baking
them into the image).
2. config.yaml entry (canonical surface for local dev / on-prem).
3. Plugin default (no provider registered when client_id is empty;
portal_url defaults to https://portal.nousresearch.com).
Empty env values are explicitly treated as unset so a provisioned-but-
not-populated Fly secret can't accidentally shadow a valid config.yaml
entry with an empty string — operators would otherwise lose the gate.
Implementation:
- hermes_cli/config.py: add dashboard.oauth.{client_id,portal_url}
block to DEFAULT_CONFIG with full doc comment explaining the
override precedence and Fly.io rationale.
- plugins/dashboard_auth/nous/__init__.py: add _load_config_oauth_section,
_resolve_client_id, _resolve_portal_url helpers; replace the two
direct os.environ.get() calls in register() with the resolvers.
Update the skip-reason string to mention BOTH surfaces so an
operator looking at the fail-closed bind error knows config.yaml
is a valid alternative to the env var.
- plugins/dashboard_auth/nous/plugin.yaml: update description to
name both surfaces. requires_env stays pointing at the env var
name — it's metadata-only (not used by the plugin loader for
gating) so this is documentation/UX, not enforcement.
- cli-config.yaml.example: append commented dashboard.oauth block
with the same override rationale operators see in code.
- website/docs/user-guide/features/web-dashboard.md: rewrite the
'Default provider: Nous Research' section to lead with config.yaml,
present env vars as operator overrides (Fly.io's primary path).
Updated the example fail-closed bind error to match the new
skip-reason text.
Test coverage — new TestConfigYamlSource class (8 tests) pinning
every tier of the precedence chain:
- config-yaml-only path registers correctly
- both config-yaml fields (client_id + portal_url) honoured
- env var overrides config for client_id (Fly.io critical path)
- env var overrides config for portal_url
- empty env string does NOT shadow config (CI/Fly edge case)
- neither source set → skip with reason mentioning BOTH surfaces
- load_config() raising falls through to env-only path (resilience)
- non-dict oauth section falls through cleanly (typo resilience)
Mutation-tested: flipping the precedence to config-wins-over-env trips
exactly test_env_overrides_config_client_id while the other 7 stay
green, confirming the suite discriminates the order, not just the
sources.
This closes the last item in Teknium's PR review (PR #30156).
When jwt.decode raises InvalidTokenError, decode the token a second time
without signature verification (safe — we never trust the values, just
display them) and append the actual iss/aud claims plus our configured
expected values to the error message. Lets operators see config drift
between HERMES_DASHBOARD_PORTAL_URL / HERMES_DASHBOARD_OAUTH_CLIENT_ID
and what Portal is actually emitting without having to hand-decode the
JWT from the browser cookie.
The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.
Three changes:
1. plugins/dashboard_auth/nous/__init__.py:
- HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
'https://portal.nousresearch.com'. Override only for staging
(portal.rewbs.uk) or a custom deployment. Empty string also
falls back to the default so an empty Fly secret can't point
the dashboard at nowhere.
- Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
reads when no providers register. Cleared on each register() call.
Skip reasons are human-readable and actionable
('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
provisions this env var…').
2. plugins/dashboard_auth/nous/plugin.yaml:
- requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
is mandatory. Description updated to reflect this.
3. hermes_cli/web_server.py:
- When the gate fail-closes for 'no providers', it now reads each
bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
message. Operator sees the specific config fix needed:
Bundled providers reported these issues:
• nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
instead of the prior generic 'Install the default Nous provider'.
Tests:
- TestPluginRegister rewritten to assert the new defaults +
LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
- New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
- test_get_method_is_not_allowed widened to handle the SPA-shell 200
path explicitly — assertion now verifies no JSON ticket leaks
rather than asserting a specific status code (covers all four of
401/404/405/200).
Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected
env vars are present:
HERMES_DASHBOARD_OAUTH_CLIENT_ID — agent:{instance_id}
HERMES_DASHBOARD_PORTAL_URL — Portal base URL
Loopback / --insecure operators leave both unset and never see this
plugin register anything. The fail-closed branch in start_server handles
the 'public bind + zero providers' case independently.
Implementation follows nous-account-service PR #180's published OAuth
contract verbatim:
- client_id is per-instance (agent:{instance_id}); the suffix is
cross-checked against the token's agent_instance_id claim as
defense-in-depth (contract C9).
- scope is agent_dashboard:access only (contract C3).
- aud is the bare client_id, no hermes-cli: prefix (contract C2).
- RS256 JWT verification against /.well-known/jwks.json with
5-minute cache (contract C7).
- No refresh tokens in V1: refresh_session always raises
RefreshExpiredError; revoke_session is a no-op (contract C5).
- oauth_contract_version claim: missing → warn + proceed; present
and != 1 → refuse (contract C11, OQ-C2 tolerant treatment).
- redirect_uri validated client-side as defense before bouncing to
Portal; authoritative check is server-side per agent-redirect-uri.ts.
41 new tests covering construction, plugin-entry env gating, start_login
shape, complete_login httpx-mocked happy path + error mapping,
verify_session JWT verification (RSA keypair fixture, full claim-check
matrix), refresh_session always raising, revoke_session no-op.
PyJWT + cryptography are already in the venv (jose was previously
suggested; switched to pyjwt[crypto] since the latter is already
pulled in transitively).
New opt-in plugin that scans the content passed to write_file / patch /
skill_manage for 25 known-dangerous code patterns — pickle.load,
yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec,
dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/
insertAdjacentHTML, crypto.createCipher (no IV), AES ECB,
TLS verification disabled, XXE-prone xml.etree/minidom parsers,
<script src=//...> without SRI, torch.load without weights_only=True,
GitHub Actions ${{ github.event.* }} injection — and appends a
"Security guidance" warning block to the tool result via the
transform_tool_result hook.
Default behaviour is non-blocking: the file is written and the warning
rides back to the model in the next turn so it can self-correct or
document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades
to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the
kill switch.
Pattern data (patterns.py) is a verbatim Apache-2.0 fork of
Anthropic's claude-plugins-official/plugins/security-guidance/hooks/
patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE
preserve attribution. The Hermes-side plugin glue (__init__.py,
plugin.yaml, README.md, tests) is original work.
Plugin is opt-in like all bundled plugins:
hermes plugins enable security-guidance
Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic
shipped this as their security-guidance plugin for Claude Code on
2026-05-26 with a measured 30-40% reduction in security-related PR
comments on internal rollout.
What's NOT ported (deferred):
* Layer 2 (LLM diff review on turn end) — would route through main
model by default on Hermes, real money on reasoning models. A
follow-up can wire it to a cheap aux model with explicit opt-in.
* Layer 3 (agentic commit-time review) — agent can run this on
demand via delegate_task today.
* .hermes/security-guidance.md project-rules file — only used by
layers 2/3 upstream.
* fix(plugins/discord): correct install_hint extra to [messaging]
The Discord platform registered install_hint pointing at
'hermes-agent[discord]', but pyproject.toml has no [discord] extra —
the deps live in [messaging] alongside Telegram and Slack. Users hitting
"Platform 'Discord' requirements not met" were directed at a pip command
that installs nothing.
* feat(nix): add #messaging and #full package variants
Make Discord/Telegram/Slack work out of the box for `nix profile install`
users. Messaging deps were dropped from [all] on 2026-05-12 in favor of
lazy-install, but lazy-install can't write to the read-only /nix/store —
users hit "No adapter available for discord" with no actionable guidance.
- #messaging: pre-built with discord.py/telegram/slack (+33 MB venv)
- #full: all 18 platform-portable extras + matrix on Linux only
(python-olm lacks Darwin PyPI wheels) (+738 MB venv)
Also adds a `messaging-variant` flake check that verifies `import discord`
succeeds in the sealed venv — regression guard for the lazy-install
migration.
Docs updated: Quick Start callout, extraDependencyGroups rewrite with
messaging as primary example + full extras table, troubleshooting row,
cheatsheet row.
Closure size deltas (measured x86_64-linux):
default 1792 MB pkg / 512 MB venv
messaging 1826 MB pkg / 546 MB venv (+33 MB)
full 2530 MB pkg / 1250 MB venv (+738 MB)
* chore(nix): trim variant comments + alphabetize full extras
Drop the date-stamped changelog from messaging-variant's comment and the
"+33 MB / +704 MB" numbers from the variant defs — those drift and belong
in the PR description, not source. Alphabetize the 18-extra list in #full
so future additions produce clean one-line diffs.
No semantic change. messaging-variant check still passes.
* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend
Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.
What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
code_execution_tool, file_operations, approval, skills_tool,
environments/local, credential_files, lazy_deps, prompt_builder,
cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
`VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
`TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
test_vercel_sandbox_environment.py; scrubs references across 23
surviving test files (no entire tests deleted unless they were
dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
notes, tool config, terminal-backend tables — English plus zh-Hans
i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list
What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
Sandbox — archive, not active docs
Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
`ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
`vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)
* test: convert profile-count check from change-detector to invariant
The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
Extend PR #31716 to plugin setup paths that were also using bare
getpass.getpass(): hindsight (4 sites), honcho, simplex, line. Same
mechanical swap onto hermes_cli.secret_prompt.masked_secret_prompt.
X Premium+ also grants Grok OAuth access — the 'SuperGrok Subscription'
wording suggested SuperGrok was the only entitlement path. Updated to
'SuperGrok / Premium+' across the picker label, setup wizard, auth flows,
and docs so Premium+ subscribers know the row applies to them too.