The Hermes Docker image's venv is built with `uv sync`, which does not
bootstrap pip into the venv. When the google-workspace setup script needs
to install its deps and the running interpreter has no pip,
`sys.executable -m pip install` dead-ends with "No module named pip"
(reported via Discord support).
install_deps() now falls back to `uv pip install --python <interpreter>`
when the pip path fails and uv is on PATH. uv installs into the exact
interpreter the script is running under without needing pip present, so
the pip-less venv self-heals (e.g. a dep evicted on image update, or a
build without the [google]/[all] extra). On environments with neither
pip nor uv, the [google] extra hint is printed as before.
Verified E2E against nousresearch/hermes-agent:latest: under the venv
python with a missing dep, --install-deps now prints "Dependencies
installed." and exits 0 instead of failing.
Adds TestInstallDeps regression coverage: pip path, uv fallback,
uv-not-consulted-when-pip-works control, and both no-installer-available
and uv-also-fails failure cases.
* feat(kanban): goal_mode cards run workers in a /goal loop
A goal_mode card wraps its dispatched worker in the Ralph-style goal
loop behind /goal: after each turn an auxiliary judge checks the
worker's response against the card title+body, and if not done the
worker keeps going in the SAME session until the judge agrees, the
worker terminates the task itself, or the turn budget runs out (which
blocks the card for human review — never a silent exit).
- kanban_db: goal_mode + goal_max_turns columns (additive migration),
Task fields, create_task params, INSERT wiring, created-event payload.
- kanban_tools: goal_mode/goal_max_turns on the kanban_create tool so
orchestrators can opt cards in when fanning out.
- kanban CLI: --goal / --goal-max-turns on 'kanban create'.
- dashboard API: goal_mode/goal_max_turns on the create endpoint
(auto-surfaced back via asdict).
- _default_spawn: sets HERMES_KANBAN_GOAL_MODE / _GOAL_MAX_TURNS only
when the card opts in.
- goals.run_kanban_goal_loop: standalone, callback-injected loop engine
(no SessionDB persistence; ephemeral worker). cli.py quiet path calls
it after the worker's first turn when the env vars are set.
- Docs: orchestrator skill + kanban feature page.
Tests: DB roundtrip + legacy migration, spawn env gating, and the loop's
continuation/completion/budget-block/finalize-nudge branches. E2E run
against a real kanban DB confirms a budget-exhausted goal worker lands
in a sticky blocked state.
* feat(kanban/dashboard): goal-mode toggle in the create form
Wires the goal_mode card setting into the dashboard UI (the plugin's
hand-written IIFE bundle, no build step):
- InlineCreate: 'goal mode' checkbox after the skills field; checking it
reveals an optional 'max turns' number input. Both reset on submit and
only post goal_mode/goal_max_turns when enabled.
- TaskDrawer: a 'Goal mode: on (max N turns)' MetaRow so a card's
goal-mode setting is visible after creation (auto-fed by asdict via the
existing _task_dict).
Live-tested through the running dashboard with a browser: created a
goal-mode card with max-turns=8, confirmed it persisted to the kanban DB
(goal_mode=1, goal_max_turns=8) and rendered back in the drawer as
'on (max 8 turns)'. No JS console errors.
Normalize Gmail API message header names to lowercase before lookup so
gmail get/search/reply populate to/subject/from regardless of the casing
the message was stored with. Emit conventional MIME header casing
(To/Subject/Cc/From) on send and reply.
Fixes#34806
Co-authored-by: Donovan Yohan <donovan-yohan@users.noreply.github.com>
Adds a `grok` skill under `skills/autonomous-ai-agents/`, a third coding-agent orchestration guide alongside `codex` and `claude-code`. It teaches Hermes to delegate coding tasks to Grok Build (xAI's `grok` CLI).
- Headless `-p` one-shots (preferred)
- Interactive TUI via pty + tmux
- Session resume, background tasks, structured JSON output
- PR review and parallel worktree patterns
- Auth via SuperGrok / X Premium+ (`grok login`)
- Full pitfalls and config notes
Kanban workers run headless — no live user is on the other side of `clarify`,
so the call times out (~120s default) and the task sits silently in `running`
with no signal to the operator that input is needed. Reporter observed a real
incident where a worker asked 'promote to production, or check staging first?'
via clarify, the call timed out, the agent hallucinated a fallback, and the
task sat 'running' for hours.
Fix: explicit 'do not call clarify' bullet in two surfaces every kanban worker
sees —
- `agent/prompt_builder.py` KANBAN_GUIDANCE `## Do NOT` section (auto-injected
into every dispatcher-spawned worker run).
- `skills/devops/kanban-worker/SKILL.md` `## Do NOT` section (the bundled
worker skill).
Both point at the right pattern: `kanban_comment` (context) + `kanban_block`
(decision needed) — the task surfaces on the board as blocked, the operator
sees it, unblocks with their answer in a comment, and the worker respawns
with the thread.
Co-authored-by: kweiner <17778+kweiner@users.noreply.github.com>
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend
Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.
What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
code_execution_tool, file_operations, approval, skills_tool,
environments/local, credential_files, lazy_deps, prompt_builder,
cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
`VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
`TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
test_vercel_sandbox_environment.py; scrubs references across 23
surviving test files (no entire tests deleted unless they were
dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
notes, tool config, terminal-backend tables — English plus zh-Hans
i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list
What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
Sandbox — archive, not active docs
Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
`ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
`vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)
* test: convert profile-count check from change-detector to invariant
The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
'hermes login' was removed (the command now just prints a deprecation
message and exits). The bundled hermes-agent SKILL.md, in-code error
messages, the tip rotation, the proxy adapters, and the docs site
still pointed agents and users at the dead command — so models loading
the skill kept running 'hermes login --provider openai-codex' and
getting a dead-end print.
Replacements use the canonical 'hermes auth add <provider>' surface
(or bare 'hermes auth' for the interactive manager).
Files:
- skills/autonomous-ai-agents/hermes-agent/SKILL.md (+ regenerated docs page)
- hermes_cli/tips.py (tip rotation)
- agent/google_oauth.py (gemini-cli error message)
- agent/conversation_loop.py (nous re-auth troubleshooting line)
- agent/credential_sources.py (docstring)
- hermes_cli/proxy/cli.py + hermes_cli/proxy/adapters/nous_portal.py (proxy auth hints)
- tests/hermes_cli/test_proxy.py (updated assertions)
- website/docs/reference/faq.md, website/docs/user-guide/features/subscription-proxy.md
- zh-Hans i18n mirrors for the above
'hermes logout' is still a live command and is left untouched.
The 'hermes login' stub in hermes_cli/auth.py:login_command() and
the cli-commands.md 'Deprecated' rows are intentionally kept as
the discoverable deprecation surface.
Phase 5 of the s6-overlay supervision plan. Documentation + small
diagnostic cleanups; no behavior changes.
website/docs/user-guide/docker.md:
- Replace the old 'entrypoint script does the bootstrap' section
with the s6-overlay boot flow (cont-init.d/01-hermes-setup,
cont-init.d/02-reconcile-profiles, static main-hermes + dashboard
services, ENTRYPOINT-as-main-program pattern).
- Add a 'Per-profile gateway supervision' subsection covering the
new lifecycle commands, restart semantics, log persistence, and
'Manager: s6 (container supervisor)' status reporting.
- Add 'Breaking change vs. pre-s6 images' callout naming the
/init ENTRYPOINT and pointing affected wrappers at the pin
workaround.
website/docs/user-guide/profiles.md:
- Add a note under 'Persistent services' pointing container users
at the docker.md section explaining s6 supervision inside the
image. Host-side systemd/launchd documentation is unchanged.
skills/software-development/hermes-s6-container-supervision/SKILL.md:
- New maintainer skill covering the supervision-tree map, file
layout, the Architecture B rationale (cont-init.d args + halt
exit-code propagation), quick recipes, and the 8 pitfalls we hit
while implementing the plan (PATH-without-/command, root-owned
profile dirs, SOUL.md as marker, the '143' anti-pattern, etc.).
hermes_cli/doctor.py:
- _check_gateway_service_linger skips on s6 (the linger concept
doesn't apply inside the container).
- New _check_s6_supervision section reports main-hermes/dashboard
state and per-profile-gateway count (registered vs supervised
up), only inside the s6 container. Host doctor output unchanged.
- External Tools / Docker check no longer emits a 'docker not
found' warning inside the container; prints an explanatory
info line instead. Still respects an explicit TERMINAL_ENV=docker
(in case the user mounted /var/run/docker.sock).
hermes_cli/gateway.py:
- Document _container_systemd_operational more precisely: it's
NOT for our Hermes Docker image (s6-overlay handles that via
detect_service_manager() == 's6'). It still covers
systemd-nspawn / k8s-with-systemd-init cases, so leaving it in
place is correct; the docstring just makes that explicit.
Test harness (verification, no test changes in this commit):
19 passed, 0 xfailed. 66 service-manager / container-boot /
profiles-s6-hooks / gateway-s6-dispatch unit tests still green.
61 doctor tests still green. Hadolint + shellcheck clean.
Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
PR #29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer. Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.
- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
gated methods. When the flag is off the call is a fast no-op; when
on, the writer behaves as before (atomic write, truncation guard
preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
`/branch` and `/compress` re-points happen automatically — no need
to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
hook). Did NOT restore the 7 intra-turn calls the original PR deleted
— those were redundant writes within the same turn that doubled disk
I/O without adding any persistence guarantee `_persist_session` does
not already provide
- Read the flag once at agent init via `load_config()`, cache as
`agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
to pin behavior: default off (no file), opt-in true (file written),
no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
document the flag and its default
Salvages #26496 by @aqilaziz. Adds branch_name column + CLI flag so
tasks with workspace_kind='worktree' can pin a target branch on
create. Schema migration added to _migrate_add_optional_columns.
- Task.branch_name field + DB column + migration
- create_task accepts branch_name kwarg
- hermes kanban create --branch <name> flag
- kanban show output includes 'Branch: <name>' when set
Cherry-picked the substantive commit (a7558cf27); the PR's tip was
an unrelated service-path-dirs commit. Resolved 2 INSERT-column-list
and show-output conflicts alongside main's session_id and
max_runtime_seconds additions; kept all three.
Salvages #28199 by @bensargotest-sys. Aligns Kanban docs with current
tool registration: dispatcher-spawned task workers get task tools,
profiles that explicitly enable the kanban toolset get orchestrator
routing tools (kanban_list, kanban_unblock). Corrects failure-limit
text to current default of 2. Hardens the e2e subprocess script to
resolve repo root and use the spawnable default assignee. Updates the
diagnostics severity fixture to assert error below the critical
threshold.
Addresses review feedback on #13193:
1. Reference-image flow no longer assumes write_file/read_file handle
binaries. vision_analyze produces a textual description; the binary
is optionally copied via terminal (cp/curl). The description is what
gets embedded in prompts.
2. image_generate's URL-only return is now explicit. Step 6 downloads
the returned URL to local disk via terminal (curl -sSL -o ...), then
verifies non-zero size before proceeding.
3. Removed "Please use nano banana pro..." line from prompts/system.md —
the backend is user-configured and not agent-selectable, so routing
hints in the prompt are misleading.
PORT_NOTES.md updated: prompts/system.md is no longer verbatim, and the
file-ops/backend-selection rows now reflect Hermes' actual tool surface
(write_file/read_file for text, terminal for binaries and URL downloads,
vision_analyze for reading images).
Adapts the upstream baoyu-article-illustrator skill (verbatim-copied in
the previous commit) to Hermes' tool ecosystem, matching the pattern
used by baoyu-infographic.
- Metadata: openclaw → hermes; add author, license, tags, category
- Triggering: slash command + CLI flags → natural language
- User config: remove EXTEND.md, first-time-setup, preferences-schema
- User prompts: AskUserQuestion (batched) → clarify (one at a time)
- Image gen: baoyu-imagine → image_generate (describe refs in prompt text)
- Platform: drop Windows/PowerShell; Linux/macOS only
- File ops: switch to write_file / read_file
- Watermark: opt-in per-article instead of EXTEND.md-driven
- Add PORT_NOTES.md describing the adaptation and sync procedure
Style, palette, and prompt/system.md reference files are verbatim copies
and are the sync points with upstream.
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- `ruff check` clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
* feat(skills/notion): overhaul for Notion Developer Platform (May 2026)
Notion shipped its Developer Platform on May 13, 2026: ntn CLI, Workers,
Markdown API, bidirectional webhooks, agent tools. The existing skill only
covered curl + integration token CRUD, so it didn't surface any of the new
ergonomics — particularly the /markdown endpoints (much easier for agents
to consume) and the ntn CLI for headless API + Workers management.
This rewrite (v1.0.0 -> v2.0.0):
- Splits setup into Path A (HTTP, cross-platform incl. Windows), Path B
(ntn CLI on macOS/Linux, with NOTION_API_TOKEN env var for headless),
and Path C (Windows fallback — HTTP API or WSL2; native ntn is 'coming
soon').
- Keeps the full curl reference (still the only Windows-compatible path).
- Adds /markdown endpoints — GET and PATCH page-as-markdown, plus POST
/v1/pages with a markdown body param. Agent-friendly, no CLI required.
- Adds ntn CLI cheat sheet for raw API shorthand, file uploads, and
workspace flags.
- Adds Notion Workers section: scaffold, tool/webhook capability shapes,
lifecycle commands. Gated on Business/Enterprise plans + macOS/Linux.
- Adds Notion-flavored Markdown reference (callouts, toggles, columns,
mentions, colors) for the /markdown endpoints.
- Adds a 'choose the right path' decision table at the bottom.
- Notes the new efficient Notion MCP server as an optional wiring path.
Auto-generated docs page regenerated via
website/scripts/generate-skill-docs.py.
* docs(skills-catalog): update notion description for v2.0.0
Adds references/template-integrity.md covering safe conversion of the
official comfyui-workflow-templates package from editor format to API
format — Reroute bypass via link tracing, dotted dynamic-input keys
(values.a, resize_type.width) that must NOT be flattened, server-error
"patch don't rebuild" loop, Cloud quirks (302 redirect to signed GCS
URL, free-tier 1 concurrent job, 1920x1080 OOM on RTX 5090), and a
Discord-compatible ffmpeg stitch recipe (yuv420p + xfade/acrossfade).
SKILL.md lists the new reference so the agent loads it when starting
from an official template. purzbeats added to author list and to
scripts/release.py AUTHOR_MAP.
Co-authored-by: purzbeats <97489706+purzbeats@users.noreply.github.com>
Closes the architectural-pin part of #19931. Most of what that issue
asked for is already implemented (logs under kanban root, env-pinned
workspace, dispatcher routing of unknown assignees, lifecycle
ownership, structured handoff conventions). What was missing:
1. A written contract integrators can point at when adding a new
worker lane shape, and
2. The "code-changing workers should not auto-promote success to
done" convention.
This commit ships both as docs+convention layered on existing primitives.
No kernel changes — the kanban_complete / kanban_block / kanban_comment
surfaces already support the review-required pattern; we just hadn't
written it down or made it visible to workers.
Changes:
- `agent/prompt_builder.py::KANBAN_GUIDANCE`: append the review-required
exception to step 5 of the lifecycle. Workers get the cue
auto-injected into their system prompt — drop structured metadata
into a kanban_comment first, then end with
kanban_block(reason="review-required: <summary>") instead of
kanban_complete when the work needs review. Total prompt size went
from ~3000 to ~3275 chars; well under the 4096 budget enforced by
test_kanban_guidance_size.
- `skills/devops/kanban-worker/SKILL.md`: add a worked example to the
existing "Good summary + metadata shapes" section between the
Coding-task and Research-task examples. Same shape as the others
(kanban_comment with structured handoff JSON, then kanban_block with
the human-readable reason). Plus a one-line guide on when to use
kanban_complete vs the review-required pattern.
- `website/docs/user-guide/features/kanban-worker-lanes.md` (new): the
integrator-facing contract. Covers the hierarchy, the three things
every lane must provide (assignee, spawn mechanism, lifecycle
terminator), the env vars the dispatcher injects, the
review-required convention, the failure modes the kernel handles
for free, and an explicit "external CLI worker lane" deferred-
pending-concrete-asker section that links to #19931 and #19924.
- `website/sidebars.ts`: link the new page under user-guide/features.
The "specialist worker lanes for external CLI tools (Codex / Claude
Code / OpenCode)" runner is NOT shipped here. The dispatcher's
spawn_fn parameter already supports plugin-shaped extension; the
per-CLI integration work (auth, sandbox policy, exit-code mapping)
needs a concrete asker. The new docs page tells would-be integrators
the contract any such lane must satisfy.
Refs #19931
The skill enumerated 8 specialist profile names (researcher, analyst,
writer, reviewer, backend-eng, frontend-eng, ops, pm) as "the standard
roster" and told orchestrators to "assume these exist." Almost no real
Hermes setup matches that fleet — single-profile setups, Docker-worker
setups, and curated-team setups all violate it — so following the skill
literally produced cards assigned to non-existent profiles, which the
dispatcher silently failed to spawn (no autocorrect, no fallback, just
sits in `ready` forever).
Changes:
- Drop the standard-specialist-roster table.
- Add a "Profiles are user-configured — not a fixed roster" section at
the top with a Step 0 that prescribes `hermes profile list` (or asking
the user) before fanning out. Cache the result in working memory.
- Rewrite the worked task-graph example with placeholder names
(<profile-A>, <profile-B>, <profile-C>) so the structure is still
teachable but doesn't invite copy-paste of role names that may not
exist.
- Reframe the "If no specialist fits" anti-temptation rule: don't
invent profile names; ask the user.
- Add a "Inventing profile names that doesn't exist" entry to Pitfalls.
- Bump skill version 2.0.0 → 3.0.0 (semantic break: previous behavior
promised a roster the skill no longer enumerates).
- Update website/docs/user-guide/features/kanban.md to drop the
matching "(researcher, writer, analyst, backend-eng, reviewer, ops)"
line and explain the discovery prompt instead.
- Re-run website/scripts/generate-skill-docs.py to refresh the
auto-generated skill page + catalog.
Closes#21131 in spirit — addresses the same hardcoded-names footgun
@yehuosi flagged, with a different shape than their PR (delete the
roster rather than replace each name with placeholder, since the
roster table was the load-bearing footgun and the worked example is
salvageable with placeholder profile names).
Co-authored-by: yehuosi <yehuosi@users.noreply.github.com>
These skills require heavy GPU/CUDA stacks or are niche enough that they shouldn't
be active by default. Moved to optional-skills/ where users opt-in via
`hermes skills install official/...`.
Moved:
- mlops/training/axolotl
- mlops/training/trl-fine-tuning
- mlops/training/unsloth
- mlops/inference/outlines
Counts: 91 -> 87 built-in, 72 -> 76 optional.
Auto-regenerated docs (per-skill pages + catalogs) reflect the move.
Completes the Windows-gating coverage for the built-in skills/ tree. Every
bundled SKILL.md now carries an explicit platforms: declaration so the
loader (agent.skill_utils.skill_matches_platform) can skip-load skills
that don't fit the current OS.
74 skills declared cross-platform (platforms: [linux, macos, windows]):
Creative (16): ascii-art, ascii-video, architecture-diagram, baoyu-comic,
baoyu-infographic, claude-design, creative-ideation, design-md,
excalidraw, humanizer, manim-video, p5js, pixel-art,
popular-web-designs, pretext, sketch, songwriting-and-ai-music,
touchdesigner-mcp
Autonomous agents: claude-code, codex, hermes-agent, opencode
Data/devops: jupyter-live-kernel, kanban-orchestrator, kanban-worker,
webhook-subscriptions, dogfood, codebase-inspection
GitHub: github-auth, github-code-review, github-issues,
github-pr-workflow, github-repo-management
Media: gif-search, heartmula, songsee, spotify, youtube-content
MCP / email / gaming / notes / smart-home: native-mcp, himalaya,
pokemon-player, obsidian, openhue
mlops (non-broken): weights-and-biases, huggingface-hub, llama-cpp,
outlines, segment-anything-model, dspy, trl-fine-tuning
Productivity: airtable, google-workspace, linear, maps, nano-pdf,
notion, ocr-and-documents, powerpoint
Red-teaming / research: godmode, arxiv, blogwatcher, llm-wiki,
polymarket
Software-dev: debugging-hermes-tui-commands, hermes-agent-skill-authoring,
node-inspect-debugger, plan, requesting-code-review, spike,
subagent-driven-development, systematic-debugging,
test-driven-development, writing-plans
Misc: yuanbao
5 skills gated from Windows (platforms: [linux, macos]):
mlops/inference/vllm (serving-llms-vllm)
vLLM is officially Linux-only; Windows requires WSL.
mlops/training/axolotl
Axolotl's flash-attn + deepspeed + bitsandbytes stack is Linux-first.
mlops/training/unsloth
Requires Triton + xformers + flash-attn — Linux only in practice.
mlops/models/audiocraft (audiocraft-audio-generation)
torchaudio ffmpeg backend + encodec dependencies are Linux-first.
mlops/inference/obliteratus
Research abliteration workflow; relies on Linux-focused pytorch
kernels and MLX — no first-class Windows path.
Same strict-over-lenient policy as the optional-skills sweep: when the
underlying tool's Windows support is rough, missing, or WSL-only, gate the
skill. Easier to un-gate after verified Windows support lands than to leak
partial support that manifests as mid-task failures.
Combined with prior commits in this branch, every bundled SKILL.md
(skills/ + optional-skills/) now has a platforms: declaration.
Hermes's skill loader (agent/skill_utils.skill_matches_platform) already honors
the 'platforms:' frontmatter field and skip-loads skills whose declared
platform list doesn't include sys.platform. Seven bundled skills are in fact
Linux/macOS-only but never declared it, so they leak into Windows skill
listings and sometimes load with broken instructions.
Audited all 160 SKILL.md files (skills/ + optional-skills/) for Windows-
hostile signals: apt-get/brew/systemd/chmod+x install flows, ptrace/proc
runtime dependencies, bash-only launcher scripts, and package dependencies
with no Windows build. The 7 below fail one or more of those tests in a way
that fundamentally can't be papered over by docs edits:
minecraft-modpack-server bash start.sh + chmod +x + apt openjdk
evaluating-llms-harness lm-eval-harness bash launcher scripts
distributed-llm-pretraining-
torchtitan bash multi-node torchrun launcher
python-debugpy remote attach relies on /proc ptrace_scope
pytorch-fsdp NCCL backend; Windows path is WSL only
tensorrt-llm NVIDIA TensorRT-LLM has no Windows build
searxng-search Docker volume flow assumes POSIX $(pwd)
All seven get 'platforms: [linux, macos]'. On Windows the loader now skips
them silently — no more phantom skill listings, no more mid-task failures
because an Apple-only path was surfaced as a suggestion.
Cross-platform skills that merely CONTAIN signals in examples or
install-instructions (brew install as one of several paths, /tmp/ in a code
snippet, etc.) are NOT touched by this commit. A broader audit that
declares the ~140 cross-platform skills as 'platforms: [linux, macos,
windows]' can follow as a separate change once each has been verified
working on Windows.
The installed user copies under ~/AppData/Local/hermes/skills/ (when they
exist) are also patched so the running session reflects the gating
immediately, but only the in-repo files are committed here.
Adds a dedicated '## Windows-Specific Quirks' section to the hermes-agent
skill so Windows pitfalls have one discoverable place to evolve. Inaugural
entries cover:
- Input / keybindings — Alt+Enter intercepted by Windows Terminal,
Ctrl+Enter as the Windows newline keystroke, mintty/git-bash behavior,
pointer to scripts/keystroke_diagnostic.py for investigation.
- Config / files — UTF-8 BOM HTTP-400 trap.
- execute_code / sandbox — WinError 10106 SYSTEMROOT root cause +
_WINDOWS_ESSENTIAL_ENV_VARS fix location.
- Testing / contributing — scripts/run_tests.sh POSIX-venv limitation and
the system-Python workaround, POSIX-only test skip-guard patterns.
- Path / filesystem — line-ending warnings (cosmetic), forward-slash
portability.
Collapses the old scattered Windows bullets under 'Platform-specific
issues' into a single pointer at the new dedicated section so there's
only one place to maintain this content.
Also adds the scripts/keystroke_diagnostic.py the skill now references —
a small prompt_toolkit Application that prints the Keys.* identifier and
raw escape bytes for every keystroke. Used to establish the Ctrl+Enter
= c-j fact on Windows Terminal; generally useful for anyone adding a
platform-aware keybinding.
Fifth and final slice polish on top of @dlkakbs's docs + skill. Three
things ship here:
1. Subscription renewal cron recipe (the #1 operational footgun).
Microsoft Graph webhook subscriptions expire at 72 hours max and
don't auto-renew. The shipped operator runbook mentioned
`maintain-subscriptions --dry-run` as a "daily or periodic check"
but never told operators how to actually automate it. Without a
scheduled job, any production deployment silently stops ingesting
meetings three days after go-live.
Adds an "Automating subscription renewal (REQUIRED for production)"
section to website/docs/guides/operate-teams-meeting-pipeline.md
with three concrete options and copy-pasteable configs:
- Option 1: Hermes cron (`hermes cron add --schedule "0 */12 * * *"
--script-only --command "hermes teams-pipeline maintain-subscriptions"`)
- Option 2: systemd service + timer (12h cadence, Persistent=true
so missed runs catch up after reboots)
- Option 3: plain crontab with a wrapper that sources .env for
credentials
Go-Live Checklist gains a bolded mandatory item for the schedule
being in place, with a cross-link to the section.
website/docs/user-guide/messaging/teams-meetings.md adds a
`::⚠️::` admonition right after the manual `subscribe`
examples so anyone who creates a subscription manually is told
the same day that it will silently expire in 72 hours.
2. Sidebar wiring. Shela's new docs pages (teams-meetings.md and
operate-teams-meeting-pipeline.md) weren't in website/sidebars.ts,
so they were orphaned URLs — reachable only if someone knew the
path. Wired teams-meetings into Messaging Platforms next to the
existing teams entry, and operate-teams-meeting-pipeline into
Guides & Tutorials next to microsoft-graph-app-registration from
PR #21922. Adjacent placement keeps the related pages discoverable
from each other.
3. SKILL.md rewrite (v1.0.0 → v1.1.0).
The original skill had five Turkish-only trigger phrases, which
works in a Turkish-speaking session but doesn't match English
triggers. Rewrote the skill to:
- Describe triggers by intent instead of exact phrases, with
explicit "works in any language" framing and example phrases
in both English and Turkish.
- Add a Decision Tree section covering the three most common user
asks (missing summary, setup verification, re-run request) and
the specific CLI command sequence for each.
- Add a dedicated "Critical pitfall: Graph subscriptions expire
in 72 hours" section that tells the agent exactly what to do
when a user reports "worked yesterday, nothing today" — the
most common operational failure mode.
- Expand the command reference into three labeled groups (Status
and inspection / Re-running and debugging / Subscription
management) so the agent can reach for the right command
without scanning.
- Add cross-links to all four related docs pages (Azure app
registration, webhook listener setup, full pipeline setup,
operator runbook).
Validation:
- npm run build: all new pages route, anchor to
#automating-subscription-renewal-required-for-production resolves
from both the runbook TOC and the teams-meetings.md admonition.
- scripts/run_tests.sh on the relevant test suites (607 tests): all
pass.
Background macOS desktop control via cua-driver MCP — does NOT steal the
user's cursor or keyboard focus, works with any tool-capable model.
Replaces the Anthropic-native `computer_20251124` approach from the
abandoned #4562 with a generic OpenAI function-calling schema plus SOM
(set-of-mark) captures so Claude, GPT, Gemini, and open models can all
drive the desktop via numbered element indices.
- `tools/computer_use/` package — swappable ComputerUseBackend ABC +
CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary).
- Universal `computer_use` tool with one schema for all providers.
Actions: capture (som/vision/ax), click, double_click, right_click,
middle_click, drag, scroll, type, key, wait, list_apps, focus_app.
- Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style
`content: [text, image_url]` parts) that flows through
handle_function_call into the tool message. Anthropic adapter converts
into native `tool_result` image blocks; OpenAI-compatible providers
get the parts list directly.
- Image eviction in convert_messages_to_anthropic: only the 3 most
recent screenshots carry real image data; older ones become text
placeholders to cap per-turn token cost.
- Context compressor image pruning: old multimodal tool results have
their image parts stripped instead of being skipped.
- Image-aware token estimation: each image counts as a flat 1500 tokens
instead of its base64 char length (~1MB would have registered as
~250K tokens before).
- COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset
is active.
- Session DB persistence strips base64 from multimodal tool messages.
- Trajectory saver normalises multimodal messages to text-only.
- `hermes tools` post-setup installs cua-driver via the upstream script
and prints permission-grant instructions.
- CLI approval callback wired so destructive computer_use actions go
through the same prompt_toolkit approval dialog as terminal commands.
- Hard safety guards at the tool level: blocked type patterns
(curl|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash,
force delete, lock screen, log out).
- Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic)
workflow guide.
- Docs: `user-guide/features/computer-use.md` plus reference catalog
entries.
44 new tests in tests/tools/test_computer_use.py covering schema
shape (universal, not Anthropic-native), dispatch routing, safety
guards, multimodal envelope, Anthropic adapter conversion, screenshot
eviction, context compressor pruning, image-aware token estimation,
run_agent helpers, and universality guarantees.
469/469 pass across tests/tools/test_computer_use.py + the affected
agent/ test suites.
- `model_tools.py` provider-gating: the tool is available to every
provider. Providers without multi-part tool message support will see
text-only tool results (graceful degradation via `text_summary`).
- Anthropic server-side `clear_tool_uses_20250919` — deferred;
client-side eviction + compressor pruning cover the same cost ceiling
without a beta header.
- macOS only. cua-driver uses private SkyLight SPIs
(SLEventPostToPid, SLPSPostEventRecordTo,
_AXObserverAddNotificationAndCheckRemote) that can break on any macOS
update. Pin with HERMES_CUA_DRIVER_VERSION.
- Requires Accessibility + Screen Recording permissions — the post-setup
prints the Settings path.
Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic-
native schema). Credit @0xbyt4 for the original #3816 groundwork whose
context/eviction/token design is preserved here in generic form.
Expand the google-workspace skill beyond read-only access to Drive and
Docs. Sheets already had full scope — just adds the missing create verb.
New subcommands:
- drive get : metadata for a single file
- drive upload : upload a local file (auto MIME detection)
- drive download : download or export (Docs/Sheets/Slides export to pdf/csv/pdf by default)
- drive create-folder
- drive share : user/group/domain/anyone + reader/writer/etc.
- drive delete : default trashes (reversible); --permanent skips the trash
- sheets create : new spreadsheet with optional first-tab name
- docs create : new doc, optional initial body
- docs append : append text at end of an existing doc
Scope changes:
- drive.readonly -> drive
- documents.readonly -> documents
Existing users with old tokens will hit the existing partial-scope
warning path (AUTHENTICATED (partial) ...) — the troubleshooting table
now points them at $GSETUP --revoke + redo steps 3-5 to pick up the
write scopes.
Small follow-ups on top of #19643:
- check_auth() takes quiet kwarg to suppress its AUTHENTICATED print
when called from check_auth_live(), so the final status line reflects
the live-call outcome only.
- Drop redundant _ensure_deps() call in check_auth_live() (check_auth()
already calls it).
- Add AUTHOR_MAP entry for ygd58 so release attribution script works.