hermes-agent

Author	SHA1	Message	Date
Ben Barclay	c10ccaaf51	feat(dashboard-auth): rotate dashboard sessions via refresh token (#37247 ) * feat(dashboard-auth): rotate dashboard sessions via refresh token The dashboard auth-code grant now issues a 24h rotating refresh token (server side: NousResearch/nous-account-service#293). This wires up the Hermes client half so an expired access token is transparently refreshed instead of bouncing the user to /login every 15 minutes. plugins/dashboard_auth/nous: - refresh_session() now POSTs grant_type=refresh_token to Portal's token endpoint and returns a Session carrying the ROTATED refresh token (was an unconditional RefreshExpiredError under the old "no RT in V1" contract). The RT is sent in BOTH the request body (Portal's schema requires it there) and the X-Refresh-Token header (log redaction) — verified against the #293 preview deploy: header-only is rejected as invalid_request, body is accepted. - A 400 from Portal (expired / revoked / reuse-detected) maps to RefreshExpiredError so the middleware forces a clean re-login; network errors map to ProviderError; empty RT fast-fails without a network call. - complete_login now captures the initial refresh token Portal returns (forward-tolerant: empty string if a deploy omits it). - Extracted the shared token-response handling into _token_response_to_session, parameterised on the 400 exception type so the auth-code path raises InvalidCodeError and the refresh path raises RefreshExpiredError. - revoke_session stays a best-effort no-op: Portal exposes no public token-endpoint revocation grant (revocation is the authenticated /sessions UI, keyed by sessionId+userId), so logout is cookie-clearing and the 24h session expires on its own. Documented for a future revoke grant. hermes_cli/dashboard_auth/middleware: - On an expired/invalid access token the gate now attempts refresh via the session's RT BEFORE forcing re-login. On success it serves the request and re-sets the rotated cookies on the response (mandatory: Portal rotates the RT every refresh and reuse-detects, so a stale RT cookie would revoke the whole session on the next refresh). On RefreshExpiredError (or no RT) it falls through to clear-and-relogin. - ProviderError during refresh (Portal unreachable) forces a clean re-login rather than 500-ing the request. - Uses the existing REFRESH_SUCCESS / REFRESH_FAILURE audit events. Validation: - 176 dashboard-auth unit/integration tests pass. - Live E2E against the #293 preview deploy: refresh_session(bad rt) -> RefreshExpiredError through the real token endpoint; live JWKS fetch + RS256 verification rejects a forged token; empty-RT fast-fail. The successful happy-path rotation is covered by unit tests (a live run needs an interactive browser OAuth round trip + registered agent:* client). Depends on: NousResearch/nous-account-service#293 (server-side RT issuance). * fix(dashboard-auth): use Portal's x-nous-refresh-token header name The refresh-token header must match Portal's REFRESH_TOKEN_HEADER exactly ("x-nous-refresh-token"); the initial cut used "X-Refresh-Token", which Portal silently ignores (harmless since the RT is also in the body, which is what the schema requires — but the header redaction was a no-op). Confirmed against the NAS token route + re-validated live against the #293 preview deploy. * fix(dashboard-auth): refresh session when access-token cookie has been evicted The gated middleware bounced users to /login the instant the access-token cookie was absent, without ever consulting the refresh token: at, _rt = read_session_cookies(request) if not at: return _unauth_response(...) # bailed here This made transparent refresh effectively dead for the common case. The access-token cookie is set with Max-Age = access_token_expires_in (~15 min), so a real browser EVICTS hermes_session_at the moment the token lapses while hermes_session_rt persists (30-day Max-Age). From that point the browser sends only the refresh-token cookie — and the old guard rejected it before _attempt_refresh could run. The _attempt_refresh path only fired for a present-but-invalid access token, which never happens in a browser. Fix: only hard-bounce when NEITHER cookie is present. A request carrying just the refresh token now skips verification (no AT to verify) and flows into the existing refresh path, which rotates both cookies and serves the request transparently. A dead/expired RT still raises RefreshExpiredError and falls through to clear-and-relogin. This failure mode escaped the original tests + manual refresh button because both kept the access-token cookie present; only a real browser evicting the cookie at Max-Age exposes it. Added 3 regression tests covering: AT-evicted + RT-present (transparent refresh), no-cookies (still bounces), and RT-only with a dead RT (clean 401, no 500).	2026-06-02 21:16:41 +10:00
Julien Talbot	8104b20269	fix(xai): route video models by modality	2026-06-01 19:00:30 -07:00
Teknium	b47cb1bbf2	feat(kanban): file attachments on tasks (#35395 ) Tasks can now carry file attachments (PDFs, images, source docs) that workers read directly — closes the gap where source material had to be pasted as a path into the task body. - kanban_db: task_attachments table (additive), Attachment dataclass, add/list/get/delete accessors, attachments_root/task_attachments_dir path helpers (per-board, HERMES_KANBAN_ATTACHMENTS_ROOT override) - build_worker_context: surfaces each attachment's absolute path so the worker (full file/terminal tool access) reads it via read_file/pdftotext - dashboard API: POST/GET/DELETE attachment routes (multipart upload, 25MB cap, traversal-safe filenames, root-containment check on download) - dashboard UI: Attachments section in the task drawer — upload button, list with download, per-row remove - docs + tests (13 cases: DB accessors, REST round-trip, traversal rejection, collision suffixing, worker-context surfacing) Closes #35338	2026-05-30 07:41:04 -07:00
Erosika	827ce602db	fix(honcho): harden self-hosted setup paths Self-hosted Honcho setup had four sharp edges: - local/cloud URLs ending in /vN double-prefixed by the SDK (/v3/v3/... 404) - authenticated local servers had no setup prompt for a JWT/bearer token - profile-derived host keys could be dot-containing workspace IDs Honcho rejects - memory-provider config files with API keys written world-readable per umask This keeps existing behavior but makes those paths safer: - strip a trailing /vN version segment from any configured baseUrl before SDK init (the SDK's route builders always prepend their own version prefix); auth-skipping stays loopback-only - add an optional local JWT/bearer prompt in honcho setup, stored under hosts.<host>.apiKey - derive new profile host keys with underscores, still reading legacy hermes.<profile> blocks - write memory-provider config files atomically with 0600 via a shared utils.atomic_json_write(mode=) arg (honcho/hindsight/mem0/supermemory) - skip honcho.json parsing in gateway cache-busting unless Honcho is the active memory provider; memoize by honcho.json mtime when active - bust the gateway agent cache on memory.provider change - add a hermes memory setup <provider> one-liner so fresh installs can configure a named provider without the picker (the per-provider hermes <provider> subcommand only registers once that provider is active) Closes #20688, #29885, #26459, #30246, #33382, #32244. Co-authored-by: BROCCOLO1D	2026-05-29 22:29:48 -07:00
Cornna	d473e7c938	fix(cron): exclude jobs.json registry from disk-cleanup pattern Closes #32164	2026-05-29 13:22:54 -07:00
alt-glitch	3183b2e28c	fix(video_gen): veo3.1 duration format and 4k resolution FAL veo3.1 API expects duration as "4s"/"6s"/"8s" (with unit suffix), not bare "4"/"6"/"8" like other families. Add per-family duration_suffix field and apply it in _build_payload. Also add "4k" to veo3.1 resolutions per FAL API docs. Note: the managed gateway currently rejects the "4s" format (expects integer duration). Gateway-side fix needed for veo3.1 to work through the Nous subscription path.	2026-05-29 22:26:24 +05:30
alt-glitch	b6294ea9f1	test(video_gen): cover gateway decision matrix gaps and 4xx error path - Add test for 4xx ValueError with actionable remediation message - Add test for is_available() returning True via managed gateway - Add test for prefers_gateway overriding direct FAL_KEY - Add test for is_available() via gateway in plugin test file	2026-05-29 22:26:24 +05:30
alt-glitch	d04b3c193e	feat(video_gen): route FAL video gen through managed Nous gateway Wire plugins/video_gen/fal/__init__.py to use the same _ManagedFalSyncClient pattern that image gen already uses. Changes: - Add managed gateway resolution, client caching, and _submit_fal_video_request() that routes between direct FAL_KEY and Nous gateway modes - Update is_available() to return True when either FAL_KEY or the managed gateway is reachable - Update generate() to use submit+get handle pattern instead of fal_client.subscribe() directly - Fix happy-horse endpoint namespace: fal-ai/ → alibaba/ (matches the tool-gateway allowlist from fal-video-gen branch) - Surface actionable error on 4xx gateway rejections Tests: - 4 new tests in test_managed_media_gateways.py (gateway routing, client reuse, direct mode fallback, alibaba namespace) - Updated existing test_fal_plugin.py fixture to use submit/handle pattern and patch _resolve_managed_fal_video_gateway for isolation	2026-05-29 22:26:24 +05:30
Rohit Sharma	9d4fda9952	feat(kanban): add POST /runs/{run_id}/terminate endpoint Closes the termination-control gap left by PR #28432, which shipped the read-only sibling endpoints (/workers/active, /runs/{run_id}, /runs/{run_id}/inspect) but no way to stop a misbehaving worker from the dashboard without dropping to the CLI. The new endpoint resolves run_id -> task_id and delegates to the existing kanban_db.reclaim_task() flow, so the SIGTERM->SIGKILL escalation, run-outcome bookkeeping, and event-log append all match POST /tasks/{task_id}/reclaim exactly. No new termination semantics introduced. Responses: 200 {ok, run_id, task_id} on success 404 unknown run_id 409 run already ended OR task no longer reclaimable Refs: #23762	2026-05-29 00:21:54 -07:00
kshitijk4poor	66827f8947	chore: prune unused imports and duplicate import redefinitions Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves	2026-05-28 22:26:25 -07:00
Nicolò Boschi	490b3e76b1	feat(hindsight): default recall_types to observation only Auto-recall used to surface every fact type Hindsight had on the session — `world`, `experience`, and `observation`. That triple-ships the same underlying signal in three different framings: observations are the concrete events the user said/did/asked, while world and experience facts are aggregate summaries Hindsight derives from those exact observations. Including all three burns most of `recall_max_tokens` on rephrasings, crowds out events the model actually needs to see, and produces effective duplicates in the prompt — observations themselves are deduplicated by construction so observation-only recall is denser per token and closer to conversational ground truth. Change ------ - Default `_recall_types = ["observation"]` (was `None`, which delegated to server-side "return everything"). - `initialize()` now treats a missing `recall_types` config the same way; also accepts comma-separated strings for parity with `recall_tags`. - An explicit `recall_types=[]` config falls back to the default rather than disabling the filter (would silently widen recall vs. the new default). - Added to `get_config_schema()` so it's discoverable via `hermes config`. Per-call `hindsight_recall` tool invocations are unaffected — they already only forward `types` when the caller passes the argument. Docs / migration ---------------- plugins/memory/hindsight/README.md grows a "Behavior change" callout explaining the why (no-duplicates, information-efficient) and how to restore the legacy broad recall: "recall_types": "observation,world,experience" # or a JSON list in `~/.hermes/hindsight/config.json`. Tests ----- - `test_default_values` updated for the new default. - New cases: explicit list override, CSV string accepted, empty list falls back to default (not "wider than default").	2026-05-28 13:07:20 -07:00
Teknium	5e1f793430	chore(web): remove web_crawl tool + provider crawl plumbing (#33824 ) The web_crawl_tool() function was an orphan — no model schema registered it, no skill or CLI command called it, and the agent had no way to invoke it. PR #32608 proposed wiring it up as a model-callable tool; we've decided not to expose crawl as a separate capability since web_search + web_extract cover the use cases we want models to have. Removed: - tools/web_tools.py: web_crawl_tool() (~230 LOC) - plugins/web/firecrawl/provider.py: supports_crawl() + crawl() - plugins/web/tavily/provider.py: supports_crawl() + crawl() - plugins/web/xai/provider.py: supports_crawl() override - agent/web_search_provider.py: supports_crawl() + crawl() ABC methods - agent/web_search_registry.py: get_active_crawl_provider() + the 'crawl' branch in _resolve() - agent/display.py: web_crawl tool-progress rendering - hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools - tools/website_policy.py: stale comment reference - Tests: removed TestWebCrawlTavily class, the two website-policy web_crawl tests, the searxng/ddgs/brave-free crawl-error tests, the integration test_web_crawl method, and the test_unconfigured_crawl_emits_top_level_error test. Trimmed the capability-flag parametrize list and the WebSearchProvider ABC conformance tests. - Docs: trimmed the Crawl column from capability tables in both EN and zh-Hans, updated the developer-guide ABC table. Net: 25 files, +115/-1067. Closes #33762 (the schema-text bug only existed if #32608 landed). Supersedes #32608.	2026-05-28 04:52:42 -07:00
Robin Fernandes	406901b27d	feat(auth) normalise the way in which we check whether a user has free/paid access to nous portal so we can expose behaviour and error messages accordingly.	2026-05-28 00:19:31 -07:00
Teknium	9919caff46	feat(image_gen): add Krea provider plugin (Krea 2 Medium + Large) (#33236 ) * feat(image_gen): add Krea provider plugin (Krea 2 Medium + Large) New built-in image_gen backend wrapping Krea's Krea 2 foundation image model family. Auto-discovered like the other image_gen plugins and appears in 'hermes tools' → Image Generation → Krea. Krea's API is asynchronous — submit returns a job_id, poll /jobs/{id} until terminal. The provider hides that behind the synchronous ImageGenProvider.generate() contract: submit, poll every 2s with light backoff (max 5s), 3-minute ceiling matching Krea's hosted-tool timeout. Result URL is materialised to $HERMES_HOME/cache/images/ to avoid CDN-expiry 404s downstream (same fix as xAI #26942). Models: - krea-2-medium (default — Krea's 'start here' recommendation) - krea-2-large Aspect ratios map landscape→16:9, square→1:1, portrait→9:16. Resolution: 1K (Krea's only current option). Kwarg passthrough: seed, creativity (raw/low/medium/high), styles, image_style_references (capped 10), moodboards (capped 1) — matches Krea's per-request limits. Unknown kwargs are ignored. Config knobs (config.yaml): image_gen.provider: krea image_gen.krea.model: krea-2-medium \| krea-2-large image_gen.krea.creativity: raw \| low \| medium \| high Env overrides: KREA_API_KEY (required), KREA_IMAGE_MODEL. KREA_API_KEY is registered in OPTIONAL_ENV_VARS so 'hermes setup' prompts for it. 31 new tests; image_gen suite + picker + tools_config: 211/211. * fix(image_gen/krea): address review feedback - Update KREA_API_KEY setup URL to the canonical token-creation page (https://www.krea.ai/app/api/tokens). The previous URL returned 404. - Fail fast on non-retryable HTTP statuses during poll. The previous loop retried every HTTPError for the full 180s deadline, so an auth (401), billing (402), forbidden (403), or not-found (404) response would make image_generate hang for three minutes. Only retry transient statuses (408/409/425/429/5xx); surface everything else immediately. - Add 5 tests covering fail-fast on 401/403/404 and retry on 429/503. * fix(krea): point users at the real API token dashboard URL Three call sites linked users to dashboard pages that don't exist: - hermes_cli/config.py: https://www.krea.ai/app/api/tokens - plugins/image_gen/krea/__init__.py get_setup_schema: https://www.krea.ai/api-keys - plugins/image_gen/krea/__init__.py auth_required error: https://www.krea.ai/api-keys Per Krea's own docs (https://docs.krea.ai/developers/api-keys-and-billing), the real dashboard URL is https://www.krea.ai/settings/api-tokens. All three sites now point there.	2026-05-27 11:01:47 -07:00
Ben	61dcc33893	feat(dashboard-auth): config.yaml as canonical surface for dashboard.oauth Per AGENTS.md, ~/.hermes/.env is reserved for API keys / secrets and config.yaml is the surface for non-secret configuration. The Nous Portal plugin previously read HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL from the environment only, which forced local-dev / on-prem operators to put non-secret per-instance configuration in .env — violating the convention. Add dashboard.oauth.{client_id,portal_url} to DEFAULT_CONFIG and have the plugin resolve each setting with env-overrides-config precedence: 1. Env var when set to a non-empty value (Fly.io platform-secret injection — what pushes per-deploy client_ids without baking them into the image). 2. config.yaml entry (canonical surface for local dev / on-prem). 3. Plugin default (no provider registered when client_id is empty; portal_url defaults to https://portal.nousresearch.com). Empty env values are explicitly treated as unset so a provisioned-but- not-populated Fly secret can't accidentally shadow a valid config.yaml entry with an empty string — operators would otherwise lose the gate. Implementation: - hermes_cli/config.py: add dashboard.oauth.{client_id,portal_url} block to DEFAULT_CONFIG with full doc comment explaining the override precedence and Fly.io rationale. - plugins/dashboard_auth/nous/__init__.py: add _load_config_oauth_section, _resolve_client_id, _resolve_portal_url helpers; replace the two direct os.environ.get() calls in register() with the resolvers. Update the skip-reason string to mention BOTH surfaces so an operator looking at the fail-closed bind error knows config.yaml is a valid alternative to the env var. - plugins/dashboard_auth/nous/plugin.yaml: update description to name both surfaces. requires_env stays pointing at the env var name — it's metadata-only (not used by the plugin loader for gating) so this is documentation/UX, not enforcement. - cli-config.yaml.example: append commented dashboard.oauth block with the same override rationale operators see in code. - website/docs/user-guide/features/web-dashboard.md: rewrite the 'Default provider: Nous Research' section to lead with config.yaml, present env vars as operator overrides (Fly.io's primary path). Updated the example fail-closed bind error to match the new skip-reason text. Test coverage — new TestConfigYamlSource class (8 tests) pinning every tier of the precedence chain: - config-yaml-only path registers correctly - both config-yaml fields (client_id + portal_url) honoured - env var overrides config for client_id (Fly.io critical path) - env var overrides config for portal_url - empty env string does NOT shadow config (CI/Fly edge case) - neither source set → skip with reason mentioning BOTH surfaces - load_config() raising falls through to env-only path (resilience) - non-dict oauth section falls through cleanly (typo resilience) Mutation-tested: flipping the precedence to config-wins-over-env trips exactly test_env_overrides_config_client_id while the other 7 stay green, confirming the suite discriminates the order, not just the sources. This closes the last item in Teknium's PR review (PR #30156).	2026-05-27 02:12:27 -07:00
Ben	a498485631	feat(dashboard-auth-nous): surface token iss/aud in verification-failure error When jwt.decode raises InvalidTokenError, decode the token a second time without signature verification (safe — we never trust the values, just display them) and append the actual iss/aud claims plus our configured expected values to the error message. Lets operators see config drift between HERMES_DASHBOARD_PORTAL_URL / HERMES_DASHBOARD_OAUTH_CLIENT_ID and what Portal is actually emitting without having to hand-decode the JWT from the browser cookie.	2026-05-27 02:12:27 -07:00
Ben	b3dc539304	feat(dashboard-auth): Nous plugin always-on; default portal URL; specific error messages The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled and auto-loaded — same as before — but previously refused to register unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL were set, then the gate's fail-closed branch told the operator 'install the default Nous provider'. That message is misleading: the provider IS installed; it's just unconfigured. And the contract only really needs the per-instance client_id — the portal URL is the same for everyone in production. Three changes: 1. plugins/dashboard_auth/nous/__init__.py: - HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to 'https://portal.nousresearch.com'. Override only for staging (portal.rewbs.uk) or a custom deployment. Empty string also falls back to the default so an empty Fly secret can't point the dashboard at nowhere. - Plugin exposes a module-level LAST_SKIP_REASON: str that the gate reads when no providers register. Cleared on each register() call. Skip reasons are human-readable and actionable ('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal provisions this env var…'). 2. plugins/dashboard_auth/nous/plugin.yaml: - requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id is mandatory. Description updated to reflect this. 3. hermes_cli/web_server.py: - When the gate fail-closes for 'no providers', it now reads each bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit message. Operator sees the specific config fix needed: Bundled providers reported these issues: • nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. … instead of the prior generic 'Install the default Nous provider'. Tests: - TestPluginRegister rewritten to assert the new defaults + LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env). - New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured. - test_get_method_is_not_allowed widened to handle the SPA-shell 200 path explicitly — assertion now verifies no JSON ticket leaks rather than asserting a specific status code (covers all four of 401/404/405/200). Docs updated: web-dashboard.md's 'Default provider' section now shows the env-var table with required/optional columns and embeds the fail-closed error message verbatim so operators can match what they see at the prompt.	2026-05-27 02:12:27 -07:00
Ben	848baeb0a8	feat(dashboard-auth): plugins/dashboard_auth/nous — contract-compliant Nous OAuth provider Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected env vars are present: HERMES_DASHBOARD_OAUTH_CLIENT_ID — agent:{instance_id} HERMES_DASHBOARD_PORTAL_URL — Portal base URL Loopback / --insecure operators leave both unset and never see this plugin register anything. The fail-closed branch in start_server handles the 'public bind + zero providers' case independently. Implementation follows nous-account-service PR #180's published OAuth contract verbatim: - client_id is per-instance (agent:{instance_id}); the suffix is cross-checked against the token's agent_instance_id claim as defense-in-depth (contract C9). - scope is agent_dashboard:access only (contract C3). - aud is the bare client_id, no hermes-cli: prefix (contract C2). - RS256 JWT verification against /.well-known/jwks.json with 5-minute cache (contract C7). - No refresh tokens in V1: refresh_session always raises RefreshExpiredError; revoke_session is a no-op (contract C5). - oauth_contract_version claim: missing → warn + proceed; present and != 1 → refuse (contract C11, OQ-C2 tolerant treatment). - redirect_uri validated client-side as defense before bouncing to Portal; authoritative check is server-side per agent-redirect-uri.ts. 41 new tests covering construction, plugin-entry env gating, start_login shape, complete_login httpx-mocked happy path + error mapping, verify_session JWT verification (RSA keypair fixture, full claim-check matrix), refresh_session always raising, revoke_session no-op. PyJWT + cryptography are already in the venv (jose was previously suggested; switched to pyjwt[crypto] since the latter is already pulled in transitively).	2026-05-27 02:12:27 -07:00
Teknium	249534e472	plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131 ) New opt-in plugin that scans the content passed to write_file / patch / skill_manage for 25 known-dangerous code patterns — pickle.load, yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec, dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/ insertAdjacentHTML, crypto.createCipher (no IV), AES ECB, TLS verification disabled, XXE-prone xml.etree/minidom parsers, <script src=//...> without SRI, torch.load without weights_only=True, GitHub Actions ${{ github.event.* }} injection — and appends a "Security guidance" warning block to the tool result via the transform_tool_result hook. Default behaviour is non-blocking: the file is written and the warning rides back to the model in the next turn so it can self-correct or document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the kill switch. Pattern data (patterns.py) is a verbatim Apache-2.0 fork of Anthropic's claude-plugins-official/plugins/security-guidance/hooks/ patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE preserve attribution. The Hermes-side plugin glue (__init__.py, plugin.yaml, README.md, tests) is original work. Plugin is opt-in like all bundled plugins: hermes plugins enable security-guidance Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic shipped this as their security-guidance plugin for Claude Code on 2026-05-26 with a measured 30-40% reduction in security-related PR comments on internal rollout. What's NOT ported (deferred): * Layer 2 (LLM diff review on turn end) — would route through main model by default on Hermes, real money on reasoning models. A follow-up can wire it to a cheap aux model with explicit opt-in. * Layer 3 (agentic commit-time review) — agent can run this on demand via delegate_task today. * .hermes/security-guidance.md project-rules file — only used by layers 2/3 upstream.	2026-05-27 02:07:21 -07:00
Will Falcon	bba50977bc	fix: parse Codex image generation SSE directly	2026-05-26 20:40:29 -07:00
teknium1	d3ffbc6409	feat(stt): add stt.providers.<name> command-provider registry Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds, SenseVoice, curl pipelines — become an STT backend with zero Python. Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved untouched via the built-in local_command path) and the register_transcription_provider() Python plugin hook also shipped in this PR. Resolution order (mirrors TTS exactly): 1. Built-in (local, local_command, groq, openai, mistral, xai) → native handler. Always wins. 2. stt.providers.<name>: type: command → command-provider runner. 3. Plugin-registered TranscriptionProvider → plugin dispatch. 4. No match → 'No STT provider available'. Files ----- - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained; added _resolve_command_stt_provider_config, _transcribe_command_stt, and local helpers for template rendering, shell-quote context, and process-tree termination. Helpers are documented as mirrors of their tts_tool.py counterparts (kept local to avoid cross-tool private import). Wire-in is one insertion point in transcribe_audio() after the xai elif and before the plugin dispatcher. Plugin dispatcher additionally defensively short-circuits when a same-name command config exists (command-wins-over-plugin invariant). - tests/tools/test_transcription_command_providers.py: 50 new tests covering resolution (builtin precedence, type/command gating, case-insensitive lookup, legacy stt.<name> back-compat), helpers (timeout fallback, format validation, iter, has-any), template rendering (shell-quote contexts, doubled-brace preservation), end-to-end via _transcribe_command_stt (output_path read, stdout fallback, timeout, nonzero exit envelope, model override, language precedence), and dispatcher integration via the real transcribe_audio() including command-wins-over-plugin and builtin-shadow-rejection. - tests/plugins/transcription/check_parity_vs_main.py: extended from 10 to 13 scenarios. New cases: command-provider-installed, command-vs-plugin-same-name (verifies command wins precedence), explicit-openai-with-command-shadow (verifies built-in wins). Adds command_provider dispatch_kind detection via transcript prefix (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished from plugin scenarios even when sharing a provider name. - website/docs/user-guide/features/tts.md: new 'STT custom command providers' section symmetric to the TTS section — example config, placeholder grammar table (input_path / output_path / output_dir / format / language / model), transcript-read-back semantics (file first, then stdout fallback), optional keys table, behavior notes, security note. Updated 'Python plugin providers (STT)' to include the new 'When to pick which (STT)' decision table and updated resolution-order section (now 4 layers instead of 3). Verification ------------ 189/189 STT targeted tests + 50/50 new command-provider tests pass. Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/ 8623/8623 — zero regressions across 14,199 tests. Parity harness: 13 scenarios, 9 OK + 4 expected diffs (no_provider_error → plugin, plugin_unavailable, command_provider × 2). E2E live-verified in an isolated HERMES_HOME with a real .wav file: command: → dispatched to stt.providers.my-fake-cli plugin: → dispatched to registered TranscriptionProvider command-wins-over-plugin: → command provider beats same-name plugin builtin-wins-over-command: → built-in OpenAI handler fires; stt.providers.openai: type: command does NOT hijack it.	2026-05-25 01:41:19 -07:00
kshitijk4poor	2cd952e110	feat(stt): add register_transcription_provider() plugin hook Add an opt-in Python plugin surface for speech-to-text backends, mirroring the TTS hook pattern. New backends (OpenRouter, SenseAudio, Gemini-STT, custom proprietary engines) can be implemented as plugins without modifying tools/transcription_tools.py. Built-ins always win -------------------- The 6 built-in STT providers (local/faster-whisper, local_command, groq, openai, mistral, xai) keep their native handlers. Plugins attempting to register under a built-in name are rejected at registration time with a warning and re-checked defensively at dispatch. Resolution order ---------------- 1. stt.provider matches a built-in → built-in dispatch (unchanged) 2. stt.provider matches a registered plugin → a. if plugin.is_available() returns False → unavailability envelope identifying the plugin (not the generic "No STT provider" message — the user explicitly opted into this plugin) b. otherwise plugin.transcribe() with model + language forwarded from stt.<provider>.{model,language} config 3. No match → legacy "No STT provider available" error (unchanged) Per-provider config namespace ----------------------------- Plugins read their config from stt.<provider> in config.yaml, mirroring how built-ins read stt.openai.model / stt.mistral.model. The dispatcher forwards `model` and `language` from this section. Caller's explicit `model=` argument overrides the config-set model. Files ----- - agent/transcription_provider.py: TranscriptionProvider ABC - agent/transcription_registry.py: register/get/list providers, built-in shadow guard, _reset_for_tests - hermes_cli/plugins.py: register_transcription_provider() on PluginContext - tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset, _dispatch_to_plugin_provider() with availability gate, wire-in after xai branch and before "No STT provider" error - tests/agent/test_transcription_registry.py: 27 tests - tests/hermes_cli/test_plugins_transcription_registration.py: 3 tests - tests/tools/test_transcription_plugin_dispatch.py: 28 tests (covering built-in short-circuit, plugin dispatch, exception envelope, non-dict guard, availability gate, language forwarding) - tests/plugins/transcription/check_parity_vs_main.py: 10-scenario subprocess-pinned parity harness vs origin/main - website/docs/user-guide/features/{tts,plugins}.md: docs Behavior parity --------------- 10 scenarios, 8 OK + 2 expected DIFFs: no_provider_error → plugin (plugin-installed scenario) no_provider_error → plugin_unavailable (plugin-installed-unavailable scenario; PR returns cleaner envelope) Zero behavior change for users not opting into a plugin. Issue follow-up to #30398.	2026-05-25 01:41:19 -07:00
Teknium	031f9c9edc	fix(image_gen): cache xAI ephemeral URL responses to disk (#26942 ) (#31759 ) xAI's grok-imagine-image API returns ephemeral imgen.x.ai/xai-tmp-* URLs that 404 within minutes — long before downstream consumers (Telegram send_photo, browser preview, multi-tier delivery fallback) get a chance to fetch them. The xAI image_gen provider was passing those URLs through unchanged on the elif url: branch; b64 responses were already cached locally via save_b64_image. Result: every image_generate call on a Telegram-routed xai-oauth profile delivered no image, falling through to text-only. Adds agent.image_gen_provider.save_url_image() — a sibling helper to save_b64_image that downloads URL bytes to $HERMES_HOME/cache/images/. Content-type-aware extension inference with URL-suffix fallback; oversize cap (25MB default) with partial-write cleanup; empty-body refusal. Mirrors the audio_cache pattern used by text_to_speech. Wires save_url_image into both the xAI and OpenAI providers' URL branches. When the download fails (network blip, 404 in-flight) we log a warning and fall back to the bare URL rather than turning the tool call into a hard error — the gateway's existing URL-send fallback then gets a chance to surface the original error legibly. Test plan: - tests/agent/test_save_url_image.py — 8 direct tests against a real in-process HTTP server: bytes round-trip, content-type → extension, URL-suffix fallback, default-to-png, 404 propagation, empty-body refusal, oversize cap + cleanup, filename uniqueness. - tests/plugins/image_gen/test_xai_provider.py — flip test_successful_url_response (was asserting the bug), add test_url_response_falls_back_to_bare_url_when_download_fails. - tests/plugins/image_gen/test_openai_provider.py — symmetric pair. 160/160 in the broader image_gen test surface.	2026-05-24 18:10:47 -07:00
kshitijk4poor	00ec0b617c	feat(tts): add register_tts_provider() plugin hook (closes #30398 ) Adds a `TTSProvider(ABC)` + `register_tts_provider()` extension point to the plugin context API, alongside the existing config-driven `tts.providers.<name>: type: command` registry from PR #17843. This is additive — the command-provider surface stays as the primary way to add a TTS backend. The hook covers cases the shell-template grammar can't reasonably express: - Native Python SDKs without a CLI (Cartesia, Fish Audio, etc.) - Streaming synthesis (chunked Opus → voice-bubble delivery) - Voice metadata API for the `hermes tools` picker - OAuth-refreshing auth flows None of the 10 inline built-in providers (`edge`, `openai`, `elevenlabs`, `minimax`, `gemini`, `mistral`, `xai`, `piper`, `kittentts`, `neutts`) are migrated to plugins. They stay inline. The hook is for new engines that aren't built-in. ## Resolution order The dispatcher's resolution order is the load-bearing invariant: 1. `tts.provider` is a built-in name → built-in dispatch. Always wins. 2. `tts.provider` matches `tts.providers.<name>` with `command:` set → command-provider dispatch (PR #17843). 3. `tts.provider` matches a plugin-registered `TTSProvider` → plugin dispatch (new). 4. No match → falls through to Edge TTS default (legacy behavior). Built-ins-always-win is enforced at THREE layers: - Registry: `register_provider()` rejects shadowing names with a warning. - Dispatcher: `_dispatch_to_plugin_provider()` short-circuits built-in names defensively before consulting the registry. - Picker: `_plugin_tts_providers()` filters built-in shadows out of the `hermes tools` row list defensively. Command-providers-win-over-plugins is enforced at TWO layers: - The caller in `text_to_speech_tool` checks `_resolve_command_provider_config` first. - `_dispatch_to_plugin_provider` re-checks for a same-name command config defensively so a refactor of the caller can't silently break the invariant. ## New files - `agent/tts_provider.py` — `TTSProvider(ABC)` with `synthesize()` (required), `list_voices()`, `list_models()`, `get_setup_schema()`, `stream()`, `voice_compatible` (all optional with sane defaults). Mirrors `agent/image_gen_provider.py` shape. - `agent/tts_registry.py` — `register_provider`/`get_provider`/`list_providers` with `_BUILTIN_NAMES` reject-shadowing invariant. Mirrors `agent/image_gen_registry.py` shape. - `plugins/tts/...` directory ready for community plugins (none shipped). ## Modified files - `hermes_cli/plugins.py` — `register_tts_provider()` method on `PluginContext`. Matches the gating shape of `register_image_gen_provider()` / `register_browser_provider()`. - `tools/tts_tool.py` — `_dispatch_to_plugin_provider()` + `_plugin_provider_is_voice_compatible()` + walrus-elif wiring into the main dispatcher. Built-in elif chain untouched. - `hermes_cli/tools_config.py` — `_plugin_tts_providers()` injects plugin rows into the Text-to-Speech picker category alongside the 10 hardcoded built-in rows. ## Tests - `tests/agent/test_tts_registry.py` — 47 tests covering registration, lookup, ABC contract, helpers, AND a `TestBuiltinSync` regression test that fails if `agent.tts_registry._BUILTIN_NAMES` drifts from `tools.tts_tool.BUILTIN_TTS_PROVIDERS` (kept duplicated due to circular import constraints). - `tests/tools/test_tts_plugin_dispatch.py` — 35 tests covering built-in-always-wins, command-wins-over-plugin, plugin dispatch, exception passthrough, voice_compatible helper. - `tests/hermes_cli/test_tts_picker.py` — 10 tests covering the picker surface, builtin shadowing defense, integration with `_visible_providers`. - `tests/hermes_cli/test_plugins_tts_registration.py` — 3 end-to-end tests via `PluginManager.discover_and_load()`. - `tests/plugins/tts/check_parity_vs_main.py` — 9-scenario subprocess parity harness vs `origin/main`. The only intentional diff is `fallback_edge → plugin` for the `plugin-installed` scenario. ## Verification - 95/95 new tests pass. - 170/170 pre-existing TTS tests (test_tts_command_providers, test_tts_max_text_length, test_tts_speed, etc.) pass unchanged. - Parity harness against `origin/main`: 8 OK + 1 expected DIFF. - E2E smoke: a registered plugin's `synthesize()` is called via `text_to_speech_tool` with the standard JSON envelope returned. - Ruff clean on all touched files. ## Docs - `website/docs/user-guide/features/tts.md` — new "Python plugin providers" section with a decision table (command-provider vs plugin), minimal plugin example, and the optional-hook reference. - `website/docs/user-guide/features/plugins.md` — TTS row updated to mention both surfaces (command-provider primary, plugin for SDK/streaming). Closes #30398	2026-05-24 18:04:54 -07:00
teknium1	70aaa774be	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape The Kimi K2 branch added in the prior commit only emitted extra_body.thinking and dropped reasoning_effort entirely. KimiProfile (api.moonshot.ai/v1) sends both fields, and OpenCode Go proxies to the same Moonshot backend. Mirror that shape on the Go path so /reasoning effort actually reaches Kimi. - low/medium/high pass through verbatim - xhigh/max clamp to high (Moonshot's max supported value) - minimal / unknown effort → omit reasoning_effort, keep thinking on - disabled / no config → unchanged - DeepSeek branch unchanged	2026-05-23 02:20:28 -07:00
Harish Kukreja	3589960e03	fix(provider): expose OpenCode Go reasoning controls	2026-05-23 02:20:28 -07:00
0xDevNinja	3ac2125140	refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241	2026-05-22 04:10:45 -07:00
ethernet	48be2e0e4d	test: use subprocesses for each test file (#29016 ) * ci(tests): install ripgrep from prebuilt tarball instead of apt apt-get update + install of ripgrep takes ~4 min on the GHA Ubuntu runners (the apt-get update against archive.ubuntu.com is the slow part; ripgrep itself is small). Switching to the upstream musl binary tarball cuts the step to a few seconds. - Pinned to ripgrep 15.1.0 with sha256 verification (same hash as published in the releases sha256 sidecar file). - Drops the `rg` binary into /usr/local/bin so it is on PATH for every subsequent step without GITHUB_PATH manipulation. - Applied to both the test and e2e jobs in tests.yml. * fix(cli): compile syntax check to tempdir, not source __pycache__ `_validate_critical_files_syntax` runs `py_compile.compile()` on each critical bootstrap file after a successful `git pull`. The default `py_compile` writes the resulting `.pyc` next to the source under `__pycache__/`, which causes two real problems: 1. Parallel test workers walking the same source tree (e.g. running the suite under per-file process isolation) can race against each other on the `__pycache__` write — manifests as flaky 'directory not empty' errors during teardown. 2. In production, the post-pull syntax check leaves a `.pyc` behind that the next interpreter run might pick up — fine when the interpreter version matches, sketchy if it doesn't. Fix: write the compiled output to a `tempfile.TemporaryDirectory()` that's discarded on function exit. We only care about the compile-or-not signal, not the artifact. * test(runner): per-file process isolation, drop manual state reset + xdist Replace fragile manual _reset_module_state test fixtures with robust per-file subprocess isolation. Each test file runs in a fresh `python -m pytest <file>` subprocess via ThreadPoolExecutor. No xdist, no custom pytest plugin, no shared worker state. Key changes: * scripts/run_tests_parallel.py — new runner: discovers test files, runs N in parallel via ThreadPoolExecutor, captures stdout per file, treats exit code 5 (no tests collected) as pass, kills all children on exit. Change from cpu_count to cpu_count2. The runner is I/O-bound (waiting on subprocess.communicate() from pytest children) The parent process does almost no CPU work, so 2x oversubscription keeps more pipes full. When a file fails, immediately show the last 30 lines of pytest output (stack traces + FAILED summary) plus a ready-to-copy repro command: python -m pytest tests/agent/test_auxiliary_client.py scripts/run_tests.sh — delegates to run_tests_parallel.py * .github/workflows/tests.yml — test step: python scripts/run_tests_parallel.py * pyproject.toml — drop pytest-xdist, pytest-split; simplify addopts * tests/conftest.py — remove ~200 lines of manual state-reset fixtures * AGENTS.md — update Testing section for per-file design * test(runner): speed gateway test antipattern scan up * fix(test): web search provider plugin test missing xai * fix(tests): make 14 test files pass under per-file subprocess isolation Tests that relied on cross-file state pollution from xdist workers fail when run in isolation (per-file subprocess model). Root causes and fixes: Tool registry not populated: - test_video_generation_tool_surface_matrix: add discover_builtin_tools() - test_web_providers_brave_free/ddgs/searxng/general: autouse fixtures registering all 8 bundled web providers, reset after each test - test_website_policy: same provider registration pattern - test_web_tools_tavily: same pattern across 3 dispatch test classes - Also add is_safe_url/check_website_access mocks where SSRF check blocks example.com (DNS resolution fails in isolated envs) Stale check_fn cache: - test_kanban_tools: invalidate_check_fn_cache() + _clear_tool_defs_cache() in both kanban guidance tests (prior test cached False for kanban_show) - test_discord_tool: cache invalidation in setup/teardown - test_homeassistant_tool: invalidate_check_fn_cache() before registry queries Module-level state pollution: - test_auxiliary_client: autouse fixture clearing _aux_unhealthy_until cache - test_skill_commands: set_session_vars() instead of patch.dict(os.environ) (ContextVar takes precedence over os.environ) - test_dm_topics: overwrite sys.modules + separate telegram.constants mock + force-reimport of gateway.platforms.telegram - test_terminal_tool_requirements: removed duplicate class declaration, autouse _clear_caches fixture * change(tests): run_tests.sh explicitly includes env vars instead of manually dropping some vars, now we just only include some * fix(tests): 5 more isolation/NixOS fixes - test_approval_plugin_hooks: isolate HERMES_HOME so real user's command_allowlist doesn't short-circuit the approval path - test_google_chat: skipif when Platform.GOOGLE_CHAT not in enum (feature not merged on this branch) - test_write_deny: test systemd prefix against tmp_path instead of /etc/systemd which resolves to /nix/store on NixOS - test_pty_bridge: use shutil.which('cat') instead of /bin/cat (doesn't exist on NixOS) - profiles.py: rmtree onexc handler chmod's parent dirs too, fixing profile deletion when copytree preserved read-only modes from nix store * fix(tests): clear unhealthy cache in autouse fixture for auxiliary_client * fix(tests): skip send_message when telegram not installed; handle missing worker_id in browser_supervisor * fix: py3.11 rmtree onexc compat + belt-and-suspenders unhealthy cache clear for expired codex test * fix: address PR #29016 review feedback - Remove tracked .pytest-cache/ artifact and add to .gitignore - Fix stale 'xdist worker' comment in conftest.py - Deduplicate web provider registration into tests/tools/conftest.py shared helper (register_all_web_providers), replacing 8 copy-pasted blocks across 6 test files - Update PR description: remove stale recovered-test-files claim, fix worker count to match code (cpu_count2) fix: eliminate race in stale-cache achievements test The background scan thread could complete and overwrite _SNAPSHOT_CACHE before evaluate_all() returned the stale data — only 10 fake sessions made the scan finish instantly. Added scan_delay param to _FakeSessionDB and set it to 2s in the stale-cache test so the background thread can't win the race.	2026-05-21 16:40:04 +05:30
Teknium	362ef912ea	fix(kanban-dashboard): restore implementations dropped during salvages (#28481 ) Four kanban dashboard test failures, all from PR salvages that picked up the test additions but dropped the corresponding implementations. - BOARD_COLUMNS: add 'review' (status added by PR `f55d94a1e` but the board API never grew the column → test_board_empty failed because VALID_STATUSES - {archived} mismatched the rendered columns). - update_task: enrich the 'ready' 409 detail with the blocking parent list (id, title, status) and add _parents_blocking_ready helper. Implementation lost in the #26744 salvage (commit `e215558ba`) which pinned the test but not the server-side code. - dist/index.js: add parseApiErrorMessage helper, wire it through the drag/drop banner, add patchErr state to the TaskDrawer and surface it inline by the action row. Lost in the same #26744 salvage. - test_diagnostics_endpoint_severity_filter: update to at-or-above semantics (PR `a94ddd807` changed the filter from exact-match so the warning filter now correctly includes error+critical too).	2026-05-18 21:54:56 -07:00
Jpalmer95	dfcf48b476	feat(kanban): drag-to-delete trash zone + bulk delete for task cards Salvages #28125 by @Jpalmer95. Adds: - Drag-to-delete trash zone in the kanban dashboard - Bulk delete endpoint with cascading delete_task cleanup - Frontend updates (drag visual + drop handler) - Confirmation prompt before delete Resolved end-of-file test conflict by appending both halves.	2026-05-18 21:40:13 -07:00
roycepersonalassistant	e3823657d6	feat(kanban): add scheduled status for delayed follow-ups Salvages #24533 by @roycepersonalassistant. Adds a first-class 'scheduled' Kanban status for time-delay follow-ups that aren't waiting on human input. - hermes kanban schedule <task_id> [reason] CLI command - Dashboard/API transitions to/from Scheduled - unblock_task() now releases both 'blocked' AND 'scheduled' tasks (re-checking parent dependencies before moving to ready/todo) - i18n + docs updates Resolved conflicts: kept HEAD's failure-counter reset on unblock alongside the PR's scheduled state, kept HEAD's 'running' direct-set rejection, combined both bulk-status branches. Dropped the dist/ bundle changes (months-stale; would need rebuild from source).	2026-05-18 21:39:03 -07:00
bensargotest-sys	81584940fe	docs: align kanban readiness docs and smoke tests Salvages #28199 by @bensargotest-sys. Aligns Kanban docs with current tool registration: dispatcher-spawned task workers get task tools, profiles that explicitly enable the kanban toolset get orchestrator routing tools (kanban_list, kanban_unblock). Corrects failure-limit text to current default of 2. Hardens the e2e subprocess script to resolve repo root and use the spawnable default assignee. Updates the diagnostics severity fixture to assert error below the critical threshold.	2026-05-18 21:07:03 -07:00
xxxigm	e215558ba7	test(kanban-dashboard): pin enriched 409 detail and inline error wiring (#26744 ) - Existing ``test_patch_drag_drop_move_todo_to_ready`` now asserts the enriched 409 detail names the blocking parent (id, quoted title, and current status), so the dashboard always has something actionable to render. - New bundle-assertion test ``test_dashboard_surfaces_ready_blocked_error_inline`` pins the frontend wiring: the ``parseApiErrorMessage`` helper exists, the drag/drop banner runs through it, and the drawer maintains a visible ``patchErr`` state that's cleared between PATCHes and tasks.	2026-05-18 21:02:49 -07:00
Interstellar-code	02efad704f	feat(kanban): worker visibility endpoints (workers/active, runs/{id}, inspect) Adds three read-only endpoints to the kanban dashboard plugin so the SwitchUI workspace (and any other dashboard consumer) can track workers across tasks without N+1 round-trips through /tasks/{task_id}. - GET /workers/active Single SQL JOIN of task_runs + tasks where ended_at IS NULL, worker_pid IS NOT NULL, status='running'. Returns {workers: [...], count, checked_at}. - GET /runs/{run_id} Direct lookup of any task_run row by id. Reuses existing kanban_db.get_run() helper and _run_dict() serialiser. 404 when not found. Mirrors GET /tasks/{task_id} 404 shape. - GET /runs/{run_id}/inspect Live PID stats via psutil.Process.as_dict() — cpu_percent, memory_rss_bytes, memory_vms_bytes, num_threads, num_fds, status, create_time, cmdline. Short-circuits with alive:false when run has ended, has no worker_pid, the pid is gone, or psutil is unavailable. AccessDenied surfaces as alive:true with error rather than a 500. 11 new tests in tests/plugins/test_kanban_worker_runs.py cover the empty-board case, running-task case, ended-run filtering, missing-pid filtering, 404 paths, already-ended inspect, no-pid inspect, dead-pid inspect, and live-pid inspect (psutil mocked). All pass. Companion termination endpoint (POST /runs/{run_id}/terminate) is intentionally out of scope here — opening a separate issue first since the RBAC and dispatcher-mediated soft-cancel design needs maintainer input before code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 21:01:47 -07:00
kronexoi	e8ce7b83fa	fix(kanban): reject direct running transitions in dashboard bulk updates Salvages #24050 by @kronexoi. The single-task PATCH already rejects direct status='running' since it bypasses the dispatcher/claim invariant, but the bulk-update endpoint still accepted it. Aligns bulk with single by emitting an error result row for any 'running' entry.	2026-05-18 20:38:32 -07:00
roycepersonalassistant	6c4f11c64a	fix: show scheduled kanban tasks in dashboard	2026-05-18 20:25:45 -07:00
moortekweb-art	4f6101cc74	Fix Kanban dashboard initial board selection	2026-05-18 20:18:21 -07:00
Drexuxux	917e51858d	fix(kanban): demote ready children when a parent is reopened	2026-05-18 20:17:28 -07:00
Zyrixtrex	326c15d955	fix(kanban): preserve notifier_profile for dashboard home subscriptions	2026-05-18 20:14:45 -07:00
wuli666	028bbc5425	test(kanban-dashboard): cover _task_dict task_age fallback The fix in `061a1830` added an outer try/except in plugin_api._task_dict so that a future failure mode in kanban_db.task_age (anything _safe_int doesn't already absorb) cannot 500 the GET /board response. The _safe_int / task_age corruption paths got regression coverage in tests/hermes_cli/test_kanban_db.py, but the OUTER fallback contract remained untested -- meaning a refactor that drops the try/except would not be caught by CI. Pin that contract from both consumers of _task_dict: - GET /board returns 200 with the literal fallback age dict for the affected card (other cards continue to render via the same path) - GET /tasks/:id (drawer view) returns 200 with the same fallback, so a single corrupt task can't block its own drawer Both tests force task_age to raise RuntimeError rather than ValueError on '%s', because ValueError is absorbed by _safe_int and never reaches the outer try/except -- testing that path would only re-cover what test_kanban_db.py already pins. Manually verified the regression discipline: git checkout 061a1830^ -- plugins/kanban/dashboard/plugin_api.py pytest -k task_age_exception # both FAIL with 500 git checkout HEAD -- plugins/kanban/dashboard/plugin_api.py pytest -k task_age_exception # both PASS	2026-05-18 20:12:52 -07:00
LeonSGP43	c91ad90bff	test(kanban): cover default board dashboard pin	2026-05-18 20:11:43 -07:00
kshitijk4poor	c74ff2c8ef	fix(browser): self-review pass — dead-import, log levels, future-proofing Addresses findings from two self-review passes pre-merge. First pass (3-agent parallel review): 1. plugins/browser/browser_use/provider.py: drop the ``_ = managed_nous_tools_enabled`` dead-import-hider in _get_config_or_none(). The import was actively misleading — the helper IS used in _get_config() (separate method, separate import), not here. The "keep static analysis happy" comment was wrong about what the helper does in this scope. 2. agent/browser_provider.py: drop ``pragma: no cover`` from is_configured() / provider_name() backward-compat aliases. They ARE covered by ``TestLegacyAbcAliases`` — the pragma would have masked future regressions. 3. tools/browser_tool.py: refactor _is_legacy_provider_registry_overridden() to compare against a module-frozen _DEFAULT_PROVIDER_REGISTRY snapshot instead of hardcoded set of 3 keys. Future maintainers adding a 4th built-in provider now just extend _PROVIDER_REGISTRY; the override detection adapts automatically. Previously the hardcoded ``set(...) != {"browserbase", "browser-use", "firecrawl"}`` would flip True forever on any 4-key registry, silently routing every install onto the legacy fixture path. 4. tools/browser_tool.py: when explicit ``browser.cloud_provider`` is set but the registry has no matching plugin (typo, uninstalled plugin, discovery failure), emit a WARNING with actionable text instead of silently falling through to auto-detect. Legacy code surfaced a typed credentials error via direct class instantiation; this log restores the signal in the post-migration path. 5. agent/browser_registry.py: trim the triple-redundant _LEGACY_PREFERENCE documentation. Module docstring + 13-line block-comment + 5-line inline comment was repeating the same point. Kept the docstring and trimmed the block-comment to 5 lines. 6. agent/browser_registry.py: upgrade is_available()-raised logging from DEBUG to WARNING with exc_info=True. A provider's availability check throwing is unusual enough that users debugging "no cloud provider" need the traceback in logs. 7. tests/plugins/browser/check_parity_vs_main.py: drop dead top-level imports (os, shutil, tempfile — only referenced inside the SUBPROCESS_SCRIPT string literal that runs in a child process). Second pass (architecture + claim-verification review): 8. tools/browser_tool.py: rewrite the inline comment in _get_cloud_provider auto-detect branch. Prior text claimed it "routes through the plugin registry's legacy preference walk so third-party plugins still get a chance to be selected when they're explicitly configured" — false on both counts. The branch uses module-level legacy class aliases (BrowserUseProvider / BrowserbaseProvider) directly; third-party plugins are intentionally reachable only via explicit ``browser.cloud_provider``. Corrected comment now matches behaviour and cross-references _LEGACY_PREFERENCE for the firecrawl gate rationale. 9. tools/browser_tool.py + tests/tools/test_managed_browserbase_and_modal.py: drop the unused ``get_active_browser_provider as _registry_get_active_browser_provider`` alias from the ``from agent.browser_registry import ...`` block. It was never referenced; matching test-stub line in the agent.browser_registry SimpleNamespace also dropped. ``get_provider`` is still imported (used by the explicit-config dispatch path at line 535). 10. plugins/browser/firecrawl/provider.py: align emergency_cleanup() with the early-guard pattern used in browserbase + browser_use plugins. Previously firecrawl tried the DELETE and relied on ``_headers()`` raising ValueError to trip a "missing credentials" warning; same final outcome but a different control flow that read like a bug to a maintainer skimming the three modules. Now: if is_available() is False, log+return early — identical shape to the other two providers. Verification: 54/54 unit tests + 13/13 parity scenarios still pass.	2026-05-17 04:04:15 -07:00
kshitijk4poor	1bb6f03724	fix(browser): ensure plugin discovery before registry lookup; parity harness Two changes that go together: 1. tools/browser_tool.py — add _ensure_browser_plugins_loaded() and call it from _get_cloud_provider() before consulting the registry. Normally model_tools triggers discover_plugins() as an import side-effect, but _get_cloud_provider() can be reached from contexts that haven't gone through model_tools (standalone scripts, certain unit-test paths, the new parity-sweep harness). Without the defensive call, the registry is empty and _registry_get_browser_provider() returns None — silently downgrading users to local mode when they explicitly configured a cloud provider with no credentials yet. The behavior-parity sweep below caught this as 4 scenario regressions (explicit-X-no-creds for all 3 providers, and explicit-firecrawl-with-creds). 2. tests/plugins/browser/check_parity_vs_main.py — subprocess harness that pins one Python invocation to origin/main and one to this PR's worktree via sys.path.insert(), runs _get_cloud_provider() across a 13-scenario config matrix, and diffs the reduced shape tuple (is_local, provider_name, is_available). Provider_name pulls from provider.provider_name() which is the legacy CloudBrowserProvider API and remains as a backward-compat alias on the new BrowserProvider ABC, so the comparison is apples-to-apples regardless of class identity. Final result: PARITY OK across 13 scenarios. The four observable config/credential matrices that exercise the dispatcher all match origin/main bit-for-bit: - no-config + no-env → local - explicit local + any env → local - explicit BB / BU / FC + no creds → provider returned with is_available()==False (so dispatcher surfaces typed credentials error; matches main exactly) - explicit BB / BU / FC + creds → provider returned with is_available()==True - no-config + BU creds → Browser Use - no-config + BB creds → Browserbase - no-config + both → Browser Use (legacy walk first hit) - no-config + FC only → local (firecrawl NOT in legacy walk) - no-config + FC + BB → Browserbase (legacy walk skips firecrawl) Per the dev skill's "behavior-parity for refactor PRs" rule — without this subprocess sweep, 31/31 unit tests pass while the production code path is silently broken for users who type `browser.cloud_provider: browserbase` and run a single browser command without prior model_tools import. Caught + fixed before push.	2026-05-17 04:04:15 -07:00
kshitijk4poor	fec0a0da98	test(plugins/browser): coverage for the 3-plugin migration Mirrors tests/plugins/web/test_web_search_provider_plugins.py from PR #25182. 31 tests across 5 classes: TestBundledPluginsRegister (8 tests) - Three plugins register (browserbase, browser-use, firecrawl) - Each plugin's name + display_name accessible - get_setup_schema() returns picker-shaped dict with post_setup hook - All three lifecycle methods (create_session, close_session, emergency_cleanup) overridden on every plugin TestIsAvailable (4 tests) - browserbase needs BOTH BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID - browserbase: api_key alone or project_id alone insufficient - browser-use satisfied by BROWSER_USE_API_KEY - firecrawl satisfied by FIRECRAWL_API_KEY TestRegistryResolution (8 tests) — most valuable, locks down pre-migration semantics: - _resolve(None) with no creds returns None (local mode) - _resolve('local') short-circuits to None - _resolve('browserbase') returns provider even when unavailable (so dispatcher surfaces typed credentials error) - _resolve('firecrawl') same: explicit-config wins - _resolve('unknown') falls through to auto-detect - Legacy walk picks browser-use over browserbase - browserbase-only configuration: browserbase wins - Regression: firecrawl is NEVER auto-selected even when single-eligible (preserves pre-migration gate; FIRECRAWL_API_KEY shared with web firecrawl must not silently route to paid cloud browser) TestLegacyAbcAliases (6 tests) - is_configured() delegates to is_available() for all three plugins - provider_name() returns display_name for all three plugins TestPickerIntegration (3 tests) - _plugin_browser_providers() exposes all three plugins as rows - Each row carries post_setup='agent_browser' - browser_plugin_name marker matches browser_provider All tests use real imports — no mocking of provider classes — so the suite catches drift in the ABC, registry, picker injection, and plugin glue layer simultaneously. 31/31 passing.	2026-05-17 04:04:15 -07:00
kshitij	5fba236644	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 ) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	2026-05-17 02:29:41 -07:00
teknium1	773a0faca0	fix(deepseek): set default_aux_model on profile so aux warning stops firing Closes #26924 (and supersedes #26926) in spirit. DeepSeek was missing `default_aux_model` on its `ProviderProfile`, so `_get_aux_model_for_provider("deepseek")` returned an empty string and the compression / vision / session-search paths emitted "No auxiliary LLM provider configured -- context compression will drop middle turns without a summary." on every DeepSeek session, even when the user had perfectly working DeepSeek credentials. Fix lands at the profile layer rather than the legacy `_API_KEY_PROVIDER_AUX_MODELS_FALLBACK` dict the original PR targeted. Every modern provider (gemini, zai, minimax, anthropic, kimi-coding, stepfun, ollama-cloud, gmi, novita, kilocode, ai-gateway, opencode-zen) sets `default_aux_model` on its `ProviderProfile`; the fallback dict only exists for providers that predate the profiles system. Tests added under `tests/plugins/model_providers/test_deepseek_profile.py`: - `test_profile_advertises_deepseek_chat` -- pins the profile attribute - `test_consumer_api_returns_deepseek_chat` -- pins the consumer API behavior - `test_consumer_api_returns_non_empty` -- regression guard for the symptom in the issue Original diagnosis and aux-model choice from @kriscolab in PR #26926; moved one layer up. Co-authored-by: kriscolab <71590782+kriscolab@users.noreply.github.com>	2026-05-16 22:54:22 -07:00
teknium1	cd9470f416	fix(deepseek): wire thinking-mode via DeepSeekProfile, not legacy fallback The cherry-picked PR #15251 from @tw2818 correctly identified the DeepSeek 400 root cause but placed the fix in the legacy fallback path of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a registered ProviderProfile and goes through `_build_kwargs_from_profile` instead. The legacy-path block was therefore dead code. This commit pivots the fix to where it actually fires: - New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py` overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire format (mirrors `KimiProfile`): {"reasoning_effort": "<low\|medium\|high\|max>", "extra_body": {"thinking": {"type": "enabled" \| "disabled"}}} - Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit thinking control. `deepseek-chat` (V3) is untouched — current behavior. - Effort mapping: low/medium/high passthrough, xhigh/max → max, unset → omitted (DeepSeek server applies its own default). - Revert the legacy-path additions from PR #15251 — they were dead code, and the `_copy_reasoning_content_for_api` strip block specifically would have nullified the existing reasoning_content padding machinery (`_needs_deepseek_tool_reasoning` → space-pad on replay) that the active provider already relies on for replay correctness. - Unit tests pin the wire-shape contract and the model gating rules (26 tests, all passing). Existing transport + provider profile suites (321 tests) continue to pass. - AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit. Closes #15700, #17212, #17825. Co-authored-by: tw2818 <twebefy@gmail.com>	2026-05-15 17:03:26 -07:00
Jaaneek	b62c997973	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`.	2026-05-15 12:11:32 -07:00
kshitij	db84a78e61	fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342 , #22763 ) (#26320 ) * fix(langfuse): reject placeholder credentials with one-shot warning When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY at a template value like 'placeholder', 'test-key', or 'your-langfuse-key', the Langfuse SDK silently accepts the credentials at construction time and drops every trace at flush time. No warning, no error — just an empty Langfuse dashboard the operator only notices hours later. Add prefix-based validation in _get_langfuse() against the documented 'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side. Anything else fires a single warning naming the offending env var(s) with a log-safe value preview (full string for short placeholders so the operator knows which template they left in place; truncated for long values so a real secret pasted into the wrong field never hits the log), then short-circuits via the existing _INIT_FAILED cache so the warning fires once per process, not once per hook invocation. The check sits after the 'Langfuse is None' SDK-installed guard so hosts without the optional langfuse SDK don't see misleading 'set real keys' hints when the actionable fix is 'pip install langfuse'. Missing credentials remains the documented opt-out path and stays silent — no log noise for unconfigured installs. Fixes #22763 Fixes #23823 * fix(langfuse): use actual API request messages for generation input on_pre_llm_request previously used the messages kwarg alone, which could be None when Hermes passes the payload via request_messages, conversation_history, or user_message instead. Add _coerce_request_messages to pick the first available list across all variants, falling back to a synthetic user message. Generations now show the real outbound payload rather than an empty input. * fix(langfuse): record tool call outputs in traces Tool observations showed input (arguments) but output was always undefined. Root cause: when tool_call_id is empty, pre_tool_call stored observations under a unique time-based key that post_tool_call could never reconstruct, so every tool span was closed without output by the _finish_trace sweep. Fix pre/post matching by routing empty-tool_call_id tools through a per-name FIFO queue (pending_tools_by_name) instead of the time-based key. Tools with a tool_call_id continue to use the id-keyed dict. Also: - Preserve OpenAI-style nested function shape in serialized tool calls so Langfuse renders name/arguments correctly - Keep name + tool_call_id on role:tool messages for proper pairing - Backfill tool results onto the matching turn_tool_calls entry so the generation's tool-call record carries the result alongside arguments - Coerce request messages from whichever field the runtime provides (request_messages, messages, conversation_history, user_message) * fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test Self-review of the combined #22345 + #23831 salvage surfaced three issues worth fixing in the same PR rather than as follow-ups: 1. Drop is_first_turn from the pre_api_request hook. The boolean expression `not bool(conversation_history)` was wrong: conversation_history is reassigned to None mid-run after compression (5 sites in run_agent.py), so the value flips False -> True mid-conversation on every post-compression API call. The langfuse plugin never consumed it, so the kwarg was both misleading AND dead. 2. Replace copy.deepcopy(request_messages) with shallow list() copy. The pre_api_request hook contract discards return values (invoke_hook never writes back to api_kwargs), and the langfuse plugin's _serialize_messages already builds its own snapshot dicts via _safe_value. A deepcopy on every API call would walk every tool result and base64 image — significant overhead for no real isolation benefit. Shallow copy of the outer list protects against later mutations of api_messages without paying for the inner-dict walk. 3. Rename test_empty_tool_call_id_concurrent_fifo_order -> test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8 threads behind a barrier to actually exercise _STATE_LOCK on the pending_tools_by_name queue. The original test was sequential and only validated Python list semantics; this one validates the lock discipline. 4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED — that function does not exist. Tests reload the module via sys.modules.pop + importlib.import_module instead. Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: Brian Conklin <brian@dralth.com>	2026-05-15 05:04:02 -07:00
binhnt92	63991bbd97	fix(memory): skip OpenViking upload symlinks	2026-05-14 07:48:03 -07:00

1 2 3

123 Commits