fix(update): harden venv rebuild + verify core deps after install

Two complementary fixes for a silent partial-install failure that bit
``hermes update`` in the wild: a fresh checkout pulled 145 commits,
``rebuild_venv`` failed to recreate the venv on Windows because
``shutil.rmtree(ignore_errors=True)`` couldn't delete files held open by
the running ``hermes.exe`` shim. ``uv venv`` then refused with
"A directory already exists at: venv" and the update fell back to
installing on top of the stale venv. The resulting partial install
missed exactly one newly-added base dep — ``pathspec==1.1.1`` — which
``hermes desktop --build-only`` imports at the top of its content-hash
check. The desktop rebuild died with ModuleNotFoundError and the parent
update only logged "⚠ Desktop build failed (non-fatal)". Same root cause
made the "default: sync failed" line in the skill-sync stage, because
that sync subprocess hit the same missing import.

Fix 1: ``rebuild_venv`` retries with ``--clear``
------------------------------------------------
If ``uv venv`` fails with "already exists" in stderr (which is what uv
prints, and what uv's own hint tells you to fix with --clear), retry
once with ``--clear``. Only this specific failure pattern triggers the
retry — disk-full / interpreter-download failures still surface as
before so we don't mask real problems.

Fix 2: post-install dep verification
------------------------------------
Belt-and-suspenders so future uv resolver quirks (or any other cause of
partial installs) surface immediately instead of hours later in a
downstream subprocess. After ``_install_python_dependencies_with_optional_fallback``
runs, ``_verify_core_dependencies_installed``:

  1. Reads ``[project.dependencies]`` straight from pyproject.toml
     (so we don't trust the venv's stale metadata).
  2. Filters by environment markers via ``packaging.requirements.Requirement``
     so cross-platform exclusions (``ptyprocess ; sys_platform != 'win32'``)
     don't false-positive on Windows.
  3. Runs ``importlib.metadata.version()`` for each remaining dep inside
     the *target* venv interpreter (resolved from ``VIRTUAL_ENV``, not
     ``sys.executable``).
  4. If anything is missing, reinstalls the base group with
     ``--reinstall`` to force re-resolution. If a second probe still
     reports missing deps, force-installs each one with its pinned spec.
  5. Treats final failure as a warning rather than a hard error — a
     single broken-on-PyPI dep shouldn't block an otherwise-successful
     update — but the message points at ``hermes update --force`` and
     names the missing packages so the user knows what's wrong.

Tests
-----
- ``TestRebuildVenv::test_retries_with_clear_when_dir_already_exists`` —
  simulates the rmtree-couldn't-delete-it failure mode and asserts the
  ``--clear`` retry path is taken and succeeds.
- ``TestRebuildVenv::test_does_not_retry_when_first_failure_is_not_dir_exists``
  — guards against masking real failures (disk full, etc.).
- ``test_verify_core_dependencies.py`` — 7 tests covering the happy
  path, the regression (missing pathspec triggers --reinstall), the
  per-package fallback when --reinstall doesn't help, the platform-
  marker filter so Windows doesn't try to install ptyprocess, the
  missing-pyproject noop, and the VIRTUAL_ENV resolver.

Co-authored-by: Kyssta <218078013+kyssta-exe@users.noreply.github.com>
This commit is contained in:
teknium1
2026-06-04 02:12:46 -07:00
committed by Teknium
parent 28ca4460a1
commit c136eb4de1
4 changed files with 547 additions and 6 deletions

View File

@ -9065,6 +9065,222 @@ def _install_python_dependencies_with_optional_fallback(
f" ⚠ Skipped optional extras that still failed: {', '.join(failed_extras)}"
)
# Belt-and-suspenders: verify every declared core dependency from
# pyproject.toml's [project.dependencies] is actually importable in the
# target venv. uv's incremental resolver has — in the wild — produced
# partial installs where a newly added base dep (e.g. ``pathspec``)
# silently fails to land on top of a half-stale venv, and the only
# symptom is a downstream subprocess crashing with ModuleNotFoundError
# hours later inside ``hermes update``'s desktop-rebuild or skill-sync
# stage. Reinstall with --reinstall to force resolution if anything is
# missing, then re-verify so the failure surfaces here instead of
# downstream.
_verify_core_dependencies_installed(install_cmd_prefix, env=env, group=group)
def _verify_core_dependencies_installed(
install_cmd_prefix: list[str],
*,
env: dict[str, str] | None = None,
group: str = "all",
) -> None:
"""Check that every base dep from pyproject.toml is importable; if not, retry.
Reads ``pyproject.toml`` directly (so we don't trust the venv's stale
metadata), filters out deps gated by ``;`` environment markers that don't
apply to this platform, and runs ``importlib.metadata.version()`` in the
venv interpreter for each one. If anything is missing we reinstall the
base group with ``--reinstall`` to force uv to re-resolve, then check
again. We treat the final state as a warning rather than a hard failure
so a single broken-on-PyPI dep can't block an otherwise-successful
update — but the warning makes the partial install visible at the spot
that caused it, instead of hours later in a downstream subprocess.
"""
try:
import tomllib # Python 3.11+
except ImportError: # pragma: no cover — Python < 3.11 unsupported but be safe
return
pyproject = PROJECT_ROOT / "pyproject.toml"
if not pyproject.is_file():
return
try:
with open(pyproject, "rb") as f:
data = tomllib.load(f)
raw_deps = data.get("project", {}).get("dependencies", []) or []
except Exception as e:
logger.debug("dep verification: failed to read pyproject.toml: %s", e)
return
# Parse each "name OP version ; marker" string into (dist_name, marker_obj).
# We use packaging.requirements when available (it ships with pip/uv envs),
# falling back to a naive split that's good enough for the canonical
# ``name==version[; marker]`` style this repo uses.
deps: list[tuple[str, "object | None"]] = []
try:
from packaging.requirements import Requirement # type: ignore
for spec in raw_deps:
try:
req = Requirement(spec)
deps.append((req.name, req.marker))
except Exception:
continue
except Exception:
for spec in raw_deps:
head = spec.split(";", 1)[0]
for op in ("==", ">=", "<=", "~=", ">", "<", "!="):
if op in head:
head = head.split(op, 1)[0]
break
name = head.strip().split("[", 1)[0].strip()
if name:
deps.append((name, None))
# Apply environment markers to drop deps that don't apply on this platform
# (e.g. ``ptyprocess ; sys_platform != 'win32'`` is correctly skipped on
# Windows). Without markers we'd false-positive every cross-platform exclusion.
applicable: list[str] = []
for name, marker in deps:
if marker is None:
applicable.append(name)
continue
try:
if marker.evaluate(): # type: ignore[union-attr]
applicable.append(name)
except Exception:
applicable.append(name)
if not applicable:
return
# Run the check inside the venv Python — sys.executable here may be the
# outer Python that drove ``hermes update``, not the venv we just wrote
# to. The uv install_cmd_prefix encodes which environment we targeted
# (either ``[uv, pip]`` with VIRTUAL_ENV in env, or
# ``[sys.executable, -m, pip]`` for the in-process Python); resolve the
# right interpreter for the verification.
venv_python = _resolve_install_target_python(install_cmd_prefix, env)
if venv_python is None:
return
def _missing_deps() -> list[str]:
check_script = (
"import importlib.metadata as md, sys\n"
"missing=[]\n"
"for name in sys.argv[1:]:\n"
" try: md.version(name)\n"
" except md.PackageNotFoundError: missing.append(name)\n"
"print('\\n'.join(missing))\n"
)
try:
result = subprocess.run(
[str(venv_python), "-c", check_script, *applicable],
capture_output=True,
text=True,
check=False,
env=env,
)
except Exception as e:
logger.debug("dep verification: subprocess failed: %s", e)
return []
return [line.strip() for line in result.stdout.splitlines() if line.strip()]
missing = _missing_deps()
if not missing:
return
print(
f" ⚠ Verification: {len(missing)} declared dep(s) missing after install: "
f"{', '.join(missing[:8])}{'...' if len(missing) > 8 else ''}"
)
print(" → Reinstalling base group with --reinstall to repair...")
# Reinstall base group with --reinstall so uv re-resolves from scratch
# against the current pyproject. We don't pass ``[{group}]`` here on
# purpose — the missing dep is in *base* deps; rerunning the full all-
# extras install can cost minutes and trips on whatever optional extra
# was already broken upstream. Base is fast and is what's actually wrong.
repair_args = ["install", "--reinstall", "-e", "."]
try:
_run_install_with_heartbeat(install_cmd_prefix + repair_args, env=env)
except subprocess.CalledProcessError as e:
logger.warning("dep verification: repair install failed: %s", e)
print(" ⚠ Repair install failed; check `hermes update` output above.")
return
still_missing = _missing_deps()
if not still_missing:
print(" ✓ All declared core dependencies now installed")
return
# Last-ditch: install each remaining missing dep with its pin directly.
# Useful when uv's resolver thinks the env is satisfied but the on-disk
# package metadata says otherwise (rare but observed).
name_to_spec = {}
for spec in raw_deps:
head = spec.split(";", 1)[0].strip()
bare = head
for op in ("==", ">=", "<=", "~=", ">", "<", "!="):
if op in bare:
bare = bare.split(op, 1)[0]
break
name_to_spec[bare.strip().split("[", 1)[0].strip()] = head
specs = [name_to_spec.get(n, n) for n in still_missing]
print(
f" → Force-installing remaining missing dep(s): {', '.join(specs)}"
)
try:
_run_install_with_heartbeat(
install_cmd_prefix + ["install", "--reinstall", *specs], env=env
)
except subprocess.CalledProcessError as e:
logger.warning("dep verification: per-package repair failed: %s", e)
print(
f" ⚠ Could not install: {', '.join(still_missing)}. "
"Run `hermes update --force` after closing other hermes processes."
)
return
final_missing = _missing_deps()
if final_missing:
print(
f" ⚠ Still missing after repair: {', '.join(final_missing)}. "
"Run `hermes update --force` after closing other hermes processes."
)
else:
print(" ✓ All declared core dependencies now installed")
def _resolve_install_target_python(
install_cmd_prefix: list[str], env: dict[str, str] | None
) -> Path | None:
"""Figure out which Python interpreter the install just targeted.
``_install_python_dependencies_with_optional_fallback`` is called with
either ``[uv, pip]`` (and a ``VIRTUAL_ENV`` env var pointing at the
target venv) or ``[sys.executable, -m, pip]`` (the in-process Python).
The verification step needs the *resulting* environment's Python so
``importlib.metadata`` queries the right site-packages.
"""
if env and "VIRTUAL_ENV" in env:
venv_root = Path(env["VIRTUAL_ENV"])
scripts = venv_root / ("Scripts" if _is_windows() else "bin")
candidate = scripts / ("python.exe" if _is_windows() else "python")
if candidate.exists():
return candidate
# Fallback: assume install_cmd_prefix[0] is the python interpreter (the
# ``[sys.executable, -m, pip]`` shape). Skip if it looks like ``uv``.
if install_cmd_prefix:
first = Path(install_cmd_prefix[0])
if first.exists() and "uv" not in first.name.lower():
return first
return None
def _is_termux_env(env: dict[str, str] | None = None) -> bool:
return _is_termux_startup_environment(env)

View File

@ -105,17 +105,40 @@ def rebuild_venv(uv_bin: str, venv_dir: Path, python_version: str = "3.11") -> b
old venv may point to a Python without FTS5, so we rebuild it with a
fresh interpreter from the current managed uv. Returns ``True`` on
success.
On Windows, ``shutil.rmtree(..., ignore_errors=True)`` can silently leave
the venv directory partially intact when another process is holding an
open handle to a file inside it (typical culprits: a running
``hermes.exe`` REPL, the gateway, AV scanners). If we don't notice that
and just call ``uv venv``, uv refuses with
``Caused by: A directory already exists at: venv`` and the *whole
update* falls back to installing on top of the stale venv — which has
historically produced partial installs where a freshly added dependency
(e.g. ``pathspec``) silently fails to land. Retry with ``--clear`` to
force uv past that condition before giving up.
"""
if venv_dir.exists():
print(f" → Rebuilding venv (old Python may lack FTS5)...")
shutil.rmtree(venv_dir, ignore_errors=True)
result = subprocess.run(
[uv_bin, "venv", str(venv_dir), "--python", python_version],
capture_output=True,
text=True,
check=False,
)
def _run_uv_venv(extra_args: list[str]) -> subprocess.CompletedProcess[str]:
return subprocess.run(
[uv_bin, "venv", str(venv_dir), "--python", python_version, *extra_args],
capture_output=True,
text=True,
check=False,
)
result = _run_uv_venv([])
# If uv refused because the directory still exists (rmtree above was
# blocked by an open file handle, common on Windows), retry with
# --clear so uv overwrites it. Match on stderr because uv's exit code
# alone doesn't distinguish "dir exists" from real failures.
if result.returncode != 0 and "already exists" in (result.stderr or "").lower():
print(" → venv dir not fully removed (likely an open file handle); retrying with --clear...")
result = _run_uv_venv(["--clear"])
if result.returncode == 0:
venv_python = venv_dir / ("Scripts" if platform.system() == "Windows" else "bin") / "python"
py_ver = subprocess.run(