fix(docker): chown hermes-owned top-level state files on boot (#35098) (#36236)

The targeted data-volume chown in stage2-hook.sh only covers hermes-owned
*subdirectories*; loose state files living directly under $HERMES_HOME
(auth.json, state.db, gateway.lock, gateway_state.json, …) are missed.
When created or rewritten by `docker exec <container> hermes …` (root
unless `-u` is passed) they land root-owned, and the unprivileged hermes
runtime then hits PermissionError on next startup, producing a gateway
restart loop.

Fix: reset ownership of an explicit allowlist of hermes-owned top-level
files on every boot. The list mirrors the top-level file entries of
hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock
files.

This uses a targeted allowlist rather than the originally-proposed blanket
`find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the
targeted-ownership contract from #19788 / PR #19795: a bind-mounted
$HERMES_HOME may contain host-owned files Hermes does not manage, and
those must never be chowned. Verified end-to-end: allowlisted root-owned
files are reset to hermes on restart while a non-allowlisted host file
keeps its root ownership.

Co-authored-by: x1am1 <2663402852@qq.com>
This commit is contained in:
Ben Barclay
2026-06-01 14:38:08 +10:00
committed by GitHub
parent 0bc616ecf9
commit bdceedf784
3 changed files with 166 additions and 0 deletions

View File

@ -187,6 +187,33 @@ if [ -d "$HERMES_HOME/profiles" ]; then
chown -R hermes:hermes "$HERMES_HOME/profiles" 2>/dev/null || true
fi
# Reset ownership of hermes-owned top-level state files on every boot.
# The targeted data-volume chown above only covers hermes-owned
# *subdirectories*; loose state files living directly under $HERMES_HOME
# are missed. When those files are created or rewritten by
# `docker exec <container> hermes …` (root unless `-u` is passed) they
# land root-owned, and the unprivileged hermes runtime then hits
# PermissionError on next startup (e.g. gateway.lock / state.db /
# auth.json), producing a gateway restart loop.
#
# We use an explicit allowlist rather than a blanket `find -user root`
# sweep so host-owned files in a bind-mounted $HERMES_HOME are never
# touched — same targeted-ownership contract as the subdir chown above
# (issue #19788, PR #19795). The list mirrors the top-level *file*
# entries of hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the
# runtime lock files; keep them in sync if that set changes.
for f in \
auth.json auth.lock .env \
state.db state.db-shm state.db-wal \
hermes_state.db \
response_store.db response_store.db-shm response_store.db-wal \
gateway.pid gateway.lock gateway_state.json processes.json \
active_profile; do
if [ -e "$HERMES_HOME/$f" ]; then
chown hermes:hermes "$HERMES_HOME/$f" 2>/dev/null || true
fi
done
# --- config.yaml permissions ---
# Ensure config.yaml is readable by the hermes runtime user even if it
# was edited on the host after initial ownership setup.