Diagnostic on prod confirmed the backend keeps sessions valid across deploys
(stable 64-char JWT key, 30-day access tokens, 62 refresh tokens persisting in
Redis with appendonly; redis/db never restart on deploy). The forced logout was
client-side:
1. The axios refresh path treated ANY refresh failure as "session gone" and
nuked the tokens. During the ~30s API restart window of a deploy, the refresh
POST gets a 502/timeout (transient) → user kicked to /login. Now refresh
distinguishes a definitive 4xx (truly invalid/expired refresh → log out) from
a transient network/5xx failure (reject + keep the session; retry later).
Refresh tokens are opaque Redis GUIDs, so they survive even a key rotation —
the only thing that was breaking sessions was this over-eager logout.
2. PWA service worker served a stale app shell after an update, pointing at JS
chunks the new build replaced. Added skipWaiting + clientsClaim +
cleanupOutdatedCaches and a NetworkFirst handler for navigations so the HTML
and its chunk refs always match the live deploy; hashed static stays
CacheFirst.
Net: a normal update no longer logs anyone out. tsc clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>