Commit Graph

12 Commits

Author SHA1 Message Date
Thomas Hallock 2082710ab2 fix: add retry middleware for zero-downtime deployments
The problem: During deployments, users pinned via sticky session to
the restarting container experienced ~60s of downtime because:
1. Health checks were too slow (10s interval)
2. No retry on failure - requests just failed

The fix:
- Add retry middleware: 3 attempts with 100ms initial interval
- Reduce health check interval from 10s to 3s
- Add health check timeout of 2s

Now when your pinned server restarts:
1. Request fails
2. Traefik retries on the OTHER healthy server
3. You get a response (maybe with new server_id cookie)

Combined with Redis for session state, this should give true
zero-downtime deployments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 05:25:19 -06:00
Thomas Hallock 442d5218d0 infra: add dedicated Redis compose file for compose-updater
Redis was getting removed during blue/green deployments because it was
only defined in the main docker-compose.yaml which isn't managed by
compose-updater.

Now Redis has its own compose file with:
- compose-watcher labels so compose-updater manages it
- Independent lifecycle from blue/green deployments
- Persistent volume for data

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 20:49:22 -06:00
Thomas Hallock 57781a9ecc feat: enhance deployment info with health checks and refactor keypad
- Add useDeploymentInfo hook with live health/build info fetching
- Refactor DeploymentInfoContent with server health status, WebSocket
  connectivity, and database status displays
- Add Storybook stories and tests for DeploymentInfoContent
- Extract NumericKeypad styles to CSS file and config to separate module
- Add debug page index
- Update NAS deployment configs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 17:22:52 -06:00
Thomas Hallock 4fbdb3fe50 feat(debug): add debugging tools for cross-instance issues
1. Enhanced /api/build-info endpoint:
   - Shows instance hostname and container ID
   - Shows Redis connection status
   - Shows Socket.IO adapter type (redis/memory)

2. Instance-specific subdomain routes:
   - blue.abaci.one routes to blue container only
   - green.abaci.one routes to green container only
   - Useful for testing cross-instance communication

3. Socket.IO debug page (/debug/socket):
   - Shows connection status and socket ID
   - Join/leave rooms (remote-camera, arcade, game)
   - Send custom events with JSON data
   - Real-time event log with direction arrows

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:42:54 -06:00
Thomas Hallock 2e77b46ca1 fix(deploy): remove depends_on from blue/green compose files
compose-updater can't resolve depends_on references to services
defined only in the main docker-compose.yaml. Remove depends_on
and rely on REDIS_URL environment variable instead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:39:32 -06:00
Thomas Hallock 0346455b3e feat(remote-camera): add Redis for cross-instance session sharing
Production blue/green deployment caused remote camera to fail because
desktop and phone could hit different instances with separate in-memory
session storage and Socket.IO rooms.

Changes:
- Add Redis service to docker-compose (production only)
- Create Redis client utility with optional connection
- Update session manager to use Redis when REDIS_URL is set
- Add Socket.IO Redis adapter for cross-instance room broadcasts
- Convert session manager functions to async
- Update tests for async functions

In development (no REDIS_URL), falls back to in-memory storage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:18:22 -06:00
Thomas Hallock 8e4338bbbe fix(deploy): add sticky sessions for Socket.IO and remote camera
Remote camera sessions are stored in-memory per instance. Without sticky
sessions, Traefik could route desktop to Blue and phone to Green, causing
"session expired" errors and failed connections.

Sticky sessions ensure the same client always hits the same backend instance,
which is required for:
- Socket.IO connections (rooms are per-instance)
- Remote camera session state (in-memory Map)
- Any stateful WebSocket communication

Note: Sessions will still be lost on container restart/deployment. For full
robustness, sessions should be persisted to database and Socket.IO should
use Redis adapter. This is a workaround for the immediate issue.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:08:32 -06:00
Thomas Hallock e703e90875 chore: cleanup unused imports and apply formatting
- Remove unused `and` import from VisionRecorder.ts
- Remove unused `IncomingMessage` and `ws` imports from socket-server.ts
- Add `muted` attribute to video element in ProblemVideoPlayer
- Apply code formatting across vision and practice components
- Update documentation formatting in DEPLOYMENT.md and README

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 18:49:31 -06:00
Thomas Hallock b47992f770 feat(deploy): add blue-green deployment with health endpoint
- Add /api/health endpoint that checks database connectivity
- Set up blue-green deployment with two containers (abaci-blue, abaci-green)
- Add docker-compose.yaml with YAML anchors for DRY config
- Add generate-compose.sh to create blue/green compose files from main
- Update deploy.sh with NAS-specific fixes (scp -O, PATH for docker)
- Fix deploy.sh to not overwrite production .env by default

The blue-green setup allows zero-downtime deployments via compose-updater,
which watches separate compose files and restarts containers independently.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 17:04:01 -06:00
Thomas Hallock c29501f666 fix: create arcade sessions on room join to enable config changes
Fixes "No active session found" error when adjusting game settings
before starting a game in arcade rooms.

**Problem:**
- Sessions were only created on START_GAME move
- SET_CONFIG moves require an active session in setup phase
- Users couldn't adjust settings until after starting game

**Solution:**
- Create session in setup phase when user joins room (if none exists)
- Initialize with room's game config from database
- Allows SET_CONFIG moves before game starts

**Changes:**
- socket-server.ts:72-100 - Auto-create session on join-arcade-session
- RoomMemoryPairsProvider.tsx:4 - Remove unused import
- nas-deployment/docker-compose.yaml:15 - Fix DB volume mount path

**Related:**
- Also fixes database persistence by correcting volume mount from
  ./data:/app/data to ./data:/app/apps/web/data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 13:52:35 -05:00
Thomas Hallock bda5bc6c0e fix: prevent database imports from being bundled into client code
**Problem:**
- player-ownership.ts imported drizzle-orm and @/db at top level
- When RoomMemoryPairsProvider imported client-safe utilities, Webpack bundled ALL imports including database code
- This caused hydration error: "The 'original' argument must be of type Function"
- Node.js util.promisify was being called in browser context

**Solution:**
1. Created player-ownership.client.ts with ONLY client-safe utilities
   - No database imports
   - Safe to import from 'use client' components
   - Contains: buildPlayerOwnershipFromRoomData(), buildPlayerMetadata(), helper functions

2. Updated player-ownership.ts to re-export client utilities and add server-only functions
   - Re-exports everything from .client.ts
   - Adds buildPlayerOwnershipMap() (async, database-backed)
   - Safe to import from server components/API routes

3. Updated RoomMemoryPairsProvider to import from .client.ts

**Result:**
- No more hydration errors on /arcade/room
- Client bundle doesn't include database code
- Server code can still use both client and server utilities

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 11:40:46 -05:00
Thomas Hallock eb8ed8b22c feat: add complete NAS deployment system for apps/web
- Add Dockerfile with multi-stage build for monorepo
- Add GitHub Actions workflow for automated CI/CD
- Add NAS deployment configuration for abaci.one
- Configure Porkbun DDNS integration
- Add Watchtower for auto-updates
- Fix Next.js standalone output configuration
- Add missing dependencies for package builds

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-27 08:42:41 -05:00