E5-F5: Queue Connection Resilience¶
Delivered by waiting-room
This feature is implemented by the standalone waiting-room service. waiting-room deliberately treats WebSocket disconnect as transient — closing the WS does not transition the ticket, because mobile OSes drop WS connections within seconds of backgrounding. Push notifications (FCM/APNs) reconnect a backgrounded user to a still-live ticket. Eviction happens only through (a) the hard ticket_ttl_seconds / session_ttl_seconds BullMQ workers, or (b) voluntary DELETE /tickets/{id}. Tune TTLs for "user vanished without saying goodbye" worst case.
Epic: E5: Waiting Queue System
Size: S (Small)
Problem / Outcome¶
Users don't lose position if connection drops briefly.
Scope¶
In-Scope:
- Position persists for 30 minutes after disconnect
- Reconnect restores original position
- Grace period expiration handling
Out-of-Scope:
- Indefinite persistence
Acceptance Criteria¶
- AC1: Given connection lost, position persists for 30 minutes
- AC2: On reconnect within 30 minutes, original position restored
- AC3: On reconnect after 30 minutes, position lost, user must rejoin
Data Model Impact¶
QueueEntry table:
- last_heartbeat (TIMESTAMP)
- grace_period_expires_at (TIMESTAMP, nullable)
- is_connected (BOOLEAN)
Heartbeat mechanism:
- Client sends heartbeat every 30 seconds
- Server marks disconnected after 60 seconds of no heartbeat
- Grace period starts on disconnect detection
Permissions/Roles¶
- System
How to Verify¶
npm test -- --grep "queue resilience"
Expected: Position retained within grace period.
Dependencies¶
Implementation Tasks¶
Doc References¶
Last Updated: January 2026