Chat & Online Presence — Overview
What we’re building
Two linked features:
-
Online Presence — Show which users are currently active. When a user logs in and opens any portal, they appear as “online” to relevant peers. When they close the tab or their session expires, they appear “offline”.
-
Chat (Direct Messages) — A floating chat panel in each portal where users can send direct messages to other online users. Conversations are persisted and retrievable later even if the recipient was offline.
Scope by Portal
| Portal | Who can see | Chat with |
|---|---|---|
| Manage | All users across all tenants | Any user (regardless of tenant) |
| Dashboard | Users in the same tenant only | Users in the same tenant only |
| DM Portal | Users in the same tenant only | Users in the same tenant only |
The tenant boundary is enforced server-side. A dashboard user cannot reach users from another tenant even if they craft a raw WebSocket message.
Framework Decision: Socket.IO
Why not plain WebSockets (ws package)?
Raw WebSockets give you a TCP-like pipe. You would need to build on top of it:
- Room/channel management (for tenant scoping)
- Automatic reconnection
- Fallback for environments that block WebSocket upgrades (corporate proxies, etc.)
- Broadcasting across multiple API instances
That is weeks of infrastructure work before writing a single product feature.
Why not SSE (Server-Sent Events, already in use)?
SSE is already used for audit log streaming. It is unidirectional — server pushes to client only. Chat requires the client to also send messages. You would still need a REST endpoint for sending and SSE for receiving, which is two separate protocols to maintain.
Why not a managed service (Ably, Pusher, Liveblocks)?
External services add per-message cost, vendor lock-in, and require all message content to leave the infrastructure. Not appropriate for a B2B marketing platform.
Socket.IO — chosen framework
Socket.IO runs on top of WebSockets (with transparent fallback to HTTP long-polling when WebSocket connections fail). It provides:
- Rooms — map directly to tenants; a message broadcast to
tenant:{tenantId}reaches only users in that room - Namespaces — separate the manage admin namespace from the tenant namespace
@socket.io/redis-adapter— plugs into the existing Redis infrastructure; when the API runs on multiple instances, a message published on instance A is automatically fanned out to clients connected to instances B and C- Automatic reconnection — built into the Socket.IO client; no custom retry logic needed
- Middleware — JWT/session verification at the handshake level (before any room is joined), consistent with the existing API auth pattern
fastify-socket.io— official Fastify plugin; attaches Socket.IO to the existing Fastify HTTP server without a separate process
The result is that the same Redis already used for BullMQ and audit-log pub/sub also handles Socket.IO message routing. No new infrastructure required.
Files in this folder
| File | Contents |
|---|---|
index.md | This file — overview and framework rationale |
architecture.md | Component diagram, auth flow, Redis layout, tenant isolation |
schema.md | New Prisma models (ChatMessage) and Redis key conventions |
events.md | Full Socket.IO event catalog (client ↔ server) |
implementation.md | Phased implementation plan with file-level task breakdown |