API Gateway

Status: [To Build] · Pattern: Gateway layer within Fastify (Option A)

All requests to the Fastify API pass through a gateway layer before reaching any surface router. The gateway owns every cross-cutting concern — authentication, tenant resolution, rate limiting, throttling, request logging, and audit — so that none of those concerns leak into the individual surface routers.

Why a Gateway Layer

Previously, auth, rate limiting, and logging were scattered across a shared common/ module imported by each router. The problems with that:

Cross-cutting logic duplicated or inconsistently applied across surfaces
Adding a new concern (e.g. audit logging) required touching every router
No single place to enforce policy changes globally

The gateway layer is a Fastify plugin registered before all routers. Every request — regardless of surface — passes through the same ordered chain of hooks. Routers only see a fully-authenticated, rate-checked, logged request.

Request Lifecycle

Every request flows through this chain in order:


Inbound request (from Traefik)
        │
        ▼
┌───────────────────────────────────────────────────────┐
│                    GATEWAY LAYER                       │
│                                                        │
│  1. Request ID        assign unique trace ID           │
│  2. IP Extraction     resolve real IP behind Traefik   │
│  3. Authentication    validate JWT or API key          │
│  4. Tenant Resolution resolve + attach tenantId        │
│  5. Rate Limiting     sliding window check (Redis)     │
│  6. Throttling        per-surface concurrency cap      │
│  7. Request Log       structured log entry (pre)       │
│  8. Audit Pre-hook    snapshot state (write ops only)  │
│                                                        │
└───────────────────┬───────────────────────────────────┘
                    │ forward
                    ▼
┌───────────────────────────────────────────────────────┐
│                  SURFACE ROUTER                        │
│   /dashboard/v1  /dm/v1  /admin/v1                    │
│   /mobile/v1     /cli/v1  /auth/v1  /agent/v1         │
└───────────────────┬───────────────────────────────────┘
                    │ response
                    ▼
┌───────────────────────────────────────────────────────┐
│                  GATEWAY LAYER (post)                  │
│                                                        │
│  9. Response Log      status, duration, size           │
│  10. Audit Post-hook  write audit record (write ops)   │
│                                                        │
└───────────────────────────────────────────────────────┘
        │
        ▼
Response returned to client

Gateway Responsibilities

1. Request ID

Every request is assigned a ULID requestId before any other processing. It is:

Attached to the Fastify request object (req.requestId)
Included in every log line for this request
Returned in the response header: X-Request-Id: <ulid>
Used to correlate logs across services (passed to BullMQ jobs, agent callbacks, MongoDB writes)


// gateway/request-id.ts
fastify.addHook('onRequest', async (req) => {
  req.requestId = ulid();
  req.log = req.log.child({ requestId: req.requestId });
});

2. IP Extraction

The API sits behind Traefik. The real client IP is in the X-Forwarded-For header. The gateway normalises this to req.clientIp (used by rate limiting and audit logging).


// Trust only the first IP in the chain (set by Traefik)
const xff = req.headers['x-forwarded-for'];
req.clientIp = Array.isArray(xff) ? xff[0] : xff?.split(',')[0] ?? req.ip;

3. Authentication

Two credential types are accepted:

Credential	Format	Used by
JWT (access token)	`Authorization: Bearer <jwt>`	Dashboard, DM Portal, Manage, Mobile
API key	`Authorization: ApiKey <key>`	CLI, agent callbacks, external integrations

JWT validation flow:

Decode header — reject if malformed
Verify signature against JWT_SECRET
Check exp — reject if expired (return 401 with WWW-Authenticate: Bearer error="expired_token")
Attach decoded payload to req.principal

API key validation flow:

Look up key hash in PostgreSQL api_keys table
Check is_active and expires_at
Load associated user + role + tenant scopes
Attach to req.principal in the same shape as JWT principal

Agent callbacks use a short-lived task-scoped JWT (issued per run, 30-min TTL, signed with JWT_SECRET). The gateway validates these the same way as user JWTs — the sub is the runId, not a userId.

Unauthenticated paths (bypass auth hook):

POST /auth/v1/login
POST /auth/v1/refresh
GET /health


interface GatewayPrincipal {
  id:         string;             // user ref_id or runId (agent)
  type:       'human' | 'agent' | 'api_key';
  role:       'admin' | 'member' | 'reviewer' | 'super_admin' | 'agent';
  tenantId?:  string;             // absent for super_admin and cross-tenant reviewers
  appAccess:  string[];           // which surfaces this principal can reach
  keyId?:     string;             // api_keys.ref_id (api_key type only)
}

4. Tenant Resolution

After auth, the gateway resolves and validates the tenant context:

Surface	Source of tenantId
`/dashboard/v1`	JWT `tenantId` field
`/mobile/v1`	JWT `tenantId` field
`/dm/v1`	Optional `?tenantId=` query param; validated against reviewer’s assigned tenants
`/admin/v1`	Optional path param `:tenantId`; no restriction (super_admin only)
`/cli/v1`	Optional `?tenantId=` or `X-Tenant-Id` header; validated against principal’s access
`/agent/v1`	From run record in PostgreSQL (looked up by `runId`)

The resolved tenant record is attached to req.tenant. All downstream route handlers use req.tenant.id — they never look it up themselves.

5. Rate Limiting

Sliding window rate limits enforced per (tenantId, userId) key in Redis. Limits are configured per surface:

Surface	Requests / minute	Burst allowance
`/dashboard/v1`	300	+60 (20% burst)
`/dm/v1`	600	+120
`/admin/v1`	120	+24
`/mobile/v1`	200	+40
`/cli/v1`	1,200	+240 (scripts may send bursts)
`/agent/v1`	60 per runId	—

On every request, the gateway:

Increments the Redis counter for rate:{surface}:{tenantId}:{userId}
Sets TTL of 60s on first write
If count > limit: return 429 with Retry-After and X-RateLimit-* headers
Otherwise: attach remaining count to response headers


X-RateLimit-Limit: 300
X-RateLimit-Remaining: 247
X-RateLimit-Reset: 1704067261   # Unix timestamp when window resets
Retry-After: 14                  # seconds (only on 429)

6. Throttling

Rate limiting counts total requests. Throttling caps concurrent requests per surface to prevent a single client from saturating the API with slow long-running requests (e.g. large SSE connections, file uploads).

Surface	Max concurrent per user
`/dashboard/v1`	10
`/dm/v1`	20
`/admin/v1`	5
`/mobile/v1`	8
`/cli/v1`	15
SSE connections	5 per user (shared across surfaces)

Implemented with a Redis counter incremented on request start, decremented on response end (including on error/disconnect).

7. Request Logging

Every request is logged as a structured JSON entry before the route handler runs:


{
  "level": "info",
  "time": "2026-04-04T09:00:00.000Z",
  "requestId": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "method": "POST",
  "path": "/dm/v1/approvals/01ARZ.../resolve",
  "surface": "dm",
  "userId": "01BRZ...",
  "tenantId": "01CRZ...",
  "clientIp": "203.0.113.5",
  "userAgent": "Leadmetrics-CLI/1.0.0"
}

And after the handler completes:


{
  "level": "info",
  "requestId": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "status": 200,
  "durationMs": 42,
  "responseBytes": 318
}

Logs are written via pino (already in the stack) and shipped to Grafana Loki.

8 & 10. Audit Logging

The gateway writes an audit record for every state-changing operation (POST, PUT, PATCH, DELETE). Read-only GETs are not audited (they are covered by request logs).

Pre-hook (step 8): Before the handler runs, for update/delete operations, the gateway fetches the current state of the resource and attaches it to req.auditBefore.

Post-hook (step 10): After the handler returns successfully, the gateway writes an audit_logs record to PostgreSQL:


interface AuditLog {
  id:           string;       // ULID
  requestId:    string;       // from step 1
  tenantId:     string | null;
  actorId:      string;       // user ref_id or 'agent:{runId}'
  actorType:    'human' | 'agent' | 'api_key';
  impersonating: string | null; // set if super_admin is impersonating
  surface:      string;       // 'dashboard' | 'dm' | 'admin' | 'mobile' | 'cli' | 'agent'
  method:       string;       // HTTP method
  path:         string;       // full request path
  action:       string;       // semantic label, e.g. 'approval.resolve'
  resourceType: string;       // e.g. 'approval', 'activity', 'tenant'
  resourceId:   string;       // ULID of the affected resource
  before:       JsonValue | null;  // state before change (nullable)
  after:        JsonValue | null;  // state after change (nullable)
  status:       number;       // HTTP response status
  durationMs:   number;
  createdAt:    Date;
}

The action label is set by the route handler via req.setAuditAction('approval.resolve'). If not set, the gateway derives it from the method + path (e.g. POST /dm/v1/approvals/:id/resolve → dm.approvals.resolve).

Impersonation flagging: When a super_admin is impersonating a tenant, impersonating is set to the target tenantId. This makes every action traceable even across impersonation sessions.

Package Structure


apps/api/src/
├── gateway/
│   ├── index.ts          # Fastify plugin — registers hooks in order
│   ├── request-id.ts     # Step 1 — assign ULID request ID
│   ├── ip.ts             # Step 2 — resolve real client IP
│   ├── auth.ts           # Step 3 — JWT + API key validation
│   ├── tenant.ts         # Step 4 — tenant resolution + attachment
│   ├── rate-limit.ts     # Step 5 — sliding window Redis rate limiter
│   ├── throttle.ts       # Step 6 — concurrent request cap
│   ├── logger.ts         # Step 7 + 9 — request + response logs
│   └── audit.ts          # Step 8 + 10 — audit pre/post hooks
│
├── routers/
│   ├── auth/             # /auth/v1
│   ├── dashboard/        # /dashboard/v1
│   ├── dm/               # /dm/v1
│   ├── admin/            # /admin/v1
│   ├── mobile/           # /mobile/v1
│   ├── cli/              # /cli/v1
│   └── agent/            # /agent/v1
│
├── common/               # Shared non-gateway utilities
│   ├── pagination.ts
│   ├── error.ts
│   └── sse.ts
│
└── index.ts              # App bootstrap — register gateway plugin, then routers

Registration order in index.ts:


// 1. Register gateway (must be first — before any routers)
await fastify.register(gatewayPlugin);
 
// 2. Register surface routers (gateway hooks already in place)
await fastify.register(authRouter,      { prefix: '/auth/v1' });
await fastify.register(dashboardRouter, { prefix: '/dashboard/v1' });
await fastify.register(dmRouter,        { prefix: '/dm/v1' });
await fastify.register(adminRouter,     { prefix: '/admin/v1' });
await fastify.register(mobileRouter,    { prefix: '/mobile/v1' });
await fastify.register(cliRouter,       { prefix: '/cli/v1' });
await fastify.register(agentRouter,     { prefix: '/agent/v1' });

Surface Access Matrix

The gateway enforces this access matrix before forwarding to any router:

Surface prefix	Allowed roles	Allowed credential types
`/auth/v1`	— (public login/refresh endpoints)	None required
`/dashboard/v1`	`admin`, `member`	JWT
`/dm/v1`	`reviewer`, `super_admin`	JWT
`/admin/v1`	`super_admin`	JWT
`/mobile/v1`	`admin`, `member`	JWT
`/cli/v1`	`reviewer`, `super_admin`	API key, JWT
`/agent/v1`	`agent`	Task-scoped JWT

Any mismatch returns 403 Forbidden before the router is reached.

Error Responses from the Gateway

Gateway errors use the same ApiError envelope as surface routers:

Step	Error condition	Status	Code
Auth	Missing Authorization header	401	`UNAUTHORIZED`
Auth	JWT signature invalid	401	`INVALID_TOKEN`
Auth	JWT expired	401	`TOKEN_EXPIRED`
Auth	API key not found / inactive	401	`INVALID_API_KEY`
Tenant	tenantId in JWT not found in DB	401	`TENANT_NOT_FOUND`
Tenant	Reviewer not assigned to requested tenant	403	`FORBIDDEN`
Surface access	Role not allowed on this surface	403	`FORBIDDEN`
Rate limit	Request count exceeded	429	`RATE_LIMITED`
Throttle	Concurrent request cap exceeded	429	`THROTTLED`

Future Upgrade — Option B: Separate Gateway Service

Future story — not in scope for the current build.

As tenant volume and traffic grow, the gateway layer can be extracted into a standalone gateway service that proxies to separately-deployed surface services. The upgrade path is:

Extract gateway plugin into a standalone Fastify app (apps/gateway)
Split surface routers into separate deployable services (apps/api-dashboard, apps/api-dm, apps/api-admin, etc.) — each on its own port
Gateway proxies using @fastify/http-proxy — routes by URL prefix to the correct downstream service after running all gateway hooks
Each service removes its own auth/rate-limit middleware (gateway now owns this exclusively)
Coolify deploys gateway + each service as separate containers with a private Docker network between them — only the gateway is exposed to Traefik

Benefits of Option B:

Individual surfaces can scale independently (e.g. api-dm gets more replicas during peak review hours)
Surface services can be deployed separately without taking down the whole API
Gateway becomes a true choke point — circuit breaking, retries, and observability all in one place

Pre-conditions before migration:

Each surface must be independently testable
API contracts between gateway and services must be stable
Monitoring per-service latency to identify where scaling is actually needed