Database Documentation
The system uses two database engines — PostgreSQL for relational, transactional data and MongoDB for large blobs, documents, and high-volume append-only logs.
Contents
| Doc | What it covers |
|---|---|
| Database Overview | Which data goes where and why — decision guide, split pattern, key relationships |
| PostgreSQL Schema | Full PostgreSQL schema: all tables, indexes, useful queries, security |
| MongoDB Collections | All MongoDB collections: schemas, indexes, TTLs, security |
| SQL Entity Relations | ER diagrams for all PostgreSQL tables and their relationships |
| NoSQL Entity Relations | MongoDB collections and their cross-references to PostgreSQL |
Quick Decision Guide
PostgreSQL — if the data:
- Has relationships to other records (foreign keys matter)
- Needs counting, aggregation, or billing queries
- Participates in ACID transactions (status state machines)
- Is small-to-medium in size with a fixed schema
MongoDB — if the data:
- Has variable or unpredictable structure (arbitrary JSON diffs, integration payloads)
- Is an append-only log with TTL cleanup and no relational queries
- Needs full-text search (skills content)
- Is small-to-medium structured metadata accessed by a single FK from PostgreSQL
S3 — if the data:
- Is a large binary or large text blob (PDF, multi-KB Markdown, scraped website text)
- Has no query requirements — accessed only by a known key
- Needs immutability guarantees (Object Lock / WORM for legal records)
- Benefits from direct client access without API proxying (pre-signed URLs)
See the overview for the full decision table.
Key Architecture Patterns
Split Pattern (PostgreSQL + MongoDB + S3)
Large content is split across all three stores. PostgreSQL holds structured metadata; MongoDB holds variable-structure metadata or operational logs; S3 holds the actual content bytes.
| Entity | PostgreSQL holds | MongoDB holds | S3 holds |
|---|---|---|---|
| Activity output | Status, output_ref pointer | Metadata + s3Key (activity_outputs) | Full content (Markdown/JSON) |
| Blog post | Metadata (status, channel, word count) | — | Full article Markdown (via output_ref → activity_outputs.s3Key) |
| Social post | Metadata (status, platform) | — | Post copy + hashtags (via output_ref → activity_outputs.s3Key) |
| Onboarding run | Status, context_file_id | S3 key pointers (onboarding_research) | Scraped text, research, context file |
| Strategy | Version, status, dates, approval | S3 key pointer (strategies_content) | Full strategy Markdown |
| Contract | Acceptance metadata, pdf_s3_key | — | PDF binary (Object Lock / WORM) |
| Tool call | Summary (name, method, status) | Full payloads (tool_call_logs) | — |
Multi-Tenancy
All records are scoped to a tenant:
- PostgreSQL:
tenant_id UUID NOT NULL REFERENCES tenants(id)on every table - MongoDB:
tenantId: stringon every document
Queries must always include the tenant filter. Global records (e.g. global skill templates) use tenantId: 'global' in MongoDB and tenant_id IS NULL in PostgreSQL.
Identifiers
| Identifier | Used for | Format |
|---|---|---|
id (PostgreSQL) | Internal FK joins only | UUID |
ref_id (PostgreSQL) | External APIs, URLs, webhooks | ULID (26 chars, sortable) |
_id (MongoDB) | MongoDB internal primary key | ObjectId |
refId (MongoDB) | External references | ULID |
PostgreSQL ref_id values and MongoDB ObjectIds are stored as cross-reference pointers (e.g. output_ref, content_ref, pdf_ref).
Storage Engines
| PostgreSQL 16 | MongoDB 7 | S3 / MinIO | |
|---|---|---|---|
| ORM/ODM/SDK | Prisma ORM | Mongoose | @aws-sdk/client-s3 |
| Package | providers/provider-db/ | providers/provider-db/ | providers/provider-storage/ |
| Connection | Via PgBouncer (transaction mode) | Direct MONGO_URL | S3_ENDPOINT + credentials |
| Multi-tenancy | tenant_id column + FK on every table | tenantId field on every document | {tenantId}/ key prefix |
| Soft deletes | WHERE deleted_on IS NULL | { deletedOn: null } | Object versioning + lifecycle rules |
| Local dev | Docker Compose postgres:16 | Docker Compose mongo:7 | Docker Compose minio/minio |
| Production | Managed cloud PostgreSQL | MongoDB Atlas or self-hosted | AWS S3 or DO Spaces |
| On-prem | Self-hosted PostgreSQL | Self-hosted MongoDB | Self-hosted MinIO |