Skip to Content
Apps & PortalsManageScreens — Knowledge Base (RAG Management)

Screens — Knowledge Base (RAG Management)

Audience: Tenant admins (Dashboard app) + DM Portal reviewers Purpose: Manage the RAG knowledge base — upload documents, monitor ingestion, configure chunking, test retrieval, and trigger website crawls. Platform: Web only (all screens)

Status: [To Build] — The Knowledge Base / RAG system is a new layer being added as part of the RAG integration. None of these screens exist in the current live app. All screens (KB1–KB6) are specified and ready to build.

ScreenStatus
KB1 — Knowledge Base Overview[To Build]
KB2 — Dataset File Management[To Build]
KB3 — Upload File Modal[To Build]
KB4 — Retrieval Sandbox[To Build]
KB5 — Dataset Configuration[To Build]
KB6 — Website Crawl Settings modal[To Build]
Dashboard Settings sub-nav (Knowledge Base entry)[To Build]
Activity tab “Retrieved context” section[To Build]
DM Portal Activity Detail “Retrieved context”[To Build]
Manage Tenant Detail Knowledge Base tab[To Build] — see screens-manage.md M3

Related: RAG Integration | RAG Architecture


Where These Screens Live

AppRouteWho uses it
Dashboard/settings/knowledge-baseTenant admin — manage their own knowledge base
Dashboard/settings/knowledge-base/[datasetId]Tenant admin — per-dataset file management
Dashboard/settings/knowledge-base/[datasetId]/sandboxTenant admin + DM reviewer — test retrieval
Dashboard/settings/knowledge-base/[datasetId]/configureTenant admin — chunking + parser settings
DM Portal/activities/[id] (existing)DM reviewer — see RAG chunks used in an agent run
Manage/tenants/[id] → Knowledge Base tab (existing)Super admin — view tenant RAG stats

The Knowledge Base is a section of Settings, not a top-level nav item. It lives under the Settings sidebar section with its own sub-nav.


Screen KB1 — Knowledge Base Overview (/settings/knowledge-base)

Purpose: See all four standard datasets for this tenant, file counts, indexing status, and quick actions.

┌─────────────────────────────────────────────────────────────┐ │ Knowledge Base │ │ Manage documents, website content, and research data │ │ that agents use to improve their responses. │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 📄 Client Documents │ │ │ │ Brand guides, product sheets, tone-of-voice docs │ │ │ │ │ │ │ │ 3 files · 29 chunks indexed │ │ │ │ ████████████████████░░ 87% │ │ │ │ brand-guidelines.pdf ✅ │ │ │ │ tone-of-voice.docx ✅ │ │ │ │ q1-results.pdf ⏳ indexing… │ │ │ │ │ │ │ │ [+ Upload Docs] [View All Files →] │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 🌐 Website Content │ │ │ │ Crawled pages from your website │ │ │ │ │ │ │ │ 142 pages · 1,840 chunks indexed │ │ │ │ Last crawl: Apr 1, 2026 · Next: Apr 8, 2026 │ │ │ │ │ │ │ │ [Re-crawl Now] [View Pages →] │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 📝 Published Content │ │ │ │ Blog posts and social posts published via platform │ │ │ │ │ │ │ │ 24 items · 312 chunks · auto-updated │ │ │ │ 12 blog posts · 12 social posts │ │ │ │ │ │ │ │ [View Content →] │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ 🔍 Competitor Research │ │ │ │ Competitor data gathered by Content Researcher │ │ │ │ 🔒 Privacy: local only — never sent to cloud │ │ │ │ │ │ │ │ 3 competitors · 18 pages · 210 chunks │ │ │ │ Zapier · Make.com · n8n │ │ │ │ │ │ │ │ [View Data →] [Clear] │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ [Test Retrieval (Sandbox) →] │ └─────────────────────────────────────────────────────────────┘

Details:

  • Each dataset card shows: name, description, item count, chunk count, and status.
  • Ingestion progress bar appears on any dataset with files currently being indexed. Auto-updates via SSE without page refresh.
  • Published Content card has no upload button — it’s auto-populated. Shows counts and a “View Content” link.
  • Competitor Research card has a privacy badge (“local only”). The DM team can clear competitor data if needed.
  • “Test Retrieval” link at the bottom navigates to the sandbox.
  • Responsive: On mobile, dataset cards stack vertically. Progress bars and file names are truncated.

Screen KB2 — Dataset File Management (/settings/knowledge-base/[datasetId])

Purpose: View, upload, and manage all files in a specific dataset. Monitor ingestion status per file.

The screen has three tabs: Files, Sandbox, Configure.

Files tab (default)

┌─────────────────────────────────────────────────────────────┐ │ ← Knowledge Base / Client Documents │ │ [Files] [Sandbox] [Configure] │ ├─────────────────────────────────────────────────────────────┤ │ [+ Upload Files] │ │ │ │ Name Size Chunks Status Created │ │ ───────────────────────────────────────────────────────── │ │ brand-guidelines.pdf 2.1MB 14 ✅ Indexed Apr 1 │ │ ▸ 14 chunks active [●] Enable [Sandbox] [Delete] │ │ │ │ tone-of-voice.docx 0.4MB 6 ✅ Indexed Apr 1 │ │ ▸ 6 chunks active [●] Enable [Sandbox] [Delete] │ │ │ │ q1-results.pdf 3.8MB — ⏳ Indexing Apr 3 │ │ ████████████░░░░░░░░░ 55% Embedding chunks… │ │ │ │ ┌─────────────────────────────────────────────────────────┐│ │ │ 📎 Drop files here or click to upload ││ │ │ PDF, DOCX, TXT, MD · max 10MB each · up to 32 files ││ │ └─────────────────────────────────────────────────────────┘│ └─────────────────────────────────────────────────────────────┘

File row detail:

ElementDescription
Status icon✅ Indexed, ⏳ Indexing (with progress bar), ❌ Error, ⚫ Disabled
Chunk countHow many chunks are indexed in Qdrant for this file
Enable/Disable toggleFlips enabled field — disabled chunks are excluded from all searches without being deleted
Sandbox shortcutOpens the sandbox pre-filtered to this file
DeleteDeletes file + all its Qdrant vectors; requires confirmation
Progress barShows during parsing + embedding (via SSE — updates live)

Upload Files flow:

  1. Drag-drop or click to select files.
  2. Upload File modal appears (Screen KB3).
  3. Files upload → rag_files records created with status: 'pending'.
  4. Ingestion jobs enqueued to BullMQ.
  5. File rows appear immediately with ⏳ Indexing and a live progress bar.
  6. Progress bar updates via SSE as the worker processes each chunk batch.
  7. Status becomes ✅ Indexed when complete.

Screen KB3 — Upload File Modal

Triggered by: ”+ Upload Files” button on the Files tab.

┌──────────────────────────────────────────────────┐ │ Upload Files to Client Documents [✕] │ ├──────────────────────────────────────────────────┤ │ │ │ ┌────────────────────────────────────────────┐ │ │ │ 📎 Drop files here or click to browse │ │ │ │ │ │ │ │ brand-guidelines.pdf ✅ 2.1 MB │ │ │ │ tone-of-voice.docx ✅ 0.4 MB │ │ │ │ large-report.pdf ❌ 14.2 MB — too large │ └────────────────────────────────────────────┘ │ │ │ │ Supported: PDF, DOCX, TXT, MD │ │ Max 10 MB per file · Up to 32 files at once │ │ │ │ ───────────────────────────────────────────── │ │ Parse on upload [ON] │ │ Start indexing immediately after upload │ │ │ │ Parser engine │ │ ● Built-in (fast, general purpose) │ │ ○ Docling (layout-aware, best for PDFs) │ │ ⚠ Requires Docling service to be configured │ │ │ │ [Cancel] [Upload 2 Files →] │ └──────────────────────────────────────────────────┘

Fields:

FieldDescription
Drop zoneMulti-file. Shows file names + sizes. Red error for oversized / unsupported files.
Parse on uploadON by default. If OFF, files upload but ingestion does not start — tenant can trigger later.
Parser engineBuilt-in (Node.js pdf-parse / mammoth / fs.readFile). Docling only shown if DOCLING_URL is configured.

Screen KB4 — Retrieval Sandbox (/settings/knowledge-base/[datasetId]/sandbox)

Also accessible from the overview as a global sandbox (queries all datasets).

Purpose: Test what the agent will retrieve before running campaigns. Adjust search parameters to optimise relevance.

┌─────────────────────────────────────────────────────────────┐ │ ← Client Documents / Sandbox │ │ [Files] [Sandbox] [Configure] │ ├────────────────────────┬────────────────────────────────────┤ │ │ │ │ Settings │ Results │ │ ────────────────── │ ────────────────────────────── │ │ Dataset │ Query: │ │ [Client Documents ▾] │ What are our key product USPs? │ │ │ [Search] │ │ Search scope │ │ │ ● This dataset only │ 4 results · 0.31s │ │ ○ All datasets │ ───────────────────────────── │ │ │ ▸ Score 0.94 · brand-guidelines │ │ TopK │ "Our four USPs: no-code setup, │ │ [────●────] 5 │ AI-powered suggestions, 200+ │ │ │ integrations, SOC2 certified…" │ │ Vector weight │ │ │ Keyword ●────── Vector│ ▸ Score 0.87 · pricing.html │ │ [0.4] [0.6] │ "Why choose us: Setup in 30 │ │ │ minutes, no IT team needed…" │ │ Similarity threshold │ │ │ [────●────] 0.1 │ ▸ Score 0.81 · website_content │ │ │ "Compare: Acme vs Zapier — we │ │ Reranker │ offer 40% lower cost at scale" │ │ [None ▾] │ │ │ │ ▸ Score 0.71 · tone-of-voice.docx │ │ │ "Always lead with outcomes, │ │ ──────────────────── │ not features. Speak to the │ │ ⚙ Advanced │ decision maker's time…" │ │ Filter by source: │ │ │ ☑ upload │ │ │ ☑ website_crawl │ │ │ ☑ published_content │ │ │ │ │ └────────────────────────┴────────────────────────────────────┘

Controls:

ControlDescription
DatasetSwitch between individual datasets or search all
Search scopeThis dataset vs all tenant datasets
TopK slider1–20 results
Vector weight slider0.0 (keyword only) → 1.0 (vector only). Default 0.6
Similarity thresholdMinimum score to include a result. Default 0.1
RerankerOptional cross-encoder model. None by default.
Source filterFilter results by file source type

Result card:

Each result shows:

  • Relevance score (0–1)
  • Source file name + dataset
  • Full chunk text (expandable if > 300 chars)
  • Chunk metadata (page number, section heading if available)

Responsive: On tablet/mobile, settings panel collapses to a “Settings” button that opens a sheet drawer. Results are shown full-width.


Screen KB5 — Dataset Configuration (/settings/knowledge-base/[datasetId]/configure)

Purpose: Adjust chunking strategy and parser settings for a dataset. These settings apply to all future uploads and re-ingestion.

┌─────────────────────────────────────────────────────────────┐ │ ← Client Documents / Configure │ │ [Files] [Sandbox] [Configure] │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Basic Details │ │ ───────────────────────────────────────────────────────── │ │ Name │ │ [Client Documents ] │ │ │ │ Description │ │ [Brand guides, product sheets, tone-of-voice docs ] │ │ │ │ ───────────────────────────────────────────────────────── │ │ Embedding Model │ │ text-embedding-3-small (OpenAI) │ │ ⚠ Cannot be changed after dataset creation. │ │ Changing the model invalidates all existing vectors. │ │ Create a new dataset to use a different model. │ │ │ │ ───────────────────────────────────────────────────────── │ │ Chunking │ │ │ │ Parse type │ │ ● NAIVE — Fixed-size character chunking (recommended) │ │ ○ MARKDOWN — Split on headings (best for .md / blogs) │ │ ○ MANUAL — Split on --- delimiter │ │ │ │ Chunk size (tokens) │ │ [──────────────●─────] 512 │ │ │ │ Chunk overlap (tokens) │ │ [──────●───────────── ] 64 │ │ │ │ ───────────────────────────────────────────────────────── │ │ Parser Engine │ │ ● Built-in (fast, works for most documents) │ │ ○ Docling (layout-aware PDF parsing — requires sidecar) │ │ │ │ ───────────────────────────────────────────────────────── │ │ Danger Zone │ │ [Re-index all files] — Re-runs ingestion with new settings│ │ [Delete dataset] — Deletes dataset + all vectors │ └─────────────────────────────────────────────────────────────┘

Notes:

  • Embedding model is read-only — cannot be changed after creation. A warning banner explains why.
  • Re-index all files re-runs ingestion for all indexed files with the new chunk settings. Existing Qdrant vectors are deleted and regenerated. A confirmation dialog shows estimated time.
  • Delete dataset requires typing the dataset name to confirm.

Screen KB6 — Website Crawl Settings (modal)

Triggered by: “Re-crawl Now” or settings gear on the Website Content card.

┌──────────────────────────────────────────────────┐ │ Website Crawl Settings [✕] │ ├──────────────────────────────────────────────────┤ │ │ │ Start URL │ │ [https://acmecorp.com ] │ │ │ │ Crawl scope (URL path prefix) │ │ [Leave empty to crawl entire site ] │ │ e.g. /blog to crawl only the blog section │ │ │ │ Max pages │ │ [──────────────●─────────] 200 │ │ │ │ Max depth │ │ [──────────●──────────── ] 3 │ │ │ │ Schedule │ │ ● Weekly (every Monday at 3am) │ │ ○ Monthly │ │ ○ Manual only │ │ │ │ Previously crawled: 142 pages (Apr 1, 2026) │ │ This crawl will replace all existing pages. │ │ │ │ [Cancel] [Start Crawl →] │ └──────────────────────────────────────────────────┘

While a crawl is running, the Website Content card shows a live progress bar:

🌐 Website Content — Crawling… ████████████░░░░░░░░░░ 57% 114 / 200 pages

RAG Usage in Existing Screens

Campaign Detail — Activity tab (Dashboard D4)

When an agent activity used rag_search, the activity card shows a collapsible “Retrieved context” section:

🤖 Copywriter — Writing blog post: "Why Local SEO Matters" Status: ✅ Completed · Cost: $0.012 · 3m 40s ▸ Retrieved context (3 queries) Query 1: "local SEO for small businesses client services" Dataset: client_docs · 3 chunks · Score: 0.91, 0.87, 0.82 ↳ brand-guidelines.pdf — "Our primary audience is local businesses…" Query 2: "past blog posts about SEO" Dataset: published_content · 3 chunks · Score: 0.89, 0.85, 0.78 ↳ "How to Rank on Google in 2026" — "When writing about SEO topics…" Query 3: "product features for SME clients" Dataset: website_content · 2 chunks · Score: 0.93, 0.76 ↳ pricing.html — "Starter plan: ideal for businesses under 50 employees…"

This gives the DM reviewer full visibility into what context the agent used — and can flag if the RAG results were poor (which would indicate a re-crawl or re-index is needed).

Activity Detail — DM Portal (P3)

Same “Retrieved context” section shown in the right metadata panel alongside cost, tokens, and model info.


Manage App — Tenant Knowledge Base Tab (M3 update)

The Tenant Detail screen (/tenants/[id]) adds a Knowledge Base tab alongside Overview, Config, Users, Agents, Billing:

Knowledge Base ────────────────────────────────────────────────────────────── Dataset Files Chunks Embedding Model Status ────────────────────────────────────────────────────────────── Client Documents 3 29 text-embed-3-small ✅ Website Content 142p 1,840 text-embed-3-small ✅ (weekly crawl) Published Content 24 312 text-embed-3-small ✅ (auto) Competitor Research 18 210 nomic-embed-text ✅ (local) ────────────────────────────────────────────────────────────── Total vectors in Qdrant: 2,391 Qdrant collection size: 4.2 MB

Super admins can trigger a re-index or clear a dataset on behalf of a tenant (with audit log entry).


Add Knowledge Base to the Dashboard Settings sub-nav:

Settings ├── General /settings ├── Knowledge Base /settings/knowledge-base ← NEW ├── Channels /settings/channels (was /channels) ├── Integrations /settings/integrations ├── Skills /settings/skills ├── Recurring Tasks /settings/recurring-tasks ├── Audit Log /settings/audit-log └── Billing /settings/billing

Screens Reference Index

ScreenRouteApp
KB1 — Overview/settings/knowledge-baseDashboard
KB2 — Files tab/settings/knowledge-base/[datasetId]Dashboard
KB3 — Upload modal(modal on KB2)Dashboard
KB4 — Sandbox tab/settings/knowledge-base/[datasetId]/sandboxDashboard
KB5 — Configure tab/settings/knowledge-base/[datasetId]/configureDashboard
KB6 — Crawl settings(modal on KB1 / KB2)Dashboard

© 2026 Leadmetrics — Internal use only