Gap 2: RAG Retrieval is Relevance-Only

Problem

The RAG pipeline scores retrieved chunks purely by cosine similarity (relevance). There is no weighting for how recent the source is or how important it is relative to other content in the knowledge base.

From the Generative Agents paper (Park et al., 2023), a three-factor retrieval score produces significantly better context selection:


score = α·relevance + β·recency + γ·importance

Current behaviour

A stale competitor analysis from six months ago scores identically to one written last week. A core brand positioning document scores the same as an incidental blog post. A RAG chunk that was previously cited in a high-scoring output carries no advantage over one that has never been used.

The result is that prompts can be populated with outdated or low-value context despite better alternatives existing in the vector store.

What to Build

1. Add metadata fields to indexed documents

In the search-indexer worker, when documents are synced to Typesense, include:


{
  // existing fields
  text: string,
  tenantId: string,
  datasetName: string,
  source: string,
 
  // new fields
  createdAt: number,      // unix timestamp — enables recency scoring
  importanceScore: number // 0.0–1.0 — set at index time based on document type
}

Importance scores at index time (heuristic, no LLM call needed):

Document type	Importance
Brand voice document	1.0
Client context file	0.9
Published landing page	0.8
Competitor analysis	0.7
Blog post (published)	0.6
Web crawl page	0.4
Generic knowledge doc	0.5

2. Implement weighted scoring in search()

In packages/feature-search/src/search.ts, apply the formula after retrieving Typesense results:


function weightedScore(
  relevance: number,  // from Typesense text_match score, normalised 0–1
  createdAt: number,  // unix timestamp
  importanceScore: number,
  now = Date.now()
): number {
  const ageHours = (now - createdAt) / (1000 * 60 * 60);
  // exponential decay: half-life of 30 days
  const recency = Math.exp(-ageHours / (30 * 24));
 
  const α = 0.5;
  const β = 0.3;
  const γ = 0.2;
 
  return α * relevance + β * recency + γ * importanceScore;
}

Re-rank results by weightedScore before returning topK.

3. Add a quality threshold filter

Currently all topK results are returned regardless of score. Add a minimum threshold:


const MIN_RELEVANCE_THRESHOLD = 0.3;
results = results.filter(r => r.weightedScore >= MIN_RELEVANCE_THRESHOLD);

This prevents low-quality chunks from consuming prompt token budget.

4. Track citation rate per chunk (long-term)

After a run completes, if the output text contains a substring that closely matches a retrieved chunk, mark that chunk as cited: true in a RagCitation log table. Over time, chunks with high citation rates get a boosted importance score on re-index.

This is the compound flywheel: better retrieval → better output → better citation signal → better retrieval.

Files to Change

apps/servers/search-indexer/src/workers/ — add createdAt and importanceScore to all 13 collection fetchers
packages/feature-search/src/search.ts — implement weighted scoring and threshold filter
packages/db/prisma/schema.prisma — add optional RagCitation model for citation tracking (phase 2)

Gap 9: Episodic memory (both improve what context agents receive)
Gap 8: Context window management (filtering low-score chunks reduces prompt size)