Qdrant

Category: Vector Search
Integration type: Platform-level (sidecar service in Docker Compose)
External SDK: @qdrant/js-client-rest

Purpose

Qdrant is the vector database powering the RAG (Retrieval-Augmented Generation) system. It stores dense vector embeddings of client knowledge base documents and enables semantic similarity search. Agents query Qdrant to fetch relevant context before generating content.

Every tenant’s knowledge base is isolated within Qdrant using collection-per-tenant naming: rag_{tenantId}.

Config Structure

Platform config (env vars)


QDRANT_URL=http://qdrant:6333           # Docker service name in Compose network
QDRANT_API_KEY=                         # Optional — leave empty for local Docker
QDRANT_COLLECTION_PREFIX=rag_           # Tenant collections named rag_{tenantId}
EMBEDDING_MODEL=text-embedding-3-small  # OpenAI embedding model used for query vectors
EMBEDDING_DIMENSIONS=1536               # Must match collection vector size

Integration Pattern

Qdrant client (`packages/rag/src/qdrant-client.ts`)


import { QdrantClient } from '@qdrant/js-client-rest';
 
const client = new QdrantClient({
  url:    config.QDRANT_URL,
  apiKey: config.QDRANT_API_KEY || undefined,
});

Collection management

Each tenant’s collection is created during Knowledge Base setup:


async function createTenantCollection(tenantId: string): Promise<void> {
  const collectionName = `rag_${tenantId}`;
 
  await client.createCollection(collectionName, {
    vectors: {
      size:     1536,           // text-embedding-3-small dimensions
      distance: 'Cosine',
    },
    optimizers_config: {
      default_segment_number: 2,
    },
    replication_factor: 1,
  });
 
  // Create payload indexes for filtering
  await client.createPayloadIndex(collectionName, {
    field_name: 'dataset_id',
    field_schema: 'keyword',
  });
  await client.createPayloadIndex(collectionName, {
    field_name: 'source_type',
    field_schema: 'keyword',
  });
}

Ingestion (upsert)

Documents are chunked and embedded during RAG ingestion, then upserted into Qdrant:


async function upsertChunks(
  tenantId:  string,
  datasetId: string,
  chunks: {
    id:        string;   // UUID — stable for idempotent re-ingestion
    text:      string;
    embedding: number[]; // 1536-dimensional vector from OpenAI
    metadata:  Record<string, string>;
  }[],
): Promise<void> {
  const collectionName = `rag_${tenantId}`;
 
  await client.upsert(collectionName, {
    wait:   true,
    points: chunks.map(chunk => ({
      id:      chunk.id,
      vector:  chunk.embedding,
      payload: {
        text:        chunk.text,
        dataset_id:  datasetId,
        tenant_id:   tenantId,
        source_type: chunk.metadata.sourceType,
        source_id:   chunk.metadata.sourceId,
        ...chunk.metadata,
      },
    })),
  });
}

Search (hybrid)

Agent rag_search tool calls perform semantic similarity search, optionally filtered by dataset:


async function search(
  tenantId:   string,
  query:      string,   // The query text — embedded before search
  options: {
    datasetIds?: string[];   // Filter to specific datasets; omit for all
    limit?:      number;     // Default: 5
    scoreThreshold?: number; // Minimum similarity score (0–1); default 0.7
  } = {},
): Promise<RagSearchResult[]> {
  // Embed the query
  const embedding = await openai.embeddings.create({
    model: config.EMBEDDING_MODEL,
    input: query,
  });
  const queryVector = embedding.data[0].embedding;
 
  const collectionName = `rag_${tenantId}`;
 
  const filter = options.datasetIds?.length
    ? {
        must: [{
          key:    'dataset_id',
          match:  { any: options.datasetIds },
        }],
      }
    : undefined;
 
  const results = await client.search(collectionName, {
    vector:        queryVector,
    filter,
    limit:         options.limit ?? 5,
    score_threshold: options.scoreThreshold ?? 0.7,
    with_payload:  true,
  });
 
  return results.map(r => ({
    id:      String(r.id),
    score:   r.score,
    text:    r.payload?.text as string,
    dataset: r.payload?.dataset_id as string,
    source:  r.payload?.source_type as string,
  }));
}

Docker Compose Setup


# docker-compose.yml
services:
  qdrant:
    image: qdrant/qdrant:v1.9.0
    volumes:
      - qdrant_data:/qdrant/storage
    ports:
      - "6333:6333"   # REST API
      - "6334:6334"   # gRPC API
    environment:
      QDRANT__SERVICE__HTTP_PORT: 6333
 
volumes:
  qdrant_data:

Test Cases

Unit tests (`packages/rag/src/qdrant-client.test.ts`)

Test	Approach
`createTenantCollection()` creates collection with correct vector size	Mock `client.createCollection`; assert `size: 1536`, `distance: 'Cosine'`
`upsertChunks()` maps chunks to Qdrant points	Mock `client.upsert`; assert `id`, `vector`, `payload`
`search()` embeds query and calls `client.search`	Mock OpenAI embedding; mock `client.search`; assert query vector passed
`search()` filters by `dataset_ids` when provided	Assert `filter.must` contains `dataset_id` match
`search()` returns empty array when no results above threshold	Mock `results: []`; assert `[]` returned

Integration tests

Test	Approach
Create collection, upsert chunks, search	Start Qdrant in Docker; create test collection; upsert 3 chunks; search; assert top result
Dataset filter narrows results	Upsert to two datasets; search with one dataset filter; assert only that dataset returned

RAG Integration — how RAG fits into agents
RAG Architecture — full ingestion pipeline, schema, hybrid search
OpenAI Provider — embedding model used for vector generation