Topic Researcher

[Live] · agent__topic-researcher · gemma3:4b (Ollama/local)

Generates 10–15 blog topic ideas for a client’s industry and niche, each with a working title, angle, relevance rationale, search intent classification, and research difficulty rating — avoiding topics covered in the last 90 days.

Overview


Function	Generate a ranked list of fresh blog topic ideas for content planning
Type	Worker — Research
Model	gemma3:4b (Ollama/local)
Queue	`agent__topic-researcher`
Concurrency	2
Timeout	8 min
Est. cost / task	~$0 (Ollama/local)
Plan	Agency+ (requires Ollama configured)

Input


interface TopicResearcherInput {
  tenantId:    string;
  campaignId:  string;
 
  // Client content context
  industry:          string;    // e.g. "B2B SaaS", "Home Services", "E-commerce"
  niche:             string;    // more specific, e.g. "project management software for construction"
  targetAudience:    string;    // e.g. "marketing managers at mid-market B2B companies"
  contentGoals:      string[];  // e.g. ["drive organic traffic", "generate leads", "build authority"]
 
  // Recent history — the agent avoids duplicating these
  recentTopics: {
    title:       string;
    publishedAt: string;   // ISO date
  }[];
 
  // Temporal context for trend relevance
  currentMonth: string;   // e.g. "April 2026"
  currentSeason: 'spring' | 'summer' | 'autumn' | 'winter';
  hemisphere:   'northern' | 'southern';
 
  // Optional constraints
  preferredAngles?: string[];   // e.g. ["how-to", "listicle", "case study", "opinion"]
  excludeTopics?:   string[];   // topics the client has explicitly asked to avoid
  topicCount?:      number;     // default 12; range 10–15
}

Output


interface TopicResearcherOutput {
  generatedAt:  string;
  topicCount:   number;
  topics:       TopicIdea[];
}
 
interface TopicIdea {
  rank:          number;         // 1 = highest priority recommendation
  workingTitle:  string;         // specific, publication-ready working title
  angle:         string;         // the unique hook or perspective — what makes this different
  relevanceNow:  string;         // why this topic is worth covering now (trend/seasonal/evergreen)
  searchIntent:  'informational' | 'commercial' | 'navigational' | 'transactional';
  evergreen:     boolean;        // true if topic has long-term relevance regardless of timing
  difficulty:    'easy' | 'medium' | 'hard';   // how much research/expertise is required
  difficultyRationale: string;   // one sentence explaining the difficulty rating
  suggestedFormat:  'how-to guide' | 'listicle' | 'case study' | 'opinion/thought leadership' | 'comparison' | 'interview' | 'news analysis';
  estimatedWordCount: number;    // suggested target word count for the format
  competitorGap:  boolean;       // true if RAG found this topic absent from competitor content
  notes:          string;        // anything the content team should know before committing to this topic
}

Sample output excerpt


## Topic Ideas — April 2026 | B2B SaaS / Project Management
 
---
 
### 1. "Why Your Project Management Software Is Making Your Team Slower (And How to Fix It)"
**Angle:** Contrarian take — most software adoption content is positive. This piece addresses the
friction points that arise after 6 months of use, speaking directly to the "we bought the tool but
adoption is struggling" pain point.
**Relevant now:** Evergreen, but spring is a common time for quarterly tool reviews and process resets.
**Search intent:** Informational
**Format:** How-to guide with problem/solution structure | ~2,000 words
**Difficulty:** Medium — requires specific examples of friction points; can be supported with user
research data or survey stats. No proprietary knowledge required.
**Competitor gap:** Yes — top 3 ranking articles are all vendor-produced and promotional in tone.
A neutral, critical angle creates clear differentiation.
 
---
 
### 2. "Construction Project Management Software Comparison: 2026 Buyer's Guide"
**Angle:** Segment-specific comparison targeting the construction niche rather than generic PM software.
**Relevant now:** Commercial intent; buyers research this year-round but activity increases in Q2
when new construction seasons begin.
**Search intent:** Commercial
**Format:** Comparison / buyer's guide | ~3,500 words
**Difficulty:** Hard — requires hands-on familiarity with 4–5 competing products and current pricing.
Requires significant research time and regular updates as products evolve.
**Competitor gap:** No — two direct competitors have comprehensive comparison pages. Needs a
differentiated angle (e.g. focus on integrations with construction-specific tools like Procore).

How It Works

Load client context. The Client Context File is injected into the system prompt with full brand, audience, product, and competitor context. The niche and target audience from the input narrow the topic search space.
RAG: avoid recent topics. Query Published Content for blog post titles and topics published in the last 90 days. These are loaded alongside the recentTopics input to build a combined avoid-list. A topic is considered a repeat if the title similarity is > 0.75 or the core subject is the same.
RAG: identify competitor content gaps. Query Competitor Research for the topics and keywords competitors are covering. Topics the client’s competitors rank for but the client does not have content on are flagged as competitorGap: true — high-priority opportunities.
Web search: current trends. Call web_search with 2–3 queries targeting trending topics in the client’s niche. Queries are structured to surface trending questions, seasonal topics, and industry news from the last 30 days. Example queries: "[niche] trends 2026", "[niche] questions people are asking", "[industry] [season] topics blog ideas".
Synthesise topic list. Combine: (a) evergreen topics with strong search intent that the client hasn’t covered, (b) trending/seasonal topics from web search, (c) competitor gap topics from RAG. Remove any topic that overlaps with the recent-topics avoid-list.
Score and rank. Rank topics by: (1) competitor gap + search intent alignment, (2) trending relevance to current month/season, (3) alignment with stated content goals, (4) research difficulty (easy topics ranked higher if content velocity is the goal). Output the top topicCount results.
Write output. For each topic, produce all fields in the TopicIdea schema. Working titles should be specific and publication-ready — not vague placeholders. The notes field captures anything the content team needs to know before committing.

System Prompt


You are a content strategist generating blog topic ideas for a digital marketing agency client.
Your job is to produce a ranked list of specific, compelling blog topic ideas that are:
- Not already covered in the client's recent content
- Aligned with the client's audience and content goals
- Timely, trending, or evergreen with strong search potential
- Differentiated from what competitors are already covering

CLIENT CONTEXT:
{{CLIENT_CONTEXT}}

KNOWLEDGE BASE CONTEXT:
{{RAG_CONTEXT}}

You have been provided with:
- The client's industry, niche, and target audience
- A list of recently published blog topics to avoid repeating
- Current month and season for temporal relevance
- Web search results showing current trends in the niche
- Competitor content gaps identified from the knowledge base

Generate {{TOPIC_COUNT}} topic ideas. For each topic:
1. Write a specific, compelling working title — not a vague description
2. Identify the unique angle or hook that makes this piece worth reading
3. Explain why it's relevant now (seasonal, trending, or why it's a strong evergreen pick)
4. Classify the search intent: informational, commercial, navigational, or transactional
5. Rate research difficulty: easy (publicly available info), medium (requires synthesis),
   hard (requires expert interviews, proprietary data, or deep expertise)
6. Flag if this is a competitor gap — a topic competitors rank for that the client lacks

Rules:
- No topic should duplicate or closely resemble a topic in the recent-topics list
- Working titles must be specific enough to brief a writer immediately
- Rank topics with competitor gaps and commercial intent higher
- Include at least 2 evergreen topics and at least 2 timely/seasonal topics
- Do not suggest topics outside the client's stated niche or audience

Output as valid JSON matching the TopicResearcherOutput schema.

Skills Injected

Skill file	Purpose
`client-context-file.md`	Company, brand, audience — always injected
`content-topic-framework.md`	Framework for evaluating topic quality, search intent classification guide, difficulty rating rubric

`content-topic-framework.md` — content


# Content Topic Framework
 
## What Makes a Good Blog Topic
 
A good blog topic has at least two of the following:
1. **Search demand** — people are actively searching for this information
2. **Business relevance** — it attracts the target audience at the right stage of their journey
3. **Differentiation** — the client can cover it better or differently than what already ranks
4. **Timeliness** — it's trending now, or seasonal demand is peaking
 
## Search Intent Classification
 
**Informational:** The reader wants to learn something. Often starts with "how", "what", "why",
"guide", "tips". Best for top-of-funnel traffic. Examples: "How to set up Google Analytics 4",
"What is ROAS?", "5 signs your email list is unhealthy".
 
**Commercial:** The reader is evaluating options before a purchase decision. Often includes
"best", "vs", "review", "comparison", "alternatives". Best for mid-funnel. Examples: "Best
project management tools for small teams", "HubSpot vs Salesforce for SMBs".
 
**Transactional:** The reader is ready to take action. Often includes "buy", "pricing", "free
trial", "hire", "agency". Best for bottom-of-funnel. Examples: "Google Ads agency pricing",
"hire a content writer".
 
**Navigational:** The reader is looking for a specific site or resource. Rarely a content
opportunity unless the brand name is being searched.
 
## Difficulty Rating Rubric
 
**Easy:**
- Topic can be covered with publicly available information
- No expert interviews or proprietary data required
- A competent writer can complete it with 1–2 hours of research
- Examples: listicles, how-to guides on well-documented topics
 
**Medium:**
- Topic requires synthesis of multiple sources or some domain expertise
- May benefit from one expert quote or a specific data source
- Examples: comparison pieces, opinion/thought leadership, trend analysis
 
**Hard:**
- Topic requires deep domain expertise, original research, or proprietary data
- Expert interviews strongly recommended
- Risk of thin or low-quality output without adequate research investment
- Examples: technical deep-dives, original survey-based reports, case studies requiring
  client permission and data access
 
## Avoiding Topic Overlap
 
When checking for overlap with recent topics:
- Same core subject = overlap (avoid)
- Same title format with different subject = acceptable
- Same angle on a different audience segment = acceptable if the difference is meaningful
- Refreshing a topic published 12+ months ago = acceptable (update post, not new post)
 
## Format × Word Count Guide
| Format | Target Word Count |
|---|---|
| How-to guide | 1,500–2,500 |
| Listicle (5–10 items) | 1,200–2,000 |
| Comparison / buyer's guide | 2,500–4,000 |
| Case study | 1,000–1,800 |
| Opinion / thought leadership | 800–1,500 |
| News analysis | 600–1,200 |
| Interview | 1,200–2,000 |

RAG Usage

Dataset	Query	When
Published Content	`"blog posts published last 90 days topics titles"`	Step 2 — to build the topic avoid-list
Competitor Research	`"competitor blog topics keywords [industry] [niche]"`	Step 3 — to identify content gaps vs. competitors
Client Documents	Not typically queried	Client context arrives via the Client Context File skill
Website Content	Not typically queried

RAG query strategy: Run both queries in parallel before the web search step. Published Content avoids wasted effort on duplicate topics. Competitor Research informs the ranking — competitor gap topics are prioritised because they represent provable demand the client can capture. The web search step runs after RAG so that search queries can be refined by what’s already covered.

Tools Required

Tool	Method	Purpose	Required?
`rag_search`	search	Query published content and competitor research	Yes
`web_search`	search	Find trending topics and current questions in the niche	Yes

HITL Gates

Review type: topic_review
Risk level: low
Trigger: Always — the generated topic list is presented to the human content strategist for selection before any brief or research note jobs are dispatched.
Reviewer action: Selects which topics to proceed with (partial approval is the norm — not all 12 topics will be commissioned). Selected topics trigger Research Note Writer jobs. Deselected topics are archived.
Editing: Reviewer can edit working titles, angles, and notes before approving. Edits are saved to the topic record.

Guardrails

Rule	Enforcement
No topic may duplicate a recent topic	String similarity check (> 0.75 Jaccard similarity on title tokens) against the combined recent-topics list; duplicates are removed and replaced
Must include ≥ 2 evergreen topics	Count check on `evergreen: true` entries; if fewer than 2, lowest-ranked trending topics are swapped for evergreen alternatives
Must include ≥ 2 informational intent topics	Count check on searchIntent; required for organic traffic potential
Working title must be ≥ 8 words	Too-short titles indicate vague topics; enforced with a minimum word count check
Output must meet topicCount target	If fewer topics are generated than requested, the agent retries the generation step with explicit instruction to add more

Tenant Settings Used

Setting	How it’s used
`industry`	Primary scoping for both RAG queries and web search queries
`targetAudience`	Informs angle selection — B2B audiences want tactical/strategic content; B2C audiences want inspiration and guidance
`brandVoice`	Influences `suggestedFormat` — an authoritative brand skews toward thought leadership and guides; a conversational brand skews toward listicles and how-tos

Cost Profile


Avg input tokens	~6,000 (context + RAG results + web search results)
Avg output tokens	~3,000 (12-topic JSON output)
Est. cost / task	~$0 (Ollama/local — gemma3:4b)

Note: gemma3:4b is a capable 4-billion-parameter model suitable for structured content generation tasks. Topic ideation does not require advanced reasoning — it benefits more from broad world knowledge and structured output adherence, both of which gemma3:4b handles well. The Ollama infrastructure requirement limits this to Agency+ plans.

Error Handling

Error	Response
Ollama unavailable or model not loaded	Fail job with error: “Ollama service unavailable — topic research requires Ollama to be running with gemma3:4b loaded”
Web search returns no results	Proceed with RAG-based topics only; note “Web search unavailable — topics based on knowledge base only”
RAG returns no Published Content results	Proceed without topic avoid-list; note “No published content found — all topics are eligible”
RAG returns no Competitor Research results	Proceed without competitor gap data; all topics have `competitorGap: false`
Output contains fewer than 10 topics after deduplication	Retry with explicit instruction to generate more; if still below 10 after retry, return what’s available with a note
Topic validation failures > 50% of output	Full regeneration with validation rules appended to prompt as explicit constraints