Google Gemini

Category: AI / LLM + Image Generation
Integration type: Platform-level API key
External SDK: @google/generative-ai

Purpose

Google Gemini serves two distinct roles in the platform:

Alternative LLM — Gemini Pro and Flash as drop-in alternatives to Claude/GPT-4o for content agents (useful for tenants with Google Cloud credits or who prefer Google’s models)
AI Image Generation — Gemini’s Imagen model (imagen-3.0-generate-001) for generating custom images for blog posts, social content, and GBP posts when stock photos are insufficient

Model lineup

Model	Use	Notes
`gemini-2.5-pro`	Complex reasoning, long context	Comparable to Claude Opus / GPT-4o
`gemini-2.5-flash`	Fast, balanced	Comparable to Claude Sonnet
`gemini-2.0-flash-lite`	Cheapest, fastest	Simple tasks, high-volume
`gemini-nano`	On-device / embedded	Not applicable for server-side; included for reference
`imagen-3.0-generate-001`	Text-to-image	High quality photorealistic output
`gemini-2.5-flash-image`	Image generation + editing	Multi-modal — can take reference images

Config Structure

Platform config (env vars)


GOOGLE_AI_API_KEY=AIzaSyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   # From console.cloud.google.com
GOOGLE_AI_DEFAULT_MODEL=gemini-2.5-flash
GOOGLE_AI_IMAGE_MODEL=imagen-3.0-generate-001

Integration Pattern

LLM adapter (`packages/agent-core/src/adapters/gemini.ts`)

Gemini implements the same LLMAdapter interface as Claude and OpenAI:


import { GoogleGenerativeAI } from '@google/generative-ai';
 
class GeminiAdapter implements LLMAdapter {
  private client: GoogleGenerativeAI;
  private history: { role: 'user' | 'model'; parts: { text: string }[] }[] = [];
 
  constructor(
    private apiKey: string,
    private model:  string = 'gemini-2.5-flash',
  ) {
    this.client = new GoogleGenerativeAI(apiKey);
  }
 
  async *run(systemPrompt: string, userPrompt: string): AsyncGenerator<LLMEvent> {
    const generativeModel = this.client.getGenerativeModel({
      model:            this.model,
      systemInstruction: systemPrompt,
    });
 
    this.history.push({ role: 'user', parts: [{ text: userPrompt }] });
 
    const chat   = generativeModel.startChat({ history: this.history.slice(0, -1) });
    const stream = await chat.sendMessageStream(userPrompt);
 
    let assistantText = '';
 
    for await (const chunk of stream.stream) {
      const text = chunk.text();
      assistantText += text;
      yield { type: 'content', text };
    }
 
    this.history.push({ role: 'model', parts: [{ text: assistantText }] });
 
    const usage = (await stream.response).usageMetadata;
    yield {
      type:         'usage',
      inputTokens:  usage?.promptTokenCount      ?? 0,
      outputTokens: usage?.candidatesTokenCount   ?? 0,
    };
  }
}

Tool use with Gemini

Gemini supports native function calling. The adapter registers tools in the Gemini format:


const tools = [{
  functionDeclarations: [
    {
      name:        'web_search',
      description: 'Search the web for current information',
      parameters: {
        type: 'object',
        properties: { query: { type: 'string', description: 'Search query' } },
        required: ['query'],
      },
    },
    {
      name:        'rag_search',
      description: 'Search the client knowledge base',
      parameters: {
        type: 'object',
        properties: {
          query:      { type: 'string' },
          datasetIds: { type: 'array', items: { type: 'string' } },
        },
        required: ['query'],
      },
    },
  ],
}];

Image generation (`packages/tools/src/gemini-image.ts`)


import { GoogleGenerativeAI } from '@google/generative-ai';
import axios from 'axios';
 
class GeminiImageTool implements ImageGenerationTool {
  readonly name = 'gemini';
  private client: GoogleGenerativeAI;
 
  constructor(private apiKey: string) {
    this.client = new GoogleGenerativeAI(apiKey);
  }
 
  async generateImage(options: {
    prompt:        string;
    negativePrompt?: string;   // Things to avoid in the image
    aspectRatio?:  '1:1' | '16:9' | '9:16' | '4:3' | '3:4';
    style?:        'photograph' | 'illustration' | 'painting' | 'digital_art';
    count?:        number;   // 1–4 images
  }): Promise<GeneratedImage[]> {
    // Imagen 3 via REST API (SDK wrapper not yet stable)
    const response = await axios.post(
      `https://generativelanguage.googleapis.com/v1beta/models/imagen-3.0-generate-001:predict`,
      {
        instances: [{
          prompt: this.buildPrompt(options),
        }],
        parameters: {
          sampleCount:    options.count      ?? 1,
          aspectRatio:    options.aspectRatio ?? '16:9',
          negativePrompt: options.negativePrompt,
        },
      },
      {
        headers: {
          'Content-Type': 'application/json',
          'x-goog-api-key': this.apiKey,
        },
      },
    );
 
    return response.data.predictions.map((pred: any, i: number) => ({
      provider:    'gemini',
      model:       'imagen-3.0-generate-001',
      imageBase64: pred.bytesBase64Encoded,
      mimeType:    pred.mimeType ?? 'image/png',
      revisedPrompt: null,
    }));
  }
 
  // Generate with reference image (Gemini 2.5 Flash multimodal)
  async editImage(options: {
    prompt:         string;
    referenceImages: Buffer[];   // Input images for style/content reference
    aspectRatio?:   string;
  }): Promise<GeneratedImage[]> {
    const model = this.client.getGenerativeModel({ model: 'gemini-2.5-flash' });
 
    const parts: any[] = [
      { text: options.prompt },
      ...options.referenceImages.map(buf => ({
        inlineData: {
          data:     buf.toString('base64'),
          mimeType: 'image/jpeg',
        },
      })),
    ];
 
    const result = await model.generateContent({ contents: [{ role: 'user', parts }] });
    const imagePart = result.response.candidates?.[0].content.parts
      .find((p: any) => p.inlineData);
 
    if (!imagePart) throw new Error('Gemini did not return an image');
 
    return [{
      provider:    'gemini',
      model:       'gemini-2.5-flash',
      imageBase64: imagePart.inlineData.data,
      mimeType:    imagePart.inlineData.mimeType,
      revisedPrompt: null,
    }];
  }
 
  private buildPrompt(options: { prompt: string; style?: string }): string {
    const styleModifiers: Record<string, string> = {
      photograph:   'professional photograph, realistic, high quality, 4K',
      illustration: 'digital illustration, clean vector art style',
      painting:     'oil painting, artistic, painterly strokes',
      digital_art:  'digital art, concept art, highly detailed',
    };
    const suffix = options.style ? `, ${styleModifiers[options.style]}` : '';
    return `${options.prompt}${suffix}`;
  }
}

Image Generation Provider Abstraction

All image generation providers (Gemini, DALL-E, Stability AI) implement a shared interface:


interface ImageGenerationTool {
  readonly name: string;
  generateImage(options: ImageGenerationOptions): Promise<GeneratedImage[]>;
}
 
interface GeneratedImage {
  provider:      string;
  model:         string;
  imageBase64:   string;   // Base64-encoded image
  mimeType:      string;
  revisedPrompt: string | null;  // Some models return a revised prompt
}

The agent execution engine resolves which image provider to use based on tenant config and the task type.

Gemini Nano

gemini-nano is Google’s smallest model designed for on-device inference (Android AICore). It is not applicable for Leadmetrics server-side agents. It is listed here for completeness — any reference to “Nano” in platform docs means an awareness of the model family, not a supported integration.

Cost Reference

Model	Input (per 1M tokens)	Output (per 1M tokens)
`gemini-2.5-pro`	$1.25	$10.00
`gemini-2.5-flash`	$0.075	$0.30
`gemini-2.0-flash-lite`	$0.0075	$0.03
`imagen-3.0-generate-001`	—	$0.03/image

Test Cases

Unit tests (`packages/agent-core/src/adapters/gemini.test.ts`)

Test	Approach
`run()` calls `startChat` with history minus last message	Assert `history` passed to `startChat` excludes current turn
`run()` yields `content` events from stream chunks	Mock stream; assert text chunks emitted
`run()` yields `usage` event with token counts	Mock `usageMetadata`; assert values
`run()` appends model reply to `this.history`	Assert history has model entry after run

Unit tests (`packages/tools/src/gemini-image.test.ts`)

Test	Approach
`generateImage()` POSTs to Imagen endpoint	Mock `axios.post`; assert URL and `x-goog-api-key`
`generateImage()` appends style modifier to prompt	Assert prompt includes style string
`generateImage()` returns base64 image	Mock `{ predictions: [{ bytesBase64Encoded: 'abc' }] }`
`editImage()` includes reference images as `inlineData`	Assert parts contain `inlineData` blocks

DALL-E Provider — OpenAI image generation
Stability AI Provider — Stable Diffusion image generation
OpenAI Provider — GPT-4o LLM (similar adapter pattern)
Claude Provider — primary LLM