Skip to Content
AdaptersGemini Local Adapter

Gemini Local Adapter

Status: [Live] — integrated with setup.worker.ts and selectable per-agent via the Manage portal Agent edit page. No agent uses it by default yet; configure via the Agent edit page to assign it to any agent.

CLI reference: Gemini CLI cheatsheet 

Skills reference: Agent Skills  · Skills tutorial 

Authentication: Gemini CLI authentication setup 

Overview

Mechanism: Spawns gemini (Google Gemini CLI) as a child process. Communicates via NDJSON (newline-delimited JSON) on stdout using --output-format stream-json.

Why Gemini for certain tasks:

  • Google account authentication — no separate API key required for personal use
  • Session resumption via session_id — stateful multi-turn context across runs
  • Latest Gemini 3 / 2.5 model family access
  • --yolo flag for fully unattended operation (skips all approval prompts)

When to use:

  • Tenants or agents configured to run on Google’s Gemini models
  • Workflows requiring long-context processing (Gemini 2.5 Pro has large context windows)
  • Cost-optimised runs using Gemini Flash models

Configuration

GeminiLocalConfig (stored in agent_configs.adapter_config):

interface GeminiLocalConfig { cwd: string; // working directory; auto-created if absent model?: string; // e.g. 'auto' (default), 'gemini-2.5-pro' promptTemplate?: string; // template with {{variables}} substitution instructionsFilePath?: string; // path to Markdown file prepended to prompt env?: Record<string, string>; // per-agent env vars (e.g. GEMINI_API_KEY) timeoutSec?: number; // hard timeout (default: 120) graceSec?: number; // SIGTERM → SIGKILL grace period (default: 5) yolo?: boolean; // pass --yolo to skip all approval prompts }

promptTemplate variable substitution:

VariableValue
{{agentId}}Agent identifier
{{agent.name}}e.g. "AI Copywriter"
{{tenantId}}Tenant identifier
{{tenant.name}}Tenant display name
{{runId}}This run’s ID

Supported Models

Common Aliases (resolved by the CLI at runtime):

Model IDNotes
autoDefault — resolves to gemini-2.5-pro (or gemini-3-pro-preview if preview features enabled)
proAlias for gemini-2.5-pro — complex reasoning
flashAlias for gemini-2.5-flash — speed/efficiency
flash-liteAlias for gemini-2.5-flash-lite — fastest, most cost-effective

Specific Models:

Model IDNotes
gemini-2.5-proGemini 2.5 Pro — current recommended pro model
gemini-2.5-flashGemini 2.5 Flash — fast, balanced
gemini-2.5-flash-liteGemini 2.5 Flash Lite — fastest
gemini-2.0-flashGemini 2.0 Flash
gemini-1.5-proGemini 1.5 Pro
gemini-1.5-flashGemini 1.5 Flash

Authentication modes: Three options — (1) Sign in with Google (run gemini interactively, OAuth via browser — personal/Google AI Pro/Ultra accounts); (2) GEMINI_API_KEY env var (from Google AI Studio  — per-token billing); (3) Vertex AI (GOOGLE_CLOUD_PROJECT + ADC/service account/GOOGLE_API_KEY — enterprise). See authentication docs .


Full Subprocess Flow

Worker process (Node.js) │ 1. ASSEMBLE INPUTS │ renderTemplate(config.promptTemplate, ctx) → renderedPrompt │ readFile(config.instructionsFilePath) → prepended to renderedPrompt │ 2. SPAWN SUBPROCESS │ ┌──────────────────────────────────────────────────────────┐ │ │ child_process.spawn('gemini', args, { │ │ │ cwd: config.cwd, │ │ │ env: { ...process.env, ...config.env }, │ │ │ stdio: ['pipe', 'pipe', 'pipe'], ← stdin piped │ │ │ shell: process.platform === 'win32', │ │ │ }) │ │ └──────────────────────────────────────────────────────────┘ │ 3. DATA IN — prompt written to stdin, then stdin closed │ proc.stdin.end(renderedPrompt) │ CLI args: │ gemini --output-format stream-json [--model auto] \ │ [--resume <session_id>] [--approval-mode yolo] │ 4. DATA OUT — read from child.stdout (NDJSON) │ Each line is one complete JSON event. └─────────────────────────────────────────────────────────────────

The NDJSON event stream

// Session info (always first) { "type": "init", "timestamp": "2026-04-01T10:00:00Z", "session_id": "3e0923c9-7033-438d-a978-a361d8031f47", "model": "auto-gemini-3" } // User message echo { "type": "message", "timestamp": "2026-04-01T10:00:01Z", "role": "user", "content": "Reply with PONG" } // Assistant response (may arrive as multiple delta chunks) { "type": "message", "timestamp": "2026-04-01T10:00:02Z", "role": "assistant", "content": "PONG", "delta": true } // Final result — usage data is here { "type": "result", "timestamp": "2026-04-01T10:00:03Z", "status": "success", "stats": { "input_tokens": 100, "output_tokens": 10, "cached": 20, "duration_ms": 3000 } } // Error result { "type": "result", "timestamp": "...", "status": "error", "error": "Session not found" }

Key I/O summary

ConcernHow it’s handled
Passing the promptWritten to stdin, then stdin.end() — avoids Windows arg-quoting issues with shell: true and -p flag
Session resumption--resume <session_id> flag — Gemini CLI manages its own session store
Getting streamed textAll message events where role === "assistant" → concatenate content (multiple delta chunks merged)
Getting the session IDinit event → session_id saved to sessions table
Getting token usageresult event → stats.input_tokens, stats.output_tokens, stats.cached
Approval bypass--approval-mode yolo — required for unattended operation. (--yolo alias is deprecated as of early 2026; the adapter uses --approval-mode yolo automatically when yolo: true)
Process timeoutSIGTERM at timeoutSec, SIGKILL at timeoutSec + graceSec; resolveOnce pattern prevents double-resolve on Windows

Why stdin and not -p?

On Windows, spawn("gemini", [..., "-p", "prompt text"], { shell: true }) causes a quoting error: “Cannot use both a positional prompt and the —prompt (-p) flag together”. This is because shell: true re-processes argument quoting through cmd.exe. The fix is to remove -p entirely and write the prompt to stdin instead — proc.stdin.end(renderedPrompt). The Gemini CLI reads from stdin when no positional prompt is provided.

Why shell: true on Windows?

spawn("gemini", ...) without shell: true fails with ENOENT on Windows because Node.js doesn’t resolve .cmd shims (the files created by npm install -g). With shell: true, Node.js delegates to cmd.exe which finds gemini.cmd on PATH. The trade-off is that arguments pass through the Windows shell, which is why stdin is used instead of arg-based prompt passing.

Why resolveOnce instead of resolve?

On Windows with shell: true, proc.kill("SIGKILL") kills the cmd.exe wrapper but not the Gemini child process. The close event may never fire after SIGKILL. The timeout handler force-resolves the promise 500ms after SIGKILL via resolveOnce. The resolveOnce wrapper ensures the close handler can’t double-resolve if it does eventually fire.


Skills

Gemini CLI supports the Agent Skills  open standard. Skills extend the agent with on-demand expertise — only the skill name and description are loaded into the initial context; the full SKILL.md body is injected only when Gemini activates the skill via the activate_skill tool.

Discovery locations

Skills are discovered from three tiers at session start:

TierLocation
Workspace.gemini/skills/<name>/SKILL.md or .agents/skills/<name>/SKILL.md (committed to version control)
User~/.gemini/skills/<name>/SKILL.md or ~/.agents/skills/<name>/SKILL.md (personal, all workspaces)
ExtensionBundled with installed Gemini CLI extensions

Precedence: Workspace > User > Extension. Within the same tier, .agents/skills/ takes precedence over .gemini/skills/.

Skill format

Each skill is a directory containing SKILL.md:

.agents/skills/ api-auditor/ SKILL.md ← required: frontmatter + instructions scripts/ audit.js ← optional: bundled assets
--- name: api-auditor description: | Expertise in auditing and testing API endpoints. Use when the user asks to "check", "test", or "audit" a URL or API. --- # API Auditor Instructions When this skill is active, you MUST: 1. Use the bundled `scripts/audit.js` utility to check the URL. 2. Analyze the output and explain any failures in plain English.

Activation mechanism

Gemini autonomously decides which skill to apply based on your request and the skill’s description:

  1. Discovery — at session start, all skill names+descriptions are injected into the system prompt
  2. Activation — Gemini calls the activate_skill tool when a matching task is identified
  3. Consent — a confirmation prompt is shown (auto-approved when yolo: true / --approval-mode yolo)
  4. Injection — the full SKILL.md body and directory structure are added to conversation context
  5. Execution — the skill’s bundled scripts/assets become accessible throughout the session

Unlike Codex (which reads skills via a shell cat/Get-Content command), Gemini uses the activate_skill tool — a native tool call that injects skill content directly, with no extra shell round-trips.

Template for per-agent skills

For Leadmetrics agents, drop workspace skills in the agent’s configured cwd under .agents/skills/:

/var/agents/my-agent/ .agents/skills/ my-skill/ SKILL.md

Session Handling

  • Session ID = session_id from the init event
  • Persisted to sessions table
  • Subsequent runs: --resume <session_id> flag — Gemini CLI resumes from its own local session store
  • Session errors (e.g. “session not found”) are detected from result.status === "error" + error message containing “session” or “resume” → clearSession: true returned in result so the orchestrator resets the session

Health Checks

The testEnvironment(config) function runs these checks in sequence:

CheckWhat it verifies
CLI installedgemini --version exits 0
cwd accessibleConfig cwd exists (or will be created)
instructionsFilePathFile exists if path is configured
Live probeSpawns gemini --output-format stream-json [--yolo] with "Respond with: hello" on stdin; expects response within 20s

Timeout Handling

Two-stage graceful shutdown:

  1. At timeoutSec: send SIGTERM — Gemini flushes and exits cleanly
  2. After graceSec (default 5s): send SIGKILL — hard kill
  3. 500ms after SIGKILL: force-resolve via resolveOnce (Windows shell: true safety net)

The result includes error: "Process timed out after Ns" with success: false.


Cost Source

Token counts come from the result event → stats.input_tokens, stats.output_tokens, stats.cached.

Cost is calculated via calculateCostUsd(modelId, usage, env) using the model from the init event’s model field.

Auth modeGEMINI_API_KEY presentcostUsd
Google account (personal / AI Pro / Ultra)No0.00
Gemini API key (AI Studio)YesPer-token (see table below)
Vertex AINo (GOOGLE_CLOUD_PROJECT used instead)0.00 *

* Vertex AI billing is not tracked by the adapter — cost is reported as $0.00.

Model pricing (USD per 1M tokens, source: Google AI Studio pricing ):

ModelInputOutputCache read
gemini-2.5-pro$1.25$10.00$0.31
gemini-2.5-flash$0.15$0.60$0.0375
gemini-2.5-flash-lite$0.10$0.40$0.025
gemini-2.0-flash$0.10$0.40$0.025
gemini-1.5-pro$1.25$5.00$0.31
gemini-1.5-flash$0.075$0.30$0.019
gemini-1.5-flash-8b$0.0375$0.15$0.01
(unknown)$0.15$0.60$0.0375

Package Location

packages/adapters/gemini-local/ ├── src/ │ ├── index.ts # type key, label, models, defaultModel, agentConfigurationDoc │ ├── types.ts # GeminiLocalConfig, stream event types, AdapterExecutionContext │ ├── server/ │ │ ├── execute.ts # buildArgs(), renderTemplate(), execute() │ │ ├── parse.ts # parseNdjsonLines(), extractSessionId(), extractOutput(), extractUsage(), buildTranscript() │ │ ├── test.ts # testEnvironment() — CLI probe + diagnostics │ │ └── __tests__/ │ │ ├── execute.test.ts # unit tests — buildArgs, renderTemplate │ │ ├── parse.test.ts # unit tests — parse/extract/buildTranscript │ │ ├── build-config.test.ts# unit tests — buildConfig, validateConfig │ │ └── integration/ │ │ └── execute.integration.test.ts # live gemini CLI tests │ └── ui/ │ ├── build-config.ts # configFields[], buildConfig(), validateConfig() │ └── __tests__/ │ └── build-config.test.ts ├── vitest.config.ts └── package.json

Test status: Unit tests cover buildArgs, renderTemplate, calculateCostUsd, extractModelId, parse/extract functions, and buildConfig/validateConfig. Integration tests require gemini CLI installed and authenticated (Google account or GEMINI_API_KEY).

© 2026 Leadmetrics — Internal use only