Gemini Local Adapter
Status: [Live] — integrated with
setup.worker.tsand selectable per-agent via the Manage portal Agent edit page. No agent uses it by default yet; configure via the Agent edit page to assign it to any agent.CLI reference: Gemini CLI cheatsheet
Skills reference: Agent Skills · Skills tutorial
Authentication: Gemini CLI authentication setup
Overview
Mechanism: Spawns gemini (Google Gemini CLI) as a child process. Communicates via NDJSON (newline-delimited JSON) on stdout using --output-format stream-json.
Why Gemini for certain tasks:
- Google account authentication — no separate API key required for personal use
- Session resumption via
session_id— stateful multi-turn context across runs - Latest Gemini 3 / 2.5 model family access
--yoloflag for fully unattended operation (skips all approval prompts)
When to use:
- Tenants or agents configured to run on Google’s Gemini models
- Workflows requiring long-context processing (Gemini 2.5 Pro has large context windows)
- Cost-optimised runs using Gemini Flash models
Configuration
GeminiLocalConfig (stored in agent_configs.adapter_config):
interface GeminiLocalConfig {
cwd: string; // working directory; auto-created if absent
model?: string; // e.g. 'auto' (default), 'gemini-2.5-pro'
promptTemplate?: string; // template with {{variables}} substitution
instructionsFilePath?: string; // path to Markdown file prepended to prompt
env?: Record<string, string>; // per-agent env vars (e.g. GEMINI_API_KEY)
timeoutSec?: number; // hard timeout (default: 120)
graceSec?: number; // SIGTERM → SIGKILL grace period (default: 5)
yolo?: boolean; // pass --yolo to skip all approval prompts
}promptTemplate variable substitution:
| Variable | Value |
|---|---|
{{agentId}} | Agent identifier |
{{agent.name}} | e.g. "AI Copywriter" |
{{tenantId}} | Tenant identifier |
{{tenant.name}} | Tenant display name |
{{runId}} | This run’s ID |
Supported Models
Common Aliases (resolved by the CLI at runtime):
| Model ID | Notes |
|---|---|
auto | Default — resolves to gemini-2.5-pro (or gemini-3-pro-preview if preview features enabled) |
pro | Alias for gemini-2.5-pro — complex reasoning |
flash | Alias for gemini-2.5-flash — speed/efficiency |
flash-lite | Alias for gemini-2.5-flash-lite — fastest, most cost-effective |
Specific Models:
| Model ID | Notes |
|---|---|
gemini-2.5-pro | Gemini 2.5 Pro — current recommended pro model |
gemini-2.5-flash | Gemini 2.5 Flash — fast, balanced |
gemini-2.5-flash-lite | Gemini 2.5 Flash Lite — fastest |
gemini-2.0-flash | Gemini 2.0 Flash |
gemini-1.5-pro | Gemini 1.5 Pro |
gemini-1.5-flash | Gemini 1.5 Flash |
Authentication modes: Three options — (1) Sign in with Google (run
geminiinteractively, OAuth via browser — personal/Google AI Pro/Ultra accounts); (2)GEMINI_API_KEYenv var (from Google AI Studio — per-token billing); (3) Vertex AI (GOOGLE_CLOUD_PROJECT+ ADC/service account/GOOGLE_API_KEY— enterprise). See authentication docs .
Full Subprocess Flow
Worker process (Node.js)
│
│ 1. ASSEMBLE INPUTS
│ renderTemplate(config.promptTemplate, ctx) → renderedPrompt
│ readFile(config.instructionsFilePath) → prepended to renderedPrompt
│
│ 2. SPAWN SUBPROCESS
│ ┌──────────────────────────────────────────────────────────┐
│ │ child_process.spawn('gemini', args, { │
│ │ cwd: config.cwd, │
│ │ env: { ...process.env, ...config.env }, │
│ │ stdio: ['pipe', 'pipe', 'pipe'], ← stdin piped │
│ │ shell: process.platform === 'win32', │
│ │ }) │
│ └──────────────────────────────────────────────────────────┘
│
│ 3. DATA IN — prompt written to stdin, then stdin closed
│ proc.stdin.end(renderedPrompt)
│
│ CLI args:
│ gemini --output-format stream-json [--model auto] \
│ [--resume <session_id>] [--approval-mode yolo]
│
│ 4. DATA OUT — read from child.stdout (NDJSON)
│ Each line is one complete JSON event.
│
└─────────────────────────────────────────────────────────────────The NDJSON event stream
// Session info (always first)
{ "type": "init", "timestamp": "2026-04-01T10:00:00Z", "session_id": "3e0923c9-7033-438d-a978-a361d8031f47", "model": "auto-gemini-3" }
// User message echo
{ "type": "message", "timestamp": "2026-04-01T10:00:01Z", "role": "user", "content": "Reply with PONG" }
// Assistant response (may arrive as multiple delta chunks)
{ "type": "message", "timestamp": "2026-04-01T10:00:02Z", "role": "assistant", "content": "PONG", "delta": true }
// Final result — usage data is here
{ "type": "result", "timestamp": "2026-04-01T10:00:03Z", "status": "success", "stats": { "input_tokens": 100, "output_tokens": 10, "cached": 20, "duration_ms": 3000 } }
// Error result
{ "type": "result", "timestamp": "...", "status": "error", "error": "Session not found" }Key I/O summary
| Concern | How it’s handled |
|---|---|
| Passing the prompt | Written to stdin, then stdin.end() — avoids Windows arg-quoting issues with shell: true and -p flag |
| Session resumption | --resume <session_id> flag — Gemini CLI manages its own session store |
| Getting streamed text | All message events where role === "assistant" → concatenate content (multiple delta chunks merged) |
| Getting the session ID | init event → session_id saved to sessions table |
| Getting token usage | result event → stats.input_tokens, stats.output_tokens, stats.cached |
| Approval bypass | --approval-mode yolo — required for unattended operation. (--yolo alias is deprecated as of early 2026; the adapter uses --approval-mode yolo automatically when yolo: true) |
| Process timeout | SIGTERM at timeoutSec, SIGKILL at timeoutSec + graceSec; resolveOnce pattern prevents double-resolve on Windows |
Why stdin and not -p?
On Windows, spawn("gemini", [..., "-p", "prompt text"], { shell: true }) causes a quoting error: “Cannot use both a positional prompt and the —prompt (-p) flag together”. This is because shell: true re-processes argument quoting through cmd.exe. The fix is to remove -p entirely and write the prompt to stdin instead — proc.stdin.end(renderedPrompt). The Gemini CLI reads from stdin when no positional prompt is provided.
Why shell: true on Windows?
spawn("gemini", ...) without shell: true fails with ENOENT on Windows because Node.js doesn’t resolve .cmd shims (the files created by npm install -g). With shell: true, Node.js delegates to cmd.exe which finds gemini.cmd on PATH. The trade-off is that arguments pass through the Windows shell, which is why stdin is used instead of arg-based prompt passing.
Why resolveOnce instead of resolve?
On Windows with shell: true, proc.kill("SIGKILL") kills the cmd.exe wrapper but not the Gemini child process. The close event may never fire after SIGKILL. The timeout handler force-resolves the promise 500ms after SIGKILL via resolveOnce. The resolveOnce wrapper ensures the close handler can’t double-resolve if it does eventually fire.
Skills
Gemini CLI supports the Agent Skills open standard. Skills extend the agent with on-demand expertise — only the skill name and description are loaded into the initial context; the full SKILL.md body is injected only when Gemini activates the skill via the activate_skill tool.
Discovery locations
Skills are discovered from three tiers at session start:
| Tier | Location |
|---|---|
| Workspace | .gemini/skills/<name>/SKILL.md or .agents/skills/<name>/SKILL.md (committed to version control) |
| User | ~/.gemini/skills/<name>/SKILL.md or ~/.agents/skills/<name>/SKILL.md (personal, all workspaces) |
| Extension | Bundled with installed Gemini CLI extensions |
Precedence: Workspace > User > Extension. Within the same tier, .agents/skills/ takes precedence over .gemini/skills/.
Skill format
Each skill is a directory containing SKILL.md:
.agents/skills/
api-auditor/
SKILL.md ← required: frontmatter + instructions
scripts/
audit.js ← optional: bundled assets---
name: api-auditor
description: |
Expertise in auditing and testing API endpoints. Use when the user asks to
"check", "test", or "audit" a URL or API.
---
# API Auditor Instructions
When this skill is active, you MUST:
1. Use the bundled `scripts/audit.js` utility to check the URL.
2. Analyze the output and explain any failures in plain English.Activation mechanism
Gemini autonomously decides which skill to apply based on your request and the skill’s description:
- Discovery — at session start, all skill names+descriptions are injected into the system prompt
- Activation — Gemini calls the
activate_skilltool when a matching task is identified - Consent — a confirmation prompt is shown (auto-approved when
yolo: true/--approval-mode yolo) - Injection — the full
SKILL.mdbody and directory structure are added to conversation context - Execution — the skill’s bundled scripts/assets become accessible throughout the session
Unlike Codex (which reads skills via a shell cat/Get-Content command), Gemini uses the activate_skill tool — a native tool call that injects skill content directly, with no extra shell round-trips.
Template for per-agent skills
For Leadmetrics agents, drop workspace skills in the agent’s configured cwd under .agents/skills/:
/var/agents/my-agent/
.agents/skills/
my-skill/
SKILL.mdSession Handling
- Session ID =
session_idfrom theinitevent - Persisted to
sessionstable - Subsequent runs:
--resume <session_id>flag — Gemini CLI resumes from its own local session store - Session errors (e.g. “session not found”) are detected from
result.status === "error"+ error message containing “session” or “resume” →clearSession: truereturned in result so the orchestrator resets the session
Health Checks
The testEnvironment(config) function runs these checks in sequence:
| Check | What it verifies |
|---|---|
| CLI installed | gemini --version exits 0 |
| cwd accessible | Config cwd exists (or will be created) |
| instructionsFilePath | File exists if path is configured |
| Live probe | Spawns gemini --output-format stream-json [--yolo] with "Respond with: hello" on stdin; expects response within 20s |
Timeout Handling
Two-stage graceful shutdown:
- At
timeoutSec: send SIGTERM — Gemini flushes and exits cleanly - After
graceSec(default 5s): send SIGKILL — hard kill - 500ms after SIGKILL: force-resolve via
resolveOnce(Windowsshell: truesafety net)
The result includes error: "Process timed out after Ns" with success: false.
Cost Source
Token counts come from the result event → stats.input_tokens, stats.output_tokens, stats.cached.
Cost is calculated via calculateCostUsd(modelId, usage, env) using the model from the init event’s model field.
| Auth mode | GEMINI_API_KEY present | costUsd |
|---|---|---|
| Google account (personal / AI Pro / Ultra) | No | 0.00 |
| Gemini API key (AI Studio) | Yes | Per-token (see table below) |
| Vertex AI | No (GOOGLE_CLOUD_PROJECT used instead) | 0.00 * |
* Vertex AI billing is not tracked by the adapter — cost is reported as $0.00.
Model pricing (USD per 1M tokens, source: Google AI Studio pricing ):
| Model | Input | Output | Cache read |
|---|---|---|---|
| gemini-2.5-pro | $1.25 | $10.00 | $0.31 |
| gemini-2.5-flash | $0.15 | $0.60 | $0.0375 |
| gemini-2.5-flash-lite | $0.10 | $0.40 | $0.025 |
| gemini-2.0-flash | $0.10 | $0.40 | $0.025 |
| gemini-1.5-pro | $1.25 | $5.00 | $0.31 |
| gemini-1.5-flash | $0.075 | $0.30 | $0.019 |
| gemini-1.5-flash-8b | $0.0375 | $0.15 | $0.01 |
| (unknown) | $0.15 | $0.60 | $0.0375 |
Package Location
packages/adapters/gemini-local/
├── src/
│ ├── index.ts # type key, label, models, defaultModel, agentConfigurationDoc
│ ├── types.ts # GeminiLocalConfig, stream event types, AdapterExecutionContext
│ ├── server/
│ │ ├── execute.ts # buildArgs(), renderTemplate(), execute()
│ │ ├── parse.ts # parseNdjsonLines(), extractSessionId(), extractOutput(), extractUsage(), buildTranscript()
│ │ ├── test.ts # testEnvironment() — CLI probe + diagnostics
│ │ └── __tests__/
│ │ ├── execute.test.ts # unit tests — buildArgs, renderTemplate
│ │ ├── parse.test.ts # unit tests — parse/extract/buildTranscript
│ │ ├── build-config.test.ts# unit tests — buildConfig, validateConfig
│ │ └── integration/
│ │ └── execute.integration.test.ts # live gemini CLI tests
│ └── ui/
│ ├── build-config.ts # configFields[], buildConfig(), validateConfig()
│ └── __tests__/
│ └── build-config.test.ts
├── vitest.config.ts
└── package.jsonTest status: Unit tests cover buildArgs, renderTemplate, calculateCostUsd, extractModelId, parse/extract functions, and buildConfig/validateConfig. Integration tests require gemini CLI installed and authenticated (Google account or GEMINI_API_KEY).