Gemini Local Adapter

Status: [Live] — integrated with setup.worker.ts and selectable per-agent via the Manage portal Agent edit page. No agent uses it by default yet; configure via the Agent edit page to assign it to any agent.

CLI reference: Gemini CLI cheatsheet

Skills reference: Agent Skills · Skills tutorial

Authentication: Gemini CLI authentication setup

Overview

Mechanism: Spawns gemini (Google Gemini CLI) as a child process. Communicates via NDJSON (newline-delimited JSON) on stdout using --output-format stream-json.

Why Gemini for certain tasks:

Google account authentication — no separate API key required for personal use
Session resumption via session_id — stateful multi-turn context across runs
Latest Gemini 3 / 2.5 model family access
--yolo flag for fully unattended operation (skips all approval prompts)

When to use:

Tenants or agents configured to run on Google’s Gemini models
Workflows requiring long-context processing (Gemini 2.5 Pro has large context windows)
Cost-optimised runs using Gemini Flash models

Configuration

GeminiLocalConfig (stored in agent_configs.adapter_config):


interface GeminiLocalConfig {
  cwd:                   string;   // working directory; auto-created if absent
  model?:                string;   // e.g. 'auto' (default), 'gemini-2.5-pro'
  promptTemplate?:       string;   // template with {{variables}} substitution
  instructionsFilePath?: string;   // path to Markdown file prepended to prompt
  env?:                  Record<string, string>; // per-agent env vars (e.g. GEMINI_API_KEY)
  timeoutSec?:           number;   // hard timeout (default: 120)
  graceSec?:             number;   // SIGTERM → SIGKILL grace period (default: 5)
  yolo?:                 boolean;  // pass --yolo to skip all approval prompts
}

promptTemplate variable substitution:

Variable	Value
`{{agentId}}`	Agent identifier
`{{agent.name}}`	e.g. `"AI Copywriter"`
`{{tenantId}}`	Tenant identifier
`{{tenant.name}}`	Tenant display name
`{{runId}}`	This run’s ID

Supported Models

Common Aliases (resolved by the CLI at runtime):

Model ID	Notes
`auto`	Default — resolves to `gemini-2.5-pro` (or `gemini-3-pro-preview` if preview features enabled)
`pro`	Alias for `gemini-2.5-pro` — complex reasoning
`flash`	Alias for `gemini-2.5-flash` — speed/efficiency
`flash-lite`	Alias for `gemini-2.5-flash-lite` — fastest, most cost-effective

Specific Models:

Model ID	Notes
`gemini-2.5-pro`	Gemini 2.5 Pro — current recommended pro model
`gemini-2.5-flash`	Gemini 2.5 Flash — fast, balanced
`gemini-2.5-flash-lite`	Gemini 2.5 Flash Lite — fastest
`gemini-2.0-flash`	Gemini 2.0 Flash
`gemini-1.5-pro`	Gemini 1.5 Pro
`gemini-1.5-flash`	Gemini 1.5 Flash

Authentication modes: Three options — (1) Sign in with Google (run gemini interactively, OAuth via browser — personal/Google AI Pro/Ultra accounts); (2) GEMINI_API_KEY env var (from Google AI Studio — per-token billing); (3) Vertex AI (GOOGLE_CLOUD_PROJECT + ADC/service account/GOOGLE_API_KEY — enterprise). See authentication docs .

Full Subprocess Flow


Worker process (Node.js)
│
│  1. ASSEMBLE INPUTS
│     renderTemplate(config.promptTemplate, ctx) → renderedPrompt
│     readFile(config.instructionsFilePath) → prepended to renderedPrompt
│
│  2. SPAWN SUBPROCESS
│     ┌──────────────────────────────────────────────────────────┐
│     │  child_process.spawn('gemini', args, {                   │
│     │    cwd:   config.cwd,                                    │
│     │    env:   { ...process.env, ...config.env },             │
│     │    stdio: ['pipe', 'pipe', 'pipe'],  ← stdin piped       │
│     │    shell: process.platform === 'win32',                  │
│     │  })                                                       │
│     └──────────────────────────────────────────────────────────┘
│
│  3. DATA IN — prompt written to stdin, then stdin closed
│     proc.stdin.end(renderedPrompt)
│
│     CLI args:
│       gemini --output-format stream-json [--model auto] \
│         [--resume <session_id>] [--approval-mode yolo]
│
│  4. DATA OUT — read from child.stdout (NDJSON)
│     Each line is one complete JSON event.
│
└─────────────────────────────────────────────────────────────────

The NDJSON event stream


// Session info (always first)
{ "type": "init", "timestamp": "2026-04-01T10:00:00Z", "session_id": "3e0923c9-7033-438d-a978-a361d8031f47", "model": "auto-gemini-3" }
 
// User message echo
{ "type": "message", "timestamp": "2026-04-01T10:00:01Z", "role": "user", "content": "Reply with PONG" }
 
// Assistant response (may arrive as multiple delta chunks)
{ "type": "message", "timestamp": "2026-04-01T10:00:02Z", "role": "assistant", "content": "PONG", "delta": true }
 
// Final result — usage data is here
{ "type": "result", "timestamp": "2026-04-01T10:00:03Z", "status": "success", "stats": { "input_tokens": 100, "output_tokens": 10, "cached": 20, "duration_ms": 3000 } }
 
// Error result
{ "type": "result", "timestamp": "...", "status": "error", "error": "Session not found" }

Key I/O summary

Concern	How it’s handled
Passing the prompt	Written to `stdin`, then `stdin.end()` — avoids Windows arg-quoting issues with `shell: true` and `-p` flag
Session resumption	`--resume <session_id>` flag — Gemini CLI manages its own session store
Getting streamed text	All `message` events where `role === "assistant"` → concatenate `content` (multiple delta chunks merged)
Getting the session ID	`init` event → `session_id` saved to sessions table
Getting token usage	`result` event → `stats.input_tokens`, `stats.output_tokens`, `stats.cached`
Approval bypass	`--approval-mode yolo` — required for unattended operation. (`--yolo` alias is deprecated as of early 2026; the adapter uses `--approval-mode yolo` automatically when `yolo: true`)
Process timeout	SIGTERM at `timeoutSec`, SIGKILL at `timeoutSec + graceSec`; `resolveOnce` pattern prevents double-resolve on Windows

Why stdin and not `-p`?

On Windows, spawn("gemini", [..., "-p", "prompt text"], { shell: true }) causes a quoting error: “Cannot use both a positional prompt and the —prompt (-p) flag together”. This is because shell: true re-processes argument quoting through cmd.exe. The fix is to remove -p entirely and write the prompt to stdin instead — proc.stdin.end(renderedPrompt). The Gemini CLI reads from stdin when no positional prompt is provided.

Why `shell: true` on Windows?

spawn("gemini", ...) without shell: true fails with ENOENT on Windows because Node.js doesn’t resolve .cmd shims (the files created by npm install -g). With shell: true, Node.js delegates to cmd.exe which finds gemini.cmd on PATH. The trade-off is that arguments pass through the Windows shell, which is why stdin is used instead of arg-based prompt passing.

Why `resolveOnce` instead of `resolve`?

On Windows with shell: true, proc.kill("SIGKILL") kills the cmd.exe wrapper but not the Gemini child process. The close event may never fire after SIGKILL. The timeout handler force-resolves the promise 500ms after SIGKILL via resolveOnce. The resolveOnce wrapper ensures the close handler can’t double-resolve if it does eventually fire.

Skills

Gemini CLI supports the Agent Skills open standard. Skills extend the agent with on-demand expertise — only the skill name and description are loaded into the initial context; the full SKILL.md body is injected only when Gemini activates the skill via the activate_skill tool.

Discovery locations

Skills are discovered from three tiers at session start:

Tier	Location
Workspace	`.gemini/skills/<name>/SKILL.md` or `.agents/skills/<name>/SKILL.md` (committed to version control)
User	`~/.gemini/skills/<name>/SKILL.md` or `~/.agents/skills/<name>/SKILL.md` (personal, all workspaces)
Extension	Bundled with installed Gemini CLI extensions

Precedence: Workspace > User > Extension. Within the same tier, .agents/skills/ takes precedence over .gemini/skills/.

Skill format

Each skill is a directory containing SKILL.md:


.agents/skills/
  api-auditor/
    SKILL.md          ← required: frontmatter + instructions
    scripts/
      audit.js        ← optional: bundled assets


---
name: api-auditor
description: |
  Expertise in auditing and testing API endpoints. Use when the user asks to
  "check", "test", or "audit" a URL or API.
---
 
# API Auditor Instructions
 
When this skill is active, you MUST:
1. Use the bundled `scripts/audit.js` utility to check the URL.
2. Analyze the output and explain any failures in plain English.

Activation mechanism

Gemini autonomously decides which skill to apply based on your request and the skill’s description:

Discovery — at session start, all skill names+descriptions are injected into the system prompt
Activation — Gemini calls the activate_skill tool when a matching task is identified
Consent — a confirmation prompt is shown (auto-approved when yolo: true / --approval-mode yolo)
Injection — the full SKILL.md body and directory structure are added to conversation context
Execution — the skill’s bundled scripts/assets become accessible throughout the session

Unlike Codex (which reads skills via a shell cat/Get-Content command), Gemini uses the activate_skill tool — a native tool call that injects skill content directly, with no extra shell round-trips.

Template for per-agent skills

For Leadmetrics agents, drop workspace skills in the agent’s configured cwd under .agents/skills/:


/var/agents/my-agent/
  .agents/skills/
    my-skill/
      SKILL.md

Session Handling

Session ID = session_id from the init event
Persisted to sessions table
Subsequent runs: --resume <session_id> flag — Gemini CLI resumes from its own local session store
Session errors (e.g. “session not found”) are detected from result.status === "error" + error message containing “session” or “resume” → clearSession: true returned in result so the orchestrator resets the session

Health Checks

The testEnvironment(config) function runs these checks in sequence:

Check	What it verifies
CLI installed	`gemini --version` exits 0
cwd accessible	Config `cwd` exists (or will be created)
instructionsFilePath	File exists if path is configured
Live probe	Spawns `gemini --output-format stream-json [--yolo]` with `"Respond with: hello"` on stdin; expects response within 20s

Timeout Handling

Two-stage graceful shutdown:

At timeoutSec: send SIGTERM — Gemini flushes and exits cleanly
After graceSec (default 5s): send SIGKILL — hard kill
500ms after SIGKILL: force-resolve via resolveOnce (Windows shell: true safety net)

The result includes error: "Process timed out after Ns" with success: false.

Cost Source

Token counts come from the result event → stats.input_tokens, stats.output_tokens, stats.cached.

Cost is calculated via calculateCostUsd(modelId, usage, env) using the model from the init event’s model field.

Auth mode	`GEMINI_API_KEY` present	`costUsd`
Google account (personal / AI Pro / Ultra)	No	`0.00`
Gemini API key (AI Studio)	Yes	Per-token (see table below)
Vertex AI	No (`GOOGLE_CLOUD_PROJECT` used instead)	`0.00` *

* Vertex AI billing is not tracked by the adapter — cost is reported as $0.00.

Model pricing (USD per 1M tokens, source: Google AI Studio pricing ):

Model	Input	Output	Cache read
gemini-2.5-pro	$1.25	$10.00	$0.31
gemini-2.5-flash	$0.15	$0.60	$0.0375
gemini-2.5-flash-lite	$0.10	$0.40	$0.025
gemini-2.0-flash	$0.10	$0.40	$0.025
gemini-1.5-pro	$1.25	$5.00	$0.31
gemini-1.5-flash	$0.075	$0.30	$0.019
gemini-1.5-flash-8b	$0.0375	$0.15	$0.01
(unknown)	$0.15	$0.60	$0.0375

Package Location


packages/adapters/gemini-local/
├── src/
│   ├── index.ts                    # type key, label, models, defaultModel, agentConfigurationDoc
│   ├── types.ts                    # GeminiLocalConfig, stream event types, AdapterExecutionContext
│   ├── server/
│   │   ├── execute.ts              # buildArgs(), renderTemplate(), execute()
│   │   ├── parse.ts                # parseNdjsonLines(), extractSessionId(), extractOutput(), extractUsage(), buildTranscript()
│   │   ├── test.ts                 # testEnvironment() — CLI probe + diagnostics
│   │   └── __tests__/
│   │       ├── execute.test.ts     # unit tests — buildArgs, renderTemplate
│   │       ├── parse.test.ts       # unit tests — parse/extract/buildTranscript
│   │       ├── build-config.test.ts# unit tests — buildConfig, validateConfig
│   │       └── integration/
│   │           └── execute.integration.test.ts  # live gemini CLI tests
│   └── ui/
│       ├── build-config.ts         # configFields[], buildConfig(), validateConfig()
│       └── __tests__/
│           └── build-config.test.ts
├── vitest.config.ts
└── package.json

Test status: Unit tests cover buildArgs, renderTemplate, calculateCostUsd, extractModelId, parse/extract functions, and buildConfig/validateConfig. Integration tests require gemini CLI installed and authenticated (Google account or GEMINI_API_KEY).