DALL-E (OpenAI Image Generation)

Category: AI Image Generation
Integration type: Platform-level API key (same as OpenAI LLM)
External SDK: openai

Purpose

DALL-E 3 is OpenAI’s text-to-image generation model. It produces photorealistic images, illustrations, and stylised artwork from text prompts. It is used when stock photo searches (Pixabay, Unsplash) don’t yield a sufficiently specific image for the content being created.

DALL-E 3 uses the same API key as the OpenAI LLM integration — no separate credential required.

When DALL-E vs Stock Photos

Scenario	Use
Need a generic nature/business photo	Unsplash or Pixabay (free)
Need a specific scene that doesn’t exist in stock	DALL-E 3 (generates it)
Need brand-specific illustration	DALL-E 3 with brand style prompt
Need editing an existing image	DALL-E 2 inpainting or Gemini edit

Config Structure

Platform config (env vars)


OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   # Shared with OpenAI LLM
DALLE_DEFAULT_MODEL=dall-e-3    # 'dall-e-3' | 'dall-e-2'
DALLE_DEFAULT_SIZE=1024x1024    # See size options per model below

Integration Pattern

Image generation tool (`packages/tools/src/dall-e.ts`)


import OpenAI from 'openai';
 
class DallETool implements ImageGenerationTool {
  readonly name = 'dall-e';
  private client: OpenAI;
 
  constructor(apiKey: string) {
    this.client = new OpenAI({ apiKey });
  }
 
  async generateImage(options: {
    prompt:     string;
    model?:     'dall-e-3' | 'dall-e-2';
    size?:      string;   // See per-model options below
    quality?:   'standard' | 'hd';   // DALL-E 3 only; default 'standard'
    style?:     'vivid' | 'natural'; // DALL-E 3 only; default 'vivid'
    count?:     number;  // DALL-E 3: max 1; DALL-E 2: max 10
  }): Promise<GeneratedImage[]> {
    const model   = options.model   ?? 'dall-e-3';
    const size    = options.size    ?? this.defaultSize(model);
    const count   = Math.min(options.count ?? 1, model === 'dall-e-3' ? 1 : 10);
 
    const response = await this.client.images.generate({
      model,
      prompt:           options.prompt,
      n:                count,
      size:             size as any,
      quality:          model === 'dall-e-3' ? (options.quality ?? 'standard') : undefined,
      style:            model === 'dall-e-3' ? (options.style   ?? 'vivid')    : undefined,
      response_format:  'b64_json',
    });
 
    return response.data.map(image => ({
      provider:      'dall-e',
      model,
      imageBase64:   image.b64_json!,
      mimeType:      'image/png',
      revisedPrompt: image.revised_prompt ?? null,  // DALL-E 3 always revises the prompt
    }));
  }
 
  // DALL-E 2 image editing — inpainting
  async editImage(options: {
    image:  Buffer;   // PNG, must be square, transparent areas = mask
    mask?:  Buffer;   // PNG with transparent areas indicating where to edit
    prompt: string;
    size?:  '256x256' | '512x512' | '1024x1024';
    count?: number;
  }): Promise<GeneratedImage[]> {
    const response = await this.client.images.edit({
      image: new File([options.image], 'image.png', { type: 'image/png' }),
      mask:  options.mask ? new File([options.mask], 'mask.png', { type: 'image/png' }) : undefined,
      prompt: options.prompt,
      n:      options.count ?? 1,
      size:   (options.size ?? '1024x1024') as any,
      response_format: 'b64_json',
    });
 
    return response.data.map(image => ({
      provider:      'dall-e',
      model:         'dall-e-2',
      imageBase64:   image.b64_json!,
      mimeType:      'image/png',
      revisedPrompt: null,
    }));
  }
 
  // Image variation — DALL-E 2 only
  async createVariation(options: {
    image:  Buffer;   // PNG, square
    count?: number;
    size?:  string;
  }): Promise<GeneratedImage[]> {
    const response = await this.client.images.createVariation({
      image: new File([options.image], 'image.png', { type: 'image/png' }),
      n:     options.count ?? 1,
      size:  (options.size ?? '1024x1024') as any,
      response_format: 'b64_json',
    });
 
    return response.data.map(image => ({
      provider:      'dall-e',
      model:         'dall-e-2',
      imageBase64:   image.b64_json!,
      mimeType:      'image/png',
      revisedPrompt: null,
    }));
  }
 
  private defaultSize(model: string): string {
    return model === 'dall-e-3' ? '1024x1024' : '512x512';
  }
}

Size Options

DALL-E 3

Size	Aspect ratio	Notes
`1024x1024`	Square	Best for Instagram, general use
`1792x1024`	Landscape 16:9	Blog featured images, banners
`1024x1792`	Portrait 9:16	Stories, Pinterest

DALL-E 2

Size	Notes
`256x256`	Low resolution; low cost
`512x512`	Medium resolution
`1024x1024`	Full resolution

DALL-E 3 Prompt Revision

DALL-E 3 automatically revises the input prompt to add detail and safety filtering. The revised_prompt in the response shows what was actually used. The platform logs this for debugging and transparency. If an agent’s prompt is consistently revised, the system prompt for that agent should be adjusted.

Content Policy

DALL-E 3 applies strict content moderation:

No people who could be mistaken for real people without explicit consent
No violent, adult, or disturbing content
Prompts involving brands or logos may be refused

Agent system prompts must avoid requesting people’s faces, real brand logos, or sensitive subjects. Violations result in 400 errors with content_policy_violation.

Cost Reference

Model	Size	Quality	Price per image
DALL-E 3	1024×1024	Standard	$0.040
DALL-E 3	1024×1024	HD	$0.080
DALL-E 3	1792×1024	Standard	$0.080
DALL-E 3	1792×1024	HD	$0.120
DALL-E 2	1024×1024	—	$0.020

Test Cases

Unit tests (`packages/tools/src/dall-e.test.ts`)

Test	Approach
`generateImage()` calls `images.generate` with correct params	Mock `openai.images.generate`; assert `model`, `size`, `quality`, `style`
`generateImage()` caps count at 1 for DALL-E 3	Pass `count: 3`; assert `n: 1` in API call
`generateImage()` returns `revisedPrompt` from response	Mock `{ revised_prompt: 'revised text' }`
`editImage()` passes image and mask as File objects	Assert `File` instances in `images.edit` call
`generateImage()` uses `response_format: 'b64_json'`	Always assert this param
Throws on content policy violation (400)	Mock 400 with `content_policy_violation`; assert error

Google Gemini Provider — Imagen 3 image generation
Stability AI Provider — Stable Diffusion image generation
Pixabay Provider — free stock photos (use before AI generation)
Unsplash Provider — premium stock photos
OpenAI Provider — GPT-4o LLM (same API key)