Skip to Content
AgentsSocial Post Designer

Social Post Designer

[Live] · agent__social-post-designer · Azure OpenAI GPT Image 1.5 / DALL-E 3

Generates brand-specific poster images for an approved social post. Produces one image for static/reel/story formats and up to five slides for carousels. Composites headline text and the tenant logo, uploads to DigitalOcean Spaces, creates Media rows, and auto-advances the post to client_review.

Related: Social Post Writer · Social Video Designer · Blog Image Generator · AI Image Generation


Overview

FunctionGenerate poster images for approved social media posts
TypeWorker — Design
ModelAzure OpenAI GPT Image 1.5 (primary) · OpenAI DALL-E 3 (fallback)
Queueagent__social-post-designer
Concurrency4
Lock duration5 min
Est. cost / task~$0.08 per image (Azure standard quality)
Credits2 cr per image (ai_image_generation)
PlanPro+

Triggers

Trigger typeWhenWho initiates
AutomaticDM approves social post copy → enqueueSocialPostDesigner()DM reviewer
Rejection re-runClient or DM rejects design → enqueueSocialPostDesigner({ wakeReason: "rejection" })Client (dashboard) / DM portal

Input

interface SocialPostDesignerJobData { tenantId: string; activityId: string; wakeReason?: "new_task" | "rejection"; reviewerFeedback?: string; // client's rejection note; injected into prompt on re-run }

Output

No structured return value. Side effects on completion:

  • SocialPost.status"client_review"
  • Deliverable.status"needs_approval"
  • Media rows created and linked via socialPostId
  • In-app notification sent to client: “Your social post for {platform} is ready for approval.”

On failure (all slides fail or insufficient credits):

  • SocialPost.status"dm_review" (reverts to DM for inspection)
  • Deliverable.status"failed"
  • Reserved credits released

Pipeline Flow

Social Post Writer completes → DM reviews copy in portal → DM approves → enqueueSocialPostDesigner() → status: design_pending → Credits reserved (2 cr × slide count) → Design Intelligence context loaded (enabled features only) → Images generated per slide (intelligence-enriched prompt) → Text composited (engagementHook on cover, heading on content slides) → Logo overlaid → Uploaded to Spaces → Media rows created → status: client_review → Client notified Client or DM rejects design → Old Media rows unlinked (preserved in library) → Re-enqueued with wakeReason: "rejection" → Rejection feedback + history injected into prompt → New slides generated

How It Works

Step 1 — Load & Validate

  1. Load SocialPost via activityId (unique relation)
  2. Load BrandAssets including designerConfig for the tenant
  3. Set SocialPost.status → "design_pending", Deliverable.status → "generating"
  4. Determine slide count: 1 for static/reel/story; up to 5 for carousel (capped at paragraph count)
  5. Map platformFormat: "carousel"mediaFormat: "carousel_slide" for correct dimension lookup
  6. Get pixel dimensions from getDimensions(platform, mediaFormat)

Step 2 — Credit Reservation

  1. Reserve 2 credits × slideCount upfront via reserveCredits()
  2. If insufficient credits: revert status to dm_review, throw — DM is notified

Step 3 — Design Intelligence Context Loading

  1. Read BrandAssets.designerConfig to determine which intelligence features are enabled
  2. Load each enabled feature’s context in parallel (see Design Intelligence section below)
  3. Assemble IntelligenceContext object passed into prompt builder

Step 4 — Image Generation (per slide)

  1. Build enriched image prompt using scene map, platform tone, brand fields, and intelligence context
  2. Call OpenAIImagesProvider.generateImage() (or img2img variant if style reference enabled)
  3. If slide fails: skip slide, continue; do not abort the whole job

Step 5 — Compositing (per slide)

  1. Composite headline text (SVG gradient + word-wrapped text via Sharp)
  2. Composite brand logo on top (prefer logoWhiteUrl over logoUrl)

Step 6 — Upload & Persist

  1. Upload PNG to Spaces via uploadToSpaces() → create Media row with full prompt stored
  2. Consume credits for successfully generated slides; release unused for failed slides

Step 7 — Finalise

  1. Update SocialPost.status → "client_review", Deliverable.status → "needs_approval"
  2. Publish in-app notification to client

Design Intelligence

Design Intelligence is a set of optional context-loading features that enrich the image prompt using the tenant’s historical data from their channels, past designs, and brand research. Each feature is independently togglable via BrandAssets.designerConfig and is configured per-tenant by super admins in the Manage portal (tenant detail → Design Intelligence tab).

Config Schema

Stored as BrandAssets.designerConfig (JSON):

type VisualApproach = "auto" | "people" | "no-people" | "abstract" | "data-driven"; interface DesignerConfig { // Intelligence feature flags useRejectionHistory: boolean; // Inject past rejection notes as negative guidance usePastApprovedStyles: boolean; // Extract visual patterns from approved images useChannelInsights: boolean; // Use channel insight strengths + recommendations useStyleReference: boolean; // Pass approved image as style anchor (img2img) useEngagementBias: boolean; // Bias toward visual patterns of high-engagement posts useCompetitorDiff: boolean; // Differentiate from competitor visual styles (RAG) useLLMTextRendering: boolean; // Bake headline text into the image prompt (vs Sharp overlay) // Scene configuration visualApproach: VisualApproach; // Applies a composition direction modifier to every scene sceneOverrides: Record<string, string>; // Per-content-type scene overrides for this tenant } // Defaults (applied when designerConfig is null or field is missing) const DESIGNER_CONFIG_DEFAULTS: DesignerConfig = { useRejectionHistory: true, usePastApprovedStyles: true, useChannelInsights: false, // off by default — turn on once channel insights are populated useStyleReference: false, // off by default — requires img2img provider support useEngagementBias: false, // off by default — requires engagement data linked useCompetitorDiff: false, // off by default — slower (RAG query) useLLMTextRendering: true, // on by default — produces more natural-looking text in image visualApproach: "auto", sceneOverrides: {}, };

Feature 1 — Rejection History (default: ON)

What it does: Aggregates the last 5 rejection notes left by the client (clientRejectionNote) and DM (dmRejectionNote) on past social posts for the same platform. Injects them as “avoid” guidance in the prompt.

Why it matters: The client has already told us what they dislike — dark backgrounds, generic stock photos, no people, etc. — and we’re currently ignoring this. Injecting it costs zero extra credits and directly addresses the client’s known preferences.

Data source:

db.socialPost.findMany({ where: { tenantId, platform, OR: [ { clientRejectionNote: { not: null } }, { dmRejectionNote: { not: null } }, ]}, orderBy: { updatedAt: "desc" }, take: 5, select: { clientRejectionNote: true, dmRejectionNote: true }, })

Prompt injection:

Avoid these issues from past rejected designs for this client: "Too dark and corporate — needed warmer tones." "Generic stock photo style, not brand-specific." "No visible people, felt empty."

Degrades gracefully: If no rejections exist yet, this feature is a no-op.


Feature 2 — Past Approved Style Patterns (default: ON)

What it does: Queries the last 3 approved social_poster Media rows for the same platform. Extracts the stored prompt strings and distils the visual patterns that led to approval into a style hint injected into the new prompt.

Why it matters: Each approved image’s prompt captures what the model generated that the client liked. Reusing those patterns — specific lighting descriptions, composition choices, colour temperature — makes the new image feel visually consistent with past approved work, even without explicit client instruction.

Data source:

db.media.findMany({ where: { tenantId, platform, mediaType: "social_poster", socialPost: { status: { in: ["client_approved", "published"] } }, }, orderBy: { createdAt: "desc" }, take: 3, select: { prompt: true, createdAt: true }, })

Prompt injection:

Visual style from past approved designs for this client on instagram: Warm colour grading, shallow depth of field, editorial photography, professional subject at laptop, natural office lighting. Maintain consistency with this established visual style.

Degrades gracefully: If no approved images exist, this feature is a no-op. More useful over time as the client approves more designs.


Feature 3 — Channel Insights (default: OFF)

What it does: Loads the latest completed ChannelInsight for the relevant platform. Extracts the top 2 strength titles and top 2 recommendation titles (sorted high → medium → low priority) from the structured insights.sections JSON and injects them as visual direction context.

Why it matters: The insight agent has already synthesised what content resonates with this audience and what visual direction is recommended — re-using that signal in the image prompt means the designer makes audience-aware decisions without needing another API call.

Default OFF because: The ChannelInsight.summary is a general analytics narrative that isn’t useful for image prompts. The extractor now uses insights.sections.strengths and insights.sections.recommendations (both InsightItem[] / RecommendationItem[] with title + detail + priority) — but this data is only meaningful once the insight agent has run and produced relevant visual guidance, not just engagement metrics.

Data source:

db.channelInsight.findFirst({ where: { tenantId, channelType, status: "done" }, orderBy: { completedAt: "desc" }, select: { insights: true }, // summary is intentionally NOT selected }) // Extracts: // insights.sections.strengths[0..1].title → top 2 strength titles // insights.sections.recommendations (sorted by priority)[0..1].title → top 2 rec titles

Prompt injection:

What resonates with this audience: Educational carousels drive high saves; Professional lifestyle imagery outperforms. Recommended visual direction: Use warm natural lighting; Feature real people in authentic settings.

Degrades gracefully: Returns null if no completed insight exists, or if both sections are empty. Capped at 400 chars.


Feature 4 — Style Reference Image / img2img (default: OFF)

What it does: Fetches the most recent client-approved social_poster image for the same platform and passes it as a visual reference to the image generation API. The model generates a new image in a similar aesthetic — same colour temperature, composition style, visual weight — without explicitly copying the content.

Why it matters: Text prompts struggle to describe subtle visual qualities — the warmth of a particular lighting setup, the breathing room in a composition, the contrast between subject and background. An actual reference image communicates all of this instantly. This is the single highest-quality improvement available.

Requires: Provider must support image reference input. Azure GPT Image 1.5 edit mode, Flux 1.1 Pro (via fal.ai), and Stable Diffusion img2img all support this. Standard DALL-E 3 text-to-image does not.

Data source:

const ref = await db.media.findFirst({ where: { tenantId, platform, mediaType: "social_poster", socialPost: { status: { in: ["client_approved", "published"] } }, }, orderBy: { createdAt: "desc" }, select: { url: true }, }); const referenceBuffer = ref ? await fetchBuffer(ref.url) : null;

Generation change: When referenceBuffer is available, call provider.generateImageFromReference({ prompt, referenceImage: referenceBuffer, strength: 0.65 }) instead of provider.generateImage({ prompt }). Strength 0.65 keeps the style without copying the content.

Degrades gracefully: If no approved image exists, falls back to standard text-to-image generation silently.

Default OFF because: Requires img2img provider (not all environments have this configured). Turn on when using Flux or Azure GPT Image edit mode.


Feature 5 — Engagement Bias (default: OFF)

What it does: Queries the connected channel’s post analytics (via the channel’s access token) to find the tenant’s 3 highest-engagement posts on this platform. Fetches their associated Media.prompt strings and uses them to bias the new prompt toward visual patterns that demonstrably resonate with the audience.

Why it matters: Approval ≠ performance. A post the client approved might get 20 likes; a different visual style might get 500. Engagement data closes this loop — the system learns what the audience responds to, not just what the client personally likes.

Data source:

// Step 1: get top posts by engagement from connected channel API const topPostIds = await fetchTopEngagementPosts(tenantId, platform, limit: 3); // Step 2: find their Media rows in our DB const topMedia = await db.media.findMany({ where: { tenantId, platform, activityId: { in: topPostIds } }, select: { prompt: true }, });

Prompt injection:

The following visual styles have driven the highest engagement for this client's instagram audience. Lean toward these patterns: Vibrant lifestyle scenes with warm natural light, people mid-action, rule of thirds, shallow depth of field.

Default OFF because: Requires an active connected channel with analytics access. Not all tenants will have this. Can be enabled per-tenant once the channel is connected and has sufficient post history (minimum 10 published posts recommended).


Feature 6 — Competitor Visual Differentiation (default: OFF)

What it does: Queries the RAG Competitor Research dataset for the tenant and retrieves descriptions of competitors’ visual style on the same platform. Injects these as “avoid” guidance so the generated image is visually distinct from what the market looks like.

Why it matters: Without this, the model defaults to category-generic aesthetics — and every marketing agency’s Instagram posts end up looking the same. Explicit competitor differentiation ensures the client’s visual identity stands out.

Data source: RAG search against Competitor Research dataset:

Query: "competitor social media visual style {platform} for {industry}" Returns: descriptions of competitor post aesthetics, colour palettes, composition patterns

Prompt injection:

Competitors in this space typically use: dark backgrounds with white text, heavy sans-serif typography, corporate office environments. Deliberately differentiate — use warm tones, organic textures, human connection rather than corporate formality.

Default OFF because: Adds a RAG query (~300ms) and requires populated competitor research data. Enable once competitor-researcher agent has run and data is indexed.


Feature 7 — LLM Text Rendering (default: ON)

What it does: Instead of compositing the headline text onto the image after generation (via Sharp gradient overlay), injects the full untruncated engagement hook or paragraph heading into the image prompt and instructs the model to render the text as part of the visual design. The Sharp compositeTextOverlay step is skipped entirely when this is enabled.

Why it matters: Sharp compositing applies a generic dark gradient band and a word-wrapped font that looks like a template overlay — the text floats on top of the image rather than belonging to it. When the model renders the text itself, it chooses a font weight, size, position, and style that integrates naturally with the scene composition.

Prompt injection (cover slide example):

Render this exact text as part of the image design, in large bold readable font integrated naturally into the composition: "The average Indian SMB spends ₹50,000–₹2,00,000/month on a marketing agency retainer — here's what you're actually getting."

Text source: Full untruncated engagementHook (cover slide) or full paragraph text (carousel content slides). The model handles wrapping and length naturally — no code-side truncation needed.

Default ON because: Produces more natural-looking designs in the majority of cases. Turn OFF per tenant if the engagement hooks contain special characters (prices, formulas, non-Latin scripts) where model text accuracy is critical.

When OFF: Falls back to Sharp compositeTextOverlay with truncation (117 chars cover / 97 chars content slides) and the gradient band. The image prompt adds “No embedded text. No watermarks.” to prevent the model from also rendering text.


Intelligence Context Assembly

All enabled features load in parallel before generation starts:

const [ rejectionHistory, pastApprovedStyles, channelInsights, styleReference, engagementBias, competitorDiff, ] = await Promise.all([ config.useRejectionHistory ? loadRejectionHistory(db, tenantId, platform) : null, config.usePastApprovedStyles ? loadPastApprovedStyles(db, tenantId, platform) : null, config.useChannelInsights ? loadChannelInsights(db, tenantId, platform) : null, config.useStyleReference ? loadStyleReference(db, tenantId, platform) : null, config.useEngagementBias ? loadEngagementBias(db, tenantId, platform) : null, config.useCompetitorDiff ? loadCompetitorDiff(rag, tenantId, platform) : null, ]); const intelligenceContext: IntelligenceContext = { rejectionHistory, pastApprovedStyles, channelInsights, styleReference, // Buffer | null — used as img2img input, not in prompt engagementBias, competitorDiff, };

Each loader returns null if the feature is disabled or no data exists. The prompt builder skips null fields. A feature being ON but returning no data never causes a failure — it just contributes nothing to the prompt.


Config UI Panel

Design Intelligence settings are configured per tenant by super admins in the Manage portal under /tenants/[tenantId]Design Intelligence tab. Clients and DM reviewers cannot change these flags.

Tenant-level panel

File: apps/manage/src/app/(manage)/tenants/[tenantId]/DesignIntelligenceTab.tsx — a dedicated tab in the 17-tab vertical sidebar on the tenant detail page.

Three sections:

  1. Feature toggles — 7 ON/OFF switches for the intelligence flags; auto-save on toggle with inline “Saved” confirmation.
  2. Visual Approach — segmented button group (Auto / People / No People / Abstract / Data-Driven); auto-save on click. Controls the visualApproach modifier appended to every resolved scene.
  3. Scene Overrides — accordion with one entry per content type (7 types). Each entry has an auto-save textarea (saves on blur) + “Reset to inherited” button. Empty = inherits platform default or SCENE_MAP. Non-empty = custom scene used for this tenant.

Partial merge on every save — only the keys in the PATCH body are changed; other keys are preserved.

Platform-level defaults page

File: apps/manage/src/app/(manage)/system/design-defaults/DesignDefaultsClient.tsx
Route: /system/design-defaults — under the System group in the Manage sidebar.

Accordion of 7 content types. Each entry has a textarea (auto-save on blur) backed by the PlatformSetting["sceneDefaults"] JSON key. Acts as Level 2 in the three-level scene stack — overrides SCENE_MAP for all tenants that haven’t set a tenant-specific override.

Empty string on save = delete key = revert to SCENE_MAP for all tenants without a tenant override.

Admin API endpoints

Tenant designer config (per-tenant overrides):

GET /admin/v1/tenants/:tenantId/brand-assets/designer-config Authorization: Bearer <super-admin-JWT> → 200 { designerConfig: { ... } | null } PATCH /admin/v1/tenants/:tenantId/brand-assets/designer-config Authorization: Bearer <super-admin-JWT> Content-Type: application/json { "useRejectionHistory": true, "usePastApprovedStyles": true, "useChannelInsights": false, "useStyleReference": false, "useEngagementBias": false, "useCompetitorDiff": false, "useLLMTextRendering": true, "visualApproach": "no-people", "sceneOverrides": { "educational": "Whiteboard-style diagram on a clean light background, no people, flat-lay composition" } } → 200 { ok: true, designerConfig: { ... }, updatedAt: "..." }

Platform scene defaults (Level 2 — applies to all tenants without a tenant override):

GET /admin/v1/design-defaults Authorization: Bearer <super-admin-JWT> → 200 { sceneDefaults: Record<string, string> } PATCH /admin/v1/design-defaults Authorization: Bearer <super-admin-JWT> Content-Type: application/json { "sceneDefaults": { "educational": "Abstract knowledge graph, glowing nodes, deep blue background", "promotional": "" ← empty string = remove key, revert to SCENE_MAP } } → 200 { ok: true, sceneDefaults: { ... } }

All routes: super-admin JWT only (requireSuperAdmin). Designer-config PATCH writes audit log (brand_assets.designer_config_updated).


Schema Changes Required

BrandAssets model — new field

model BrandAssets { // ... existing fields ... // Design Intelligence feature toggles (JSON to avoid migration per feature) designerConfig Json? // DesignerConfig — see docs/agents/social-post-designer.md }

No migration needed for existing rows — null falls back to DESIGNER_CONFIG_DEFAULTS in the worker.


Prompt Construction

Three-level scene resolution

The scene description for each slide is resolved through a priority stack at runtime via resolveScene():

1. Tenant scene override (BrandAssets.designerConfig.sceneOverrides[contentType]) ↓ fallback if empty 2. Platform default (PlatformSetting["sceneDefaults"][contentType]) ↓ fallback if empty 3. Hardcoded SCENE_MAP (in social-post-designer.worker.ts) + VisualApproach modifier appended

VisualApproach modifiers — appended to whichever base scene resolves:

ValueAppended instruction
"auto"(no modifier — scene is used as-is)
"people"Include people authentically and naturally in the scene.
"no-people"No people in the scene. Focus on environment, objects, data, or graphic composition.
"abstract"Abstract graphic composition. Bold shapes and brand colours. No realistic photography.
"data-driven"Data visualization, charts, statistics, or infographic style. Bold and informative. No people.

resolveScene() function:

function resolveScene(opts: { contentType: string; visualStyle: string; // BrandAssets.visualStyle fallback tenantOverrides: Record<string, string>; // from designerConfig.sceneOverrides platformDefaults: Record<string, string>; // from PlatformSetting["sceneDefaults"] visualApproach: VisualApproach; }): string { const base = tenantOverrides[contentType]?.trim() || platformDefaults[contentType]?.trim() || SCENE_MAP[contentType] || `${visualStyle} professional editorial photograph`; const modifier = VISUAL_APPROACH_MODIFIERS[visualApproach]; return modifier ? `${base}. ${modifier}` : base; }

Hardcoded SCENE_MAP (level 3 default)

All entries are environment/concept-focused — not person-prescriptive. Use visualApproach to control whether people appear.

contentTypeDefault scene
educationalBright modern office environment, concept-driven composition, editorial photography, shallow depth of field — data visualizations, devices, or workspace details in focus
promotionalProduct or service hero shot, studio flat-lay composition, dramatic lighting, brand colours dominant, clean negative space
inspirationalAspirational setting at golden hour, warm cinematic tones, rule of thirds, wide open environment with depth and natural light
announcementBold geometric shapes, graphic composition, brand colours dominant, modern minimal design
engagementWarm community setting — bright café or modern office, inviting natural light, genuine human atmosphere
behind-the-scenesCandid workplace moment, authentic documentary-style lighting, real office or studio environment
ugcAuthentic product-in-use setting, natural lifestyle context, mobile-photography aesthetic, warm and relatable

Platform art direction

PlatformTone
instagramWarm colour grading, lifestyle photography, vibrant yet natural
linkedinCool-neutral professional tones, authoritative composition, business setting
facebookFriendly and approachable, bright natural light, inclusive community feel
xHigh contrast, bold graphic, minimal elements, single strong focal point
tiktokVibrant energetic colours, dynamic composition, youth-friendly atmosphere

Full prompt template

The scene is resolved before buildImagePrompt is called (via resolveScene()) and passed in as opts.scene. The prompt builder accepts it directly — it does not re-look up SCENE_MAP when scene is provided.

function buildImagePrompt(opts: { scene?: string; ... }, intelligence: IntelligenceContext): string { const scene = opts.scene ?? SCENE_MAP[opts.contentType] ?? `${opts.visualStyle} professional editorial photograph`; const platformDir = PLATFORM_TONE[opts.platform] ?? ""; const slideLabel = opts.slideCount > 1 ? `Slide ${opts.slideIndex + 1} of ${opts.slideCount}. ` : ""; const colorStr = [ `primary ${opts.primaryColor}`, `secondary ${opts.secondaryColor}`, opts.accentColor ? `accent ${opts.accentColor}` : "", opts.backgroundColor? `background ${opts.backgroundColor}` : "", ].filter(Boolean).join(", "); return [ `${slideLabel}${scene}.`, platformDir ? `Platform aesthetic: ${platformDir}.` : "", `Brand colours: ${colorStr}.`, opts.designNotes ? `Brand style notes: ${opts.designNotes}.` : "", // ── Design Intelligence injections ──────────────────────────────────────── intelligence.pastApprovedStyles ? `Visual style from past approved designs: ${intelligence.pastApprovedStyles}.` : "", intelligence.channelInsights ? `Target audience: ${intelligence.channelInsights}.` : "", intelligence.engagementBias ? `High-engagement visual patterns for this audience: ${intelligence.engagementBias}.` : "", intelligence.competitorDiff ? `Differentiate from competitors who use: ${intelligence.competitorDiff}.` : "", intelligence.rejectionHistory ? `Avoid these issues from past rejected designs: ${intelligence.rejectionHistory}.` : "", opts.reviewerFeedback ? `Previous design was rejected with this feedback: "${opts.reviewerFeedback}". Address specifically.` : "", // ───────────────────────────────────────────────────────────────────────── `Subject context: ${opts.slideContent.slice(0, 200)}.`, `Shot on professional camera, f/2.0 shallow depth of field, studio lighting, magazine-quality composition, 8K detail.`, `Diverse authentic representation. No brand logos. No embedded text. No watermarks.`, `High quality image optimised for ${opts.platform}.`, ].filter(Boolean).join(" "); }

Text Compositing

Controlled by useLLMTextRendering (default ON):

  • ON — Sharp compositing is skipped. The full untruncated engagement hook / paragraph is injected into the image prompt (see Feature 7 above) and the model renders it as part of the image.
  • OFF — Sharp compositeTextOverlay runs: two composite passes applied after image generation:
    1. Gradient overlay — SVG rect covering bottom 45% of image, black 0%→68% opacity
    2. Headline text — word-wrapped, font-size proportional to image width (~42px at 1080px), drop-shadow filter for readability
Slide typeText sourceTruncation (Sharp only)Position
Cover (slide 0 or single)SocialPost.engagementHook117 charsLower third
Content slide (carousel 1–4)First sentence of slide’s paragraph97 charsLower third

Font family: BrandAssets.fontPrimary"Liberation Sans"Arialsans-serif


Brand Asset Usage

FieldPromptCompositing
primaryColor✅ gradient tint
secondaryColor
accentColor
backgroundColor
logoUrl✅ fallback
logoWhiteUrl✅ preferred (dark background)
visualStyle✅ fallback for unknown contentType
designNotes
fontPrimary✅ headline text
designerConfigcontrols which intelligence features run

Image Size Mapping

Aspect ratioGPT Image 1.5 sizeUsed for
Landscape (w/h > 1.2)1536x1024LinkedIn, Facebook standard
Portrait (w/h < 0.8)1024x1536Instagram Stories, TikTok
Square (default)1024x1024Instagram square, Facebook square

Provider Selection

AZURE_IMAGE_ENDPOINT + AZURE_IMAGE_API_KEY set? → Azure OpenAI GPT Image 1.5 ← preferred → else: OPENAI_API_KEY set? → OpenAI DALL-E 3 ← fallback → else: throw (job fails) useStyleReference enabled + referenceBuffer available? → Use provider's img2img / edit endpoint instead of text-to-image → Falls back to text-to-image if provider doesn't support it

Future: Flux provider

Flux 1.1 Pro (via fal.ai) produces significantly more photorealistic results and natively supports img2img (required for useStyleReference). The billing config already reserves ai_image_flux: 0.5 credits. Add BrandAssets.imageProvider: "azure" | "openai" | "flux" to allow per-tenant provider selection.


Credit Costs

ProviderCost per imageCredit type
Azure GPT Image 1.52 creditsai_image_generation
OpenAI DALL-E 32 creditsai_image_generation
Flux 1.1 Pro (future)0.5 creditsai_image_flux

Design Intelligence features add zero extra credit cost — they are DB/RAG queries that enrich the prompt, not additional API calls.


Storage Path Convention

DigitalOcean Spaces bucket: leadmetrics-media Path: tenants/{tenantId}/social-posters/{socialPostId}/slide-{slideIndex}-{nanoid}.png Media.mediaType: "social_poster" Media.platform: post.platform Media.format: "carousel_slide" | "reel" | "story" | "static" Media.slideIndex: 0 … N Media.generatedBy: "social-post-designer" Media.prompt: full prompt string incl. intelligence injections (stored for audit)

HITL Gates

GateActorWhen
Copy approvalDM reviewerBefore design job is enqueued
Design approvalClient (dashboard)After all slides are generated
Design Intelligence configSuper Admin (Manage portal)Any time via tenant detail → Design Intelligence tab; takes effect on next design job

Guardrails

RuleEnforcement
Credits reserved before generationreserveCredits() called before first API call
Unused credits always releasedreleaseCredits() in both failure and partial-failure paths
Intelligence features are best-effortAny intelligence loader failure logs a warning and returns null; generation continues without it
Intelligence loaders run in parallelPromise.all() — total intelligence load time is the slowest single loader, not sum
Failed slides skipped, not fatalSingle slide failure does not abort the job
All slides fail → revert to dm_reviewcreatedMediaIds.length === 0 → revert, release credits
Old media unlinked on rejection, never deletedsocialPostId → null; preserved in media library
Prompt stored on Media rowFull enriched prompt (including intelligence injections) stored for debugging and audit
designerConfig null → safe defaultsWorker applies DESIGNER_CONFIG_DEFAULTS when designerConfig is null

Environment Variables

Required in apps/servers/agents/.env:

# Azure OpenAI GPT Image 1.5 (primary) AZURE_IMAGE_API_KEY=... AZURE_IMAGE_ENDPOINT=https://{resource}.cognitiveservices.azure.com/openai/deployments/gpt-image-1.5/images/generations?api-version=2024-02-01 # OpenAI DALL-E 3 (fallback) OPENAI_API_KEY=... # DigitalOcean Spaces (required for upload) DO_SPACES_KEY=... DO_SPACES_SECRET=... DO_SPACES_BUCKET=... DO_SPACES_ENDPOINT=... DO_SPACES_CDN_URL=... DO_SPACES_REGION=...

Error Handling

ErrorResponse
SocialPost not found for activityIdThrow immediately
Insufficient creditsRevert to dm_review; throw with credits-needed message
Image provider not configuredThrow before any generation
Intelligence loader throwsLog warning, treat as null, continue generation
Single slide generation failsLog error, skip slide, continue
Logo fetch failsLog warning, use image without logo
Text composite failsLog warning, use image without text overlay
All slides failRelease credits; revert to dm_review; throw
Upload to Spaces failsPropagate error; BullMQ retries the job

Observability

The worker publishes lifecycle events to the agent_events:{tenantId} Redis channel, which persists AgentRun records and surfaces them in Manage → Dashboards → Execution Queue.

EventWhen emitted
agent:startedAfter SocialPost is confirmed to exist
agent:failedInsufficient credits path — before throw
agent:failedAll slides failed path — before throw
agent:completedAfter client notification is sent

runId format: social-post-designer-{activityId}-{timestamp}

Note: a separate creditRunId (social-post-designer-{postId}-{timestamp}) is used for the credit reservation / consumption API so that credit ledger entries remain tied to the specific SocialPost, not the activityId. The two IDs are intentionally different.


Implementation Phases

PhaseStatusDescription
Phase 1✅ LiveBase generation, logo overlay, credit tracking, carousel format fix, rejection re-run
Phase 2✅ LiveContent-type scene mapping, platform art direction, text compositing via Sharp, full brand field usage, rejection feedback injection
Phase 3✅ LiveDesign Intelligence — schema field, 9 config fields (7 flags + visualApproach + sceneOverrides), 5 context loaders, improved channel insights extractor (InsightItem/RecommendationItem), useLLMTextRendering (bakes text into prompt, skips Sharp), three-level scene stack (resolveScene(): tenant override → platform PlatformSetting["sceneDefaults"] → hardcoded SCENE_MAP + VisualApproach modifier), admin config panel in Manage (tenant Design Intelligence tab + /system/design-defaults page), GET+PATCH /admin/v1/tenants/:tenantId/brand-assets/designer-config, GET+PATCH /admin/v1/design-defaults
Phase 4🔲 To BuildStyle Reference img2img — Flux provider integration, useStyleReference loader, edit-mode API call
Phase 5🔲 To BuildEngagement Bias — channel analytics query, top-post prompt extraction
Phase 6🔲 To BuildTemplate compositing — platform-specific layout frames (safe zones, CTA band, brand border)

© 2026 Leadmetrics — Internal use only