Creative Standard

Art

Static visual content across 20+ formats via Flux, Nano Banana Pro (Gemini 3 Pro), and GPT-Image-2 (OpenAI flagship).

Workflows

References

Triggers

medium

Effort

The Problem

Ask a generic AI to generate a blog header or architecture diagram and you get something that technically exists. The composition is generic, the style is inconsistent across posts, the text in diagrams is garbled, and there's no concept of where the image goes or what format it needs to be. The model dumps the file wherever it wants, sends a social thumbnail with a transparent background (which renders as black on X), and has no idea your blog uses a sepia background that needs specific handling.

How This Skill Approaches It

Each of the 20+ output types routes to a specific workflow — Essay for blog headers, TechnicalDiagrams for architecture, Frameworks for 2x2 matrices, D3Dashboards for interactive charts, Comics for sequential panels, and so on. Generate.ts enforces this: it won't run without a --workflow flag, so the technique, palette, and composition rules encoded in each workflow file actually get applied. Model selection is job-specific: gpt-image-2 for text-heavy work (stat cards, aphorism cards, taxonomies) because it leads the Arena leaderboards on text rendering; nano-banana-pro for editorial illustration because PREFERENCES.md pins the aesthetic. Blog headers always use --thumbnail, which produces both a transparent WebP for the body and a sepia-backed thumbnail for social. All output stages to ~/Downloads/ first — nothing goes to a project directory until you've reviewed it.

Workflows: Essay, D3Dashboards, Visualize, Mermaid, TechnicalDiagrams, Taxonomies, Timelines, Frameworks, Comparisons, AnnotatedScreenshots, RecipeCards, Aphorisms, Maps, Stats, Comics, YouTubeThumbnailChecklist, AdHocYouTubeThumbnail, CreatePAIPackIcon, LogoWallpaper, RemoveBackground
SKILLCUSTOMIZATIONS loads PREFERENCES.md, CharacterSpecs.md, SceneConstruction.md from user customization dir
--remove-bg flag returns transparent PNG
Up to 14 reference images per Gemini API request (5 human, 6 object)
Output staged to ~/Downloads/ before project copy
Nano Banana Pro --size tier 1K/2K/4K + separate --aspect-ratio

Not for video or animation (use Remotion), or web UI design and integrated frontend layout (use Webdesign)

In Action

What you say to your DA, and what the Art skill actually does.

You say "make a header image for my post about how AI systems learn from feedback"

Routes to the Essay workflow, composes a prompt following the workflow's 8-step template, runs Generate.ts with --workflow=Essay --thumbnail, produces a transparent PNG for the blog body and a sepia-backed thumb for social frontmatter, both staged to ~/Downloads/ for review.
You say "diagram the SPQA pattern as a technical architecture visual"

Routes to TechnicalDiagrams workflow, uses gpt-image-2 for strong text rendering, generates a structured architecture diagram with labeled components, outputs to ~/Downloads/ before any copy to the project.
You say "create a stat card showing 41k GitHub stars for Fabric"

Routes to Stats workflow, selects gpt-image-2 for text-heavy layout, generates a clean stat card with the number as the visual anchor, stages to ~/Downloads/ for review.

Inside the Skill

The thinking, frameworks, and architecture that distinguish this skill from a generic version of the same task.

What It Does

Generates static visual content across 20+ formats — blog headers, technical and architecture diagrams, frameworks, taxonomies, timelines, comparisons, stat cards, comics, icons, wallpapers, D3 charts, Mermaid diagrams — using Flux, Nano Banana Pro (Gemini 3 Pro), and GPT-Image-2. Every request routes through a named workflow that encodes the technique and palette, output stages to ~/Downloads/ for review first, and blog headers ship both a transparent inline version and an opaque social thumbnail.

The Problem

The bare image model produces inconsistent, off-style output when handed a freeform prompt — one session shipped 12 rejected diagrams because the prompt skipped the workflow that holds the composition rules. Different formats need different models (text-heavy cards want GPT-Image-2; editorial headers want Nano Banana Pro), different size formats, and different transparency handling. Without a fixed routing-and-staging discipline, you get wrong sizes, opaque headers that bleed over the page background, and images pushed straight to a repo before anyone looked at them. This skill makes the workflow, the model choice, and the Downloads-first review mandatory in code, not just in markdown.

How It Works

A complete visual content system for illustrations, diagrams, and other static visuals. Each request picks a matching workflow file first, follows its prompt template, then calls Generate.ts with --workflow=<name> plus model/size/output flags. Two layers enforce that the workflow was followed (Generate.ts itself and the ArtWorkflowGuard.hook.ts PreToolUse hook), output always lands in ~/Downloads/ for preview, and blog headers run with --thumbnail to produce both the transparent PNG and the sepia-backed social thumbnail.

STRUCTURAL ENFORCEMENT — `--workflow=<name>` IS REQUIRED

This rule used to be markdown-only and was silently ignored, producing 12 rejected diagrams in one session (incident 2026-04-30, see ISA MEMORY/WORK/20260430-180000_art-skill-freeform-enforcement). It now lives in code.

Two layers enforce it:

Generate.ts itself refuses to run unless you pass --workflow=<name> (or the explicit --freeform-confirmed opt-out). It exits non-zero with the workflow lookup table.
ArtWorkflowGuard.hook.ts (PreToolUse Bash) blocks any Bash command containing Art/Tools/Generate.ts without --workflow= or --freeform-confirmed, with exit code 2 and the same lookup table.

The flow that works: read the matching workflow file → follow its prompt template → invoke Generate.ts with --workflow=<that-workflow-name> plus your model/prompt/size flags. The --workflow=<name> flag is your explicit assertion "I read the workflow and followed it."

The flow that's blocked: composing a freeform prompt and shipping it directly to Generate.ts. Both layers above will refuse.

Most Common Failure Mode (don't repeat it)

Reading the workflow's caps-warning, mentally noting "do the workflow," then composing a Bash command with your own prompt anyway because it feels faster. Stop. The workflow templates encode the technique, palette, composition rules, and validation gate the bare model fails to honor. Skipping them produced — verbatim — "absolute fucking ass" diagrams. Read the workflow file FIRST. Compose the prompt FROM the template. Pass --workflow=<name> so the gate can see you did it.

Workflow → command (copy-paste)

bun ~/.claude/skills/Art/Tools/Generate.ts \
  --workflow=<WorkflowName> \
  --model nano-banana-pro \
  --prompt "..." \
  --size 2K \
  --aspect-ratio 16:9 \
  --output ~/Downloads/<filename>.png

<WorkflowName> MUST match a file under Workflows/ (without .md):

Routing rules — pick a workflow FIRST, before writing any prompt:

Request shape	Required workflow
Blog header / editorial essay illustration	`Workflows/Essay.md` — Steps 1–8 in order, no skipping
Mermaid diagram	`Workflows/Mermaid.md`
Technical / architecture diagram	`Workflows/TechnicalDiagrams.md`
Framework / 2x2 / matrix	`Workflows/Frameworks.md`
D3 dashboard / chart	`Workflows/D3Dashboards.md`
Taxonomy / hierarchy	`Workflows/Taxonomies.md`
Timeline	`Workflows/Timelines.md`
Comparison	`Workflows/Comparisons.md`
Stat card	`Workflows/Stats.md`
Aphorism / quote card	`Workflows/Aphorisms.md`
Comic panel	`Workflows/Comics.md`
YouTube thumbnail	`Workflows/AdHocYouTubeThumbnail.md` or `Workflows/YouTubeThumbnailChecklist.md`
PAI pack icon	`Workflows/CreatePAIPackIcon.md`
brand-logo wallpaper	`Workflows/LogoWallpaper.md`
Recipe card	`Workflows/RecipeCards.md`
Map / conceptual map	`Workflows/Maps.md`
Annotated screenshot	`Workflows/AnnotatedScreenshots.md`
Background removal only	`Workflows/RemoveBackground.md`
Embossed logo wallpaper	`Workflows/EmbossedLogoWallpaper.md`
Generic visualization (none of the above fit)	`Workflows/Visualize.md`

The ONLY exception: the user explicitly says "freeform" / "skip the workflow" / "just run Generate.ts directly with this prompt: ...". In that case, pass --freeform-confirmed to Generate.ts (which logs the explicit opt-out to stderr for audit). Without that explicit instruction from the user, ALWAYS pick the matching workflow and pass --workflow=<name> — both Generate.ts and ArtWorkflowGuard.hook.ts will refuse the call otherwise.

If no workflow matches the request, stop and surface to the user before generating — propose either (a) the closest existing workflow, (b) using Visualize.md as the generic catch-all, or (c) creating a new workflow first via the CreateSkill skill. Do not improvise.

Core Aesthetic

Default: Production-quality concept art style appropriate for editorial and technical content.

User customization defines specific aesthetic preferences including:

Visual style and influences
Line treatment and rendering approach
Color palette and wash technique
Character design specifications
Scene composition rules

Load from: ~/.claude/PAI/USER/CUSTOMIZATIONS/SKILLS/Art/PREFERENCES.md

Reference Images

User customization may include reference images for consistent style.

Check ~/.claude/PAI/USER/CUSTOMIZATIONS/SKILLS/Art/PREFERENCES.md for:

Reference image locations
Style examples by use case
Character and scene reference guidance

Usage: Before generating images, load relevant user-provided references to match their preferred style.

Image Generation

Default model: Check user customization at SKILLCUSTOMIZATIONS/Art/PREFERENCES.md Fallback: nano-banana-pro (Gemini 3 Pro)

Model-Specific Size Requirements

Each model accepts different --size formats. Using the wrong format causes validation errors.

Model	`--size` format	Valid values	Default
`flux`	Aspect ratio	`1:1`, `16:9`, `3:2`, `2:3`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `21:9`	`16:9`
`nano-banana`	Aspect ratio	`1:1`, `16:9`, `3:2`, `2:3`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `21:9`	`16:9`
`nano-banana-pro`	Resolution tier	`1K`, `2K`, `4K` (also accepts `--aspect-ratio` separately)	`2K`
`gpt-image-2`	Pixel dimensions	`1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `auto` (also accepts `--quality` low/medium/high/auto)	`1024x1024`

gpt-image-1 is DEPRECATED per OpenAI docs and is rejected by Generate.ts with a clear error message. There is no gpt-image-1.5 or gpt-image-2.5 — earlier versions of this skill referenced those as fallbacks; they do not exist. The OpenAI image lineup as of 2026-05-04 is exactly: gpt-image-2 (current) and gpt-image-1 (deprecated).

Model Selection — when to pick which

Three first-class models are wired into Generate.ts. PREFERENCES.md (if present) pins the user's default; in absence of a pin, pick by job:

Job	Recommended model	Why
Editorial illustration / blog header (default)	`nano-banana-pro`	Best composition fidelity for the user's editorial aesthetic; PREFERENCES.md typically pins it.
Text-heavy work — stat cards, framework diagrams, taxonomies, timelines, aphorism cards	`gpt-image-2`	Currently #1 across all Image Arena leaderboards (Arena.ai, 2026-05-04) — text-to-image margin +242 Elo, single-image edit +125, multi-image edit +90. Strongest text rendering on the market right now.
Two-side bake-off for high-stakes editorial work	`compare` (runs both `gpt-image-2` + `nano-banana` in parallel)	Two interpretations of the same brief; pick the better one. See `Workflows/Essay.md`.
Stylistic variety / non-photoreal / iteration speed	`flux` or `nano-banana`	Different aesthetic register; `flux` is crisper for technical illustration.

Arena leaderboard sweeps measure aesthetic preference at scale, not editorial style fit. They are a strong quality signal, not a default-override; respect PREFERENCES.md when it exists.

Note: nano-banana-pro uses --size for resolution quality and a separate --aspect-ratio flag for aspect ratio (defaults to 16:9).

🚨 CRITICAL: Always Output to Downloads First

ALL generated images MUST go to ~/Downloads/ first for preview and selection.

Never output directly to a project's public/images/ directory. User needs to review images in Preview before they're used.

Workflow:

Generate to ~/Downloads/[descriptive-name].png
User reviews in Preview
If approved, THEN copy to final destination (e.g., cms/public/images/)
Create WebP and thumbnail versions at final destination

# CORRECT - Output to Downloads for preview
bun run ${CLAUDE_SKILL_DIR}/Tools/Generate.ts \
  --model nano-banana-pro \
  --prompt "[PROMPT]" \
  --size 2K \
  --aspect-ratio 1:1 \
  --thumbnail \
  --output ~/Downloads/blog-header-concept.png

# After approval, copy to final location (substitute your blog/site path)
cp ~/Downloads/blog-header-concept.png ~/your-site/public/images/
cp ~/Downloads/blog-header-concept-thumb.png ~/your-site/public/images/

Multiple Reference Images (Character/Style Consistency)

For improved character or style consistency, use multiple --reference-image flags:

# Multiple reference images for better likeness
bun run ${CLAUDE_SKILL_DIR}/Tools/Generate.ts \
  --model nano-banana-pro \
  --prompt "Person from references at a party..." \
  --reference-image face1.jpg \
  --reference-image face2.jpg \
  --reference-image face3.jpg \
  --size 2K \
  --aspect-ratio 16:9 \
  --output ~/Downloads/character-scene.png

API Limits (Gemini):

Up to 5 human reference images
Up to 6 object reference images
Maximum 14 total reference images per request

API keys in: ${PAI_DIR}/.env

Examples

Example 1: Blog header image

User: "create a header for my AI agents post"
→ Invokes ESSAY workflow
→ Generates charcoal sketch prompt
→ Creates image with architectural aesthetic
→ Saves to ~/Downloads/ for preview
→ After approval, copies to public/images/

Example 2: Technical architecture diagram

User: "make a diagram showing the SPQA pattern"
→ Invokes TECHNICALDIAGRAMS workflow
→ Creates structured architecture visual
→ Outputs PNG with consistent styling

Example 3: Comparison visualization

User: "visualize humans vs AI decision-making"
→ Invokes COMPARISONS workflow
→ Creates side-by-side visual
→ Charcoal sketch with labeled elements

Example 4: PAI pack icon

User: "create icon for the skill system pack"
→ Invokes CREATEPAIPACKICON workflow
→ Reads workflow from Workflows/CreatePAIPackIcon.md
→ Generates 1K image with --remove-bg for transparency
→ Resizes to 256x256 RGBA PNG
→ Outputs to ~/Downloads/ for preview
→ After approval, copies to ${PROJECTS_DIR}/PAI/Packs/icons/

Gotchas

Always output to ~/Downloads/ first — NEVER directly to project directories. User must preview before use. Multiple past failures from pushing wrong images directly to repos.
Verify image dimensions match target use case before claiming done. Social media previews, blog headers, and thumbnails have different size requirements. A header that works on the blog may break OG/social previews.
nano-banana-pro uses --size for resolution (1K/2K/4K) and SEPARATE --aspect-ratio flag. Don't pass aspect ratio values to --size.
Reference images: max 5 human, 6 object, 14 total per request (Gemini API limit).
After generating, use Read tool to visually confirm the image before reporting success. "Generated successfully" means nothing if you haven't looked at it.
When asked to use a specific image URL or file, use EXACTLY that asset. Don't substitute similar images. Past rating-1 failures from using wrong image assets.
--remove-bg may produce black backgrounds instead of transparency. Always verify transparent PNG output visually before deploying.
--remove-bg is unsafe for thin-linework technical diagrams. rembg classifies thin black ink on a light field as "background" and strips it, leaving a near-empty ghost. Documented 2026-05-11 on the free-will flowchart. Mitigations: (a) prompt for thick saturated linework first so rembg has a strong signal, or (b) skip --remove-bg entirely when the destination background matches the image's background (blog page is sepia #EAE9DF — opaque sepia diagram on sepia page composites with zero visible seam, no alpha needed).

Workflows · 21

01

AdHocYouTubeThumbnail Workflows/AdHocYouTubeThumbnail.md
02

AnnotatedScreenshots Workflows/AnnotatedScreenshots.md
03

Aphorisms Workflows/Aphorisms.md
04

Comics Workflows/Comics.md
05

Comparisons Workflows/Comparisons.md
06

CreatePAIPackIcon Workflows/CreatePAIPackIcon.md
07

D3Dashboards Workflows/D3Dashboards.md
08

EmbossedLogoWallpaper Workflows/EmbossedLogoWallpaper.md
09

Essay Workflows/Essay.md
10

Frameworks Workflows/Frameworks.md
11

LogoWallpaper Workflows/LogoWallpaper.md
12

Maps Workflows/Maps.md
13

Mermaid Workflows/Mermaid.md
14

RecipeCards Workflows/RecipeCards.md
15

RemoveBackground Workflows/RemoveBackground.md
16

Stats Workflows/Stats.md
17

Taxonomies Workflows/Taxonomies.md
18

TechnicalDiagrams Workflows/TechnicalDiagrams.md
19

Timelines Workflows/Timelines.md
20

Visualize Workflows/Visualize.md
21

YouTubeThumbnailChecklist Workflows/YouTubeThumbnailChecklist.md

How to Invoke

Say any of these to your DA and PAI activates the Art skill automatically:

"art"
"illustration"
"diagram"
"flowchart"
"infographic"
"header image"
"thumbnail"
"visualize"
"generate image"
"mermaid"
"architecture diagram"
"comic"
"icon"
"blog art"

Or invoke explicitly:

Skill("Art")

Related Skills

Want PAI to do this for you?

Install PAI on your machine — your DA gets the Art skill plus 44 others, all hooked into one Life OS.

Install PAI View on GitHub