Development Standard

Browser

Headless browser automation via agent-browser — Rust CLI daemon with persistent auth profiles for fast, scriptable, parallel browser work.

03
Workflows
01
Reference
11
Triggers
medium
Effort

The Problem

Ask a generic AI to scrape ten pages in parallel while staying logged into a site, and it either refuses or spins up a new browser instance for every page. There's no persistent auth state, no daemon to reuse across commands, and no way to fan out YAML user stories to parallel reviewers. Every headless session starts cold, logs in from scratch, and costs you the time and tokens to babysit it.

How This Skill Approaches It

Browser wraps agent-browser, a Rust CLI daemon that holds a persistent session across commands. You log in once with --headed --profile, and every subsequent run reuses those cookies and IndexedDB headlessly. Parallel scraping uses --session to spin up isolated contexts under the same daemon. The ReviewStories workflow fans out YAML user-story files to parallel UIReviewers; Automate runs parameterized recipe templates for repeatable patterns like FormFill or ScreenshotCompare. If a site has bot detection, the skill routes to Interceptor automatically.

  • Supports batch commands, network interception, device emulation, per-site profile auth (one-time headed login, headless forever after), and parallel isolated sessions via --session
  • Workflows: ReviewStories (fan out YAML user stories to parallel UIReviewers), Automate (load/run parameterized recipe templates), Update
  • Delegates to general-purpose agents with agent-browser instructions for background parallel scraping
  • Falls back to Interceptor if site has bot detection
Not for deploy verification or UI confirmation with real Chrome (use Interceptor), simple single-URL fetching (use WebFetch), CAPTCHA or bot-detection bypass (use BrightData or Interceptor), or social platform actor-based scraping (use Apify)

In Action

What you say to your DA, and what the Browser skill actually does.

  • You say "scrape these 15 pages in parallel and grab the pricing table from each"
    Spins up parallel agent-browser --session contexts, one per page, fans work out to background agents with batch commands, and merges results — no cold-start login overhead per session.
  • You say "run the checkout story against staging and tell me if it passes"
    Loads the YAML story from Stories/, fans steps to a UIReviewer agent via the ReviewStories workflow, runs each action and assertion, and reports pass/fail with a snapshot on failure.
  • You say "automate the FormFill recipe on the contact page"
    Runs the Automate workflow, loads the FormFill.md recipe template with your parameters injected at {PROMPT}, and executes it via agent-browser against the target URL.

Inside the Skill

The thinking, frameworks, and architecture that distinguish this skill from a generic version of the same task.

What It Does

Drives a headless browser from the command line through agent-browser, a Rust CLI daemon. Opens pages, clicks, fills forms, takes screenshots, runs JS, intercepts network, and emulates devices. Holds per-site auth profiles so you log in once and stay logged in for headless runs after that. Batches commands, runs sessions in parallel, and hands browser work to background agents.

The Problem

Most browser automation is slow to script, leaks state between runs, and forces a fresh login every time. You either babysit a headed browser or fight a framework that re-authenticates on every call. When you need to screenshot ten pages, scrape a logged-in site, or run several scrapes at once, the per-run startup and auth cost dominates. This skill keeps a daemon warm, persists auth per site, and lets you fan work out to parallel sessions and agents.

How It Works

Tool: agent-browser — headless Rust CLI daemon with persistent auth profiles.

If agent-browser isn't working or a site has bot detection, use the Interceptor skill instead. Interceptor is a Chrome extension with zero CDP fingerprint — passes all major bot detection checks.

Does the site need auth?

Use --profile ~/.agent-browser/profiles/<site>. If profile exists, auth is automatic. If not, run --headed once for login, then headless forever.


agent-browser

Native Rust daemon. Persistent profiles for auth. Headless by default.

Quick One-Shot Commands

agent-browser open https://example.com && agent-browser screenshot /tmp/shot.png
agent-browser open https://example.com && agent-browser screenshot --full /tmp/full.png
agent-browser open https://example.com && agent-browser pdf /tmp/page.pdf

Session-Based Interaction

# 1. OPEN
agent-browser open https://example.com

# 2. WORK
agent-browser snapshot                    # a11y tree with @eN refs (for AI)
agent-browser click @e12                  # click by ref
agent-browser fill @e15 "hello"           # fill input by ref
agent-browser screenshot /tmp/shot.png    # screenshot
agent-browser eval "document.title"       # run JS

# 3. CLOSE — when done
agent-browser close

Authenticated Browsing (Per-Site Profiles)

First-time setup (headed, one-time):

# Close any running daemon first
agent-browser close --all

# Launch headed with persistent profile — log in manually
agent-browser --headed --profile ~/.agent-browser/profiles/<site> open https://example.com

# After login completes, all future runs reuse the profile headlessly

Subsequent runs (headless, automatic):

agent-browser --profile ~/.agent-browser/profiles/<site> open https://example.com
# Auth is automatic — cookies, IndexedDB, cache all persist

To add a new site: Close daemon, run --headed --profile ~/.agent-browser/profiles/<name> once, log in, done.

Auth Vault (Alternative)

agent-browser auth save mysite --url https://example.com --username user --password-stdin
agent-browser auth login mysite    # auto-fills login form
agent-browser auth list            # show saved profiles

Batch Execution

# Send multiple commands in one shot (fewer tool calls = fewer tokens)
echo '[["open","https://example.com"],["snapshot"],["click","@e12"]]' | agent-browser batch

Advanced Features

# Connect to already-running Chrome
agent-browser --auto-connect snapshot

# Network interception
agent-browser route "**/*.{png,jpg}" abort     # block images
agent-browser route "https://api.com/*" mock '{"data":"test"}'

# Device emulation
agent-browser --device "iPhone 15" open https://example.com

# Session persistence (cookies + localStorage by name)
agent-browser --session-name myapp open https://example.com

agent-browser Rules

  • Daemon model — first command starts daemon, subsequent commands connect instantly.
  • Refs use @eN syntax@e12 not e12.
  • Profiles persist everything — cookies, IndexedDB, cache, localStorage.
  • Close with agent-browser close or close --all to kill daemon.

Delegating Browser Work to Agents

When you need parallel or background browser work (scraping multiple pages, monitoring), spawn general-purpose agents with browser instructions. No dedicated browser agent type needed — this skill IS the expertise.

Agent(subagent_type="general-purpose", prompt="
  Use agent-browser CLI for all browser work.
  Commands: open <url>, snapshot, click @eN, fill @eN 'text', screenshot /path.
  For authenticated sites: --profile ~/.agent-browser/profiles/<site>
  Refs use @eN syntax from snapshots.
  [your specific task instructions here]
")

For parallel isolation, each agent uses --session <name>:

Agent 1: agent-browser --session scrape1 open https://site-a.com
Agent 2: agent-browser --session scrape2 open https://site-b.com

Fallback: If agent-browser fails or the site has bot detection, use the Interceptor skill instead.

Legacy built-in agents — DEPRECATED, do not invoke. BrowserAgent and UIReviewer are Claude Code built-ins whose internals cannot be modified; they run browser automation that PAI no longer uses. Route all browser work through the Interceptor skill (verification, authenticated flows) or agent-browser (headless scraping).


Stories — YAML User Story Validation

Define user stories in YAML and validate them in parallel with UIReviewer agents.

Directory: skills/Browser/Stories/

name: App Name
url: https://example.com
stories:
  - name: Story name
    steps:
      - action: click
        target: "LLM-readable description"
    assertions:
      - type: snapshot_contains
        text: "expected text"

Run with: "review stories" or "run stories in HackerNews.yaml"


Recipes — Parameterized Templates

Reusable Markdown templates with {PROMPT} injection.

Directory: skills/Browser/Recipes/

Recipe Description Tool
SummarizePage.md Extract content summary BrowserAgent
ScreenshotCompare.md Before/after comparison agent-browser
FormFill.md Fill form fields agent-browser

Run with: "automate SummarizePage for https://example.com"


Workflows · 3

  1. 01
    `Workflows/ReviewStories.md` Workflows/`Workflows/ReviewStories.md`.md

    review stories, run stories, ui review, validate stories

  2. 02
    `Workflows/Automate.md` Workflows/`Workflows/Automate.md`.md

    automate, recipe, template, or a recipe name

  3. 03
    `Workflows/Update.md` Workflows/`Workflows/Update.md`.md

    update, check version

How to Invoke

Say any of these to your DA and PAI activates the Browser skill automatically:

  • "headless browser"
  • "batch scrape"
  • "fast screenshot"
  • "dev server test"
  • "parallel browser"
  • "background automation"
  • "extract data"
  • "review stories"
  • "automate recipe"
  • "batch screenshots"
  • "scrape multiple pages in parallel"

Or invoke explicitly:

Skill("Browser")

References · 1

Auxiliary files the skill loads at runtime — frameworks, guides, configs.

  • README

Want PAI to do this for you?

Install PAI on your machine — your DA gets the Browser skill plus 44 others, all hooked into one Life OS.