How to Build a Screenshot Agent for Claude Code

Claude Code supports custom agents defined as markdown files in .claude/agents/. Each agent has a name, a description, a model, and a set of instructions. Any Claude Code session can spawn them. Once an agent exists, it’s available everywhere.

The screenshot agent uses that system to capture web pages and local HTML files on demand. Playwright handles the browser automation. Chromium runs headlessly in the background.

What it’s for

Three use cases that come up constantly:

Verifying deploys. After pushing a site update, you want to see what it actually looks like, not just whether the build succeeded. The agent can capture the live URL immediately after deploy.

Generating blog header images. Point it at any page or local HTML file. It captures a full-width crop at the right dimensions for a header. No Figma, no screenshots from a browser window.

Before/after comparisons. Capture what a page looks like before making changes, then again after. Useful when handing off visual context to another session or to a collaborator.

The agent definition

The agent lives at ~/.claude/agents/screenshot.md. The frontmatter looks like this:

---
name: screenshot
description: Takes screenshots of URLs or local HTML files using Playwright
model: claude-sonnet-4-5
---

The instructions section tells the agent what script to run, what arguments it accepts, and how to handle retina mode, full-page vs. cropped, and local file paths vs. live URLs.

The underlying script

The script is about 75 lines of Python using Playwright’s async_api. The key pieces:

Accepts a URL or a local file path as input
Launches a headless Chromium instance
Sets viewport dimensions based on the target format (default, retina, blog header)
Captures either the full page or a fixed-height crop
Saves the file with a timestamped name so nothing gets overwritten

Playwright handles the browser lifecycle cleanly. For local files, the script converts the file path to a file:// URL before loading it.

Basic usage

Take a screenshot of https://codeforcreatives.com

Screenshot this local file: /Users/sash/project/preview.html

Take a retina screenshot of the homepage for a blog header

The agent figures out which mode to use based on the request. It returns the file path when it’s done.

Why this matters

The screenshot agent is a small thing, but it illustrates something about how custom agents work. There’s no approval step, no environment switching, no context loss. Any session can say “take a screenshot of this” and the capability is just there.

Defining custom agents is one of the more underused features of Claude Code. Most people know about CLAUDE.md and skills, but agents are a separate layer. An agent is a specialized sub-process with its own model, its own instructions, and its own scope. You can have an agent for screenshots, one for running tests, one for database queries. Each one is a markdown file in a folder.

Once an agent exists, any session can spawn it.

tags: claude-code, playwright, screenshot, automation, custom agents

A note from Alex: hi i’m alex - i run code for creatives. i’m a writer so i feel that it is important to say - i had claude write this piece based on my ideas and ramblings, voice notes, and teachings. the concepts were mine but the words themselves aren’t. i want to say that because its important for me to distinguish, as a writer, what is written ‘by me’ and what’s not. maybe that idea will seem insane and antiquated in a year, i’m not sure, but for now it helps me feel okay about putting stuff out there like this that a) i know is helpful and b) is not MY voice but exists within the umbrella of my business and work. If you have any thoughts or musings on this, i’d genuinely love to hear them - its an open question, all of this stuff, and my guess is as good as yours.