Open Source · PyPI · MCP · ClawHub

Your AI can

Tappi gives AI agents control of your real browser — with your sessions, your cookies, your extensions. No screenshots. No DOM dumps. No token tax.

0x
fewer tokens
0
MCP tools
0/3
benchmark wins
tappi — zsh
$

Every other browser tool is broken expensive

Screenshot-based tools burn thousands of tokens per click. DOM dumpers flood your context window with noise. Headless browsers get blocked by every major site. There had to be a better way.

Screenshots

Vision models squint at pixels, guess coordinates, pray they click right. 5-10K tokens per interaction.

DOM Dumps

Entire accessibility trees — 50K+ tokens of nested divs. The LLM reads a novel just to click a button.

Headless Chrome

No cookies. No sessions. Reddit, Gmail, LinkedIn — all blocked at the front door. CAPTCHAs everywhere.

Tappi: indexed element lists. LLM says "click 4". Done.

Built for AI agents. Used by humans.

Every feature exists because we hit a wall with existing tools and built the thing we needed.

10x Token Efficiency

Compact indexed element lists instead of screenshots or DOM dumps. The LLM reasons less and acts faster.

Shadow DOM Piercing

Reddit, Gmail, GitHub — all use shadow DOM. Tappi pierces through automatically. Accessibility trees can't.

Your Real Browser

Connects to your existing Chrome via CDP. Your sessions, cookies, extensions. No login walls. No CAPTCHAs.

24 MCP Tools

Full MCP server for Claude Desktop, Cursor, Windsurf, and any MCP client. stdio + HTTP/SSE transport.

Cross-Origin Iframes

Payment forms, CAPTCHA solvers, OAuth popups — coordinate commands handle cross-origin boundaries.

Built-in AI Agent

6 tools (browser, files, PDF, spreadsheets, shell, cron). 7 LLM providers. Web UI with live tool visibility.

Sandboxed by Design

One browser. One workspace directory. No filesystem access beyond what you define. Deliberate constraints.

Zero-Config Install

pip install tappi. Or double-click the .mcpb bundle for Claude Desktop. One install, all features.

CLI + Python + MCP

Use from the command line, import as a Python library, or connect as an MCP server. Same power, three interfaces.

The benchmark nobody asked for

4 tools. 3 real-world tasks. Same model, same thinking level, same instructions. Only one went 3/3 with correct data.

🔹 tappi🔸 Browser Tool🔷 Playwright🔶 playwright-cli
Success Rate🟢 3/3🟢 3/3🟡 1/3*🔴 1/3
Total Context59K252K44K52K
Total Time4m 13s8m 38s3m 42s3m 36s
Auth Tasks
Bot Detection
Shadow DOM⚠️ WorkaroundN/AN/A
Data Quality⭐ High⭐ High⚠️ LowN/A
Verdict🏆 Best overallReliable but heavyCheap but brittleToo limited

*Playwright's Reddit "success" returned automod bot comments instead of actual top comments on 4/5 posts — functionally incorrect.

How it works

  You (CLI / Web UI / Claude Desktop)
           ↓
  ┌──────────────────┐
  │   LLM Agent      │ ← Sees compact element lists, not DOM dumps
  └────────┬─────────┘
           │
  ┌────────┴─────────┐
  │    Tool Calls     │
  ├──────────────────┤
  │ 🌐 Browser       │ → CDP → Your Chrome (with all your sessions)
  │ 📁 Files          │ → Sandboxed workspace directory  
  │ 📄 PDF            │ → Read/create PDFs
  │ 📊 Spreadsheets   │ → CSV/Excel (.xlsx)
  │ 💻 Shell          │ → Optional, workspace-only
  │ ⏰ Cron           │ → Scheduled recurring tasks
  └──────────────────┘

  No middleware. No cloud. No screenshots.
  Just structured data flowing between your browser and your LLM.

Four ways to use tappi

Pick the one that fits your workflow.

MCP Server

Claude Desktop, Cursor, Windsurf

$ tappi mcp
# or HTTP/SSE:
$ tappi mcp --sse
# Claude Desktop: double-click .mcpb

AI Agent

Built-in, 7 LLM providers

$ bpy setup
$ bpy agent "Summarize HN top 5"
$ bpy serve
# web UI with live tool calls

CLI / Shell

Direct browser control

$ tappi open github.com
$ tappi elements
$ tappi click 3
$ tappi text

Python Library

Import and build

from
tappi import Browser
b = Browser()
b.open("https://github.com")
elements = b.elements()
b.click(3)
pip install tappi

Python 3.10+ · Chrome or Chromium · Linux, macOS, Windows

The honest take

We're not going to pretend tappi is perfect. Here's what other tools do better — and why we think our tradeoffs are the right ones.

Vision-optional, not vision-first

Tappi's default flow doesn't use screenshots — it indexes elements into compact lists instead. Tools like Anthropic Computer Use or OpenAI Operator are vision-first: they "see" every page and reason about visual layout, button colors, spatial relationships. Tappi can take screenshots (tappi screenshot), but it's a fallback, not the primary interaction mode.

The tradeoff: Vision costs 5-10K tokens per screenshot. Tappi's element indexing costs ~200 tokens and is more reliable — the LLM never misclicks because it "saw" the button 3 pixels off. For 95% of browser tasks, you don't need to see the page — you need to interact with it. And when you do need a visual?tappi screenshot is one command away.

Requires a running Chrome

Playwright spins up a fresh browser anywhere — CI pipelines, Docker containers, serverless. Tappi needs Chrome running with a debug port. You can't run it in a GitHub Action without extra setup.

The tradeoff: That "limitation" is the feature. The running Chrome IS the point — it has your sessions, your cookies, your service workers, your extensions. A fresh headless Chromium is like an amnesiac trying to do your job. For testing and CI, use Playwright. For real-world agent work, use tappi.

Smaller ecosystem

Playwright has Microsoft behind it, thousands of contributors, massive documentation, integrations with every CI system. Tappi is one developer and a growing community. Fewer Stack Overflow answers. Fewer tutorials.

The tradeoff: Every tool starts small. Playwright isn't designed for AI agents — it was built for testing. We're purpose-built for one thing: giving AI agents browser control that actually works. Smaller but sharper. And the code is simple enough to read in an afternoon.

Python only

Playwright supports Python, Node.js, .NET, and Java. Tappi is Python-only. If your stack is TypeScript/Node, you'll need to shell out or use the MCP server.

The tradeoff: Python is the lingua franca of AI/ML. Every major AI framework is Python-first. And the CLI + MCP server work from any language — your TypeScript agent can call tappi via MCP without writing a line of Python. Language boundaries dissolve when you speak protocol.

Tappi is not a Playwright replacement. It's a different tool for a different job. Playwright tests websites. Tappi uses them.

Give your AI a real browser

Stop burning tokens on screenshots. Stop getting blocked by headless detection. Start automating with the browser you already use.

pip install tappi