MCP profile

GhostQA

AI personas navigate your web app in real browsers, find bugs and UX issues. No scripts needed.

Content & MediaPackagePythonOpen SourceExternal

Book a demo View source

Last updated

March 16, 2026

Visibility

Public

ByRegistry

About This MCP Server

Traditional E2E tests are brittle. You write selectors, they break. You maintain scripts, they rot. SpecterQA takes a different approach: AI vision models look at your actual UI and navigate it the way a person would.

You define personas (who is using your app) and journeys (what they're trying to do). SpecterQA's engine takes a screenshot, sends it to a Claude vision model, gets back a decision ("click this button", "fill this field"), executes it via Playwright, takes another screenshot, and repeats until the goal is achieved or something goes wrong.

When something goes wrong, you get evidence: screenshots, UX observations, cost breakdowns, and findings categorized by severity.

The core loop is simple:

1. Screenshot -- Playwright captures the current page state as a PNG 2. Decide -- A Claude vision model receives the screenshot + persona context + goal, returns a structured JSON action (click, fill, navigate, scroll, keyboard, wait, done, or stuck) 3. Execute -- Playwright performs the action (click at coordinates, type text, navigate to URL, etc.) 4. Repeat -- Loop until the goal is achieved, the agent gets stuck, or the budget runs out

The persona's profile shapes how the AI behaves. A "tech-savvy developer" explores differently than a "frustrated first-time user." Persona patience, tech comfort, and frustrations all influence the system prompt.

Capabilities

Persona-based testing -- Define AI users with backgrounds, goals, frustrations, and tech comfort levels. They don't just follow scripts; they react to what they see.Vision-powered -- No selectors, no DOM queries. The AI interprets screenshots like a human would. Catches visual/layout issues that selector-based tests miss entirely.YAML-configured -- Products, personas, and journeys are all YAML files. PMs can read them. No code to maintain.Budget enforcement -- Per-run, per-day, and per-month cost caps. The engine hard-stops if you hit the limit. No surprise bills.JUnit XML output -- Drop --junit-xml results.xml and plug it into any CI system.Tiered model routing -- Haiku for cheap navigation, Sonnet for complex reasoning, optional local Ollama for zero-cost simple actions.Multi-platform -- Web apps (via Playwright), macOS native apps (via Accessibility API + pyobjc), iOS Simulator (via simctl). Same YAML format, different runners.Evidence collection -- Every run produces screenshots, a findings report, cost breakdown, and a structured JSON result. Everything is saved to an evidence directory.Stuck detection -- If the AI repeats the same action or the UI stops changing, the engine escalates to a stronger model, then aborts if nothing works. No infinite loops.Template variables -- Use {{persona.credentials.email}} in your journey steps. Variables resolve from persona configs at runtime.Precondition checks -- Verify services are up before running tests. Fail fast with clear errors instead of wasting API calls.

Tools & Endpoints1

Example Workflow

You'll need an Anthropic API key:

That's it. Three commands and an API key.

Why Use GhostQA?

Persona-based testing -- Define AI users with backgrounds, goals, frustrations, and tech comfort levels. They don't just follow scripts; they react to what they see.
Vision-powered -- No selectors, no DOM queries. The AI interprets screenshots like a human would. Catches visual/layout issues that selector-based tests miss entirely.
YAML-configured -- Products, personas, and journeys are all YAML files. PMs can read them. No code to maintain.
Budget enforcement -- Per-run, per-day, and per-month cost caps. The engine hard-stops if you hit the limit. No surprise bills.
JUnit XML output -- Drop --junit-xml results.xml and plug it into any CI system.
Tiered model routing -- Haiku for cheap navigation, Sonnet for complex reasoning, optional local Ollama for zero-cost simple actions.
Multi-platform -- Web apps (via Playwright), macOS native apps (via Accessibility API + pyobjc), iOS Simulator (via simctl). Same YAML format, different runners.
Evidence collection -- Every run produces screenshots, a findings report, cost breakdown, and a structured JSON result. Everything is saved to an evidence directory.
Stuck detection -- If the AI repeats the same action or the UI stops changing, the engine escalates to a stronger model, then aborts if nothing works. No infinite loops.
Template variables -- Use {{persona.credentials.email}} in your journey steps. Variables resolve from persona configs at runtime.
Precondition checks -- Verify services are up before running tests. Fail fast with clear errors instead of wasting API calls.

Limitations

Requires an Anthropic API key. No API key, no testing. There's no free tier built into SpecterQA itself.
Costs money. Every run makes API calls. A typical 3-step journey costs $0.30-0.60. Budget enforcement prevents surprises, but the meter is always running.
Vision models aren't perfect. The AI sometimes misreads small text, clicks the wrong element, or gets confused by complex layouts. It's good, not infallible. You'll occasionally see false positives and false negatives.
Not a replacement for unit tests. SpecterQA tests behavioral UX flows. It doesn't test your business logic, data integrity, or edge case handling. Use it alongside your existing test suite, not instead of it.
macOS native testing requires pyobjc. The specterqa[native] extra pulls in pyobjc packages (~200MB). Only needed for native macOS and iOS Simulator testing.
Alpha software. Version 0.4.0. APIs may change. File structure may change. Expect rough edges.
Single-persona per journey (for now). Multi-persona concurrent testing (e.g., simulating a chat between two users) is on the roadmap but not yet supported.
Deterministic reproduction is hard. Because the AI makes decisions at runtime, the exact sequence of actions varies between runs. Same journey, same persona, slightly different clicks. This is by design (it catches more issues) but makes exact reproduction tricky.

Specifications

Status

live

Industry

Content & Media

Requirements

SpecterQA is distributed via PyPI and requires Python 3.10 or later.
After installing, download the Playwright browser binaries:
For macOS native app testing and iOS Simulator support, install the optional native extra:
For MCP server support (integrating SpecterQA as a tool in Claude Desktop, Cursor, or other MCP clients):
You will also need an Anthropic API key to run tests:
To verify the installation:

Hosting

Hosting Options

Package

API

Integrate this server into your application. Choose a connection method below.

Install

Install command

Python

pip install specterqa

Performance

Usage

Quick Reference

Name: GhostQA
Function: AI personas navigate your web app in real browsers, find bugs and UX issues. No scripts needed.
Available Tools: For schema definitions, type stubs, and federated protocol details, see docs/for-agents.md.
Transport: Package
Language: Python
Install: pip install specterqa
Source: External (Registry)
License: Open Source

Get started

Ready to integrate this MCP server?

Content & MediaLiveVerified

May 20, 2026 · External

View →

Book a demo

GhostQAGhostQA

About This MCP Server

Tools & Endpoints1

Example Workflow

Why Use GhostQA?

Limitations

Specifications

Requirements

Hosting

Hosting Options

API

Install

Performance

Usage

Quick Reference

Ready to integrate this MCP server?

Related MCP Servers

AutEng Document Publisher

AutEng MCP - Markdown Publishing & Document Share Links

Filegraph Document Processing

Filtrix AI MCP

Lenny Rachitsky Podcast Transcripts MCP Server

Pictomancer Image Processing

Discover agents, MCP servers, and skills in one governed surface

GhostQA