About OpenFunnel Bench

OpenFunnel Bench is the public benchmark hub published by OpenFunnel. It exists because the AI agent tooling space is too noisy to evaluate by vibes. Every benchmark here runs against verified ground truth, with scoring rules, input slices, and pricing assumptions documented openly.

Who runs it

OpenFunnel is the team behind the open agentic-GTM stack. We publish benchmarks for the categories we ship against and the ones our users keep asking about. We do not accept payment from benchmarked providers for placement, weighting, or removal.

How a benchmark is produced

Ground truth dataset. Each benchmark is bound to a curated dataset where the correct answers are known to OpenFunnel from first-party sources.
Provider runs. The same input is sent to every provider through their public API, using the documented credentials and rate limits of their cheapest publicly listed plan.
Scoring. Provider output is compared against ground truth on five metrics: correct rate, wrong rate, answer rate, accuracy when answered, and cost per correct result.
Publication. Aggregate scores are published here and through the public JSON API and MCP server. Raw inputs and PII never leave secure storage.

What we publish openly

Per-provider scores at /leaderboards
REST API at /api/leaderboards (no auth, CORS open, CC-BY-4.0)
MCP server at mcp.openfunnel.dev/mcp
OpenAPI 3.1 spec at /openapi.json
llms.txt + llms-full.txt + agents.md for agent-readable docs

Corrections and contact

If you are a benchmarked provider and you think a number is wrong, email hello@openfunnel.dev with the provider, dataset slice, run timestamp, and your evidence. We re-run and re-publish with the methodology updated.