1. The web is hostile to machines
Open any business website. The bakery on the corner. The dentist down the street. The repair shop, the law firm, the hotel.
What an agent needs from that page is usually 8 facts: name, address, phone, hours, services, prices, contact form, booking link. Maybe 2 KB of actual information.
What an agent has to download to extract those 8 facts is 500 KB to 5 MB of HTML, CSS, JavaScript, fonts, tracking pixels, ad networks, GDPR banners, image carousels, hero videos, animated logos, third-party widgets, customer testimonials, and a footer with 47 links. The 8 facts are buried in 99.6% noise.
The agent then has to parse all of it, identify what's relevant, throw away the rest, and hope nothing important was hidden in JavaScript that requires a headless browser to render.
This costs:
- Tokens. An average business homepage costs an LLM ~5,000 to 50,000 tokens to read. At GPT-4o prices, that's $0.01 to $0.10 per page just to look at it. An agent doing 100 lookups burns a dollar.
- Latency. A typical page takes 2-10 seconds to fully load and render. For an agent making real-time decisions for a user, that's an eternity.
- Reliability. Half the modern web requires JavaScript execution to render. The other half is hidden behind anti-bot defenses (Cloudflare, captchas, rate limits) that treat agents as adversaries.
- Sanity. Even when it works, the structure of a page changes constantly. Agents that depend on scraping break weekly.
The web wasn't designed with any of this in mind. It was designed for humans with eyes, browsers, and patience. AI agents have none of those things.
2. The fix isn't a better scraper. It's a different protocol.
The instinct of the last 20 years has been to fix this with better scrapers. Build smarter parsers. Use bigger LLMs to extract structured data from messy HTML. Pay for headless browser farms. Cache aggressively. Pray.
This works, sort of. It's also the wrong layer. It's like translating a fax into JSON every time you need to read one. Sure, it's possible. But the right move is to stop sending faxes.
The web has done this before. We added robots.txt for crawlers. We added sitemap.xml for search engines. We added schema.org for structured data. Each one was a small protocol layered on top of HTML to make a specific class of consumer happy.
None of those were enough for AI agents. Agents need more: they need actions, not just facts. They need confidence scores, not just text. They need stable identifiers, not URLs that move. They need a registry, not a search engine. They need an API, not a webpage.
We need a protocol designed for agents. From scratch.
3. The substrate, not a directory
Every existing approach treats agent-readable business data as something you bolt onto an existing website. "Add this JSON-LD snippet to your homepage. Now agents can read you."
This misses the point. Most businesses on Earth shouldn't have to build a website at all. The vast majority of small businesses are bakeries, dentists, plumbers, hairdressers, restaurants, repair shops, gyms, tutors, and trades. They don't need a homepage. They need to be findable.
Building a website costs them money, time, and ongoing maintenance. It exists not because the business needs it but because that was the only way to be on the internet for the last 30 years.
The agent web doesn't need to inherit that constraint.
The principle: a business should be able to exist on the agent web in 30 seconds, for free, without owning a domain, without hosting anything, without writing any code or HTML. Their data lives in one canonical place. Agents query that place. The data is theirs to edit, ours to host, and free to access.
This is the substrate of the agent web. AgentWeb is the canonical hosted place where the data lives — owned by the people who put it there, served free to anyone who asks, indexed by default for AI agents.
Think of it the way GitHub is for code, Wikipedia is for facts, YouTube is for video. One place that everyone trusts as the home for a category of data, where the producers do the editing and the consumers get the data for free.
4. What an agent-native record looks like
A business profile on AgentWeb isn't a webpage. It's a structured record optimized for the way agents actually consume data:
{
"id": "biz_4xz7p2",
"name": "Berlin Bakery",
"category": "bakery",
"address": {
"street": "Hauptstraße 5",
"city": "Berlin",
"country": "DE",
"geo": [52.4905, 13.4267]
},
"phone": {
"number": "+4930123456",
"confidence": 0.98,
"verified_at": "2026-04-05T11:23:00Z",
"source": "owner"
},
"hours": { "mon-fri": "07:00-18:00", "sat": "08:00-14:00", "sun": "closed" },
"is_open_now": true,
"actions": [
{ "verb": "call", "endpoint": "tel:+4930123456" },
{ "verb": "book", "endpoint": "/v1/biz/biz_4xz7p2/book" },
{ "verb": "contact", "endpoint": "/v1/biz/biz_4xz7p2/contact" },
{ "verb": "menu", "endpoint": "/v1/biz/biz_4xz7p2/menu" }
],
"trust": {
"score": 0.94,
"claimed_by_owner": true,
"last_verified_at": "2026-04-05T11:23:00Z"
}
}
Notice what's there and what isn't:
- No HTML, no CSS, no JavaScript. Just facts.
- Per-field confidence scores so agents can filter on trust.
- Per-field provenance so agents know where each fact came from.
- Per-field freshness so agents know when each fact was last verified.
- Computed fields like
is_open_nowso the agent doesn't have to do timezone math. - Actions, not just data. Every fact that can be acted on has a verb attached. Agents don't have to construct API calls — they have to choose which one to invoke.
This entire record is ~700 bytes. The same business on a typical website would take 300 KB to deliver. That's a 400x reduction in data per fetch. For an agent making millions of lookups, the cost difference is staggering.
5. Speed and cost as a feature
The agent web isn't just a different shape of data. It's a different cost curve.
| Metric | Typical website | AgentWeb profile |
|---|---|---|
| Response size | 300 KB – 5 MB | ~700 bytes – 4 KB |
| Latency (p95) | 2 – 10 seconds | < 50 ms |
| Tokens to parse | 5,000 – 50,000 | < 200 |
| Cost per lookup (LLM tokens) | $0.01 – $0.10 | $0.00001 |
| Reliability | Variable, fragile | Deterministic |
| Anti-bot defenses | Common | None — agents are first-class |
| Action support | None | Built in (call, book, contact, etc.) |
That's not a 2x or 10x improvement. It's a 1,000x improvement on every axis. For agents at scale, that's the difference between "I can build this product" and "I can't afford to build this product."
The agent web isn't just better. It's the only thing that makes agent-driven applications economically viable at scale.
6. What we're committing to building
This isn't just a manifesto. It's a roadmap. Here's what we're committing to:
Today: the substrate is live
- 11 million businesses across 233 countries, accessible via free REST API
- The substrate. Public, no-API-key reads at
/v1/r/{id}/agent.json— sub-50ms global, ~700 bytes per record. Every business is also reachable as a human profile at/r/{id}with embedded JSON-LD. - The agent shorthand format. Single-letter keys, ~80% fewer tokens than standard JSON. Self-described at
/v1/schema/short. - The claim flow. Any business owner can claim their profile or add a brand-new business in 30 seconds at
/claimand/claim/new. No website required. - Strong verification. Email-at-domain verification automatically promotes claims to
owner_verified. - Per-field provenance and confidence. Every fact comes with a source and a verification timestamp.
- The directory. Browse 15,000+ categories and 233 countries at
/directory. Per-category and per-country landing pages with full SEO. - The dispute flow. Public dispute form at
/dispute/{id}. Self-healing data quality. - Two-way contributions.
POST /v1/contributeandPOST /v1/reportlet agents add new businesses, enrich existing ones, and flag bad data. Tracked in a public leaderboard. - An MCP server (
npm i agentweb-mcp) so any agent runtime — Claude Desktop, Cursor, Continue, Cline — can call AgentWeb as a native tool. Six tools exposed:search_businesses,get_business,agentweb_health,agentweb_leaderboard,contribute_business,report_business. - Browser-native i18n hint. Non-English visitors see a localised banner suggesting browser translation, in 21 languages.
- OpenStreetMap as the base layer, JSON-LD enrichment from owner sites, deduplication by name + coordinates.
Next 90 days: the action layer
contactaction. Agents can send messages to businesses through AgentWeb. We handle deliverability, abuse prevention, and reply parsing.bookaction. Native booking for businesses that opt in, integrated with calendar tools.subscribeaction. Agents can subscribe to changes — when a business updates hours, prices, or status, subscribed agents get notified.- Phone callback verification. Twilio-mediated trust upgrades for businesses without a website.
- Semantic search. pgvector-backed natural-language queries: "a quiet cafe with wifi where I can work".
Beyond: the constellation
Once the substrate works for businesses, the same model extends to any other category that agents need to act on: products, events, services, real-time data. Each one becomes another vertical, all under the same registry, all served by the same MCP endpoint. One integration. The whole agent web.
7. The principles, in one place
Everything we build will be measured against these:
- Agents are the primary user. Humans get a nice profile page as a side effect, not the main thing. Every design decision optimizes for machines first.
- Free at the base. Reading the substrate is free, forever, with generous rate limits. Paid tiers are for action endpoints, verification, and SLA — not for access to data.
- Small payloads, fast responses. Sub-50ms global. Sub-1KB by default. Token efficiency is a first-class concern.
- Provenance and confidence on every fact. Agents shouldn't have to guess what to trust.
- Owners control their data. If you claim your profile, you own it. We host it; we don't sell it.
- Actions, not just data. Every readable noun has callable verbs attached.
- Open spec. The protocol, the schemas, the API are all public. Anyone can build a compatible implementation. We're not gatekeeping a standard.
- No website required. A bakery should be able to exist on the agent web with nothing but a phone.
8. Why this has to happen now
AI agents are about to become real economic actors. Within 12-24 months, a meaningful share of all consumer purchase decisions, customer service interactions, research tasks, and outbound communication will be initiated by agents acting on behalf of humans.
Those agents need infrastructure. They need a way to find businesses. They need a way to act on what they find. They need it to be cheap, fast, reliable, and free at the base. None of that exists today.
If nothing changes, the default will be: agents brute-force the human web with expensive LLM calls and headless browsers, and the experience will be slow, fragile, and economically unsustainable. The promise of agents will be capped by the friction of the substrate they have to operate on.
If the agent web exists, the default will be: agents query a structured registry, get clean data and callable actions, and the experience will be sub-second and pennies. The promise of agents will be unlocked.
This is the difference between AI agents being a curiosity and AI agents being infrastructure.
Someone is going to build this. The question is whether it's built by a company optimizing for shareholder returns, by a platform that gates access, or by something open and free that treats both businesses and agents as first-class citizens.
We're choosing the third one.
9. Where this goes
The end state is an agent web that looks nothing like the human web. Businesses don't have websites — they have profiles. Agents don't browse — they query. Search isn't a list of links — it's a structured answer with provenance. Actions aren't form-filling — they're verb invocation.
The human web doesn't go away. It stays where it is, for the things humans need: long-form reading, video, art, social, expression. The agent web sits next to it, a parallel substrate optimized for a parallel kind of consumer.
One built for eyes. One built for code. Both useful. Neither pretending to be the other.
That's the future we're building. Today is day one.
Build with us
If you're building AI agents that need to know about the real world — start here.
If you run a business — claim your profile. It takes 30 seconds. It's free forever.