For everyone building on agents
Stop burning tokens
on WebFetch.
A 40-page site is ~80k raw-HTML tokens and six roundtrips by the time your agent parses the DOM. Sitedex pre-indexes the web — chunked, embedded, cited, kept fresh. One endpoint, every site your agent touches.
Hosted MCP · REST API · npx @sitedex/cli
WebFetch vs Sitedex
Two ways to read a 40-page docs site
Without Sitedex
tokens of raw HTML per agent task
- →4–8 round trips of WebFetch + DOM parse
- →Truncated context, stale answers, brittle scrapers
- →Token bills you can't budget
With Sitedex
tokens for the same answer
- →One MCP call returns ranked chunks with citations
- →Pre-chunked, pre-embedded, deduplicated
- →Every result carries a freshness stamp
Order-of-magnitude figures from typical agent runs.
Try it without installing anything
Live query against the production index
This box hits api.sitedex.dev/search from your browser. Same endpoint an agent would call. No signup, no demo data, no caching tricks.
For Agent Builders
One endpoint for every site, instead of one integration each
Pre-indexed, kept fresh, served through MCP, REST, or a CLI. Stop wiring up MCPs by hand. Stop falling back to WebFetch and parsing HTML on the agent's clock.
https://mcp.sitedex.dev/mcp
Cross-site server. Every indexed site is a tool. Each result carries a last_crawled stamp. Per-site endpoints at /s/<domain>/mcp.
POST api.sitedex.dev/search
REST endpoints that return structured chunks, citations, and freshness metadata.
npx @sitedex/cli
Terminal-native. Drop into any agent loop. Works offline of the MCP stack.
Wire it into your agent
Paste one block into your client
Remote where your client speaks streamable-http, npx where it doesn't. Same endpoint underneath. No credentials to manage.
Add to ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"sitedex": {
"command": "npx",
"args": ["-y", "-p", "@sitedex/cli", "sitedex-mcp"]
}
}
}Restart Claude Desktop after saving. On Windows the path is %APPDATA%\Claude\claude_desktop_config.json.
Something else? See the full docs →
How we compare
Pick the right tool for the job
Sitedex isn't a web crawler. It's a pre-indexed agent-ready interface to the parts of the web your users care about.
| Sitedex | Firecrawl | Brave Search API | |
|---|---|---|---|
| Primary shape | Pre-indexed agent interface | On-demand crawler API | Web search API |
| Hosted MCP server | Yes | Yes | No |
| Per-site MCP endpoint | Every indexed site | No | No |
| Pre-indexed chunks + citations | Yes | On-demand fetch | Snippets only |
| Per-result freshness stamps | Every result | Last fetch | Search engine date |
| Site evaluation / scoring | Built in | No | No |
Comparison current as of May 2026. Always check the source for the latest.
Cross-site search
Search every indexed site at once.
A vendor MCP can answer questions about itself. Sitedex answers questions across everyone. That's the capability no single-site integration can replicate — and the reason agent builders reach for it.
- → "Find a CRM with free tier + webhooks + great API docs."
- → "Which auth libraries support Cloudflare D1 natively?"
- → "Who charges a flat fee instead of a percentage?"
"auth libraries with cloudflare d1"
• better-auth.com native adapter
• lucia-auth.com community adapter
• clerk.com not supported
One endpoint today. Every site your agent will ever touch.
No per-site integrations. No DOM parsing on the agent's clock. No token bills you can't explain.