Announcing Web endpoints: markdown, HTML, and more from any URL
Social Fetch now extracts markdown, HTML, and LLM answers from any public URL—same API key, same envelope, no second vendor.
We kept hearing the same thing from teams already using Social Fetch: the pipeline never stays inside social platforms.
You pull a creator profile from TikTok, and then you need the landing page in their bio. You monitor a brand on X, and then you need the pricing page they linked in a campaign. You build an AI research agent, and half its sources are blog posts and docs—not Instagram posts.
Every time, the answer was a second tool: a Puppeteer container nobody wanted to babysit, a generic scraping API with different auth and different billing, or a hand-rolled /utils/scrape.ts that aged badly.
Now it's just another Social Fetch call
Pass any public URL, get structured content back. Same x-api-key, same { data, meta } envelope, same one-credit pricing. No second vendor.
The gap we kept hearing about
The social data was the easy part—Social Fetch already handled that. The painful part was everything around social:
- A creator's Linktree, Shopify store, or personal blog
- Competitor pricing pages, changelogs, and feature comparisons
- Documentation and support articles an AI agent needs to reason about
- Press releases and news articles linked from social posts
These are not edge cases. They are the other half of every pipeline that touches creators, brands, or content. And until now, handling them meant context-switching to a different integration with a different mental model.
The best architecture is one where "I need this URL too" does not require a new vendor evaluation.
What we shipped
Four new routes, all under /v1/web/*, all using the credentials and envelope you already know:
- Markdown — clean text extraction with filter modes for readability, raw DOM, or relevance-ranked passages. Reference →
- HTML — cleaned HTML when your downstream tool expects markup. Reference →
- Ask — pass a URL and a question, get a grounded answer back. Reference →
- Crawl — batch up to five URLs in one request, get per-page results. Reference →
All four cost one credit per page—same as a TikTok profile lookup. No multipliers, no response-size tiers.
What teams are building with this
AI and RAG pipelines. The markdown endpoint with filter=fit strips navigation chrome and gives you content ready for chunking and embedding. If you only need passages relevant to a specific topic, filter=bm25 with a query returns just the sections that score—like semantic search over a single page, without the indexing step.
Competitive intelligence. Product teams monitoring competitor pricing, feature grids, and changelog updates. The crawl endpoint grabs a pricing page plus a features page plus an about page in one call—one request, three credits, three pages of structured markdown you can diff weekly.
Page Q&A without building your own chain. The ask endpoint takes a URL and a natural-language question, reads the page, and returns a grounded answer. Teams use it for extracting structured facts ("what is the return policy?"), powering tooltips that explain linked content, or building quick comparison tools without writing a parser per site.
Enrichment beside social calls. You pull a creator profile and see a URL in their bio. One more request and you have the full text of whatever they are linking to—no context switch, no second integration, same SDK client.
Content monitoring. Marketing teams pulling their own landing pages as markdown, diffing against last week's version, alerting if messaging drifted after a rebrand or a deploy went wrong.
A quick look
Same SDK client you already use for social endpoints:
const page = await client.web.getMarkdown({
url: "https://competitor.com/pricing",
filter: "fit",
});
if (page.ok) {
// page.value.data.markdown.fit — ready for your LLM, your search index, or your diff tool
}
const answer = await client.web.ask({
url: "https://competitor.com/pricing",
q: "What is the cheapest plan?",
});
if (answer.ok) {
console.log(answer.value.data.answer);
}Markdown filters
fit (default) — readability-optimized, stripped of nav noise. raw — fuller DOM conversion. bm25 — relevance-ranked against a query you provide. These are separate requests, not three fields from one call.
Honest limits
- Public pages only. If a human needs a login or a cookie to see it, this API will not see it either.
- Live fetch, not cached. Expect seconds, not milliseconds. You get the page as it is right now, not a snapshot from an hour ago.
- Crawl is small and synchronous. Five URLs max. Built for "grab these specific pages," not "spider this entire domain."
- Ask is page-grounded. It reads what is publicly visible on the URL. It is not a general chatbot.
Where to start
If you already have a Social Fetch API key, these endpoints are live now. No opt-in, no waitlist.
- Explore in the browser — open the playground under Web, paste a URL, see the response
- Wire it into your backend — same
client.web.*methods in the TypeScript SDK - Estimate volume — Pricing; one credit per page, same as everything else
Quickstart
Auth, first request, and the response envelope.
API reference
Full parameters and response shapes for all four Web routes.
Social Fetch is still 20+ social platforms in one integration. Web is the same key and the same JSON discipline for the URLs that are not on those networks—and that is the point.