Which platforms support comment listing?

TikTok, YouTube, Instagram, Facebook, Reddit, and Rumble expose comment endpoints under `/v1/{platform}/.../comments`. See the [API reference](/docs/api) for the full list.

Can I get reply threads?

Top-level comments paginate with `cursor`. YouTube also exposes `/v1/youtube/videos/comments/replies` — pass each comment's `repliesCursor` or the replies page `nextCursor`. Reddit returns nested `replies` on each comment; walk the tree or flatten before scoring.

Are deleted or hidden comments included?

You get what is publicly visible at lookup time. Private or removed comments will not appear; `lookupStatus` tells you if the parent post was reachable.

How are credits charged for comment pages?

Each paginated request is a separate lookup. A completed page that returns zero comments is still charged if the lookup completed successfully.

Should I score every comment on a viral post?

Usually no. Pull top pages by engagement, sample by like count, or cap pages per post. Full-thread pulls on 50k-comment videos burn credits without changing your aggregate sentiment much.

How do I dedupe comments across re-runs?

Upsert on `platform + comment.id`. Store `capturedAt` and `meta.requestId` on every row so you can tell a fresh pull from a cached snapshot.

Social Comments for Sentiment Analysis with an API (2026)

A sponsored caption can read positive while the comment section argues about shipping delays. Sentiment on post text alone misses that split — and on most networks the argument lives behind infinite scroll, different sort orders, and nested reply trees.

This guide walks through comment collection for scoring pipelines: which endpoints to call on TikTok, YouTube, Instagram, and Reddit, how pagination actually works, when to sample instead of draining a thread, and how to hand normalized text to keyword rules or an LLM classifier without double-counting the same reply on every cron run.

The short version

Pass a public post URL to a platform's comments endpoint, paginate with cursor, and feed comment.text into your classifier. Same { data, meta } envelope on every platform.

You'll need an API key. Test comment lookups in the Playground.

Why comments beat post text alone

Post copy is one voice. Comments are hundreds of independent reactions — often angrier, more specific, and closer to purchase intent than anything in the creative brief.

Signal	Where it shows up	Why it matters for sentiment
Objections	"shipping took 3 weeks"	Product gaps surface in replies, not captions
Brand perception	"used to love them until…"	Churn language clusters in threads
Campaign QA	Debate under sponsored posts	Creative testing via audience pushback
Creator risk	Toxic reply patterns	Partnership vetting before you sign

Manual copy-paste fails on pagination, breaks when URLs redirect, and does not scale to a launch monitor that runs hourly. A scheduled job with cursors and a warehouse table does.

Pair comment pulls with search-based listening when you do not already have the post URL — search finds candidates; comments endpoints score the ones that crossed your velocity threshold.

Which platforms expose comments

Social Fetch normalizes comment listing under predictable routes. Pattern: GET /v1/{platform}/posts/comments or .../videos/comments, required url, optional cursor.

Platform	Route	Notable query params	Reply depth
TikTok	`/v1/tiktok/videos/comments`	`trim` for lighter payloads	Top-level per page; `replyCount` on each row
YouTube	`/v1/youtube/videos/comments`	`order=top` or `newest`	`/v1/youtube/videos/comments/replies` per thread
Instagram	`/v1/instagram/posts/comments`	works on posts and reels	Top-level; `replyCount` when reported
Reddit	`/v1/reddit/posts/comments`	`trim`	Nested `replies` on each comment
Facebook	`/v1/facebook/posts/comments`	—	Top-level pages
Rumble	`/v1/rumble/videos/comments`	—	Top-level pages

Every response includes data.lookupStatus (found or not_found), data.comments, data.page.nextCursor, data.page.hasMore, and meta.requestId. Field names differ slightly per platform — TikTok uses likes on the comment object; YouTube uses likeCount; Reddit uses score — but text and id are always there for scoring.

Phase 0: Scope the scoring job

Before you paginate, write down what "done" means. Vague goals produce expensive full-thread pulls on videos that do not need them.

Job type	Example question	Typical pull
Launch monitor	Did sentiment flip negative after the drop?	Top 2–3 pages per flagged post, hourly
Creator vetting	Are reply threads hostile?	First 500 comments by likes on recent posts
Support triage	Which posts need a human?	Keyword filter on newest page only
Research archive	What objections appear in r/SaaS?	Full pagination on 10–20 high-score threads

List inputs: post URLs from your CMS, URLs from a Reddit research sprint, or video IDs from TikTok search. Set a per-post page cap before you write the loop — cursors make it easy to overspend.

TikTok video comments

TikTok threads move fast and skew short. Comments often carry the meme reaction while the on-screen caption stays neutral.

curl -sS \
  -H "x-api-key: $SOCIALFETCH_API_KEY" \
  -G "https://api.socialfetch.dev/v1/tiktok/videos/comments" \
  --data-urlencode "url=https://www.tiktok.com/@mrbeast/video/7596844935442189598"

Useful fields for scoring:

text — primary classifier input
likes — weight positive/negative aggregates (high-like sarcasm still counts)
language — route to locale-specific models without guessing
replyCount — prioritize threads with debate before you pay for more pages
author.handle — blocklist repeat spam accounts client-side

Optional trim=true drops ancillary fields when you only need text and ids for batch scoring.

Reference: TikTok video comments.

On viral videos, totalComments in the response helps you decide whether full pagination is worth it — a post with 40,000 comments rarely needs every page for a dashboard chart.

YouTube video comments

YouTube exposes sort order at query time. That matters because "top" and "newest" can disagree on sentiment for the same video.

const params = new URLSearchParams({"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","order":"top"});

const response = await fetch(
  `https://api.socialfetch.dev/v1/youtube/videos/comments?${params.toString()}`,
  {
    headers: {
      "x-api-key": process.env.SOCIALFETCH_API_KEY,
    },
  }
);

const body = await response.json();

console.log(response.status, body);

`order` value	When to use
`top`	Default for brand monitoring — surfaces the visible thread
`newest`	Launch week, controversy spikes, or support triage

Fields worth storing:

likeCount and replyCount — same weighting story as TikTok
author.creator — creator replies often clarify context; tag separately in aggregates
repliesCursor — non-null means a sub-thread exists

Nested replies are a separate endpoint. After listing top-level comments, fetch replies for rows where replyCount exceeds your threshold:

curl -sS \
  -H "x-api-key: $SOCIALFETCH_API_KEY" \
  -G "https://api.socialfetch.dev/v1/youtube/videos/comments/replies" \
  --data-urlencode "cursor=REPLIES_CURSOR_FROM_COMMENT"

Each reply page bills separately. For sentiment, scoring top-level comments plus high-reply threads usually captures the argument without walking every leaf node.

Reference: YouTube video comments · Comment replies.

Instagram post comments

Instagram comments on reels and feed posts share one route. Pass the full post or reel URL from the share sheet — shortened instagr.am links usually work, but the canonical /p/ or /reel/ URL is safest.

curl -sS \
  -H "x-api-key: $SOCIALFETCH_API_KEY" \
  -G "https://api.socialfetch.dev/v1/instagram/posts/comments" \
  --data-urlencode "url=https://www.instagram.com/reel/example/"

TypeScript with the SDK:

const result = await client.instagram.getPostComments({
  url: "https://www.instagram.com/reel/example/",
});

if (result.ok && result.value.data.lookupStatus === "found") {
  for (const comment of result.value.data.comments) {
    console.log(comment.text, comment.likeCount);
  }
}

Instagram returns top-level comments per page. replyCount tells you where sub-threads exist; there is no separate replies route today — weight high-reply top-level comments heavily if the debate lives in one visible reply chain.

author.verified helps filter brand-account replies when you want audience-only sentiment.

Reference: Instagram post comments.

Reddit post comments

Reddit is where sentiment work pays off for B2B and product research — long objections, switching stories, and nested arguments. If you arrived from search, pull comments only on threads that passed your score filter (see the Reddit research guide).

curl -sS \
  -H "x-api-key: $SOCIALFETCH_API_KEY" \
  -G "https://api.socialfetch.dev/v1/reddit/posts/comments" \
  --data-urlencode "url=https://www.reddit.com/r/SaaS/comments/example/"

Reddit comments include depth, nested replies, and score. Flatten before feeding an LLM if your classifier expects a list of strings:

function flattenComments(comments: Array<{ text: string; replies?: { items: typeof comments } }>): string[] {
  const out: string[] = [];
  for (const c of comments) {
    out.push(c.text);
    if (c.replies?.items?.length) {
      out.push(...flattenComments(c.replies.items));
    }
  }
  return out;
}

Use trim=true when you only need text, score, and ids. Keep isSubmitter — OP clarifications change how you read negative replies.

Reference: Reddit post comments.

Facebook posts follow the same URL-in pattern at /v1/facebook/posts/comments when you need comment text on Page posts or public group discussions.

Paginate the full thread

Comment endpoints return one page plus data.page.nextCursor. Loop until hasMore is false or you hit your budget cap:

typescript

import { SocialFetchClient } from "@socialfetch/sdk";

const client = new SocialFetchClient({
  apiKey: process.env.SOCIALFETCH_API_KEY!,
});

const videoUrl = "https://www.tiktok.com/@mrbeast/video/7596844935442189598";
const comments = [];
let cursor: string | undefined;

do {
  const result = await client.tiktok.getVideoComments({ url: videoUrl, cursor });
  if (!result.ok) break;
  if (result.value.data.lookupStatus !== "found") break;

  comments.push(
    ...result.value.data.comments.map((c) => ({
      text: c.text,
      likes: c.metrics?.likes ?? 0,
    })),
  );
  cursor = result.value.data.page.nextCursor ?? undefined;
} while (cursor);

// Feed comments.text into your sentiment model or keyword rules.
console.log(comments.length, "comments ready for analysis");

Checklist for every pagination loop:

Stop when lookupStatus !== "found" — do not keep passing cursors after not_found.
Prefer hasMore over guessing from array length — empty pages can still have a cursor on some platforms.
Pass cursor verbatim — do not parse or construct tokens.
Log meta.requestId per page for support traces.

Re-running the same cursor chain twice bills twice. Treat cursors as single-use pagination tokens within one ingestion run.

Sampling strategies

Full-thread pulls are the exception, not the default. A 30-second TikTok with 8,000 comments does not need 8,000 rows for a launch dashboard — the top three pages by default sort already skew what humans see in-app.

Strategy	How	Good for
Page cap	Stop after N cursor iterations	Hourly monitors, credit budgets
Engagement floor	Skip comments where likes/score < threshold	Noise reduction on viral posts
Reply threshold	Only fetch YouTube reply pages when `replyCount` > 10	Debate-heavy threads
Time window	Re-fetch newest page only on repeat runs	Trend detection vs archive
Stratified sample	Top page + newest page (YouTube `order`)	When sort order disagrees
Search handoff	Comments only on posts from search hits above velocity	Monitoring workflows

For LLM batch jobs, chunk comments into batches of 50–100 texts with the post URL in the system prompt. Asking the model to score 5,000 comments in one prompt produces mushy averages and blows context limits.

Weight aggregates by engagement: weighted_score = sentiment_score * log1p(likes) damps single-comment noise without ignoring a 10k-like pile-on.

Feed LLM classifiers

Normalized JSON maps cleanly to three scoring patterns.

Keyword rules (cheap, deterministic)

Run first. Flag refund, scam, love this, competitor names before you spend tokens.

const NEGATIVE = /\b(scam|refund|broken|worst)\b/i;

for (const row of comments) {
  if (NEGATIVE.test(row.text)) {
    await routeToSlack({ postUrl, text: row.text, requestId: row.requestId });
  }
}

Classical ML (fast at volume)

Export text + likes/score to your sklearn or Hugging Face pipeline. Train on a few hundred hand-labeled comments per vertical; rules catch spikes, the model catches paraphrases.

LLM batch classification

Send structured input — one JSON object per batch:

{
  "postUrl": "https://www.tiktok.com/@brand/video/123",
  "comments": [
    { "id": "c1", "text": "shipping took forever", "likes": 240 },
    { "id": "c2", "text": "ordered two more", "likes": 89 }
  ]
}

Ask for per-id labels (positive, negative, neutral, mixed) and optional themes. Require JSON output; parse defensively. Store the model version on each row.

Preprocessing that usually helps:

Strip URLs and bare @mentions if your model treats them as noise.
Keep emoji when monitoring Gen-Z brands — stripping flattens sarcasm.
Attach language (TikTok) or detect locale before routing to the right prompt.
Pair with transcripts when spoken claims and comment reactions diverge.

Roll per-comment labels into post-level scores: weighted fraction positive, or "mixed" when top and bottom quartiles disagree.

Deduplication and warehouse rows

Comment pipelines fail in the warehouse, not at the HTTP layer. Plan keys before your second cron run.

Problem	Fix
Same comment on re-fetch	Upsert on `platform + comment.id`
Same text, different ids (rare)	Secondary hash on `normalize(text)` for analytics only
Same post scored twice in one run	Dedupe URLs before the outer loop
Stale sentiment in dashboards	Store `capturedAt`; re-pull on schedule, do not mutate old rows
Audit trail	Persist `meta.requestId` and `meta.creditsCharged` per page

Suggested flat row for Postgres or BigQuery:

{
  "platform": "tiktok",
  "postUrl": "https://www.tiktok.com/@brand/video/123",
  "commentId": "7596851467001119502",
  "text": "shipping took forever",
  "likes": 240,
  "sentiment": "negative",
  "themes": ["shipping"],
  "capturedAt": "2026-06-30T14:00:00.000Z",
  "requestId": "req_01example",
  "creditsCharged": 1
}

When you merge comment sentiment back to a listening snapshot, join on postUrl or platform-native video id — not on title text, which collides across reposts.

Troubleshooting

not_found but the post opens in a browser

Comments may be disabled on that post — lookup still resolves for many platforms; you get zero rows with found.
Private, age-gated, or region-blocked media returns not_found at request time.
Pass the canonical URL (full YouTube watch link, full Reddit /comments/ path). Tracker URLs and stripped share links fail more often.

Empty comments array with lookupStatus: found

Legitimate — new posts, brand accounts with comments off, or cleared threads.
Still billed — the upstream lookup completed. Budget for empty pages in cron jobs.

Pagination stops early or repeats

Only pass cursor from the immediately previous response.
If hasMore is true but nextCursor is null, stop and log requestId — that is an upstream anomaly worth a support ticket.
Cap pages per post so a bug does not loop until your balance drains.

YouTube top vs newest disagree

Run two passes with different order values if sort skew matters for your decision. That is two credit lines per page, not one.

Reddit nested replies missing from classifier input

Top-level pagination does not always inline every nested reply on one page. Walk replies.items recursively or flatten as shown above.

TikTok id looks like a placeholder

Some comments use deterministic ids derived from available fields. Still stable within a pull — use them for dedupe keys in that run.

lookup_failed or HTTP 503

Not charged. Retry with backoff; include meta.requestId from the failed attempt if you contact support.

Classifier drift

Re-score a frozen comment snapshot when you change models — do not compare June scores to March scores across model versions without a calibration pass.

Billing notes

Each page is its own metered lookup (typically one credit). meta.creditsCharged on every response tells you what that page cost.

Action	Credits (typ.)
1 TikTok comment page	1
1 YouTube comment page	1
1 YouTube reply page	1
1 Instagram comment page	1
1 Reddit comment page	1
10 posts × 3 pages each	30

Completed lookups that return not_found (deleted video) still ran upstream and are billed. Infrastructure failures (lookup_failed, 503) are not. See Credits.

Rough planning formula: (posts you care about) × (pages per post) + (YouTube reply pages you opt into). A launch monitor on 20 posts with a 3-page cap is about 60 credits per run.

What you can build

Launch monitoring — score comment sentiment hourly after a product drop; alert when negative fraction crosses a threshold.
Creator vetting — flag toxic reply patterns and recurring spam before signing a partnership.
Support triage — route negative threads to Slack when keyword rules fire on the newest page.
Competitive dashboards — same classifier on your posts and competitor posts discovered via search.
Research exports — flat CSV of Reddit thread comments with themes for PM decks.

Next steps: Playground · API reference · Pricing