Measurement

How to Track AI Share of Voice (Step-by-Step, 2026)

Q: What's the difference between SOV and share of recommendation?

SOV measures whether your brand is mentioned. [Share of recommendation](/ai-share-of-recommendation/) measures whether your brand is recommended — framed as the answer, not just listed. SOV is the wider top-of-funnel metric; share of recommendation is the tighter shortlist-influence metric. We recommend tracking both: SOV as the breadth indicator, share of recommendation as the conversion-proximate one.

AI share of voice (SOV) is the percentage of AI-engine answer responses, across a defined prompt set and time window, where your brand is mentioned compared with the total brand mentions across your tracked competitor set. The core formula: AI SOV = your brand mentions / (your brand + competitor brand mentions) × 100, measured per engine (ChatGPT, Perplexity, Google AI Mode, Gemini, Claude), per prompt, per day.

Updated 2026-05-22

Questions this guide answers

How do I measure AI share of voice?
What is AI share of voice?
How to track brand share of voice in ChatGPT?
What's the formula for AI SOV?
Which tool measures AI search share of voice?

Direct answer

SOV in AI search vs traditional SOV

Traditional share of voice was built for media and social listening. Brandwatch, Talkwalker, and the Cision-era tools defined SOV as "what percentage of category conversation is about your brand" across earned mentions in news, blogs, social posts, and forums. The denominator was the entire detectable conversation volume in a category. The unit was a mention in human-produced media.

AI SOV inverts the unit. Instead of "how often do humans talk about you," it measures "how often does an AI answer engine produce your brand name when a buyer asks a relevant question." The denominator is no longer organic conversation; it's a controlled set of buyer-intent prompts that you (the analyst) define. The numerator is responses in which your brand surfaces.

This matters for three reasons:

The prompt set is the experiment. Traditional SOV ingests whatever conversation exists. AI SOV requires you to construct the prompt set deliberately, because the prompt set defines the market boundary you're measuring against. Two analysts measuring the "same" brand can produce wildly different SOV numbers if their prompt sets differ. See our golden prompt set methodology for how to constrain this.
The engine matters as much as the prompt. A brand can hold 35% SOV in ChatGPT and 6% in Perplexity for the same prompt set, because the engines source different corpora and weight signals differently. ChatGPT Search runs on Bing + GPTBot; Perplexity does real-time RAG; Google AI Mode and AIO pull from the Google index with ranking signals; Claude pulls from Brave; Gemini uses the Google index with its own grounding. A single blended SOV number hides the channel reality.
Frequency matters. AI responses are probabilistic. The same prompt run three times in fresh sessions can produce three different brand sets. A one-shot SOV measurement is closer to a single sample than a metric. You need repeated runs over a time window before the number is decision-grade.

The formula

There isn't one AI SOV formula — there are three, and they answer different questions. Use the one that matches the question you're trying to answer.

Question	Use
Are buyers seeing our brand in AI answers at all?	Mention-rate SOV
Are buyers seeing us as a recommended answer?	Primary-mention SOV
Is our owned content being used as the source?	Citation-rate SOV
Single board-level metric?	Primary-mention SOV (closest to revenue)

The three SOV formulas

Mention-rate SOV = (sum of responses where your brand is mentioned) / (sum of responses where any tracked competitor — including you — is mentioned) × 100

Window this by prompt set and time. This is the headline number most vendors report. It tells you what share of the answer surface your brand occupies relative to your tracked competitor set. Use it when you want a single comparable metric to track week-over-week.

Watch-out: it counts a passing mention ("alternatives include X, Y, and Z") the same as a primary recommendation ("the strongest option here is X"). For shortlist influence, that's misleading.

Primary-mention SOV = (responses where your brand is the named primary answer) / (total responses where any tracked brand is primary) × 100

A "primary" mention is one where the AI engine frames your brand as the answer, not a side option. Examples of primary patterns: "the leading option for X is BRAND," "I'd recommend BRAND," "BRAND is the best fit for this use case." You'll need a manual or LLM-judged classifier to distinguish primary from listed, but the work pays for itself — primary-mention SOV correlates with buyer shortlist behavior far better than mention rate. This is closer to what we call AI Share of Recommendation.

Citation-rate SOV = (URLs from your domain cited across the prompt set) / (total cited URLs across tracked competitor domains) × 100

This is the URL-level metric. It tells you the share of the *source layer* your owned domain controls. Useful when you care about traffic potential and source authority, not just brand recall in the answer. A brand can have high mention SOV and low citation SOV (engines name you but cite a third-party review) or vice versa (engines cite your docs but don't surface your brand name in the body of the answer). Both gaps need different fixes.

Mention-rate SOV — use when you want one comparable breadth metric to track week-over-week; answers "are buyers seeing our brand in AI answers at all?"
Primary-mention SOV — use as your single board-level number; it's closest to revenue because it answers "are buyers seeing us as the recommended answer, not just listed?"
Citation-rate SOV — use when you care about traffic potential and source authority; answers "is our owned content being used as the source?"

Three measurement methods

A single blended SOV number is the metric equivalent of a vanity dashboard tile. Useful for the slide; useless for the diagnosis. You need three cuts.

Per-segment or per-persona SOV

Compute SOV one prompt at a time, across all engines, over your time window. This is the diagnostic view. You'll find that you hold 60% SOV on "best X for small teams" and 0% on "X alternatives" — a pattern that points to a specific content gap (you don't have an alternatives page, your competitors do). Per-prompt is the only cut that drives actionable content decisions.

Compute SOV per engine, holding the prompt set constant. This tells you which engine is your strength and which is your weakness. A typical B2B SaaS pattern: strong on Perplexity (because Perplexity's real-time RAG favors recent posts), weak on Google AI Mode (because Google AI Mode weights established domain authority). The fix for each engine is different — Perplexity rewards fresh first-party content; Google AI Mode rewards source-layer presence on high-authority third-party domains. The cross-engine inconsistency is normal; we wrote about why two AEO platforms can disagree, and the same logic applies to your own measurement.

Segment your prompt set by buyer journey stage (category-discovery, comparison, alternatives, implementation, post-purchase) or by persona (security buyer, end user, procurement). Compute SOV per segment. You'll find that you dominate the bottom-of-funnel "BRAND vs COMPETITOR" prompts but lose the top-of-funnel "best X" prompts — which means buyers who already know you find you, but new buyers don't. That's a content portfolio gap, not a brand gap, and the fix is different from a category-mention deficit.

Worked example (illustrative scenario)

To make this concrete, here's an illustrative scenario in a neutral category — the cloud data-warehouse space — that mirrors what a real SOV run produces. We use it instead of a named live brand so the mechanics, not the specific vendors, are what you take away.

Methodology disclosure: the numbers below are invented-but-realistic, modeled on the shape a representative run produces — 22 prompts × 4 engines (ChatGPT, Perplexity, Google AI Overviews, Gemini) × 22 daily runs = 1,936 responses over a 30-day window. The leader names are real market leaders (Snowflake, Databricks, BigQuery); "Northwind Data" is an anonymous stand-in for a mid-market warehouse vendor. Treat the figures as an illustration of the method, not as a measured result.

Mention-rate SOV across the tracked competitor set:

How to read this:

The actionable insight is not "Snowflake is winning." It's "the data-warehouse category is consolidating around two leaders in mention-rate, but no vendor is yet dominant in primary-mention or citation rate — which means the category is still a winnable game for any brand willing to operate the source layer (Wikipedia, Reddit, TechRadar, G2)." That's a category-level read that mention-rate alone can't surface.

The headline reading is "Snowflake owns category-level AI mindshare in the data-warehouse space, with Databricks a close second." Snowflake is named in nearly half of all responses for warehouse-category prompts. That's a dominant SOV position.
A mid-market vendor like "Northwind Data" sits near the bottom at 4.8%. That bottom-of-leaderboard position is exactly the kind of gap a measurement program is built to surface and close — being mentioned rarely, on the prompts buyers actually ask, is the diagnosis the SOV cut exists to deliver.
Mention-rate SOV alone overstates Snowflake's true competitive moat. If we recomputed using primary-mention SOV (filtering for responses where the vendor is framed as *the* answer, not one of several listed options), the leaderboard compresses meaningfully. Snowflake is "the answer" less often than its mention rate suggests; many of its mentions are in lists where competitors appear adjacent. That's why the primary cut matters.
Per-engine, the pattern would split further. Snowflake's strength is most pronounced on ChatGPT and Perplexity (where listicle-style third-party content cites it heavily). On Google AI Overviews, the SOV gap tightens because Google AIO pulls from a different source mix.

Vendor	Mention-rate SOV
Snowflake	48.9%
Databricks	44.7%
BigQuery	22.5%
Northwind Data (illustrative mid-market vendor)	4.8%

Common SOV mistakes

Five mistakes we see almost every team make when they first start tracking AI SOV.

1. Confusing mention with primary mention. A brand mentioned eighth in a 12-brand list is not winning. If your dashboard only reports mention rate, you're optimizing for "appear in the answer somewhere," which is a much weaker goal than "be the recommended answer."

2. Mixing prompt-set definitions across vendors. Two AEO platforms can report different SOV numbers for the same brand because they're running different prompt sets. If you switch tools, you'll see a jump or drop that's purely an artifact of the prompt set. Document your prompt set explicitly. If you change it, treat the SOV series as a new series — don't pretend it continues.

3. Running once instead of continuously. AI responses are probabilistic. A single SOV snapshot is closer to a single sample than a metric. You need at least 3–5 runs per prompt per engine, ideally daily, over a 30-day window before the SOV number is decision-grade. Otherwise you're measuring noise.

4. Ignoring prompt-volume weighting. Not all prompts have equal buyer demand. "Best CRM for small teams" gets 100x the buyer query volume of "best CRM with native SAP integration." If you treat both prompts as equal in your SOV denominator, you're over-counting the long tail. Either weight prompts by external query-volume estimates (when available) or report SOV separately for high-volume vs long-tail prompt subsets.

5. Reporting one blended number to executives. A single category-level SOV number is fine for the board slide, but the operating cadence needs per-prompt, per-engine, per-segment cuts. The blended number can move 2 points week-over-week from noise; the per-prompt cut shows you exactly which prompt drove the move.

How to set up a starter SOV tracker (30-day rollout)

You can stand up a credible SOV tracker in a month. Here's the rollout we recommend.

Week 1 — Define the competitive set and prompt set.

Week 2 — Choose engines and tracking cadence.

Week 3 — Build the measurement pipeline.

Week 4 — Build the reporting cadence.

If you'd rather not stand all this up yourself, run a free 10-prompt audit on your brand. It returns the same SOV data shape against your top 3 competitors, with URL-level citation evidence — usable as your baseline week-1 measurement.

Pick 5–8 competitors. Fewer than 5 underrepresents the category; more than 8 dilutes the signal and creates a long tail of brands that drift in and out of the answer surface.
Build a 25–50 prompt set that covers category-discovery, comparison, alternatives, and implementation prompts. Use real buyer language, not internal product taxonomy. Our golden prompt set methodology walks through the construction.
Document the prompt set as a versioned artifact. Treat it the way an SEO team treats a tracked keyword list.
Track at minimum ChatGPT, Perplexity, Google AI Mode (or AIO), and Gemini. Add Claude if your buyers are technical. Skip the "we cover 50+ engines" pitch — depth on four engines beats breadth on fifty.
Set cadence to daily runs with at least 3 fresh-session executions per prompt per engine. Weekly cadence is too sparse to catch directional changes; daily smooths the probabilistic noise.
Decide capture method: direct API (where available), platform-grade dual-channel measurement, or controlled scraping. Our visibility measurement methodology covers the trade-offs.
Compute all three SOV variants (mention-rate, primary-mention, citation-rate) daily.
Build per-prompt, per-engine, and per-segment cuts.
Set up alerting on week-over-week SOV deltas above a noise threshold (typically ±3 points for a stable prompt set).
Daily: internal monitoring of SOV deltas and prompt-level drops. This is for the analyst desk, not executives.
Weekly: team-level review of per-prompt and per-engine cuts. Tie deltas to recent content shipped or competitor moves.
Monthly: executive rollup with one blended SOV number, three highlight prompts (best win, worst loss, biggest mover), and one decision the data is asking the team to make.

CTA

If you want to see your own brand's SOV against your top competitors without building the measurement pipeline yourself, run a free 10-prompt ChatGPT visibility audit — it returns the same prompt-set structure and per-prompt scoring used in the dataset above, scoped to a one-engine baseline for your brand.

*Last updated 2026-05-22. Worked-example data is an illustrative scenario in the cloud data-warehouse category, modeled on the shape of a representative 30-day cross-engine run (22 prompts × 4 engines × 22 daily runs = 1,936 responses); the figures are invented-but-realistic, not a measured result. See our visibility measurement methodology for the underlying capture protocol.*

FAQ

What's the difference between SOV and share of recommendation?

SOV measures whether your brand is mentioned. Share of recommendation measures whether your brand is recommended — framed as the answer, not just listed. SOV is the wider top-of-funnel metric; share of recommendation is the tighter shortlist-influence metric. We recommend tracking both: SOV as the breadth indicator, share of recommendation as the conversion-proximate one.

How many competitors should I track?

Five to eight. Below five, you under-represent the category and risk overstating your own SOV. Above eight, you dilute signal and the long-tail competitors drift in and out of the answer surface from week to week, adding noise. If your category has more than eight real competitors, tier them: a core 5–8 in the primary SOV calculation, plus a watch-list set you track but don't include in the headline denominator.

What about prompt-volume weighting?

If you can get external query-volume estimates per prompt (from SEO tools, internal search logs, or buyer interviews), weight your SOV calculation by them. The simplest implementation: multiply each prompt's SOV contribution by its estimated monthly query volume, then normalize. Without weighting, a long-tail prompt with 10 monthly buyer queries counts the same as a head prompt with 10,000 — which over-rewards niche wins and under-counts category-defining losses.

Can I DIY this in a spreadsheet?

For a one-time baseline across one engine and 25 prompts, yes — a spreadsheet plus manual prompt execution plus a simple count formula works. For continuous tracking across four engines, daily runs, three executions per prompt, and three SOV variants, you're looking at 1,500+ responses per week to label and parse. That's a pipeline, not a spreadsheet. Most teams move to an AEO platform by week 3 of the rollout.

Why do two AEO platforms give me different SOV numbers for the same brand?

Almost always because their prompt sets differ, their engine mix differs, or their capture methodology differs. Less commonly, because one is using primary-mention SOV and the other mention-rate SOV under the same label. We wrote a full breakdown of why two AEO platforms can disagree — short version: ask each vendor to show you the prompt set and the per-prompt result, not just the headline number.

Is SOV the same across engines?

No, and you should not expect it to be. ChatGPT Search (Bing + GPTBot), Perplexity (real-time RAG), Google AI Mode and AIO (Google index + ranking), Claude (Brave), and Gemini (Google index + grounding) source different corpora and weight signals differently. A 30-point SOV gap across engines for the same brand is common, not anomalous.

How often should I refresh the prompt set itself?

Quarterly is the working cadence. Buyer language shifts as categories mature; prompts that were category-defining six months ago can become long-tail. Treat the prompt set as a living artifact — but version it, and report SOV separately for each prompt-set version so you don't conflate a prompt-set change with a real SOV change.

Related guides

Measurement

AI Share of Recommendation

AI Share of Recommendation measures how often answer engines recommend a brand, not just whether they mention it. Learn how to track and improve it.

Measurement

AI Brand Visibility Monitoring

A practical guide to measuring brand mentions, citations, sentiment, and competitive position across AI answer engines.

Strategy & Positioning

Why Two AEO Platforms Can Disagree on Citation Share

AEO measurement is more like survey design than search analytics. The four sources of disagreement between AEO platforms, the five questions that surface them, and how SolCrys answers each one — with explicit uncertainty.

How SolCrys Works

Golden Prompt Set Methodology

We ground every AEO prompt set on real intent volume, public community questions, AI query signals, and live engine follow-ups - not synthetic keyword lists. Here's how we build it.

How SolCrys Works

AI Visibility Measurement Methodology

How we capture your AI visibility data across supported engines, with each response traceable to a prompt, engine, capture method, available model or surface signal, and timestamp. Consumer-surface and retail-assistant validation are scoped where technically reliable.

AI Search Tools

ChatGPT Visibility Tracker

Free ChatGPT visibility tracker. Measure your brand's mention rate, share of voice, and citations in ChatGPT — start with a 5-minute audit. Paid plans add Gemini, Perplexity, and Google AI.

AI Search Tools

Perplexity Visibility Tracker

Free Perplexity visibility tracker (via ChatGPT baseline audit). Measure brand mentions and citation share in Perplexity — 5-minute audit to start, daily Perplexity tracking on the Brand plan.

AI Search Tools

Gemini Visibility Tracker

Free Gemini visibility tracker via ChatGPT baseline audit. Measure how Google Gemini describes your brand — 5-minute audit to start, daily Gemini tracking on the Brand plan.

AI Search Tools

Google AI Overviews Tracker

Free Google AI Overviews tracker via ChatGPT baseline audit. Measure how Google AI Overviews and AI Mode show your brand — 5-minute audit to start, daily Google AI tracking on the Brand plan.

Measurement

AI Recommendation Score

AI can name your brand and recommend a rival in the next sentence. The Recommendation Score grades every AI answer 0-100 on how favorably it positions you across ChatGPT, Gemini, Perplexity, Google AI Overviews and Claude, plots you against every competitor, and shows the verbatim line behind every point.

Free · No credit card

Turn AI answer gaps into governed marketing execution.

Start free with a ChatGPT visibility read, then add multi-engine tracking, Corporate Context governance, and the action-to-result loop when you are ready.

Start Free