Buyer Guides & Platform Decisions

How to evaluate an AEO platform's data methodology - 7 questions that separate real measurement from vanity metrics

Most AEO platforms describe their measurement in marketing-speak ('comprehensive AI visibility tracking across all major engines') and hope buyers don't ask harder questions. The seven questions below separate vendors with reproducible, auditable data from vendors selling beautiful dashboards on weak foundations: where prompts come from, whether responses are captured from the consumer surface or the API, which model versions are queried, how engine non-determinism is handled, whether a single data point can be reproduced, what happens when an engine changes its default, and whether the cost model is honest about per-engine multipliers. This is the data-trust companion to the broader AEO Platform Buyer's Guide. Send the questions in writing before the call. A vendor that bristles at any of them is signaling something about how they handle scrutiny.

Updated 2026-05-08

Questions this guide answers

How do I evaluate an AEO platform's data?
What questions should I ask an AI visibility vendor about methodology?
How do I know if AEO data is accurate?
What's the difference between AI visibility platforms?

Direct answer

Most AEO and AI visibility platforms describe their measurement in marketing-speak ('comprehensive AI visibility tracking across all major engines') and hope buyers don't ask harder questions. The seven questions below separate vendors with reproducible, auditable data from vendors selling beautiful dashboards on weak foundations: where do prompts come from, are responses captured from the consumer surface or the API, which model versions are queried, how is engine non-determinism handled, can a single data point be reproduced, what happens when an engine changes its default, and is the cost model honest about per-engine multipliers.

This is the data-trust companion to the broader AEO Platform Buyer's Guide. That guide covers full vendor selection. This post focuses specifically on the data-and-methodology dimension - the part most buyers under-investigate before signing.

Why methodology is the under-investigated dimension

Buyers comparing AEO platforms typically focus on three dimensions: feature parity, price, and engine coverage. The fourth - whether the underlying data is reliable - is harder to evaluate from a sales call and easier to wave away with a slide. So most buyers skip it. Two specific failure modes recur.

Synthetic prompts dressed as research: a vendor says 'we track 300 prompts in your category' but the prompts were generated by a language model at indexing time, not grounded in real buyer behavior. The questions real users ask AI assistants are typically much longer and more problem-stated than synthetic SEO-keyword-derived prompts. If the prompts don't reflect real questions, the visibility numbers don't reflect real exposure.

API-only tracking sold as consumer measurement: a vendor says 'we track ChatGPT' but in practice queries an API endpoint that uses different default models, system prompts, and web-search behavior than the consumer chat product. The two surfaces can diverge enough that fixing one does not move the other.

The 7 methodology questions

Each question lists what you are listening for, common red flags, and why it matters.

Question 1: Where do your tracked prompts come from?

What you are listening for: specific data sources, not vague phrases like 'AI-generated based on your category.' Strong answers include some combination of intent volume data, trending community questions from public Q&A platforms, live follow-up questions captured from the engines themselves, and customer-supplied prompts at a transparent ratio.

Red flags: 'We use AI to generate prompts based on your category' (translation: synthetic), vague phrases like 'ensemble deep-learning models on 50+ sources' without naming the sources, refusal to disclose template-vs-customer ratio, and no mechanism for declining or replacing the vendor's default prompts.

Question 2: Are responses captured from the consumer surface or the API?

What you are listening for: a clear distinction, ideally with both. Strong answers explain that consumer-surface capture is used for engines like ChatGPT, Google AI Overviews, and Rufus because that is what buyers actually see; that the API is used separately for agent and deep-research use cases; and that each data point is tagged so you can filter to either channel.

Red flags: 'We use the API because it is more reliable' (translation: the data does not match what consumer users see); inability to explain how Google AI Overviews is captured (it has no public API); implicit claim that API and consumer-surface data are interchangeable.

Question 3: Which specific model versions do you track?

What you are listening for: named models and surfaces, plus a public registry that updates when defaults change. Strong answers describe a model registry published in a changelog, an SLA for updating tracking when a default changes, and per-data-point model signal where the engine discloses it.

Red flags: 'We track ChatGPT' without specifying the variant or surface; 'the platform decides which model gets used' (this means data is not reproducible); no public model registry or changelog.

Question 4: How do you handle the fact that AI engines give different answers each time?

What you are listening for: explicit acknowledgement of non-determinism, plus a methodology for it. Strong answers describe repeated capture on a recurring cadence with rolling-window aggregates and confidence bands; flag single-snapshot data points as such; and treat movement within historical engine variance as 'within noise' rather than as a real trend change.

Red flags: 'Our data is highly accurate' without explaining how non-determinism is handled; one run per refresh cycle treated as one data point; no confidence intervals, rolling windows, or acknowledgement that engines are non-deterministic.

Question 5: Can a specific data point be reproduced?

What you are listening for: yes, with specifics. The vendor should be able to walk you from any chart to the underlying captured response, including prompt text, engine, region, timestamp, response, and citations. On higher-tier plans, raw structured data should be exportable.

Red flags: 'Our methodology is proprietary' (translation: you can't audit it); no drill-down from charts to raw responses; case studies that show only AI response screenshots without capture metadata.

Question 6: What happens when an engine changes its default model?

What you are listening for: a documented update process plus disclosure to customers. Strong answers describe ongoing monitoring of provider announcements, an SLA for updating tracking, dashboard surfacing of model changes, and automated alerting so the team does not miss provider updates.

Red flags: 'We use the latest model' without specifying how 'latest' is defined or who watches for changes; no changelog visible to customers; inability to describe what happened the last time a major engine updated its default.

Question 7: Is the cost model honest about per-engine multipliers?

What you are listening for: a pricing model that reconciles to forecastable spend. Strong answers express pricing as tracked prompts and included engines with a transparent multiplier structure; price add-ons individually; and avoid per-credit pricing that hides engine multipliers.

Red flags: '1 credit = 1 AI response' (the hidden multiplier is engines x cadence x refresh cycles); tiered plans with no published per-engine consumption rules; custom pricing with no reference points. Credit pricing without clear per-engine rules makes it almost impossible to forecast spend.

How to use this checklist

Treat the seven questions as documentation requests, not accusations. A credible vendor will welcome them and have ready answers.

Send the seven questions in writing before the call. This filters out vendors who can't answer in writing.
Score each answer 0/1/2 (no answer / vague answer / specific answer). A vendor below 10/14 is not yet ready for production buyer work.
Ask for documentation on the strongest claims. A vendor that says 'we publish a model registry' should be able to show it on the spot.
Run a side-by-side test. Pick five prompts in your category, have the vendor run them, and run them yourself manually. The vendor's reported responses should align in substance with your manual checks.
Read the vendor's methodology page if they have one before the call. If they don't have one, that is itself an answer.

What we deliberately do not claim

To make this checklist credible, we want to be explicit about what no AEO platform - including SolCrys - can honestly promise.

No platform can guarantee citation lift. Engine behavior is influenced by hundreds of inputs no vendor can fully control. The right promise is measurement and recommended actions, not outcomes.
No platform fully covers every AI engine. Coverage trade-offs are real. Beware vendors who claim universal coverage; ask which engines are tracked at full fidelity vs. partial.
No platform is immune to engine changes. What matters is how quickly the vendor responds and how transparently they disclose updates to customers.

How SolCrys answers its own checklist

We publish our prompt-selection methodology in the Golden Prompt Set methodology page and our visibility-measurement methodology in the Visibility Measurement methodology page. Every paid plan includes per-data-point drill-down with the prompt, engine, region, timestamp, and captured response. We disclose model versions in our changelog. Pricing is published with no per-engine multiplier surprises. We try to model the standard we are asking other vendors to meet.

FAQ

What is the single most important question on this list?

If we had to pick one: Question 5 (reproducibility). If a platform cannot show you the raw captured response behind any chart, none of the other questions matter - every claim becomes unfalsifiable. Reproducibility is the foundation that makes the other six questions checkable.

How is this different from your other Buyer's Guide?

The AEO Platform Buyer's Guide covers the full vendor selection: positioning fit, feature breadth, support, integrations, agency scenarios, and pricing fairness. This post zooms in on the data-and-methodology dimension specifically - the part most buyers underweight in vendor selection but that determines whether the data they pay for is trustworthy.

Should I ask all seven questions even if the vendor seems credible?

Yes. The questions are documentation requests, not accusations. A credible vendor will welcome them and have ready answers. A vendor that bristles at any of them is signaling something about how they handle scrutiny.

What about smaller or newer AEO vendors that haven't published methodology pages yet?

Some early-stage vendors have legitimate methodology that is not yet documented publicly. In that case, ask for a methodology document under NDA. If they can produce one in writing, they pass. If they cannot, treat them like any other vendor that won't show their work.

Is there a single failing answer that should disqualify a vendor outright?

Two: 'The platform decides which model gets used' (data is non-reproducible) and 'Our methodology is proprietary' (methodology that cannot be audited cannot be trusted). Implementation details can be confidential; the methodology itself should be transparent.

How does SolCrys score on its own checklist?

We publish prompt-selection and measurement methodology, disclose model versions, ship per-data-point drill-down on paid plans, and publish pricing with transparent per-engine rules. We try to model the standard we are asking other vendors to meet.

Related guides