Why AI Models Ignore 90% of Content (Research Fixes It)
Priyam Goyal
Co-Founder

In a hurry? Summarise this with AI.
Open it in your AI tool of choice for the short version.
On this page
- What does AI search visibility actually mean?
- Why original research wins citations
- What content types AI actually cites
- The minimum viable research framework
- How to publish so AI picks it up
- What Google itself says (and what it doesn't)
- How to measure AI search visibility
- The mistakes that keep you invisible
- Where to start this week
Most content is invisible to AI search. Not because it reads badly, and not because someone forgot a meta description. It's invisible because there are now millions of pages saying roughly the same thing, and a language model has no reason to pick yours over the other 999,999.
Original research is the lever that breaks the tie. When you publish numbers nobody else has, you stop being one more voice repeating the consensus and start being the thing the consensus quotes. That is the whole game for AI search visibility right now, and most brands are nowhere near it.
We run AI search campaigns for a living, so this isn't theory we read in a deck. Here's what actually moves the needle, what the public data says, and where we think half the advice online is plain wrong.
What does AI search visibility actually mean?
AI search visibility is how often your brand or content gets surfaced, summarised or cited inside AI answers, the ones from ChatGPT, Google's AI Overviews and AI Mode, Perplexity and Gemini, rather than the classic ten blue links.
That distinction matters because the blue links are drying up. A Pew Research Center study of 68,879 Google searches found that when an AI summary appeared, users clicked a traditional result in only 8% of visits, against 15% when there was no summary. They clicked a link inside the summary itself just 1% of the time.
Ahrefs put a sharper number on the damage. Their analysis of 300,000 keywords found a 58% lower average click-through rate when an AI Overview is present, using December 2025 data. So the click you used to win is increasingly the citation inside the answer instead. If you're not in the answer, you're not in the conversation.
This is the shift behind the rise of what people now call generative engine optimisation, and it's why we treat AI search visibility as its own discipline rather than a bolt-on to ordinary SEO.
Why original research wins citations
Here's the uncomfortable bit for most marketing teams. A model doesn't cite you because your prose is lovely. It cites you when you're the most useful, most verifiable source for a specific claim. Opinion is abundant and cheap. A genuine data point with a clear method is rare and quotable.
When a model needs to say "X% of marketers do Y" or "the average cost of Z is £N", it has to point at a source. Be that source and you get pulled into answers across hundreds of related queries, often for months, without doing anything else.
Our take, after a lot of campaigns: original research is the single most durable AI visibility asset you can build. Keywords shift, models get retrained, citation patterns swing about. A clean, well-documented dataset that people quote keeps earning. We've watched a single decent stat get referenced across an entire topic cluster long after the post that launched it slipped down the rankings.
Volatility is real, by the way, so don't bet the farm on one platform. Semrush analysed 230,000 prompts and over 100 million AI citations across 13 weeks and watched Reddit's share of ChatGPT responses collapse from around 60% in early August 2025 to roughly 10% by mid-September. Wikipedia dropped from about 55% to under 20% on the same platform. If domains that big can swing that hard, your job is to be quotable enough that you survive the reshuffles.
What content types AI actually cites
This is where a lot of "publish a survey and you're sorted" advice falls apart. The format matters as much as the data.
Omniscient Digital analysed 23,387 unique sources behind AI citations across ChatGPT, Perplexity, Gemini, AI Mode and AI Overviews. For branded queries, 57% of citations went to reviews and social proof, 17% to directories and reference sites, and only around 5.4% to education and thought-leadership content. Brand "About" and home pages barely registered.
So research alone isn't a magic key. The lesson we take from it is that your data needs to live where AI engines already trust the format, packaged in genuinely useful, reference-grade content rather than a glossy thought-leadership PDF nobody can parse.
Two practical things we keep coming back to:
- Tables and plain text beat infographics. A model can read a data table. It struggles with the same numbers baked into a pretty image. If your only version of a stat is a picture, you've hidden it.
- Method has to be obvious and early. Sample size, dates, who you asked. If a reader (or a crawler) can't see how you got the number in the first screenful, it reads as an unsupported claim.
We dug into the early-position pattern in our own piece on where ChatGPT pulls citations from in the first 500 words, and it lines up with everything above. Lead with the finding. Don't make anyone scroll to the good bit.
The minimum viable research framework
You do not need a university grant or a six-figure study. You need an honest number nobody else has, documented well enough to trust. That's the bar.
Here's the approach we use when we build a research asset for a client:
- Pick one question your buyers keep asking. The narrower the better. "How long does X take" or "what does Y cost in 2026" beats a sprawling "state of the industry" report you'll never finish.
- Gather data you can actually stand behind. A focused survey, your own anonymised platform data, a benchmark you can rerun. Aggregating other people's numbers without adding analysis doesn't count, and models tend to skip it for the original source anyway.
- Write the method in plain English, up top. Who, when, how many, and the limitations. Honesty about limits builds credibility, it doesn't dent it.
- Put the headline number in the first paragraph. One clear, quotable stat that answers the question outright.
- Publish a machine-readable version. Tables in HTML, a downloadable dataset if you can. Make it trivially easy to quote.
On sample size, ignore anyone who gives you a magic threshold. There isn't one. A tightly targeted survey of a few hundred relevant people, with the method stated clearly, is far more citable than thousands of random responses with no context. The Pew study earned its citations off transparency and rigour, not raw volume. Copy that, not the headcount.
What doesn't count as research
Your hot take on where the industry is heading is not research. A roundup of stats you found elsewhere is not research. A claim like "studies show" with no study attached is worse than nothing, because it actively erodes trust. If you can't name the source, the date and the method, don't dress an opinion up as data.
How to publish so AI picks it up
Brilliant research published badly still gets ignored. We see it constantly. The fix is mostly mechanical.
Host the research on your own domain first, as the canonical source, before anyone syndicates it. When other sites pick up your numbers, and they will if the data is good, you want the trail leading back to you as the origin, not to whoever republished it with a prettier headline.
Then distribute across formats, because different engines favour different homes. The Semrush dataset showed how unevenly citations spread across platforms, and our own field experience matches it. A blog post, a clean data page, a short explainer video and a syndicated version cover far more ground than betting everything on one URL.
A word of realism on timing. Don't expect overnight results. These systems refresh on their own schedule, and it routinely takes weeks for fresh research to start appearing in AI answers after you publish. That lag drives marketers spare when they're used to checking rankings the next morning. Plan for it, and don't kill a good asset because it was quiet for a fortnight.
If you want the full mechanics, we've laid out our approach to getting your brand into AI answers and a more tactical breakdown of how to get cited in ChatGPT and AI Overviews. Both go deeper than we can here.
What Google itself says (and what it doesn't)
There's a cottage industry selling "AI markup" and secret schema that supposedly unlocks AI Overviews. We'd save your money.
Google's own documentation on AI features and your website is blunt about it: "You don't need to create new machine readable files, AI text files, or markup to appear in these features. There's also no special schema.org structured data that you need to add." It adds that there are "no additional requirements to appear in AI Overviews or AI Mode, nor other special optimizations necessary."
What Google does say is that the same fundamentals apply: make sure pages are indexable, fast and genuinely helpful, with what it calls "helpful, reliable, people-first content." Our reading is simple. There's no shortcut. Original research is powerful precisely because it's a content quality signal, not a technical trick. You can't fake your way past it with markup.
That said, structured data still earns its keep for rich results and entity clarity, and entities are doing a lot of quiet work in AI search. We get into that in our look at how Wikipedia and brand entities feed LLM citations.
How to measure AI search visibility
Traditional analytics won't show you this cleanly, which is why so many teams assume nothing's happening when plenty is. You have to go and look.
Start manual and cheap. Build a list of the questions your buyers ask, then run them through ChatGPT, Perplexity, Google AI Mode and AI Overviews on a regular schedule. Log when your brand appears, in what context, and against which competitors. That spreadsheet will teach you more in a month than most dashboards, because it captures the framing, not just a count.
Then layer in tooling. A growing set of platforms now track brand mentions and citation share across AI engines, and they're maturing fast. Use them for trend lines, but keep doing the manual checks for the qualitative reads a dashboard misses.
One honest caution on conversion claims. You'll see plenty of confident "AI traffic converts X times better" stats flying around. We'd treat those with suspicion until the attribution catches up, because AI referral tracking is genuinely messy right now and a lot of those numbers are someone's small sample dressed up as a law of nature. What we can say from our own work is that AI-cited visibility tends to reach people further down the decision, since they're already asking the model for a recommendation rather than browsing. That's a directional read, not a guaranteed multiplier.
For a sense of how the citation landscape itself is shifting, our breakdown of AI Overviews citation rates is a useful companion to the numbers above.
The mistakes that keep you invisible
Making research AI-invisible is genuinely easy. We see the same own goals over and over.
- Hiding the method. If the sample size, dates and approach aren't easy to find, the data reads as an unbacked claim and gets skipped.
- Locking numbers inside images. A gorgeous infographic with no text or table version is, as far as a model is concerned, decoration.
- Overclaiming. "Our research proves" off the back of forty responses is the kind of thing that quietly trashes your credibility. It can also stray into dodgy advertising territory if the claim is material and unsupported.
- One-and-done. A single study published once and abandoned won't compound. Consistent, modest research builds a citation library that does.
- Chasing one platform. Given how violently citation shares swing, as the Semrush data showed, betting everything on today's favourite engine is a fast way to disappear at the next reshuffle.
If you want the wider pattern of why brands botch this, we've catalogued the recurring failures in our piece on GEO best practices and where brands go wrong.
Where to start this week
You don't need a strategy offsite to begin. Pick one question your audience keeps asking. Survey two or three hundred of the right people, or pull a benchmark from data you already own. Publish it with the method in the first two paragraphs, the headline number up top, and the data in a real table.
Then wait, check the AI engines against your query list over the following weeks, and watch what gets quoted. The brands building this habit are quietly assembling a moat: while everyone else chases an algorithm that keeps moving, they're becoming the source the algorithm has to reach for.
That's the work we do day in, day out. If you'd rather not start from a blank page, tell us what your buyers keep asking and we'll help you turn it into research worth citing.


