Anchor Text in 2026: What Still Triggers Penguin, What AI Engines Actually Reward
Priyam Goyal
Co-Founder

In a hurry? Summarise this with AI.
Open it in your AI tool of choice for the short version.
On this page
- The Penguin rules never changed. They just stopped being announced.
- The anchor text ratios we actually use
- Here is the uncomfortable bit: exact-match barely moves the needle anyway
- What "descriptive anchor" actually means to Google
- How AI engines actually use anchor text
- The Common Crawl evidence base
- The split: Google wants safety, AI wants signal
- How to audit your own anchor text profile in 2026
- A real example from our agency data
- Co-occurrence: the anchor strategy that wins on both sides
- How knowledge graphs change the maths
- Why this matters more every quarter
- What we would do this week
Here is the thing that broke our brains about eight months ago. Two client portfolios, same agency, same outreach process, wildly different outcomes.
One client kept ranking happily on Google but vanished from AI Overviews. Another started turning up everywhere in Perplexity and then dropped three keywords on Google in a single month. We ran both accounts. The only meaningful difference was the anchor text profile.
That is what kicked off this post. Most of the "anchor text in 2026" content floating around still argues about whether exact-match should sit at 3% or 8%. Fine question. Wrong decade. The question that actually matters now is this: what does an anchor text strategy look like when a big chunk of your visibility comes from systems that index links by meaning, not by keyword density?
We run white-label link building for agencies and SEO for our own clients, so we have watched this play out across dozens of accounts in 2025 and 2026. Below is what we actually see, plus what Google's own documentation says, with the receipts.
The Penguin rules never changed. They just stopped being announced.
A surprising number of newer SEOs think Penguin retired in 2012. It did not. It got promoted.
Google Penguin launched on 24 April 2012 and affected roughly 3.1% of English search queries. Then on 23 September 2016, Google folded it into the core algorithm and made it run in real time. So Penguin does not get "refreshed" any more. It is always on, quietly evaluating links and anchor text every time Google re-crawls your backlink profile. No announcement, no warning, no recovery window you can wait out.
What does Google actually say about manipulative anchors? It is published in plain English. The Google Search spam policies list "links with optimized anchor text in articles, guest posts, or press releases distributed on other sites" as link spam. They also flag "keyword-rich, hidden, or low-quality links embedded in widgets that are distributed across various sites" and "widely distributed links in the footers or templates of various sites."
Read that carefully. Google is not banning guest posts. It is banning guest posts with optimised anchor text. The anchor is the trigger, not the placement.
What we see trigger Penguin in 2026
From the client audits we have run this year, these are the patterns that keep lining up with traffic drops:
- Exact-match anchor ratios above 8% pointing at a single commercial URL
- 15 or more guest posts using near-identical money-keyword anchors ("best widgets for X", "top widgets for X", "widgets for X reviewed")
- Footer or sidebar links across multiple sites all using the same anchor
- A cluster of anchors from the same low-grade domain class, ten random directories, all carrying product keywords
The sites we have helped claw back from this kind of mess took anywhere from two to seven months. We wrote up the full workflow in our guide to the link reclamation process after a Penguin hit if you want the step-by-step.
The anchor text ratios we actually use
We hate publishing universal ratios because they get repeated as gospel and then someone audits a homepage that is 80% branded and panics. But people ask for numbers, so here is roughly what we aim for in our white hat link building work in 2026:
- Branded anchors: 40 to 60%. Brand name only, or brand plus a generic descriptor ("Acme", "the Acme team", "Acme's guide").
- Naked URLs and generic anchors: 15 to 25%. "acme.com", "this article", "the full study", "here" used sparingly.
- Partial-match and contextual anchors: 15 to 25%. The keyword living inside a longer descriptive phrase, like "how Acme handles widget compliance".
- Exact-match: 2 to 5% maximum. And only from genuinely strong sources where the anchor reads naturally in the sentence.
- Co-occurrence anchors: as many as you can earn. Branded anchor, keyword in the surrounding paragraph. This is the unlock for AI engines, and we will get to it.
Those numbers are not arbitrary. They are roughly what we see on the natural link profiles of sites that never did outreach in the first place. Genuinely organic profiles often lean even more branded, sometimes 70% or higher.
This is also why we keep nudging clients to treat unlinked brand mentions as serious assets, not afterthoughts. A brand name dropped into a sentence with no link at all reads as more natural to both Google and AI retrieval than a keyword-stuffed link ever will.
Here is the uncomfortable bit: exact-match barely moves the needle anyway
If you still believe exact-match anchors are the lever that wins rankings, the data has bad news.
Ahrefs ran one of the largest public studies on this, analysing 384,614 pages across 19,840 keywords and found "a relatively weak correlation between the percentage of exact-match anchored links and rankings." Their own conclusion was blunt: the best way to build a natural anchor text ratio is to stop trying to manipulate it.
So the keyword-stuffed anchor gives you very little upside and a real chance of tripping Penguin. That is a terrible trade. We have done client audits where the previous agency's headline deliverable was "50 contextual backlinks" and every single one carried the same three keywords. They paid for a liability.
What "descriptive anchor" actually means to Google
Google's documentation on links is more useful than most third-party guides, and almost nobody reads it.
The key line: "Try reading only the anchor text (out of context) and check if it's specific enough to make sense by itself. If you don't know what the page could be about, you need more descriptive anchor text." Google explicitly warns against "click here", "read more", "website" and "article". Not because they nuke your rankings, but because they fail the standalone-meaning test.
What Google wants is descriptive. Its own example is "list of cheese types". That phrase tells you what the destination page is about without being a stuffed monstrosity like "best cheese types list buy cheese online". Concise, relevant, human.
That distinction matters more than ever in 2026, because the AI engines are reading anchors the same way a curious human would.
How AI engines actually use anchor text
This is the part most SEO advice gets wrong. It treats AI search as if it were Google's core algorithm with extra hallucination bolted on. It is not the same machine.
Most of the big AI engines, ChatGPT, Perplexity and Google's AI Overviews, lean on some form of retrieval-augmented generation. RAG converts documents into embeddings, numerical representations in a large vector space, stored in a vector database so the system can retrieve the most semantically relevant chunks when someone asks a question.
The retrieval step does not count anchor text ratios. It compares meaning. When a system decides which page to cite for "how should small SaaS teams structure their pricing pages", it is matching the meaning of that query against the meaning of paragraphs in its index.
Anchor text still matters here. It just does a different job:
- Anchors act as semantic labels attached to your URL across the web
- Multiple descriptive anchors covering different angles widen the topical territory your URL is associated with
- Short keyword anchors compress your page's meaning into something narrow, which actually hurts the semantic match
- Sentence-length anchors written in natural language map onto far more real user queries
In practice, "Acme's breakdown of how SaaS pricing pages convert better with annual-first ordering" is worth more to an AI engine than "SaaS pricing pages". The first matches a dozen queries. The second matches a handful and pattern-matches as commercial in Google's eyes. You lose on both fronts with the short version.
We go deeper on this in our guide to getting your brand cited in AI answers, but the headline is simple. Describe, do not compress.
The Common Crawl evidence base
If you want to understand how these models learned what anchor text means in the first place, you have to know about Common Crawl.
Common Crawl is an open web archive that has been collected regularly since 2008 and now holds petabytes of data, free to access. Nearly every major LLM you have heard of was trained, at least partly, on it.
Researchers at Webis built a dataset called MS MARCO Anchor Texts, enriching up to 4.8 million documents with anchor text pulled from six Common Crawl snapshots between 2016 and 2021. Each snapshot covered between 1.7 and 3.4 billion documents. That is the closest thing we have to a public ground truth for how the open web genuinely uses anchor text at scale.
The implication is the bit people skip. Every major AI model has been trained on a representation of the web where anchor text is one of the labels stuck to each URL. When a model decides which sources to recall for a query, the historical anchor text of those URLs is part of how the meaning got encoded in the first place. Years of descriptive anchors compound. Years of "best widgets for X" compound too, just in the wrong direction.
This is exactly why our link-building framework leans on descriptive, earned mentions rather than aggressive exact-match campaigns. Press and editorial mentions tend to use natural, sentence-length anchors. Those get encoded across multiple snapshots over years, and the brand-to-topic association bakes itself into the underlying model.
The split: Google wants safety, AI wants signal
The cleanest way we frame this for clients in 2026:
- Google rewards a natural-looking profile. It penalises anything that looks engineered. Your anchor strategy needs to survive an audit by a bored junior on Google's spam team.
- AI engines reward semantic clarity. They want to know what your page is about and which questions it answers. Your anchors need to read like a human description, not a keyword.
The good news, and it genuinely surprised us at first, is that these two goals do not fight. They pull in the same direction, away from exact-match and towards descriptive, branded, contextual anchors.
The bad news is that the industry spent 15 years training everyone to do the opposite, and a lot of agencies still measure their own success by exact-match volume. When we take those clients on, the first move is usually a reclamation pass to dilute or remove the worst patterns. Sometimes that means requesting anchor changes. Sometimes it means new branded placements to drag the ratio down to something safe.
How to audit your own anchor text profile in 2026
This is what we do for every new account. You can run most of it yourself in an afternoon.
- Export every backlink with its anchor text from Ahrefs, Semrush or Majestic. Use referring domain as your unit, not individual links, so a site firing 200 sitewide links counts once.
- Tag each anchor as branded, naked URL, generic, partial match or exact match. A spreadsheet is fine. You get a feel for the borderline cases after about 50 rows.
- Calculate ratios per landing page, not just sitewide. Concentration on commercial URLs is the real danger. Your homepage can be 80% branded while your services page quietly burns.
- Flag any landing page where exact-match clears 8%, or where a single anchor variant repeats more than five times across different domains.
- Hunt for unnatural clusters. Multiple links from weak domains, all dated within the same month, all carrying similar anchors. That is the exact shape Penguin was built to catch.
- Check semantic spread. For your top five commercial pages, do the anchors describe different aspects of the page, or just reword one keyword over and over? If it is the latter, AI engines are getting a thin, narrow signal.
An automated SEO audit will surface the obvious structural patterns, but the anchor analysis itself is best done by eye. We have not found a shortcut here that we actually trust.
A real example from our agency data
One anonymised case from earlier this year. A B2B SaaS client came to us with strong Google rankings and almost no AI Overview citations. Their profile looked like this:
- 18% exact-match on "workflow automation software"
- 22% partial match using slight variations
- 35% branded
- 25% other
The Google rankings were holding because their on-page content was strong enough to mask the over-optimisation. But every AI engine we tested was citing competitors with less authority and more topically descriptive anchors across their profiles. The competitors were not winning on links. They were winning on meaning.
We ran a 90-day campaign that did three things:
- Reached out to existing referring domains and requested anchor changes on the worst exact-match clusters
- Built 14 new placements with long descriptive anchors that paired the brand name with a specific use case
- Added 22 unlinked brand mentions through digital PR, each written into a paragraph that semantically described the product category
At 90 days, ChatGPT cited the client for 9 of the 14 queries we tracked. On Google, exact-match dropped from 18% to 7% and the rankings held. By month four, roughly 40% of inbound was coming from AI-driven discovery. Not every account lands this cleanly, and budgets and timelines vary, but the pattern repeats often enough that we now build for it from day one.
Co-occurrence: the anchor strategy that wins on both sides
This is the section we most want every SEO to read.
A co-occurrence anchor is where the visible link text is branded or generic, but the keyword you want associated with the URL sits in the surrounding sentence. Like this:
"Acme published a detailed breakdown of how SaaS pricing pages should be structured to convert annual-first."
The anchor is "Acme". Branded, clean, natural. Google sees a normal mention and Penguin has nothing to grab.
But the sentence around it contains "SaaS pricing pages" and "convert annual-first". When that paragraph gets embedded into a vector database, the URL gets associated with both concepts semantically. The retrieval system surfaces that URL when someone asks about either topic. You get the safety of a branded anchor and the semantic richness AI engines reward, in one move. It is the closest thing to a free lunch we have found in modern link building.
The catch is that you only earn co-occurrence anchors at scale by writing things worth quoting. Original data, sharp opinions, case studies with real numbers. That is why we keep steering clients towards AI search visibility built on quotable assets, not towards raw link volume. Volume gets you anchors. Quality gets you anchors that mean something.
How knowledge graphs change the maths
One more layer almost nobody is discussing.
Google and the major AI engines maintain knowledge graphs that map entities to concepts and to each other. When your brand has a clearly defined entity in those graphs, anchor text plays a smaller role. It stops being the main story about your site and becomes reinforcement instead.
For brands without a strong entity presence, the anchor profile is doing far more of the heavy lifting, which is exactly when over-optimisation looks worst. The anchors become the only story Google has about you, so any unnatural pattern stands out a mile.
We dig into this in our piece on knowledge graphs and entity optimisation for AI search. The short version: the more clearly defined your brand entity is across structured data, mentions and citations, the less weight any single anchor carries, and the safer you are.
Why this matters more every quarter
If you think this is a niche concern, look at where clicks are going. Pew Research found that when an AI summary appeared in the results, users clicked a traditional search result in just 8% of visits, against 15% when no summary appeared, based on browsing data from 900 US adults in March 2025. Clicks on links inside the summary itself happened in only 1% of visits.
So a growing share of your visibility now depends on being the source the AI cites, not the blue link someone clicks. And being the cited source is decided by semantic relevance, which anchor text helps shape. The anchor strategy that protects you from Penguin is the same one that earns those citations. They stopped being in tension a while ago.
What we would do this week
If you took over a client account tomorrow, here is the order we would work in:
- Pull every backlink and tag the anchors. Branded, naked, generic, partial, exact. One row per referring domain.
- Calculate exact-match percentage per landing page. Flag anything over 8% on commercial URLs.
- Spot repeat anchors across different domains. Three or more domains sending the same partial-match anchor is your first red flag.
- Audit how your brand gets described in mentions. Are the sentences around your links semantically rich, or just keyword wrappers?
- Stop chasing exact-match. Brief outreach partners with two or three descriptive anchor options that include the brand name.
- Build co-occurrence into every placement. Brand in the anchor, topic in the paragraph. Both Google and AI engines reward it.
- Run one digital PR campaign around a quotable asset. Original data works best. The anchors follow on their own.
If you want to see how this plays out across real accounts, several of the brands in our work shifted their anchor strategy as the main driver of recovery or growth. And if you want a second opinion on what your profile looks like right now, our team runs link audits and outreach as part of our link-building service. Send us your domain and we will tell you straight whether your anchors are an asset or a liability. The quickest way in is to get in touch.
The industry spent a decade arguing over whether exact-match should be 3% or 5% or 8%. In 2026 the better question is whether your anchors mean anything to a machine that does not count them. Ours do. Yours can too.


