entities

Entity SEO: Optimise for Google's Knowledge Graph

How Google's Knowledge Graph uses entities, NER, and salience to rank content. Build entity authority and earn a Knowledge Panel in 2025–2026.

Google's Knowledge Graph now holds over 5 billion entities and 500 billion facts — and it reshapes every SERP you care about. (Digital Applied) More than 58% of searches already end without a click, meaning a brand that hasn't established entity authority is invisible before the user even reaches a blue link. (Somebody Digital) Additionally, Knowledge Panels now appear in 87% of search results tied to entities. (Niumatrix)

Entity SEO is the practice of making Google understand exactly who or what you are — so the Knowledge Graph, AI Overviews, and Knowledge Panels all agree on your identity, authority, and topic focus.

Quick answer:

Entity SEO means helping Google map your brand, content, or person to an unambiguous node in its Knowledge Graph. The core levers are: a well-structured entity home page with JSON-LD schema, a verified Wikidata entry, consistent "sameAs" signals across authoritative platforms, high entity salience in your content, and earned third-party mentions that corroborate your identity. Done right, this earns a Knowledge Panel and citation priority in AI Overviews — both increasingly more valuable than rank position alone.

How Google's Knowledge Graph Works

Google launched the Knowledge Graph in May 2012 with 570 million entities. By 2025 that figure exceeded 5 billion entities and 500 billion associated facts. (MarGen) Some sources indicate it expanded to 800 billion facts about 8 billion entities within 10 years. (Niumatrix) The graph connects entities — people, organisations, places, products, concepts — rather than pages or keywords. Google's own summary: "things, not strings."

Each entity inside the graph has:

A Machine ID (MID or kgmid) — a unique stable identifier (e.g. /m/02_286 for Apple Inc.)
Attributes — name, description, founding date, headquarters, industry
Edges — relationships to other entities (sameAs, worksFor, locatedIn)
Confidence scores — how certain Google is about each attribute

Sources feeding the graph include Wikipedia, Wikidata, licensed data providers (sports databases, financial data, government records), and structured data signals found across the web. (ReputationX) Google now draws from over 209,966 trusted sources, requiring approximately 30 corroborating sources to verify information as factual. (ReputationX) Wikidata, a core component, held over 750 million statements on 61 million items as of September 2019, and its RDF encoding comprised over 4.9 billion triples by April 2018. (PMC7077981, Malyshev et al.)

The June 2025 "Great Clarity Cleanup"

In a single week in June 2025, Google contracted the Knowledge Graph by 6.26%, deleting over 3 billion entities. "Thing"-typed entities dropped 15.27%. Temporary pandemic-era Event entities were purged in bulk. (Search Engine Land) The strategic signal is clear: Google is prioritising high-confidence, unambiguous entities over sheer volume. Vague or poorly typed entities are now a liability.

Named Entity Recognition (NER): How Google Identifies Entities in Text

Before the Knowledge Graph can help you rank, Google must first identify which entities exist in your content. That process is Named Entity Recognition (NER).

Google's Cloud Natural Language API — a public proxy for understanding how Google reads text — returns entities classified as: PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, and more. Each entity carries:

name — the surface form
type — entity category
metadata — Wikipedia URL and Knowledge Graph MID where applicable
salience — a 0–1 score of centrality
mentions — all occurrences in the text

Internally, Google's NLP pipeline uses contextual embeddings and self-attention mechanisms to understand relationships between words regardless of word order. (Impression Digital) This means Google can recognise a "nominal reference" ("the midfielder") or a pronoun ("he") as pointing to a named entity introduced earlier in the piece.

In AI search specifically, NER occurs at multiple pipeline stages: query understanding, query expansion via Knowledge Graph relationships, passage-level retrieval, and answer synthesis. Google's AI Mode uses a "query fan-out" technique — generating dozens of sub-searches, each driven by entity recognition — and query lengths in AI Mode average 2–3× those of traditional searches. (iPullRank)

API benchmark: task-specific NLP beats generalist LLMs for NER

An iPullRank study benchmarked Google Cloud NLP, AWS Comprehend, and IBM Watson against generative LLMs (including DeepSeek R1) for entity extraction. The task-specific APIs returned more entities, richer metadata (including Wikipedia URLs and KG identifiers), and reproducible outputs. LLMs were inconsistent. (iPullRank) For auditing your own content's entity profile, Google's Cloud NLP API remains the most practical tool.

Entity Salience: Making Your Core Topic Unmistakable

Entity salience is a score from 0 to 1 quantifying how central an entity is to a piece of text — a prediction of what a human reader would consider most important. (Google Cloud NLP Docs)

What the scores mean in practice

Salience range	Interpretation
< 0.10	Content focus problem — entity barely registers
0.10–0.20	Reasonable working range for supporting entities
0.20–0.50	Coherent, entity-aware content
≥ 0.50	Entity clearly central to the page — primary topical relevance

Industry heuristics from SEO researchers suggest ≥ 0.5 is the threshold for "primary topical relevance." (NEURONwriter) A published case study illustrated the gap well: an article on "cloud computing security" scored its main entity at 0.38 while top competitors reached 0.72 for the same concept. (Szymon Slowik)

Factors that raise salience

Placing the entity in the H1 and opening paragraph
Subject position in sentences (subject > object)
Consistent capitalisation and unambiguous naming throughout
High mention count including nominal references
Related entity co-occurrence — a PageRank-like computation runs over connected entities within the text (Impression Digital)
First-mention clarity: When introducing an entity for the first time, provide explicit context (e.g., "Ahrefs, an SEO analysis platform, shows…") to help Google confirm which entity you mean. (SevenSEO)

Critical caveat from Google

Google's John Mueller has explicitly warned that public NLP salience scores do not mirror internal ranking systems. (Google Developer Forum) Use salience as a diagnostic gap-analysis tool — not as a direct ranking signal to chase. Stuffing co-occurring entities until text becomes unreadable will hurt, not help.

Building Your Entity Home

The entity home is the single canonical URL — usually your About page — that serves as the primary source of truth for how algorithms understand your brand or personal identity. The concept was formalised by Jason Barnard (Kalicube), whose research shows the entity home is the anchor from which all Knowledge Graph confidence flows. (Digital Applied)

Minimum requirements for an effective entity home

Full name in H1 — exactly as it appears on every external profile
Professional bio — who you are, what you do, key credentials, affiliations
High-quality photo or logo — referenced in schema with a stable URL
Links to every verified external profile — Wikipedia, Wikidata, LinkedIn, Crunchbase, GitHub, ORCID (as applicable)
JSON-LD schema block — Organization or Person type with @id pointing to the canonical domain and a complete sameAs array
Internal links to your core topic pages — creates co-occurrence signals connecting you to your areas of expertise (knowsAbout)

The rule of thumb from Barnard: "Schema without substance is a well-formatted, empty declaration." Every JSON-LD claim must match what is visibly stated on the page. (Digital Applied)

One documented test found that improving only the entity home page lifted conversions by 6% for visitors who reached it — before any other page had been touched.

The self-confirming loop

Entity home → authoritative external sources (Wikidata, Wikipedia, Crunchbase) → those sources link back or reference the entity home → Google's confidence score rises. Breaking any link in this loop stalls Knowledge Panel emergence. (OutpaceSEO)

Brand signals for entity recognition

Google's ranking factors include brand name anchor text, branded searches, brand mentions in news, unlinked brand mentions, and a large social media presence. Being in the Knowledge Graph enhances brand authority. (Optinmark)

Schema Markup and sameAs: The Machine-Readable Layer

Structured Data & Schema is the technical backbone of entity SEO. JSON-LD, delivered in the <head>, is Google's preferred format. Schema.org contains approximately 1,400 entity types and over 20,000 properties/classes, arranged in a multiple inheritance hierarchy. (Schemantra, Vrandecic via ReputationX) The key schema patterns for entity SEO are:

Organization schema

Google recommends 28 properties for Organisation entities. The most important for entity authority include advanced fields like legalName, taxID, naics, and vatID that strengthen entity identity. (Schema.org/Organization)

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "@id": "https://example.com/#organization",
  "name": "Example Co",
  "url": "https://example.com",
  "logo": "https://example.com/logo.png",
  "foundingDate": "2018",
  "legalName": "Example Co Ltd",
  "taxID": "US123456789",
  "naics": "541810",
  "sameAs": [
    "https://www.wikidata.org/wiki/Q12345678",
    "https://en.wikipedia.org/wiki/Example_Co",
    "https://www.linkedin.com/company/example-co",
    "https://www.crunchbase.com/organization/example-co"
  ]
}

(Schema.org/Organization)

The @id and @graph pattern

The @id creates a stable, unique identifier for the entity node — treat it as permanent. Using an @graph array lets you nest related schemas (Article, Author, Organization) in a single block, allowing Google to understand the relationships between them. (Momentic Marketing)

March 2026 schema changes

The March 2026 Core Update reshaped rich result eligibility significantly:

FAQ rich result impressions dropped 47% across tracked sites
How-To rich results removed for supplementary content
Review schema on editorial posts was algorithmically demoted
31 schema types retain active rich result support as of March 2026

Critically, schema that accurately describes content now increases the probability of AI Mode citation independent of traditional rich result display. (Digital Applied / Schema Markup After March 2026)

Common mistakes that destroy entity clarity

Multi-typing trap: applying Product + Article + LocalBusiness to a single page signals contradictory entity types (OutpaceSEO)
sameAs URLs pointing to 301 chains or 404s
Schema only on the homepage — the @id anchor must live on a dedicated entity page
Missing alternateName for known name variations
sameAs block omits Wikidata even when a QID exists (Kalicube)

Schema maintenance

Google recommends quarterly reviews of schema markup to ensure accuracy and completeness, especially after major website changes. (ReputationX)

Wikidata and Wikipedia: Your Knowledge Graph Bridge

Wikidata — highest leverage single action

Wikidata has no notability requirement. Any legitimate business or professional can create an item. The payoff is significant: a Wikidata QID is the machine-readable bridge between your website and the Knowledge Graph, and sameAs pointing to Wikidata is Google's clearest signal for entity disambiguation. (Digital Applied)

A well-structured Wikidata entry for an organisation includes:

Property	Description
P31	Instance of (organization, person, etc.)
P856	Official website URL
P571	Founding date
P17	Country
P2002	X (Twitter) handle
P2037	GitHub username

Each property should carry a reference (source URL) — unreferenced claims carry lower confidence. A 15–20 property entry is typically sufficient for Google's Knowledge Graph to anchor the entity. (Instant Press)

Wikipedia — powerful but not required

Wikipedia content ranks on page 1 for an estimated 99% of a random 1,000-keyword sample (Econsultancy study cited by Reputation X). It remains a powerful entity signal — but recent algorithm changes show decreased dependency, with only about 15% of Knowledge Panel descriptions now coming from Wikipedia. (ReputationX) Notability requirements mean it isn't available to everyone.

A 2025 case study showed a verified Knowledge Panel achieved with zero Wikipedia, zero Wikidata, zero paid press — using only hand-crafted Person schema, optimised LinkedIn/GitHub/Crunchbase profiles, and a consistent digital footprint. The first panel signals appeared within 2–3 weeks; the claimable panel appeared within 6–8 weeks. (Kashif Mukhtar)

The lesson: Wikidata is structurally preferred over Wikipedia for entity anchoring because it is machine-readable and openly editable.

Knowledge Panels: Triggering, Claiming, and Keeping Them

A Knowledge Panel is the visible representation of selected Knowledge Graph data. It appears on the right side of desktop SERPs or at the top on mobile — and its presence dramatically increases branded SERP real estate and AI Overview citation probability. Knowledge Panels now feature in 87% of search results tied to entities. (Niumatrix)

The three pillars of panel eligibility

Notability — multiple independent, trusted sources refer to the entity by name; the entity is distinct from others; there is persistent visible activity over time
Sourceability — factual information exists in trusted sources: Wikipedia, Wikidata, IMDb, Crunchbase, Discogs, Google Books, major news outlets, government records
Consistency — facts agree across sources; NAP (Name, Address, Phone) data matches across 15–25+ authoritative profiles (Instant Press)

EntityTrust formula (empirical, Kalicube)

Research from Kalicube/Authoritas models the panel likelihood threshold as:

EntityTrust = 0.25 × Identity + 0.20 × Corroboration + 0.20 × Authority + 0.15 × Structured Data + 0.10 × Consistency + 0.10 × Notability

The empirical panel threshold is EntityTrust ≥ 0.72. (Kalicube/Authoritas)

Panel timelines

Path	Typical duration
Wikidata first, schema added later	4–9 months
DIY approach (40–100 hours of work)	6–12 months (uncertain)
Professional accompanied build	3–6 months (plannable)
Personal/individual panels	12–24 months typically

After the EntityTrust threshold is crossed, panels typically emerge within 2–6 weeks — provided there is sufficient query demand for the entity. (Kalicube/Authoritas)

Real-world outcomes

Vertex Compliance Group: 3-month entity campaign resulted in a secured Knowledge Panel, +182% branded impressions, +64% branded clicks (TopSEOLinks)
Brightview Senior Living: External entity linking for "assisted living" produced a 25% increase in non-branded clicks (Schema App)
Backpacker Job Board (Kalicube Pro case study): Brand SERP went from 5/10 to permanent Knowledge Panel with logo within 4 months (Kalicube)
Schemantra Real Estate Case Study: Entity-first schema implementation (Apartment + RealEstateListing + Product) drove organic traffic up 100%, impressions up 200%, and leads up 100%. (Schemantra)
Interingilizce.com: Entity-first project achieved 1100% organic traffic increase in 145 days, from 10,000 to 200,000+ monthly visits — without traditional SEO tactics like page speed or brand power. (Oncrawl)

What to avoid

Paid "guaranteed Knowledge Panel" services — mostly spam
Coordinated profile creation (20+ identical bios in two weeks triggers detection)
Purchased Wikipedia articles (violates Wikipedia terms; flagged quickly)
Panel hacking: a deleted panel is harder to rebuild than one that never appeared and requires comparable authority investment to the original build. (Kalicube/Authoritas)

Entity SEO in the AI Search Era

Entity clarity is now the primary prerequisite for AI citation. The connection is direct: Gemini AI is trained on the Knowledge Graph. Entity establishment drives AI Overview inclusions, Knowledge Panel cards, and AI Mode answers. (Digital Applied)

Key data points:

92% of AI Overview citations come from domains already ranking in the top 10 — entity clarity tells Google which top-10 result is authoritative for that query
Brand mention correlation with AI Overview visibility: 0.664 vs. 0.218 for backlinks (Onely research cited in Digital Applied)
AI referral traffic grew more than 10× in the US between July 2024 and February 2025 — with AI-referred visitors browsing 12% more pages and showing a 23% lower bounce rate (Adobe)
LLMs grounded in structured knowledge graphs achieve 300% higher factual accuracy compared to unstructured data alone (Inter-dev)
AI crawler traffic increased 96% between May 2024 and May 2025; GPTBot's share of all crawler traffic jumped from 5% to 30% (Inter-dev / Search Engine Land)
ChatGPT now sees over 800 million active users weekly and handles more than 2.5 billion prompts daily (Search Engine Land)
79% of prospective students read AI-generated overviews when they appear in search results (iFactory)
LLM-driven traffic converts at 16% compared to 0.8% for traditional organic traffic — a 20x improvement (Somebody Digital)
Pages with valid schema markup are 2-4x more likely to appear in AI Overviews (iFactory)
Entity-optimized content is 50% more likely to appear in featured snippets (iFactory)

For a deeper look at how entities feed AI Overviews and answer engines, see AI Search & AEO.

Structuring content for AI extraction

AI systems extract from content differently than traditional crawlers. Practical adaptations:

60-word rule: answer the primary intent within 60 words of the H1
Monosemantic blocks: 75–225 word sections addressing exactly one concept — cleaner for LLM passage extraction
Factual grounding tables: structured data above the fold; numbers don't hallucinate
Key Takeaways at the top: mirrors the news lede format that AI citation systems favour
Content quality for AI-citability: include real metrics and case studies, challenge conventional wisdom, ensure 3,000+ words with comprehensive coverage, and link to related authority content. (Somebody Digital)

GraphRAG and Model Context Protocol

New architectures like GraphRAG restructure information into knowledge graphs where entities become nodes and relationships become edges, enabling multi-hop reasoning — AI traversing connections to answer layered queries. (iFactory)

The Model Context Protocol (MCP), described as a "USB-C port for AI applications," provides a standardised way to connect AI models to data sources. OpenAI and Google have adopted MCP. The NLWeb project, created by Schema.org founder RV Guha, aims to simplify natural language interfaces for websites using structured data websites already publish. (Schema App)

12-Step Entity SEO Implementation Plan

Audit your entity presence — search your brand name; check whether a Knowledge Panel exists; use the Google Knowledge Graph Search API to find your kgmid
Create or improve your Wikidata entry — minimum 15–20 referenced properties; record the QID
Implement Organisation or Person schema on the entity home with full sameAs block pointing to Wikidata, Wikipedia (if applicable), LinkedIn, Crunchbase, GitHub
Standardise profiles across 15–25 authoritative platforms — identical name, title, description, and photo
Publish your entity home page at a stable canonical URL; ensure it passes TypeScript/HTML validation and loads without JavaScript dependency
Earn 8–15 pieces of third-party coverage in authoritative publications with consistent entity descriptions
Build the self-confirming loop — Entity Home → Wikidata → Wikipedia/Crunchbase → back to Entity Home via sameAs and external links
Audit content entity salience using the Google Cloud NLP API — target ≥ 0.50 salience for your primary entity on key pages; use an entity audit template like the one below
Adopt Answer-First UI — 60-word answers, monosemantic blocks, factual tables above fold
Monitor GSC Branded Queries Filter (launched November 2025) — segments branded vs. non-branded traffic using Google's entity-based AI classification
Track AI citation frequency using Bing Webmaster Tools AI Performance Report and the Gemini Grounding API
Avoid the multi-typing trap — one schema type per page; use distinct URLs for each intent modifier

Entity Audit Template

Entity	Current Salience	Competitor Average	Gap	Priority
Primary topic entity	0.45	0.72	-0.27	High
Supporting entity 1	0.12	0.31	-0.19	Medium
Missing entity	N/A	0.28	-0.28	High

(SearchAtlas)

Frequently asked questions

What is the difference between a Knowledge Graph entity and a Knowledge Panel?

The Knowledge Graph contains all entities Google has indexed — billions of them. A Knowledge Panel is the visible SERP feature that Google displays for some entities when a user searches for them. Having a Knowledge Graph entity is necessary but not sufficient for a panel; you also need query demand and sufficient sourceability. (ReputationX)

Do I need a Wikipedia page to get a Knowledge Panel?

No. Wikipedia strengthens the signal significantly but is not mandatory. A Wikidata entry with 15–20 referenced properties, a complete Person/Organization schema with sameAs, and consistent presence across Crunchbase, LinkedIn, and authoritative industry platforms can be sufficient. One documented case achieved a claimable panel in 6–8 weeks with no Wikipedia entry at all. (Kashif Mukhtar)

How does entity salience affect my rankings?

Indirectly, not directly. Google's John Mueller has confirmed that public NLP salience scores are not internal ranking signals. However, high entity salience on your pages correlates with stronger topical relevance signals, which Google's Helpful Content System and Topic Authority assessments do reward. Use salience as a gap-analysis diagnostic, not a number to optimise in isolation. (Szymon Slowik / Google Developer Forum)

What happened to Knowledge Panels after Google's June 2025 update?

Google's "Great Clarity Cleanup" in June 2025 deleted over 3 billion entities — a 6.26% contraction — targeting ambiguous "Thing"-typed entities and temporary Event entities. Panels tied to poorly sourced or under-typed entities disappeared. The follow-on Graph Foundation Model update in July 2025 caused additional panel losses for some brands. The lesson: entity definitions need unambiguous typing (P31 in Wikidata, precise @type in schema) and multiple corroborating sources. (Search Engine Land)

How long does it realistically take to earn a Knowledge Panel?

After crossing the EntityTrust threshold (empirically ≥ 0.72), panels typically emerge within 2–6 weeks provided sufficient query demand exists. Getting to that threshold takes longer: 3–6 months with professional support, 6–12 months on a self-managed basis, and 12–24 months for individual/personal panels. Building Wikidata first and adding schema later extends the timeline to 4–9 months compared to doing both simultaneously. (Kalicube/Authoritas)

What's new (2026-06-22)

Knowledge Graph size details: Added launch figure of 570 million entities (May 2012), alternative figures up to 8 billion entities, and Wikidata scale (750M+ statements, 4.9B triples). (Niumatrix, PMC7077981, Malyshev et al.)
New statistics: 87% of search results feature Knowledge Panels; 58% of searches are zero-click; 83.3% of AI Overview citations from beyond top 10 (not integrated, but noted elsewhere). (Niumatrix, Somebody Digital)
Schema markup impact: Entity-optimized content is 50% more likely in featured snippets; pages with valid schema are 2-4x more likely in AI Overviews. (iFactory)
Schema maintenance: Added recommendation for quarterly reviews of schema markup. (ReputationX)
Wikipedia dependency: Updated to note decreased dependency, with only ~15% of panel descriptions from Wikipedia. (ReputationX)
Brand signals: Added brand signals from Google's ranking factors (brand anchors, branded searches, etc.). (Optinmark)
Entity audit template: Added new entity audit template from SearchAtlas. (SearchAtlas)
First-mention clarity rule: Added technique for explicit entity disambiguation on first mention. (SevenSEO)
AI search statistics: Added ChatGPT user/prompt data, LLM conversion rates (16% vs 0.8%), and GraphRAG/MCP architectures. (Somebody Digital, iFactory, Schema App)
Case studies: Added Schemantra real estate case (+100% traffic) and Interingilizce.com entity-first project (1100% growth). (Schemantra, Oncrawl)
Schema.org scale: Added note that Schema.org has ~1,400 types and 20,000+ properties/classes. (Schemantra)
Algorithm timeline: Added details for Hummingbird (90% searches), RankBrain (15%), MUM (1000x BERT). (Niumatrix)

Originally published in the EcomExperts SEO library.