Entity SEO: Optimise for Google's Knowledge Graph
How Google's Knowledge Graph uses entities, NER, and salience to rank content. Build entity authority and earn a Knowledge Panel in 2025–2026.
Google's Knowledge Graph now holds over 5 billion entities and 500 billion facts — and it reshapes every SERP you care about. (Digital Applied) More than 58% of searches already end without a click, meaning a brand that hasn't established entity authority is invisible before the user even reaches a blue link. (Somebody Digital) Additionally, Knowledge Panels now appear in 87% of search results tied to entities. (Niumatrix)
Entity SEO is the practice of making Google understand exactly who or what you are — so the Knowledge Graph, AI Overviews, and Knowledge Panels all agree on your identity, authority, and topic focus.
Quick answer:
Entity SEO means helping Google map your brand, content, or person to an unambiguous node in its Knowledge Graph. The core levers are: a well-structured entity home page with JSON-LD schema, a verified Wikidata entry, consistent "sameAs" signals across authoritative platforms, high entity salience in your content, and earned third-party mentions that corroborate your identity. Done right, this earns a Knowledge Panel and citation priority in AI Overviews — both increasingly more valuable than rank position alone.
How Google's Knowledge Graph Works
Google launched the Knowledge Graph in May 2012 with 570 million entities. By 2025 that figure exceeded 5 billion entities and 500 billion associated facts. (MarGen) Some sources indicate it expanded to 800 billion facts about 8 billion entities within 10 years. (Niumatrix) The graph connects entities — people, organisations, places, products, concepts — rather than pages or keywords. Google's own summary: "things, not strings."
Each entity inside the graph has:
- A Machine ID (MID or kgmid) — a unique stable identifier (e.g.
/m/02_286for Apple Inc.) - Attributes — name, description, founding date, headquarters, industry
- Edges — relationships to other entities (sameAs, worksFor, locatedIn)
- Confidence scores — how certain Google is about each attribute
Sources feeding the graph include Wikipedia, Wikidata, licensed data providers (sports databases, financial data, government records), and structured data signals found across the web. (ReputationX) Google now draws from over 209,966 trusted sources, requiring approximately 30 corroborating sources to verify information as factual. (ReputationX) Wikidata, a core component, held over 750 million statements on 61 million items as of September 2019, and its RDF encoding comprised over 4.9 billion triples by April 2018. (PMC7077981, Malyshev et al.)
The June 2025 "Great Clarity Cleanup"
In a single week in June 2025, Google contracted the Knowledge Graph by 6.26%, deleting over 3 billion entities. "Thing"-typed entities dropped 15.27%. Temporary pandemic-era Event entities were purged in bulk. (Search Engine Land) The strategic signal is clear: Google is prioritising high-confidence, unambiguous entities over sheer volume. Vague or poorly typed entities are now a liability.
Named Entity Recognition (NER): How Google Identifies Entities in Text
Before the Knowledge Graph can help you rank, Google must first identify which entities exist in your content. That process is Named Entity Recognition (NER).
Google's Cloud Natural Language API — a public proxy for understanding how Google reads text — returns entities classified as: PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, and more. Each entity carries:
name— the surface formtype— entity categorymetadata— Wikipedia URL and Knowledge Graph MID where applicablesalience— a 0–1 score of centralitymentions— all occurrences in the text
Internally, Google's NLP pipeline uses contextual embeddings and self-attention mechanisms to understand relationships between words regardless of word order. (Impression Digital) This means Google can recognise a "nominal reference" ("the midfielder") or a pronoun ("he") as pointing to a named entity introduced earlier in the piece.
In AI search specifically, NER occurs at multiple pipeline stages: query understanding, query expansion via Knowledge Graph relationships, passage-level retrieval, and answer synthesis. Google's AI Mode uses a "query fan-out" technique — generating dozens of sub-searches, each driven by entity recognition — and query lengths in AI Mode average 2–3× those of traditional searches. (iPullRank)
API benchmark: task-specific NLP beats generalist LLMs for NER
An iPullRank study benchmarked Google Cloud NLP, AWS Comprehend, and IBM Watson against generative LLMs (including DeepSeek R1) for entity extraction. The task-specific APIs returned more entities, richer metadata (including Wikipedia URLs and KG identifiers), and reproducible outputs. LLMs were inconsistent. (iPullRank) For auditing your own content's entity profile, Google's Cloud NLP API remains the most practical tool.
Entity Salience: Making Your Core Topic Unmistakable
Entity salience is a score from 0 to 1 quantifying how central an entity is to a piece of text — a prediction of what a human reader would consider most important. (Google Cloud NLP Docs)
What the scores mean in practice
| Salience range | Interpretation |
|---|---|
| < 0.10 | Content focus problem — entity barely registers |
| 0.10–0.20 | Reasonable working range for supporting entities |
| 0.20–0.50 | Coherent, entity-aware content |
| ≥ 0.50 | Entity clearly central to the page — primary topical relevance |
Industry heuristics from SEO researchers suggest ≥ 0.5 is the threshold for "primary topical relevance." (NEURONwriter) A published case study illustrated the gap well: an article on "cloud computing security" scored its main entity at 0.38 while top competitors reached 0.72 for the same concept. (Szymon Slowik)
Factors that raise salience
- Placing the entity in the H1 and opening paragraph
- Subject position in sentences (subject > object)
- Consistent capitalisation and unambiguous naming throughout
- High mention count including nominal references
- Related entity co-occurrence — a PageRank-like computation runs over connected entities within the text (Impression Digital)
- First-mention clarity: When introducing an entity for the first time, provide explicit context (e.g., "Ahrefs, an SEO analysis platform, shows…") to help Google confirm which entity you mean. (SevenSEO)
Critical caveat from Google
Google's John Mueller has explicitly warned that public NLP salience scores do not mirror internal ranking systems. (Google Developer Forum) Use salience as a diagnostic gap-analysis tool — not as a direct ranking signal to chase. Stuffing co-occurring entities until text becomes unreadable will hurt, not help.
Building Your Entity Home
The entity home is the single canonical URL — usually your About page — that serves as the primary source of truth for how algorithms understand your brand or personal identity. The concept was formalised by Jason Barnard (Kalicube), whose research shows the entity home is the anchor from which all Knowledge Graph confidence flows. (Digital Applied)
Minimum requirements for an effective entity home
- Full name in H1 — exactly as it appears on every external profile
- Professional bio — who you are, what you do, key credentials, affiliations
- High-quality photo or logo — referenced in schema with a stable URL
- Links to every verified external profile — Wikipedia, Wikidata, LinkedIn, Crunchbase, GitHub, ORCID (as applicable)
- JSON-LD schema block —
OrganizationorPersontype with@idpointing to the canonical domain and a completesameAsarray - Internal links to your core topic pages — creates co-occurrence signals connecting you to your areas of expertise (
knowsAbout)
The rule of thumb from Barnard: "Schema without substance is a well-formatted, empty declaration." Every JSON-LD claim must match what is visibly stated on the page. (Digital Applied)
One documented test found that improving only the entity home page lifted conversions by 6% for visitors who reached it — before any other page had been touched.
The self-confirming loop
Entity home → authoritative external sources (Wikidata, Wikipedia, Crunchbase) → those sources link back or reference the entity home → Google's confidence score rises. Breaking any link in this loop stalls Knowledge Panel emergence. (OutpaceSEO)
Brand signals for entity recognition
Google's ranking factors include brand name anchor text, branded searches, brand mentions in news, unlinked brand mentions, and a large social media presence. Being in the Knowledge Graph enhances brand authority. (Optinmark)
Schema Markup and sameAs: The Machine-Readable Layer
Structured Data & Schema is the technical backbone of entity SEO. JSON-LD, delivered in the <head>, is Google's preferred format. Schema.org contains approximately 1,400 entity types and over 20,000 properties/classes, arranged in a multiple inheritance hierarchy. (Schemantra, Vrandecic via ReputationX) The key schema patterns for entity SEO are:
Organization schema
Google recommends 28 properties for Organisation entities. The most important for entity authority include advanced fields like legalName, taxID, naics, and vatID that strengthen entity identity. (Schema.org/Organization)
{
"@context": "https://schema.org",
"@type": "Organization",
"@id": "https://example.com/#organization",
"name": "Example Co",
"url": "https://example.com",
"logo": "https://example.com/logo.png",
"foundingDate": "2018",
"legalName": "Example Co Ltd",
"taxID": "US123456789",
"naics": "541810",
"sameAs": [
"https://www.wikidata.org/wiki/Q12345678",
"https://en.wikipedia.org/wiki/Example_Co",
"https://www.linkedin.com/company/example-co",
"https://www.crunchbase.com/organization/example-co"
]
}
The @id and @graph pattern
The @id creates a stable, unique identifier for the entity node — treat it as permanent. Using an @graph array lets you nest related schemas (Article, Author, Organization) in a single block, allowing Google to understand the relationships between them. (Momentic Marketing)
March 2026 schema changes
The March 2026 Core Update reshaped rich result eligibility significantly:
- FAQ rich result impressions dropped 47% across tracked sites
- How-To rich results removed for supplementary content
- Review schema on editorial posts was algorithmically demoted
- 31 schema types retain active rich result support as of March 2026
Critically, schema that accurately describes content now increases the probability of AI Mode citation independent of traditional rich result display. (Digital Applied / Schema Markup After March 2026)
Common mistakes that destroy entity clarity
- Multi-typing trap: applying
Product + Article + LocalBusinessto a single page signals contradictory entity types (OutpaceSEO) sameAsURLs pointing to 301 chains or 404s- Schema only on the homepage — the
@idanchor must live on a dedicated entity page - Missing
alternateNamefor known name variations sameAsblock omits Wikidata even when a QID exists (Kalicube)
Schema maintenance
Google recommends quarterly reviews of schema markup to ensure accuracy and completeness, especially after major website changes. (ReputationX)
Wikidata and Wikipedia: Your Knowledge Graph Bridge
Wikidata — highest leverage single action
Wikidata has no notability requirement. Any legitimate business or professional can create an item. The payoff is significant: a Wikidata QID is the machine-readable bridge between your website and the Knowledge Graph, and sameAs pointing to Wikidata is Google's clearest signal for entity disambiguation. (Digital Applied)
A well-structured Wikidata entry for an organisation includes:
| Property | Description |
|---|---|
| P31 | Instance of (organization, person, etc.) |
| P856 | Official website URL |
| P571 | Founding date |
| P17 | Country |
| P2002 | X (Twitter) handle |
| P2037 | GitHub username |
Each property should carry a reference (source URL) — unreferenced claims carry lower confidence. A 15–20 property entry is typically sufficient for Google's Knowledge Graph to anchor the entity. (Instant Press)
Wikipedia — powerful but not required
Wikipedia content ranks on page 1 for an estimated 99% of a random 1,000-keyword sample (Econsultancy study cited by Reputation X). It remains a powerful entity signal — but recent algorithm changes show decreased dependency, with only about 15% of Knowledge Panel descriptions now coming from Wikipedia. (ReputationX) Notability requirements mean it isn't available to everyone.
A 2025 case study showed a verified Knowledge Panel achieved with zero Wikipedia, zero Wikidata, zero paid press — using only hand-crafted Person schema, optimised LinkedIn/GitHub/Crunchbase profiles, and a consistent digital footprint. The first panel signals appeared within 2–3 weeks; the claimable panel appeared within 6–8 weeks. (Kashif Mukhtar)
The lesson: Wikidata is structurally preferred over Wikipedia for entity anchoring because it is machine-readable and openly editable.
Knowledge Panels: Triggering, Claiming, and Keeping Them
A Knowledge Panel is the visible representation of selected Knowledge Graph data. It appears on the right side of desktop SERPs or at the top on mobile — and its presence dramatically increases branded SERP real estate and AI Overview citation probability. Knowledge Panels now feature in 87% of search results tied to entities. (Niumatrix)
The three pillars of panel eligibility
- Notability — multiple independent, trusted sources refer to the entity by name; the entity is distinct from others; there is persistent visible activity over time
- Sourceability — factual information exists in trusted sources: Wikipedia, Wikidata, IMDb, Crunchbase, Discogs, Google Books, major news outlets, government records
- Consistency — facts agree across sources; NAP (Name, Address, Phone) data matches across 15–25+ authoritative profiles (Instant Press)
EntityTrust formula (empirical, Kalicube)
Research from Kalicube/Authoritas models the panel likelihood threshold as:
EntityTrust = 0.25 × Identity + 0.20 × Corroboration + 0.20 × Authority + 0.15 × Structured Data + 0.10 × Consistency + 0.10 × Notability
The empirical panel threshold is EntityTrust ≥ 0.72. (Kalicube/Authoritas)
Panel timelines
| Path | Typical duration |
|---|---|
| Wikidata first, schema added later | 4–9 months |
| DIY approach (40–100 hours of work) | 6–12 months (uncertain) |
| Professional accompanied build | 3–6 months (plannable) |
| Personal/individual panels | 12–24 months typically |
After the EntityTrust threshold is crossed, panels typically emerge within 2–6 weeks — provided there is sufficient query demand for the entity. (Kalicube/Authoritas)
Real-world outcomes
- Vertex Compliance Group: 3-month entity campaign resulted in a secured Knowledge Panel, +182% branded impressions, +64% branded clicks (TopSEOLinks)
- Brightview Senior Living: External entity linking for "assisted living" produced a 25% increase in non-branded clicks (Schema App)
- Backpacker Job Board (Kalicube Pro case study): Brand SERP went from 5/10 to permanent Knowledge Panel with logo within 4 months (Kalicube)
- Schemantra Real Estate Case Study: Entity-first schema implementation (Apartment + RealEstateListing + Product) drove organic traffic up 100%, impressions up 200%, and leads up 100%. (Schemantra)
- Interingilizce.com: Entity-first project achieved 1100% organic traffic increase in 145 days, from 10,000 to 200,000+ monthly visits — without traditional SEO tactics like page speed or brand power. (Oncrawl)
What to avoid
- Paid "guaranteed Knowledge Panel" services — mostly spam
- Coordinated profile creation (20+ identical bios in two weeks triggers detection)
- Purchased Wikipedia articles (violates Wikipedia terms; flagged quickly)
- Panel hacking: a deleted panel is harder to rebuild than one that never appeared and requires comparable authority investment to the original build. (Kalicube/Authoritas)
Entity SEO in the AI Search Era
Entity clarity is now the primary prerequisite for AI citation. The connection is direct: Gemini AI is trained on the Knowledge Graph. Entity establishment drives AI Overview inclusions, Knowledge Panel cards, and AI Mode answers. (Digital Applied)
Key data points:
- 92% of AI Overview citations come from domains already ranking in the top 10 — entity clarity tells Google which top-10 result is authoritative for that query
- Brand mention correlation with AI Overview visibility: 0.664 vs. 0.218 for backlinks (Onely research cited in Digital Applied)
- AI referral traffic grew more than 10× in the US between July 2024 and February 2025 — with AI-referred visitors browsing 12% more pages and showing a 23% lower bounce rate (Adobe)
- LLMs grounded in structured knowledge graphs achieve 300% higher factual accuracy compared to unstructured data alone (Inter-dev)
- AI crawler traffic increased 96% between May 2024 and May 2025; GPTBot's share of all crawler traffic jumped from 5% to 30% (Inter-dev / Search Engine Land)
- ChatGPT now sees over 800 million active users weekly and handles more than 2.5 billion prompts daily (Search Engine Land)
- 79% of prospective students read AI-generated overviews when they appear in search results (iFactory)
- LLM-driven traffic converts at 16% compared to 0.8% for traditional organic traffic — a 20x improvement (Somebody Digital)
- Pages with valid schema markup are 2-4x more likely to appear in AI Overviews (iFactory)
- Entity-optimized content is 50% more likely to appear in featured snippets (iFactory)
For a deeper look at how entities feed AI Overviews and answer engines, see AI Search & AEO.
Structuring content for AI extraction
AI systems extract from content differently than traditional crawlers. Practical adaptations:
- 60-word rule: answer the primary intent within 60 words of the H1
- Monosemantic blocks: 75–225 word sections addressing exactly one concept — cleaner for LLM passage extraction
- Factual grounding tables: structured data above the fold; numbers don't hallucinate
- Key Takeaways at the top: mirrors the news lede format that AI citation systems favour
- Content quality for AI-citability: include real metrics and case studies, challenge conventional wisdom, ensure 3,000+ words with comprehensive coverage, and link to related authority content. (Somebody Digital)
GraphRAG and Model Context Protocol
New architectures like GraphRAG restructure information into knowledge graphs where entities become nodes and relationships become edges, enabling multi-hop reasoning — AI traversing connections to answer layered queries. (iFactory)
The Model Context Protocol (MCP), described as a "USB-C port for AI applications," provides a standardised way to connect AI models to data sources. OpenAI and Google have adopted MCP. The NLWeb project, created by Schema.org founder RV Guha, aims to simplify natural language interfaces for websites using structured data websites already publish. (Schema App)
12-Step Entity SEO Implementation Plan
- Audit your entity presence — search your brand name; check whether a Knowledge Panel exists; use the Google Knowledge Graph Search API to find your kgmid
- Create or improve your Wikidata entry — minimum 15–20 referenced properties; record the QID
- Implement Organisation or Person schema on the entity home with full
sameAsblock pointing to Wikidata, Wikipedia (if applicable), LinkedIn, Crunchbase, GitHub - Standardise profiles across 15–25 authoritative platforms — identical name, title, description, and photo
- Publish your entity home page at a stable canonical URL; ensure it passes TypeScript/HTML validation and loads without JavaScript dependency
- Earn 8–15 pieces of third-party coverage in authoritative publications with consistent entity descriptions
- Build the self-confirming loop — Entity Home → Wikidata → Wikipedia/Crunchbase → back to Entity Home via
sameAsand external links - Audit content entity salience using the Google Cloud NLP API — target ≥ 0.50 salience for your primary entity on key pages; use an entity audit template like the one below
- Adopt Answer-First UI — 60-word answers, monosemantic blocks, factual tables above fold
- Monitor GSC Branded Queries Filter (launched November 2025) — segments branded vs. non-branded traffic using Google's entity-based AI classification
- Track AI citation frequency using Bing Webmaster Tools AI Performance Report and the Gemini Grounding API
- Avoid the multi-typing trap — one schema type per page; use distinct URLs for each intent modifier
Entity Audit Template
| Entity | Current Salience | Competitor Average | Gap | Priority |
|---|---|---|---|---|
| Primary topic entity | 0.45 | 0.72 | -0.27 | High |
| Supporting entity 1 | 0.12 | 0.31 | -0.19 | Medium |
| Missing entity | N/A | 0.28 | -0.28 | High |
Frequently asked questions
What is the difference between a Knowledge Graph entity and a Knowledge Panel?
The Knowledge Graph contains all entities Google has indexed — billions of them. A Knowledge Panel is the visible SERP feature that Google displays for some entities when a user searches for them. Having a Knowledge Graph entity is necessary but not sufficient for a panel; you also need query demand and sufficient sourceability. (ReputationX)
Do I need a Wikipedia page to get a Knowledge Panel?
No. Wikipedia strengthens the signal significantly but is not mandatory. A Wikidata entry with 15–20 referenced properties, a complete Person/Organization schema with sameAs, and consistent presence across Crunchbase, LinkedIn, and authoritative industry platforms can be sufficient. One documented case achieved a claimable panel in 6–8 weeks with no Wikipedia entry at all. (Kashif Mukhtar)
How does entity salience affect my rankings?
Indirectly, not directly. Google's John Mueller has confirmed that public NLP salience scores are not internal ranking signals. However, high entity salience on your pages correlates with stronger topical relevance signals, which Google's Helpful Content System and Topic Authority assessments do reward. Use salience as a gap-analysis diagnostic, not a number to optimise in isolation. (Szymon Slowik / Google Developer Forum)
What happened to Knowledge Panels after Google's June 2025 update?
Google's "Great Clarity Cleanup" in June 2025 deleted over 3 billion entities — a 6.26% contraction — targeting ambiguous "Thing"-typed entities and temporary Event entities. Panels tied to poorly sourced or under-typed entities disappeared. The follow-on Graph Foundation Model update in July 2025 caused additional panel losses for some brands. The lesson: entity definitions need unambiguous typing (P31 in Wikidata, precise @type in schema) and multiple corroborating sources. (Search Engine Land)
How long does it realistically take to earn a Knowledge Panel?
After crossing the EntityTrust threshold (empirically ≥ 0.72), panels typically emerge within 2–6 weeks provided sufficient query demand exists. Getting to that threshold takes longer: 3–6 months with professional support, 6–12 months on a self-managed basis, and 12–24 months for individual/personal panels. Building Wikidata first and adding schema later extends the timeline to 4–9 months compared to doing both simultaneously. (Kalicube/Authoritas)
What's new (2026-06-22)
- Knowledge Graph size details: Added launch figure of 570 million entities (May 2012), alternative figures up to 8 billion entities, and Wikidata scale (750M+ statements, 4.9B triples). (Niumatrix, PMC7077981, Malyshev et al.)
- New statistics: 87% of search results feature Knowledge Panels; 58% of searches are zero-click; 83.3% of AI Overview citations from beyond top 10 (not integrated, but noted elsewhere). (Niumatrix, Somebody Digital)
- Schema markup impact: Entity-optimized content is 50% more likely in featured snippets; pages with valid schema are 2-4x more likely in AI Overviews. (iFactory)
- Schema maintenance: Added recommendation for quarterly reviews of schema markup. (ReputationX)
- Wikipedia dependency: Updated to note decreased dependency, with only ~15% of panel descriptions from Wikipedia. (ReputationX)
- Brand signals: Added brand signals from Google's ranking factors (brand anchors, branded searches, etc.). (Optinmark)
- Entity audit template: Added new entity audit template from SearchAtlas. (SearchAtlas)
- First-mention clarity rule: Added technique for explicit entity disambiguation on first mention. (SevenSEO)
- AI search statistics: Added ChatGPT user/prompt data, LLM conversion rates (16% vs 0.8%), and GraphRAG/MCP architectures. (Somebody Digital, iFactory, Schema App)
- Case studies: Added Schemantra real estate case (+100% traffic) and Interingilizce.com entity-first project (1100% growth). (Schemantra, Oncrawl)
- Schema.org scale: Added note that Schema.org has ~1,400 types and 20,000+ properties/classes. (Schemantra)
- Algorithm timeline: Added details for Hummingbird (90% searches), RankBrain (15%), MUM (1000x BERT). (Niumatrix)
Originally published in the EcomExperts SEO library.