Do small niche sites get cited?

Yes. Niche blogs made up 37% of all citations in the experiment, beating major publishers and academic sources combined. Tactical, specific content wins citations.

Which content formats are cited most?

AI models prefer structured, tactical formats: numbered lists, exact metrics, question-based H2s, and short answer blocks. Academic papers are heavily cited for technical claims.

Do AI tools like Jasper or Semrush get cited?

No. Tools were frequently mentioned but their homepages were never cited. AI cites educational or analytical content, not product pages.

Does domain authority matter for AI citations?

Much less than expected. Small SEO blogs were cited alongside The Verge and Reuters. Tactical specificity and structure matter more than DR.

What types of sources get cited most by GPT-5?

The breakdown: 37% niche SEO blogs, 25% major publishers, 20% SEO platforms, 18% academic research. This shows diversity and a preference for tactical insights.

Does Reuters get cited for AI SEO topics?

Yes, but only for breaking news: announcements, platform changes, and industry updates. Not for strategy or implementation.

What makes content citation-worthy for AI?

Temporal specificity, precise metrics, structured headings, data-backed claims, short answer blocks, and clear author authority.

I Analyzed 60+ AI Citations - Here's What Actually Gets Cited in 2025

Q: Why does adding the year trigger AI citations?

Year-specific queries signal to LLMs that the user requires current, sourced information rather than general knowledge. Adding “2025” caused GPT-5 to produce 7+ citations per response.

Updated at 2/1/2026

Illustration-a-lively-city-night-In-the-center-two-main-buildings-stand-tall-purple-pink-and-blue.jpg

The Experiment

I wanted to understand what sources AI platforms trust when answering questions about AI SEO and content optimization. So I ran a controlled experiment using Savannabay's AI comparison platform.

The setup:

4 strategic questions about AI search and content
3 responses per question from GPT-5 with web search activated, using API calls to be neutral, simulating as close as I could the real chat settings and environment.
Total: 12 AI responses analyzed
Result: 60+ distinct source citations tracked

Here's what I discovered about AI citation patterns (and what it means for content creators).

Update (February 2026): After publishing these initial findings, we've made an expanded validation study analyzing 60 websites across 74 different metrics, comparing sites ChatGPT cited versus those it didn't. The patterns held and revealed even more specific factors that make content citation-worthy. Read the full 60-website analysis: How to Get ChatGPT to Recommend Your Business in 2026

📊

RECENCY MATTERS

83.3%

of cited sites from 2025

vs

23.3%

of random sites

+257%

difference
(largest factor identified)

The Discovery That Changed Everything

My first few queries returned zero citations. ChatGPT gave generic advice without sources.

Then I added one word to my query: "2025".

Suddenly, full citations appeared.

Without year: "How to get cited by AI search engines" → Zero citations
With year: "How to get cited by AI search engines in 2025" → 7+ citations per response

The insight: Temporal specificity triggers citation behavior. AI models interpret year-specific queries as requiring current, sourced information rather than general knowledge.

The Complete Citation Breakdown

Source Type Distribution (60+ citations):

Academic Research: 18%

arxiv.org appeared 11+ times
Most frequently cited single source
Used for theoretical backing and research validation

Major Tech/Business Publishers: 25%

Reuters (4x)
The Verge (3x)
LinkedIn (3x)
Le Monde, Financial Times (2x each)
Investopedia, Economic Times

SEO/Marketing Platforms: 20%

SEO.com (3x)
Salesforce Blog (5x)
Schema.org (2x)
RankTracker (2x)

Niche SEO/AI Blogs: 37%

entasher.com, techdevs.in, bloggerspice.com
apacheinteractive.com, aimodeboost.com, seoce.ai
bestprompt.art, keywordsearch.com, humanizeai.tools
abbacustechnologies.com, mediasearchgroup.com
Plus 15+ other specialized sites

AI Citation Source Distribution

Based on 60+ citations from GPT-5 responses

Niche Blogs

37%

Largest source category

Major Publishers

25%

Reuters, The Verge, etc.

SEO Platforms

20%

Salesforce, SEO.com, etc.

Academic Sources

18%

arxiv.org papers

Niche Blogs 37%

Major Publishers 25%

SEO Platforms 20%

Academic Sources 18%

Key Finding

Niche blogs dominate AI citations at 37% — more than major publishers and academic sources combined. Small sites with tactical specificity compete directly with established brands.

Pattern #1: Academic Credibility Dominates

arxiv.org appeared more than any other single source (11+ times across 4 different queries).

When GPT-5 needs authoritative backing for claims about AI behavior, algorithm changes, or technical implementations, it defaults to academic research.

Example citation contexts:

"Recent work shows phrase-level rewrites measurably improve inclusion in LLM outputs" → arxiv paper
"Generative engines demand intent-driven content and structured responses" → arxiv paper
"RAG/LLM retrieval systems prefer" → arxiv paper

Why this matters: Publishing research (even preprints on arxiv) dramatically increases your citation likelihood. You don't need peer review; you need structured, data-backed insights.

Our new study validated this pattern: Sites with proprietary data and original research have more chances to be recommended, especially by ChatGPT. Even websites with no authority but proprietary data were mentioned, like robertyoung.consulting. Academic-style rigor, even without formal peer review, signals authority to AI models.

Pattern #2: Salesforce Outranks Major Media

Salesforce Blog appeared 5 times - more than Forbes, Wired, or any major tech publisher.

What Salesforce did right:

Practical, tactical guides ("6 tips to get your content surfaced by AI")
Clear, numbered frameworks
Specific implementation steps
Enterprise credibility without academic formality

The insight: Enterprise software companies publishing educational content get cited as heavily as traditional media. Authority comes from usefulness, not just brand recognition.

Pattern #3: Small Sites Punch Above Their Weight

Sites you've probably never heard of appeared alongside Reuters, The Verge, and major tech publishers in citation lists.

Examples from the data:

lairedigital.com - cited for specific tactical advice ("Add concise answer paragraphs under each H2 (40–120 words)")
techmidiasquare.com - cited for "Mastering AI search ranking in 2025: top expert tactics"
butterflai.pro - cited for AI SEO best practices and tools
entasher.com - cited twice for "Top 10 Hacks to Rank in AI Search in 2025"
bestprompt.art - cited for AI content trends
bloggerspice.com - cited for Bing content optimization

What these small sites have in common:

Specific, numbered tactical guides with exact metrics
Year (2025) in the title
Clear H2 structure with actionable steps
Deep, focused coverage of one specific angle
Concrete examples over generic advice

This is the game-changer: You don't need massive domain authority. Sites like lairedigital.com and techmidiasquare.com got cited alongside The Verge and Reuters because they provided tactical specificity and clear structure that major publishers often skip.

We validated this at scale: In our new 60-website study, we analyzed sites with Domain Rating 0-15 that successfully competed against major publishers. The top performers achieved this through:

Originality scores of 77.1% (unique angles + proprietary data + nuanced analysis)
Vocabulary density 15-25% higher than non-cited competitors
Perfect answer completeness (10/10 scores across the board)
Current year publication (83.3% from 2025)

Pattern #4: Reuters = Breaking News Source

Reuters appeared 4 times, always for the same reason: recent platform announcements.

Citation contexts:

Meta using AI chats to personalize content (October 2025)
Microsoft Edge Copilot mode launch (July 2025)
Platform policy changes

The pattern: Major news wires get cited for "what's new"—never for strategy, how-to, or analysis.

What this means: If you're not a news organization, don't try to compete on breaking news. Compete on analysis and implementation.

Pattern #5: The Documentation Citation

Schema.org appeared when discussing structured data implementation. Google Developers docs appeared when referencing official guidelines.

The pattern: Documentation sites get cited when the topic requires technical precision or official standards.

Your opportunity: Create documentation-quality resources for your niche. If you can become the "official unofficial" reference, you own that citation territory.

🎯

THE CITATION FORMULA

Based on analysis of 60 websites across 74 metrics, here's what actually determines citation likelihood:

Originality

40%

Unique angle + proprietary data + nuanced analysis

Demonstrated Expertise

25%

Technical depth + insider knowledge

Vocabulary Density

20%

Rich language + technical terms + acronyms

Semantic Structure

10%

Clean HTML + proper tags + lists

Answer Completeness

5%

Direct response + specificity

⚡

Critical Multiplier

Is your content from 2025/2026?
If not, your odds drop ~75%

Note: While domain authority still matters for competitive short-tail queries (DR 75-80+), this formula shows how low-DR sites can compete in mid and long-tail queries where authority requirements drop 12-30%.

Pattern #6: Tools Get Mentioned, Not Cited

Across all responses, GPT-5 mentioned these tools multiple times:

Surfer SEO
Jasper
Clearscope
MarketMuse
Semrush

Citations to their websites: Zero

What happened: ChatGPT referenced the tools in advice but didn't cite their marketing sites, documentation, or blogs.

The lesson: Being a known tool gets you mentioned in AI responses, but doesn't guarantee citation. Educational content gets cited more than product pages.

If you're a SaaS company, your blog matters more than your homepage for AI visibility.

Pattern #7: Numbered Frameworks Win

The most-cited content followed this formula:

"[Number] [Action/Insight] [Topic] [Year]"

Examples of cited titles:

"Top 10 Hacks to Rank in AI Search in 2025"
"6 Tips to Get Your Content Surfaced by AI"
"5 SEO Updates You Can't Ignore: June 2025"
"9 AI Marketing Trends for 2025"

Why this works:

Numbered lists signal comprehensive coverage
Action words (hacks, tips, updates) promise utility
Year signals freshness and specificity
Scannable format is retrieval-friendly

When Content Quality Beats Domain Authority

ChatGPT's high variability (±30-36 DR) creates opportunities for exceptional content

🎯 Low-DR Sites That Got Cited

lairedigital.com

DR 0 • Small niche blog

Cited for: "Add concise answer paragraphs under each H2 (40–120 words)"

techmidiasquare.com

DR 0 • Niche SEO blog

Cited for: "Mastering AI search ranking in 2025: top expert tactics"

butterflai.pro

DR 0 • AI optimization specialist

Cited for: "AI SEO best practices and tools 2025"

≈

CAN COMPETE WITH

📰 Major Publishers

Reuters

Global news agency • High domain authority

Cited for: Platform announcements and breaking news

The Verge

Major tech publisher • High authority

Cited for: Tech industry analysis and updates

Financial Times

Premium business publication • High authority

Cited for: Business and market trends

💡 WHY THIS WORKS

While domain authority still matters for competitive short-tail queries (DR 75-80+), ChatGPT shows the highest citation variability of all platforms (±30-36 DR). For mid and long-tail queries, small sites can compete when they provide tactical specificity, exact metrics, and current data that major publishers often skip. Sites like lairedigital.com offered precise implementation details ("40-120 words") rather than general trends.

What Makes Content Citation-Worthy?

Based on 60+ citations analyzed, here's the definitive checklist:

Must-Have Elements:

1. Temporal Specificity

Include the year (2025) in your query/title
Signal freshness with "latest," "current," "recent"
Date your content clearly

2. Academic or Data Backing

Original research or data analysis
Citations to studies or papers
Quantified claims with sources

3. Tactical Specificity

Exact numbers ("40-120 word paragraphs")
Specific implementation steps
Code examples or templates
Named techniques or frameworks

4. Clear Structure

H2 headings as questions
Numbered lists or steps
Bullet points for scanability
Tables for comparisons

5. Authoritative Signals

Author credentials
Enterprise backing or partnerships
Links to primary sources
Recent publication or update dates

6. Retrieval-Friendly Format

Short, declarative lead paragraphs
FAQ sections
Structured data (schema.org)
Concise answer blocks

Nice-to-Have Elements:

Multiple content formats (text, video, audio)
Interactive examples
Tool integrations or templates
Community validation (comments, shares)

The Source Types That Don't Get Cited

Product pages - Zero citations to tool homepages
Generic "best practices" posts - Without specifics or data
Paywalled content - AI can't access to cite
Personal opinion pieces - Without backing data or research
Old content - Without recent updates or year markers
Thin content - Short posts without depth

What This Means for Different Creator Types

For Bloggers & Content Creators:

Focus on numbered, tactical guides with year specificity. You can compete with major publishers if you go deeper on specific tactics.

Do: "7 Schema Markup Patterns That Get You Cited by AI in 2025" Don't: "How to Improve Your SEO" (too generic, no year)

For SaaS Companies:

Your blog matters more than your homepage. Create educational content, not just product marketing.

Do: Tactical implementation guides Don't: "Why Our Tool Is Best" posts

For Agencies:

Original research and data analysis get cited heavily. Publish studies, benchmarks, and frameworks.

Do: "We Analyzed 500 AI Citations - Here's What Works" Don't: Generic client case studies

For Researchers:

Arxiv preprints get cited as much as peer-reviewed papers. Don't wait for publication, share findings early.

Do: Publish working papers on arxiv Don't: Wait months for peer review before sharing

The Limitations of This Research

This analysis is based on 12 responses to 4 questions from one AI platform (GPT-5). That's enough to spot clear patterns, but not enough to call them universal laws.

What we know:

Citation behavior is observable and trackable
Patterns exist across different query types
Small sites can compete with major publishers
Temporal specificity matters significantly

What we don't know:

How patterns vary across other AI platforms (Claude, Perplexity, Gemini)
Whether these patterns hold for non-SEO topics
How citation behavior evolves as AI models update
The full ranking algorithm behind source selection

What's next:

Expand analysis to other AI platforms
Track citation patterns across different industries
Monitor how patterns evolve over time

Key Takeaways

Add "2025" to your content titles and queries - It's the citation trigger
Academic backing matters more than domain authority - Link to research, publish data
Small, tactical sites beat generic major publishers - Specificity > Brand
Numbered frameworks are citation magnets - "7 ways" performs better than prose
Enterprise blogs (like Salesforce) outrank news sites - Educational content wins
Tools get mentioned but not cited - Educational content > Product pages
Reuters gets cited for news, arxiv for research - Know your citation category

Actionable Next Steps

This week:

Add year (2025) to your top 5 articles
Create one numbered tactical guide
Add structured data (FAQ schema) to key pages

This month:

Publish original data or research
Rewrite generic advice into specific tactics
Build documentation-quality resources

This quarter:

Create your signature framework or methodology
Establish thought leadership in one narrow niche
Monitor which of your content gets cited

The Real Strategy

AI citation patterns reveal something important: You don't need to be HubSpot to get cited. You need to be the best answer to a specific question.

The sites that got cited weren't gaming algorithms, they were creating genuinely useful, specific, tactical content with clear structure and authoritative backing.

That's the strategy: Own a specific answer that nobody else is answering as clearly, as recently, or as tactically.

About this research: Initial data collected October 2025 using Savannabay's AI comparison platform. Analysis based on GPT-5 responses with full citation tracking. Methodology: 4 queries × 3 responses each = 12 total responses analyzed, with 60+ distinct source citations tracked and categorized by source type, citation context, and content characteristics.

Follow-up validation study (January 2026): We expanded this research with a controlled analysis of 60 websites across 74 metrics, comparing 30 sites ChatGPT cited versus 30 non-cited sites answering identical queries. The patterns identified in this initial study were validated and quantified at scale.

Richard Lowenthal is founder of Savannabay, co-founder of GoBrunch and Live University, AI Search & GEO practitioner

MENU