We analyzed 1,000,000+ links from AI responses…
This study explores how modern generative AI systems cite sources in response to realistic user prompts. Our objective was to quantify and characterize the nature of AI-generated citations across different use cases and vendor models. This includes their frequency, source types, and the prominence of earned and owned media.
To accomplish this, we constructed a large, diverse prompt set and executed it across several web-enabled language models, followed by systematic analysis of the responses and the cited links. The prompts span a variety of industries and subject matter. Sometimes prompts specifically mention companies by name, sometimes they do not.
Gemini, Perplexity, Claude and ChatGPT were used to execute the queries, during between July and December 2025.Generative AI systems are rapidly evolving and inherently opaque. The behaviors observed in this study may shift as models are updated or retrained.
Sources:
- MuckRack-GenerativePulse2025-1.pdf (archived)
- https://generativepulse.ai/report
Related: Which journalists and news outlets are most cited by AI answer engines? [web-archive]
51% of answers which contain a Reddit citation also cite a journalistic source
Also known as
49% of answers which contain a Reddit citation paraphrase what a single person on the internet is claiming



