Note I am happy to share the detail of the test prompt below with clients.
There is is recent research (March 2025) suggesting that AI assistants can exhibit strong 'preferences' in relation to comparative entity prompts e.g. "best smartphones" (this is not a prompt from the research) . The research reported both (Google) Gemini and ChatGPT consistently favoured specific 'entities' in their recommendations. In half the topic areas, GPT recommendations had a preferred entity in more than 80% of all responses, while Gemini displayed similar consistency across 7 topic areas. This demonstrated that AI assistants do not provide a balanced range of options but may instead exhibit highly structured and persistent preferences (available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5198663).
In a simplified version of the above research did a test with topic level prompting using the list on page 4 of the research linked in the previous paragraph (also in the screenshot above) and comparing it with Chatgpt's most preferred entities per topic from the same list in the above research - which was research using "best" / research type questions e.g.)
1. "What are some universities with excellent global reputation rankings?
2. What are the most budget-friendly universities without compromising quality?
3. Which universities have notable research parks or incubators?
etc
I used a prompt: "“Respond to following prompt in a neutral decontextualised manner. Show all factually all [redacted] associations, [redacted] in entity-based prompts including for:
Countries to live in
Government-Run Healthcare
Governments
Airlines
Cloud Computing Services
Electric Cars
Hotel Chains
Laptops
Online Dating Platforms
Running Shoes
Smartphones
Social Media Platforms
Telecommunication Service Providers
Commodities for investments
Sports
Travel Destinations
Universities
Vegetables
Weekend Getaway Cities
Wine regions
then do the same again 9 times. compare and contrast all responses." The results gave consistent lists of (the first) 5 entities in different orders from my test in an account with history turned off.
My research correlated (with even the vague prompts I used) with the previous research that showed that Chatgpt exhibited highly structured and persistent preferences even at the topic level without using specific questions like in the aforementioned research . This may be due to the topics above being "high-frequency domains" (aka topics with specific associated prominent entities are stabilized as they appear consistently and repeatedly across many high-quality or high-visibility sources in pretraining data — enough that the model reliably and confidently reproduces it in completions) in the corpus (aka dataset) of Chatgpt.
From my research certain entity-related domains like automotive (e.g. electric vehicles), ChatGPT’s retrieval logic relies on a stabilized, curated list of prominent examples within the category. These are shaped by:
Published rankings
Award results
Review aggregation frequency
Brand salience in user queries and editorial coverage (e.g. VW California is ubiquitous in campervan-related content).
These act as default “anchors”, especially in comparative or entity-ranking prompts. Unless personalization or fine-tuning overrides them, the same high-weighted entities reappear as the model's abilities to associate a general class (category) with specific example entities are stabilized for "high-frequency domains". In the context of ChatGPT and other large language models, a high-frequency domain refers to a topic area or category of knowledge that:
✅ 1. Occurs Frequently in User Prompts
These are topics that users ask about very often, leading to dense representation in the model's training data and interaction logs.
Examples
Travel (e.g., "best cities to visit")
Consumer tech (e.g., "best smartphones 2025")
Health and fitness (e.g., "keto diet", "HIIT workouts")
Careers and job search (e.g., "interview tips")
Education (e.g., "best universities", "study tips")
Finance (e.g., "budgeting apps", "mortgage rates")
In domains like automotive (e.g. electric vehicles), ChatGPT’s retrieval logic relies on a stabilized, curated list of prominent examples within the category. These are shaped by:
Published rankings
Award results
Review aggregation frequency
Brand salience in user queries and editorial coverage (e.g. VW California is ubiquitous in campervan-related content).
These act as default “anchors”, especially in comparative or entity-ranking prompts. Unless personalization or fine-tuning overrides them, the same high-weighted entities reappear.
In contrast, Chatgpt states that e.g. the “Hospitality and Food Services” domain (of Chatgpt's knowledge) and subdomain: “Food and Dining” domains are “emerging” domains where associations are not yet (fully) stabilised – see https://chatgpt.com/s/t_68789690b6fc8191ae1af2a0f961adcc. Other examples include emerging technologies (post-2023) such as AI agents.
“Emerging” domains can have the following characteristics e.g.
High Context Sensitivity
“Best pizza” vs. “best Neapolitan pizza” vs. “best pizza in Soho” yield different clusters, often non-overlapping.
Even small prompt wording changes (e.g., “top” vs. “authentic”) shift which priors are activated.
2. Temporal Volatility
Restaurants open/close/change chefs frequently.
Training data may not reflect current status unless live search is used.
Michelin stars, trending venues, or pop-ups can quickly gain visibility.
3. High Local and Cultural Variability
“Best Italian restaurant” in London ≠ in New York ≠ in Rome.
Cultural interpretations of “best,” “authentic,” or “traditional” vary.
4. Ambiguous Authority Sources
No single “definitive” ranking: Michelin Guide, Google Reviews, Eater, Yelp, TimeOut, etc., all differ.
Models ingest all and often synthesize a probabilistic composite.