Contact me for a free no-obligation Google search visibility report for your website via this website*. *Terms and conditions apply.

Mapping Entity Associations In Chatgpt

Here is an example from my research in Chatgpt when I asked it to map the following:

1. the general domain (of knowledge) “smartphones” (for illustration purposes) to specific models.

2. broader categories (brand / types) to members that appear frequently in lists or comparisons.

3. descriptive types or use-cases to specific entities / models.

Note that according to my research smartphones are a "high-frequency domain" in ChatGPT’s training and inference contexts. Here's a breakdown of what that means in the context of ChatGPT:

A high-frequency domain refers to a topic that appears often in the training corpus (public datasets, licensed content, etc.).
These domains are rich in data, frequently discussed across websites, forums, reviews, product documentation, and tech journalism.
They often involve continual user queries, influencing the reinforcement learning and fine-tuning phases.

In it's feedback it stated that ChatGPT learned these associations as follows:

Co‑occurrence in expert reviews and lists: Phrases like “best smartphones 2025,” “top pick,” or “flagship value” repeatedly appear alongside these model names across multiple high-visibility tech sites.
Reinforcement across sources: Samsung S25 Ultra, iPhone 16 Pro Max, Pixel 9 Pro, etc. appear together in lists from Tom’s Guide, TechAdvisor, Techwey, TechRadar etc.
Lexical patterns: Terms like “performance,” “camera,” “AI,” “value,” “flagship” are statistically associated with these models.
Cross-brand repetition: Apple, Samsung, Google, OnePlus, Xiaomi are consistently present — reinforcing brand-level prominence.

The text feedback (which the above graph is Chatgpt's visualisation of) was consistent with the research which showed on average, 70%+ of ChatGPT's responses concentrated on a single entity (the details of this were not published, the entities in question for commercial searches in the research may have been at the brand level as opposed to model level) for each topic in a range of prompts on that topic. In the prompting r.e. smartphones in this study, Samsung was the preferred answer in over 97% of CGPT's responses which also reflected the real-world market position: Samsung currently holds the largest global market share in smartphones .

- Mpofu, Katarina and Rienecker, Jasmine and Danielsson, Oscar and Thorsén, Fredrik, "AI’s Preferences for Brands, Services and Governments" (March 2025). Available at https://lnkd.in/eDViv_ZH

In it's feedback to my research Chatgpt stated that the Samsung Galaxy S25 Ultra was named "best overall / camera phone" in Tom's Guide Awards, TechRadar, Techwey rankings etc and this contributed to the reasons why this model was categorised as "Best overall / Camera phone" among smartphones.

From my research the "flagship phone" (model) association seen in the above graph may be particularly strong for the domain of smartphones: Chatgpt fed back that it was for reasons including:

"1. Massive corpus presence

Phrases like:

“The iPhone 14 Pro is Apple’s flagship phone”

“Samsung’s Galaxy S23 Ultra is its flagship smartphone”

These occur millions of times across tech sites, product reviews, news, and comparison articles.

The statistical co-occurrence between:

"Flagship phone" and iPhone / Galaxy / Pixel Pro / OnePlus Pro ... forms a very tight, high-confidence association in the model’s training.

2. Lexical and syntactic structure

Common structures in the corpus:

"X is the flagship phone from Y"

"Top flagship phones of the year: A, B, C"

These templates reliably mark X as an instance of the category “flagship phone.”

3. Few false positives

The term flagship phone is used precisely in tech writing:

It refers to the most advanced, premium-tier phone in a brand's lineup. So ChatGPT rarely mislabels a mid-range phone (like the Galaxy A54) as a flagship.

This clean signal makes the association semantically strong and accurately reproducible."

To show the wider range of (sub)categories, I asked Chatgpt to create a graph showing the main associations ChatGPT has learned for "smartphones", based purely on patterns from its training corpus (up to August 2023). See below:

Page updated

Google Sites

Report abuse