Reverse Engineering the Response to "best smartphones" in Chatgpt 5.1

chatgpt 5.1 "best smartphones" screenshot

By Chris Byrne (29.11.25)

This is my attempt to reverse engineer a response to a vague comparative entity prompt ("best smartphones") in Chatgpt 5.1 using the tool itself. I chose this prompt to trigger "Retrieval Augmented Generation" (RAG): where instead of relying solely on the model’s pre-trained knowledge alone, the model can retrieve relevant information from connected data sources. This is an interpretive high-level view given by Chatgpt itself, not the model's internal data.

In the USA for example, there are many thousands of "best smartphones"-type searches per month on GOOG according to data from Google KWP - one might assume that LLMs might be used in a similar way to compare products etc.

See the original response to my prompt “Best Smartphones” below: note I've broken some of the hyperlinks:

“Best smartphones

Samsung Galaxy S25 Ultra — flagship Android with top-tier cameras, display, and performance; widely listed as a 2025 top pick. Linked Citations:

https://www.techradar .com/uk/news/best-phone

https://www.droid-life .com/2025/02/11/galaxy-s25-ultra-review

Google Pixel 10 Pro / Pixel 10 Pro XL — leading Android for on-device AI features and a strong camera system in 2025 roundups. Linked Citations:

https://www.techradar .com/best/best-android-phones

https://www.androidfaithful .com/pixel-10-pro-review-a-great-phone-you-can-probably-skip/?utm_source=chatgpt.com

Apple iPhone 17 / iPhone 17 Pro Max — best iOS options for many buyers in 2025; strong camera and system integration. Linked Citations:

https://www.techradar .com/uk/news/best-phone

https://www.techradar .com/news/best-cameraphone

OnePlus 13 — high value flagship contender frequently recommended in 2025 reviews for performance and price balance.

Linked Citation: https://www.coolsmartphone .com/2025/01/14/oneplus-13-review

Value / Budget picks (examples) — Motorola Moto G55, OnePlus Nord CE 5, CMF Phone 2 Pro and Samsung A-series appear frequently in “best budget” lists.

Linked Citations:

https://www.expertreviews .co.uk/technology/phones/best-budget-smartphones

https://www.techadvisor .com/article/724318/best-phone.html”

Chagpt stated this response was generated as follows (with full data pipeline (inc. RAG) mapped to the seven stages

The below does not reveal internal token-level chain-of-thought. Chatgpt stated that the description below is a transparent, factual mapping of system stages and the retrieval evidence used.

1) Prompt Analysis & Intent Detection

Input: "best smartphones" (no context).

Interpretation chosen: broad, time-sensitive request for the current top phones across common buyer buckets (flagship, camera, value, budget). Because the topic is time-sensitive, CGPT treated it as requiring up-to-date web evidence.

2) Embedding retrieval & semantic expansion (RAG) —

Chatgpt stated it expanded the short prompt into a set of targeted search queries to retrieve reviews, roundups, and model pages. These are the exact search queries executed:

"best smartphones 2025 roundup"

"Samsung Galaxy S25 Ultra review 2025"

"Google Pixel 10 Pro review 2025"

"Google Pixel 10 Pro XL review 2025"

"iPhone 17 review 2025"

"iPhone 17 Pro Max camera review 2025"

"OnePlus 13 review 2025"

"best camera phone 2025 review"

"best budget phone 2025 UK"

"best phones TechRadar 2025" (this website's publisher is a Chatgpt content partner: https://openai .com/index/openai-and-future-partner-on-specialist-content/)

"best phones The Guardian 2025"

"best phones TechAdvisor 2025"

"best budget phones 2025 list"

"best Android phones 2025"

"best phone cameras 2025 Telegraph"

(These queries were used to fetch the list of review and roundup pages that formed the factual basis of the recommendations.)

3) Feature Selection for Comparison

From retrieved sources Chatgpt stated it extracted the standard reviewer attributes used to judge “best” phones:

- Performance (SoC, RAM)

- Display (size, refresh rate, peak brightness)

- Camera (hardware, zoom, low-light performance, image processing)

- Battery & charging (mAh, longevity, charging speed)

- Software & updates (OS, AI features, update policy)

- Price / value ratio

- Use-case fit (e.g., gaming, photography, battery endurance)

- These attributes were applied differently for each use-case bucket (e.g., camera weight higher for “best camera”).

4) Knowledge base integration

Chatgpt stated it combined:

Model training knowledge about historical brand tendencies and technical tradeoffs (static background).

Fresh web retrievals (the documents found by the search queries above) for current 2025 model names, reviewer verdicts, and pricing/availability context.

All internet-derived factual claims in the picks above are supported by the sources cited above (there were additional regional and retailer pages consulted for price/context).

5) Ranking & scoring

Method used - "soft, human-like RAG ranking" (Chatgpt states this is a "flexible consolidation heuristic" the model uses after retrieval, where items are ranked by contextual consensus, salience, and qualitative evidence strength, rather than by a fixed numeric metric or deterministic algorithm). This appears to be different to previous versions of Chatgpt.

Consensus elevation: models that repeatedly appear near the top of multiple up-to-date roundups/reviews (TechRadar, TechAdvisor, The Guardian etc) are elevated to flagship/all-round picks.

Use-case slotting: phones that are repeatedly recommended as “best camera”, “best value”, or “best budget” are placed into those buckets even if they aren’t top overall.

No single numeric score published: Chatgpt used a qualitative, evidence-weighted decision rather than a single aggregated numeric index — because different users weight attributes differently.

6) Natural language generation

Generated concise bullets: phone name + short rationale (1–2 lines).
Each factual claim (e.g., “widely listed as a 2025 top pick”) was backed by one or more citations from the retrieved sources.

7) Output (response) assembly & post-processing

Formatted as a decontextualised recommendation list (no user-specific customisation).
Ensured key factual statements have citations to the web sources used.

Page updated

Google Sites

Report abuse