Link between DeepSeek and ChatGPT?

OpenAI claims Deepseek used its models - Perplexity (29.01.2025)

OpenAI has accused Chinese AI startup DeepSeek of potentially using its proprietary models to train DeepSeek's own AI systems1 2. OpenAI and Microsoft are investigating whether DeepSeek utilized OpenAI's API to incorporate its AI models into their offerings, potentially violating OpenAI's terms of service1 3.

Key Points:

Evidence of Distillation: OpenAI claims to have found indications that DeepSeek employed a technique called "distillation" (*) to train its AI models using OpenAI's technology1 2.
- OpenAI has evidence that its models helped train China’s DeepSeek - The Verge 29.01.2025
  - OpenAI and Microsoft are investigating whether the Chinese rival used OpenAI’s API to integrate OpenAI’s AI models into DeepSeek’s own models, according to Bloomberg. The outlet’s sources said Microsoft security researchers detected that large amounts of data were being exfiltrated through OpenAI developer accounts in late 2024, which the company believes are affiliated with DeepSeek.
  - OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of distillation (*)— a common technique developers use to train AI models by extracting data from larger, more capable ones.
  - President Donald Trump’s artificial intelligence czar David Sacks said “it is possible” that IP theft had occurred. “There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models and I don’t think OpenAI is very happy about this,” Sacks told Fox News on Tuesday.
- OpenAI says DeepSeek may have used its AI outputs 'inappropriately' to train new model - Business Insider Jan 29, 2025
Data Breaches: Microsoft's cybersecurity team reportedly discovered significant data breaches through OpenAI developer accounts in late 2024, suspected to be linked to DeepSeek1.
Cost Efficiency: DeepSeek has developed top-performing AI models using less-advanced chips at a fraction of the cost of US rivals like OpenAI, Google, and Meta2.
Terms of Service Violation: While developers can use OpenAI's API to integrate its capabilities, using the outputs to create competing models violates OpenAI's terms of service1.
Government Involvement: OpenAI has stated it will work closely with the US government to protect advanced AI models from exploitation by adversaries and competitors1 3.
Market Impact: The news of DeepSeek's affordable yet powerful AI model led to a decline in US tech company stock prices4.

OpenAI has committed to taking "aggressive, proactive countermeasures" to protect its technology and intellectual property2 3. The situation has also raised concerns about national security implications, with US officials evaluating the potential ramifications of DeepSeek's AI advancements4.

(*) Distillation, according to IBM, refers to a machine-learning technique where the learning of a large pre-trained "teacher model" is transferred to a smaller "student model."2

How does distillation work in AI model training - Perplexity 29.01.2025

Model distillation in AI training is a technique that transfers knowledge from a large, complex model (the "teacher") to a smaller, more efficient model (the "student"). This process aims to create a compact model that retains much of the performance of the larger model while being faster and less resource-intensive1 3.

The distillation process typically involves the following steps:

Training the teacher model: A large, sophisticated model is trained on a dataset to achieve high accuracy and performance1 4.
Generating soft targets: The teacher model produces probability distributions over classes (soft targets) for the training data, which contain more information than hard labels1 2.
Training the student model: The student model is trained to mimic the teacher's outputs, often using a combination of the original training data and the soft targets generated by the teacher1 2.
Knowledge transfer: The student model learns to replicate the teacher's decision-making process, including the relationships between inputs and outputs1 4.

Key Aspects of Distillation

Soft targets: These provide more nuanced information about the teacher's predictions, allowing the student to learn subtle patterns in the data1 4.
Feature-based distillation: The student model may learn from the teacher's internal features, minimizing the difference between their learned representations1.
Relation-based distillation: This advanced technique focuses on transferring the underlying relationships between inputs and outputs from the teacher to the student1.
Self-distillation: In some cases, a single model can act as both teacher and student, transferring knowledge from its deeper layers to shallower ones4 6.

Model distillation offers several benefits, including model compression, improved generalization, and the ability to deploy high-performing models on resource-constrained devices3 5. This technique has become particularly relevant with the proliferation of large language models (LLMs) and other massive AI systems4 8.

Update 29.01.2025

Page updated

Google Sites

Report abuse