
Open-Source Reasoning Breakthrough
On January 20, 2025, DeepSeek released R1—an open-source reasoning model that matches OpenAI o1's performance at a fraction of the cost. The release sent shockwaves through the AI industry, demonstrating that frontier-level reasoning doesn't require hundreds of billions of dollars in infrastructure.
DeepSeek, a Chinese AI lab backed by quantitative trading firm High-Flyer, built R1 using innovative training techniques that achieve remarkable efficiency. The model is fully open-source under the MIT license, with weights, training details, and distilled variants freely available.
Performance: Matching o1
R1's benchmarks are remarkably competitive with OpenAI's o1:
| Benchmark | DeepSeek-R1 | OpenAI o1 | Claude 3.5 | GPT-4o |
|---|---|---|---|---|
| AIME 2024 | 79.8% | 79.2% | 16.0% | 13.4% |
| MATH-500 | 97.3% | 96.4% | 78.3% | 76.6% |
| Codeforces (Elo) | 2029 | 1891 | ~900 | 808 |
| GPQA Diamond | 71.5% | 78.0% | 65.0% | 53.6% |
| SWE-bench Verified | 49.2% | 48.9% | 49.0% | 33.2% |
| MMLU | 90.8% | 91.8% | 88.7% | 87.2% |
On math and coding benchmarks, R1 actually exceeds o1 in several categories. The MATH-500 score of 97.3% and Codeforces Elo of 2029 are world-class results.
Training Innovation: Pure Reinforcement Learning
The most revolutionary aspect of R1 is its training methodology. DeepSeek discovered that reasoning can emerge from pure reinforcement learning (RL) without supervised fine-tuning:
1Traditional Reasoning Model Training:
21. Pre-train on massive text corpus
32. Supervised Fine-Tuning (SFT) with human examples
43. RLHF with human preference data
54. Additional reasoning-specific training
6
7DeepSeek R1 Approach:
81. Pre-train base model (DeepSeek-V3)
92. Apply RL directly → reasoning emerges naturally
103. "Cold start" data for readability/formatting
114. Final RL stage for polish
12
13Key insight: You don't need to teach reasoning step-by-step.
14Given the right RL reward signal, models learn to reason on their own.This "reasoning from RL" discovery was published in their technical paper and has been independently validated by researchers. It suggests that reasoning is a more fundamental capability than previously thought—it can be "unlocked" rather than explicitly trained.
Distilled Models: R1 for Everyone
DeepSeek released six distilled variants based on open-source base models:
| Model | Base | Parameters | AIME 2024 | MATH-500 |
|---|---|---|---|---|
| R1-Distill-Qwen-1.5B | Qwen 2.5 | 1.5B | 28.9% | 83.9% |
| R1-Distill-Qwen-7B | Qwen 2.5 | 7B | 55.5% | 92.8% |
| R1-Distill-Qwen-14B | Qwen 2.5 | 14B | 69.7% | 93.9% |
| R1-Distill-Qwen-32B | Qwen 2.5 | 32B | 72.6% | 94.3% |
| R1-Distill-Llama-8B | Llama 3.1 | 8B | 50.4% | 89.1% |
| R1-Distill-Llama-70B | Llama 3.1 | 70B | 70.0% | 94.5% |
The 7B distilled model scores 92.8% on MATH-500—outperforming GPT-4o (76.6%). A model that runs on a laptop beats the world's most expensive AI on math.
1# Run R1-Distill locally with Ollama
2ollama run deepseek-r1:7b
3
4# Or the full model with vLLM
5pip install vllm
6vllm serve deepseek-ai/DeepSeek-R1 --tensor-parallel-size 8Cost Efficiency
DeepSeek's API pricing is dramatically lower than competitors:
| Model | Input (1M tokens) | Output (1M tokens) | Reasoning Quality |
|---|---|---|---|
| DeepSeek-R1 | $0.55 | $2.19 | o1-level |
| OpenAI o1 | $15.00 | $60.00 | o1-level |
| OpenAI o1-mini | $3.00 | $12.00 | Good |
| Claude 3.5 | $3.00 | $15.00 | Good (non-reasoning) |
R1 is 27x cheaper than o1 for input and 27x cheaper for output, with comparable performance. This pricing disrupted the assumption that frontier AI requires premium pricing.
The "DeepSeek Shock" Market Impact
R1's release caused a significant market reaction:
- NVIDIA stock dropped 17% (single-day loss of ~$600B market cap)
- Investors questioned whether massive GPU investments were necessary
- The "scaling hypothesis" (more compute = better AI) was challenged
- US tech companies faced pressure to justify AI infrastructure spending
The concern: if a Chinese lab with reportedly modest compute (~$5.6M training cost) can match OpenAI's results, are the $100B+ datacenter investments justified?
Open-Source Implications
R1's MIT license means:
- Commercial use without restrictions
- Modification and redistribution allowed
- No data sharing requirements
- No usage reporting to DeepSeek
This has spawned a wave of R1-based applications, fine-tuned variants, and research papers. The reasoning techniques are being applied to specialized domains: medical diagnosis, legal analysis, scientific research.
Geopolitical Context
R1's success raises complex geopolitical questions:
- Export controls: Despite US restrictions on AI chips to China, DeepSeek achieved frontier performance
- Efficiency vs. scale: Chinese labs are innovating on algorithms rather than brute-force compute
- Open-source dynamics: The model is freely available globally, bypassing any potential restrictions
- Competition narrative: The US-China AI race is more nuanced than compute alone
What This Means for AI Development
DeepSeek R1 proves three important things:
- Reasoning is accessible: You don't need OpenAI's resources to build reasoning models
- Open-source can compete: MIT-licensed models can match proprietary frontier models
- Efficiency matters: Algorithmic innovation can substitute for raw compute
The AI industry's assumption that frontier performance requires massive investment has been fundamentally challenged.
Sources: DeepSeek R1 Paper, DeepSeek GitHub, DeepSeek API


