What did DeepSeek figure out about reasoning with DeepSeek-R1?

https://www.seangoedecke.com/deepseek-r1

The Chinese AI lab DeepSeek recently released their new reasoning model R1, which is supposedly (a) better than the current best reasoning models (OpenAI’s o1- series), and (b) was trained on a GPU cluster a fraction the size of any of the big western AI labs.

DeepSeek uses a reinforcement learning approach, not a fine-tuning approach. There’s no need to generate a huge body of chain-of-thought data ahead of time, and there’s no need to run an expensive answer-checking model. Instead, the model generates its own chains-of-thought as it goes.

https://medium.com/@ShankarsPayana/how-deepseek-r1-using-fp8-instead-of-fp32-beat-openai-meta-gemini-and-claude-c105d94d0c39

The secret behind their success? A bold move to train their models using FP8 (8-bit floating-point precision) instead of the standard FP32 (32-bit floating-point precision).

By using a clever system that applies high precision only when absolutely necessary, they achieved incredible efficiency without losing accuracy.


The impressive part? These multi-token predictions are about 85–90% accurate, meaning DeepSeek R1 can deliver high-quality answers at double the speed of its competitors.

https://www.tweaktown.com/news/102798/chinese-ai-firm-deepseek-has-50-000-nvidia-h100-gpus-says-ceo-even-with-us-restrictions/index.html

Chinese AI firm DeepSeek has 50,000 NVIDIA H100 AI GPUs

pIXELsHAM