Home Blog Newsfeed DeepSeek’s Distilled R1 AI Model Achieves Impressive Performance on a Single GPU
DeepSeek’s Distilled R1 AI Model Achieves Impressive Performance on a Single GPU

DeepSeek’s Distilled R1 AI Model Achieves Impressive Performance on a Single GPU

DeepSeek has recently unveiled an updated version of its R1 reasoning AI model, capturing the attention of the AI community. Alongside this significant release, the Chinese AI lab has also introduced a smaller, distilled variant named DeepSeek-R1-0528-Qwen3-8B. According to DeepSeek, this model rivals other similar-sized models based on several benchmarks.

Built on the foundation of Alibaba’s Qwen3-8B model, launched in May, the updated R1 demonstrates better performance than Google’s Gemini 2.5 Flash on the AIME 2025, a challenging math question dataset.

DeepSeek-R1-0528-Qwen3-8B closely matches Microsoft’s Phi 4 reasoning plus model on the HMMT math skills test, showcasing its competitive capabilities.

Distilled models like DeepSeek-R1-0528-Qwen3-8B usually have lesser capabilities compared to their full-sized counterparts. However, they offer an advantage because they are less computationally intensive. According to NodeShift, the Qwen3-8B requires a GPU with 40GB-80GB of RAM to run, whereas the full-sized R1 model requires around a dozen 80GB GPUs.

DeepSeek trained DeepSeek-R1-0528-Qwen3-8B by fine-tuning Qwen3-8B using text generated by the updated R1. On Hugging Face, DeepSeek describes the model as being suited for both academic research and industrial development, focused on smaller-scale models.

Available under a permissive MIT license, DeepSeek-R1-0528-Qwen3-8B can be used commercially without restriction. Platforms like LM Studio are already offering the model through an API.

Add comment

Sign Up to receive the latest updates and news

Newsletter

Bengaluru, Karnataka, India.
Follow our social media
© 2025 Proaitools. All rights reserved.