AirBox Successfully Ports DeepSeek-R1 Models
The Radxa Fogwise® AirBox has successfully ported the DeepSeek-R1-Distill-Qwen-7B/1.5B models.
Performance Details:
Deepseek-R1-Distill-Qwen-7B reaches 11 tokens/s
Deepseek-R1-Distill-Qwen-1.5B reaches 30 tokens/s
The Radxa development team has ported the DeepSeek-R1-Distill-Qwen-7B / 1.5B distilled models onto the Fogwise® AirBox. By using the TPU-MLIR toolchain for INT4 quantization and model compilation, We have successfully enabled the DeepSeek-R1 distilled model to run on the AirBox, which has 32 TOPS computational power.
Performance Results
DeepSeek-R1-Distill-Qwen-7B reaches 11 tokens/s, it is really an Edge Computing Monster, click to watch the video
Model | Quantization | Sequence Length | First Token Latency (s) | Tokens Per Second (tokens/s) |
---|---|---|---|---|
deepseek-r1-distill-qwen-1.5b | INT4 | 8192 | 5.159 | 30.448 |
deepseek-r1-distill-qwen-7b | INT4 | 2048 | 2.843 | 11.008 |
Model Deployment and Usage
The DeepSeek-R1-Distill-Qwen-7B/1.5B model porting method and detailed documentation have been released on Radxa official website. The models and code are fully open-source, and welcome everyone to try and deploy them.
Fogwise® AirBox Overview
The Radxa Fogwise® AirBox is an embedded AI microserver with a computational power of up to 32TOPS. It supports various precisions (INT8, FP16/BF16, FP32) and local deployment of mainstream large models such as LLM, text-to-image generation, and various CV models. It features high performance, low power consumption, and strong environmental adaptability. With a variety of deep learning algorithms, it can achieve applications such as facial recognition, video structuring, behavior analysis, and status monitoring, empowering digital transformation in smart cities, smart transportation, smart energy, smart finance, smart telecom, and smart industries.
Additionally, the Radxa Fogwise® AirBox is fully compatible with edge large models such as ChatGLM3, Llama3.1, Qwen2.5, Stable Diffusion3, FLUX.1, MiniCPM-V2.6, CLIP, Whisper, and more. For more details, please refer to the Radxa official documentation, and feel free to experience it.