DeepSeek: DeepSeek R1 0528 Qwen3 8B (free)
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.
Description
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.
ArchitectureАрхитектура
- Modality:
- text->text
- InputModalities:
- text
- OutputModalities:
- text
- Tokenizer:
- Qwen
- InstructionType:
- deepseek-r1
ContextAndLimits
- ContextLength:
- 131072 Tokens
- MaxResponseTokens:
- 0 Tokens
- Moderation:
- Disabled
PricingRUB
- Request:
- ₽
- Image:
- ₽
- WebSearch:
- ₽
- InternalReasoning:
- ₽
- Prompt1KTokens:
- ₽
- Completion1KTokens:
- ₽
DefaultParameters
- Temperature:
- 0
UserComments