DeepSeek-V4-Flash
DeepSeek's smaller, fast variant of V4 — same architecture at a fraction of the cost and latency
DeepSeek-V4-Flash
DeepSeek • April 2026
Training Data
Up to early 2026
DeepSeek-V4-Flash
April 2026
Parameters
284 billion (13B active)
Training Method
MoE with hybrid attention, Muon optimizer
Context Window
1,000,000 tokens
Knowledge Cutoff
Not disclosed
Key Features
1M Context • Sparse MoE (13B Active) • Aggressive Pricing • Open Weights (MIT)
Capabilities
Speed: Outstanding
Cost Efficiency: Outstanding
Reasoning: Excellent
What's New in This Version
Most affordable model at frontier-comparable quality; 1M context retained from V4-Pro at ~5× lower cost ($0.14 in / $0.28 out per MTok)
DeepSeek's smaller, fast variant of V4 — same architecture at a fraction of the cost and latency
What's New in This Version
Most affordable model at frontier-comparable quality; 1M context retained from V4-Pro at ~5× lower cost ($0.14 in / $0.28 out per MTok)
Technical Specifications
Key Features
Capabilities
Other DeepSeek Models
Explore more models from DeepSeek
DeepSeek-V4-Pro
DeepSeek's frontier MoE flagship closing the gap with leading proprietary models on reasoning and agentic coding
DeepSeek-V3.2
DeepSeek's latest flagship model matching GPT-5 performance with integrated tool-use thinking
DeepSeek-V3.2-Speciale
DeepSeek's competition-focused variant (EXPIRED Dec 15, 2025 - was temporary API-only release)