LATEST MODEL

DeepSeek-V4-Pro

DeepSeek 🧠 The Thinker Released April 2026

DeepSeek's frontier MoE flagship closing the gap with leading proprietary models on reasoning and agentic coding

DeepSeek-V4-Pro

DeepSeek • April 2026

Latest

Training Data

32+ trillion tokens, up to early 2026

DeepSeek-V4-Pro

April 2026

Parameters

1.6 trillion (49B active)

Training Method

MoE with hybrid attention (CSA + HCA), Muon optimizer, two-stage post-training

Context Window

1,000,000 tokens

Knowledge Cutoff

Not disclosed

Key Features

Hybrid Compressed Attention • Manifold-Constrained Hyper-Connections • FP4/FP8 Mixed Precision • Open Weights (MIT)

Capabilities

Reasoning: Outstanding

Coding: Outstanding

Agentic: Outstanding

What's New in This Version

27% inference FLOPs and 10% KV cache vs V3.2 at 1M tokens; SWE Verified 80.6, Terminal-Bench 2.0 67.9, MMLU-Pro 87.5, GPQA 90.1, LiveCodeBench 93.5

DeepSeek's frontier MoE flagship closing the gap with leading proprietary models on reasoning and agentic coding

What's New in This Version

27% inference FLOPs and 10% KV cache vs V3.2 at 1M tokens; SWE Verified 80.6, Terminal-Bench 2.0 67.9, MMLU-Pro 87.5, GPQA 90.1, LiveCodeBench 93.5

Technical Specifications

Parameters 1.6 trillion (49B active)

Context Window 1,000,000 tokens

Training Method MoE with hybrid attention (CSA + HCA), Muon optimizer, two-stage post-training

Knowledge Cutoff Not disclosed

Training Data 32+ trillion tokens, up to early 2026

Key Features

Hybrid Compressed Attention Manifold-Constrained Hyper-Connections FP4/FP8 Mixed Precision Open Weights (MIT)

Capabilities

Reasoning: Outstanding

Coding: Outstanding

Agentic: Outstanding

Other DeepSeek Models

Explore more models from DeepSeek

Official Documentation Compare with Other Models View Timeline All DeepSeek Models

DeepSeek-V4-Pro

DeepSeek-V4-Pro

Training Data

Parameters

Training Method

Context Window

Knowledge Cutoff

Key Features

Capabilities

What's New in This Version

What's New in This Version

Technical Specifications

Key Features

Capabilities

Other DeepSeek Models

DeepSeek-V4-Flash

DeepSeek-V3.2

DeepSeek-V3.2-Speciale