Mistral Small 4
Mistral's unified open-weight MoE model combining reasoning, multimodal, and coding capabilities under Apache 2.0 with only 6.5B active parameters
Mistral Small 4
Mistral AI • March 2026
Training Data
Up to early 2026
Mistral Small 4
March 2026
Parameters
119 billion (6.5B active)
Training Method
Sparse Mixture of Experts (128 experts, 4 active per token)
Context Window
256,000 tokens
Knowledge Cutoff
Not disclosed
Key Features
Configurable Reasoning • Native Multimodal • Unified Architecture (Apache 2.0)
Capabilities
Reasoning: Excellent
Coding: Excellent
Multimodal: Very Good
What's New in This Version
Unifies Magistral, Pixtral, and Devstral into one model with 40% lower latency and 3x higher throughput compared to Mistral Small 3
Mistral's unified open-weight MoE model combining reasoning, multimodal, and coding capabilities under Apache 2.0 with only 6.5B active parameters
What's New in This Version
Unifies Magistral, Pixtral, and Devstral into one model with 40% lower latency and 3x higher throughput compared to Mistral Small 3
Technical Specifications
Key Features
Capabilities
Other Mistral AI Models
Explore more models from Mistral AI
Mistral Large 3
Mistral's state-of-the-art open-weight frontier model with multimodal and multilingual capabilities under Apache 2.0
Ministral 3 14B
Mistral's high-performance dense model in the new Ministral 3 family
Ministral 3 8B
Mistral's efficient edge-ready model for drones, cars, robots, phones and laptops