LATEST MODEL

Mistral Small 4

Mistral AI 🧠 The Thinker Released March 2026

Mistral's unified open-weight MoE model combining reasoning, multimodal, and coding capabilities under Apache 2.0 with only 6.5B active parameters

Mistral Small 4

Mistral AI • March 2026

Latest

Training Data

Up to early 2026

Mistral Small 4

March 2026

Parameters

119 billion (6.5B active)

Training Method

Sparse Mixture of Experts (128 experts, 4 active per token)

Context Window

256,000 tokens

Knowledge Cutoff

Not disclosed

Key Features

Configurable Reasoning • Native Multimodal • Unified Architecture (Apache 2.0)

Capabilities

Reasoning: Excellent

Coding: Excellent

Multimodal: Very Good

What's New in This Version

Unifies Magistral, Pixtral, and Devstral into one model with 40% lower latency and 3x higher throughput compared to Mistral Small 3

Mistral's unified open-weight MoE model combining reasoning, multimodal, and coding capabilities under Apache 2.0 with only 6.5B active parameters

What's New in This Version

Unifies Magistral, Pixtral, and Devstral into one model with 40% lower latency and 3x higher throughput compared to Mistral Small 3

Technical Specifications

Parameters 119 billion (6.5B active)

Context Window 256,000 tokens

Training Method Sparse Mixture of Experts (128 experts, 4 active per token)

Knowledge Cutoff Not disclosed

Training Data Up to early 2026

Key Features

Configurable Reasoning Native Multimodal Unified Architecture (Apache 2.0)

Capabilities

Reasoning: Excellent

Coding: Excellent

Multimodal: Very Good

Other Mistral AI Models

Explore more models from Mistral AI

Official Documentation Compare with Other Models View Timeline All Mistral AI Models

Mistral Small 4

Mistral Small 4

Training Data

Parameters

Training Method

Context Window

Knowledge Cutoff

Key Features

Capabilities

What's New in This Version

What's New in This Version

Technical Specifications

Key Features

Capabilities

Other Mistral AI Models

Mistral Large 3

Ministral 3 14B

Ministral 3 8B