Mistral Small 4

MoE decoder architecture with MLA attention mechanism.

MLA·MoE · 6.63B active

6.63B active / 119B total|256K context|MLA|MoE

Architecture Specifications

Parameters6.63B active / 119B total

Context Window256K

Decoder TypeMoE

AttentionMLA

Active Parameters6.63B

Release Date2026-03

CategoryMixture of Experts

OrganizationMistral

Multi-head Latent AttentionExpert routingLayer mix: 36 MLAKV cache: 22.5 KiB/token

Enterprise AI platform

Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.