MoE
Unknown · 2026-03
Sarvam 30B
MoE decoder architecture with GQA + QK-Norm attention mechanism.
Sarvam 30B decoder block architecture: Attention: GQA + QK-Norm with QK-Norm. Normalization: RMSNorm. FFN: Mixture of Experts (2.4B active parameters). Position encoding: RoPE. Scale: 30B, 131K context, 19 layers. Decoder type: MoE.
GQA + QK-Norm·MoE · 2.4B active
2.4B active / 30B total|131K context|GQA + QK-Norm|MoE
Architecture Specifications
Parameters2.4B active / 30B total
Context Window131K
Decoder TypeMoE
AttentionGQA + QK-Norm
Active Parameters2.4B
Layers19
Hidden Size4,096
Vocabulary Size262K
Release Date2026-03
CategoryMixture of Experts
OrganizationUnknown
Key Features
Grouped Query AttentionQK normalizationExpert routingLayer mix: 19 GQAKV cache: 19 KiB/token
Enterprise AI platform
Compare, evaluate, and deploy LLM architectures at scale
Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.