Qwen3.5 397B

MoE decoder architecture with 3:1 Gated DeltaNet + Gated Attn attention mechanism.

3:1 Gated DeltaNet + Gated Attn·MoE · 17B active

17B active / 397B total|262K context|3:1 Gated DeltaNet + Gated Attn|MoE

Architecture Specifications

Parameters17B active / 397B total

Context Window262K

Decoder TypeMoE

Attention3:1 Gated DeltaNet + Gated Attn

Active Parameters17B

Release Date2026-02

CategoryHybrid Architecture

OrganizationAlibaba

Expert routingLayer mix: 15 gated attention + 45 DeltaNetKV cache: 30 KiB/token

Enterprise AI platform

Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.