Skip to content
Dense
Cohere · 2026-02

Tiny Aya 3.35B

Dense decoder architecture with GQA + 3:1 SWA attention attention mechanism.

Tiny Aya 3.35B decoder block architecture: Attention: GQA + 3:1 SWA attention with Sliding Window Attention. Normalization: RMSNorm. FFN: SwiGLU. Position encoding: RoPE. Scale: 3.35B, 8K context, 24 layers. Decoder type: Dense.

GQA + 3:1 SWA attention·SwiGLU
3.35B|8K context|GQA + 3:1 SWA attention|Dense

Architecture Specifications

Parameters3.35B
Context Window8K
Decoder TypeDense
AttentionGQA + 3:1 SWA attention
Release Date2026-02
CategoryEfficient & Small
OrganizationCohere

Key Features

Grouped Query AttentionSliding Window AttentionRoPE embeddingsLayer mix: 27 sliding-window + 9 globalKV cache: 72 KiB/token
Enterprise AI platform

Compare, evaluate, and deploy LLM architectures at scale

Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.

Catalog Workspace

Discover agents, MCP servers, and skills in one governed surface

Use structured catalog views to compare readiness, ownership, integrations, and deployment posture before rollout.