All models

Hardware to run Kimi K2.7 Code 1T (MoE)

Jun 2026 release — code-specialized variant built on K2.6 with ~30% fewer thinking tokens for the same task. Same 1 T total / 32 B active MoE (384 experts top-8 + 1 shared), 256 K context, Modified MIT. Internal Moonshot benchmarks only (Kimi Code Bench v2 62.0, MCP Atlas 76.0, MCP Mark Verified 81.1) — standard SWE-Bench / TB2 cells stay null until third-party leaderboards land.

Moonshot · text
Kimi K2.7 Code 1T (MoE)
1000 B params 540 GB Q4 file 600 GB min Q4 714 GB min Q5 1152 GB min Q8 256K ctx Modified MIT 🤗
switch in the live picker →
Quantization
Availability
Cheapest

8× Strix Halo cluster (1024 GB unified)

AMD · rack of 8 mini-PCs, 10/25 GbE fabric
$23,200
tokens / secQ4
235B-MoE 32 t/s
671B-MoE 8.0 t/s
1T-MoE 5.0 t/s
Memory1024 GB · 768 usable
Bandwidth256 GB/s
Idle / Active64 W / 960 W
Sticker$23,200
Why: Lowest sticker that still fits Kimi K2.7 Code 1T (MoE) ($23k USD).
Fastest

DGX B200 — 8× B200 server (1.44 TB HBM3e)

NVIDIA · 10U DGX server
$475,000
tokens / secQ4
235B-MoE 165 t/s
671B-MoE 105 t/s
1T-MoE 90 t/s
Memory1440 GB · 1404 usable
Bandwidth8000 GB/s
Idle / Active900 W / 10200 W
Sticker$475,000
Why: Highest measured tg/s — 90 t/s on Kimi K2.7 Code 1T (MoE)-class models at Q4.
All-rounder

2× Mac Studio M3 Ultra 512 GB cluster (TB5 / MLX)

Apple · two desktops, Thunderbolt 5 RDMA
$28,400
tokens / secQ4
235B-MoE 29 t/s
671B-MoE 9.0 t/s
1T-MoE 7.0 t/s
Memory1024 GB · 960 usable
Bandwidth819 GB/s
Idle / Active24 W / 440 W
Sticker$28,400
Why: Top quartile across speed, value, memory headroom, and efficiency — the "buy this if unsure" pick.
Best value

8× RTX Pro 6000 Blackwell server (768 GB)

NVIDIA · 4U server (e.g. SuperMicro AS-4125GS-TNRT)
$78,000
tokens / secQ4
235B-MoE 220 t/s
671B-MoE 75 t/s
1T-MoE 40 t/s
Memory768 GB · 744 usable
Bandwidth1792 GB/s
Idle / Active220 W / 4800 W
Sticker$78,000
Why: Best $/tg-per-second — ~$1,950 per t/s.
Best CUDA

DGX H200 — 8× H200 server (1.13 TB HBM3e)

NVIDIA · 8U DGX / HGX server rack
$380,000
tokens / secQ4
235B-MoE 155 t/s
671B-MoE 100 t/s
1T-MoE 85 t/s
Memory1128 GB · 1100 usable
Bandwidth4800 GB/s
Idle / Active700 W / 6500 W
Sticker$380,000
Why: Strongest CUDA-only software stack among fitting builds.
Most VRAM

12× RTX Pro 6000 Blackwell rack (1152 GB)

NVIDIA · 8U server rack (multi-node, 1-2 chassis)
$118,000
tokens / secQ4
235B-MoE 260 t/s
671B-MoE 95 t/s
1T-MoE 55 t/s
Memory1152 GB · 1116 usable
Bandwidth1792 GB/s
Idle / Active340 W / 7400 W
Sticker$118,000
Why: 1116 GB usable — most headroom for batching and longer contexts.
Efficient

8× DGX Spark cluster (1024 GB unified, CUDA)

NVIDIA · rack of 8 desktops, 200 GbE fabric
$43,500
tokens / secQ4
235B-MoE 42 t/s
671B-MoE 16 t/s
1T-MoE 13 t/s
Memory1024 GB · 976 usable
Bandwidth273 GB/s
Idle / Active220 W / 1840 W
Sticker$43,500
Why: 1840 W active — lowest power draw of the fitting builds.

No plug-and-play build fits at Q4_K_M

Only used / DIY / homelab-cluster rigs fit Kimi K2.7 Code 1T (MoE) at this quant. Turn off "Only plug & play" to see them.

Every other build that runs Kimi K2.7 Code 1T (MoE)

1 additional build fit Kimi K2.7 Code 1T (MoE) at Q4_K_M (600 GB usable minimum), sorted by sticker price.

BuildPriceMemoryBandwidthtg/s (Q4)Active W5-yr power
8× H100 80 GB serverNVIDIA · server rack
$280k640 / 620 GB3350 GB/s60 t/s5600 W$20k
Open in the live picker (Q2 / Q5 / Q8 toggles) → Compare Kimi K2.7 Code 1T (MoE) against other LLMs → Pick LLMs for your hardware → Submit a benchmark for Kimi K2.7 Code 1T (MoE) ↗

Sources

Last updated 2026-06-13