Xiaomi MiMo V2.5 on 2x DGX Spark: 32 tok/s single, 184 tok/s at 8 concurrent, 97.8 tool-quality (measured)
MiMo V2.5 is a 310B/15B-active omnimodal MoE from Xiaomi that reached ~13% of OpenRouter's token traffic in May, and MIT open weights. We ran the NVFP4 build on our 2-node DGX Spark cluster: ~32 tok/s single-stream, 184 tok/s at 8 concurrent, 1M context, 97.8/100 on a 69-scenario tool eval. Two honest catches inside.
#software#performance#nvidia#dgx-spark#nvfp4#mimo