Xiaomi, MiMo, TileRT, inference optimization, FP4, speculative decoding, MoE, LLM inference, 2026
AI & Automation
Xiaomi MiMo UltraSpeed: A 1-Trillion-Parameter Model at 1,000 Tokens Per Second
ALGERIATECH Editorial
June 29, 2026
⚡ Key Takeaways Xiaomi’s MiMo-V2.5-Pro-UltraSpeed delivers 1,000+ tokens/sec (peak 1,200) on a 1.02T-parameter MoE model using a standard 8-GPU node...