<b>Liquid AI has released a compact MoE model designed for consumer devices</b>

Liquid AI has released a compact MoE model designed for consumer devices

Liquid AI has released a compact MoE model designed for consumer devices

LFM2.5-8B-A1B is a Mixture-of-Experts language model with 8 billion total and 1 billion active parameters, optimized for inference on laptops, smartphones, and reasoning tasks.

This release expands the LFM2 series: the context window is now 128k tokens, pre-training volume reached 38 trillion tokens, and large-scale RL has been applied. The tokenizer vocabulary has doubled to 128k, significantly improving efficiency for non-Latin scripts.

Liquid AI emphasizes speed and tool-use: the model achieves up to 253 tokens/sec on an Apple M5 Max (under 6 GB RAM) and ~30 tokens/sec on a smartphone. In benchmarks, it performs comparably to models like Gemma-4-26B.

Support: llama.cpp, MLX, vLLM, SGLang, and ONNX Runtime.

📌 Licensing: LFM Open License

🟡 Blog Post
🟡 Documentation
🟡 Weights
🟡