Liquid AI’s new convolution-based, multi-hybrid LLM can work on smartphones and edge devices; uses evolutionary algorithms to auto-design model backbones and optimize for latency, memory usage, and quality • DigiBanker

Liquid AI announced “Hyena Edge,” a new convolution-based, multi-hybrid model designed for smartphones and other edge devices. Hyena Edge is engineered to outperform strong Transformer baselines on both computational efficiency and language model quality. In real-world tests on a Samsung Galaxy S24 Ultra smartphone, Hyena Edge achieved up to 30% faster prefill and decode latencies compared to its Transformer++ counterpart, with speed advantages increasing at longer sequence lengths. Unlike most small models designed for mobile deployment, Hyena Edge steps away from traditional attention-heavy designs. Instead, it strategically replaces two-thirds of grouped-query attention (GQA) operators with gated convolutions from the Hyena-Y family. The new architecture is the result of Liquid AI’s Synthesis of Tailored Architectures (STAR) framework, which uses evolutionary algorithms to automatically design model backbones. STAR explores a wide range of operator compositions, rooted in the mathematical theory of linear input-varying systems, to optimize for multiple hardware-specific objectives like latency, memory usage, and quality. With mobile devices increasingly expected to run sophisticated AI workloads natively, models like Hyena Edge could set a new baseline for what edge-optimized AI can achieve.

Read Article