• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

DigiBanker

Bringing you cutting-edge new technologies and disruptive financial innovations.

  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In
  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In

Liquid AI’s new convolution-based, multi-hybrid LLM can work on smartphones and edge devices; uses evolutionary algorithms to auto-design model backbones and optimize for latency, memory usage, and quality

April 29, 2025 //  by Finnovate

Liquid AI announced “Hyena Edge,” a new convolution-based, multi-hybrid model designed for smartphones and other edge devices. Hyena Edge is engineered to outperform strong Transformer baselines on both computational efficiency and language model quality. In real-world tests on a Samsung Galaxy S24 Ultra smartphone, Hyena Edge achieved up to 30% faster prefill and decode latencies compared to its Transformer++ counterpart, with speed advantages increasing at longer sequence lengths. Unlike most small models designed for mobile deployment, Hyena Edge steps away from traditional attention-heavy designs. Instead, it strategically replaces two-thirds of grouped-query attention (GQA) operators with gated convolutions from the Hyena-Y family. The new architecture is the result of Liquid AI’s Synthesis of Tailored Architectures (STAR) framework, which uses evolutionary algorithms to automatically design model backbones. STAR explores a wide range of operator compositions, rooted in the mathematical theory of linear input-varying systems, to optimize for multiple hardware-specific objectives like latency, memory usage, and quality. With mobile devices increasingly expected to run sophisticated AI workloads natively, models like Hyena Edge could set a new baseline for what edge-optimized AI can achieve.

Read Article

Category: Members, Additional Reading

Previous Post: « NeuroBlade’s Analytics Accelerator is a purpose-built hardware designed to handle modern database workloads delivering 4x faster performance than leading vectorized CPU implementations
Next Post: Microsoft releases taxonomy of failure modes- security and safety- inherent to agentic architecture- novel modes unique to agentic systems (e.g. agent compromise) and modes representing amplification of existing GenAI risks (e.g. bias amplification) »

Copyright © 2025 Finnovate Research · All Rights Reserved · Privacy Policy
Finnovate Research · Knyvett House · Watermans Business Park · The Causeway Staines · TW18 3BA · United Kingdom · About · Contact Us · Tel: +44-20-3070-0188

We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.