• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

DigiBanker

Bringing you cutting-edge new technologies and disruptive financial innovations.

  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In
  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In

Hailo Technologies AI accelerator device runs hybrid AI pipelines that blend LLMs, vision-language models (VLMs), and other multi-modal AI with traditional convolutional neural networks (CNNs) directly on-device, eliminating the need for cloud-based inference

July 25, 2025 //  by Finnovate

Israeli chipmaker Hailo Technologies has released the Hailo-10H, the first discrete AI accelerator designed for generative AI workloads at the edge. The device runs large language models (LLMs), vision-language models (VLMs), and other multi-modal AI directly on-device, eliminating the need for cloud-based inference. The Hailo-10H offers unmatched power efficiency and low latency, achieving first-token generation in under one second and maintaining 10 tokens per second on 2-billion parameter LLMs. It can also generate images with Stable Diffusion 2.1 in under five seconds, demonstrating a significant leap forward for offline generative workloads. The chip is designed around Hailo’s second-generation neural core architecture, providing 40 tera-operations per second (TOPS) of INT4 performance and 20 TOPS of INT8 at a typical power draw of 2.5 W. It is fully compatible with TensorFlow, PyTorch, ONNX, and Keras, and is supported by Hailo’s mature software stack. The device is designed to work in hybrid AI pipelines that blend LLMs or VLMs with traditional convolutional neural networks (CNNs), conserving power and ensuring real-time responsiveness for mission-critical applications like video analytics.

Read Article

Category: AI & Machine Economy, Innovation Topics

Previous Post: « Embedded payments are seeing rising adoption in the parking sector through AI-recognition tech that lets customers just drive in and scan a QR code to enter their credit card information the first time they park, with automatic vehicle identification and charges applied on subsequent trips

Copyright © 2025 Finnovate Research · All Rights Reserved · Privacy Policy
Finnovate Research · Knyvett House · Watermans Business Park · The Causeway Staines · TW18 3BA · United Kingdom · About · Contact Us · Tel: +44-20-3070-0188

We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.