• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

DigiBanker

Bringing you cutting-edge new technologies and disruptive financial innovations.

  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In
  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In

Hailo Technologies AI accelerator device runs hybrid AI pipelines that blend LLMs, vision-language models (VLMs), and other multi-modal AI with traditional convolutional neural networks (CNNs) directly on-device, eliminating the need for cloud-based inference

July 25, 2025 //  by Finnovate

Israeli chipmaker Hailo Technologies has released the Hailo-10H, the first discrete AI accelerator designed for generative AI workloads at the edge. The device runs large language models (LLMs), vision-language models (VLMs), and other multi-modal AI directly on-device, eliminating the need for cloud-based inference. The Hailo-10H offers unmatched power efficiency and low latency, achieving first-token generation in under one second and maintaining 10 tokens per second on 2-billion parameter LLMs. It can also generate images with Stable Diffusion 2.1 in under five seconds, demonstrating a significant leap forward for offline generative workloads. The chip is designed around Hailo’s second-generation neural core architecture, providing 40 tera-operations per second (TOPS) of INT4 performance and 20 TOPS of INT8 at a typical power draw of 2.5 W. It is fully compatible with TensorFlow, PyTorch, ONNX, and Keras, and is supported by Hailo’s mature software stack. The device is designed to work in hybrid AI pipelines that blend LLMs or VLMs with traditional convolutional neural networks (CNNs), conserving power and ensuring real-time responsiveness for mission-critical applications like video analytics.

Read Article

Category: AI & Machine Economy, Innovation Topics

Previous Post: « Anthropic unveils ‘auditing agents’ to test for AI misalignment finding prompts that elicit “concerning” behaviors
Next Post: The passage of the GENIUS Act to spur the supply of stablecoins by a “relatively modest” $25 billion-$75 billion in the near term, driven by product rollouts, infrastructure investment and competition from tokenized deposits and money market funds »

Copyright © 2025 Finnovate Research · All Rights Reserved · Privacy Policy
Finnovate Research · Knyvett House · Watermans Business Park · The Causeway Staines · TW18 3BA · United Kingdom · About · Contact Us · Tel: +44-20-3070-0188

We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.