• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

DigiBanker

Bringing you cutting-edge new technologies and disruptive financial innovations.

  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In
  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In

Nvidia delivers state-of-the-art SLM performance with hybrid selective state modeling, reasoning toggle, and no commercial license fees for scalable real-world AI applications.

August 19, 2025 //  by Finnovate

Nvidia has released Nemotron-Nano-9B-v2, a small language model (SLM) that leads its class in benchmark performance and includes a toggle for AI “reasoning” (self-checking before answering). The model was pruned from 12B to 9B parameters to fit on a single Nvidia A10 GPU, a popular choice for deployment, according to Oleksii Kuchiaev, Nvidia’s Director of AI Model Post-Training. It’s a hybrid model that supports larger batch sizes and runs up to 6× faster than similarly sized transformer models. Nemotron-Nano-9B-v2 is based on Nemotron-H, which uses hybrid Mamba-Transformer architecture developed with input from Carnegie Mellon and Princeton researchers. Mamba architecture integrates selective state space models (SSMs) that efficiently handle long sequences by maintaining state. Hybrid Mamba-Transformer architecture reduces compute costs by replacing most attention with linear-time state space layers, achieving 2–3× higher throughput on long contexts with similar accuracy. Nemotron-Nano-9B-v2 is a unified, text-only chat and reasoning model, trained from scratch and defaulting to reasoning trace generation before final answers. Users can control reasoning behavior using tokens like /think or /no_think. The model introduces a “thinking budget”, allowing developers to cap internal reasoning tokens to balance accuracy and latency—ideal for customer support and autonomous agents. Benchmark results show strong performance: 72.1% on AIME25, 97.8% on MATH500, 64.0% on GPQA, and 71.1% on LiveCodeBench. Nemotron-Nano-9B-v2 outperforms Qwen3-8B, a common comparison model in the open SLM category. The model can be put into production immediately without negotiating a separate commercial license or paying fees tied to usage thresholds, revenue levels, or user counts. There are no clauses requiring a paid license once a company reaches a certain scale, unlike some tiered open licenses used by other providers

Read Article

 

Category: AI & Machine Economy, Innovation Topics

Previous Post: « Embedded payments are seeing rising adoption in the parking sector through AI-recognition tech that lets customers just drive in and scan a QR code to enter their credit card information the first time they park, with automatic vehicle identification and charges applied on subsequent trips

Copyright © 2025 Finnovate Research · All Rights Reserved · Privacy Policy
Finnovate Research · Knyvett House · Watermans Business Park · The Causeway Staines · TW18 3BA · United Kingdom · About · Contact Us · Tel: +44-20-3070-0188

We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.