• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to primary sidebar

DigiBanker

Bringing you cutting-edge new technologies and disruptive financial innovations.

  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In
  • Home
  • Pricing
  • Features
    • Overview Of Features
    • Search
    • Favorites
  • Share!
  • Log In

Startup Rime’s text-to-speech model can quickly generate “infinite” new voices of varying genders, ages, demographics and languages just based on a simple text description of intended characteristics; while incorporating social context, individual speech habits and non-verbal conversational cues

June 10, 2025 //  by Finnovate

Startup Rime is tackling this challenge with Arcana text-to-speech (TTS), a new spoken language model that can quickly generate “infinite” new voices of varying genders, ages, demographics and languages just based on a simple text description of intended characteristics. The model has helped boost customer sales — for the likes of Domino’s and Wingstop — by 15%. Rime’s multimodal and autoregressive TTS model was trained on natural conversations with real people (as opposed to voice actors). Users simply type in a text prompt description of a voice with desired demographic characteristics and language.  Rime’s Mist v2 TTS model was built for high-volume, business-critical applications, allowing enterprises to craft unique voices for their business needs. “The customer hears a voice that allows for a natural, dynamic conversation without needing a human agent,” said Lily Clifford, Rime CEO and co-founder.  Rime’s model generates audio tokens that are decoded into speech using a codec-based approach, which Rime says provides for “faster-than-real-time synthesis.” Rime’s data incorporates sociolinguistic conversation techniques (factoring in social context like class, gender, location), idiolect (individual speech habits) and paralinguistic nuances (non-verbal aspects of communication that go along with speech).  Rime intends to give customers the ability to find voices that will work best for their application. They built a “personalization harness” tool to allow users to do A/B testing with various voices. After a given interaction, the API reports back to Rime, which provides an analytics dashboard identifying the best-performing voices based on success metrics.  Another KPI customers are maximizing for is the caller’s willingness to talk to the AI. They’ve found that, when switching to Rime, callers are 4X more likely to talk to the bot. 

Read Article

Category: Channels, Innovation Topics

Previous Post: « Walmart bets on AI assistant Sparky to ignite sales
Next Post: FINNY unveils intent search to help advisors pinpoint high-intent prospects faster based on real-time online behavior »

Copyright © 2025 Finnovate Research · All Rights Reserved · Privacy Policy
Finnovate Research · Knyvett House · Watermans Business Park · The Causeway Staines · TW18 3BA · United Kingdom · About · Contact Us · Tel: +44-20-3070-0188

We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.