A three-way partnership between AI phone support company Phonely, inference optimization platform Maitai, and chip maker Groq has achieved a breakthrough that addresses one of conversational AI’s most persistent problems: the awkward delays that immediately signal to callers they’re talking to a machine. The collaboration has enabled Phonely to reduce response times by more than 70% while simultaneously boosting accuracy from 81.5% to 99.2% across four model iterations, surpassing GPT-4o’s 94.7% benchmark by 4.5 percentage points. The improvements stem from Groq’s new capability to instantly switch between multiple specialized AI models without added latency, orchestrated through Maitai’s optimization platform. The system works by collecting performance data from every interaction, identifying weak points, and iteratively improving the models without customer intervention. “Since Maitai sits in the middle of the inference flow, we collect strong signals identifying where models underperform,” Matai founder Christian DalSanto explained. “These ‘soft spots’ are clustered, labeled, and incrementally fine-tuned to address specific weaknesses without causing regressions.” The performance gains translate directly to business outcomes. “One of our biggest customers saw a 32% increase in qualified leads as compared to a previous version using previous state-of-the-art models,” Will Bodewes, Phonely’s founder and CEO noted. For call centers and customer service operations, the implications could be transformative: one of Phonely’s customers is replacing 350 human agents this month alone.