LMArena, the company behind AI testing service Chatbot Arena, has raised $100 million in initialj funding, marking one of the largest seed rounds in the AI sector to date. LMArena operates as a neutral benchmarking platform that enables users to compare large language models through head-to-head matchups. It works by allowing users to submit prompts and evaluate anonymous responses from different models, selecting the best reply. The result is that the service offers a crowdsourced comparison method and unbiased rankings that reflect actual, real-world user preferences. By not favoring any specific company or model, the platform has attracted participation from nearly every major company and lab that is developing large language models, giving it industry-wide relevance and legitimacy. The company’s platform has become the main and arguably one of the best ways for both researchers and commercial AI developers to compare models. Major AI companies, including OpenAI, Google LLC and Anthropic PBC submit their models to LMArena to showcase performance and gather community feedback. LMArena’s ability to generate detailed performance comparisons without the need for direct integration into third-party systems makes it highly scalable.
Anthropic’s new AI models can use tools in parallel, extract and save key facts from local files, operate in two modes including near-instant responses and extended thinking and can maintain full context to sustain focus on longer projects
Anthropic has introduced the next generation of its artificial intelligence (AI) models, Claude Opus 4 and Claude Sonnet 4. “These models advance our customers’ AI strategies across the board: Opus 4 pushes boundaries in coding, research, writing and scientific discovery, while Sonnet 4 brings frontier performance to everyday use cases as an instant upgrade from Sonnet 3.7,” the company said. The company said Claude Opus 4 is its most powerful model yet and “the world’s best coding model,” adding that it delivers sustained performance on complex, long-running tasks and agent workflows. Claude Sonnet 4 balances performance and efficiency . It provides a significant upgrade to its predecessor, Claude Sonnet 3.7, and offers superior coding and reasoning while responding more precisely to user instructions. Both models can use web search and other tools during extended thinking, use tools in parallel, and extract and save key facts from local files, per the announcement. In addition, both models offer two modes, including near-instant responses and extended thinking. These models are a large step toward the virtual collaborator — maintaining full context, sustaining focus on longer projects, and driving transformational impact.
IBM’s multi-agent orchestration framework blends pre-built, domain-specific agents into existing systems without replacing entire software stacks with AI-native applications
IBM Corp. is focused on developing AI agents that execute across systems rather than merely assist at the edges. These agents are designed to integrate with legacy and modern tools, orchestrating processes across the full sprawl of enterprise infrastructure, according to Ritika Gunnar, general manager for data and artificial intelligence at IBM. Instead of replacing entire software stacks with AI-native applications, IBM blends agentic functionality into existing systems. That strategy includes leveraging fixed workflows, enabling agent-based enhancements and allowing customers to scale into full orchestration when needed, according to Gunnar. To help enterprises get started, IBM has unveiled a lineup of prebuilt AI agents in areas such as human resources, sales and procurement, with more planned in customer care and finance, according to Gunnar. These domain-specific agents can be customized, integrated and orchestrated using IBM’s frameworks. “[We have] a new interaction paradigm to work across this multi-agent orchestration framework, across all those systems, whether those be agents, tools or anything else underneath that. It is about [being] open … hybrid … because we know agents are going to run everywhere. Your systems are going to exist in many different forms, in agentic and non-agentic.” The agentic strategy converges with IBM’s push to unlock unstructured data. IBM’s watsonx offerings aim to bridge IT and business needs by enabling users to build intelligent AI agents grounded in enterprise data, according to Gunnar. “We believe that we’re going to see an explosion of the 90% of unstructured data that today has been untapped; you’re untapping a whole new set of intelligence that’s now available.”
Mistral AI’s API integrates server-side conversation management, a Python code interpreter, web search, image generation and document retrieval capabilities to enable building fully autonomous AI agents
Mistral AI, a rival to OpenAI, Anthropic PBC, Google LLC and others, has jumped into agentic AI development with the launch of a new API. The new Agents API equips developers with powerful tools for building sophisticated AI agents based on Mistral AI’s LLms, which can autonomously plan and carry out complex, multistep tasks using external tools. Among its features, the API integrates server-side conversation management, a Python-based code interpreter, web search, image generation and document retrieval capabilities. It also supports AI agent orchestration, and it’s compatible with the emerging Model Context Protocol that aims to standardize the way agents interact with other applications. With its API, Mistral AI is keeping pace with the likes of OpenAI and Anthropic, which are also laser-focused on enabling the emergence of AI agents that can perform tasks on behalf of humans with minimal supervision, in an effort to turbocharge business automation. The API boasts dozens of useful “connectors” that should make it simpler to build some very capable AI agents. For instance, the Python Code Interpreter provides a way for agents to execute Python code in a secure, sandboxed environment, while the image generation tool powered by Black Forest Labs Inc.’s FLUX1.1 [pro] Ultra model means they’ll have powerful picture-generating capabilities. A premium version of web search provides access to a standard search engine, plus the Agence France-Presse and the Associated Press news agencies, so AI agents will be able to access up-to-date information about the real world. Other features include a document library that uses hosted retrieval-augmented generation from user-uploaded documents. In other words, Mistral’s AI agents will be able to read external documents and perform actions with them. The API also includes an “agent handoffs” mechanism that allows multiple agents to work together. One agent will be able to delegate a task to another, more specialized agent. According to Mistral, the result will be a “seamless chain of actions,” with a single request able to trigger multiple agents into action so they can collaborate on complex tasks. The Agents API supports “stateful conversations” too, which means they’re able to maintain context over time by remembering the user’s earlier inputs.
Chance AI’s visual reasoning AI model provides detailed history, context, and related information of any object along with step-by-step visual logic and conversational insight to explain how it discovers and interprets new information
Chance AI, the multi-agent visual AI for explorers, artists, and creatives, announced its most substantial model upgrade to date. Available on iOS and coming soon to Android, Chance AI’s latest release introduces real-time visual reasoning, support for 17 languages, and voice playback—making Chance’s unique visual AI proposition more intuitive, helpful, and accessible. Simply take a photo, and Chance AI will instantly provide a wealth of history, context, and related information. Uncover the story behind historic landmarks or art pieces, identify unique plants or objects, or learn more about books, games, movies, and more. Chance AI is currently a free download with no ads or shopping links. The latest update brings real-time visual reasoning to Chance AI, allowing the model not just to identify what it sees—but to explain how it discovers and interprets new information through step-by-step visual logic, like a thoughtful human observer. Whether it’s analyzing art, decoding design, or understanding the natural world, Chance now provides rich, conversational insight into visual intelligence. With this release, Chance AI becomes the first true visual reasoning model, offering an unprecedented level of transparency and outperforming competitors in accuracy and contextual depth. The update also introduces audio output, so users can choose to read or listen to Chance AI’s responses.
Fabrix.ai’s platform offers automated network observability including guardrails, Model Context Protocol and agent-to-agent interfaces to streamline repetitive, time-consuming IT operations use cases
Fabrix.ai Inc., previously known as CloudFabrix, delivers a purpose-built agentic AI operational intelligence platform that enables enterprise users to streamline IT operations use cases, make better decisions more quickly and successfully accelerate digital transformation. Fabrix.ai’s intelligent agents take over repetitive, time-consuming operational workloads for its enterprise customers, delivering increased agility and cost efficiency. There are three components to the Fabrix.ai operational platform: Agentic AI; Generative AI copilot; and Cisco-specific solutions. The company views its platform as having a unique capability to focus on automation, particularly in network observability. Running a network tends to be more stochastic than deterministic, so providing enterprises and service providers a solution requires additional building blocks, including guardrails, Model Context Protocol or agent-to-agent interfaces, and Fabrix.ai has built those. While Fabrix.ai continues to work closely with Cisco and telcos, the company is also branching out to serve customers in other areas, including AI security. One of the biggest differentiators for Fabrix.ai is the ability to work with real-time data. Fabrix.ai leverages many of the common building blocks, but the platform is purpose-built for IT ops use cases rather than trying to modify a generic AI model. Its focus on handling real-time information has enabled it to get traction in key verticals, especially telco. Fabrix.ai also leverages its growing partner ecosystem to bring its capabilities to more enterprise customers. The company can use whatever data platform a customer has, including Splunk, Elastic, OpenSearch, MinIO, HP or others. Or it could be a data lake, since it has partnerships with many of the data platforms and its data abstraction layer can read directly from the platforms.
Mistral AI’s agent framework combines its Medium 3 language model with persistent memory, tool integration and orchestration capabilities that allow maintaining context across conversations
Mistral AI released a comprehensive agent development platform that enables enterprises to build autonomous AI systems capable of executing complex, multi-step business processes. Mistral’s approach combines its Medium 3 language model with persistent memory, tool integration and orchestration capabilities that allow AI systems to maintain context across conversations while executing tasks like code analysis, document processing and web research. The timing suggests coordinated market movement toward standardized agent development frameworks. All the major agent development platforms now support the Model Context Protocol, an open standard created by Anthropic that enables agents to connect with external applications and data sources. This convergence indicates that the industry recognizes agent interoperability as a key determinant of long-term platform viability. Mistral’s approach differs from competitors in its emphasis on enterprise deployment flexibility. The company offers hybrid and on-premises installation options using as few as four GPUs, addressing data sovereignty concerns that prevent many organizations from adopting cloud-based AI services.
LandingAI’s agentic vision tech uses an iterative workflow to accurately extract a document’s text, diagrams, charts and form fields to produce an LLM-ready output
LandingAI, a pioneer in agentic vision technologies, announced the major upgrades to Agentic Document Extraction (ADE). Unlike traditional optical character recognition (OCR), ADE sees a PDF or other document visually, and uses an iterative workflow to accurately extract a document’s text, diagrams, charts, form fields, and so on to produce an LLM-ready output. ADE utilizes layout-aware parsing, visual grounding, and no-template setup, allowing for quick deployment and dependable outcomes without the need for fine-tuning or model training. A leading healthcare platform provider, Eolas Medical, is processing over 100,000 clinical guidelines in the form of PDFs and complex documents with ADE, streamlining the creation of structured summaries with the view to supporting over 1.2million queries per month from healthcare professionals on their platform. Their QA chatbot, powered by ADE, provides answers with direct references to the original documents, improving information traceability and reliability. In financial services, ADE is being used to automate document onboarding for use cases like Know Your Customer (KYC), mortgage and loan processing, and client due diligence. Visual grounding enables full auditability by linking extracted data directly to its source location in the document.
Snyk launches real-time governance and adaptive policy enforcement, crucial for managing evolving risks in AI-driven software development
Cybersecurity company Snyk launched the Snyk AI Trust Platform, an AI-native agentic platform designed to empower organizations to accelerate AI-driven innovation, mitigate business risk and secure agentic and generative AI. The platform introduces several innovations, including Snyk Assist, an AI-powered chat interface offering contextual guidance, next-step recommendations and security intelligence. Another feature called Snyk Agent further extends these capabilities by automating fixes and security actions throughout the development lifecycle, leveraging its testing engines. Other parts of the offering include Snyk Guard, which provides real-time governance and adaptive policy enforcement, crucial for managing evolving AI risks. Complementing these capabilities is the Snyk AI Readiness Framework, which helps organizations assess and mature their secure AI development strategies over time. Also launching from Snyk are two new platform-supporting curated AI Trust environments. Snyk Labs is an innovation hub for researching, experimenting with and incubating the future of AI security, while Snyk Studio allows technology partners to collaborate with Snyk experts to build secure AI-native applications for mutual customers. With Snyk Studio, developers and technology providers can collaborate with its security experts to embed critical security context and controls into their AI-generated code and AI-powered workflows.
Mistral AI’s ‘plug and play’ platform offers built-in connectors to run Python code, create custom visuals, access documents stored in cloud and retrieve information from web for easy customization of AI agents
French AI startup Mistral AI is introducing its Agents API, a “plug and play” platform that enables third-party software developers to quickly add autonomous generative AI capabilities to their existing applications. The API uses Mistral’s proprietary Medium 3 model as the “brains” of each agent, allowing for easy customization and integration of AI agents into enterprise and developer workflows. The API complements Mistral’s existing Chat Completion API and focuses on agentic orchestration, built-in connectors, persistent memory, and the ability to coordinate multiple AI agents to tackle complex tasks. This innovative approach aims to overcome the limitations of traditional language models. The Agents API comes equipped with several built-in connectors, including: Code Execution: Securely runs Python code, enabling applications in data visualization, scientific computing and other technical tasks. Image Generation: Leverages Black Forest Lab FLUX1.1 [pro] Ultra to create custom visuals for marketing, education or artistic uses. Document Library: Accesses documents stored in Mistral Cloud, enhancing retrieval-augmented generation (RAG) features. Web Search: Allows agents to retrieve up-to-date information from online sources, news outlets and other reputable platforms.