NTT DATA and Mistral AI are partnering to jointly sell and deploy enterprise-grade AI solutions that foster strategic autonomy for clients. The companies will combine NTT DATA’s GenAI and IT services portfolio, global delivery capabilities, industry expertise, and trusted client relationships with Mistral AI’s solutions and advanced generative AI models, known for their efficiency, performance, and enterprise empowerment. Initial focus areas include: Sustainable and Secure AI Co-Development: The companies will develop sustainable and highly secure private AI solutions for organizations in regulated sectors such as financial services, insurance, defense and public sector. The companies will provide end-to-end solutions from infrastructure to business processes powered by AI applications for clients operating on private clouds. AI-Driven Innovation for IT Infrastructure & Customer Experience: The companies will pioneer the integration of Mistral AI technologies into NTT DATA’s customer experience platforms, beginning with agentic AI call center solutions in Europe and Asia Pacific. Joint projects could include co-development of LLMs for specific languages, which will further AI innovation tailored to local markets and specialized needs. Go-To-Market Expansion: NTT DATA and Mistral AI will jointly develop and execute regional go-to-market strategies tailored to the unique dynamics of countries including France, Luxembourg, Spain, Singapore and Australia. End-to-end AI services will range from use-case development and customization to implementation, support and managed services. Dedicated sales teams will be assigned to address key client needs and priorities.
CIOs can adopt a 4-stage framework for scaling agentic capabilities that starts with information retrieval agents for fetching data and progressing to simple orchestration in a single domain, complex orchestration with multi-step workflows and multi-agent orchestration across domains and systems
CIOs need a practical framework that recognizes where they are and helps them scale agentic capabilities responsibly and effectively over time. Level 1: Information retrieval agents. Most organizations will begin here. Agents fetch data and return insights. For example, one global brokerage handles 54,000 calls daily, and advisers used to spend 90 minutes prepping for each. With agents, prep time now takes minutes. Results like this build momentum and prove that agentic AI can deliver value today. Level 2: Simple orchestration, single domain. Agents begin to act — updating records, coordinating tasks, scheduling services. This stage expands both technical complexity and organizational trust. Level 3: Complex orchestration, multiple domain. Agents manage multistep workflows and decision trees. What qualifies as “complex” depends on your industry and risk tolerance, but the concept is the same: greater autonomy within well-defined boundaries. Level 4: Multi-agent orchestration. Agents collaborate across domains, systems, even organizations. Think supply chain agents interacting with logistics partners. This is the frontier: The architecture required is significant, but so is the potential. The framework aims to provide a simple, accessible way for CIOs to make informed decisions about how, and how far, to want to scale agentic capabilities.
Codeglide.ai’s always-on MCP server platform syncs enterprise APIs in real time, enabling secure, context-rich AI access with no custom coding and 90% cost reduction
Codeglide.ai announced the launch of its groundbreaking MCP server lifecycle platform that continuously transforms APIs into AI-ready Model Context Protocol (MCP) servers ensuring enterprises remain AI-ready as their API ecosystems evolve. Unlike static SDK generators or one-off migration tools, Codeglide.ai operates as an always-on MCP server lifecycle platform. It monitors API changes, automatically generates secure MCP servers, and keeps them in sync, providing AI agents and large language models (LLMs) with continuous, reliable, and context-rich access to enterprise systems. A subsidiary of Opsera, the leading AI-powered DevOps platform, Codeglide.ai is delivered as a SaaS platform that integrates seamlessly with the GitHub ecosystem — including GitHub Actions, secrets scanning, and more. Built 100% on the GitHub platform, the solution is easily accessible via SaaS or directly from the GitHub Marketplace through GitHub Actions workflows. By ensuring every API stays in sync with its MCP server counterpart and managing the end-to-end lifecycle of MCP servers effectively, Codeglide.ai eliminates the integration churn that stalls AI projects. Codeglide.ai is the first continuous MCP server lifecycle platform that: Automates and maintains MCP servers for all enterprise APIs — legacy or modern; Adapts in real time as APIs evolve, avoiding costly rework; Delivers context-aware, secure, and stateful AI access without custom coding or expensive consulting; Leverages the GitHub ecosystem (GitHub, GitHub Actions, secrets scanning, etc.) and is accessible via SaaS and GitHub Actions on GitHub Marketplace; Empowers enterprises to efficiently manage the MCP server lifecycle; and Reduces integration time by up to 97% and costs by up to 90%.
Google brings air‑gapped, multimodal AI to Distributed Cloud so regulated enterprises can deploy GenAI on premise without sacrificing data sovereignty
Google announced the general availability of its Gemini artificial intelligence models on Google Distributed Cloud, extending its most advanced AI capabilities into enterprise and government data centers. The launch, which sees Gemini now available on GDC in an air-gapped configuration and in preview on GDC onnected, allows organizations with strict data residency and compliance requirements to deploy generative AI without sacrificing control over sensitive information. With the release and by bringing models on-premises, Google is addressing a longstanding issue faced by regulated industries: a choice between adopting modern AI tools or maintaining full sovereignty over their data. The integration provides access to Gemini’s multimodal capabilities, including text, images, audio and video. Google says that unlocks a range of use cases, including multilingual collaboration, automated document summarization, intelligent chatbots and AI-assisted code generation. The release also includes built-in safety tools that allow enterprises to improve compliance, detect harmful content and enforce policy adherence. Google argues that delivering these capabilities securely requires more than just models, positioning GDC as a full AI platform that combines infrastructure, model libraries and prebuilt agents such as the preview of Agentspace search. Under the hood, GDC makes use of Nvidia Corp.’s Hopper and Blackwell graphics processing units, paired with automated load balancing and zero-touch updates for high availability. Confidential computing is supported on both central processing units and GPUs, ensuring that sensitive data is encrypted even during processing. Customers also gain audit logging and granular access controls for end-to-end visibility of their AI workloads. Along with Gemini 2.5 Flash and Pro, the platform supports Vertex AI’s task-specific models and Google’s open-source Gemma family. Enterprises can also deploy their own open-source or proprietary models on managed virtual machines and Kubernetes clusters as part of a unified environment.
Reflection AI’s autonomous coding agent Asimov reads everything from emails to slack messages, project notes to documentation, in addition to the code, to learn everything about how and why the app was created
AI startup Reflection AI has developed an autonomous agent known as Asimov. It has been trained to understand how software is created by ingesting not only code, but the entirety of a business’ data to try to piece together why an application or system does what it does. Co-founder and Chief Executive Misha Laskin said that Asimov reads everything from emails to slack messages, project notes to documentation, in addition to the code, to learn everything about how and why the app was created. He explained that he believes this is the simplest and most natural way for AI agents to become masters at coding. Asimov is actually a collection of multiple smaller AI agents that are deployed inside customer’s cloud environments so that the data remains within their control. Asimov’s agents then cooperate with one another to try and understand the underlying code of whatever piece of software they’ve been assigned to, so they can answer any questions that human users might have about it. There are several smaller agents designed to retrieve the necessary data, and they work with a larger “reasoning” agent that collects all of their findings and tries to generate coherent answers to user’s questions.
MLCommons’s new standard for measuring the performance of LLMs on PCs to support NVIDIA and Apple Mac GPUs and new prompt categories, including structured prompts for code analysis and experimental long-context summarization tests using 4,000- and 8,000-token inputs
MLCommons, the consortium behind the industry-standard MLPerf benchmarks, released MLPerf Client v1.0, a benchmark that sets a new standard for measuring the performance of LLMs on PCs and other client-class systems. MLPerf Client v1.0 introduces an expanded set of supported models, including Llama 2 7B Chat, Llama 3.1 8B Instruct, and Phi 3.5 Mini Instruct, with Phi 4 Reasoning 14B added as an experimental option to preview next-generation high-reasoning-capable LLMs. These additions reflect real-world use cases across a broader range of model sizes and capabilities. The benchmark expands its evaluation scope with new prompt categories, including structured prompts for code analysis and experimental long-context summarization tests using 4,000- and 8,000-token inputs. Hardware and platform support has also grown significantly. MLPerf Client v1.0 supports AMD and Intel NPUs and GPUs via ONNX Runtime, Ryzen AI SDK, and OpenVINO, with additional support for NVIDIA GPUs and Apple Mac GPUs through llama.cpp. It offers both command-line and graphical user interfaces. The GUI includes real-time compute and memory usage, persistent results history, comparison tables, and CSV exports. The CLI supports automation and scripting for regression testing and large-scale evaluations, making MLPerf Client v1.0 a comprehensive tool for benchmarking LLMs on client systems.
Datasite’s acquisition of agentic AI platform for investment services Blueflame AI to automate complex workflows and enable full scope analysis by connecting fragmented data sources used in deal sourcing, due diligence, market research, and fundraising activities
Datasite has acquired Blueflame AI™, a leading provider of agentic AI solutions for investment and financial services. “Blueflame’s agentic AI solutions will expand the collective capacity of our user base, automating complex workflows and enabling full scope analysis,” said Rusty Wiley, CEO and President of Datasite. “Combining Datasite, Grata and Blueflame’s resources will create a unique offering.” “Datasite and Grata have unique data to feed into Blueflame’s purpose-built agentic AI solution,” said Raj Bakhru, CEO and Co-Founder of Blueflame. Blueflame’s AI-native, large language model agnostic platform combines deep domain expertise advanced reasoning models, and secure data connections. When used with internal systems, data rooms, proprietary data sources, and publicly available information, Blueflame drives efficiencies around investment workflows, connecting fragmented data sources used in deal sourcing, due diligence, market research, and fundraising activities. Blueflame will be added to Datasite’s Intelligence Unit and continue to be led by Raj Bakhru, supported by Datasite’s ongoing investment.
Private AI from Magic Research harnesses legacy hardware in a distributed mesh, orchestrating neural network shards to deliver supercomputer-level inference with 90% lower costs and full data control
Magic Research, has launched an artificial intelligence platform for on-premises use that it claims can cut costs by up to 90% from comparable cloud-based services. Called Private AI, the platform is intended to give organizations complete control over their data, infrastructure and brand by operating securely behind a firewall and leveraging existing computing resources. The proprietary technology underlying Private AI is Fabric Hypergrid, a distributed computing mesh that taps into existing hardware on a network, including legacy graphic processing units, central processing units and accelerators, to create the equivalent of an AI supercomputer at a small fraction of the cost. Hypergrid can “shard neural networks during the inference process,” said Humberto Farias, founder and chief executive of Magic Research. Sharding is an architecture originally created for database management systems that breaks monolithic databases into smaller, faster, more manageable pieces called shards that each hold a subset of the dataset. During the inference process, a dynamic model router analyzes the AI task and matches it with the best available computational resource within the private network. The Hypergrid layer then orchestrates the workload, performing model acceleration and distributing shards of the neural network across the available hardware before reassembling the results. The model-agnostic platform provides a white-labeled AI chatbot experience with customizable user interfaces, workflows, models and permissions. A component called GatewAI, enforces policies, filters content, logs activity and ensures alignment with industry-specific regulations like FERPA, HIPAA, GDPR and SOC 2.
Salesforce debuts a CRM ‘flight simulator’ to harden AI agents in realistic business scenarios and benchmark target reliability, integration security to close the 95% pilot failure gap
Salesforce is betting that rigorous testing in simulated business environments will solve one of enterprise artificial intelligence’s biggest problems: agents that work in demonstrations but fail in the messy reality of corporate operations. The cloud software giant unveiled three major AI research initiatives this week, including CRMArena-Pro, what it calls a “digital twin” of business operations where AI agents can be stress-tested before deployment. The announcement comes as enterprises grapple with widespread AI pilot failures and fresh security concerns following recent breaches that compromised hundreds of Salesforce customer instances. “Pilots don’t learn to fly in a storm; they train in flight simulators that push them to prepare in the most extreme challenges,” said Silvio Savarese, Salesforce’s chief scientist and head of AI research, during a press conference. “Similarly, AI agents benefit from simulation testing and training, preparing them to handle the unpredictability of daily business scenarios in advance of their deployment.” The research push reflects growing enterprise frustration with AI implementations. A recent MIT report found that 95% of generative AI pilots at companies are failing to reach production, while Salesforce’s own studies show that large language models alone achieve only 35% success rates in complex business scenarios.
Anthropic’s analytics dashboard for Claude coding agent to provide detailed breakdowns of activity by user and cost including lines of code accepted, suggestion accept rates and total spend over time
Anthropic is rolling out a comprehensive analytics dashboard for its Claude Code AI programming assistant. The new dashboard will provide engineering managers with detailed metrics on how their teams use Claude Code, including lines of code accepted, suggestion accept rates, total user activity over time, total spend over time, average daily spend for each user, and average daily lines of code accepted for each user. The dashboard will track commits, pull requests, and provide detailed breakdowns of activity by user and cost — data that engineering leaders say is crucial for understanding how AI is changing development workflows. The feature includes role-based access controls, allowing organizations to configure who can view usage data. The system focuses on metadata rather than actual code content, addressing potential privacy concerns about employee surveillance. The platform has seen active user base growth of 300% and run-rate revenue expansion of more than 5.5 times, according to company data. Unlike some competitors that focus primarily on code completion, Claude Code offers what Anthropic calls “agentic” capabilities — the ability to understand entire codebases, make coordinated changes across multiple files, and work directly within existing development workflows.