Tracelight, the UK-founded AI company making financial models easier to build and trust, has raised $3.6 million in seed funding. Tracelight bridges the gap between financial models and large language models (LLMs) by turning spreadsheet logic into LLM-friendly data, allowing financial analysts and consultants to harness the benefits of Generative AI when building complex financial models. By integrating directly into the spreadsheet workflow and enhancing how LLMs interpret and work with Excel, Tracelight eliminates repetitive modelling tasks. From writing complex formulas and validating models to autonomously running analysis from simple natural language prompts, it is a force multiplier for analysts, augmenting their skills with AI, letting them model faster and smarter without needing to change the way they work. Since launch, Tracelight has gained early traction among analysts at investment banks, private credit investment and private equity houses and leading professional services firms. Early users are already reporting >90% time savings on laborious modelling tasks like building common analyses, formatting, and finding errors. Crucially, Tracelight keeps humans in control of decision-making. Rather than replacing analysts, it enables them to build better financial models and focus on high-stakes decisions where human judgment is essential.
GEPA method optimizes LLMs with genetic prompt evolution and natural language reflection, outperforming reinforcement learning using 35x fewer trials for cost-effective AI adaptation
Researchers from the University of California, Berkeley, Stanford University and Databricks have introduced a new AI optimization method called GEPA (Genetic-Pareto) that significantly outperforms traditional reinforcement learning (RL) techniques for adapting LLMs to specialized tasks. GEPA uses an LLM’s own language understanding to reflect on its performance, diagnose errors, and iteratively evolve its instructions. In addition to being more accurate than established techniques, GEPA is significantly more efficient, achieving superior results with up to 35 times fewer trial runs. For businesses building complex AI agents and workflows, this translates directly into faster development cycles, substantially lower computational costs, and more performant, reliable applications. GEPA is designed for teams that need to optimize systems built on top-tier models that often can’t be fine-tuned, allowing them to improve performance without managing custom GPU clusters. GEPA is a prompt optimizer that tackles this challenge by replacing sparse rewards with rich, natural language feedback. It leverages the fact that the entire execution of an AI system (including its reasoning steps, tool calls, and even error messages) can be serialized into text that an LLM can read and understand. GEPA’s methodology is built on three core pillars. First is “genetic prompt evolution,” where GEPA treats a population of prompts like a gene pool. It iteratively “mutates” prompts to create new, potentially better versions. This mutation is an intelligent process driven by the second pillar: “reflection with natural language feedback.” After a few rollouts, GEPA provides an LLM with the full execution trace (what the system tried to do) and the outcome (what went right or wrong). The LLM then “reflects” on this feedback in natural language to diagnose the problem and write an improved, more detailed prompt. The third pillar is “Pareto-based selection,” which ensures smart exploration. Instead of focusing only on the single best-performing prompt, which can lead to getting stuck in a suboptimal solution (a “local optimum”), GEPA maintains a diverse roster of “specialist” prompts. It tracks which prompts perform best on different individual examples, creating a list of top candidates. By sampling from this diverse set of winning strategies, GEPA ensures it explores more solutions and is more likely to discover a prompt that generalizes well across a wide range of inputs. GEPA’s core guidance is to structure feedback that surfaces not only outcomes but also intermediate trajectories and errors in plain text—the same evidence a human would use to diagnose system behavior. A major practical benefit is that GEPA’s instruction-based prompts are up to 9.2 times shorter than prompts produced by optimizers like MIPROv2, which include many few-shot examples. Shorter prompts decrease latency and reduce costs for API-based models. This makes the final application faster and cheaper to run in production.
MIT says partner‑led GenAI deployments beat internal builds, but sustainable value needs persistent memory, integration, and user‑familiar interfaces over brittle bespoke apps
The GenAI implementation failure rate is staggering, according to a new report from MIT. While 80% of organizations have explored GenAI tools and 40% report deployment, only 5% of custom enterprise AI solutions reach production, creating a massive gap between pilot enthusiasm and actual transformation. Investment allocation misses high-ROI opportunities. 50% of GenAI budgets flow to sales and marketing despite back-office automation delivering faster payback periods, with successful implementations generating $2-10M annually in BPO cost reductions. Strategic partnerships dramatically outperform internal builds. External partnerships achieve 66% deployment success compared to just 33% for internally developed tools, yet most organizations continue pursuing expensive internal development efforts. The contrast becomes even sharper when examining enterprise-specific AI solutions. While 60% of organizations have evaluated custom or vendor-sold GenAI systems, only 20% progress to pilot stage. Of those brave enough to attempt implementation, a mere 5% achieve production deployment with sustained business value. The paradox of GenAI adoption becomes clear when examining user preferences. The same professionals who praise ChatGPT for flexibility and immediate utility express deep skepticism about custom enterprise tools. When asked to compare experiences, three consistent themes emerge: generic LLM interfaces consistently produce better answers, users already possess interface familiarity, and trust levels remain higher for consumer tools. This preference reveals the fundamental learning gap. Research reveals a stark preference hierarchy based on task complexity and learning requirements. For simple tasks such as email drafting, basic analysis, and quick summaries, 70% of users prefer AI assistance. But for anything requiring sustained context, relationship memory, or iterative improvement, humans dominate by 9-to-1 margins. The dividing line isn’t intelligence or capability; it’s memory, adaptability, and learning capacity. Current GenAI systems require extensive context input for each session, repeat identical mistakes, and cannot customize themselves to specific workflows or preferences. These limitations explain why 95% of enterprise AI initiatives fail to achieve sustainable value. This shadow usage demonstrates that individuals can successfully cross the GenAI Divide when given access to flexible, responsive tools. The pattern suggests that successful enterprise adoption must build on rather than replace this organic usage, providing the memory and integration capabilities that consumer tools lack while maintaining their flexibility and responsiveness.
NTT DATA and Mistral AI partner to shape future of sustainable and secure private AI for enterprises with AI applications for clients operating on private clouds
NTT DATA and Mistral AI are partnering to jointly sell and deploy enterprise-grade AI solutions that foster strategic autonomy for clients. The companies will combine NTT DATA’s GenAI and IT services portfolio, global delivery capabilities, industry expertise, and trusted client relationships with Mistral AI’s solutions and advanced generative AI models, known for their efficiency, performance, and enterprise empowerment. Initial focus areas include: Sustainable and Secure AI Co-Development: The companies will develop sustainable and highly secure private AI solutions for organizations in regulated sectors such as financial services, insurance, defense and public sector. The companies will provide end-to-end solutions from infrastructure to business processes powered by AI applications for clients operating on private clouds. AI-Driven Innovation for IT Infrastructure & Customer Experience: The companies will pioneer the integration of Mistral AI technologies into NTT DATA’s customer experience platforms, beginning with agentic AI call center solutions in Europe and Asia Pacific. Joint projects could include co-development of LLMs for specific languages, which will further AI innovation tailored to local markets and specialized needs. Go-To-Market Expansion: NTT DATA and Mistral AI will jointly develop and execute regional go-to-market strategies tailored to the unique dynamics of countries including France, Luxembourg, Spain, Singapore and Australia. End-to-end AI services will range from use-case development and customization to implementation, support and managed services. Dedicated sales teams will be assigned to address key client needs and priorities.
CIOs can adopt a 4-stage framework for scaling agentic capabilities that starts with information retrieval agents for fetching data and progressing to simple orchestration in a single domain, complex orchestration with multi-step workflows and multi-agent orchestration across domains and systems
CIOs need a practical framework that recognizes where they are and helps them scale agentic capabilities responsibly and effectively over time. Level 1: Information retrieval agents. Most organizations will begin here. Agents fetch data and return insights. For example, one global brokerage handles 54,000 calls daily, and advisers used to spend 90 minutes prepping for each. With agents, prep time now takes minutes. Results like this build momentum and prove that agentic AI can deliver value today. Level 2: Simple orchestration, single domain. Agents begin to act — updating records, coordinating tasks, scheduling services. This stage expands both technical complexity and organizational trust. Level 3: Complex orchestration, multiple domain. Agents manage multistep workflows and decision trees. What qualifies as “complex” depends on your industry and risk tolerance, but the concept is the same: greater autonomy within well-defined boundaries. Level 4: Multi-agent orchestration. Agents collaborate across domains, systems, even organizations. Think supply chain agents interacting with logistics partners. This is the frontier: The architecture required is significant, but so is the potential. The framework aims to provide a simple, accessible way for CIOs to make informed decisions about how, and how far, to want to scale agentic capabilities.
Codeglide.ai’s always-on MCP server platform syncs enterprise APIs in real time, enabling secure, context-rich AI access with no custom coding and 90% cost reduction
Codeglide.ai announced the launch of its groundbreaking MCP server lifecycle platform that continuously transforms APIs into AI-ready Model Context Protocol (MCP) servers ensuring enterprises remain AI-ready as their API ecosystems evolve. Unlike static SDK generators or one-off migration tools, Codeglide.ai operates as an always-on MCP server lifecycle platform. It monitors API changes, automatically generates secure MCP servers, and keeps them in sync, providing AI agents and large language models (LLMs) with continuous, reliable, and context-rich access to enterprise systems. A subsidiary of Opsera, the leading AI-powered DevOps platform, Codeglide.ai is delivered as a SaaS platform that integrates seamlessly with the GitHub ecosystem — including GitHub Actions, secrets scanning, and more. Built 100% on the GitHub platform, the solution is easily accessible via SaaS or directly from the GitHub Marketplace through GitHub Actions workflows. By ensuring every API stays in sync with its MCP server counterpart and managing the end-to-end lifecycle of MCP servers effectively, Codeglide.ai eliminates the integration churn that stalls AI projects. Codeglide.ai is the first continuous MCP server lifecycle platform that: Automates and maintains MCP servers for all enterprise APIs — legacy or modern; Adapts in real time as APIs evolve, avoiding costly rework; Delivers context-aware, secure, and stateful AI access without custom coding or expensive consulting; Leverages the GitHub ecosystem (GitHub, GitHub Actions, secrets scanning, etc.) and is accessible via SaaS and GitHub Actions on GitHub Marketplace; Empowers enterprises to efficiently manage the MCP server lifecycle; and Reduces integration time by up to 97% and costs by up to 90%.
Google brings air‑gapped, multimodal AI to Distributed Cloud so regulated enterprises can deploy GenAI on premise without sacrificing data sovereignty
Google announced the general availability of its Gemini artificial intelligence models on Google Distributed Cloud, extending its most advanced AI capabilities into enterprise and government data centers. The launch, which sees Gemini now available on GDC in an air-gapped configuration and in preview on GDC onnected, allows organizations with strict data residency and compliance requirements to deploy generative AI without sacrificing control over sensitive information. With the release and by bringing models on-premises, Google is addressing a longstanding issue faced by regulated industries: a choice between adopting modern AI tools or maintaining full sovereignty over their data. The integration provides access to Gemini’s multimodal capabilities, including text, images, audio and video. Google says that unlocks a range of use cases, including multilingual collaboration, automated document summarization, intelligent chatbots and AI-assisted code generation. The release also includes built-in safety tools that allow enterprises to improve compliance, detect harmful content and enforce policy adherence. Google argues that delivering these capabilities securely requires more than just models, positioning GDC as a full AI platform that combines infrastructure, model libraries and prebuilt agents such as the preview of Agentspace search. Under the hood, GDC makes use of Nvidia Corp.’s Hopper and Blackwell graphics processing units, paired with automated load balancing and zero-touch updates for high availability. Confidential computing is supported on both central processing units and GPUs, ensuring that sensitive data is encrypted even during processing. Customers also gain audit logging and granular access controls for end-to-end visibility of their AI workloads. Along with Gemini 2.5 Flash and Pro, the platform supports Vertex AI’s task-specific models and Google’s open-source Gemma family. Enterprises can also deploy their own open-source or proprietary models on managed virtual machines and Kubernetes clusters as part of a unified environment.
Amazon SageMaker HyperPod’s observability solution offers a comprehensive dashboard that provides insights into foundation model (FM) development tasks and cluster resources by consolidating health and performance data from various sources
Amazon SageMaker HyperPod offers a comprehensive dashboard that provides insights into foundation model (FM) development tasks and cluster resources. This unified observability solution automatically publishes key metrics to Amazon Managed Service for Prometheus and visualizes them in Amazon Managed Grafana dashboards. The dashboard consolidates health and performance data from various sources, including NVIDIA DCGM, instance-level Kubernetes node exporters, Elastic Fabric Adapter (EFA), integrated file systems, Kubernetes APIs, Kueue, and SageMaker HyperPod task operators. The solution also abstracts management of collector agents and scrapers across clusters, offering automatic scalability of collectors across nodes as the cluster grows. The dashboards feature intuitive navigation across metrics and visualizations, helping users diagnose problems and take action faster. These capabilities save teams valuable time and resources during FM development, helping accelerate time-to-market and reduce the cost of generative AI innovations. To enable SageMaker HyperPod observability, users need to enable AWS IAM Identity Center and create a user in the IAM Identity Center.
Amazon Web Services is launching a dedicated AI agent marketplace to enable startups to directly offer their AI agents to AWS customers while also letting enterprises to browse and install AI agents based on their requirements from a central location
Amazon Web Services (AWS) is launching an AI agent marketplace next week and Anthropic is one of its partners at the AWS Summit in New York City on July 15. The distribution of AI agents poses a challenge, as most companies offer them in silos. AWS appears to be taking a step to address this with its new move. The company’s dedicated agent marketplace will allow startups to directly offer their AI agents to AWS customers. The marketplace will also allow enterprise customers to browse, install, and look for AI agents based on their requirements from a single location. That could give Anthropic — and other AWS agent marketplace partners — a considerable boost. AWS’ marketplace would help Anthropic reach more customers, including those who may already use AI agents from its rivals, such as OpenAI. Anthropic’s involvement in the marketplace could also attract more developers to use its API to create more agents, and eventually increase its revenues. The marketplace model will allow startups to charge customers for agents. The structure is similar to how a marketplace might price SaaS offerings rather than bundling them into broader services.
Docker’s new capabilities enable developers to define agents, models, and tools as services in a single Compose file and share and deploy agentic stacks across environments without rewriting infrastructure code
Docker announced major new capabilities that make it dramatically easier for developers to build, run, and scale intelligent, agentic applications. Docker is extending Compose into the agent era, enabling developers to define intelligent agent architectures consisting of models and tools in the same simple YAML files they already use for microservices and take those agents to production. With the new Compose capabilities, developers can: Define agents, models, and tools as services in a single Compose file; Run agentic workloads locally or deploy seamlessly to cloud services like Google Cloud Run or Azure Container Apps; Integrate with Docker’s open source Model Context Protocol (MCP) Gateway for secure tool discovery and communication; Share, version, and deploy agentic stacks across environments without rewriting infrastructure code. Docker unveiled Docker Offload (Beta), a new capability that enables developers to offload AI and GPU-intensive workloads to the cloud without disrupting their existing workflows. With Docker Offload, developers can: Maintain local development speed while accessing cloud-scale compute and GPUs; Run large models and multi-agent systems in high-performance cloud environments; Choose where and when to offload workloads for privacy, cost, and performance optimization; Keep data and workloads within specific regions to meet sovereignty requirements and ensure data does not leave designated zones across the globe.