ThetaRay has introduced the Rule Builder Simulator, an AI-powered tool that enables banks and other financial institutions to build, test, and implement anti-money laundering rules faster. The tool offers a no-code interface for defining complex rule logic, reducing operational friction and approval bottlenecks. The Self-Service Rule Builder allows teams to test and analyze the impact of new rules in a secure environment before deployment without affecting live systems. This is part of ThetaRay’s mission to equip financial institutions with AI-enhanced tools that strengthen compliance, support growth, and innovation. Key Compliance Benefits: Autonomy and Speed: Users can build, modify, simulate, and deploy AML rules independently, reducing the rule lifecycle times from weeks to hours. Tailored Risk Coverage: Complex rule logic using no-code customization and aggregation, addressing institution-specific compliance needs with precision. Safe Testing Environment: Validate new rules in a secure simulation environment before going live, ensuring confidence in compliance decisions. Optimized Detection: Simulations help teams evaluate rule and AI combinations for optimal results in detecting financial crime. Seamless Production Deployment: Approved simulations can be applied to production with built-in governance and oversight workflows
NTT DATA and Mistral AI partner to shape future of sustainable and secure private AI for enterprises with AI applications for clients operating on private clouds
NTT DATA and Mistral AI are partnering to jointly sell and deploy enterprise-grade AI solutions that foster strategic autonomy for clients. The companies will combine NTT DATA’s GenAI and IT services portfolio, global delivery capabilities, industry expertise, and trusted client relationships with Mistral AI’s solutions and advanced generative AI models, known for their efficiency, performance, and enterprise empowerment. Initial focus areas include: Sustainable and Secure AI Co-Development: The companies will develop sustainable and highly secure private AI solutions for organizations in regulated sectors such as financial services, insurance, defense and public sector. The companies will provide end-to-end solutions from infrastructure to business processes powered by AI applications for clients operating on private clouds. AI-Driven Innovation for IT Infrastructure & Customer Experience: The companies will pioneer the integration of Mistral AI technologies into NTT DATA’s customer experience platforms, beginning with agentic AI call center solutions in Europe and Asia Pacific. Joint projects could include co-development of LLMs for specific languages, which will further AI innovation tailored to local markets and specialized needs. Go-To-Market Expansion: NTT DATA and Mistral AI will jointly develop and execute regional go-to-market strategies tailored to the unique dynamics of countries including France, Luxembourg, Spain, Singapore and Australia. End-to-end AI services will range from use-case development and customization to implementation, support and managed services. Dedicated sales teams will be assigned to address key client needs and priorities.
MLCommons’s new standard for measuring the performance of LLMs on PCs to support NVIDIA and Apple Mac GPUs and new prompt categories, including structured prompts for code analysis and experimental long-context summarization tests using 4,000- and 8,000-token inputs
MLCommons, the consortium behind the industry-standard MLPerf benchmarks, released MLPerf Client v1.0, a benchmark that sets a new standard for measuring the performance of LLMs on PCs and other client-class systems. MLPerf Client v1.0 introduces an expanded set of supported models, including Llama 2 7B Chat, Llama 3.1 8B Instruct, and Phi 3.5 Mini Instruct, with Phi 4 Reasoning 14B added as an experimental option to preview next-generation high-reasoning-capable LLMs. These additions reflect real-world use cases across a broader range of model sizes and capabilities. The benchmark expands its evaluation scope with new prompt categories, including structured prompts for code analysis and experimental long-context summarization tests using 4,000- and 8,000-token inputs. Hardware and platform support has also grown significantly. MLPerf Client v1.0 supports AMD and Intel NPUs and GPUs via ONNX Runtime, Ryzen AI SDK, and OpenVINO, with additional support for NVIDIA GPUs and Apple Mac GPUs through llama.cpp. It offers both command-line and graphical user interfaces. The GUI includes real-time compute and memory usage, persistent results history, comparison tables, and CSV exports. The CLI supports automation and scripting for regression testing and large-scale evaluations, making MLPerf Client v1.0 a comprehensive tool for benchmarking LLMs on client systems.
Runloop’s platform provides isolated, ephemeral cloud-based development environments where AI agents can safely execute code with full filesystem and build tool access simplifying their deployment in real-world production environments
Runloop, an infrastructure startup, has raised $7 million in seed funding to address what its founders call the “production gap” — the critical challenge of deploying AI coding agents beyond experimental prototypes into real-world enterprise environments. Runloop’s platform addresses a fundamental question that has emerged as AI coding tools proliferate: where do AI agents actually run when they need to perform complex, multi-step coding tasks? For Runloop, the answer lies in providing the infrastructure layer that makes AI agents as easy to deploy and manage as traditional software applications — turning the vision of digital employees from prototype to production reality. Runloop’s core product, called “devboxes,” provides isolated, cloud-based development environments where AI agents can safely execute code with full filesystem and build tool access. These environments are ephemeral — they can be spun up and torn down dynamically based on demand. One customer example illustrates the platform’s utility: a company that builds AI agents to automatically write unit tests for improving code coverage. When they detect production issues in their customers’ systems, they deploy thousands of devboxes simultaneously to analyze code repositories and generate comprehensive test suites. Despite only launching billing in March and self-service signup in May, Runloop has achieved significant momentum. The company reports “a few dozen customers,” including Series A companies and major model laboratories, with customer growth exceeding 200% and revenue growth exceeding 100% since March. Runloop’s second major product, Public Benchmarks, addresses another critical need: standardized testing for AI coding agents. Traditional AI evaluation focuses on single interactions between users and language models. Runloop’s approach is fundamentally different.
Augment Code allows developers to integrate popular developer tools with just one click by leveraging the open MCP to stream rich runtime context such as build logs and error traces directly into its AI coding assistant without the complexity of traditional setup
Augment Code, has launched Easy MCP, a new capability that allows developers to integrate popular developer tools like CircleCI, MongoDB, Redis, Sentry, and Stripe with just one click. Available immediately through the Augment Code extension in VS Code, Easy MCP leverages the open Model Context Protocol (MCP) to stream rich runtime context directly into the company’s AI coding assistant—without the complexity of traditional setup. Augment Code’s proprietary Context Engine, streams live data—such as build logs, database schemas and error traces—directly into its core models. By applying real-time signals from across the stack, Augment delivers suggestions and agent runs that are context rich, accurate and production ready, helping teams move faster and deploy with greater confidence. Easy MCP integrations include: CircleCI: Build logs, test insights, flaky‑test detection via the mcp-server-circleci; MongoDB: Data exploration, database management, and context-aware code generation via the MongoDB MCP Server; Redis: Keyspace introspection, TTL audits, and migration helpers through the open‑source mcp-redis server; Sentry: Search issues, errors, traces, logs, and releases. Create RCAs and AI-Generated fixes with Seer; Stripe: Real‑time payment events, refund status, subscription metrics, and secure tokenization via Stripe’s remote or local MCP servers (OAuth MCP in public preview).
Helix 2.0 can deploy GenAI on private infrastructure, ensuring compliance with regulations like GDPR and HIPAA while mitigating risks associated with public AI platforms; slashing deployment times from 6-12 months to just 8 weeks
Helix announced Helix 2.0, a next-generation private AI platform, eliminates the complexity, high costs, and security risks of traditional AI deployments, providing everything needed to build, deploy, and manage powerful AI solutions with complete data sovereignty and predictable economics. Helix 2.0 slashes deployment times from 6-12 months to just 8 weeks with predictable, fixed licensing and infrastructure fees that reduce costs by up to 75% when compared to public AI platforms. Enterprise-grade testing, version control, and rollback capabilities reduce operational risk by 90% while integrated Vision RAG technology enhances document processing accuracy by 85%, ensuring fidelity in complex financial, regulatory, and technical documents. Its intelligent orchestration engine dynamically allocates resources and optimizes model selection based on workload requirements. Native integration with enterprise DevOps pipelines supports automated testing, CI/CD workflows, and GitOps practices and enables rapid, auditable deployment of AI agents at scale. Integrated Vision RAG technology leverages advanced visual document understanding to process complex, multi-modal files with high fidelity, ensuring accurate extraction and analysis across diverse enterprise data types. Key Features Include: Deployment on Private Infrastructure – Ensure compliance with regulations like GDPR and HIPAA while mitigating risks associated with public AI platforms. Agentic AI and Enterprise CI/CD – Build, test, and deploy AI agents and LLMs with full software engineering rigor, including integration with leading CI/CD platforms, automated testing, GitOps workflow support, and full rollback capabilities. Vision RAG Integration – Process complex documents, including financial statements, regulatory filings, and technical diagrams, with 85% higher accuracy using ColPali-powered visual document understanding. Kubernetes-Native Architecture – Effortlessly scale to 1000+ concurrent users with enterprise-grade reliability and performance. OpenAI-Compatible APIs – Seamlessly migrate existing projects without code changes, enabling immediate engagement and zero disruption. Enterprise-Grade Authentication – Integrate with Okta, Auth0, and Active Directory for robust, familiar security.
Google’s Gemini 2.5 Deep Think multi-agent model with advanced AI reasoning capabilities can answer questions by exploring and considering multiple ideas simultaneously and choose the best answer using those outputs and can produce “much longer responses”
Google DeepMind is rolling out Gemini 2.5 Deep Think, which is its most advanced AI reasoning model, able to answer questions by exploring and considering multiple ideas simultaneously and then using those outputs to choose the best answer. Gemini 2.5 Deep Think is Google’s first publicly available multi-agent model. These systems spawn multiple AI agents to tackle a question in parallel, a process that uses significantly more computational resources than a single agent, but tends to result in better answers. Gemini 2.5 Deep Think model is a significant improvement over what it announced at I/O. The company also claims to have developed “novel reinforcement learning techniques” to encourage Gemini 2.5 Deep Think to make better use of its reasoning paths. “Deep Think can help people tackle problems that require creativity, strategic planning and making improvements step-by-step,” said Google in a blog post. Gemini 2.5 Deep Think achieves state-of-the-art performance on Humanity’s Last Exam (HLE) — a challenging test measuring AI’s ability to answer thousands of crowdsourced questions across math, humanities, and science. Google claims its model scored 34.8% on HLE (without tools), compared to xAI’s Grok 4, which scored 25.4%, and OpenAI’s o3, which scored 20.3%. Google also says Gemini 2.5 Deep Think outperforms AI models from OpenAI, xAI, and Anthropic on LiveCodeBench 6, a challenging test of competitive coding tasks. Google’s model scored 87.6%, whereas Grok 4 scored 79%, and OpenAI’s o3 scored 72%. Gemini 2.5 Deep Think automatically works with tools such as code execution and Google Search, and is capable of producing “much longer responses” than traditional AI models. In Google’s testing, the model produced more detailed and aesthetically pleasing web development tasks compared to other AI models. The model could aid researchers and “potentially accelerate the path to discovery.”
Accenture invests in software engineering intelligence platform YearOne that surfaces real-time insights across workflows, individuals, and teams by bringing together insights from existing tools into a single system to enable accelerated digital product development
Accenture has announced a strategic investment in YearOne, a company that helps organizations accelerate software development through its data-driven software engineering intelligence platform. As part of the investment led by Accenture Ventures, Accenture will collaborate with YearOne to help businesses accelerate the lifecycle of digital product development with AI-powered visibility, coaching, and performance optimization. YearOne’s platform surfaces real-time insights across workflows, individuals, and teams by bringing together insights from existing tools into a single system of intelligence that closes the loop between data, behavior, and execution. The platform identifies hidden patterns, delivery bottlenecks, and productivity gaps, offering tailored recommendations and intelligent interventions that enable high-performance software teams to operate with precision. Tom Lounibos, global lead for Accenture Ventures said “YearOne’s platform can provide organizations with the clarity needed to develop innovative capabilities and strategic vision to move forward with confidence and purpose.” YearOne can also help teams rebalance deep work, meeting time, and delivery focus. Signals around workload fragmentation, skill gaps, and progress clarity help leaders proactively coach teams, reduce rework, and scale delivery quality. Accenture Song is using YearOne’s platform to establish a benchmark for engineering performance and output that can help teams identify areas for efficiency gains, including how teams are adopting and leveraging AI tools. These insights can also help identify where skills gaps exist and how additional training would be beneficial. Dan Garrison, chief technology officer at Accenture Song said, “The platform can simultaneously improve digital product software development while also upskilling teams—helping people deliver faster and more accurately. This is the collaborative nature of humans and AI that we envision will benefit talent and innovation.” YearOne will join Accenture Ventures’ Project Spotlight, a vertical accelerator for emerging technology companies in data and AI. Project Spotlight offers startups extensive access to Accenture’s domain expertise and enterprise client base—helping breakthrough technologies scale faster and deliver more impact.
BigID introduced AI TRiSM (Trust, Risk, and Security Management) – a new, integrated set of controls governing AI usage, detecting emerging threats, and validate the integrity of the data
BigID introduced AI TRiSM (Trust, Risk, and Security Management) – a new, integrated set of controls that empowers organizations to govern AI usage, detect emerging threats, and validate the integrity of the data fueling their models. BigID’s AI TRiSM unifies three essential capabilities in a single platform: AI Data Trust: validate that training and inference data is compliant, accurate, and appropriate; AI Risk Assessment: quantify exposure across infrastructure, data, usage, and vendors; AI Security Posture Management (SPM): detect unauthorized GenAI use, prevent data exfiltration, and mitigate prompt injection attacks. Unlike tools that stop at visibility, BigID is built for action. AI TRiSM lets teams continuously monitor AI risk, trigger remediation workflows, and enforce policies based on model behavior, data sensitivity, and organizational requirements. AI TRiSM delivers the depth and reach teams need to govern AI across the enterprise—bringing trust, control, and accountability into every AI workflow. Key Takeaways: Detect risky AI behavior with AI Security Posture Management (SPM); Automate AI Risk Assessments across usage, vendors, and infrastructure; Validate training and inference data with AI Data Trust verification; Trigger remediation workflows and enforce policy-driven controls; Operationalize AI governance across data, models, and pipelines
Aquant’s platform enables users to build, deploy and integrate custom AI agents across channels by embedding deep domain expertise directly into its architecture and providing service-specific terminology, error codes, data models and built-in workflows
Aquant Inc has launched its Agentic AI Platform, a horizontal platform infused with domain-specific knowledge. Aquant users gain the flexibility to build, deploy and integrate AI agents tailored to their unique service needs. Organizations can use its prebuilt agents, such as Troubleshooting, Knowledge Search, Parts Identification, IoT and Call Assist, for immediate impact,. The platform also allows users to build custom agents using specialized tools, data models and deep domain expertise, or bring their own internally developed agents and integrate them into the platform. Aquant’s agents can be incorporated into existing enterprise AI ecosystems, offering broad compatibility and ensuring organizations don’t need to overhaul their current infrastructure. The platform is purpose-built to address the complexities of modern service environments by embedding deep domain expertise directly into its architecture. The offering provides service-specific terminology, error codes and workflows built in, allowing organizations to deploy custom AI agents faster and more effectively than with generic tools. It’s designed for easy integration, whether a company is starting from scratch or operating within a mature AI ecosystem. It supports multichannel deployment across voice interfaces, customer relationship management and enterprise resource planning systems, collaboration tools, offline environments and emerging channels, ensuring AI is accessible wherever work occurs. Additionally, the platform has a retrieval-augmented conversation layer that enhances effectiveness by delivering outcome-focused intelligence, producing responses that are not only relevant but also aligned with key business performance indicators.
