Well-funded AI startup Anysphere Inc. is expanding beyond its viral generative AI code editor and into “agentic AI” with the launch of new web and mobile browser-based orchestration tools for coding agents. With its new application, developers can send natural language prompts from a mobile or web-based browser directly to the background agents, instructing them to perform tasks like writing new features or fixing bugs. Using the web app, developers can also monitor fleets of agents that are busy working on different tasks, check their progress and register those that have been completed within the underlying codebase. Anysphere explained that developers can instruct its AI agents to complete tasks via the web app, and if they’re unable to do so, they can seamlessly switch to the IDE to take over and see what’s caused it to come unstuck. Each of its agents has its own shareable link, which developers can click on to see its progress.
Zango Global’s AI agents can read and interpret regulations with a high degree of accuracy, integrate it directly into a company’s day-to-day operations and respond to inquiries or draft consulting reviews complete with citations
Zango Global raised $4.8 million in seed funding led by Nexus Venture Partners to provide artificial intelligence agents to financial firms and banks, with an aim to transform how they deal with regulatory compliance. Zango uses AI agents, a type of artificial intelligence software that can make decisions, do research and achieve specific goals with a degree of autonomy. Agents are designed to carry out tasks with minimal or no human oversight, while adapting to changing circumstances. This allows them to continuously integrate knowledge, including regulatory information, so they can respond to inquiries or draft consulting reviews complete with citations. The company said its large language models and AI agents don’t just read and interpret regulations with a high degree of accuracy. They can integrate directly into a company’s day-to-day operations. In one example given by Zango, a bank involved with a regulator had a process that would have taken 48 hours, reduced to under four hours using the agentic AI platform. Using the platform, the company said, aiming to remain compliant and launching a new product or service can be as simple as spinning up an agent and asking: “I want to launch a lending product in X market. What do I need to do?” The agents will go to work, track down all the necessary resources and produce research, compliance requirements, records, citations, an impact assessment and a gap analysis helpful for future-proofing the product.
OPAQUE Systems integrates confidential computing with popular data and AI tools, to process fully encrypted data from ingestion to inference by enforcing cryptographically verifiable privacy, secure code execution, and auditable proof of compliance
OPAQUE Systems, the industry’s first Confidential AI platform, announced the availability of its secure AI solution on the Microsoft Azure Marketplace. By integrating confidential computing with popular data and AI tools, OPAQUE lets enterprises process sensitive data fully encrypted, from ingestion to inference, without costly code rewrites or specialized cryptographic skills. Most confidential computing solutions focus on encrypting data in use and verifying the basic infrastructure, such as applications running in Confidential Virtual Machines. However, OPAQUE goes significantly further by enforcing privacy, security, and compliance policies from data ingestion to inference. OPAQUE capabilities provide comprehensive coverage, which means customers safely deploy classic analytics/ML and advanced AI agents on their most valuable, confidential data, without compromising on sovereignty or compliance. By keeping sensitive information encrypted even during analysis and inference, organizations gain cryptographically verifiable privacy, protection against unapproved agents or code execution, and auditable proof of compliance at every step. This robust coverage frees enterprises to innovate at scale by using its differentiated, proprietary data while minimizing regulatory risk on a single platform. OPAQUE is the only platform that meets these needs in three critical phases.
Adaptive Computer’s no-code web-app platform lets non-programmers build full-featured apps that include payments (via Stripe), scheduled tasks, and AI features such as image generation, speech synthesis simply by entering a text prompt
Startup Adaptive Computer wants non-programmers to be using full-featured apps that they’ve created themselves, simply by entering a text prompt into Adaptive’s no-code web-app platform. To be certain, this isn’t about the computer itself or any hardware — despite the company’s name. The startup currently only builds web apps. For every app it builds, Adaptive Computer’s engine handles creating a database instance, user authentication, file management, and can create apps that include payments (via Stripe), scheduled tasks, and AI features such as image generation, speech synthesis, content analysis, and web search/research. Besides taking care of the back-end database and other technical details, Adaptive apps can work together. For instance, a user can build a file-hosting app and the next app can access those files. Founder Dennis Xu likens this as more like an “operating system” rather than a single Web app. He says the difference between more established products and his startup is that the others were originally geared toward making programming easier for programmers. “We’re building for the everyday person who is interested in creating things to make their own lives better.”
OpenAI is looking to acquire AI coding startups for its next growth areas amid pricing pressure on access to foundational models and outperformance of competitors’ models on coding benchmarks
Anysphere, maker of AI coding assistant Cursor, is growing so quickly that it’s not in the market to be sold, even to OpenAI, a source close to the company tells TechCrunch. It’s been a hot target. Cursor is one of the most popular AI-powered coding tools, and its revenue has been growing astronomically — doubling on average every two months, according to another source. Anysphere’s current average annual recurring revenue is about $300 million, according to the two sources. The company previously walked away from early acquisition discussions with OpenAI, after the ChatGPT maker approached Cursor, the two sources close to the company confirmed, and CNBC previously reported. Anysphere has also received other acquisition offers that the company didn’t consider, according to one of these sources. Cursor turned down the offers because the startup wants to stay independent, said the two people close to the company. Instead, Anysphere has been in talks to raise capital at about a $10 billion valuation, Bloomberg reported last month. Although it didn’t nab Anysphere, OpenAI didn’t give up on buying an established AI coding tool startup. OpenAI talked with more than 20 others, CNBC reported. And then it got serious over the next-fastest-growing AI coding startup, Windsurf, with a $3 billion acquisition offer, Bloomberg reported last week. While Windsurf is a comparatively smaller company, its ARR is about $100 million, up from $40 million in ARR in February, according to a source. Windsurf has been gaining popularity with the developer community, too, and its coding product is designed to work with legacy enterprise systems. Windsurf did not respond to TechCrunch’s request for comment. OpenAI declined to comment on its acquisition talks. OpenAI is likely shopping because it’s looking for its next growth areas as competitors such as Google’s Gemini and China’s DeepSeek put pricing pressure on access to foundational models. Moreover, Anthropic and Google have recently released AI models that outperform OpenAI’s models on coding benchmarks, increasingly making them a preferred choice for developers. While OpenAI could build its own AI coding assistant, buying a product that is already popular with developers means the ChatGPT-maker wouldn’t have to start from scratch to build this business. VCs who invest in developer tool startups are certainly watching. Speculating about OpenAI’s strategy, Chris Farmer, partner and CEO at SignalFire, told TechCrunch of the company, “They’ll be acquisitive at the app layer. It’s existential for them.”
Amazon Bedrock serverless endpoint system dynamically predicts the response quality of each model and efficiently routes it to the most appropriate model based on cost and response quality
Amazon Bedrock has announced the general availability of its Intelligent Prompt Routing, a serverless endpoint that efficiently routes requests between different foundation models within the same model family. The system dynamically predicts the response quality of each model for a request and routes the request to the model it determines is most appropriate based on cost and response quality. The system incorporates state-of-the-art methods for training routers for different sets of models, tasks, and prompts. Users can use the default prompt routers provided by Amazon Bedrock or configure their own prompt routers to adjust for performance linearly between the performance of two candidate LLMs. The system has reduced the overhead of added components by over 20% to approximately 85 ms (P90), resulting in an overall latency and cost benefit compared to always hitting the larger/more expensive model. Amazon Bedrock has conducted internal tests with proprietary and public data to evaluate the system’s performance metrics.
Codacy’s solution integrates directly with AI coding assistants to enforce coding standards using MCP server, flagging or fixing issues in real-time
Codacy, provider of automated code quality and security solutions, launched Codacy Guardrails, a groundbreaking new product designed to bring real-time security, compliance, and quality enforcement to AI-generated code. Guardrails is the first technology to make AI-generated code trustworthy and compliant by checking it before it ever reaches the developer. Codacy Guardrails is the first solution of its kind that integrates directly with AI coding assistants to enforce coding standards and prevent non-compliant code from being generated in the first place. Built on Codacy’s SOC2-compliant platform, Codacy Guardrails empowers teams to define their own secure development policies and apply them across every AI-generated prompt. With Codacy Guardrails, AI-assisted tools gain full access to the security and quality context of a team’s codebase. At the core of the product is the Codacy MCP server, which connects development environments to the organization’s code standards. This gives LLMs the ability to reason about policies, flag or fix issues in real time, and deliver code that’s compliant by default. Guardrails integrates with popular IDEs like Cursor AI and Windsurf as well as VSCode and IntelliJ through Codacy’s plugin, allowing developers to apply guardrails directly within their existing workflows.
Docker to simplify AI software delivery by containerizing MCP servers along with offering an enterprise-ready toolkit and a centralized platform to discover and manage them from a catalog of 100+ servers
Software containerization company Docker is launching the Docker MCP Catalog and Docker MCP Toolkit, which bring more of the AI workflow into the existing Docker developer experience and simplify AI software delivery. The new offerings are based on the emerging Model Context Protocol standard created by its partner Anthropic PBC. Docker argues that the simplest way to use Anthropic’s MCP to improve LLMs is to containerize it. To do that, it offers tools such as Docker Desktop for building, testing and running MCP servers, as well as Docker Hub to distribute their container images, and Docker Scout to ensure they’re secure. By packaging MCP servers as containers, developers can eliminate the hassles of installing dependencies and configuring their runtime environments. The Docker MCP Catalog, integrated within Docker Hub, is a centralized way for developers to discover, run and manage MCP servers, while the Docker MCP Toolkit offers “enterprise-ready tooling” for putting AI applications to work. At launch, there are more than 100 MCP servers available within Docker MCP Catalog. President and Chief Operating Officer Mark Cavage explained that “The Docker MCP Catalog brings that all together in one place, a trusted, developer-friendly experience within Docker Hub, where tools are verified, secure, and easy to run.”
Amazon’s new benchmark to evaluate AI coding agents’ ability to navigate and understand complex codebases and GitHub issues
Amazon has introduced SWE-PolyBench, the first industry benchmark to evaluate AI coding agents’ ability to navigate and understand complex codebases. The benchmark, which measures system performance in GitHub issues, has spurred the development of capable coding agents and has become the de-facto standard for coding agent benchmarking. SWE-PolyBench contains over 2,000 curated issues in four languages and a stratified subset of 500 issues for rapid experimentation. The benchmark aims to advance AI performance in real-world scenarios. Key features of SWE-PolyBench at a glance: Multi-Language Support: Java (165 tasks), JavaScript (1017 tasks), TypeScript (729 tasks), and Python (199 tasks). Extensive Dataset: 2110 instances from 21 repositories ranging from web frameworks to code editors and ML tools, on the same scale as SWE-Bench full with more repository. Task Variety: Includes bug fixes, feature requests, and code refactoring. Faster Experimentation: SWE-PolyBench500 is a stratified subset for efficient experimentation. Leaderboard: A leaderboard with a rich set of metrics for transparent benchmarking.
New system for training AI agents focuses on multi-turn, interactive settings where agents must adapt, remember, and reason in the face of uncertainty instead of static tasks like math solving or code generation
A collaborative team from Northwestern University, Microsoft, Stanford, and the University of Washington — including a former DeepSeek researcher named Zihan Wang, currently completing a computer science PhD at Northwestern — has introduced RAGEN, a new system for training and evaluating AI agents that they hope makes them more reliable and less brittle for real-world, enterprise-grade usage. Unlike static tasks like math solving or code generation, RAGEN focuses on multi-turn, interactive settings where agents must adapt, remember, and reason in the face of uncertainty. Built on a custom RL framework called StarPO (State-Thinking-Actions-Reward Policy Optimization), the system explores how LLMs can learn through experience rather than memorization. StarPO-S incorporates three key interventions: Uncertainty-based rollout filtering; KL penalty removal; and Asymmetric PPO clipping. StarPO operates in two interleaved phases: a rollout stage where the LLM generates complete interaction sequences guided by reasoning, and an update stage where the model is optimized using normalized cumulative rewards. This structure supports a more stable and interpretable learning loop compared to standard policy optimization approaches. The team identified three dimensions that significantly impact training: Task diversity, Interaction granularity, and Rollout freshness. Together, these factors make the training process more stable and effective.