Anthropic is increasing the amount of information that enterprise customers can send to Claude in a single prompt, part of an effort to attract more developers to the company’s popular AI coding models. For Anthropic’s API customers, the company’s Claude Sonnet 4 AI model now has a 1 million token context window — meaning the AI can handle requests as long as 750,000 words, more than the entire “Lord of the Rings” trilogy, or 75,000 lines of code. That’s roughly five times Claude’s previous limit (200,000 tokens), and more than double the 400,000 token context window offered by OpenAI’s GPT-5. Long context will also be available for Claude Sonnet 4 through Anthropic’s cloud partners, including on Amazon Bedrock and Google Cloud’s Vertex AI. Anthropic’s product lead for the Claude platform, Brad Abrams, expects AI coding platforms to get a “lot of benefit” from this update. When asked if GPT-5 put a dent in Claude’s API usage, Abrams downplayed the concern, saying he’s “really happy with the API business and the way it’s been growing.” Whereas OpenAI generates most of its revenue from consumer subscriptions to ChatGPT, Anthropic’s business centers around selling AI models to enterprises through an API. That’s made AI coding platforms a key customer for Anthropic and could be why the company is throwing in some new perks to attract users in the face of GPT-5. Abrams also told that Claude’s large context window helps it perform better at long agentic coding tasks, in which the AI model is autonomously working on a problem for minutes or hours. With a large context window, Claude can remember all its previous steps in long-horizon tasks. Abrams said that Anthropic’s research team focused on increasing not just the context window for Claude, but also the “effective context window,” suggesting that its AI can understand most of the information it’s given.
Anthropic’s Claude tops enterprise market share at 32%, outpacing OpenAI’s 25% and Google’s 20%, as firms prioritize performance, coding ROI, and MCP‑enabled tool use
Claude leads the enterprise race: According to a July report by venture capital firm Menlo Ventures, Anthropic has the top market share among enterprises, at 32%. OpenAI’s AI models used to have the lead, holding 50% at the end of 2023. But its share has since declined to 25%. Google is at 20%, which experienced “strong growth” in recent months, the report said. (Microsoft, as OpenAI’s largest investor, uses and offers its AI models to clients.) Among open-source models, Meta’s Llama has a 9% share while AI startup DeepSeek accounts for 1%. Anthropic’s Claude began gaining momentum in June 2024, with the release of Claude Sonnet 3.5, Sonnet 3.7, Sonnet 4, Opus 4 and Claude Code. Another boost to Anthropic is its development of model context protocol (MCP), an open set of rules and standards that describe how AI models like Claude and Gemini can connect to tools, APIs, data sources and other agents. It can be broadly used in many industries, such as finance. For example, MCP is part of the new stack for intelligent commerce. Companies such as Visa are using it for intelligent commerce, enabling AI agents to interact with payments and other tools to autonomously and securely perform tasks. Menlo’s report also found is that enterprises prioritize performance over price and prefer closed, proprietary models over open-source ones. One reason is the performance of open-source AI models lags those of closed models by nine to 12 months. Another reason is the technical complexity of going at it alone and reluctance to use Chinese APIs. On the other hand, CFOs are more cost conscious.
Agent2.AI’s AI orchestration platform can understand user intent, break down the request into smaller, manageable steps, delegate each task to focused atomic agents and deliver real, usable outputs such as reports, spreadsheets, and presentations
Agent2.AI announced the upcoming launch of Super Agent, a breakthrough AI orchestration platform designed to coordinate intelligent work across multiple agents, APIs, and even real human collaborators. Unlike traditional AI tools that focus on generating content or answering questions, Super Agent acts as an orchestration layer — a system that understands user intent, delegates work to the right components, and delivers real, usable outputs such as reports, spreadsheets, and presentations. “We’re not building just another AI agent,” said Chuci Qin, CEO of Agent2.AI. Users can prompt Super Agent with requests and system will automatically break each request into smaller, manageable steps. Each task is broken down and handled by focused atomic agents. Each agent is built to do one specific job, such as finding information, organizing research, or creating slides. These atomic agents form a growing ecosystem inside Agent2.AI, each focused, reliable, and composable. Super Agent can also call on external tools and agents through standard protocols such as MCP or A2A, allowing the system to dynamically connect with open-source frameworks, third-party APIs, or no-code automations as needed. In some cases, tasks may require not just software, but real-world execution, such as placing an order, contacting a vendor, or managing a physical deliverable. When that’s the case, Super Agent can seamlessly coordinate with vetted freelancers or agency partners. These human contributors are not fallback options, but core participants in a flexible, multi-agent system.
BigID introduced AI TRiSM (Trust, Risk, and Security Management) – a new, integrated set of controls governing AI usage, detecting emerging threats, and validate the integrity of the data
BigID introduced AI TRiSM (Trust, Risk, and Security Management) – a new, integrated set of controls that empowers organizations to govern AI usage, detect emerging threats, and validate the integrity of the data fueling their models. BigID’s AI TRiSM unifies three essential capabilities in a single platform: AI Data Trust: validate that training and inference data is compliant, accurate, and appropriate; AI Risk Assessment: quantify exposure across infrastructure, data, usage, and vendors; AI Security Posture Management (SPM): detect unauthorized GenAI use, prevent data exfiltration, and mitigate prompt injection attacks. Unlike tools that stop at visibility, BigID is built for action. AI TRiSM lets teams continuously monitor AI risk, trigger remediation workflows, and enforce policies based on model behavior, data sensitivity, and organizational requirements. AI TRiSM delivers the depth and reach teams need to govern AI across the enterprise—bringing trust, control, and accountability into every AI workflow. Key Takeaways: Detect risky AI behavior with AI Security Posture Management (SPM); Automate AI Risk Assessments across usage, vendors, and infrastructure; Validate training and inference data with AI Data Trust verification; Trigger remediation workflows and enforce policy-driven controls; Operationalize AI governance across data, models, and pipelines
LFM2-VL, a new generation of vision-language foundation models can deploy across a wide range of hardware — from smartphones and laptops to wearables and embedded systems promising low-latency performance, strong accuracy, and flexibility for real-world applications
Liquid AI has released LFM2-VL, a new generation of vision-language foundation models designed for efficient deployment across a wide range of hardware — from smartphones and laptops to wearables and embedded systems. The models promise low-latency performance, strong accuracy, and flexibility for real-world applications. According to Liquid AI, the models deliver up to twice the GPU inference speed of comparable vision-language models, while maintaining competitive performance on common benchmarks. The release includes two model sizes: LFM2-VL-450M — a hyper-efficient model with less than half a billion parameters (internal settings) aimed at highly resource-constrained environments. LFM2-VL-1.6B — a more capable model that remains lightweight enough for single-GPU and device-based deployment. Both variants process images at native resolutions up to 512×512 pixels, avoiding distortion or unnecessary upscaling. For larger images, the system applies non-overlapping patching and adds a thumbnail for global context, enabling the model to capture both fine detail and the broader scene. Unlike traditional architectures, Liquid’s approach aims to deliver competitive or superior performance using significantly fewer computational resources, allowing for real-time adaptability during inference while maintaining low memory requirements. This makes LFMs well suited for both large-scale enterprise use cases and resource-limited edge deployments. LFM2-VL uses a modular architecture combining a language model backbone, a SigLIP2 NaFlex vision encoder, and a multimodal projector. The projector includes a two-layer MLP connector with pixel unshuffle, reducing the number of image tokens and improving throughput. Users can adjust parameters such as the maximum number of image tokens or patches, allowing them to balance speed and quality depending on the deployment scenario. The training process involved approximately 100 billion multimodal tokens, sourced from open datasets and in-house synthetic data.
SurrealDB’s MCP-compatible server gives AI agents secure, permissioned, real-time memory across vectors, graphs, and documents, enabling compliant recall, live queries, and policy‑aware updates
SurrealDB has announced the launch of SurrealMCP, the official Model Context Protocol (MCP) server for SurrealDB and SurrealDB Cloud. SurrealMCP gives AI assistants, AI agents, IDEs, chatbots, and data platforms the ability to securely store, recall, and reason over live structured data – giving them the persistent, permission-aware memory they’ve been missing. Built on the open Model Context Protocol standard (modelcontextprotocol.io), SurrealMCP connects any MCP-compatible client to SurrealDB with full portability and interoperability across the AI ecosystem. SurrealMCP gives agents a secure, real-time memory layer backed by SurrealDB’s multi-model engine. With SurrealMCP, agents can: Remember and recall events, facts, and conversations over time; Query and update live data with role-based access controls; Link vectors, graphs, and documents to create deep contextual understanding; Perform administrative tasks like creating schemas or seeding data, all through natural language. Example use cases: Agent Memory: “Store this chat and recall anything about shipping delays.” SurrealMCP stores the conversation as vectors, links related data in graph form, and makes it time-travel queryable. Business Intelligence: “Recall customers in the top ten percent by lifetime value.” SurrealMCP translates the request into optimized queries, respecting all access policies. Operational Automation: “Create a dev namespace in Europe, apply the schema, seed sample data.” SurrealMCP executes instantly, no dashboards, no manual scripts. Enterprise Co-pilots: Power contextual CRM insights, real-time inventory tracking, or customer support histories
A new open-source method utilizes the MCP architecture to evaluate agent performance through a variety of available LLMs by gathering real-time information on how agents interact with tools, generating synthetic data and creating a database to benchmark them
Researchers from Salesforce discovered another way to utilize MCP technology, this time to aid in evaluating AI agents themselves. The researchers unveiled MCPEval, a new method and open-source toolkit built on the architecture of the MCP system that tests agent performance when using tools. They noted current evaluation methods for agents are limited in that these “often relied on static, pre-defined tasks, thus failing to capture the interactive real-world agentic workflows.” MCPEval differentiates itself by being a fully automated process, which the researchers claimed allows for rapid evaluation of new MCP tools and servers. It both gathers information on how agents interact with tools within an MCP server, generates synthetic data and creates a database to benchmark agents. Users can choose which MCP servers and tools within those servers to test the agent’s performance on. MCPEval’s framework takes on a task generation, verification and model evaluation design. Leveraging multiple large language models (LLMs) so users can choose to work with models they are more familiar with, agents can be evaluated through a variety of available LLMs in the market. Enterprises can access MCPEval through an open-source toolkit released by Salesforce. Through a dashboard, users configure the server by selecting a model, which then automatically generates tasks for the agent to follow within the chosen MCP server. Once the user verifies the tasks, MCPEval then takes the tasks and determines the tool calls needed as ground truth. These tasks will be used as the basis for the test. Users choose which model they prefer to run the evaluation. MCPEval can generate a report on how well the agent and the test model functioned in accessing and using these tools. What makes MCPEval stand out from other agent evaluators is that it brings the testing to the same environment in which the agent will be working. Agents are evaluated on how well they access tools within the MCP server to which they will likely be deployed.
Aquant’s platform enables users to build, deploy and integrate custom AI agents across channels by embedding deep domain expertise directly into its architecture and providing service-specific terminology, error codes, data models and built-in workflows
Aquant Inc has launched its Agentic AI Platform, a horizontal platform infused with domain-specific knowledge. Aquant users gain the flexibility to build, deploy and integrate AI agents tailored to their unique service needs. Organizations can use its prebuilt agents, such as Troubleshooting, Knowledge Search, Parts Identification, IoT and Call Assist, for immediate impact,. The platform also allows users to build custom agents using specialized tools, data models and deep domain expertise, or bring their own internally developed agents and integrate them into the platform. Aquant’s agents can be incorporated into existing enterprise AI ecosystems, offering broad compatibility and ensuring organizations don’t need to overhaul their current infrastructure. The platform is purpose-built to address the complexities of modern service environments by embedding deep domain expertise directly into its architecture. The offering provides service-specific terminology, error codes and workflows built in, allowing organizations to deploy custom AI agents faster and more effectively than with generic tools. It’s designed for easy integration, whether a company is starting from scratch or operating within a mature AI ecosystem. It supports multichannel deployment across voice interfaces, customer relationship management and enterprise resource planning systems, collaboration tools, offline environments and emerging channels, ensuring AI is accessible wherever work occurs. Additionally, the platform has a retrieval-augmented conversation layer that enhances effectiveness by delivering outcome-focused intelligence, producing responses that are not only relevant but also aligned with key business performance indicators.
Shortcut an MIT startup’s AI agents automates multi-step Excel work that adapt in real time to complex workflows enabling full financial modeling and analysis from natural language prompts
Fundamental Research Labs, an artificial intelligence startup launched out of MIT, has created Shortcut, a system of AI agents that can do multi-step work on Excel, such as creating discounted cash flow models in finance. Shortcut is accessed through its website, which is designed to look quite similar to Excel with its green lines and tabs. There is a side bar that opens on the right side where users can prompt the AI agents. Users can also open or upload Excel files on Shortcut. The user writes a prompt in natural language and uploads documents for the AI agents to “read.” The group of AI agents behind the tool then gets to work to create business models, financial statements and the like. Unlike macro scripts or cloud-based automation, Shortcut can adapt mid-task if something changes. That flexibility makes it more likely to handle the messy, inconsistent workflows that dominate office life. It also keeps sensitive data on-device, a selling point for regulated industries. Shortcut CEO Nico Christie said “It’s not about replacing Excel — it’s about replacing the need to open Excel in the first place.” Asked how Shortcut is different from Microsoft Copilot that is in Excel, Christie told that Copilot does specific tasks the user tells it to do, like write formulas or create charts. Shortcut does full financial modeling and analysis. Shortcut scores over 80% on cases presented at the Microsoft Excel World Championship, described as a “thrilling” competition among Excel users. Christie said Shortcut finished the cases in about 10 minutes, or 10 times faster than humans.
Edgen rolls out multi‑agent stock intelligence that generates sub‑second market reports, AI ratings, and price forecasts via its EDGM model, enabling rapid, executable equity decisions at scale
Edgen, an AI-driven market intelligence platform, announced a series of major platform upgrades, centered on the launch of AI-generated stock picks, stock ratings, and stock price forecasts. The new AI stock picks feature draws on Edgen’s multi-agent system to identify opportunities across equities with speed and precision. Users can now see which stocks surface as high-potential investments, rated and ranked by AI across multiple dimensions. Stock ratings distill performance into a transparent scoring framework, providing both institutional and retail investors with a quick way to differentiate between stronger and weaker companies. This rating system, combined with stock price forecasts, enables investors to anticipate potential moves rather than react after the fact. The outcome is sharper, faster decision-making, where signals come directly from AI agents trained to scan, assess, and act at scale. Alongside these initiatives, the company is rolling out a new Market Report system and advancing its proprietary model, EDGM, bringing unprecedented speed and depth to investment research. Edgen’s new Market Report delivers professional-grade research in under a second. The platform provides structured analysis that consolidates financial data, market momentum, and forward-looking scenarios into a single, easy-to-read report, enabling confident investment decisions at speed. This capability is powered by EDGM, Edgen’s private model, now upgraded to deliver results almost instantly. What once required hours of manual research, cross-checking analyst notes, and piecing together market commentary can now be compressed into a few seconds of AI-powered insight. Edgen’s multi-agent architecture introduces a dynamic layer of discovery, exploration, recommendation, and rating. Each agent operates with a specialized focus, such as analyzing technical signals, identifying market trends, or flagging undercovered opportunities, before converging on insights to provide a unified view for the user.