Startup Confident Security aims to be “the Signal for AI.” The company’s product, CONFSEC, is an end-to-end encryption tool that wraps around foundational models, guaranteeing that prompts and metadata can’t be stored, seen, or used for AI training, even by the model provider or any third party. The company wants to serve as an intermediary vendor between AI vendors and their customers — like hyperscalers, governments, and enterprises. CONFSEC is modeled after Apple’s Private Cloud Compute (PCC) architecture, which “is 10x better than anything out there in terms of guaranteeing that Apple cannot see your data” when it runs certain AI tasks securely in the cloud. Like Apple’s PCC, Confident Security’s system works by first anonymizing data by encrypting and routing it through services like Cloudflare or Fastly, so servers never see the original source or content. Next, it uses advanced encryption that only allows decryption under strict conditions. Finally, the software running the AI inference is publicly logged and open to review so that experts can verify its guarantees. CONFSEC is also well-suited for new AI browsers hitting the market, like Perplexity’s Comet, to give customers guarantees that their sensitive data isn’t being stored on a server somewhere that the company or bad actors could access, or that their work-related prompts aren’t being used to “train AI to do your job.”
Augment Code allows developers to integrate popular developer tools with just one click by leveraging the open MCP to stream rich runtime context such as build logs and error traces directly into its AI coding assistant without the complexity of traditional setup
Augment Code, has launched Easy MCP, a new capability that allows developers to integrate popular developer tools like CircleCI, MongoDB, Redis, Sentry, and Stripe with just one click. Available immediately through the Augment Code extension in VS Code, Easy MCP leverages the open Model Context Protocol (MCP) to stream rich runtime context directly into the company’s AI coding assistant—without the complexity of traditional setup. Augment Code’s proprietary Context Engine, streams live data—such as build logs, database schemas and error traces—directly into its core models. By applying real-time signals from across the stack, Augment delivers suggestions and agent runs that are context rich, accurate and production ready, helping teams move faster and deploy with greater confidence. Easy MCP integrations include: CircleCI: Build logs, test insights, flaky‑test detection via the mcp-server-circleci; MongoDB: Data exploration, database management, and context-aware code generation via the MongoDB MCP Server; Redis: Keyspace introspection, TTL audits, and migration helpers through the open‑source mcp-redis server; Sentry: Search issues, errors, traces, logs, and releases. Create RCAs and AI-Generated fixes with Seer; Stripe: Real‑time payment events, refund status, subscription metrics, and secure tokenization via Stripe’s remote or local MCP servers (OAuth MCP in public preview).
“GPT-5 is a huge improvement over GPT-4 in a few key areas: It thinks better (reasoning), writes better (creativity), follows instructions more closely and is more aligned to user intent”
GPT-5 is giving ChatGPT more capabilities in at least five areas. Before GPT-5, ChatGPT users could choose between GPT models and the “o” reasoning series of models. GPT-5 merges both capabilities. “GPT-5 will automatically decide to use reasoning or not. Switching should be smoother in the next update,” said Elaine Ya Le, a researcher at OpenAI. ChatGPT Plus subscribers can send up to 160 messages using GPT-5 every three hours. That’s twice as many as prior models, according to OpenAI’s community forums. “We are going to double rate limits for Plus users as we finish rollout,” OpenAI CEO Sam Altman confirmed during the Q&A with Redditors. For GPT-5 Thinking, Plus users are capped at 200 messages per week if they manually select this option. When ChatGPT switches to “Thinking” mode by itself, this does not count toward the quota. Eric Mitchell from OpenAI’s research team told Redditors that OpenAI “definitely” intends Plus users to have “unlimited access to reasoning.” Users don’t have to toggle tools on or off with GPT-5. They are automatically enabled depending on what the user needs. Tools include web search, data analysis, image analysis, file analysis, canvas, image generation, memory and custom instructions. There are two ways to use “voice” mode: clicking on the microphone icon in the prompt window and speaking a prompt or query for ChatGPT to process; and activating full “voice” mode and interacting directly with the model. Sulman Choudhry, head of engineering at OpenAI, said voice mode is now better at following instructions with GPT-5. Overall, users should get more advanced capabilities with GPT-5: “GPT-5 is a huge improvement over GPT-4 in a few key areas: It thinks better (reasoning), writes better (creativity), follows instructions more closely and is more aligned to user intent.”
Lightning AI launches a unified multi-cloud GPU marketplace, enabling AI teams to cut compute costs by 70% and access on-demand or reserved clusters across hyperscalers and NeoClouds.
Lightning AI, the company building the infrastructure layer for AI development, announced the launch of its Multi-Cloud GPU Marketplace, a unified platform that gives AI teams access to on-demand and reserved GPUs across leading cloud providers, including top-tier hyperscalers and a new generation of specialized compute platforms known as NeoClouds. With Lightning AI, teams can now choose the best GPU provider for their goals, like optimizing for cost, performance, or region, all within a single interface and an intuitive, unified platform for AI development trusted by over 300,000 developers and Fortune 500 enterprises alike. The Multi-Cloud GPU Marketplace supports both on-demand GPUs and large-scale reserved GPU clusters where customers can choose fully managed SLURM, Kubernetes or Lightning’s next-gen AI orchestrator. This allows customers to bring their favorite tools and stack with no workflow changes so they can scale training, fine-tuning, and inference workloads on their terms. Built on Lightning AI’s end-to-end development platform, users can prototype, train, and deploy AI without worrying about infrastructure rework or cloud-specific setup. Lightning AI’s marketplace addresses a clear and growing need by giving teams the ability to scale AI with freedom of choice, cost transparency, and no friction. Key benefits include: Run across clouds using a single interface, with no manual orchestration or job rewrites; Access GPUs from top providers, including premium hyperscalers and emerging NeoClouds; Reserve compute or run on-demand depending on workload needs; Avoid vendor lock-in with a flexible, portable platform that works across your favorite clouds; Eliminate infrastructure overhead and use SLURM, Kubernetes, baremetal or Lightning without the DevOps burden.
Hyland debuts Enterprise Context Engine that unifies all company info (ERP, HR, CRM) creating a graph‑driven “living record” to feeds and informs AI workflows while preserving institutional knowledge
Enterprise content management firm Hyland Software Inc. launched two new components of its Content Innovation Cloud, which the company says present a unified, continuously updated view of an organization’s content, processes, people and applications to fuel a network of task-specific artificial intelligence agents. The Enterprise Context Engine pulls from systems such as enterprise resource planning, customer relationship management and human resources and maps relationships that create what the company describes as a “living record of enterprise activity.” Enterprise Context Engine as “a shared services platform layer” that sits beneath Hyland products and agentic solutions. It leverages graph analytics technologies to connect artifacts in a way that informs workflows and supports new applications. The Enterprise Agent Mesh is a network of task-specific agents tuned for specific industries, including healthcare, banking, insurance, government and higher education. Hyland said the mesh uses the context layer to make decisions and take actions inside complex workflows, while preserving institutional knowledge and incorporating human feedback. The company will provide prebuilt meshes for its core verticals and a no-code platform customers can use to adapt or assemble their own. The Agent Mesh architecture enables organizations to leverage the Enterprise Context Engine to replace business processes with agent meshes. The platform is designed to work in conjunction with customers’ existing repositories and workflow engines. The architecture uses the Model Context Protocol to connect to systems of record and other vendors’ agents.
Analog Devices AI tool automates the end-to-end machine learning pipeline for edge AI, including model search and optimization using state-of-the-art algorithms and verifies model size against the device’s RAM to enable successful deployment
Analog Devices Inc. (ADI) has introduced AutoML for Embedded, an AI tool that automates the end-to-end machine learning pipeline for edge AI. The tool, co-developed with Antmicro, is now available as part of the Kenning framework, integrated into CodeFusion Studio. The Kenning framework is a hardware-agnostic and open-source platform for optimizing, benchmarking, and deploying AI models on edge devices. AutoML for Embedded allows developers without data science expertise to build high-quality and efficient models that deliver robust performance. The tool automates model search and optimization using state-of-the-art algorithms, leveraging SMAC to explore model architectures and training parameters efficiently. It also verifies model size against the device’s RAM to enable successful deployment. Candidate models can be optimized, evaluated, and benchmarked using Kenning’s standard flows, with detailed reports on size, speed, and accuracy to guide deployment decisions. Antmicro’s Michael Gielda, VP Business Development, said that AutoML in Kenning reduces the complexity of building optimized edge AI models, allowing customers to take full control of their products. AutoML for Embedded is a Visual Studio Code plugin built on the Kenning library that supports: ADI MAX78002 AI accelerator MCUs and MAX32690 devices — deploy models directly to industry-leading edge AI hardware. Simulation and RTOS workflows — leverage Renode-based simulation and Zephyr RTOS for rapid prototyping and testing. General-purpose, open-source tools — allowing flexible model optimisation without platform lock-in
Helix 2.0 can deploy GenAI on private infrastructure, ensuring compliance with regulations like GDPR and HIPAA while mitigating risks associated with public AI platforms; slashing deployment times from 6-12 months to just 8 weeks
Helix announced Helix 2.0, a next-generation private AI platform, eliminates the complexity, high costs, and security risks of traditional AI deployments, providing everything needed to build, deploy, and manage powerful AI solutions with complete data sovereignty and predictable economics. Helix 2.0 slashes deployment times from 6-12 months to just 8 weeks with predictable, fixed licensing and infrastructure fees that reduce costs by up to 75% when compared to public AI platforms. Enterprise-grade testing, version control, and rollback capabilities reduce operational risk by 90% while integrated Vision RAG technology enhances document processing accuracy by 85%, ensuring fidelity in complex financial, regulatory, and technical documents. Its intelligent orchestration engine dynamically allocates resources and optimizes model selection based on workload requirements. Native integration with enterprise DevOps pipelines supports automated testing, CI/CD workflows, and GitOps practices and enables rapid, auditable deployment of AI agents at scale. Integrated Vision RAG technology leverages advanced visual document understanding to process complex, multi-modal files with high fidelity, ensuring accurate extraction and analysis across diverse enterprise data types. Key Features Include: Deployment on Private Infrastructure – Ensure compliance with regulations like GDPR and HIPAA while mitigating risks associated with public AI platforms. Agentic AI and Enterprise CI/CD – Build, test, and deploy AI agents and LLMs with full software engineering rigor, including integration with leading CI/CD platforms, automated testing, GitOps workflow support, and full rollback capabilities. Vision RAG Integration – Process complex documents, including financial statements, regulatory filings, and technical diagrams, with 85% higher accuracy using ColPali-powered visual document understanding. Kubernetes-Native Architecture – Effortlessly scale to 1000+ concurrent users with enterprise-grade reliability and performance. OpenAI-Compatible APIs – Seamlessly migrate existing projects without code changes, enabling immediate engagement and zero disruption. Enterprise-Grade Authentication – Integrate with Okta, Auth0, and Active Directory for robust, familiar security.
Anthropic Claude Sonnet 4 model can now process up to 1 million tokens of context in a single request — a fivefold increase that allows developers to analyze entire software projects
Claude Sonnet 4 artificial intelligence model can now process up to 1 million tokens of context in a single request — a fivefold increase that allows developers to analyze entire software projects or dozens of research papers without breaking them into smaller chunks. The expansion, available now in public beta through Anthropic’s API and Amazon Bedrock, represents a significant leap in how AI assistants can handle complex, data-intensive tasks. With the new capacity, developers can load codebases containing more than 75,000 lines of code, enabling Claude to understand complete project architecture and suggest improvements across entire systems rather than individual files. The extended context capability addresses a fundamental limitation that has constrained AI-powered software development. Eric Simons, CEO of Bolt.new, which integrates Claude into browser-based development platforms, said: “With the 1M context window, developers can now work on significantly larger projects while maintaining the high accuracy we need for real-world coding.” The expanded context enables three primary use cases that were previously difficult or impossible: comprehensive code analysis across entire repositories, document synthesis involving hundreds of files while maintaining awareness of relationships between them, and context-aware AI agents that can maintain coherence across hundreds of tool calls and complex workflows. The 1 million token context window represents significant technical advancement in AI memory and attention mechanisms. Anthropic’s internal testing revealed perfect recall performance across diverse scenarios, a crucial capability as context windows expand. The company embedded specific information within massive text volumes and tested Claude’s ability to find and use those details when answering questions.
Inclusion Arena shifts LLM evaluation from static lab benchmarks to real-life app interactions, ranking models by user-preferred responses for more relevant enterprise AI selection
Researchers from Inclusion AI, which is affiliated with Alibaba’s Ant Group, proposed Inclusion Arena, a new model leaderboard and benchmark that focuses more on a model’s performance in real-life scenarios. They argue that LLMs need a leaderboard that takes into account how people use them and how much people prefer their answers compared to the static knowledge capabilities models have. Inclusion Arena stands out among other model leaderboards, due to its real-life aspect and its unique method of ranking models. Inclusion Arena works by integrating the benchmark into AI applications to gather datasets and conduct human evaluations. Currently, there are two apps available on Inclusion Arena: the character chat app Joyland and the education communication app T-Box. When people use the apps, the prompts are sent to multiple LLMs behind the scenes for responses. The users then choose which answer they like best, though they don’t know which model generated the response. The framework considers user preferences to generate pairs of models for comparison. The Bradley-Terry algorithm is then used to calculate a score for each model, which then leads to the final leaderboard. According to the initial experiments with Inclusion Arena, the most performant model is Anthropic’s Claude 3.7 Sonnet, DeepSeek v3-0324, Claude 3.5 Sonnet, DeepSeek v3 and Qwen Max-0125. The Inclusion AI researchers argue that their new leaderboard “ensures evaluations reflect practical usage scenarios,” so enterprises have better information around models they plan to choose.
ChatGPT’s new ‘router’ function to automatically select the best OpenAI model to respond to the user’s input on the fly, depending on the specific input’s content by switching between reasoning, non-reasoning, and tool-using models
Reports emerged over the last few days on X from AI influencers, including OpenAI’s own researcher “Roon (@tszzl on X)” (speculated to be technical team member Tarun Gogineni) — of a new “router” function that will automatically select the best OpenAI model to respond to the user’s input on the fly, depending on the specific input’s content. Similarly, Yuchen Jin, Co-founder & CTO of AI inference cloud provider Hyperbolic Labs, wrote in an X post, “Heard GPT-5 is imminent, from a little bird. It’s not one model, but multiple models. It has a router that switches between reasoning, non-reasoning, and tool-using models. That’s why Sam said they’d “fix model naming”: prompts will just auto-route to the right model. GPT-6 is in training.” While a presumably far more advanced GPT-5 model would (and will) be huge news if and when released, the router may make life much easier and more intelligent for the average ChatGPT subscriber. It would also follow on the heels of other third-party products such as the web-based Token Monster chatbot, which automatically select and combine responses from multiple third-party LLMs to respond to user queries. Hopefully any hypothetical OpenAI router seamlessly helps direct them to the right model product for their needs, when they need it.