Archive360 has released the first modern archive platform that provides governed data for AI and analytics. The Archive360 Platform enables enterprises and government agencies to unlock the full potential of their archival assets with extensive data governance, security and compliance capabilities, and primed for intelligent insights. The Archive360 Modern Archiving Platform enables organizations to control how AI and analytics consume information from the archive, and to simplify the process of connecting to and ingesting data from any application, so organizations can start realizing value faster. The capability reduces the risk AI can pose to organizations by inadvertently exposing regulated data, company trade secrets, or simply ingesting faulty and irrelevant data. The Archive360 AI & Data Governance Platform is deployed as a cloud-native, class-based architecture. It provides each customer with a dedicated SaaS environment to enable them to completely segregate data and retain administrative access, entitlements, and the ability to integrate into their security protocols. It allows organizations to: Shift from application-centric to data-centric archiving; Protect, classify and retire enterprise data; and AI Activation.
Qlik launches Open Lakehouse offering 2.5x–5x faster query performance and up to 50% lower infrastructure costs, while maintaining full compatibility with the most widely used analytics and machine learning engines
Qlik announced the launch of Qlik Open Lakehouse, a fully managed Apache Iceberg solution built into Qlik Talend Cloud. Designed for enterprises under pressure to scale faster and spend less, Qlik Open Lakehouse delivers real-time ingestion, automated optimization, and multi-engine interoperability — without vendor lock-in or operational overhead. Qlik Open Lakehouse offers a new path: a fully managed lakehouse architecture powered by Apache Iceberg that delivers 2.5x–5x faster query performance and up to 50% lower infrastructure costs, while maintaining full compatibility with the most widely used analytics and machine learning engines. Qlik Open Lakehouse combines real-time ingestion, intelligent optimization, and true ecosystem interoperability in a single, fully managed platform: Real-time ingestion at enterprise scale; Intelligent Iceberg optimization, fully automated; Open by design, interoperable by default; Your compute, your cloud, your rules; One platform, end to end. As AI workloads demand faster access to broader, fresher datasets, open formats like Apache Iceberg are becoming the new foundation. Qlik Open Lakehouse responds to this shift by making it effortless to build and manage Iceberg-based architectures — without the need for custom code or pipeline babysitting. It also runs within the customer’s own AWS environment, ensuring data privacy, cost control, and full operational visibility.
Elastic’s new plugin to accelerate open-source vector search index build times and queries on Nvidia GPUs; integrates with Nvidia validated designs to enable on-premises AI agents
Elastic announced that Elasticsearch integrates with the new NVIDIA Enterprise AI Factory validated design to provide a recommended vector database for enterprises to build and deploy their own on-premises AI factories. Elastic will use NVIDIA cuVS to create a new Elasticsearch plugin that will accelerate vector search index build times and queries. NVIDIA Enterprise AI Factory validated designs enable Elastic customers to unlock faster, more relevant insights from their data. Elasticsearch is used throughout the industry for vector search and AI applications, with a thriving open source community. Elastic’s investment to accelerate vector search on GPUs builds upon previous longstanding efforts to optimize its vector database performance through hardware-accelerated CPU SIMD instructions, new vector data compression innovations like Better Binary Quantization and making Filtered HNSW faster. With Elasticsearch and the NVIDIA Enterprise AI Factory reference design, enterprises can unlock deeper insights and deliver more relevant, real-time information to AI agents and generative AI applications.
Starburst Data’s lakehouse model supports AI models by using data where it already lives without needing to copy it into a centralized repository and without requiring external data pipelines
Starburst Data is unveiling a suite of enhancements intended to make it easier for enterprises to develop and apply artificial intelligence models. Starburst’s updates are focused on enabling what it calls an AI “lakeside,” in which companies can use data where it already lives without needing to copy it into a centralized repository. Starburst defines a lakeside as a staging ground for AI, or an area adjacent to the data lakehouse where data is the most complete, cost-efficient and governed. The company’s new Lakeside AI architecture combines AI-ready tools with an open data lakehouse model. It allows companies to experiment with, train and deploy AI systems while keeping sensitive or regulated data in place. Starburst AI Workflows accelerates AI application development by making it easier to transform unstructured data into vector embeddings, a machine learning technique that turns data into numerical representations that capture the meaning and relationships between different data points without requiring explicit keywords. Workflows manage prompts and models with SQL and enforce governance policies. Starburst said these capabilities are fully contained within its platform and require no external data pipelines. Data is stored on Apache Iceberg tables with connectors available for a variety of third-party vector databases. Basically, this means users can build AI features that rely on unstructured or semi-structured sources like emails, documents and logs without having to move data or stitch together multiple tools. The Starburst AI Agent is a built-in natural language interface that allows users to talk to their data using natural language. It automatically scans for sensitive data such as names, email addresses and other personally identifiable information at the column level and tags it so access policies can be applied. That reduces the need for manual checks and helps organizations enforce privacy rules more consistently. A new Starburst data catalog replaces the aging Hive metastore and provides better support for the Iceberg data format that is rapidly becoming the standard for cloud data lakes. The new catalog supports both legacy Hive data and Iceberg tables. To improve performance across large-scale deployments, Starburst is also introducing a native ODBC Driver that improves connection speed and reliability with business intelligence tools such as Salesforce Inc.’s Tableau and Microsoft Corp.’s Power BI.
Data governance platform Relyance AI allows organizations to precisely detect bias by examining not just the immediate dataset used to train a model, but by tracing the potential bias to its source
Relyance AI, a data governance platform provider that secured $32.1 million in Series B funding last October, is launching a new solution aimed at solving one of the most pressing challenges in enterprise AI adoption: understanding exactly how data moves through complex systems. The company’s new Data Journeys platform addresses a critical blind spot for organizations implementing AI — tracking not just where data resides, but how and why it’s being used across applications, cloud services, and third-party systems. Data Journeys provides comprehensive view, showing the complete data lifecycle from original collection through every transformation and use case. The system starts with code analysis rather than simply connecting to data repositories, giving it context about why data is being processed in specific ways. Data Journeys delivers value in four critical areas: First, compliance and risk management: The platform enables organizations to prove the integrity of their data practices when facing regulatory scrutiny. Second, precise bias detection: Rather than just examining the immediate dataset used to train a model, companies can trace potential bias to its source. Third, explainability and accountability: For high-stakes AI decisions like loan approvals or medical diagnoses, understanding the complete data provenance becomes essential. Finally, regulatory compliance: The platform provides a “mathematical proof point” that companies are using data appropriately, helping them navigate increasingly complex global regulations. Customers have seen 70-80% time savings in compliance documentation and evidence gathering.
Apache Airflow 3.0’s event-driven data orchestration makes real-time, multi-step inference process possible at scale across various enterprise use cases
Apache Airflow community is out with its biggest update in years, with the debut of the 3.0 release. Apache Airflow 3.0 addresses critical enterprise needs with an architectural redesign that could improve how organizations build and deploy data applications. Unlike previous versions, this release breaks away from a monolithic package, introducing a distributed client model that provides flexibility and security. This new architecture allows enterprises to: Execute tasks across multiple cloud environments; Implement granular security controls; Support diverse programming languages; and Enable true multi-cloud deployments. Airflow 3.0’s expanded language support is also interesting. While previous versions were primarily Python-centric, the new release natively supports multiple programming languages. Airflow 3.0 is set to support Python and Go with planned support for Java, TypeScript and Rust. This approach means data engineers can write tasks in their preferred programming language, reducing friction in workflow development and integration. Instead of running a data processing job every hour, Airflow now automatically starts the job when a specific data file is uploaded or when a particular message appears. This could include data loaded into an Amazon S3 cloud storage bucket or a streaming data message in Apache Kafka.
Datadog unifies observability across data and applications, combining AI with column-level lineage to detect, resolve and prevent data quality problems from occurring
Cloud security and application monitoring giant Datadog is looking to expand the scope of its data observability offerings after acquiring a startup called Metaplane. By adding Metaplane’s tools to its own suite, Datadog said, it will enable its users to identify and take instant action to remedy any data quality issues affecting their most critical business applications. Metaplane has built an end-to-end data observability platform that combines AI with column-level lineage to try and detect, resolve and also prevent data quality problems from occurring. It’s an important tool for any company that’s trying to make data-driven decisions, since “bad” data means those decisions are being made based on the wrong insights. This allows it to notify customers of any issues with the tools that are creating their data, such as Slack, PagerDuty and the like. Datadog Vice President Michael Whetten said, Metaplane’s offerings will help the company to unify observability across data and applications so its customers can “build reliable AI systems.” When the acquisition closes, Metaplane will continue to support its existing customers as a standalone product, though it will be rebranded as “Metaplane by Datadog.” Of course, Datadog will also look to integrate Metaplane’s capabilities within its own platform, and likely do its utmost to get Metaplane’s customers onboard.
Candescent and Ninth Wave’s integrated open data solution to facilitate secure, API-based, consumer-permissioned data sharing for banks and credit unions of all sizes and enable compliance to US CFPB Rule 1033
US digital banking platform Candescent has expanded its partnership with Ninth Wave to launch an integrated open data solution for banks and credit unions. The new offering is designed to facilitate secure, API-based, consumer-permissioned data sharing for banks and credit unions of all sizes. The development aims to support institutions in enhancing customer experience, operational efficiency, and regulatory compliance, including adherence to the US Consumer Financial Protection Bureau’s Rule 1033. The expanded collaboration seeks to replace traditional data-sharing practices—such as screen scraping and manual uploads—with modern, transparent alternatives. The new solution offers seamless integration with third-party applications used by both retail and business banking customers. Candescent chief product officer Gareth Gaston said: “With our integrated solution, banks and credit unions will be able to access Ninth Wave open data capabilities from within the Candescent digital banking platform. By adopting this model, financial institutions are expected to gain improved control over shared data, as well as stronger compliance with evolving regulatory standards. Ninth Wave founder and CEO George Anderson said “This partnership will allow financial institutions of all sizes to gain the operational efficiencies, reliability, and scalability of a single point of integration to open finance APIs and business applications.”
Reducto’s ingestion platform turns unstructured data that’s locked in complex documents into accurate LLM-ready inputs for AI pipelines
Reducto, the most accurate ingestion platform for unlocking unstructured data for AI pipelines, has raised a $24.5M series A round of funding led by Benchmark, alongside existing investors First Round Capital, BoxGroup and Y Combinator. “Reducto’s unique technology enables companies of all sizes to leverage LLMs across a variety of unstructured data, regardless of scale or complexity,” said Chetan Puttagunta, General Partner at Benchmark. “The team’s incredibly fast execution on product development further underscores their commitment to delivering state-of-the-art software to customers.” Reducto turns complex documents into accurate LLM-ready inputs, allowing AI teams to reliably use the vast data that’s locked in PDFs and spreadsheets. Ingestion is a core bottleneck for AI teams today because traditional approaches fail to extract and chunk unstructured data accurately. These input errors lead to inaccurate and hallucinated outputs, making LLM applications unreliable for many real-world use cases such as processing medical records and financial statements. In benchmark studies, Reducto has been proven to be significantly more accurate than legacy providers like AWS, Google and Microsoft – in some cases by a margin of 20+ percent, alongside significant processing speed improvements. This is critical for high-stakes, production AI use cases.