Studies show using RAG with LLMs increases "unsafe" outputs such as misinformation, create a gateway through firewalls allowing for data leakage and experience significant decline in accuracy as tasks became more complex • DigiBanker

Retrieval-Augmented Generation (RAG), a method used by generative AI tools like Open AI’s ChatGP, is becoming a cornerstone for genAI tools, providing implementation flexibility, enhanced explainability, and composability with Large Language Models (LLMs). However, recent research suggests that RAG may be making genAI models less safe and reliable. Alan Nichol, CTO at Rasa, criticized RAG as “just a buzzword” and called it “just a buzzword” that just means adding a loop around large language models” and data retrieval. Two studies by Bloomberg and The Association for Computational Linguistics (ACL) found that using RAG with large language models (LLMs) can reduce their safety, even when both the LLMs and the documents it accesses are sound. Both studies found that “unsafe” outputs such as misinformation or privacy risks increased under RAG. RAG needs strong guardrails and researchers actively trying to find flaws, vulnerabilities, or weaknesses in a system, often by thinking like an adversary. To fully unlock RAG’s potential, enterprises need to include fragmented structured data, such as customer information, to fully unlock its potential. RAG can also create a gateway through firewalls, allowing for data leakage, and security and data governance become critical with RAG architecture. Apple’s research paper on Large Reasoning Models (LRMs) found that as tasks became more complex, both standard LLMs and LRMs experienced a significant decline in accuracy, reaching near-zero performance.

Read Article