Microsoft’s paper discusses closed-domain hallucination detection method, designed to provide traceability for processes with any number of generative steps, through an evidence trail that identifies error stage and verifies each claim separately • DigiBanker

Microsoft paper presents VeriTrail, the first closed-domain hallucination detection method designed to provide traceability for processes with any number of generative steps. VeriTrail outperforms baseline methods commonly used for hallucination detection. In this blog post, we provide an overview of VeriTrail’s design and performance. A key idea leveraged by VeriTrail is that a wide range of generative processes can be represented as a directed acyclic graph (DAG). Each node in the DAG represents a piece of text (i.e., source material, an intermediate output, or the final output) and each edge from node A to node B indicates that A was used as an input to produce B. Each node is assigned a unique ID, as well as a stage reflecting its position in the generative process.  VeriTrail takes as input a DAG representing a completed generative process and aims to determine whether the final output is fully supported by the source text. It begins by extracting claims (i.e., self-contained, verifiable statements) from the final output using Claimify. VeriTrail verifies claims in the reverse order of the generative process: it starts from the final output and moves toward the source text. Each claim is verified separately. In addition to assigning a final “Fully Supported,” “Not Fully Supported,” or “Inconclusive” verdict to each claim, VeriTrail returns (a) all Verdict Generation results and (b) an evidence trail composed of all Evidence Selection results: the selected sentences, their corresponding node IDs, and the generated summaries. Collectively, these outputs provide traceability: Provenance: For “Fully Supported” and “Inconclusive” claims, the evidence trail traces a path from the source material to the final output, helping users understand how the output may have been derived. Error Localization: For “Not Fully Supported” claims, VeriTrail uses the Verdict Generation results to identify the stage(s) of the process where the unsupported content was likely introduced. Error stage identification helps users address hallucinations and understand where in the process they are most likely to occur. The evidence trail also helps users verify the verdict: instead of reading through all nodes – which may be infeasible for processes that generate large amounts of text – users can simply review the evidence sentences and summaries. 

Read Article