Risk-Aware Hallucination Detection for Arbitrary Generative Models

Key Outcomes

Under 1.4s To convert a 7B model
+95% Accuracy with uncertainty-based selective answering
0 New training samples needed

Context

Generative AI and language interface use cases present a significant opportunity, estimated to reach 42.6B in 2023 and 98.1B by 2026 at a 32.0% CAGR, with 60% of CIOs planning for widespread AI deployments across their organizations by 2025. AI core platforms, i.e, building blocks of AI and ML deployments, including developer tools needed to build and deploy models to production, is a segment with spending that has reached 32.4B in 2022 and will grow at a 32.7% CAGR to 75.7B in 2024. However, the demand for expertise in this field has caused a talent shortage, with 78% of organizations reporting increased difficulty hiring AI data scientists in 2022, and has made the purchase of AI software one of the leading strategies to address this deficiency. Moreover, as impressive as the capabilities of generative models have become, they are still trained as black-box systems and continue to be held back by seemingly unpredictable variations in output quality. This has resulted in a reluctance of users to fully adopt these models into their day-to-day lives and subsequently prevents them from being integrated into larger workflows with more strict performance requirements. In simple terms, deployed models need to understand when their outputs are unreliable in order to allow for proper quality assurance.

Our Solution

Themis AI has developed Capsa, a software platform that automates the creation of uncertainty-aware machine learning models. It is the first uncertainty estimation library designed to be compatible with any model. That is, it can perform precise and low-level modifications in any framework, for any type of model architecture, at any stage of development. The library performs these operations automatically in a few seconds through a single one-line call that wraps an existing model, leaving the rest of the user’s code unchanged. Capsa provides estimates for multiple sources of uncertainty including aleatoric, epistemic, and vacuitic. It includes a wide library of state-of-the-art algorithms and methods, several of which are proprietary. These methods have been successfully tested and deployed on vision, language, graph, and generative models.