
Themis AI: Teaching AI Models What They Don’t Know
In the rapidly evolving landscape of artificial intelligence, systems like ChatGPT often present answers with an air of certainty, even when their knowledge is incomplete. This can lead to significant issues as AI is increasingly relied upon in critical sectors such as drug development and autonomous driving. Themis AI, an MIT spinout, is tackling this problem head-on by quantifying model uncertainty and correcting unreliable outputs.
Themis AI’s flagship platform, Capsa, is designed to work with any machine-learning model, swiftly detecting and rectifying unreliable outputs. Capsa modifies AI models to identify patterns indicative of ambiguity, incompleteness, or bias in their data processing.
“The idea is to take a model, wrap it in Capsa, identify the uncertainties and failure modes of the model, and then enhance the model,” explains Themis AI co-founder and MIT Professor Daniela Rus, who also directs MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). “We’re excited about offering a solution that can improve models and offer guarantees that the model is working correctly.”
Founded in 2021 by Rus, Alexander Amini, and Elaheh Ahmadi, Themis AI has already made strides in various industries. They have assisted telecom companies with network planning, aided oil and gas companies in interpreting seismic imagery using AI, and contributed to research on building more trustworthy chatbots.
“We want to enable AI in the highest-stakes applications of every industry,” Amini states. “We’ve all seen examples of AI hallucinating or making mistakes. As AI is deployed more broadly, those mistakes could lead to devastating consequences. Themis makes it possible that any AI can forecast and predict its own failures, before they happen.”
Rus’s research on model uncertainty dates back several years. In 2018, she received funding from Toyota to investigate the reliability of machine learning-based autonomous driving solutions, highlighting the critical importance of model reliability in safety-critical applications.
Earlier work by Rus, Amini, and their colleagues led to an algorithm capable of detecting and eliminating racial and gender bias in facial recognition systems. This algorithm reweighted the model’s training data by identifying and rebalancing unrepresentative parts.
In 2021, they demonstrated that a similar approach could assist pharmaceutical companies in using AI models to predict drug candidate properties, leading to the founding of Themis AI later that year.
“Guiding drug discovery could potentially save a lot of money,” Rus notes. “That was the use case that made us realize how powerful this tool could be.”
Currently, Themis AI is collaborating with enterprises across various sectors, including those developing large language models (LLMs). Capsa enables these models to quantify their uncertainty for each output, addressing concerns about reliability.
Stewart Jamieson, Themis AI’s head of technology, adds, “Many companies are interested in using LLMs that are based on their data, but they’re concerned about reliability. We help LLMs self-report their confidence and uncertainty, which enables more reliable question answering and flagging unreliable outputs.”
Themis AI is also engaged in discussions with semiconductor companies to build AI solutions on chips for use outside of cloud environments, potentially combining the efficiency of edge computing with high accuracy.
Pharmaceutical companies are leveraging Capsa to refine AI models used in identifying drug candidates and predicting clinical trial performance. Amini emphasizes that Capsa can provide immediate insights into whether predictions are evidence-based or speculative, accelerating the identification of promising candidates and offering significant societal benefits.
Looking ahead, Themis AI is exploring Capsa’s ability to enhance accuracy in chain-of-thought reasoning, an AI technique where LLMs explain their reasoning steps. Jamieson believes this could significantly improve the LLM experience, reduce latencies, and lower computational requirements.
For Rus, Themis AI represents an opportunity to translate her MIT research into real-world impact, addressing both the potential and the concerns surrounding AI.
“My students and I have become increasingly passionate about going the extra step to make our work relevant for the world,” Rus concludes. “AI has tremendous potential to transform industries, but AI also raises concerns. What excites me is the opportunity to help develop technical solutions that address these challenges and also build trust and understanding between people and the technologies that are becoming part of their daily lives.”
</n