Home Blog Newsfeed MIT Researchers Enhance AI Trustworthiness in Critical Applications
MIT Researchers Enhance AI Trustworthiness in Critical Applications

MIT Researchers Enhance AI Trustworthiness in Critical Applications

In high-stakes environments like medical diagnostics, the reliability of AI models is paramount. MIT researchers have addressed a significant challenge in this area by improving the trustworthiness and efficiency of AI predictions, particularly in medical imaging analysis.

The core issue lies in the ambiguity of medical images, where conditions like pleural effusion can easily be mistaken for pulmonary infiltrates. AI models can aid clinicians, but they often provide a single prediction, which may not be sufficient given the complexity of possible conditions.

Conformal classification, a method for generating a set of potential diagnoses, offers a promising solution. However, it often produces impractically large sets. Researchers at MIT have developed an enhancement that reduces these prediction sets by up to 30% while simultaneously increasing the reliability of the predictions.

Divya Shanmugam PhD ’24, a postdoc at Cornell Tech who conducted this research while she was an MIT graduate student, explains, “With fewer classes to consider, the sets of predictions are naturally more informative in that you are choosing between fewer options. In a sense, you are not really sacrificing anything in terms of accuracy for something that is more informative.”

The research team, including Helen Lu ’24; Swami Sankaranarayanan, a former MIT postdoc now at Lilia Biosciences; and senior author John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering at MIT, will present their findings at the Conference on Computer Vision and Pattern Recognition in June. Their work focuses on improving the utility and trustworthiness of AI in critical decision-making processes.

Traditional AI models provide probability scores with each prediction, but these scores are often unreliable. Conformal classification replaces this single prediction with a set of probable diagnoses and a guarantee that the correct diagnosis is within the set. However, the uncertainty in AI predictions can lead to excessively large and less useful sets.

To combat this, the researchers integrated a technique called test-time augmentation (TTA). TTA involves creating multiple versions of a single image through cropping, flipping, and zooming, and then aggregating the AI model’s predictions across these augmented images. This approach enhances both the accuracy and robustness of the predictions.

Shanmugam elaborates, “In this way, you get multiple predictions from a single example. Aggregating predictions in this way improves predictions in terms of accuracy and robustness.”

The team applied TTA by holding out labeled image data during the conformal classification process. They then learned to aggregate augmentations on these held-out data, optimizing the images in a way that maximizes the accuracy of the underlying model’s predictions. The result is a smaller, more accurate set of predictions with the same level of confidence.

The researchers emphasized that this method requires no model retraining, making it simple and effective to implement. Across several standard image classification benchmarks, the TTA-augmented method reduced prediction set sizes by 10 to 30 percent while maintaining the probability guarantee.

Future work will focus on validating this approach with models that classify text and on reducing the computational demands of TTA. The research highlights the importance of how labeled data is used after model training and opens new avenues for improving AI reliability in various applications.

Add comment

Sign Up to receive the latest updates and news

Newsletter

Bengaluru, Karnataka, India.
Follow our social media
© 2025 Proaitools. All rights reserved.