Home Blog Newsfeed MIT Researchers Develop New Method to Improve Reliability of Radiologists’ Diagnostic Reports
MIT Researchers Develop New Method to Improve Reliability of Radiologists’ Diagnostic Reports

MIT Researchers Develop New Method to Improve Reliability of Radiologists’ Diagnostic Reports

A multidisciplinary team of researchers from MIT, in collaboration with Harvard Medical School-affiliated hospitals, has developed a new framework to assess and improve the reliability of radiologists’ diagnostic reports. The study addresses the inherent ambiguity in medical images, where radiologists often use terms like “may” or “likely” to describe potential pathologies.

The research highlights that radiologists tend to be overconfident when using phrases like “very likely” and underconfident when using terms like “possibly.” This discrepancy can impact clinical decision-making and patient care. The new framework aims to quantify the reliability of certainty phrases used by radiologists and provide suggestions for more accurate clinical reporting.

Peiqi Wang, an MIT graduate student and lead author of the paper on this research, emphasizes the importance of accurate language in radiology reports. “The words radiologists use are important. They affect how doctors intervene, in terms of their decision making for the patient. If these practitioners can be more reliable in their reporting, patients will be the ultimate beneficiaries,” says Wang.

The team’s approach involves using clinical data to create a framework that aligns the language used by radiologists with the actual occurrence of specific pathologies. This method treats certainty phrases as probability distributions rather than fixed percentages, capturing more nuances in meaning.

According to the study, existing calibration methods rely on the confidence score provided by AI model and it represents the model’s estimated likelihood that its prediction is correct.

The researchers suggest that by adjusting the frequency of certain phrases, radiologists can better align their confidence levels with reality. For example, the framework might suggest changing the phrase “pneumonia is present” to “likely present” to improve calibration in certain datasets.

The study also found that radiologists tend to be underconfident when diagnosing common conditions like atelectasis and overconfident with ambiguous conditions like infection. The same technique used to access radiologists can be used to effectively measure and improve the calibration of large language models by better aligning the words models use to express confidence with the accuracy of their predictions.

The framework’s potential extends beyond radiology, as it can also be used to evaluate the reliability of language models. These models often use phrases like “certainly,” which may not encourage verification of the statements’ correctness, according to Wang.

Looking ahead, the researchers plan to collaborate with clinicians to expand the study to include data from abdominal CT scans and to assess how receptive radiologists are to calibration-improving suggestions.

Atul B. Shinagare, associate professor of radiology at Harvard Medical School, who was not involved with this work, notes, “Expression of diagnostic certainty is a crucial aspect of the radiology report, as it influences significant management decisions. This study takes a novel approach to analyzing and calibrating how radiologists express diagnostic certainty in chest X-ray reports, offering feedback on term usage and associated outcomes. This approach has the potential to improve radiologists’ accuracy and communication, which will help improve patient care.”

The research was funded by a Takeda Fellowship, the MIT-IBM Watson AI Lab, the MIT CSAIL Wistron Research Collaboration, and the MIT Jameel Clinic.

Add comment

Sign Up to receive the latest updates and news

Newsletter

Bengaluru, Karnataka, India.
Follow our social media
© 2025 Proaitools. All rights reserved.