
MIT Researchers Develop New Method to Improve Radiologists’ Diagnostic Reliability
A multidisciplinary team of MIT researchers, collaborating with Harvard Medical School-affiliated hospitals, has developed a novel framework to assess and improve the reliability of radiologists’ diagnostic reports. The study addresses the inherent ambiguity in medical images, where radiologists often use terms like “may” or “likely” to describe pathologies, such as pneumonia.
The research reveals that radiologists tend to be overconfident when using phrases like “very likely” and underconfident when using terms like “possibly.” To tackle this issue, the team created a framework that quantifies the reliability of certainty phrases used by radiologists, aligning their language with the actual occurrence of pathologies.
“The words radiologists use are important. They affect how doctors intervene, in terms of their decision making for the patient. If these practitioners can be more reliable in their reporting, patients will be the ultimate beneficiaries,” says Peiqi Wang, an MIT graduate student and lead author of the research paper published on arXiv.
The framework provides actionable suggestions to help radiologists choose certainty phrases that improve the reliability of their clinical reporting. It also demonstrates the effectiveness of the same technique in calibrating large language models, ensuring that the words models use to express confidence align with the accuracy of their predictions.
By helping radiologists accurately describe the likelihood of pathologies in medical images, this framework promises to enhance the reliability of critical clinical information, ultimately benefiting patient care. The study uses clinical data and probability distributions to capture the nuances of terms such as “possible” and “likely.”
The researchers used prior work that surveyed radiologists to obtain probability distributions that correspond to each diagnostic certainty phrase, ranging from “very likely” to “consistent with.”
The team’s approach involves treating certainty phrases as probability distributions rather than single percentage values, allowing for a more nuanced representation of uncertainty. This allows for the evaluation and improvement of calibration by adjusting how often certain phrases are used to better align confidence with reality.
The research team includes Peiqi Wang, Polina Golland, Barbara D. Lam, Yingcheng Liu, Ameneh Asgari-Targhi, Rameswar Panda, William M. Wells, and Tina Kapur. Their work will be presented at the International Conference on Learning Representations.
Atul B. Shinagare, associate professor of radiology at Harvard Medical School, notes that the study has the potential to improve radiologists’ accuracy and communication, which will help improve patient care.