Repeated showing of CT scan images helped train the generative AI tool ChatGPT-4 to effectively differentiate lumps and lesions like an experienced radiologist which could reduce radiology workflows. Lionel BONAVENTURE/AFP

Artificial Intelligence tools like ChatGPT can greatly help in speeding up radiology workflows, according to a new study published in the American Journal of Roentgenology.

The study titled "Use of GPT-4 With Single-Shot Learning to Identify Incidental Findings in Radiology Reports" has caught Canadian researchers off-guard as it showcased how ChatGPT can perform tasks like identifying CT scans. According to the study, generative AI can be trained to differentiate between CT scans through repeated exposure to similar images.

To understand whether ChatGPT can speed up radiology workflows, the Canadian scientists trained ChatGPT-4 to read CT scan reports with the help of a process called 'single-shot learning'. This method trained the generative AI platform by repeated comparisons of two images.

GPT-4 was repeatedly given two images of CT scans until it learnt to identify the minute differences in the images. These differences can be case-specific and relevant to patients and radiologists. The researchers believe this could lessen the burden of healthcare professionals including radiologists and speed up the diagnosis of critical diseases.

One of the researchers from the team that experimented at the Toronto General Hospital, Dr Rajesh Bhayana said: "Automatic identification of incidental findings in radiology reports could improve patient care by highlighting the findings to referring clinicians, automating management, or facilitating population health initiatives."

This comes at a time when UK researchers from Cardiff University demonstrated the effective use of generative AI in enhancing breast cancer detection and diagnosis.

The scientists randomly selected and analysed 1000 radiology reports to test and train ChatGPT-4. All the reports were abdominal CT scans taken from the Toronto General Hospital records.

The accuracy of AI was further checked by the standard reference set by a physician with four years of experience and a seasoned radiologist who has years of post-training experience. Both of them set certain benchmarks for the GPT-4 to be analysed including potential new malignancy resulting from adrenal nodule measurement of less than one centimetre, pancreatic lesion or vascular calcification.

Adrenal nodule is a lump or tissue growth which isn't a health scare under normal circumstances. However, abnormal hormone production and malignancy can be found in them. Hence, it's an effective market for the diagnosis of diseases.

Vascular calcification is mineral deposits in the blood vessels which could be an indication of blockage, atherosclerosis and other cardiac issues.

These markers were essential for the radiologists to understand changes in the CT scans so that proper diagnosis could be made. Once this was established a more experienced radiologist with over a decade of clinical experience checked for any discrepancies in the ChatGPT analysis of the images and resolved them with scores.

The F1 scores given in the radiology reports indicate how accurate the generative AI analysis was. A score of 1 means the large language model has identified potential areas of concern in the CT scans which need to be checked by experts.

The analysis showed that ChatGPT-4 was able to accurately identify adrenal nodules in the CT scan reports as it showed F1 scores of 1. For vascular calcification and pancreatic lesions, the F1 score was 0.99 and 0.91 respectively which was later confirmed by radiologists.

These results were similar to what a trained radiologist could have detected only here the ChatGPT has performed the initial sweep of images and reduced errors through additional identification. This is likely to help radiology departments save time as AI gets integrated into the workflow after CT scans.

The study echoed the sentiments of the World Health Organisation which recently said AI could help speed up diagnosis and drug development.

The researchers underlined in their study how incidental findings are commonly mismanaged by hospital staff and hence "automatic identification in reports could improve management by increasing visibility or partially automating workup".

The flexibility of large language models will provide new opportunities to improve care when AI is incorporated into electronic medical records, said the study.