2 minute read

The performance of ChatGPT-4.0o in medical imaging evaluation: a cross-sectional study

The performance of ChatGPT-4.0o in medical imaging evaluation: a cross-sectional study

Reviewer: Dr Jacqui Roots | ASA SIG: Emerging Technologies

Authors: Arruzza E, Evangelista C, Chau M

Why the study was performed

A large aspect of the medical imaging profession is the evaluation of image quality and optimisation to allow for accurate diagnosis. ChatGPT-4.0o has a new feature that allows the user to upload an image for evaluation. This study investigated the accuracy of the AI’s evaluation of medical images, specifically radiographs. The authors aimed to compare the ChatGPT’s evaluation of a radiograph’s quality and positioning.

How the study was performed

Radiographs of the knee, elbow, ankle, hand, shoulder and pelvis of phantoms were used in this study. The authors created and uploaded a variety of correct and suboptimal images to ChatGPT and entered a prompt. The AI was asked “(1) to identify positioning error(s), (2) to explain the error using specific and relevant anatomical terminology, and (3) to provide a suitable radiographic method to enhance the positioning”. The answer generated was evaluated, scored and graded based on the number of errors that were correctly or incorrectly identified, as well as the presence or absence of a justification and if corrections were offered by the AI.

What the study found

The answers generated by ChatGPT-4.0o correctly recognised all errors as well as suggested improvements in 20% of cases. In almost 27% of cases, ChatGPT either could not recognise errors or incorrectly identified an error in an optimal image. The authors have demonstrated that the current ChatGPT4.0o has low accuracy in the evaluation of radiographs. While additional information and prompts provided to the AI may have increased the accuracy, there is a lot of growth required before it may eventually be used as an educational or assistive tool.

Relevance to clinical practice

From an educational point of view, the new feature of ChatGPT to evaluate medical images may be able to provide information on image quality and positioning errors. This could assist students in correcting their basic errors and increase confidence in image optimisation. For experienced staff, this feature may be useful when asked to complete an unusual or uncommon view. Currently, ChatGPT only provides low accuracy in its image evaluation and provision of corrections.

“The phantoms were purposely positioned to include a range and spectrum of positioning errors, including radiographs that were optimally positioned”
This article is from: