June 22, 2023

Demystifying AI’s Reliability Concerns – Interview With Tim O’Connell, Radiologist, Emtelligent Founder, CEO

by Medical Futurist 5 minutes

“Providers, healthcare technology companies, payers, government agencies that provide healthcare services – they all need to have a chief AI officer, a person who understands the technology and workflows and can be responsible for implementation” – says Dr. Tim O’Connell.

He is the founder and CEO of emtelligent, a Vancouver-based medical natural language processing technology solution, a practicing radiologist, and the vice-chair of clinical informatics at the University of British Columbia.

What are your thoughts on the recent widespread adoption of large language models in various sectors?

I’m really only qualified to talk about large language models (LLMs) in healthcare, but the adoption of this technology is exciting. LLMs are an amazing tool that can make language processing, language inference, and language understanding much more efficient and accurate. They offer an incremental performance improvement over previous types of natural language processing.

But a lot of people misunderstand large language models because of their own easy interactions with ChatGPT and Bard. They’re thinking, “Now I can converse with this amazing AI engine.” What they tend to overlook is that these language models still have reliability problems in terms of accuracy and “hallucinations” (a phenomenon in which an AI model, based on the training data it has been fed, generates outputs that aren’t possible in the real world). For safe use in healthcare, LLMs must be trained on extremely large sets of annotated medical data to improve accuracy.

Still, I’m bullish on the future of LLMs. They are a super exciting technology that’s moving the needle on how a computer understands language. They’re going to open up new markets and new possibilities. They will help people do their jobs better and more safely in healthcare. And they’ll help relieve physician burnout by taking over some of the tasks that consume so much clinician time.

In your opinion, what measures could be taken to ensure the safe and ethical utilization of large language models?

We need a couple things. First, we probably will need some type of government regulation. While I’m not saying that’s an easy thing to do, ethical utilization of LLMs can’t be assumed because there are people who will try to go beyond the boundaries of ethics to gain some kind of advantage. We know that; it’s part of human nature.

In addition to regulations, there needs to be corporate and organizational accountability for how AI is used in healthcare. Providers, healthcare technology companies, payers, government agencies that provide healthcare services – they all need to have a chief AI officer, a person who understands the technology and workflows and can be responsible for implementation. Otherwise, we’re going to see people in large organizations start implementing AI projects and things using LLMs, and some of those implementations aren’t going to be safe, appropriate, or ethical.

What are your thoughts on GPT-4 (or similar models) to also analyze images?

I want to be clear that I haven’t had enough experience using GPT-4 for image analysis to offer any expert insights, but I’d like to address its use for analyzing radiology images. OpenAI hasn’t disclosed what image sources that GPT-4 was trained on, so I have to at least question whether it was trained on a significant number of radiology images, which generally aren’t available in large numbers to the public.

If GPT-4 wasn’t trained on a significant number of radiology images, I would be concerned about people using it to interpret radiology studies. While one wrist X-ray may look similar to another to the untrained eye, there are significant small differences that may be very important diagnostically. Even a model trained on 1,000 wrist X-rays wouldn’t be able to deal with all of the pathology possible in just the wrist on that single imaging modality. Models trained on the high tens to hundreds of thousands of images with widely varying pathology are required to achieve clinical accuracy.

Do you see beneficial use cases for LLMs in radiology?

Absolutely. So again, a large language model is just a piece of technology that can be used to improve natural language processing. That’s one of the things our company does. LLMs will enable more accurate information extraction from radiology reports for use cases such as analytics or patient cohort identification for research and AI model building, as well as for clinical tools to help radiologists, researchers, administrators, and other clinicians.

If image-based models can be made good enough and safely implemented, that raises the potential for radiologists to automate reporting of basic results for X-rays and other common radiology images. Based only on what I’ve seen in the market, I don’t think we’re quite there yet for completely automated reporting. We might be five or 10 years away, though people already have come up with ancillary technologies, such as models that can automate the creation of summaries from the findings sections of radiology reports.

As medical imaging utilization increases every year, some radiology departments have been having trouble keeping pace due to staffing shortages and other resource constraints. AI technologies like LLMs hold great promise to allow us to continue to deliver high-quality imaging in the face of this increased utilization.

Thank you!

Take care,

Berci