With long waiting lists and rising costs in overburdened healthcare systems, many people are turning to AI-powered chatbots like ChatGPT for medical self-diagnosis. About one in six American adults already use chatbots for health advice at least monthly, according to one recent survey.
But placing too much trust in chatbots’ outputs can be risky, in part because people struggle to know what information to give chatbots for the best possible health recommendations, according to a recent Oxford-led study.
“The study revealed a two-way communication breakdown,” Adam Mahdi, director of graduate studies at the Oxford Internet Institute and a co-author of the study, told TechCrunch. “Those using [chatbots] didn’t make better decisions than participants who relied on traditional methods like online searches or their own judgment.”
For the study, the authors recruited around 1,300 people in the U.K. and gave them medical scenarios written by a group of doctors. The participants were tasked with identifying potential health conditions in the scenarios and using chatbots, as well as their own methods, to figure out possible courses of action (e.g., seeing a doctor or going to the hospital).
The participants used the default AI model powering ChatGPT, GPT-4o, as well as Cohere’s Command R+ and Meta’s Llama 3, which once underpinned the company’s Meta AI assistant. According to the authors, the chatbots not only made the participants less likely to identify a relevant health condition, but it also made them more likely to underestimate the severity of the conditions they did identify.
Mahdi said that the participants often omitted key details when querying the chatbots or received answers that were difficult to interpret.
“[T]he responses they received [from the chatbots] frequently combined good and poor recommendations,” he added. “Current evaluation methods for [chatbots] do not reflect the complexity of interacting with human users.”
Techcrunch event
Berkeley, CA
|
June 5
The findings come as tech companies increasingly push AI as a way to improve health outcomes. Apple is reportedly developing an AI tool that can dispense advice related to exercise, diet, and sleep. Amazon is exploring an AI-based way to analyze medical databases for “social determinants of health.” And Microsoft is helping build AI to triage messages to care providers sent from patients.
But as TechCrunch has previously reported, both professionals and patients are mixed as to whether AI is ready for higher-risk health applications. The American Medical Association recommends against physician use of chatbots like ChatGPT for assistance with clinical decisions, and major AI companies, including OpenAI, warn against making diagnoses based on their chatbots’ outputs.
“We would recommend relying on trusted sources of information for healthcare decisions,” Mahdi said. “Current evaluation methods for [chatbots] do not reflect the complexity of interacting with human users. Like clinical trials for new medications, [chatbot] systems should be tested in the real world before being deployed.”