WHO's new AI-powered chatbot is giving wrong medical answers
The World Health Organization is wading into the world of AI to provide basic health information through a human-like avatar. But while the bot responds sympathetically to users' facial expressions, it doesn't always know what it's talking about.
SARAH, short for Smart AI Resource Assistant for Health, is a virtual health worker that's available to talk 24/7 in eight different languages to explain topics like mental health, tobacco use and healthy eating. It's part of the WHO's campaign to find technology that can both educate people and fill staffing gaps with the world facing a health-care worker shortage.
WHO warns on its website that this early prototype, introduced on April 2, provides responses that "may not always be accurate." Some of SARAH's AI training is years behind the latest data. And the bot occasionally provides bizarre answers, known as hallucinations in AI models, that can spread misinformation about public health.
SARAH doesn't have a diagnostic feature like WebMD or Google. In fact, the bot is programmed to not talk about anything outside of the WHO's purview, including questions on specific drugs. So SARAH often sends people to a WHO website or says that users should "consult with your health-care provider."
"It lacks depth," Ramin Javan, radiologist and researcher at George Washington University, said. "But I think it's because they just don't want to overstep their boundaries and this is just the first step."
WHO says SARAH is meant to work in partnership with researchers and governments to provide accurate public health. The agency is asking them for advice on how to improve the bot and use it in emergency health situations. But it emphasizes its AI assistant is still a work in progress.
"These technologies are not at the point where they are substitutes for interacting with a professional or getting medical advice from an actual trained clinician or health provider," Alain Labrique, the director of digital health and innovation at WHO, said.
SARAH was trained on OpenAI's ChatGPT 3.5, which used data through September 2021, so the bot doesn't have up-to-date information on medical advisories or news events. When asked whether the US Food & Drug Administration has approved the Alzheimer's drug Lecanemab, for example, SARAH said the drug is still in clinical trials when in fact it was approved for early disease treatment in January 2023.
Even the WHO's own data can trip SARAH up. When asked whether hepatitis deaths are increasing, it could not immediately provide details from a recent WHO report until promoted a second time to check the agency's website for updated statistics. The agency said it's checking on whether there's a lag in updates.
And sometimes the AI bot draws a blank. Javan asked SARAH where one could get a mammogram in Washington, DC, and it could not provide a response.
None of this is unusual in these early days of AI development. In a study last year looking at how ChatGPT responded to 284 medical questions, researchers at Vanderbilt University Medical Center found that while it provided correct answers most of the time, there were multiple instances where the chatbot was "spectacularly and surprisingly wrong."
Safety and privacy concerns
To be able to mimic empathy for question sessions, SARAH accesses computer cameras to store users' facial expressions for 30 seconds, then deletes the recordings, WHO communications director Jaimie Guerra said. Each visit is anonymous, but users can elect to share their questions with WHO in a survey to improve the experience, though Guerra said any data collected is randomized and not tied to an IP address or person to protect users.
Still, using open source data like GPT's has its own dangers because it is a frequent target of cybercriminals, Jingquan Li, a public health and IT researcher at Hofstra University, said. Some people accessing SARAH through Wi-Fi are vulnerable to malware attacks or video camera hacking. Guerra said attacks trying to access data shouldn't be a problem because of the anonymous sessions.
Government partners and researchers also do not have regular access to the data, including questions that might help track health patterns, unless they ask for the voluntary survey data. Guerra said this means SARAH would not be the most accurate tool to predict the next flu outbreak, for example.
SARAH is a continuation of a 2021 WHO virtual health worker project called Florence that provided basic information on Covid-19 and tobacco, and New Zealand-based Soul Machines Ltd. built the avatars for both projects. Soul Machines cannot access the SARAH data, but Chief Executive Officer Greg Cross said in a statement the company is using the GPT data to improve results and experience. Earlier this year, WHO released ethics guidance to its government partners for health-related AI models including promoting data transparency and protecting safety.
While Florence appeared to be a young, nonwhite woman, SARAH appears White. Changing the appearance and updating the avatar isn't out of the question, Labrique said, and users may be able to choose an avatar preference in future versions.
As for SARAH's gender, when asked it said "I am a chatbot digital health promoter, so I do not have a gender or use personal pronouns. My purpose is to assist you in living a healthy lifestyle. Do you have any questions about quitting tobacco, reducing alcohol consumption, or improving your overall well-being?"