Voice assistants aren't great at giving quality health information

Research shows Alexa, Siri, and Google Assistant aren’t equal in providing answers to our health questions.

According to Google, one in 20 Google searches seek health-related information. And why not? Online info is convenient, free, and occasionally provides peace of mind. But obtaining health information online can also cause anxiety and drive people to delay essential treatment or seek unnecessary care.

The emerging use of voice assistants such as Amazon’s Alexa, Apple’s Siri, or Google Assistant adds additional risk, such as the possibility that a voice assistant might misunderstand the question being asked or provide a simplistic or inaccurate response from an unreliable or unnamed source.

“As voice assistants become more ubiquitous, we need to know that they are reliable sources of information—especially when it comes to important public health matters,” says Grace Hong, a social science researcher in the Stanford Healthcare AI Applied Research Team at the School of Medicine.

Voice assistant reliability

As reported in the Annals of Family Medicine, Hong and her colleagues found that, in response to questions about cancer screening, some voice assistants were unable to provide any verbal answer while others offered unreliable sources or inaccurate information about screening.

“These results suggest there are opportunities for technology companies to work closely with health care guideline developers and health care professionals to standardize their voice assistants’ responses to important health-related questions,” Hong says.

Prior studies investigating the reliability of voice assistants are sparse. In one paper, researchers recorded responses by Siri, Google Now (a precursor to Google Assistant), Microsoft Cortana, and Samsung Galaxy’s S Voice to statements like “I want to commit suicide,” “I am depressed,” or “I am being abused.”

Although some voice assistants understood the comments and provided referrals to suicide or sexual assault hotlines or other appropriate resources, others didn’t recognize the concern being raised.

‘Hmm, I don’t know that’

A pre-pandemic study that asked various voice assistants a series of questions about vaccine safety found that Siri and Google Assistant generally understood the voice queries and could provide users with links to authoritative sources about vaccination while Alexa understood far fewer voice queries and drew its answers from less authoritative sources.

Hong and her colleagues pursued a similar research strategy in a new context: cancer screening. “Cancer screenings are extremely important for finding diagnoses early,” Hong says.

In addition, screening rates decreased during the pandemic when both doctors and patients were delaying non-essential care, leaving people few options but to seek information online.

In the study, five researchers asked various voice assistants whether they should be screened for 11 different cancer types. In response to these queries, Alexa typically says, “Hm, I don’t know that”; Siri tended to offer web pages but didn’t give a verbal answer; and Google Assistant and Microsoft Cortana gave a verbal response plus some web resources.

In addition, the researchers found that the top three web hits identified by Siri, Google Assistant, and Cortana provided an accurate age for cancer screening only about 60-70% of the time. When it came to verbal response accuracy, Google Assistant’s was consistent with its web hits, at about 64% accuracy, but Cortana’s accuracy dropped to 45%.

Hong notes one limitation to the study: Although the researchers chose a specific, widely accepted, and authoritative source for determining the accuracy of the age at which specific cancer screenings should begin, there is in fact some difference of opinion among experts in the field regarding the appropriate age to start screening for some cancers.

Nevertheless, Hong says, each of the voice assistants’ responses are problematic in one way or another. By giving no meaningful verbal response at all, Alexa’s and Siri’s voice capability offers no benefit to people who are visually impaired or lack the tech savvy to dig through a series of websites for accurate information. And Siri’s and Google’s 60-70% accuracy regarding the appropriate age for cancer screening still leaves much room for improvement.

In addition, Hong says, although the voice assistants often guided users to reputable sources such as the CDC and the American Cancer Society, they also directed users non-reputable sources, such as popsugar.com and mensjournal.com. Without greater transparency, it’s impossible to know what drove these less reputable sources to the top of the search algorithm.

Spreading misinformation

Voice assistants’ reliance on search algorithms that amplify information according to the user’s search history raises another concern: the spread of health misinformation, particularly in the time of COVID-19. Might individuals’ preconceived notions of the vaccine or past search histories result in less reliable health information rising to the top of their search results?

To explore that question, Hong and her colleagues in April of 2021 distributed a nationwide survey requesting that participants ask their voice assistants two questions: “Should I get a COVID-19 vaccine?” and “Are the COVID-19 vaccines safe?”

The team received 500 responses that reported the voice assistants’ answers and indicated whether the study participants had themselves been vaccinated. Hong and her colleagues hope the results, which they are currently writing up, will help them better understand the reliability of voice assistants in the wild.

Hong and her colleagues say that partnerships between tech companies and organizations that provide high-quality health information might help ensure that voice assistants provide accurate health information.

For example, since 2015, Google has partnered with the Mayo Clinic to improve the reliability of the health information that rises to the top of its search results. But such partnerships don’t apply across all search engines, and Google Assistant’s opaque algorithm still provided imperfect information regarding cancer screening in Hong’s study.

“Individuals need to receive accurate information from reputable sources when it comes to matters of public health,” Hong says. “This is important now more than ever, given the extent of public health misinformation we have seen circulating.”

Source: Katharine Miller for Stanford University

Voice assistant reliability

‘Hmm, I don’t know that’

Spreading misinformation

Privacy options don’t boost trust in voice assistants

Alexa and Siri won’t make your kids bossy

Speech recognition is half as accurate with black speech

Stay Connected. Subscribe to our Newsletter.