How Alexa could be better at responding to toddlers

A new study shows that children often respond to voice-activated technologies by treating the device, such as Alexa, Siri, or Google Assistant, as a conversation partner. The study suggests that devices could be more responsive to children by prompting and filling in gaps in communication.

For such popular household technology, that’s a missed opportunity to reach every member of the family, the study finds. Children communicate with technology differently than adults do, and a more responsive device—one that repeats or prompts the user, for example—could be more useful to more people.

“There has to be more than ‘I’m sorry, I didn’t quite get that,'” says coauthor Alexis Hiniker, an assistant professor at the University of Washington Information School. “Voice interfaces now are designed in a cut-and-dried way that needs more nuance. Adults don’t talk to children and assume there will be perfect communication. That’s relevant here.”

The study is part of the proceedings of the 17th Interaction Design and Children Conference, which took place in June in Trondheim, Norway.

40 million homes

Nearly 40 million US homes have a voice-activated assistant like an Amazon Echo or Google Home, and it’s estimated that by 2022, more than half of US households will own one.

While some interfaces have features specifically aimed at younger users, research has shown that these devices generally rely on the clear, precise English of adult users—and specific ones, at that. People for whom English is not their first language, or those who have a regional accent—say, a Southern accent—tend to hit snags with smart speakers, according to a recent Washington Post analysis.

The study shows how children will persist in the face of a communication breakdown, treating a device as a conversation partner and in effect, showing developers how to design technologies that are more responsive to families.

“They’re being billed as whole-home assistants, providing a centralized, shared, collaborative experience,” Hiniker says. “Developers should be thinking about the whole family as a design target.”

‘Quack!’

In this study, the team recorded 14 children, ages 3 to 5 (and, indirectly, their parents), as they played a Sesame Workshop game, “Cookie Monster’s Challenge,” on a lab-issued tablet. As designed, the game features a cartoon duck waddling across the screen at random intervals; the child is asked to “say ‘quack’ like a duck!” each time he or she sees the duck, and the duck is supposed to quack back.

Only in this study, the duck has lost its quack.

That scenario was something of an accident, Hiniker says. The team, with funding from Sesame Workshop, was originally evaluating how various tablet games affect children’s executive function skills. But when they configured the tablet to record the children’s responses, researchers later learned their data-collection tool shut off the device’s ability to “hear” the child.

What the team had instead was more than 100 recordings of children trying to get the duck to quack—in effect, attempting to repair a lapse in conversation—and their parents’ efforts to help. And a study of how children communicate with nonresponsive voice technology was born.

Researchers grouped children’s communication strategies into three categories: repetition, increased volume, and variation. Repetition—in this case, continuing to say “quack,” repeatedly or after pausing—was the most common approach, used 79 percent of the time. Less common among participants was speaking loudly—shouting “quack!” at the duck, for instance—and varying their response, through their pitch, tone, or use of the word. (Like trying an extended “quaaaaaack!” to no avail.)

In all, children persisted in trying, without any evidence of frustration, to get the game to work more than 75 percent of the time; frustration surfaced in fewer than one-fourth of the recordings. And in only six recordings, children asked an adult to help.

Parents were happy to do so—but, the team found, they were also quick to determine something was wrong and take a break from the game. Adults usually suggested the child try again and took a shot at responding, themselves; once they pronounced the game broken—and only then—did the child agree to stop trying.

Better responses

The results represented a series of real-life strategies families use when facing a “broken” or uncommunicative device, Hiniker says. The scenarios also provided a window into young children’s early communication processes.

“Adults are good at recognizing what a child wants to say and filling in for the child,” Hiniker says. “A device could also be designed to engage in partial understanding, to help the child go one step further.”

Despite ‘friends’ like Alexa and Siri, we’re still lonely

For example, a child might ask a smart speaker to play “Wheels on the Bus,” but if the device doesn’t pick up the full name of the song, it could respond with, “Play what?” or fill in part of the title, prompting the child for the rest.

Such responses would be useful even among adults, Hiniker points out. Person-to-person conversation, at any age, is filled with little mistakes, and finding ways to repair such disfluencies should be the future of voice interfaces.

Toddlers pick up lots of grammar around 24 months

“AI is getting more sophisticated all the time, so it’s about how to design these technologies in the first place,” Hiniker says. “Instead of focusing on how to get the response completely right, how could we take a step toward a shared understanding?”

Hiniker has launched another study into how diverse, intergenerational families use smart speakers, and what communication needs emerge.

Source: University of Washington