Super tiny lag adds to the awkwardness of Zoom

Zoom support pages suggest that transmission lags under 150 milliseconds (less than a 1/5 of a second) should lead to a fully satisfactory experience without any noticeable lag. New research suggests that's not quite true. (Credit: Getty Images)

Conversations on Zoom can be exhausting. New research shows that trying to catch subtle cues despite internet lag time may be why.

Conversations have a transition time between speakers averaging about 200 milliseconds. Because this is fast, the listener has to comprehend the speaker, plan their response, and predict when they can cut in, simultaneously, says Julie Boland, professor of psychology and linguistics at the University of Michigan.

Brainwaves, or neural oscillators, may automate a part of this, by synching the two speakers on syllable rate, to help with the timing.

“Oscillators can tolerate a certain amount of deviation (in syllable rate), without desyncing, which is necessary to handle the fuzzy rhythms of speech,” says Boland, the study’s lead author. “However, the variable electronic transmission delays in videoconferencing are probably sufficient to destabilize these oscillators.”

Boland and colleagues find evidence of this destabilization in the longer turn initiation times over Zoom.

“This is one factor that makes Zoom conversations more effortful and tiring than in-person conversations,” she says.

Zoom support pages suggest that transmission lags under 150 milliseconds (less than a 1/5 of a second) should lead to a fully satisfactory experience without any noticeable lag. Boland’s study focuses on considerably shorter lags—well under this level, ranging from about 30 to 70 milliseconds, with more samples at the low end.

Transmission lag, she says, can’t get faster than about 30 milliseconds, given that the electronic data have to travel a considerable distance (bouncing off a satellite). The variability in lag is related to internet traffic.

“Short lags cause problems because the period of a neural oscillator tracking speech rate would need to be in the range of 100-150 milliseconds,” Boland says.

The human voice already stretches that tolerance for variability, so adding even 30-50 milliseconds of transmission lag would be beyond the capacity of the proposed oscillator. So, people need to use other, less automatic cognitive mechanisms, she says.

Thus, video conferencing—as many have learned during the pandemic—can be less enjoyable and feel more awkward.

Boland says she’s been fascinated by the processing efficiency of conversation for several years. The impact from Zoom calls, which seemed to rob the rhythm and grace from interactions, piqued her interest to better understand the effect on the brain and speech.

The findings appear in the Journal of Experimental Psychology: General.

Source: University of Michigan