Zoom disrupts the rhythm of conversation

November 22, 2021
Written By:
Jared Wadley

A man rubs his temples with a Zoom meeting in the background. Image credit: iStockIf you’ve felt exhausted or burned out after a Zoom video conferencing for work or social life, you’re not alone.

The frustration and mental drain, in part, can be connected to trying to catch subtle cues during conversations over Zoom, in the face of internet lag time, according to a new University of Michigan study.

Conversations have a transition time between speakers averaging about 200 milliseconds. Because this is fast, the listener has to comprehend the speaker, plan his response, and predict when he can cut in, simultaneously, said Julie Boland, professor of psychology and linguistics.

Brainwaves, or neural oscillators, may automate a part of this, by synching the two speakers on syllable rate, to help with the timing.

“Oscillators can tolerate a certain amount of deviation (in syllable rate), without desyncing, which is necessary to handle the fuzzy rhythms of speech,” said Boland, the study’s lead author. “However, the variable electronic transmission delays in videoconferencing are probably sufficient to destabilize these oscillators.”

Boland and colleagues find evidence of this destabilization in the longer turn initiation times over Zoom.

“This is one factor that makes Zoom conversations more effortful and tiring than in-person conversations,” she said.

Zoom support pages suggest that transmission lags under 150 milliseconds (less than a 1/5 of a second) should lead to a fully satisfactory experience without any noticeable lag. Boland’s study focuses on considerably shorter lags—well under this level, ranging from about 30 to 70 milliseconds, with more samples at the low end.

Transmission lag, she said, can’t get faster than about 30 milliseconds, given that the electronic data have to travel a considerable distance (bouncing off a satellite). The variability in lag is related to internet traffic.

“Short lags cause problems because the period of a neural oscillator tracking speech rate would need to be in the range of 100-150 milliseconds,” Boland said.

The human voice already stretches that tolerance for variability, so adding even 30-50 milliseconds of transmission lag would be beyond the capacity of the proposed oscillator. So, people need to use other, less automatic cognitive mechanisms, she said.

Thus, video conferencing—as many have learned during the pandemic—can be less enjoyable and feel more awkward.

Boland said she’s been fascinated by the processing efficiency of conversation for several years. The impact from Zoom calls, which seemed to rob the rhythm and grace from interactions, piqued her interest to better understand how the brain and speech were impacted.

The study’s co-authors, Pedro Fonseca, Ilana Mermelstein and Myles Williamson, are undergraduates at U-M’s College of Literature, Science, and the Arts.

The findings appear in the current issue of the Journal of Experimental Psychology: General.


More information: