A new method uses the camera on a person’s smartphone or computer to take their pulse and respiration signal from a real-time video of their face.
Telehealth has become a critical way for doctors to still provide health care while minimizing in-person contact during COVID-19.
But with phone or Zoom appointments, it’s harder for doctors to get important vital signs from a patient, such as their pulse or respiration rate, in real time.
With their new method, researchers are proposing a better system to measure these physiological signals. This system is less likely to be tripped up by different cameras, lighting conditions, or facial features, such as skin color.
“Machine learning is pretty good at classifying images. If you give it a series of photos of cats and then tell it to find cats in other images, it can do it. But for machine learning to be helpful in remote health sensing, we need a system that can identify the region of interest in a video that holds the strongest source of physiological information—pulse, for example—and then measure that over time,” says lead author Xin Liu, a doctoral student in the Paul G. Allen School of Computer Science & Engineering at the University of Washington.
“Every person is different,” Liu says. “So this system needs to be able to quickly adapt to each person’s unique physiological signature, and separate this from other variations, such as what they look like and what environment they are in.”
Try the researchers’ demo version that can detect a user’s heartbeat over time, which doctors can use to calculate heart rate.
The team’s system is privacy preserving—it runs on the device instead of in the cloud—and uses machine learning to capture subtle changes in how light reflects off a person’s face, which is correlated with changing blood flow. Then it converts these changes into both pulse and respiration rate.
The first version of this system was trained with a dataset that contained both videos of people’s faces and “ground truth” information: each person’s pulse and respiration rate measured by standard instruments in the field. The system then used spatial and temporal information from the videos to calculate both vital signs. It outperformed similar machine learning systems on videos where subjects were moving and talking.
But while the system worked well on some datasets, it still struggled with others that contained different people, backgrounds, and lighting. This is a common problem known as “overfitting,” the researchers say.
They improved the system by having it produce a personalized machine learning model for each individual. Specifically, it helps look for important areas in a video frame that likely contain physiological features correlated with changing blood flow in a face under different contexts, such as different skin tones, lighting conditions, and environments. From there, it can focus on that area and measure the pulse and respiration rate.
While this new system outperforms its predecessor when given more challenging datasets, especially for people with darker skin tones, there’s still more work to do, the team says.
“We acknowledge that there is still a trend toward inferior performance when the subject’s skin type is darker,” Liu says. “This is in part because light reflects differently off of darker skin, resulting in a weaker signal for the camera to pick up. Our team is actively developing new methods to solve this limitation.”
The researchers are also working on a variety of collaborations with doctors to see how this system performs in the clinic.
“Any ability to sense pulse or respiration rate remotely provides new opportunities for remote patient care and telemedicine. This could include self-care, follow-up care, or triage, especially when someone doesn’t have convenient access to a clinic,” says senior author Shwetak Patel, a professor in both the Allen School and the electrical and computer engineering department.
“It’s exciting to see academic communities working on new algorithmic approaches to address this with devices that people have in their homes.”
This software is open-source and available on Github:
The researchers presented their system in December at the Neural Information Processing Systems conference. The researchers will present these findings April 8 at the ACM Conference on Health, Interference, and Learning.
Additional coauthors are from the University of Washington and Microsoft Research.
Funding for the work came from the Bill & Melinda Gates Foundation, Google, and the University of Washington.
Source: University of Washington