Cameras are all around us—on store ceilings, public transportation, and even car dashboards.
The recordings can be a powerful surveillance tool on the roads and in buildings, but it’s surprisingly hard to sift through vast amounts of visual data to find relevant information. Specifically, it’s been difficult to quickly identify and understand a person’s actions and behaviors as recorded sequentially by cameras in a variety of locations.
Now, electrical engineers have developed a way to automatically track people across moving and still cameras by using an algorithm that trains the networked cameras to learn one another’s differences. The cameras first identify a person in a video frame, then follow that same person across multiple camera views.
The cameras communicate
“Tracking humans automatically across cameras in a three-dimensional space is new,” says lead researcher Jenq-Neng Hwang, a professor of electrical engineering at the University of Washington. “As the cameras talk to each other, we are able to describe the real world in a more dynamic sense.”
Imagine a typical GPS display that maps the streets, buildings, and signs in a neighborhood as your car moves forward, then add humans to the picture. With the new technology, a car with a mounted camera could take video of the scene, then identify and track humans and overlay them into the virtual 3D map on your GPS screen.
The researchers are developing this to work in real time, which could help pick out people crossing in busy intersections, or track a specific person who is dodging the police.
“Our idea is to enable the dynamic visualization of the realistic situation of humans walking on the road and sidewalks, so eventually people can see the animated version of the real-time dynamics of city streets on a platform like Google Earth,” Hwang says.
Hwang’s research team in the past decade has developed a way for video cameras—from the most basic models to high-end devices—to talk to each other as they record different places in a common location.
Color, texture, angle
The problem with tracking a human across cameras of non-overlapping fields of view is that a person’s appearance can vary dramatically in each video because of different perspectives, angles, and color hues produced by different cameras.
The researchers overcame this by building a link between the cameras. Cameras first record for a couple of minutes to gather training data, systematically calculating the differences in color, texture, and angle between a pair of cameras for a number of people who walk into the frames in a fully unsupervised manner without human intervention.
After this calibration period, an algorithm automatically applies those differences between cameras and can pick out the same people across multiple frames, effectively tracking them without needing to see their faces.
The research team has tested the ability of static and moving cameras to detect and track pedestrians on the University of Washington campus in multiple scenarios. In one experiment, graduate students mounted cameras in their cars to gather data, then applied the algorithms to successfully pick out humans and follow them in a 3D space.
They also installed the tracking system on cameras placed inside a robot and a flying drone, allowing the robot and drone to follow a person, even when the instruments came across obstacles that blocked the person from view.
The linking technology can be used anywhere, as long as the cameras can talk over a wireless network and upload data to the cloud.
This detailed visual record could be useful for security and surveillance, monitoring for unusual behavior or tracking a moving suspect. But it also tells store owners and business proprietors useful information and statistics about consumers’ moving patterns.
For example, a store owner could use a tracking system to watch a shopper’s movements in the store, taking note of his or her interests. Then, a coupon or deal for a particular product could be displayed on a nearby screen or pushed to the shopper’s phone—in an instant.
Leveraging the visual data produced by our physical actions and movements might, in fact, become the next way in which we receive marketing, advertisements, and even helpful tools for our everyday lives.
Inevitably, people will have privacy concerns, Hwang says, and the information extracted from cameras could be encrypted before it’s sent to the cloud.
“Cameras and recording won’t go away. We might as well take advantage of that fact and extract more useful information for the benefit of the community,” he adds.
Hwang and his research team presented their results last month in Qingdao, China, at the Intelligent Transportation Systems Conference sponsored by the Institute of Electrical and Electronics Engineers, or IEEE.
Coauthors are Kuan-Hui Lee, a doctoral student in electrical engineering, and Greg Okopal and James Pitton, engineers at the Applied Physics Laboratory.
The Electronics and Telecommunications Research Institute of Korea and the Applied Physics Laboratory funded the work.
Source: University of Washington