Name the emotion you want drone video to capture

"We were trying to learn something incredibly subjective, and I was surprised that we obtained good quality data," says Rogerio Bonatti. (Credit: George Kroeker/Unsplash)

A new model lets a drone shoot a video based on a desired emotion or viewer reaction.

It takes skill to fly a drone smoothly and without crashing. Once someone has mastered flying, there are still camera angles, panning speeds, trajectories, and flight paths to plan.

A team of researchers imagined that with all the sensors and processing power onboard a drone and embedded in its camera, there must be a better way to capture the perfect shot.

With the new model, the drone uses camera angles, speeds, and flight paths to generate a video that could be exciting, calm, enjoyable, or nerve-wracking—depending on what the filmmaker tells it to do.

“Sometimes you just want to tell the drone to make an exciting video,” says Rogerio Bonatti, a PhD candidate in Carnegie Mellon University’s Robotics Institute.

The team presented their paper on the work at the 2021 International Conference on Robotics and Automation this month. The presentation is available on YouTube.

“We are learning how to map semantics, like a word or emotion, to the motion of the camera,” says Bonatti, who worked with researchers at the University of Sao Paulo and Facebook AI Research on the project.

But before “Lights! Camera! Action!” the researchers needed hundreds of videos and thousands of viewers to capture data on what makes a video evoke a certain emotion or feeling. Bonatti and the team collected hundreds of diverse videos. Several thousand viewers then watched 12 pairs of videos and gave them scores based on how the videos made them feel.

The researchers used this data to train a model that directed the drone to mimic the cinematography corresponding to a particular emotion. If fast moving, tight shots created excitement, the drone would use those elements to make an exciting video when the user requested it. The drone also could create videos that were calm, revealing, interesting, nervous, or enjoyable; and combine emotional characteristics, like interesting and calm, in the same video.

“We were trying to learn something incredibly subjective, and I was surprised that we obtained good quality data,” says Bonatti.

The team tested their model by creating sample videos, like a chase scene or someone dribbling a soccer ball, and asked viewers for feedback on how the videos felt. Bonatti says that not only did the team create videos intended to be exciting or calming that actually felt that way, but also that they achieved different degrees of those emotions.

The team’s work aims to improve the interface between people and cameras, whether that be helping amateur filmmakers with drone cinematography or providing on-screen directions on a smartphone to capture the perfect shot.

“This opens this door to many other applications, even outside filming or photography,” Bonatti says. “We designed a model that maps emotions to robot behavior.”

Source: Carnegie Mellon University