Researchers have created a new method for keeping private the data that our many devices collect about how we use them.
The people who design hardware and software for smartphones, internet browsers, high-tech cars, and many other internet-enabled devices need to know how people use their products in order to make them better. But when faced with the request to send information about a computer error back to the developers, many of us are inclined to say “No,” just in case that information is too personal.
So researchers have developed a new system for aggregating these kinds of usage reports that emphasizes maintaining personal privacy.
“We have an increasing number of devices—in our lightbulbs, in our cars, in our toasters—that are collecting personal data and sending it back to the device’s manufacturer. More of these devices means more sensitive data floating around, so the problem of privacy becomes more important,” says Henry Corrigan-Gibbs, a graduate student in computer science at Stanford University who codeveloped the system. “This type of system is a way to collect aggregate usage statistics without collecting individual user data in the clear.”
The secret ingredient? Secret sharing
The system, called Prio, works by breaking up and obscuring individual information through a technique known as “secret sharing” and only allowing for the collection of aggregate reports. As a result, an individual’s information is never reported in any decipherable form.
Mozilla is currently testing Prio in a version of Firefox called Nightly, which includes other features Mozilla is still testing. On Nightly, Prio ran in parallel to the current remote data collection (telemetry) system for six weeks, gathering over three million data values. There was one glitch but once that was fixed, Prio’s results exactly matched the results from the current system.
“This is rare example of a new privacy technology that is getting deployed in the real world,” says Prio codeveloper Dan Boneh, a professor of computer science and of electrical engineering. “It is really exciting to see this put to use.”
Keep ’em separated
Secret sharing is a method for maintaining the security of data that involves breaking up a piece of information into specially formulated parts. That way, if someone gets hold of only one part, they learn nothing about the original piece of information.
Prio uses secret sharing to break individual data points—such as whether you chose to change your browser homepage from the default setting—into secret shares and then sends those to two different servers. Even if an attacker is able to take over one of the two servers, the attacker still cannot recover any individual’s data point.
To produce the aggregate value of interest, the servers each sum up their shares and then exchange these sums. By combining the sums, the servers can learn the final aggregate statistic—what percent of people changed their browser homepage from the default—without leaking any other information about the individual pieces of information involved.
Prio can handle large amounts of data and, so long as the servers never collude, the system reveals nothing other than aggregate statistics. The system can further enhance privacy by slightly perturbing the final result. The researchers developed a method whereby the system sending the data proves to the servers that a set of secret shares is well formed without revealing any information about the data that the shares encode. Without a proof of this sort, a single faulty or malicious participant could send a garbled set of shares to the servers, which would completely corrupt the final reports.
100,000 Prio users
Currently, Mozilla is testing Prio using nonsensitive data it already collects and is running both servers. In order to fulfill the privacy-preserving potential of Prio, Mozilla would have to find a trustworthy third party to run the second server. It is also continuing its tests of Prio and will be providing updates about progress via its blog.
For their part, the researchers are excited about the potential of Prio for many different kinds of devices and data sharing. They also appreciate seeing their work in action.
“To me, this is the best example of why research is exciting. You get to study these things and you get to launch them into the real world and see them have impact,” says Corrigan-Gibbs. “This began as a fascinating theoretical problem about proof systems and zero knowledge. And then 18 months later, there are 100,000 people using it.”
The researchers presented a paper about Prio at the 14th USENIX Symposium on Networked Systems Design and Implementation.
Source: Stanford University