NORTHWESTERN (US)—A universal method can be used to accurately analyze a range of complex networks, be they social networks like Facebook, protein to protein interactions, or networks used for air transportation.
Roger Guimera and Marta Sales-Pardo, a husband-wife research team at Northwestern University, applied a mathematical and computational framework to five different networks and found one method indeed could reliably analyze all.
The details of their algorithm, which can predict missing and spurious interactions in a system, appears in the Dec. 7 Early Edition by the Proceedings of the National Academy of Sciences.
“The way the flu spreads, for example, is based on an underlying network, and it’s important to understand the critical patterns,” says Guimera, a research assistant professor of chemical and biological engineering in the McCormick School of Engineering and Applied Science.
“Using available data, our method tries to find the best description of the network being analyzed, no matter what kind of network.”
In the study, Guimera and Sales-Pardo tested their method on a range of five known “true” networks: a karate club, a social network of dolphins, the neural network of the worm C. elegans, the air transportation network in Eastern Europe, and the metabolic network of E. coli.
These networks have between 34 nodes (members of a karate club) and 604 nodes (metabolites in a metabolic network).
“Our method separates wheat from chaff, the signal from the noise,” explains Sales-Pardo, also a research assistant professor of chemical and biological engineering.
“There are many ways to map nodes in a network, not just one. We consider all the possible ways. By taking the sum of them all, we can identify both missing and spurious connections.”
The central idea behind Guimera and Sales-Pardo’s method is that, even though each network has unique characteristics (depending on its functional needs and evolutionary history), all networks share a remarkable property: their nodes can be classified into groups with the nodes connecting to each other depending on their group membership.
In a social network, for example, people can be grouped by age, occupation, political orientation and so on. The method proceeds by averaging all possible groupings of the nodes, giving each grouping a weight that reflects its explanatory power.
For each of the five true networks, the researchers introduced errors and applied their algorithm to the distorted network.
Each time, the algorithm produced a new network that reliably separated interactions likely to be spurious from those likely to be correct, without the aid of any additional information (such as the type of network or the amount of errors).
Each new network reconstruction was closer to the original true network than the network containing errors and omissions.
A more accurate method of network analysis could help Facebook, for example, identify truly relevant connections—with 350 million Facebook users the number of mistakes can add up quickly.
Systems biology could benefit, too. The project to obtain a complete map of the millions of human protein-protein interactions has a projected cost of $1 billion but relies on techniques with accuracies (estimated in 2002) to be below 20 percent.
“The flexibility of our approach, along with its generality and its performance, will make it applicable to many areas where network data reliability is a source of concern,” the authors write.
The National Science Foundation and the National Institutes of Health supported the research.
Northwestern University news: www.northwestern.edu/newscenter/index.html