CORNELL (US)—Computer scientists have mapped the flow of news in the mainstream media and in the blogosphere. Their findings show a consistent rhythm as stories rise into prominence and fall off after a few days, with a “heartbeat” pattern of handoffs.
Jon Kleinberg, the Tisch University Professor of Computer Science at Cornell University, along with postdoctoral researcher Jure Leskovec and graduate student Lars Backstrom used the Internet to track and analyze the news cycle over a three-month period leading up to the 2008 presidential election. The team looked at 1.6 million online news sites, including 20,000 mainstream media sites and blogs, for a total of 90 million articles.
“The movement of news to the Internet makes it possible to quantify something that was otherwise very hard to measure—the temporal dynamics of the news,” says Kleinberg.
In mainstream media, they found, a story rises to prominence slowly then dies quickly; in the blogosphere, stories rise in popularity very quickly but then stay around longer, as discussion goes back and forth. Eventually, almost every story is pushed aside by something newer.
“We want to understand the full news ecosystem, and online news is now an accurate enough reflection of the full ecosystem to make this possible. This is one [very early] step toward creating tools that would help people understand the news, where it’s coming from and how it’s arising from the confluence of many sources.”
Because quotes remain fairly consistent, even when an overall story may be presented in different ways by different writers, the researchers developed an algorithm which tracked quotations in news stories, identifying and grouping similar words and phrases into ‘phrase cultures.’
They then tracked the volume of posts in each phrase cluster over time and found major peaks in several areas.
For example, in data from August and September, the Democratic and Republican conventions, the financial crisis, and discussions of a bailout plan all showed threads rising and falling on a more or less weekly basis. The greatest spike came the week of Sept. 12, with the much-debated “lipstick on a pig” comment.
The researchers also say their work suggests an answer to a longstanding question: Is the “news cycle” just a way to describe our perception of what’s going on in the media, or is it a real phenomenon that can be measured? They opt for the latter, and offer a mathematical explanation of how it works.
In the life of a news story, it seems, imitation is the sincerest form of flattery. As more sites carry a story, other sites are more likely to pick it up. But invariably, new stories always push out the old.
The researchers also found that a rise in blogging activity on a particular story closely follows a rise in mainstream activity, peaking an average of 2.5 hours after the mainstream peak. Almost all stories started in the mainstream. Only 3.5 percent of the stories tracked appeared first dominantly in the blogosphere and then moved to the mainstream.
The mathematical model needs to be refined, the researchers say, and they suggest further study of how stories move between sites with opposing political orientation.
“It will be useful to further understand the roles different participants play in the process,” the researchers conclude, “as their collective behavior leads directly to the ways in which all of us experience news and its consequences.”
The research was presented at the Association for Computing Machinery Special Interest Group on Conference on Knowledge Discovery and Data Mining Conference June 28 to July 1 in Paris.
Cornell University news: www.news.cornell.edu