Why Google’s rapid growth means faster search

"Google already delivered YouTube videos from within these client networks," says Matt Calder. "But they've abruptly expanded the way they use the networks, turning their content-hosting infrastructure into a search infrastructure as well." (Credit: Quinn Dombrowski/Flickr)

Google search has dramatically increased the number of locations around the world from which it serves client queries during the past 10 months—by some 600 percent.

In effect, Google is repurposing existing infrastructure to change the physical way it processes web searches, according to a new study.

From October 2012 to late July 2013, the number of locations serving Google’s search infrastructure increased from a little less than 200 to a little more than 1400, and the number of ISPs grew from just over 100 to more than 850.

The change in Google’s search infrastructure. (Credit: USC)

Most of this expansion reflects Google utilizing client networks (such as Time Warner Cable, for example) that it already relied on for hosting content like videos on YouTube, and reusing them to relay—and speed up—user requests and responses for search and ads.

“Google already delivered YouTube videos from within these client networks,” says Matt Calder, a PhD student at the University of Southern California and the study’s lead author. “But they’ve abruptly expanded the way they use the networks, turning their content-hosting infrastructure into a search infrastructure as well.”

Google goes regional


Previously, if you submitted a search request to Google, your request would go directly to a Google data center.

Now, your search request will first go to the regional network, which relays it to the Google data center. While this might seem like it would make the search take longer by adding in another step, the process actually speeds up searches.

Data connections typically need to “warm up” to get to their top speed—the continuous connection between the client network and the Google data center eliminates some of that warming up lag time.

In addition, content is split up into tiny packets to be sent over the Internet—and some of the delay that you may experience is due to the occasional loss of some of those packets. By designating the client network as a middleman, lost packets can be spotted and replaced much more quickly.

A technical report on the study will be presented at the SIGCOMM Internet Measurement Conference in Spain on October 24.

Good for web users and Google

Calder and colleagues developed a new method of tracking down and mapping servers, identifying both when they are in the same datacenter and estimating where that datacenter is.

They also identify the relationships between servers and clients, and just happened to be using it when Google made its move.

“Delayed web responses lead to decreased user engagement, fewer searches, and lost revenue,” says Ethan Katz-Bassett, assistant professor at USC Viterbi. “Google’s rapid expansion tackles major causes of slow transfers head-on.”

The strategy seems to have benefits for web users, ISPs, and Google, according to the team. Users have a better web browsing experience, ISPs lower their operational costs by keeping more traffic local, and Google is able to deliver its content to users quicker.

Xun Fan, graduate student at USC Viterbi, noted that the team had not originally set out to document this growth.

“We had developed techniques to locate the servers, without requiring access to the users they serve, and it just so happened we exposed this rapid expansion,” Fan says.

Next, the team will attempt to quantify exactly what the performance gains are for using this strategy, and will try to identify under-served regions.

The National Science Foundation, the US Department of Homeland Security Science and Technology Directorate, and the Air Force Research Laboratory funded the work.

Source: USC