2012年6月5日

How many servers does Google have?



This estimate was made by adding up the total available floor space at all of Google's data centers, combined with knowledge on how the data centers are constructed. I've also checked the numbers against Google's known energy consumption, and various other snippets of detail revealed by Google themselves.

Satellite imagery: j2p2.com/google-data-center-floor-plans

Google doesn't publicly say how many servers they have. They keep the figure secret for competitive reasons. If Microsoft over-estimates and invests in more servers then they'll waste money - and this would be good for Google. Conversely, if Microsoft builds fewer servers then they won't match Google's processing power, and again, this would be good for Google. Nevertheless, from the limited amount of information that is available I've attempted to make a rough estimate.

First of all, here's some background on how Google's data centers are built and organised. Understanding this is crucial to making a good estimate.

--------------------------------------------------
Number and location of data centers

Google build and operate their own data centers. This wasn't always the case. In the early years they rented colocation space at third-party centers. Since the mid-2000s, however, they have been building their own. Google currently (as of January 2012) has eight operational data centers. There are six in the US and two in Europe. Two more are being built in Asia and one more in Europe. A twelfth is planned in Taiwan but construction hasn't yet received the go-ahead.

Initially the data center locations were kept secret. Google even purchased the land under a false company name. That approach didn't quite work however. Information always leaked out via the local communities. So now Google openly publishes the info:google.com/about/datacenters/locations

Here are all 12 of Google's self-built data centers, listed by year they became operational:

2003 - Douglas County, Georgia, USA (container center 2005)
2006 - The Dalles, Oregon, USA
2008 - Lenoir, North Carolina, USA
2008 - Moncks Corner, South Carolina, USA
2008 - St. Ghislain, Belgium
2009 - Council Bluffs, Iowa, USA
2010 - Hamina, Finland
2011 - Mayes County, Oklahoma, USA

2012 - Profile Park, Dublin, Ireland (operational late 2012)
2013 - Jurong West, Singapore (operational early 2013)
2013 - Kowloon, Hong Kong (operational early 2013)
201? - Changhua Coastal Industrial Park, Taiwan (unconfirmed)

These are so-called “mega data centers” that contain hundreds of thousands of servers. It's possible that Google continues to rent smaller pockets of third-party colocation space, or has servers hidden away at Google offices around the world. There's online evidence, for example, that Google was still seeking colocation space as recently as 2008. Three of the mega data centers came online later that year, however, and that should have brought the total capacity up to requirements. It's reasonable to assume that Google now maintains all its servers exclusively at its own purpose-built centers - for reasons of security and operational efficiency.

--------------------------------------------------
Physical construction of data centers

Although the locations are public knowledge, the data center insides are still fairly secret. The public are not allowed in, there are no tours, and even Google employees have restricted access. Google have, however, revealed the general design principles.

The centers are based around mobile shipping containers. They use standard 40' intermodal containers which are ~12m long and ~2.5m wide. Each container holds 1,160 servers. The containers are lined up in rows inside a warehouse, and are stacked two high.

See the video Google released in 2009: Google container data center tour

Are all of Google's data centers now based on this container design? We don't know for sure, but assume that they are. It would be sensible to have a standardised system.

As for the servers themselves - they use cheap, low-performance, open-case machines. The machines only contain the minimal hardware required to do their job, namely: CPU, DRAM, disk, network adapter, and on-board battery-powered UPS. Exact up-to-date specifications are not known, but in 2009 an average server was thought to be a dual-core dual-processor (i.e. 4 cores) with 16 GB RAM and 2 TB disk.

The containers are rigged to an external power supply and cooling system. Much of the space inside a warehouse is taken up with the cooling pipes and pumps. The cooling towers are generally external structures adjacent to the warehouse.

--------------------------------------------------
Counting servers based on data center floor space

This is by no means a precise method, but it gives us an indication. It works as follows.

First we determine the surface area occupied by each of Google's data center buildings. Sometimes this information is published. For example the data center at The Dalles is reported to be 66,000 m². The problem with this figure, however, is we don't know if it includes only the warehouse building itself or the whole plot of land including supporting buildings, car parks, and flower beds.

So, to be sure of getting the exact size of only the buildings, I took satellite images from Google Maps and used those to make measurements. Due to out-of-date imagery some of the data centers are not shown on Google Maps, but those that are missing can be found on Bing Maps instead.

Having retrieved the satellite imagery of the buildings I then superimposed rows of shipping containers drawn to scale. Care was taken to ensure the containers occupied approximately the same proportion of total warehouse surface area as seen in the video linked above. That is, well under 50% of the floor space, probably closer to 20%. An example of this superimposed imagery is attached to this post, it shows one of the warehouses in Douglas County, Georgia, USA.

All floor plan images: j2p2.com/google-data-center-floor-plans

Having counted how many container footprints fit inside each warehouse, I then doubled those figures. This is because I assume all containers are stacked two high. Quite a large assumption, but hopefully a fair one.

It turns out that in general the centers house around 200,000 servers each. Douglas County is much larger at about twice that figure. Meanwhile Lenoir, Hamina, and Mayes County are smaller. Mayes County is due to be doubled in size during 2012. The sizes of the future data centers in Singapore and Hong Kong have not been measured. Instead I assume that they'll also host around 200,000 servers each.

This results in the following totals:

417,600 servers - Douglas County, Georgia, USA
204,160 servers - The Dalles, Oregon, USA
241,280 servers - Council Bluffs, Iowa, USA
139,200 servers - Lenoir, North Carolina, USA
250,560 servers - Moncks Corner, South Carolina, USA
296,960 servers - St. Ghislain, Belgium
116,000 servers - Hamina, Finland
125,280 servers - Mayes County, Oklahoma, USA

Sub-total: 1,791,040

Future data centers that'll be operational by early 2013:

46,400 servers - Profile Park, Dublin, Ireland
200,000 servers - Jurong West, Singapore (projected estimate)
200,000 servers - Kowloon, Hong Kong (projected estimate)
139,200 additional servers - Mayes County, Oklahoma, USA

Grand total: 2,376,640

--------------------------------------------------
Technical details revealed by Google

A slide show published in 2009 by Google Fellow +Jeff Dean reveals lots of interesting numbers. In particular it mentions "Spanner", which is the storage and computation system used to span all of Google's data centers. This system is designed to support 1 to 10 million globally distributed servers.

Given that this information was published over two years ago, it's likely the number of servers is already well into that 1-to-10 million range. And this would match with the floor space estimation.

Slide show: www.odbms.org/download/dean-keynote-ladis2009.pdf

--------------------------------------------------
Counting servers based on energy consumption

Last year +Jonathan Koomey published a study of data center electricity use from 2005 to 2010. He calculated that the total worldwide use in 2010 was 198.8 billion kWh. In May of 2011 he was told by +David Jacobowitz (program manager on the Green Energy team at Google) that Google's total data center electricity use was less than 1% of that worldwide figure.

From those numbers, Koomey calculated that Google was operating ~900,000 servers in 2010. He does say, however, that this is only "educated guesswork". He factored in an estimate that Google's servers are 30% more energy efficient than conventional ones. It‘s possible that this is an underestimate - Google does pride itself on energy efficiency.

If we take Koomey's 2010 figure of 900,000 servers, and then add the Hamina center (opened late 2010) and the Mayes County center (opened 2011) that brings us to over a million servers. The number would be ~1,200,000 if we were to assume all data centers are the same size.

Koomey's study: www.koomey.com/post/8323374335

--------------------------------------------------
Summary

The figure of 1,791,040 servers is an estimate. It's probably wrong. But hopefully not too wrong. I'm pretty confident it's correct within an order of magnitude. I can't imagine Google has fewer than 180,000 servers or more than 18 million. This gives an idea of the scale of the Google platform.

--------------------------------------------------
References

YouTube videos:
Google container data center tour
Google Data Center Efficiency Best Practices. Part 1 - Intro & Measuring PUE
Continual improvements to Google data centers: ISO and OHSAS certifications
Google data center security

http://www.google.com/about/datacenters/
http://www.j2p2.com/google-data-center-floor-plans/

http://goo.gl/vkjWu - Google patent for container-based data centers
http://goo.gl/G4aMK - Standard container sizes
http://goo.gl/rfPMa - +Jeff Dean's slideshow about Google platform design
http://goo.gl/DcjJB - “In the Plex” book by +Steven Levy
http://goo.gl/JYXbx - +Jonathan Koomey's data center electricity use

Articles by +Rich Miller of Data Center Knowledge:
http://goo.gl/nfjvW
http://goo.gl/K5MDW
http://goo.gl/rGNy7

Original copy of this post:
https://plus.google.com/114250946512808775436/posts/VaQu9sNxJuY

Attached image below is one of Google's data warehouses in Douglas County, Georgia. Photo is from Google Maps, with an overlay showing the server container locations.
--
收合這則訊息

沒有留言:

張貼留言