Shared publicly  - 
So how much data can the NSA's new datacenter actually store?

+NetApp's Larry Freeman sent me this surprising analysis earlier.
It debunks all the crazy talk about zettabytes of disk. Larry seriously knows his stuff, so this is well worth a read. I'd love to hear your thoughts... 

[TL;DR: It's "only" 2 XB.]

"[The press] has some conflicting information: [some] say the Utah Data Center is the 2nd largest in the world with 1.5 million sq feet. [Other sources] put the data center at 100,000 sq ft. ... The most telling statistic is the 65 Megawatt substation, which will limit the amount of racks that can be powered and cooled.

"Assuming that 40% of the 25,000 sq ft floor space in each of the 4 data halls would be used to house storage, 2,500 storage racks could be housed on a single floor (with accommodations for front and rear service areas).  Each rack could contain about 450 high capacity 4TB HDDs which would mean that 1,125,000 disk drives could be housed on a single data center floor, with 4.5 Exabytes of raw storage capacity. (1 Exabyte = 1 million Terabytes).
"HOWEVER, each storage rack consumes about 5 Kilowatts of power, meaning the storage equipment alone would require 12.5 Megawatts.  On the other hand, servers consume much more power per rack. Up to 35 Kilowatts.  Assuming an equivalent number of server racks (2,500), servers would eat up 87.5 Megawatts, for a total of 100 Megawatts.  Also, cooling this equipment would require another 100 Megawatts of power, making the 65 Megawatt power substation severely underpowered - and so far we’ve only populated a single floor.  Think that the NSA can simply replace all those HDDs with Flash SSDs to save power?  Think again, an 800GB SSD (3 watts) actually consumes more power per GB than a 4TB HDD (7.8 watts).
"So, in my opinion, what we’re looking at here is a fairly typical enterprise data center, albeit with monumental security measures – and a few thousand servers and a couple Exabytes of storage."

More at
Darrell Ames's profile photoNabil Kazi's profile photoHPC High Performance Computing's profile photoInetta Bullock's profile photo
Is the assumption that all the equipment on this site is just for storage are are the servers also being tasked with indexing, analyzing, and searching of the data?

That is still a nice chunk of data.   Could put a lot of MP3's in there :)
Relational compression.  Easier to link billions of people to a "happy new year" token than storing that string billions of times come new year, same approach of all data, not that much unique data when you look at a large volume and with that reducing its storage needs is never the real issue, more ability to process and reference.

But email, sms, phone calls don't take up as much as you think.  Now if people did more video conferencing using sign language, then things get fun :).
The problem with that idea is that, as more and more people use crypto, the NSA will have to store the ciphertext until they can decrypt it. And an encrypted stream is inherently impossible to compress.
Typically what you see on the surface is like the visible portion of the iceberg.
+Richi Jennings Compression of a encrypted stream is possible, dispite the entropy (tell you more one day but ways to reararange the cyphertext into a form more yeildable for compression exist as I wrote one many years ago).

That said many forms of encryption used limited sized keys (and methods)  and humans can be very predictable.  Get 1000 people and ask them to pick an encryption password form say 10 billion possible combinations and you would think the odds of two people picking the same would be low, yet it happens and above the odd's.  This ignore that 5% of them would pick "god" or "jesus" as a password if you have no checks and let them.

As for the actualy decryption, you don't even want to know what they have for rainbow tables, seriously scarey, that ignore the kinks and dents they have seeded in various encryption subsystems over the years.  Could spend a lifetime talking about what random truely is and then still find that 256bit encryption is flawed many years later so they net effect is only 64bit encryption with the rest workable.  Personly I don't like encryptin that uses even numbers, I'd take 255bit over 256bit, call be silly but time will show method in that madness.
If you have to store massive amounts of data which you do not need to access in real time, you'll add robotic tape libraries. So just calculating hdd's is a bit too simple. That said, store the metadata on disk, store the payload on tapes and if the metadata analysis says that you want to look at the payload, voila, it's available.

And, as long as most people communicate in cleartext, there's deduplication, compression, etc. The Internet Archive Wayback Machine contains almost 2 petabytes of data and is currently growing at a rate of 20 terabytes per month. That should give you an idea of the amount to add.

Last but not least: encryption is not the proper response. The proper response is to require democratic and public control (and limitations, of course) of the things happening. Any real democracy has to cope with adversaries, who benefit from this open control system. If it can't cope, it will be no better than these adversaries in the long run.
The answer is -- way too much.
To me, it looks like a large spleen.  Perhaps it hypertrophies when there is more data.
I thought they wanted to try and make it look a bit like a dick.
Why are they building this? 640k is enough for anyone.
As to cooling, I was at a very large data center last year and the cooling issue was fixed using a convection approach. What I can tell you is that, unlike a standard hot/cold isle approach, this convection approach reduces the cooling requirements around 40% to 50% over a standard data center.
+Hector Andem You better believe it, certianly true of many govermental buildings, especialy those constructed with datacentres or legacy cold war considerations in mind design ways.  Why many constructions of goverment buildings will dispose of landfill in a distrabuted manner and covered removal lorrys to help hide the subsoil type layers being extracted.  Hence sat pics often used during the cold war days to  inspect soil being removed from building construction to help work out how many sub levels are actualy being built.  Hence the security used in its removal, and then one of the security guards used during the site construction ends up slipping the details out during a pub chat.

So in itself the people who do know and usualy only the ones that the reason for secracy is in place, which makes you laugh if not cry.  But we the people don't know and honestly, do we even need to know.  A room is but a room in any other name after all to corupt Shakespire :).
If you look at the layout from a distance long enough after a few drinks it looks like a killer weiner (hot dog sausage) with mobility tracks and a blood sucking attachment at the front.  Somebody had a laugh with that design I bet :).  Now I can't unsee it 8(
Add a comment...