Shared publicly  - 
 
This article got me wondering, how is Amazon doing its Glacier storage?  People keep comparing to tape, but, at $0.01/GB/month, couldn't it just be extra disk in its cluster?  Commodity disk is only about $0.05/GB, probably cheaper for Amazon, so they'd only need a year of storage on average to pay for the disk (with x3 replication).  But that even ignores that the disk usage doesn't interfere with higher paying work and would be necessary anyway, so, by increasing utilization on what would be necessary anyway, this is, in some sense, pure profit, similar to how people coming to restaurants at non-peak times are pure profit, the costs are in handling maximum peak load.  

If that is what Amazon is doing here -- and I'm guessing, but I think they noticed that a lot of needs are for memory and maybe some rapid access to disk, most disk was empty and there are long times were disk is mostly idle, so they thought, let's sell it out in a way that doesn't interfere with real-time work -- I really love it.
Earlier today Amazon Web Services announced Glacier, a low-cost, cloud-hosted, cold storage solution. Cold storage is a class of storage that is discussed infrequently and yet it is by far the largest storage class of them all. Ironically, the storage we usually talk about and the storage I’ve worked on for most of my life is the high-IOPS rate storage supporting mission critical databases. These systems today are best hosted on NAND flash and I’...
9
2
Ed Chi's profile photoKuyper Hoffman's profile photoAndrew Hitchcock's profile photoGreg Linden's profile photo
11 comments
Ed Chi
 
Do you think the calculation makes sense after factoring in the management and data center costs of cooling and electricity, etc?
 
Yep, because it is only marginal costs there.  I think you need almost all the hardware powered up anyway.  So it's only the very small electricity difference between an active, lightly loaded machine and an active, slightly more loaded machine, which is negligible.

In more detail, I'd think we're only talking about the potential to have a few machines powered up and active at non-peak times that might have been able to be idle or powered down otherwise, but most of the machines would have utilization dropping to single digits but need to be active to handle that light load, so you're only talking about the very low difference in electricity cost between a machine being at very low utilization and upping the load to moderate utilization on most of the machines.  That's very cheap, basically negligible.  And, unless you're quite clever about reorganizing load all the time, I'd think they're be very few machines that would be completely idle for long periods otherwise, but even those are only saving a small fraction of power (last I heard, 10-40% savings in idle mode, depending on what is shut down and what latency you are willing to tolerate when you need to go active again) and being able to fully power down a machine is harder than one might think (last I heard, if you have any chance of needing the data immediately, you can't power down the machine, as boot up time is too long, which means you basically can only power down machines with only extra replicas and backups on it, and that's not what most machines have on them).

But, honestly, I'm not an expert here, just someone wondering about how Amazon Glacier is working, what the economics of it look like.  To me, because almost all access can be deferred to non-peak times, it looks plausible that it could all be done just using the spare disk capacity on existing disks on their existing machines in their AWS fleet.  And that has interesting implications, that big data companies could offer online and backup using excess capacity in their data centers, especially for companies trying to compete in online backup or data storage with dedicated hardware.
 
Word is they're separate gear; low-RPM drives that get spun down and only fired up on demand (hence the 3-5 hr retrieve promise)

Likely too that they schedule and batch both reads and writes and only access a few drives at a time. My further guess is that these, low power and hence cooler racks, are interspersed between "hot racks" hence lowering aggregate power and heat footprint.

S3 is probably used (automagically) as the staging area for pre-delivery.
 
+Kuyper Hoffman Good news for others if you're right, as ithat would mean Amazon has no competitive advantage.
 
+Greg Linden  Really? I'd say that the competitive advantage is in the horrendously complex algorithms that decide how, when and where to send spin-up/down requests to optimize the storage and recovery; of course, they already know a little thing or two about "just in time delivery" - boxes, bytes, all the same, in principle, esp when then timescales are in hours not nano-seconds; the thought behind the low RPM is obv low consumption/heat during access; furthermore, only a few "servers" (I would imagine some very cheap, low CPU "blades" that handle a few drives each) are spun up at a time in a particular location (say a rack) meaning there is little to no impact on power/cooling in that location. I'd even guess that they're picking a device on a PDU based on the that PDU's load at the time and deferring requests for data that reside on a heavily used PDU as late as possible, then prioritizing those "running late" requests in favour of other requests when the clock expires.

At the same time, writes are probably sent only to banks that are already spun up for a scheduled read.

I think with careful scheduling, the power requirements could be minimized; this is where they're scoring heavily; cooling kills. At scale, anyway :)
Ed Chi
 
Everything I have learned about this area is that it can get insanely complex, with cooling being a much harder problem than most people realize.  Cooling comes with power management, and density issues.

My alt. simple theory: they have a bunch of disks in big trays.  These trays are robot controlled, and function like tapes.  The trays gets plugged in to a machine only when they're needed, and the scheduling algorithm serializes the access requests.
 
+Kuyper Hoffman Good points, and I'm being too strong when I say Amazon has no competitive advantage.  My point is mostly that competitors (e.g. Mozy) can't use idle disk capacity on an existing cluster, but competitors can and do use similar hardware and powering down strategies to what you (and +Ed Chi) said.  It would be a much stronger competitive advantage if Amazon was using something few others can get easily (like lots of empty disk on existing hardware that is basically free if used at non-peak times).
Ed Chi
 
I hope we get to learn how they implemented this soon.  I like the idea of them knowing when to power up the drives so that they can minimize energy usage and cost.
 
+Ed Chi they're notoriously tight lipped; I used to help them stay that way :)
 
Greg, I had the same thought when this service was released. I imagine S3's main driver of hard disk purchases is IOPS and that they have a glut of bytes available. You'll notice that S3 underprices reads and writes. From an EC2 instance, I can read a 5 TB object from S3 for 1/10K of a penny: just the GET cost since bandwidth from EC2 is free and there are no per-byte charges on reads.

I was wondering how long it would take for Amazon to remedy this. I expected them to introduce a per-byte read/write charge, but it looks like they went a different way and introduced it as a separate product. This lets existing S3 customers over pay for storage and underpay for requests and those who are byte price sensitive can transition to the new product. It is actually quite smart: Amazon keeps existing S3 customers happy by not upending the price model. Just look at how unhappy people were when Amazon first announced the per-request charge or when Google changed the App Engine prices.

Another reason I believed this was S3 under the hood: it provides the same durability SLAs, which makes me think it uses the same software and durability model. They can flag these objects as Glacier objects. The API can delay customer reads and writes to a more opportune moment in order to shave off the top of the demand curve and fill in the troughs. 

Of course, it would be neat if Amazon was using ultra-low power disks or massive robot disk libraries. That would align well with the rumored Facebook "sub zero" project: http://www.wired.com/wiredenterprise/2012/08/sub-zero/
 
+Andrew Hitchcock My thoughts exactly, they they are using S3 for Glacier, but just delaying reads and writes to non-peak times, which means they are using excess capacity, which is nearly costless (as long as it never interferes with peak demand).  Great point on how Glacier corrects the S3 pricing for, eg, backups and logs coming off of EC2 boxes.  And thanks for the Wired article on Facebook's Sub Zero project; I hadn't seen that.
Add a comment...