Shared publicly  - 
 
Processing a trillion cells per mouse click

Working at Internet-scale drives us to develop tools to handle the world’s largest datasets. We built Dremel (http://research.google.com/pubs/pub36632.html) to make big datasets look small, but that was just the beginning. Now we're proud to spotlight PowerDrill, a new data analysis tool that takes advantage of a pre-processing step to query huge datasets a couple of orders of magnitude faster than Dremel (http://www.wired.com/wiredenterprise/2012/08/google-trillion-pieces-of-data/).

PowerDrill is described in the research paper below, presented at last week’s Very Large Databases Conference in Istanbul. With a single mouse click, the tool built by Google authors Alexander Hall, Olaf Bachmann, Robert Bussow, Silviu Ganceanu, and Marc Nunkesser can fire off around 20 queries, which go over 782 billion cells of data in less than 40 seconds -- 10 to 100x faster than traditional column stores that do full scans of data.


(Edited 9/5/12: updated links and clarified PowerDrill tool)
63
71
Mohsen Amiri (Ali)'s profile photoelaine ossipov's profile photoAlexandra Olteanu's profile photovalera cocosov (KaZantip)'s profile photo
6 comments
 
For some reason when I read the title, my brain processed it as "Processing a trillion mouse cells per click".  I wondered when Google had gotten into neuroscience.  :-)
 
If they keep at it they'll soon be able to handle queries on the U. S. debt.
Add a comment...