Profile cover photo
Profile photo
Tariq Khokhar
366 followers
366 followers
About
Tariq's posts

Post has attachment
I expect Neil Fantom is the only one more excited than me about this, but seriously well done to the +World Bank  data team on launching the OPEN DATA TIME MACHINE!

What is it? It's a tool that lets you analyze and visualize different vintages and revisions of a published time series. So when you hear phrases like "revised GDP numbers" or "revised inflation figures" you can now compare the different data side by side and see if any conclusions you initially drew need to be updated as well.

http://blogs.worldbank.org/opendata/2015-year-data-time-travel 

Post has attachment

Post has attachment

Post has attachment

Post has shared content
How do you define #BigData?

By December 2013, "meaningless buzzword" would be the most honest answer, but before the label "big data" became simply another way to publish more books, wow investors, and dupe customers, it actually was trying to make a real distinction.

My short, quippy answer has generally been "more data than I can fit on my phone" (i.e. more than 48GB serialized). +Tariq Khokhar told me on Twitter that he prefers "more data than I can pick up."

My longer, more considered answer is that big data is data that -- because of its size, not its nature -- we can't handle using conventional tools like relational databases (or hierarchical databases, RDF triple stores, etc.). Relational databases can easily store, analyze, query, and report on millions or tens of millions of rows, so millions of rows isn't "big data." At billions of rows, you're starting to strain the capacity of current RDBMS -- you might have to partition your data in complex ways and then merge the results, for example -- so I suspect that's when you start thinking about non-relational, distributed hashing approaches like Hadoop.

Once you make the jump to "big data" you're giving up some valuable stuff like internal consistency, but you're gaining the ability to scale. Depending on what you're trying to accomplish, the answers you get over a slightly inconsistent dataset of 10 billion nodes could be more interesting than the answers you get over a consistent dataset of 10 million nodes.

So that's my tentative answer, for today, anyway. What's yours?

Post has attachment
New data  show the global child (under-five) mortality rate has dropped 47 percent since 1990: http://blogs.worldbank.org/opendata/global-child-mortality-rates-have-halved-1990-s-not-enough-meet-mdg-target 
Photo

Post has attachment

Post has attachment
Just wrote a little overview of access +World Bank Data in Python, R, Ruby and Stata - keen to hear about any other language-specific libraries out there! 

Post has shared content
Hey Alex - I went though the same decision process a few months ago: I ended up with a 15" non-retina MBP. My reasons for going for it:

- hi-res matte (not glossy) display option which I find preferable to the retina for 95% of the coding, writing and a/v work I do. Movies look nicer on the glossy retina but that's about it for me.

- I want fast AND plentiful storage. I replaced the internal HDD with a (relatively) inexpensive Samsung 256GB SSD and swapped the optical drive with a caddy to house the 1TB HDD that came with it: result is a super-fast machine with heaps of storage.

- overall, with 16gb of third party ram and the SSD switcheroo + AppleCare, the non retina machine comes out a few hundred bucks cheaper than the closest configured retina. As a bonus I could use my existing magsafe power adapters..

Good luck!
OK, techie friends: if you were going to buy a new Apple laptop, what would it be -- and why? I want a 15" screen, so the Air is out. What's the general consensus on Retina vs non-Retina machines?
http://www.apple.com/why-mac/compare/notebooks.html

Post has attachment
"What Happens When Big Data Meets Official Statistics?" Live webstream today from the +World Bank  at 1430 EST / 1930 GMT. #bigstats  
Wait while more posts are being loaded