Learn about approaches to batching, and the costs of unnecessary system complexity.
You got Mesos in my Distributed Filesystem!
I /really/ like Apache Ranger + Apache Knox for security. I think it's a better approach than Cloudera's Sentry approach, particularly when you only need to focus on LDAP integration and don't want to mess with Kerberos. There is still some work to do there, but it's off to a great start.
I think there needs to be better integration of components with Ambari. For example, Ranger and Hue have to be installed manually and are not managed with Ambari. On larger Hadoop clusters that can be a real hassle. I'm told that Ambari integration for those components are coming in the near future (Q1 or Q2 2015), so I'm looking forward to that.
I've benchmarked Hive queries using both MR and Tez. There are clear performance advantages to using Tez. It's not nearly as fast as Impala. However I have queries that won't run on Impala due to memory limits that run perfectly fine with Hive.
If you haven't tried Hortonworks HDP 2.2 yet, I recommend you give it a try. I think you'll be pleasantly surprised.
#hortonworks #hdp #hadoop
Namenode HA Reaches a Major Milestone | Hortonworks
We reached a significant milestone in HDFS: the Namenode HA branch was merged into the trunk. With this merge, HDFS trunk now supports HOT
Apache Launches Hadoop 1.0 - Linux and Open Source - News & Reviews
The Apache Software Foundation delivers Hadoop 1.0, the much-anticipated 1.0 version of the popular open-source platform for storing and pro
cloud computing: Data Scientists Should Be Design Thinkers
But as DJ Patil said in “Building Data Science Teams,” the best data scientists are not statisticians; they come from a wide range of scient
Google introduces Compute Engine, Google-scale Linux virtualization
New infrastructure-as-a-service joins the App Engine PaaS offering.
SWAT team throws flashbangs, raids wrong home due to open WiFi network
Whoops! Those anonymous Internet threats came from up the block.
Why Real-Time Analytics? [Free White Paper] | Infochimps Blog
Updated daily, Monday - Friday. Chock full of big data insights, news and tips straight from the Data Mine.