Profile

Cover photo
Brian O'Neill
296 followers|227,021 views
AboutPostsPhotosYouTube

Stream

Brian O'Neill

Shared publicly  - 
 
Cloud Formation on AWS for Cassandra + HPCC
If your primary objective is to setup a simple Cassandra cluster, then you probably want to start here: http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installAMI.html However, if you have an existing AWS cluster to which you want to add Cassand...
1
1
Larry Crochet Jr.'s profile photo
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Spark SQL against Cassandra Example
Spark SQL is awesome.  It allows you to query any Resilient Distributed Dataset (RDD) using SQL.  (including data stored in Cassandra!) First thing to do is to create a SQLContext from your SparkContext.  I'm using Java so... (sorry -- i'm still not hip eno...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Data Locality w/ Cassandra : How to scan the local token range of a table...
I'm working on a mechanism that will allow HPCC to access data stored in Cassandra with data locality, leveraging the Java streaming capabilities from HPCC (more on this in a followup post). More specifically, we want to allow people to write functions in E...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Holy momentum Batman! Spark and Cassandra (circa 2015) w/ Datastax Connector and Java
Over a year ago, I did a post on Spark and Cassandra .  At the time, Calliope was your best best.  Since then, Spark has exploded in popularity. Check out this Google Trends chart .  That's quite a hockey stick for Spark. Also notice their github project , ...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Delta Architectures: unifying the Lambda Architecture and leveraging Storm from Hadoop/REST
Recently, I've been asked by a bunch of people to go into more detail on the Druid/Storm integration that I wrote for our book: Storm Blueprints for Distributed Real-time Computation .  Druid is great. Storm is great. And the two together appear to solve th...
Recently, I've been asked by a bunch of people to go into more detail on the Druid/Storm integration that I wrote for our book: Storm Blueprints for Distributed Real-time Computation.  Druid is great. Storm is great. And the ...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Our dream home, and a monthly mortgage rate calculator in ruby!
We just finished the construction of our dream home: Now that construction is complete, we are converting from a construction loan into a normal mortgage, refinancing to get the best rate.   We are doing all the normal trade-offs, and being the geek that I ...
We just finished the construction of our dream home: Now that construction is complete, we are converting from a construction loan into a normal mortgage, refinancing to get the best rate.   We are doing all the normal trad...
1
Julia Sable's profile photo
 
nice! what area is that?
Add a comment...
Have him in circles
296 people
Robin Disque's profile photo
Dan Rolli's profile photo
Sam Geddio's profile photo
Alp Şehiç's profile photo
Paul Redman's profile photo
David K's profile photo
Daniel Loftus's profile photo
Video Learning Channel's profile photo
Lynn Bender's profile photo

Brian O'Neill

Shared publicly  - 
 
Amazon Echo : Syntax, Semantics, Intents and Goals: NLP over time.
So I caved.  Even with all my Apple paraphernalia, I bought an Amazon Echo .  I've had it for a little over a week, and I'm hooked.  We use it to play music, check the weather, and set timers -- all of the out of the box functionality.  You may think, "It's...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Streaming data into HPCC using Java
High Performance Computing Cluster (HPCC) is a distributed processing framework akin to Hadoop, except that it runs programs written in its own Domain Specific Language (DSL) called Enterprise Control Language (ECL).   ECL is great, but occasionally you wil...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Tuning Hadoop & Cassandra : Beware of vNodes, Splits and Pages
When running Hadoop jobs against Cassandra, you will want to be careful about a few parameters. Specifically, pay special attention to vNodes, Splits and Page Sizes. vNodes were introduced in Cassandra 1.2 .  vNodes allow a host to have multiple portions of...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
High-Performance Computing Clusters (HPCC) and Cassandra on OS X
Our new parent company, LexisNexis, has one of the world's largest public records database: " ...our comprehensive collection of more than 46 billion records from more than 10,000 diverse sources—including public, private, regulated, and derived data. You g...
2
1
Dean Poulin's profile photo
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Getting started with Cassandra Development in Eclipse (BEWARE of jdk8_40_ea, NoClassDefFoundError: ExtendedPlatformComponent)
I finally got back around to getting my environment setup for Cassandra development.  I ran into one snag, and couple things have changed so I figured I would capture the experience here. Fork and Build First, fork and clone from here: https://github.com/ap...
1
Add a comment...

Brian O'Neill

Shared publicly  - 
 
Absolute Truths, Perspectives, and Parellelism
We have a product called VerifyRx.  Chances are, when you go to the pharmacy and hand over your prescription, our data is being used to verify that your doctor was eligible to write that prescription. We deliver this functionality as a web service.  And bec...
1
Add a comment...
People
Have him in circles
296 people
Robin Disque's profile photo
Dan Rolli's profile photo
Sam Geddio's profile photo
Alp Şehiç's profile photo
Paul Redman's profile photo
David K's profile photo
Daniel Loftus's profile photo
Video Learning Channel's profile photo
Lynn Bender's profile photo
Basic Information
Gender
Male
Story
Introduction
Husband, Hacker, Hiker, and Kayaker. Fisherman and Father. 
Big Data Believer, Innovator, and Distributed Computing Fanatic.
DataStax MVP & Rebel Elite, InfoWorld's Technology Innovation Award Winner

Brian is CTO at Health Market Science (HMS) where he drives innovation and development of their Big Data platform focused on data management and analysis for the Healthcare space. The platform is powered by Storm and Cassandra and delivers real-time data management and analytics as a service.
Links
Contributor to