How to Save Billions with a $10K Competition... in 3 Months

This blog will show you how a $10,000 Kaggle competition drove HUGE value for Allstate -- the U.S.'s largest publicly held insurer. I'll then share with you how YOU can use Kaggle to solve your challenges.

As a reminder, Kaggle is a machine-learning, data-competition platform that has more than 45,000 machine-learning data specialists in its worldwide network who compete to analyze YOUR data and come up with insights and breakthrough algorithms. These 45,000 competing data scientists work during the day at some of the top Silicon Valley and global tech companies while, at night, driven by the challenge, they compete at Kaggle to demonstrate they are the best. As my friend Jeremy Howard, president and chief scientist at Kaggle told me, "It turns out that these people are applying themselves every day to real-world problems, and desire to apply what they've learned to other problems. That's why they come to Kaggle, and on Kaggle they find dozens and dozens of real-world challenges that people are desperate to solve."

During my interview with Jeremy, I asked him to tell me about Kaggle's most impressive success story. "With little doubt, it's the competition we did with Allstate." So here's the deal: Allstate, founded in 1931, is one of the world's largest insurance companies with $32 billion in revenue and 70,000 employees. You can bet that this company, which lives and dies based on the quality of its data and algorithms, employs some of the most gifted actuarial specialists in the world. What I mean by this is that Allstate wants to be able to "know," based on your age, marital status, the kind of car you drive and where you live, exactly what the probability is that you will have an accident. This is how they set your rates and how the company makes its money. It's really that simple.

Allstate's formula for making a prediction based on the data is called a vehicle-risk assessment algorithm. The competition that they chose to run with Allstate asked teams to improve on the company's internally developed  algorithm.

The prize offered was only $10,000 but, remarkably, somewhere around 600 data scientists, composing 300 teams, took part in the competition. "The $10,000 had little to do with it. That just makes it a bit of fun," continued Jeremy. "The reason the data scientists chose to compete was all about demonstrating their creativity. They wanted to reach the top of the leaderboard against other brilliant people in the field."

The results of this competition were nothing less than stunning. In the end, the outside experts (i.e., tapping into the global cognitive surplus) blew away Allstate's internal experts. The winners of the Kaggle competition demonstrated a 340 percent improvement in predictive accuracy over Allstate's best internal algorithm! 

I can't imagine how much such an improvement is worth, but I have little doubt that this $10,000 purse will eventually drive hundreds of millions if not billions of dollars of additional profits. Jeremy, who spent 10 years in the insurance business himself, was shocked at the result. "I can tell you that Allstate's actuarial department is amongst the best in the world, and unless you are familiar with the insurance marketplace it's hard to understand how huge of an achievement this Kaggle competition represents."

So how can YOU leverage available Kaggle tools and techniques to solve your biggest problems at an affordable price? To make this easy, Kaggle has created a turnkey process through which you can engage the company to take you step by step from building a competition, to monitoring its progress, to implementing the winning algorithm into your business. 

Here are the four steps in what Jeremy calls, Kaggle's analytics value chain:

Step #1: Identify the problem you're trying to solve (Kaggle Prospect): Kaggle can help you identify how the data that you've collected can help you improve your business. This first step is more of a consulting-based approach. "Throw away your preconceived ideas and think about what ways you can potentially transform your business by leveraging machine learning. You can use Kaggle Prospect for this," Jeremy said. "Kaggle Prospect is where you can run a competition among our Kaggle scientists asking them to come up with ideas. So you say: 'Here's our data, here's a snapshot of roughly how our business works,' and data scientists who actually understand the limitations of machine learning will write the proposals for the kind of Kaggle competitions they believe will be most fruitful." The clearer the competition is, the more likely it will find talent to compete. 

Step #2: Create the competition: In this second step, Kaggle will help you put together a data set that contains both outcomes and all of the things that might possibly be relevant. Depending on its size, and whether it's public or private, a Kaggle competition can cost between $20,000 and $200,000. Competitions usually run from 30 to 90 days, though some have run for just 24 hours. The prize money varies as well, from a few hundred to several hundred thousand or even millions of dollars. 

Step #3: Build a predictive model. At this step, the winner(s) of the competition will provide you with an algorithm that you can use to build a predictive model that you can use to drive your business.

Step #4: Implementation: Take that model and implement it in your product. It could be online, part of how you price -- however you wish or need to use the results, Jeremy said. 

I closed my interview by asking Jeremy to predict what changes will take place in this field over the next five years. Specifically, what changes will spur entrepreneurs and large corporations to consider using Machine Learning Data Competitions to a greater extent. Here's his answer:

1. The number of human beings connected to the internet will double. "There will be billions of new minds we can access," Jeremy said. 

2. The availability of people to work together will be greater.

3. There will be more and more handheld devices with greater capability.

4. The power of machine-learning algorithms will continue to improve.

5. More and more processes will become automated and simpler. "Machines will be able to recognize much more sophisticated patterns," Jeremy said. 

The result: data mining, analysis and predictive modeling on an even-greater scale. More important for large corporations, Jeremy said, is this: "I hope there will be a cultural change, that companies will move away from the 'not-invented-here-yet-syndrome.' That they will be prepared to accept that there is somebody in, say, Bolivia, who is better at analyzing their credit score and banking data than anybody on their highly paid banking team," he said. "After all, why would you not use the best minds in the world, wherever they are?'"

In my next blog I'm going to lead you through the amazing adventure of how I managed to arrange for renowned, wheelchair-bound physicist Stephen Hawking to take a zero-gravity flight

NOTE: As always, I would love your help in co-creating BOLD, and will happily acknowledge you as a "contributing author" for your input. Please share with me (and the community) in the comments below what you specifically found most interesting, what you disagree with and any similar stories or examples that reinforce this blog that I might use as examples in writing BOLD. Thank you!
Shared publiclyView activity