Shared publicly  - 
How to Save Billions with a $10K Competition... in 3 Months

This blog will show you how a $10,000 Kaggle competition drove HUGE value for Allstate -- the U.S.'s largest publicly held insurer. I'll then share with you how YOU can use Kaggle to solve your challenges.

As a reminder, Kaggle is a machine-learning, data-competition platform that has more than 45,000 machine-learning data specialists in its worldwide network who compete to analyze YOUR data and come up with insights and breakthrough algorithms. These 45,000 competing data scientists work during the day at some of the top Silicon Valley and global tech companies while, at night, driven by the challenge, they compete at Kaggle to demonstrate they are the best. As my friend Jeremy Howard, president and chief scientist at Kaggle told me, "It turns out that these people are applying themselves every day to real-world problems, and desire to apply what they've learned to other problems. That's why they come to Kaggle, and on Kaggle they find dozens and dozens of real-world challenges that people are desperate to solve."

During my interview with Jeremy, I asked him to tell me about Kaggle's most impressive success story. "With little doubt, it's the competition we did with Allstate." So here's the deal: Allstate, founded in 1931, is one of the world's largest insurance companies with $32 billion in revenue and 70,000 employees. You can bet that this company, which lives and dies based on the quality of its data and algorithms, employs some of the most gifted actuarial specialists in the world. What I mean by this is that Allstate wants to be able to "know," based on your age, marital status, the kind of car you drive and where you live, exactly what the probability is that you will have an accident. This is how they set your rates and how the company makes its money. It's really that simple.

Allstate's formula for making a prediction based on the data is called a vehicle-risk assessment algorithm. The competition that they chose to run with Allstate asked teams to improve on the company's internally developed  algorithm.

The prize offered was only $10,000 but, remarkably, somewhere around 600 data scientists, composing 300 teams, took part in the competition. "The $10,000 had little to do with it. That just makes it a bit of fun," continued Jeremy. "The reason the data scientists chose to compete was all about demonstrating their creativity. They wanted to reach the top of the leaderboard against other brilliant people in the field."

The results of this competition were nothing less than stunning. In the end, the outside experts (i.e., tapping into the global cognitive surplus) blew away Allstate's internal experts. The winners of the Kaggle competition demonstrated a 340 percent improvement in predictive accuracy over Allstate's best internal algorithm! 

I can't imagine how much such an improvement is worth, but I have little doubt that this $10,000 purse will eventually drive hundreds of millions if not billions of dollars of additional profits. Jeremy, who spent 10 years in the insurance business himself, was shocked at the result. "I can tell you that Allstate's actuarial department is amongst the best in the world, and unless you are familiar with the insurance marketplace it's hard to understand how huge of an achievement this Kaggle competition represents."

So how can YOU leverage available Kaggle tools and techniques to solve your biggest problems at an affordable price? To make this easy, Kaggle has created a turnkey process through which you can engage the company to take you step by step from building a competition, to monitoring its progress, to implementing the winning algorithm into your business. 

Here are the four steps in what Jeremy calls, Kaggle's analytics value chain:

Step #1: Identify the problem you're trying to solve (Kaggle Prospect): Kaggle can help you identify how the data that you've collected can help you improve your business. This first step is more of a consulting-based approach. "Throw away your preconceived ideas and think about what ways you can potentially transform your business by leveraging machine learning. You can use Kaggle Prospect for this," Jeremy said. "Kaggle Prospect is where you can run a competition among our Kaggle scientists asking them to come up with ideas. So you say: 'Here's our data, here's a snapshot of roughly how our business works,' and data scientists who actually understand the limitations of machine learning will write the proposals for the kind of Kaggle competitions they believe will be most fruitful." The clearer the competition is, the more likely it will find talent to compete. 

Step #2: Create the competition: In this second step, Kaggle will help you put together a data set that contains both outcomes and all of the things that might possibly be relevant. Depending on its size, and whether it's public or private, a Kaggle competition can cost between $20,000 and $200,000. Competitions usually run from 30 to 90 days, though some have run for just 24 hours. The prize money varies as well, from a few hundred to several hundred thousand or even millions of dollars. 

Step #3: Build a predictive model. At this step, the winner(s) of the competition will provide you with an algorithm that you can use to build a predictive model that you can use to drive your business.

Step #4: Implementation: Take that model and implement it in your product. It could be online, part of how you price -- however you wish or need to use the results, Jeremy said. 

I closed my interview by asking Jeremy to predict what changes will take place in this field over the next five years. Specifically, what changes will spur entrepreneurs and large corporations to consider using Machine Learning Data Competitions to a greater extent. Here's his answer:

1. The number of human beings connected to the internet will double. "There will be billions of new minds we can access," Jeremy said. 

2. The availability of people to work together will be greater.

3. There will be more and more handheld devices with greater capability.

4. The power of machine-learning algorithms will continue to improve.

5. More and more processes will become automated and simpler. "Machines will be able to recognize much more sophisticated patterns," Jeremy said. 

The result: data mining, analysis and predictive modeling on an even-greater scale. More important for large corporations, Jeremy said, is this: "I hope there will be a cultural change, that companies will move away from the 'not-invented-here-yet-syndrome.' That they will be prepared to accept that there is somebody in, say, Bolivia, who is better at analyzing their credit score and banking data than anybody on their highly paid banking team," he said. "After all, why would you not use the best minds in the world, wherever they are?'"

In my next blog I'm going to lead you through the amazing adventure of how I managed to arrange for renowned, wheelchair-bound physicist Stephen Hawking to take a zero-gravity flight

NOTE: As always, I would love your help in co-creating BOLD, and will happily acknowledge you as a "contributing author" for your input. Please share with me (and the community) in the comments below what you specifically found most interesting, what you disagree with and any similar stories or examples that reinforce this blog that I might use as examples in writing BOLD. Thank you!
Donna Kim-Brand's profile photoMarlo Graves's profile photoRichard Karpinski's profile photoGeorge Mikhailovsky's profile photo
I am not in doubt, there is so much opportunity out there waiting to be led to the kaggle problem cruncher. I am also not in doubt, that, unfairly, the ones that can bring the horse to the trough to drink, will reap more profit than the brainpower in the competion. That is, the people that know how to talk to the right ears in big companies, and can help them turn their thinking inside out and get something kaggled will be able to make a splendid living (and feel god about it even). 
N. Berntsen, you have a point. How many man years of graduate education got reduced to $10,000? And what does that mean for graduate education? Specifically, why would anyone spend years in deep learning when the risk is increasing much faster than opportunity? Will deep learning, scholarship, devolve to a pursuit historically enjoyed by the privileged, or will it continue to be a viable career path for the middle class? Who can say?

On the other hand, perhaps we can make faster progress against cancer and aging related diseases. Even better, use machines to help model an economy where people can be productively employed, while  increasing the value of the economy. Now that would be cool!

In any case, opportunity abounds at the present like no other time in history. And that's the coolest thing of all.
I think there are two big problems that should be addressed by ML.
One of them is the prediction of the best medical treatment in terms of outcomes, quality of life and costs.
Another one is transportation model, which can predict traffic jams and prevent them. It may significantly improve quality of live for a lot of people and save a lot of fuel.
I think IBM Watson is trying to address these issues, but why not create a challenge "Beat the Watson" (or "Revenge of humans")?
+David Doolin I share your concern about, what will encourage deep thinking in the future, and it seems sad to me initially.

One the positive side in an abundant future, the encouragement could be the sheer thrill of it, so that sounds good.

On the negative side, things are happening faster and faster, everything fights for you attention. A hard climate for the young to discover the pleasure of thinking. Furthermore when more and more problems are tackled by machine "intelligence" it may no longer appear worthwhile to compete, you are outsmarted, and if you are not, then the thrill of being smarter than your human opponent has gone.

I am positive about it though, I think we will still think important thought for the foreseeable future, and if deep thoughts becomes futile, we will adapt and seek other pleasures as a biological beings. It is not easy to say what the "right thing" to do in the future is, but I believe that we will find new purposes naturally as our society continues to evolve.
- Houston, we have a problem...
- This is Houston. Uh, say again, please? 
- Houston, we have a problem.

- And… What is the problem? 
- The main problem is the definition of the problem.
- Ok, let’s start analyzing all the systems.
While it does have it's positive aspects with people working together to resolve problems, whether domestically or internationally, for the winners of the competition in regard to Allstate, then they should share in the profits to be generated as well. A little greedy of Allstate, don't you think?
+Ronald Greenfield If Allstate was to share the profits to be generated from the competition, don't you think that would decrease their likelihood to enter into the contest in the first place?
As someone who has taught thinking & creativity in organizations for many years across private, public & educational sectors (US, UK & Middle East mostly) I believe that domain specific deep knowledge will always be necessary in most fields of endeavor. We need experts. Adding training & finesse in generic thinking tools, along with astute facilitation, leverages the dynamic creative collaboration process whatever the field or focus. It's a fairly simple way to harness collective wisdom and get more done in less time. These days, skill-building programs can be rolled out in relatively short time frames without much difficulty.

Better yet, if we were to educate students from a young age how to think, learn & create more effectively, then these skills become an integral part of their personal and professional repertoire, regardless of their chosen field.

I'd love to see the algorithms applied to more qualitative issues, such as predicting development and use of 'talent' for future needs. Current quantitative methods seem to lack resilience.
I think that machine learning should be applied to language translation.  If IBM's Watson could understand English well enough to compete in the Jeopardy game and win then surely it won't be long before machines can understand the rest of the world's 7000 languages equally well.  Language is after all just another type of algorithm with a data set called words and a set of equations called grammar. 
How can we use these ideas to correct the many wrong ideas that are currently widely supported?
1. That the fascist TPP would benefit humanity.
2. That HIV is the cause of AIDS.
3. That Fosamax is a cure for osteoporosis and bone fragility.
4. That statins are a good treatment for heart disease.
5. That global warming is largely caused by CO2.
6. That laetrile is useless against cancer.
7. That we have democracy in the United States.
8. That austerity is a good way out of recession.
9. That peer review prevents publication of bad science.

I'll pause there, but I am convinced that a million dollar solution to any of these problems would be money well spent. 

How can I become a more respected and effective conspiracy theorist?
I would add to five changes those will take place in this field over the next five years, according to Jeremy, the sixth one:
6. Human minds will become more and more specialized in this or that area and will need more cooperation with the other mind for solving interdisciplinary problems. 
Add a comment...