I'm pretty excited about this post. Not because we've done so well on the predictions (more on that in a minute), but because we've been able to turn the predictions over to anyone who wants to give them a try.
When +Felipe Hoffa
and I first started working on the Google I/O talk that was the background for this effort, we did so because we wanted to demystify machine learning. Many developers and technologists think that ML is 'hard' and so don't think about all of the ways that it can work for them. However, between the open source tools that are available and Google's cloud, it is now pretty easy to do a lot of things that look 'hard'. We applied those tools to something that both of us are passionate about (soccer) and used them to make some predictions.
So now we've packaged up the models that we've built in a way that anyone can see exactly what we did and try them out themselves. If you have an idea, it should be easy to incorporate in your model. There is lots of room for improvement.
It turns out that it is pretty easy to set this up -- the detailed instructions are in the post: just cut and paste a couple of command lines (you don't even need to modify them), then navigate to the iPython notebook in your web browser. That's all you need to do to start making predictions of your own.
And about that prediction accuracy. We've gone 13 for 14 so far, but I'd like to call out that this is a lot
of luck. Soccer is just not predictable at that level of accuracy. This world cup knockout stage was particularly surprising in that the favorites all won (so far). (See Nate Silver's take on it here: http://fivethirtyeight.com/datalab/its-a-huge-upset-when-all-the-world-cup-favorites-win/
). So please don't think that these models are some magic oracle that will tell you who will win upwards of 90% of the time. At best, they'll tell you who should
win. Or who would win more often if the game was played under the same conditions 100 times.
So give this a try and start predicting. I'd be happy to hear your results.