Predicting Baseball Game Outcomes
"It's tough to make predictions, especially about the future."
— Yogi Bera, Apocryphal
If you're following me on twitter (h/t to you,@moneySpamBot2123!) or have been glancing at my periodic predictions here on G+, you'll notice that most of the time I'm so darn wrong.
But occasionally, I'm on the money. What's up with that?
Well, after much examination of the matter, it appears that the problem boils down to not accounting for the pitcher's abilities.
Overall, in a batter-pitcher interaction, roughly 40% of the outcome is determined by the pitcher. (Alright, well, 39.78%, but who's counting?)
So if we have a bad pitcher (e.g., Ervin Sanatana for the Minnesota Twins returning after being banned for 80 games), then the outcome will not resemble the prediction at all (FWIW, I predicted it would be Twins 5 vs Cincinnati 4 --- it turned out to be 4-17...I was close!).
I'm still thinking about how to adequately represent this, because a naive Markov chain won't do anymore. The random variation in the pitcher's ability is a real
effect (as I study in this post). I'm going to have to sit down, and think carefully about how to model these interactions.
The other low-hanging fruit was determining the lineup. But this turns out to be fairly easy to predict, since it doesn't change too much game-to-game.http://pqnelson.github.io/2015/08/17/lineups-and-pitchers.html