Press question mark to see available shortcut keys

Nate Silver 2012, The Signal and the Noise:

# ch3

Baseball, uniquely among the major American sports, has always been played on fields with nonstandard dimensions. It's much easier to put up a high batting average in snug and boxy Fenway Park, whose contours are shaped by compact New England street grids, than in the cavernous environs of Dodger Stadium, which is surrounded by a moat of parking lot. By observing how players perform both at home and on the road, we can develop "park factors" to account for the degree of difficulty that a player faces. (For example, Fred Lynn, an MVP with the Red Sox during the 1970s, hit .347 over the course of his career at Fenway Park but just .264 at every other stadium.) Likewise, by observing what happens to players who switch from the National League to the American League, we can tell quite a bit about which league is better and account for the strength of a player's competition.

Olympic gymnasts peak in their teens; poets in their twenties; chess players in their thirties11; applied economists in their forties,12 and the average age of a Fortune 500 CEO is 55.13 A baseball player, James found, peaks at age twenty-seven. Of the fifty MVP winners between 1985 and 2009, 60 percent were between the ages of twenty-five and twenty-nine, and 20 percent were aged twenty-seven exactly. This is when the combination of physical attributes and mental attributes needed to play the game well seem to be in the best balance.

The players in the PECOTA list had generated 546 wins for their major-league teams through 2011 (figure 3-3). But the players in Baseball America's list did better, producing 630 wins. Although the scouts' judgment is sometimes flawed, they were adding plenty of value: their forecasts were about 15 percent better than ones that relied on statistics alone. That might not sound like a big difference, but it really adds up. Baseball teams are willing to pay about $4 million per win on the free-agent market.30 The extra wins the scouts identified were thus worth a total of $336 million over this period.*
Although it would have been cool if the PECOTA list had gotten the better of the scouts, I didn't expect it to happen. As I wrote shortly after the lists were published:31
     As much fun as it is to play up the scouts-versus-stats angle, I don't expect the PECOTA rankings to be as accurate as . . . the rankings you might get from Baseball America.The fuel of any ranking system is information-and being able to look at both scouting and statistical information means that you have more fuel. The only way that a purely stat-based prospect list should be able to beat a hybrid list is if the biases introduced by the process are so strong that they overwhelm the benefit.
In other words, scouts use a hybrid approach. They have access to more information than statistics alone. Both the scouts and PECOTA can look at what a player's batting average or ERA was; an unbiased system like PECOTA is probably a little bit better at removing some of the noise from those numbers and placing them into context. Scouts, however, have access to a lot of information that PECOTA has no idea about. Rather than having to infer how hard a pitcher throws from his strikeout total, for instance, they can take out their radar guns and time his fastball velocity. Or they can use their stopwatches to see how fast he runs the bases.
This type of information gets one step closer to the root causes of what we are trying to predict. In the minors, a pitcher with a weak fastball can rack up a lot of strikeouts just by finding the strike zone and mixing up his pitches; most of the hitters he is facing aren't much good, so he may as well challenge them. In the major leagues, where the batters are capable of hitting even a ninety-eight-mile-per-hour fastball out of the park, the odds are against the soft-tosser. PECOTA will be fooled by these false positives while a good scout will not be. Conversely, a scout may be able to identify players who have major-league talent but who have yet to harness it.

But statheads can have their biases too. One of the most pernicious ones is to assume that if something cannot easily be quantified, it does not matter. In baseball, for instance, defense has long been much harder to measure than batting or pitching. In the mid-1990s, Beane's Oakland A's teams placed little emphasis on defense, and their outfield was manned by slow and bulky players, like Matt Stairs, who came out of the womb as designated hitters. As analysis of defense advanced, it became apparent that the A's defective defense was costing them as many as eight to ten wins per season,33 effectively taking them out of contention no matter how good their batting statistics were. Beane got the memo, and his more recent and successful teams have had relatively good defenses.

Statistics, indeed, have been a part of the fabric of baseball since the very beginning. The first newspaper box score, which included five categories of statistics for each player-runs, hits, putouts, assists, and errors-was published by Henry Chadwick in 1859,38 twelve years before the first professional league was established, in 1871. Many of the Moneyball-era debates concerned not whether statistics should be used, but which ones should be taken into account. On-base percentage (OBP), for instance, as analysts like James had been pointing out for years, is more highly correlated with scoring runs (and winning games) than batting average, a finding which long went underappreciated by traditionalists within the industry.39
...The further you get away from the majors-the more you are trying to predict a player's performance instead of measure it-the less useful statistics are. Statistics at the more advanced minor-league levels, like Double-A and Triple-A, have been shown to be almost as predictive as major-league numbers. But statistics at the lower minor-league levels are less reliable, and the numbers for college or high school players have very little predictive power.

Few professions, however, are as competitive as baseball. Among the thousands of professional baseball players, and the hundreds of thousands of amateurs, only 750 are able to play in the major leagues at any given time, and only a few dozen of those will be All-Stars. Sanders's job is to search for those exceptional individuals who defy the odds. He has to work nearly as hard at his job as the players do, and he is still out on the road almost every day in his late sixties.
But [the scout] Sanders provides the Dodgers with the most valuable kind of information-the kind of information that other people don't have.

As we've seen, baseball players do not become free agents until after six full seasons, which is usually not until they're at least thirty. As Bill James's analysis of the aging curve revealed, this often leads clubs to overspend on free agents-after all, their best years are usually behind them. But there is a flip side to this: before a player is thirty, he can provide tremendous value to his club. Moreover, baseball's economics are structured such that younger players can often be had for pennies on the dollar.42
If a baseball team is viewed, as with any other business, from a standpoint of profits and losses, almost all the value is created by the scouting and development process. If a team's forecasting system is exceptionally good, perhaps it can pay $10 million a year for a player whose real value is $12 million. But if its scouting is really good, it might be paying the same player just $400,000. That is how you compete in a small market like Oakland.

Indeed, the line between stats and scouting, and qualitative and quantitative information, has become very blurry in the baseball industry. Take, for example, the introduction of Pitch f/x, a system of three-dimensional cameras that have now been installed at every major-league stadium. Pitch f/x can measure not just how fast a pitch travels-that has been possible for years with radar guns-but how much it moves, horizontally and vertically, before reaching the plate. We can now say statistically, for instance, that Zack Greinke, a young pitcher with the Milwaukee Brewers who won the 2009 Cy Young Award as his league's best pitcher, has baseball's best slider,44 or that Mariano Rivera's cut fastball is really as good as reputed.45 Traditionally, these things were considered to be in the domain of scouting; now they're another variable that can be placed into a projection system.
We're not far from a point where we might have a complete three-dimensional recording of everything that takes place on a baseball field. We'll soon be able to measure exactly how good a jump Jacoby Ellsbury gets on a fly ball hit over his head. We'll know exactly how fast Ichiro Suzuki rounds the bases, or exactly how quickly Yadier Molina gets the ball down to second base when he's trying to throw out an opposing base-stealer.
This new technology will not kill scouting any more than Moneyball did, but it may change its emphasis toward the things that are even harder to quantify and where the information is more exclusive, like a player's mental tools. Smart scouts like Sanders are already ahead of the curve.
Shared publiclyView activity