Baseball Articles | Index
2006 Predictions -- Keeping Score
By Tom Tippett
When we release our annual Projection Disk in the spring, we give our customers a chance to get a head start on the baseball season. With projected statistics and ratings for over 1600 established big leaguers and top minor-league prospects, plus league schedules, park factors, team rosters, projected pitching rotations, bullpen assignments, lineups and depth charts, the Projection Disk gives them everything they need to play out the new season using the Diamond Mind Baseball simulation game.
It also gives us a chance to get a head start on the season. Ever since we created the first Projection Disk in 1998, we've been publishing our projected standings along with comments on the outlook for all 30 teams. Those projected standings are based on the average of a number of full-season simulations using the Projection Disk.
Of course, nobody really knows what's going to happen when the real season starts, but we're always curious to see how our projected results compare to the real thing. And we're equally interested in seeing how our projections stack up against the predictions made by other leading baseball experts and publications. This article takes a look at those preseason predictions and identifies the folks who were closest to hitting the mark in 2006. And because anyone can get lucky and pick the winners in one season, we also look at how everyone has done over a period of years.
In addition to projecting the order of finish, our simulations provide us with projected win-loss records, projected runs for and against, and the probability that each team will make the postseason by winning its division or grabbing the wild card.
Our favorite measure is standard error, which is a statistical formula that compares two sets of values -- actual versus predicted -- and determines how closely they match. In this case, we apply the standard error formula to the predicted an actual win totals for each team.
Unfortunately, most of the predictions that are published in major newspapers, magazines and web sites don't include wins and losses. Instead, they give the order of finish without indicating which races are expected to be hotly contested and which will be runaways. Some don't even bother to predict the order of finish, settling for the division winners and wild card teams.
As a result, we assign a score to each prediction based solely on order of finish within each division. We borrowed the scoring system from our friend Pete Palmer, co-author of Total Baseball and The Hidden Game of Baseball, who has been projecting team standings for more than 35 years.
Pete's scoring system subtracts each team's actual placement from its projected placement, squares this difference, and adds them up for all the teams. For example, if you predict a team will finish fourth and they finish second, that's a difference of two places. Square the result, and you get four points. Do this for every team and you get a total score. The lower the score, the more accurate your predictions.
We don't try to break ties. If, for example, two teams tie for first, we say that each team finished in 1.5th place for the purposes of figuring out how many places a prediction was off. Suppose a team was projected to finish third and they tied for first instead. That's a difference of 1.5 places. The square of 1.5 is 2.25, so that would be the point total for this team. That's why you'll see some fractional scores in the tables below.
Keeping things in perspective
That first year, we created a little database with our projected standings and those of fourteen national publications, and we were pleased to see that we ended the year with the best accuracy score among those fifteen forecasts. When we wrote up the results and posted them to our web site, however, we were very careful not to make any grand claims, saying:
Over time, we expanded our database to include the predictions of prominent baseball writers from major newspapers and other publications. This is easier said than done because some publications and web sites change their approach from year to year. For example, we used to track the predictions of several ESPN.com writers and editors, but they limited their picks to division winners starting in 2003. So the number of entries in our database can rise and fall depending on what the various publications do and whether we were able to find those predictions in our spring survey.
In the sections below, we'll show you how various prognosticators ranked in 2006 and over a period of years, with the period varying in length depending on when we added that person or publication to our database. We don't make any claims of completeness here -- there are lots of other predictions that are not in our database -- but we think you'll find that our sample is an interesting one.
For several reasons, we want to emphasize that it's important that nobody take these rankings too seriously.
First, this isn't the only scoring system one could use to rank these projections, of course. We've already mentioned standard error as a better metric when predicted wins and losses are available. In addition, a fellow named Gerry Hamilton runs an Annual Baseball Predictions (ABP) contest and assigns a score based on how many games each team finished out of their predicted place in the standings.
Second, because of publishing deadlines, the predictions in some spring baseball magazines are made long before spring training started, others are prepared in early-to-mid March, and some are compiled just before opening day. Obviously, the longer you wait, the more information you have on player movement and injuries.
Third, many newspaper editors ask staff writers to make predictions so their readers have something to chew on for a couple of days. Some writers hate doing them but comply because their editors insist. Some do it even though their main beat is a different sport. Others may make off-the-wall picks just for grins or feel compelled to favor the hometown teams.
Rankings for 2006
For the second year in a row, we're much happier with our projections than you might think given our finish in the rankings below, which are based on the Palmer scoring system.
The standard error of our projected win totals matched our previous best, so our simulation methodology came as close to predicting the real standings as we ever have. That's the most comprehensive measure, so it's the one we pay most attention to, and we were thrilled with how things turned out.
We were also happy to see that our preseason simulations correctly identified five of the six division winners, and it would have been six for six if not for the Padres taking the NL West over the Dodgers on a tie-breaker.
Despite those successes, our Palmer score was dragged down by the Chicago Cubs, who finished second in our simulations but wound up in the basement after an injury-plagued real-life season.
In the AGP contest, the Diamond Mind projections finished 52nd out of 203 entries. For much of the season, we were near the top, but a couple of September meltdowns knocked us down at the end.
Here are the rankings for 2006:
Forecaster Score Charlie McCarthy, CBS SportsLine 34 Erin Brown, CBS SportsLine 34 Dallas Morning News 36 ABP consensus 37 ESPN the magazine 39 Scott Miller, CBS SportsLine 40 Chris Snow, Boston Globe 41 Poll of SABR members 43 Tony DeMarco, MSNBC.com 44 ESPN.com staff consensus 44 Jonah Keri, Baseball Prospectus 44 USA Today Sports Weekly 44 Baseball America 45 St. Louis Post-Dispatch 45 Los Angeles Times 46 Eric Mack, CBS SportsLine 47 CBS SportsLine 47 New York Times 47 Washington Post 47 Diamond Mind simulations 48 Baseball Prospectus, PECOTA 48 Sporting News annual, power poll 48 Sporting News magazine, power poll 49 Christina Kahrl, Baseball Prospectus 50 New York Daily News 50 Sports Illustrated 50 Gordon Edes, Boston Globe 51 ESPN.com power rankings 51 David Gonos, CBS SportsLine 51 Eric Karabell, ESPN.com 51 Dayne Perry, Baseball Prospectus 51 Seattle Times 51 Las Vegas over-under line 52 Athlon 53 Baseball prospectus, consensus vote 54 Dan Fox, Baseball Prospectus 54 Will Carroll, Baseball Prospectus 55 2005 final standings 55 Dan Shaughnessy, Boston Globe 56 Joe Sheehan, Baseball Prospectus 56 Jackie McMullen, Boston Globe 57 Steve Goldman, Baseball Prospectus 58 Thomas Gorman, Baseball Prospectus 58 Jay Jaffe, Baseball Prospectus 58 MLB Yearbook 63 Street & Smith 63 Keith Woolner, Baseball Prospectus 64 Kevin Goldstein, Baseball Prospectus 66 Baseball Digest 67 Chicago Tribune 67 Team payroll (per USA Today) 68 Pete Palmer 70 Associated Press 73 Miami Herald 73 San Francisco Chronicle 73 Lindy's 75 Nate Silver, Baseball Prospectus 75 Bob Ryan, Boston Globe 80 Spring training results 121.5The "Diamond Mind simulations" entry is the one representing the average result of simulating the season 100 times. These simulations were done a couple of weeks before the season started.
There are a few other entries in this list that don't represent the views of an individual writer, analyst, or publication. If you predicted that the 2006 standings would be the same as in 2005, your score would have been 55. If you put together a set of standings based on the Las Vegas over-under line, you'd have racked up 52 points. If you thought the teams would finish in order from highest to lowest payroll, your score would have been 68.
The notion of strength in numbers has plenty of support here. For the second year in a row, very high rankings were achieved by the consensus of the participants in the Annual Baseball Predictions contest and the poll of SABR members.
And if you predicted that the regular season standings would match the 2006 spring training standings, your score would have been 121.5. As is the case every year, the spring training results were almost useless as a predictor of the real season.
Reviewing the season
Much more interesting than the rankings are the reasons behind the rankings. If we (or others) missed badly on a team, was that because the projection was suspect or because events (such as a rash of injuries or major trade-deadline moves) put a very different team on the field? Which teams were lucky or unlucky to finish where they did? If a team was a big surprise, were the indicators there to be seen back in March, or did they really come out of nowhere?
Let's take a tour through the divisions to see what we can learn.
Before the season, the expected order of finish was NY-Bos-Tor-Bal-TB, with about 70% of the predictions having the Yankees in first, one brave soul (Bob Ryan of the Boston Globe) picking the Blue Jays, and everyone else picking the Red Sox over New York. Thirteen predictions had Toronto ahead of Boston, but most thought the Jays hadn't improved enough to crack the top two. Two people actually had Toronto in 4th, behind Baltimore. About 70% had the Orioles fourth, with the other 30% believing they would finish behind Tampa Bay. Nobody picked Tampa Bay to finish higher than fourth.
In reality, the Yankees overcame some first-half adversity before cruising to another title, the Jays squeaked past the fading Red Sox to grab second place, and Tampa Bay fell apart, losing more games than any other team in baseball. Six of the predictions -- Scott Miller and Charlie McCarthy of CBS Sportsline, Lindy's, the New York Daily News, the Seattle Times, and the Sporting News annual -- nailed the division from top to bottom.
For the second year in a row, this was a most interesting division. Our simulations had Minnesota as the favorite by a nose over Cleveland and Chicago, with Detroit close enough to be considered a legitimate dark horse candidate.
Only five others picked the Twins to win this race, with the vast majority of the mainstream media going with the defending champion White Sox and many in the sabermetrics crowd liking the Indians, who were the best team in baseball in 2005 by many statistical measures. Believe it or not, three of the forecasts had the Tigers in last place, behind the Royals. The consensus of the preseason picks expected a Chi-Cle-Min-Det-KC finish.
As it turned out, the Tigers had a terrific season. Even though our simulations put the Tigers in fourth place in what has become a most competitive division, we could see the makings of a big leap. Here's what I wrote about the Tigers in March:
Of course, they were even better than that. The owned the best record in baseball for much of the season and now they're playing in the World Series.
Now that I've seen how things turned out, I'm kicking myself for making some last minute adjustments to our projections. For much of the winter, we were seeing the Tigers finish a couple of games ahead of the White Sox in our simulations. We wondered why, and when we took a closer look, we decided to adjust the projections of three White Sox players.
In all three cases, the player had experienced a dramatic improvement in performance in 2005, and in every case, we could point to a tangible reason why more weight should be placed on their recent numbers and less on their (much weaker) earlier level of performance. That was enough to push the White Sox ahead of the Tigers for the simulation runs that we wrote about in March, and we felt very comfortable with those results at the time. Six months later, we now see that (a) all three of these players performed more in line with our original projections than the adjusted ones, (b) the Tigers did indeed finish ahead of the White Sox by a few games, and (c) we would have looked a little smarter had we trusted the original projections instead of looking for reasons to tweak them.
The Indians remain a mystery to me. We projected them to outscore their opponents by 91 runs. That run margin is normally good enough to produce 91 wins, but the team averaged only 88 in our simulations, perhaps pointing to a subtle weakness in the construction of the team. In the real 2006 season, Cleveland outscored its opponents by 88 runs, fourth-best in the AL and only three runs behind our projection. And yet, because of a second consecutive year of poor results in close games, the Indians finished with 12 fewer wins than expected given their run margin. If ever there was a young, talented team that needs to "learn how to win", this appears to be it.
Kudos to Joe Sheehan of Baseball Prospectus and Charlie McCarthy of CBS Sportsline for posting the best Palmer scores for this division.
Before the season, the AL West consensus was Oak-LA-Tex-Sea, but it was a close call at the top. Oakland had almost twice as many first-place votes as LA, but four had the A's in third place and two even had them in the basement. A very small minority (including us) had Texas second, but most put them in third with Seattle in the cellar.
Once again, the Angels were the surprise of the division, though it wasn't enough to put them in the playoffs this time. In our simulations, they were offensively challenged, to say the least, and the reality wasn't much better. LA finished 11th in the league in scoring and outscored their opponents by only 34 runs. You don't often make a serious postseason run with a scoring difference like that.
But that, plus their ability to squeeze four more wins than usual out of that run margin, was enough to put the Angels comfortably in second place and give the Athletics a late-season scare. Oakland, for their part, fell far short of our projections for run margin but made up most of the difference by winning 7 more games than the pythagorean method would indicate. Statistically, Oakland and LA were pretty much the same, but the wins went Oakland's way for a change.
It's much easier to predict a four-team race than a five- or six-team race, so a number of forecasters -- too many to mention -- were dead on with the order of finish in this division.
Can you believe it? For once, the Braves didn't defy all indications to the contrary and take home another division title. This time, they underachieved by a good margin, albeit in a limited way. We projected 85 wins on the basis of a +29 run margin. In the real season, their run margin was +44, but they fell six games short of their pythagorean record and six games short of those 85 wins.
Back in March, few people were willing to bet against Atlanta, which was once again the consensus pick to continue their division-winning streak, though the Mets got a lot of support, too. Again, there was a split between the mainstream, which didn't seem willing to pick anyone other than Atlanta until the Braves actually failed, and the sabermetric community, which seemed to put more weight on the Braves' flaws. Overall, the forecasters saw an Atl-NY-Phi-Was-Flo finish, with the vast majority (but not everyone) picking the Marlins last and only one person (Jonah Keri of Baseball Prospectus) picking the Phillies to win the division.
We did pick the Mets to win this division, but only by a hair over Philly and Atlanta. The difference between that narrow victory and the real-life runaway was mostly on the offensive side of the ball, where Carlos Beltran bounced back from a disappointing 2005 campaign and Jose Reyes had his breakout season.
The Florida Marlins were the other big story in the division. Like most everyone else, we picked them to finish last, though there were times when they edged past the Nationals to grab fourth place. In March, we wrote:
The Mets made sure the Marlins were never in the division race, but Florida did hang around the wildcard race for a long time when several young players -- Hanley Ramirez, Dan Uggla, Anibal Sanchez, and Josh Johnson in particular -- began peforming at a high level from the moment they stepped on the big stage.
In our spring predictions roundup, the Cardinals came closest to being a unanimous pick to win their division, falling only two votes short of perfection. Lindy's picked the Cubs first and St. Louis second, while Keith Woolner of Baseball Prospectus went with Milwaukee and Houston in the top two spots. The next three spots were closely bunched, with the Astros having a slight edge over the Brewers and the Cubs. The Reds were picked by a majority to finish last, but Pittsburgh got their share of last-place picks as well. As a result, this group saw a StL-Hou-Mil-Chi-Pit-Cin finish.
The Cubs were our big miss of the 2006 projection season, mainly because we made the optimistic assumption that Kerry Wood and Mark Prior would combine for 53 starts and 309 innings. In reality, they contributed only 13 starts and 64 innings. When they did pitch, Wood matched his projected level of performance, but Prior's ERA was north of seven.
Those two injuries, plus the one that sidelined slugger Derrek Lee for 2/3 of the year and reduced his power when he did play, cost the Cubs over 100 runs relative to our projections. Statistically, the real-life Cubs were a hair better than the Pirates, and had they been able to carry that edge to a fifth-place finish, our Palmer score would have dropped to 40, good enough for the top ten. But it wasn't to be, in part because they finished in a virtual tie with Atlanta for the league's worst winning percentage in one-run games.
Should we have seen this coming? The Lee injury came out of nowhere, but we probably should have been more cautious about Prior and Wood. Injured pitchers often fail to come all of the way back.
In our simulations, Milwaukee edged Houston by an average of one game, and so we picked the Brewers third and the Astros fourth. At the time, we had mixed feelings about these results. On the one hand, we had no idea where Roger Clemens would choose to play, or for how long. But we knew that if he chose to stay in Houston, that would be enough to push the Astros ahead of the Brewers. On the other hand, lots of smart people were picking the Brewers to make a big move on the division leaders, and we weren't, so we wondered whether we were missing something.
As it turned out, the Astros did get Roger back. He made 19 starts for Houston and performed exactly at the level we projected. Had we known in March that things would happen this way, our projection for the Astros would most likely have been 81 wins, not 78. The real-life Astros won 82 games, so we didn't miss by much.
One of the hallmarks of our projection methodology is the ability to identify overrated teams. When a real-life team overachieves for one reason or another, many in the mainstream media see that level of performance as real. As a result, they often assume they'll continue to perform at that level or improve from there. Given that Houston was coming off a World Series appearance, many saw them as serious contenders for the division title. In contrast, we saw them as a .500 team that happened to have a nice run in 2005, and that's pretty much how it turned out.
Bolstered by a an improbable series of late-inning rallies, Cincinnati held second place for much of the season and was still in the race in the season's final week. The Reds won only three more games than we projected, and you can chalk that up to their strong record (27-20) in one-run games, something that is nearly impossible to project. Erin Brown of CBS Sportsline was the only person to pick the Reds higher than 5th. In fact, if the Cubs had finished ahead of Pittsburgh, she would have nailed the entire division.
In terms of our Palmer score, this division killed us, accounting for 26 of our 48 points. We were off by four places on the Cubs and two each on the Reds and Astros. We missed b
In the NL West, things appeared to be wide open before the year began. Four of the five teams were picked to win the division by at least one prognosticator. The Rockies were picked last by most, fourth by most of the rest, third once, and second once. In fact, every team was picked in every spot at least once, with only the Dodgers (never picked last) and the Rockies (never picked first) as exceptions. Overall, the consensus was LA-SF-SD-Ari-Col, which is how the teams finished in our simulations.
It's tempting to say that things worked out almost as the consensus anticipated, but that wouldn't be quite right. For one thing, the Padres were 11 games better than we projected them to be, and while roughly 1/3 of the projections had San Diego in first or second, the majority expected them to finish further down.
More importantly, the Giants were very lucky to finish third. They were only a half-game ahead of Arizona and Colorado, and could easily have dropped into a tie with those teams had it been necessary for them to travel to St. Louis to make up the season's only unplayed game. Furthermore, they had the division's worst run margin, an indication that they deserved to finish in the cellar.
Baseball America, Dayne Perry of Baseball Prospectus, and Gordon Edes and Chris Snow of the Boston Globe were closest to picking this division perfectly, but Chris gets the honors because he was the only one of the four who picked the Padres to finish on top.
Partly because of luck, and partly because of quirks in the scoring system, one season isn't nearly enough to tell you who has the best approach to predicting the coming season, so we'll devote the rest of this article to ranking the predictions on a multi-year basis.
Here are the rankings for those who were included in our sample every year. In the past, we have presented the year-by-year scores, but now that we're up to nine years of history, we're running out of room for that. If you're interested in the details, you can look up that information in past editions of this article, all of which are archived on the web site. Starting this year, we'll show only the nine-year average for each forecaster.
Forecaster Avg Las Vegas over-under line 47.0
In 1999, we added some writers from the Boston Globe.
Forecaster Avg Gordon Edes, Boston Globe 45.6 Las Vegas over-under line 46.4
The Diamond Mind simulations missed the mark by quite a bit in 2000. We added a new concept to our projection system that year, but we were unhappy with the results, and we took that out of the model before generating our projections in 2001. The results have been much better since. As you can see, the Las Vegas over-under line has been getting much better in recent years.
Forecaster Avg Las Vegas over-under line 46.1 Sporting News 46.8 Sports Illustrated 47.1 Gordon Edes, Boston Globe 48.1 Diamond Mind simulations 48.2 USA Today Sports Weekly 48.4 Baseball America 48.6 Athlon 49.4 Previous season standings 51.8 Dan Shaughnessy, Globe 57.4 Street & Smith 57.8 Pete Palmer 58.6 Bob Ryan, Boston Globe 61.4 Payroll ranking 69.3
MSNBC, Lindy's, the LA Times, and the Spring Training standings were added in 2001.
Forecaster Avg Tony DeMarco, MSNBC.com 42.8 Diamond Mind simulations 44.9 Las Vegas over-under line 45.3 USA Today Sports Weekly 46.8 Baseball America 47.8 Sports Illustrated 48.3 Sporting News 48.3 Lindy's 48.8 Los Angeles Times 49.3 Athlon 50.6 Previous season standings 51.1 Gordon Edes, Boston Globe 51.8 Street & Smith 57.8 Dan Shaughnessy, Globe 57.9 Pete Palmer 59.4 Bob Ryan, Boston Globe 61.9 Payroll ranking 66.2 Spring training results 107.6
Here's how things looked from 2002 to 2006.
Forecaster Avg Tony DeMarco, MSNBC.com 37.8 Las Vegas over-under line 41.2
Here's how things have looked from 2003 to 2006.
Forecaster Avg Tony DeMarco, MSNBC.com 38.8 Las Vegas over-under line 40.0
Here's how things have looked from 2004 to 2006.
Forecaster Avg APB consensus 39.0 Tony DeMarco, MSNBC.com 40.3 New York Times 40.7 Las Vegas over-under line 43.3 SABR poll 44.3 ESPN.com 46.3 Sporting News 46.3 Dallas Morning News 46.7 ESPN the magazine 48.3 Diamond Mind simulations 49.0 Baseball Prospectus 49.7 Eric Mack, Sportsline 49.7 Previous season standings 50.7 USA Today Sports Weekly 51.3 Sports Illustrated 51.7 Baseball America 52.0 Jonah Keri, BP 52.0 Scott Miller, Sportsline 52.3 Los Angeles Times 53.3 Athlon 54.0 Gordon Edes, Boston Globe 56.0 Seattle Times 56.0 CBS Sportsline 56.3 Joe Sheehan, BP 56.7
Finally, here's how things have looked over the past two years.
It seems odd to see the Diamond Mind simulations below the midpoint in this table, partly because we've ranked near the top in all other multi-year periods, but mostly because our standard error (which measures how close we came to projecting the actual win totals for all thirty teams) has been better in this two-year period than in any other similar span since we started doing this in 1998.
That's quite a disconnect. According to the best tool for measuring such things, we've just had our best two-year run ever. Our long-term track record is extremely good, and the last two years have been better than our long-term average by a noticeable amount, so how is it that our Palmer scores haven't kept pace?
Some of this is an outgrowth of the recent increase in parity. In each of the last two years, we've projected several groups of teams to be clustered within a few games of each other. When that happens, one can be very close on the wins and still miss the standings by a place or two.
If you pick a team to finish last with 75 wins and they actually finish last with 55 wins, your Palmer score is great but your standard error is awful because you missed by 20 games. If you pick a team to finish fourth with 78 wins and they finish second with 82 wins, your Palmer score suffers but your standard error doesn't take much of a hit. I'll take the latter outcome any day of the week, and that's how things have gone for us the last couple of years.
Forecaster Avg ABP consensus 33.5 ESPN the magazine 35.5 Dallas Morning News 36.0 Tony DeMarco, MSNBC.com 40.5 Sporting News annual, power poll 40.5 ESPN.com staff consensus 41.5 Baseball Prospectus, PECOTA 41.5 New York Daily News 42.0 Scott Miller, CBS SportsLine 42.5 Poll of SABR members 42.5 Los Angeles Times 43.0 Chris Snow, Boston Globe 44.0 Jackie McMullen, Boston Globe 44.0 USA Today Sports Weekly 44.0 Jonah Keri, Baseball Prospectus 45.0 New York Times 46.0 Dayne Perry, Baseball Prospectus 47.0 St. Louis Post-Dispatch 47.0 Washington Post 47.0 Sports Illustrated 47.5 Baseball America 48.0 Gordon Edes, Boston Globe 48.0 Baseball prospectus, consensus vote 48.5 CBS SportsLine 48.5 Will Carroll, Baseball Prospectus 48.5 Las Vegas over-under line 48.8 Jay Jaffe, Baseball Prospectus 49.0 Seattle Times 50.0 Eric Mack, CBS SportsLine 50.5 2005 final standings 52.0 Diamond Mind simulations 52.5 Christina Kahrl, Baseball Prospectus 54.5 Street & Smith 55.0 Pete Palmer 56.0 Bob Ryan, Boston Globe 56.5 Athlon 57.0 Baseball Digest 57.5 Lindy's 61.0 San Francisco Chronicle 61.5 Keith Woolner, Baseball Prospectus 62.0 Dan Shaughnessy, Boston Globe 62.5 Team payroll (per USA Today) 62.5 Chicago Tribune 63.8 Joe Sheehan, Baseball Prospectus 64.0 Nate Silver, Baseball Prospectus 66.0 Spring training results 121.0