Baseball Articles | Index
2004 Predictions -- Keeping Score
By Tom Tippett
When we release our annual Projection Disk in the spring, we give our customers a chance to get a head start on the baseball season. With projected statistics and ratings for over 1600 established big leaguers and top minor-league prospects, plus league schedules, park factors, team rosters, projected pitching rotations, bullpen assignments, lineups and depth charts, the Projection Disk gives them everything they need to play out the new season using the Diamond Mind Baseball simulation game.
It also gives us a chance to get a head start on the season. Ever since we created the first Projection Disk in 1998, we've been publishing our projected standings along with comments on the outlook for all 30 teams. Those projected standings are based on the average of a number of full-season simulations using the Projection Disk.
Of course, nobody really knows what's going to happen when the real season starts, but we're always curious to see how our projected results compare to the real thing. And we're equally interested in seeing how our projections stack up against the predictions made by other leading baseball experts and publications. This article takes a look at those preseason predictions and identifies the folks who were closest to hitting the mark in 2004. And because anyone can get lucky and pick the winners in one season, we also look at how everyone has done over a period of years.
In addition to projecting the order of finish, our simulations provide us with projected win-loss records, projected runs for and against, and the probability that each team will make the postseason by winning its division or grabbing the wild card.
Unfortunately, most of the predictions that are published in major newspapers, magazines and web sites don't include projected win-loss records. Instead, they give the projected order of finish without indicating which races are expected to be hotly contested and which will be runaways. Some don't even bother to predict the order of finish, but settle instead for the division winners and wild card teams.
As a result, we do our best to assign a meaningful score to each prediction based solely on order of finish within each division. We borrowed the scoring system from our friend Pete Palmer, co-author of Total Baseball and The Hidden Game of Baseball, who has been projecting team standings for more than 35 years.
Pete's scoring system subtracts each team's actual placement from its projected placement, squares this difference, and adds them up for all the teams. For example, if you predict a team will finish fourth and they finish second, that's a difference of two places. Square the result, and you get four points. Do this for every team and you get a total score. The lower the score, the more accurate your predictions.
We don't try to break ties. If, for example, two teams tie for first, we say that each team finished in 1.5th place for the purposes of figuring out how many places a prediction was off. Suppose a team was projected to finish third and they tied for first instead. That's a difference of 1.5 places. The square of 1.5 is 2.25, so that would be the point total for this team. That's why you'll see some fractional scores in the tables below.
Keeping things in perspective
That first year, we created a little database with our projected standings and those of fourteen national publications, and we were pleased to see that we ended the year with the best accuracy score among those fifteen forecasts. When we wrote up the results and posted them to our web site, however, we were very careful not to make any grand claims, saying:
Over time, we expanded our database to include the predictions of prominent baseball writers from major newspapers and other publications. This is easier said than done because some publications and web sites change their approach from year to year. For example, we used to track the predictions of several ESPN.com writers and editors, but they limited their picks to division winners in 2003. So the number of entries in our database can rise and fall depending on what the various publications do and whether we were able to find those predictions in our spring survey.
In the sections below, we'll show you how various prognosticators ranked in 2004 and over a period of years, with the period varying in length depending on when we added that person or publication to our database. We don't make any claims of completeness here -- there are lots of other predictions that are not in our database -- but we think you'll find that our sample is an interesting one.
For several reasons, we want to emphasize that it's important that nobody take these rankings too seriously.
First, this isn't the only scoring system one could use to rank these projections, of course. A fellow named Gerry Hamilton runs a predictions contest every year (see http://www.tidepool.com/~ggh1/index.html) and assigns a score based on how many games each team finished out of their predicted place in the standings. (We came 22nd out of 195 predictions in their 2004 contest after finishing 4th in 2003.)
Second, because of publishing deadlines, the predictions in some spring baseball magazines are made long before spring training started, others are prepared in early-to-mid March, and some are compiled just before opening day. Obviously, the longer you wait, the more information you have on player movement and injuries.
Third, many newspaper editors ask staff writers to make predictions so their readers have something to chew on for a couple of days. Some writers hate doing them but comply because their editors insist. Some do it even though their main beat is a different sport. Others may make off-the-wall picks just for grins or feel compelled to favor the hometown teams.
Rankings for 2004
It's interesting to see how everyone did this year, but it's even more interesting to look back to see how different people perceived the baseball world before the season started. We'll start by showing you the prediction rankings for the current season, then we'll follow that up with a review of each division race and how those races affected these rankings.
Forecaster Score New York Times 30 Las Vegas over-under line 32.5 Tony DeMarco, MSNBC.com 40 Diamond Mind simulations 42 Bob Hohler, Boston Globe 42 Joe Sheehan, Baseball Prospectus 42 Michael Wolverton, Baseball Prospectus 42 David Lipman, ESPN.com 44 Michael Holley, Boston Globe 46 Gary Huckabay, Baseball Prospectus 46 Team payroll (per USA Today) 46 Poll of SABR members 48 Athlon 48 Eric Mack, CBS SportsLine 48 2003 final standings 48 MLB Yearbook 50 Baseball Prospectus 52 Nate Silver, Baseball Prospectus 52 Lindy's 52 Dan Shaughnessy, Boston Globe 52 ESPN.com power rankings 56 Phil Rogers, ESPN.com 56 Steve Mann 56 The Sporting News (Ken Rosenthal) 58 Rany Jazayerli, Baseball Prospectus 58 Charley McCarthey, CBS SportsLine 58 Baseball America 60 Sports Illustrated 60 Spring Training Yearbook 60 Tristan Cockroft, CBS SportsLine 60 USA Today 61.5 Street & Smith 62 Chris Kahrl, Baseball Prospectus 62 Miami Herald 64 Derek Zumsteg, Baseball Prospectus 64 USA Today Sports Weekly 66 Jonah Keri, Baseball Prospectus 66 Pete Palmer 68 Dallas Morning News 68 Seattle Times 68 CBS SportsLine 72 Gordon Edes, Boston Globe 72 Scott Miller, CBS SportsLine 72 ESPN the magazine (Peter Gammons) 74 Los Angeles Times 74 Bob Ryan, Boston Globe 76 Adam Reich, CBS SportsLine 80 Spring training results 134
The "Diamond Mind simulations" entry is the one representing the average result of simulating the season 100 times. These simulations were done about three weeks before the season started.
There are a few other entries in this list that don't represent the views of a writer or a publication. If you predicted that the 2004 standings would be the same as in 2003, your score would have been 48. If you put together a set of standings based on the Las Vegas over-under line, you'd have racked up an impressively low total of 32.5 points. If you thought the teams would finish in order from highest to lowest payroll, your score would have been 46.
And if you predicted that the regular season standings would match the 2004 spring training standings, your score would have been 134. In other words, the spring training results were almost useless as a predictor of the real season, and that's been true for at least the past four years.
Reviewing the divisions
Much more interesting than the overall scores, in our opinion, are the details. Which teams were consistently under- or over-estimated? Which divisions contained the biggest surprises? Did anyone predict that certain teams would have a sudden change of fortune?
Leaving out the entries that don't represent writers or publications, here are some observations about how the others saw things last spring:
AL East. Everyone had either New York or Boston winning the division, with the Yankees being picked first four more times than the Red Sox. Other than Gary Huckabay, who picked Toronto second and Boston third, everyone had this as a two-team race. A good number of people picked Baltimore third ahead of Toronto, but four people picked the Orioles to finish last, too, so there was no clear consensus on the Orioles.
AL Central. The Kansas City Royals were the downfall for many this year. The young Royals led the division for much of the 2003 season before fading down the stretch, then added some veteran players during the winter. As a result, they were a trendy pick to win the division or finish second behind Minnesota. A good part of the reason our score is among the leaders in 2004 is that we identified the Royals as one of the teams most likely to disappoint. That was based largely on our simulation results, but also based on the fact that the 2003 Royals didn't have the statistical foundation to justify their high placement. Surprisingly, seven predictions had Detroit finishing fourth, in every case because they thought the Indians would be even worse.
AL West. A year ago, our score was significantly improved because we chose to rank the Mariners ahead of the Angels when those two teams finished in a virtual tie for second in our simulations. This year, those teams were again neck and neck, with the Mariners averaging one more win but the Angels having a slightly better run margin. In a decision we'd love to have back, we gave the nod to Seattle. More than twice as many people chose Anaheim to win the division over Oakland, with three choosing the Mariners for first place. Everyone picked the Rangers to finish last, meaning that nobody in our survey got this division (or any other division) correct from top to bottom.
NL East. Before the season, the Phillies appeared to be loaded with talent, the Marlins were shedding payroll after winning the World Series, and the Braves seemed quite vulnerable. All three teams were selected by at least one person to win the division, with Philly being the choice about 80% of the time. Most predictions had a clear separation between the top three and the bottom two, but Montreal (five times) and New York (three times) snuck into third place on a few lists.
NL Central. Only two entries (Diamond Mind and Steve Mann) had the Cardinals finishing first in this division. The others seemed caught up in the hype surrounding the Cubs young pitching (Prior, Wood, Zambrano) and the Astros older pitching (Clemens, Pettitte). Just about every prediction had the Cubs and Astros duking it out for first with the Cardinals third. The picks for first place were almost evenly split between Chicago and Houston, with the Cubs having a very slight edge. There was some variation in the order of the bottom three teams, but nobody picked any of them to finish in the top half of the division.
NL West. Picking the Dodgers to finish at or near the top was a key to the better-scoring predictions this year, as was picking against the Diamondbacks. We were among those who thought Arizona would finish ahead of Los Angeles, but we were not alone. Approximately 2/3 of the predictions had Arizona beating the Dodgers, with thirteen people picking the D'backs to win the division outright. (In an example of the importance of timing, Arizona finished one game ahead of the Dodgers in our simulations, but the teams would have been reversed had we run them again after Milton Bradley was traded to LA.) It's clear that many people thought this division was wide open, as four of the five teams (everyone but the Rockies) were picked to finish first at least once.
Summing up. For the first time ever, not a single division was nailed by even a single predictor. Certain teams surprised a lot of people by overachieving (Texas, Los Angeles) or falling short (Arizona, Seattle, Kansas City, Toronto). As a result, the prediction scores were much higher this year than in 2003. A year ago, things went more in accordance with expectations.
Here are the rankings for those who were included in our sample every year. There's a new entry this year. We went back and ranked all of the teams based on their payroll as reported in USA Today in April, and we computed a standings score based on the "prediction" that teams would finish in order from highest to lowest payroll. As you can see, that doesn't seem to be a very good predictor.
Forecaster 2004 2003 2002 2001 2000 1999 1998 Total Diamond Mind 42.0 28.0 40.0 54.5 68.0 42.0 44.5 319.0 Las Vegas over-under 32.5 30.0 46.0 65.5 51.5 48.0 52.0 325.5 Sports Illustrated 60.0 30.0 48.0 56.5 40.0 56.0 54.0 344.5 Steve Mann 56.0 48.0 60.0 38.5 58.0 54.0 44.0 358.5 Sports Weekly 66.0 38.0 42.0 46.5 58.0 51.5 60.0 362.0 Athlon 48.0 36.0 38.0 67.5 42.0 72.0 72.0 375.5 Sporting News 58.0 44.0 54.0 52.5 38.0 78.0 54.0 378.5 Pete Palmer 68.0 56.0 50.0 70.5 54.0 40.0 58.0 396.5 Street & Smith 62.0 36.0 70.0 68.5 58.0 68.0 64.0 426.5 Previous season 48.0 42.0 48.0 64.5 56.0 70.0 100.0 428.5 Payroll ranking 46.0 64.0 102.0 60.0 88.0 72.0 44.0 476.0
In 1999, we added some writers from the Boston Globe.
Forecaster 2004 2003 2002 2001 2000 1999 Total Gordon Edes, Boston Globe 52.0 32.0 54.0 56.5 26.0 28.0 248.5 Las Vegas over-under line 32.5 30.0 46.0 65.5 51.5 48.0 273.5 Diamond Mind simulations 42.0 28.0 40.0 54.5 68.0 42.0 274.5 Sports Illustrated 60.0 30.0 48.0 56.5 40.0 56.0 290.5 USA Today Sports Weekly 66.0 38.0 42.0 46.5 58.0 51.5 302.0 Athlon 48.0 36.0 38.0 67.5 42.0 72.0 303.5 Baseball America 60.0 28.0 48.0 54.5 54.0 70.0 314.5 Steve Mann 56.0 48.0 60.0 38.5 58.0 54.0 314.5 Sporting News 58.0 44.0 54.0 52.5 38.0 78.0 324.5 Previous season standings 48.0 42.0 48.0 64.5 56.0 70.0 328.5 Dan Shaughnessy, Globe 52.0 56.0 70.0 44.5 54.0 58.0 334.5 Pete Palmer 68.0 56.0 50.0 70.5 54.0 40.0 338.5 Bob Ryan, Boston Globe 76.0 40.0 58.0 84.5 58.0 40.0 356.5 Street & Smith 62.0 36.0 70.0 68.5 58.0 68.0 362.5 Payroll ranking 46.0 64.0 102.0 60.0 88.0 72.0 432.0
The Diamond Mind simulations missed the mark by quite a bit in 2000. We added a new concept to our projection system that year, but we were unhappy with the results, and we took that out of the model before generating our projections in 2001. The results have been much better since. As you can see, the Las Vegas over-under line has been getting much better in recent years.
Forecaster 2004 2003 2002 2001 2000 Total Las Vegas over-under line 32.5 30.0 46.0 65.5 51.5 225.5 Athlon 48.0 36.0 38.0 67.5 42.0 231.5 Diamond Mind simulations 42.0 28.0 40.0 54.5 68.0 232.5 Sports Illustrated 60.0 30.0 48.0 56.5 40.0 234.5 Gordon Edes, Boston Globe 72.0 32.0 54.0 56.5 26.0 240.5 Baseball America 60.0 28.0 48.0 54.5 54.0 244.5 Sporting News 58.0 44.0 54.0 52.5 38.0 246.5 Previous season standings 48.0 42.0 48.0 64.5 56.0 248.5 USA Today Sports Weekly 66.0 38.0 42.0 46.5 58.0 250.5 Steve Mann 56.0 48.0 60.0 38.5 58.0 260.5 Dan Shaughnessy, Globe 52.0 56.0 70.0 44.5 54.0 276.5 Street & Smith 62.0 36.0 70.0 68.5 58.0 294.5 Pete Palmer 68.0 56.0 50.0 70.5 54.0 298.5 Bob Ryan, Boston Globe 76.0 40.0 58.0 84.5 58.0 316.5 Payroll ranking 46.0 64.0 102.0 60.0 88.0 360.0
Lindy's was a strong addition to our survey in 2001. We also added the San Francisco Chronicle that year, but they've been dropped from this list because we couldn't find their 2004 predictions. That paper ranked second from 2001 to 2003.
Forecaster 2004 2003 2002 2001 Total Diamond Mind simulations 42.0 28.0 40.0 54.5 164.5 Lindy's 52.0 40.0 42.0 36.5 170.5 Las Vegas over-under line 32.5 30.0 46.0 65.5 174.0 Tony DeMarco, MSNBC.com 40.0 34.0 34.0 67.5 175.5 Athlon 48.0 36.0 38.0 67.5 189.5 Baseball America 60.0 28.0 48.0 54.5 190.5 USA Today Sports Weekly 66.0 38.0 42.0 46.5 192.5 Sports Illustrated 60.0 30.0 48.0 56.5 194.5 Steve Mann 56.0 48.0 60.0 38.5 202.5 Previous season standings 48.0 42.0 48.0 64.5 202.5 Sporting News 58.0 44.0 54.0 52.5 208.5 Los Angeles Times 74.0 18.0 44.0 73.5 209.5 Gordon Edes, Boston Globe 72.0 32.0 54.0 56.5 214.5 Dan Shaughnessy, Globe 52.0 56.0 70.0 44.5 222.5 Street & Smith 62.0 36.0 70.0 68.5 236.5 Pete Palmer 68.0 56.0 50.0 70.5 244.5 Bob Ryan, Boston Globe 76.0 40.0 58.0 84.5 258.5 Payroll ranking 46.0 64.0 102.0 60.0 272.0 Spring training results 134.0 70.0 86.0 113.5 403.5
Here's how things looked from 2002 to 2004. The LA Times was unable to follow up the excellent 2003 predictions that put them in top spot in last year's two-season rankings.
Forecaster 2004 2003 2002 Total Tony DeMarco, MSNBC.com 40.0 34.0 34.0 108.0 Las Vegas over-under line 32.5 30.0 46.0 108.5 Diamond Mind simulations 42.0 28.0 40.0 110.0 Bob Hohler, Boston Globe 42.0 32.0 38.0 112.0 Athlon 48.0 36.0 38.0 122.0 Lindy's 52.0 40.0 42.0 134.0 Los Angeles Times 74.0 18.0 44.0 136.0 Baseball America 60.0 28.0 48.0 136.0 Sports Illustrated 60.0 30.0 48.0 138.0 Previous season standings 48.0 42.0 48.0 138.0 USA Today Sports Weekly 66.0 38.0 42.0 146.0 USA Today 61.5 32.0 58.0 151.5 Sporting News 58.0 44.0 54.0 156.0 Gordon Edes, Boston Globe 72.0 32.0 54.0 158.0 Steve Mann 56.0 48.0 60.0 164.0 Street & Smith 62.0 36.0 70.0 168.0 Bob Ryan, Boston Globe 76.0 40.0 58.0 174.0 Pete Palmer 68.0 56.0 50.0 174.0 Dan Shaughnessy, Globe 52.0 56.0 70.0 178.0 Payroll ranking 46.0 64.0 102.0 212.0 Spring training results 134.0 70.0 86.0 290.0
Finally, here's how things have looked over the past two years.
Forecaster 2004 2003 Total Las Vegas over-under line 32.5 30.0 62.5 Diamond Mind simulations 42.0 28.0 70.0 Tony DeMarco, MSNBC.com 40.0 34.0 74.0 Bob Hohler, Boston Globe 42.0 32.0 74.0 Athlon 48.0 36.0 84.0 Baseball America 60.0 28.0 88.0 Sports Illustrated 60.0 30.0 90.0 Previous season standings 48.0 42.0 90.0 MLB Yearbook 50.0 40.0 90.0 Lindy's 52.0 40.0 92.0 Los Angeles Times 74.0 18.0 92.0
Overall, we've been pretty happy with our results, and if there's one thing that stands out, it's our ability to identify over-rated teams.
In 2004, we saw the Royals as a 2003 overachiever that was unlikely to repeat, we projected the Blue Jays to finish below .500, and we didn't buy all of the hype surrounding the Cubs and Astros.
A year earlier, our simulations correctly indicated that the Mets were likely to finish at the bottom of their division again, the Angels were very unlikely to repeat their 2002 success, and the Dodgers wouldn't score enough runs to make a serious run at the NL West title.
Even so, we're always surprised by something that happens each year. We didn't anticipate the emergence of the Rangers and Dodgers in 2004 or the surprising finishes of the Marlins and Royals the year before. As a result, we have a bunch of test cases to study as we consider possible improvements to our projection system.
More than anything, this process -- projecting the season in March, watching the real thing for six months, and taking a look back after the season -- is highly educational for us. So we'll be back with our projected 2005 team standings in March.