Baseball Articles | Index
2002 Predictions -- Keeping Score
By Tom Tippett
In the spring of 1998, we released our first annual Projection Disk, enabling you to play the coming season using Diamond Mind Baseball and over 1500 established big leaguers and top minor-league prospects.
Players on our Projection Disks are rated to perform in accordance with stats that we create using our projection system. That system produces expected statistics for all batters and pitchers based on a blend of major-league and minor-league stats from the past three years, adjusted for factors such as the level of competition (majors, AAA, AA), ballpark effects (including minor-league parks), league rules (DH vs non-DH), and age (young players are projected to improve while older players are projected to fade a little each year).
After the projected stats have been computed and the player ratings assigned, we set up a manager profile for each team with the starting rotation, bullpen assignments, lineups versus left- and right-handed pitchers, and roles for bench players (such as platoons, spot starters, and defensive replacements) based on best assessment of how the players will be used in the coming season.
We then simulate the season many times, average the results to come up with our projected final standings for the season, and write an article describing the results and commenting on the outlook for every team.
That first year, we were curious to see how our projected standings would stack up against the pre-season predictions in leading magazines such as Sports Illustrated, The Sporting News, Street and Smith, and USA Today Baseball Weekly.
To do those rankings, we needed a way to assign an accuracy score to each prediction. So we turned to our friend Pete Palmer, the co-author of Total Baseball and The Hidden Game of Baseball. Pete has been projecting team standings for more than 25 years, and he routinely collects predictions and ranks them at the end of the year.
Pete's rankings are based on a simple scoring system -- subtract each team's actual placement from their projected placement, square this difference, and add them up for all the teams. For example, if you predict a team will finish fourth and they finish second, that's a difference of two places. Square the result, and you get four points. Do this for every team and you get a total score. The lower the score, the more accurate your predictions.
We don't try to break ties and name one team as having finished ahead of the other. If, for example, two teams tie for first, we say that each team finished in 1.5th place for the purposes of figuring out how many places a prediction was off. If a team was projected to finish third and they tied for first instead, that's a difference of 1.5 places. The square of 1.5 is 2.25, so that would be the point total for this team. That's why you'll see some fractional scores in the tables below.
Keeping things in perspective
That first year, we created a little database with our projected standings and those of fourteen national publications, and we were pleased to see that we ended the year with the best accuracy score among those fifteen forecasts as measured by Pete's formula. When we wrote up the results and posted them to our web site, however, we were very careful not to make any grand claims, saying:
Over time, we expanded our database to include the predictions of prominent baseball writers from major newspapers and those of several ESPN.com staffers. In the sections below, we'll show you how these prognosticators ranked in 2002 and over a period of years, with the period varying in length depending on when we added that forecaster to our database. We don't make any claims of completeness here -- there are lots of other predictions that are not in our database -- but I think you'll find that our sample is an interesting one.
For several reasons, I want to emphasize that it's important that nobody take these rankings too seriously.
First, this isn't the only scoring system one could use to rank these projections, of course. The rankings at ESPN.com use the same approach but don't square the differences. A fellow named Gerry Hamilton runs a predictions contest every year (see http://www.tidepool.com/~ggh1/index.html) and assigns a score based on how many games each team finished out of their predicted place in the standings.
Second, it's not entirely fair to put all of these predictions into the same group. Because of publishing deadlines, the predictions in the spring baseball magazines are made long before spring training started, others (including ours) are usually prepared in early-to-mid March, while some are published just before opening day. Obviously, the later you do them, the more information you have on player movement and injuries.
Third, many newspaper editors ask staff writers to make predictions so their readers have something to chew on for a couple of days. Some writers hate doing them but comply because their editors insist. Others may make off-the-wall picks just for grins. We don't have a reliable way to decide which are serious, so we include them all. But we do ask you to remember that some of these predictions may have been made in jest.
Finally, our projections are based on the average of many simulated seasons. That means that the normal ups and downs of a single season are smoothed out. For example, in 2002, we had the A's winning the AL West with 96 wins, but their win total in any one simulated season could have been much higher or lower, and they didn't always finish first. In two of the fifty seasons that we ran last March, Anaheim won that division even though the overall averages put them in last place.
Rankings for 2002
It's interesting to see how everyone did this year, of course, and those rankings are shown in the first table below.
More interesting, at least to my mind, is the process of looking back at those predictions and identifying the patterns that emerge. Which teams were consistently under- or over-estimated? Which divisions contained the biggest surprises? Did anyone predict that certain teams would have a sudden change of fortune? So we'll follow the table with a brief analysis of the six division races.
Forecaster Score Tony DeMarco, MSNBC.com 34 Athlon 38 Bob Hohler, Boston Globe 38 Peter Gammons, ESPN 38 Diamond Mind simulations 40 Baseball Weekly 42 Brandon Funston, ESPN.com 42 Eric Karabell, ESPN.com 42 Jayson Stark, ESPN 42 Lindy's 42 Danny Sheridan, USA Today 44 Los Angeles Times 44 Zack Scott, Diamond Mind 44 Dallas Morning News 46 Las Vegas over-under line 46 2001 final standings 48 Andy Latack, ESPN.com 48 Baseball America 48 Chicago Tribune 48 David Schoenfield, ESPN.com 48 Rob Neyer, ESPN.com 48 Sean McAdam, ESPN.com 48 Sports Illustrated 48 Tim Kurkjian, ESPN.com 48 Pete Palmer 50 San Francisco Chronicle 50 Alan Schwarz, ESPN.com 54 The Sporting News (spring magazine) 54 Gordon Edes, Boston Globe 54 Matt Szefc, ESPN.com 54 Phil Rogers, ESPN.com 56 Spring Training Yearbook 56 Bob Ryan, Boston Globe 58 USA Today 58 Jim Caple, ESPN.com 60 Steve Mann 60 Bob Klapisch, ESPN.com 62 Joe Sheehan, Baseball Prospectus 62 David Lipman, ESPN.com 66 Dan Shaughnessy, Boston Globe 70 Street & Smith 70 John Sickels, ESPN.com 74 Michael Holley, Boston Globe 76 Rany Jazayerli, Baseball Prospectus 78 Spring training results 86
The "Diamond Mind simulations" entry is the one representing the average result of simulating the season 50 times. These simulations were done about three weeks before the season started. Two weeks later, Zack Scott of Diamond Mind made his own predictions. These were based largely on the simulation results but also took his own hunches into account.
There are three entries in this list that don't represent the views of a writer or a publication. If you predicted that the 2002 standings would be the same as in 2001, your score would have been 48. If you put together a set of standings based on the Las Vegas over-under line, you'd have scored 46. And if you predicted that the regular season standings would match the 2002 spring training standings, your score would have been 86. In other words, for the second year in a row, the spring training results were almost useless as a predictor of the real season.
Reviewing the divisions
Much more interesting than the overall scores, in my opinion, are the details. Leaving out the entries that don't represent writers or publications, here are some observations about how the others saw things last spring:
AL East. The teams have finished in the same order five years in a row. Michael Holley of the Boston Globe was the only one to pick Boston ahead of New York. Three people from ESPN.com (David Lipman, Eric Karabell, and John Sickels) put Toronto in second. Holley and Dan Shaughnessy of the Globe were alone in having Baltimore third, a pick that looked mighty good before the Orioles collapsed in the last six weeks. Everybody else had Baltimore and Tampa Bay finishing fourth and fifth, with the O's getting the nod for fourth a little more often than not.
AL Central. The White Sox were the consensus pick as division winner, but not by a lot. Twenty-four predictions put the pale hose in first, fourteen correctly picked the Twins, and four thought Cleveland would hold on for one more year. Two intrepid souls (Jim Caple and Steve Mann) were brave enough to pick the Tigers for third, but everybody else had KC and Detroit bringing up the rear, with Detroit picked fourth about twice as often as KC. Three forecasters (Bob Hohler, John Sickels, and The Dallas Morning News) got the division right from top to bottom.
AL West. It won't come as a shock when I say that most people gave the division to Seattle after the Mariners' record-setting 2001 season. But it was far from unanimous, and about a quarter of them gave the division to Oakland, while Michael Holley picked the Rangers to finish first. Only three (The Sporting News, Gordon Edes, and Jayson Stark) picked Anaheim as high as second, and all three of them had Seattle first and Oakland third. Twenty-seven predictions put Anaheim in the basement. Needless to say, nobody was exactly right on this division.
NL East. This division was the undoing of many a predictor this year. Eleven predictors thought the Mets would end Atlanta's run at the top. Twenty-two more put them second, five had them third, two (Diamond Mind and Steve Mann) ranked them fourth, and one (Joe Sheehan) had them in the basement. By picking Florida first, Steve Mann was the only one to have a team other than Atlanta or New York atop the division. Montreal was picked last by everyone except Joe Sheehan and Pete Palmer, and neither of them had the Expos higher than fourth. Another division that nobody got right.
NL Central. For those who finished near the bottom of the rankings, if the Mets didn't get them, the Cubs did. All but three submissions had either St. Louis (27 times) or Houston (12 times) on top, but three (Dan Shaughnessy, Bob Ryan, and John Sickels) picked the Cubs. Ten others put Chicago second, with only two (Tony DeMarco and David Schoenfield) putting them as low as fourth. It seems surprising now, but nine predictions saw Milwaukee matching its fourth-place finish from a year. Most, however, had Cincinnati rebounding to take that fourth spot. Pittsburgh was picked last on almost every list, but a few had them fifth, and the LA Times were the only ones to correctly pick them fourth. Cincinnati was picked last on the only four submissions that didn't have Pittsburgh or Milwaukee in that slot. Nobody picked this division correctly from top to bottom.
By the way, when we ran our first batch of simulations back in March, Cincinnati finished one game ahead of Chicago. We had a few more minor adjustments to make, and while we were working on those adjustments, I was very worried that we'd end up picking the Cubs fourth and then spend six months watching them win the division. Now we can see that those early results would have made us look very smart. At the time, however, I was very relieved to see the Cubs edge Cincinnati by one game when we ran our fifty seasons for real.
NL West. For the second year in a row, we were among the most optimistic about the Rockies chances. Seven predictions put the Rockies in third and everyone else had them lower than that. And Diamond Mind? In our simulations, they averaged 85 wins and edged Arizona by one game for second place. We were among the 25 forecasters who picked the Giants to win it, while only 15 saw the defending world champion Diamondbacks repeating atop the division. Two (Joe Sheehan and Rany Jazayerli) picked the Padres. Most predictions had the Dodgers finishing either third or fourth, but there were exceptions, as they were listed second three times (Bob Ryan, Sean McAdam, Phil Rogers) and in the basement twice (Rob Neyer and Rany Jazayerli). This division was picked correctly from top to bottom six times, by Tony DeMarco, Street and Smith, Athlon, Lindy's, Eric Karabell, and Steve Mann.
So, we had one division that was dead simple (24 predictions nailed the AL East), three that nobody got right, one with three correct predictions, and one with five. Based on the past five years, that seems about par for the course. The baseball world looked a bit different back in March, and a lot can change in six months.
Looking back over the past five years, here are the rankings for those who were included in our sample every year. Disappearing are Baseball Digest and Mazeroski magazines, neither of which were published in 2002. In what is most likely just a coincidence, those two were at the bottom of last year's four-year rankings:
Forecaster 2002 2001 2000 1999 1998 Total Diamond Mind simulations 40.0 54.5 68.0 42.0 44.5 249.0 Steve Mann 60.0 38.5 58.0 54.0 44.0 254.5 Sports Illustrated 48.0 56.5 40.0 56.0 54.0 254.5 Baseball Weekly 42.0 46.5 58.0 51.5 60.0 258.0 Las Vegas over-under line 46.0 65.5 51.5 48.0 52.0 263.0 Pete Palmer 50.0 70.5 54.0 40.0 58.0 272.5 Sporting News 54.0 52.5 38.0 78.0 54.0 276.5 Athlon 38.0 67.5 42.0 72.0 72.0 291.5 Street & Smith 70.0 68.5 58.0 68.0 64.0 328.5 Previous season standings 48.0 64.5 56.0 70.0 100.0 338.5
In 1999, we added some writers from the Boston Globe and ESPN.com, so the four-year totals include a few more names than did the previous table:
Forecaster 2002 2001 2000 1999 Total Gordon Edes, Boston Globe 54.0 56.5 26.0 28.0 164.5 Baseball Weekly 42.0 46.5 58.0 51.5 198.0 David Schoenfield, ESPN.com 48.0 56.5 56.0 40.0 200.5 Sports Illustrated 48.0 56.5 40.0 56.0 200.5 Diamond Mind simulations 40.0 54.5 68.0 42.0 204.5 Rob Neyer, ESPN.com 48.0 66.5 48.0 44.0 206.5 Peter Gammons, ESPN.com 38.0 56.5 48.0 66.0 208.5 Steve Mann 60.0 38.5 58.0 54.0 210.5 Las Vegas over-under line 46.0 65.5 51.5 48.0 211.0 Pete Palmer 50.0 70.5 54.0 40.0 214.5 Rany Jazayerli, BP 78.0 62.5 46.0 30.0 216.5 Athlon 38.0 67.5 42.0 72.0 219.5 Sporting News 54.0 52.5 38.0 78.0 222.5 Baseball America 48.0 54.5 54.0 70.0 226.5 Dan Shaughnessy, Globe 70.0 44.5 54.0 58.0 226.5 Previous year standings 48.0 64.5 56.0 70.0 238.5 Bob Ryan, Boston Globe 58.0 84.5 58.0 40.0 240.5 John Sickels, ESPN.com 74.0 68.5 58.0 58.0 258.5 Bob Klapisch, ESPN.com 62.0 57.5 78.0 62.0 259.5 Street & Smith 70.0 68.5 58.0 68.0 264.5
The Diamond Mind simulations missed the mark by quite a bit in 2000, so they rank lower here than in any of the other tables. We added a new concept to our projection system that year, but we were very unhappy with the results, and we took that out of the model before doing this again in 2001. The results have been much better since.
Forecaster 2002 2001 2000 Total Sean McAdam, ESPN.com 48.0 32.5 38.0 118.5 Gordon Edes, Boston Globe 54.0 56.5 26.0 136.5 Peter Gammons, ESPN.com 38.0 56.5 48.0 142.5 Sporting News 54.0 52.5 38.0 144.5 Sports Illustrated 48.0 56.5 40.0 144.5 Baseball Weekly 42.0 46.5 58.0 146.5 Athlon 38.0 67.5 42.0 147.5 Baseball America 48.0 54.5 54.0 156.5 Steve Mann 60.0 38.5 58.0 156.5 David Schoenfield, ESPN.com 48.0 56.5 56.0 160.5 Diamond Mind simulations 40.0 54.5 68.0 162.5 Rob Neyer, ESPN.com 48.0 66.5 48.0 162.5 Las Vegas over-under line 46.0 65.5 51.5 163.0 Dan Shaughnessy, Globe 70.0 44.5 54.0 168.5 Previous year standings 48.0 64.5 56.0 168.5 Pete Palmer 50.0 70.5 54.0 174.5 Phil Rogers, ESPN.com 56.0 62.5 56.0 174.5 Matt Szefc, ESPN.com 54.0 56.5 68.0 178.5 Rany Jazayerli, BP 78.0 62.5 46.0 186.5 Street & Smith 70.0 68.5 58.0 196.5 Bob Klapisch, ESPN.com 62.0 57.5 78.0 197.5 John Sickels, ESPN.com 74.0 68.5 58.0 200.0 Bob Ryan, Boston Globe 58.0 84.5 58.0 200.5
Finally, here's how things have looked in 2001-2002. As you can see, Lindy's has had two very good years in a row and have to be regarded as the top dog for the time being. Sean McAdam slipped a bit in 2002 after two extremely good years, but is still right at the top. This doesn't surprise me a bit; I have the good fortune of hearing Sean regularly on Boston's main sports radio station, and he really knows his stuff.
Forecaster 2002 2001 Total Lindy's 42.0 36.5 78.5 Sean McAdam, ESPN.com 48.0 32.5 80.5 SF Chronicle 50.0 36.5 86.5 Baseball Weekly 42.0 46.5 88.5 Jayson Stark, ESPN.com 42.0 46.5 88.5 Diamond Mind simulations 40.0 54.5 94.5 Peter Gammons, ESPN.com 38.0 56.5 94.5 Steve Mann 60.0 38.5 98.5 Tony DeMarco, MSNBC.com 34.0 67.5 101.5 Baseball America 48.0 54.5 102.5 Zack Scott, Diamond Mind 44.0 58.5 102.5 David Schoenfield, ESPN.com 48.0 56.5 104.5 Sports Illustrated 48.0 56.5 104.5 Athlon 38.0 67.5 105.5 Sporting News 54.0 52.5 106.5 Chicago Tribune 48.0 62.5 110.5 Gordon Edes, Boston Globe 54.0 56.5 110.5 Matt Szefc, ESPN.com 54.0 56.5 110.5 Las Vegas over-under line 46.0 65.5 111.5 Previous year standings 48.0 64.5 112.5 Rob Neyer, ESPN.com 48.0 66.5 114.5 Dan Shaughnessy, Globe 70.0 44.5 114.5 Los Angeles Times 44.0 73.5 117.5 Phil Rogers, ESPN.com 56.0 62.5 118.5 Bob Klapisch, ESPN.com 62.0 57.5 119.5 Pete Palmer 50.0 70.5 120.5 Alan Scwarz, ESPN.com 54.0 70.5 124.5 David Lipman, ESPN.com 66.0 64.5 130.5 Street & Smith 70.0 68.5 138.5 Rany Jazayerli, BP 78.0 62.5 140.5 John Sickels, ESPN.com 74.0 68.5 142.5 Bob Ryan, Boston Globe 58.0 84.5 142.5
Except for the 2000 season, our approach to developing projections seems to be providing good results.
If there's any one thing that stands out, it's the system's ability to identify over-rated teams. In 2002, for example, our simulations indicated that (a) the Mets would have real trouble scoring runs even with the addition of guys like Mo Vaughn and Roberto Alomar, (b) the Seattle offense would come back to earth after a terrific 2001 season, and (c) the Cubs were likely to be battling the Reds for third and fourth place, not challenging for the division title.
On the other hand, we didn't anticipate the sudden emergence of some of the game's best bullpens. (I'm not sure anyone else saw this coming, either.) In our simulations, the relievers on the Angels, Twins, and Braves were nowhere near as good as they were in the real 2002 season.
And if there's a theme from the past five years, it's that our simulations occasionally project a team with great hitting and barely acceptable pitching to do very well, and it seems as if their real-life counterparts often blow a bunch of leads early in the year and then fall apart. The Rangers were this year's example, though a very tough division and key injuries on both sides of the ball were a major factor as well.
I wish we were better at projecting which young teams will continue to get better, as the Twins did this year, and which will backslide unexpectedly. Before the season, I wouldn't have been surprised to see the Marlins outperform the Twins, but it didn't work out that way.
Still, it's a lot of fun and a highly educational process for us. When we run the simulations in the spring, we always end up learning something. We always end up being surprised by some of the results. And we always end up with a bunch of things to watch for as the real season unfolds. So we'll take another run at it next spring and we'll report back after the season.