Baseball Articles | Index
|
Playoff Upsets and the 1998 Atlanta Braves
By Tom Tippett and Tom Ruane Baseball cliché: "Anything can happen in a short series." Observation #1: So far this year, the team with the better regular-season record has won only three of the six playoff series. Observation #2: In the 1990s, the team with the best regular-season record has won exactly zero World Series championships. The 1998 post-season was another disappointment to Braves fans. As Baseball
Weekly pointed out recently, "it's No. 1 or Bust for the Braves,"
who, as one commentator after another has mentioned in the wake of their
latest failure against the Padres, have only a single championship to
show for nearly a decade of dominance. But have these expectations been
reasonable? After all, there were seven other teams in the playoffs this
year--what kind of chance did the Braves really have of winning the World
Series? 1. Determine the likelihood that the stronger team would win a single game against the weaker team, based on their regular-season records. 2. Take these expected single-game winning percentages and see how they translate into expected series winning percentages, for series of varying lengths. 3. Look at how much harder it has become to win the World Series now that there are three rounds of post-season play. Forecasting Single-Game Winning PercentagesThe 1998 Yankees won 114 games this year and finished with a .704 winning percentage. If they played the same team every day, we could estimate their chances of winning the next game at .704, or 70.4%. But they don't play the same team every day, and their chances to win any given game clearly depend to some degree on the strength of their opponent. So, if a .704 team plays a .410 team, what is the likelihood that they would win? What if they played a .600 team? What if a .550 team plays a .530 team? For years, we've used a simple rule of thumb to estimate the expected winning percentage in any matchup: take the difference in the two teams' overall winning percentage and add it to .500. For example, a .704 team against a .450 team is .254 better, so we add that to .500 and come up with an expected winning percentage of .754 in that matchup. The rationale is quite simple. If a team plays a balanced schedule, the combined winning percentage of its opponents in a season is near .500. It usually won't be exactly .500, but it won't be far off. In the 1998 AL, for example, the Yankees were 66 games over .500, so their opponents were a combined 66 games under .500, and their combined winning percentage was .484. That's one of the most extreme examples in baseball history. Most teams are nowhere near 66 games over or under .500, so for most teams, the assumption that their opponents played .500 ball as a group is a pretty good one. So, if a team played .600 ball against a group of .500 teams, and they're now facing a .450 team, their expected winning percentage should go up. We surmised that it should go up by 50 points, since their opponent is 50 points worse than the average. Until now, this rule of thumb has been nothing more than a theory. So we (specifically, Tom Ruane) wrote a program to analyze the results of every AL and NL game from 1901 to 1997. The program placed each team into one of twenty groups based upon their winning percentage for that season. All teams with winning percentages less than .330 went into group A; those with winning percentage between .330 and .350 went into group B, and so on up to the top group, which had all teams with winning percentages greater than .690. For each game, the program figured out what type of matchup it was (e.g. group C vs group F) and then added the game result to the totals for that matchup. The following table shows the results from the perspective of the favored team. Instead of showing the groups by their letter code, the table shows them by the midpoint of the winning percentage range for the group (e.g. .340 for group B). To save space, the percentages are shown without the decimal point. For example, when teams in the .530 to .550 percentage group (or the one in the box identified by "540") play teams in the .470 to .490 range (identified by "480"), they won 566 games out of every 1000 games. -------------------- Winning Percentage of the Worse Team --------------------------- Pct <330 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 >690 <330 554 340 479 --- 360 547 496 568 380 568 591 526 516 400 627 535 549 538 537 420 567 570 552 509 531 567 440 611 593 584 556 534 538 542 460 650 605 604 582 523 522 505 496 480 610 619 631 584 590 549 540 516 496 500 677 660 627 591 601 565 573 533 523 498 520 711 676 630 650 630 592 570 556 519 518 493 540 705 695 670 646 614 608 593 569 566 528 517 510 560 742 684 701 658 630 648 610 565 560 563 538 518 526 580 728 710 679 668 659 668 633 619 598 558 558 537 495 507 600 757 710 714 695 677 640 648 607 617 603 581 562 517 515 530 620 803 727 697 723 686 669 656 637 611 596 611 567 557 550 518 514 640 764 739 697 739 711 761 663 673 646 645 588 602 598 569 566 470 525 660 769 774 712 762 785 710 666 673 636 663 626 602 639 596 519 534 682 --- 680 812 812 754 724 750 687 703 689 700 682 729 645 607 542 569 575 --- --- 591 >690 838 777 837 772 773 778 723 795 710 710 577 649 664 576 571 545 682 500 591 --- Pct <330 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 >690
Let's test our rule of thumb by looking at a few rows in the table. Recall that the predicted winning percentage is .500 plus the difference in the winning percentages of the two teams. Here's how the predicted values compare to the observed values for these two rows: --------------- Opponents Winning Percentage ------------------ 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 520 row Predicted 680 660 640 620 600 580 560 540 520 500 Observed 676 630 650 630 592 570 556 519 518 493 580 row Predicted 740 720 700 680 660 640 620 600 580 560 540 520 500 Observed 710 679 668 659 668 633 619 598 558 558 537 495 507 640 row Predicted 800 780 760 740 720 700 680 660 640 620 600 580 560 540 520 500 Observed 739 697 739 711 761 663 673 646 645 588 602 598 569 566 470 525 You can see that there is quite a bit of similarity between the predicted and observed values, and the similarity is strongest in the part of the table where the sample sizes are largest (the 460 to 540 range). At the extremes, there's more variation, which is to be expected with the smaller number of games in these buckets. But there's something else going on here, too. The observed values are consistently lower than the predicted values when there's a very large gap in the winning percentages. The .640 teams consistently underperformed against the teams in the sub-.400 range. It seems quite possible that the strong teams have a habit of choosing these games to rest their stars, so the talent level of their starting lineups and starting pitchers might not be up to their normal standard. Or maybe they just come up a little flat against the weaker teams more often. Because of our concerns about sample sizes, we ran the program again and compiled the data in a slightly different way. This time, we grouped each game into categories based solely on the difference in their winning percentages. So, if a .590 team played a .500 team and a .540 team played a .450 team, both results would go into the category for matchups with a 90-point difference in winning percentage. Here are the results: Difference Predicted Observed .000 .500 .499 .010 .510 .513 .020 .520 .518 .030 .530 .522 .040 .540 .532 .050 .550 .544 .060 .560 .557 .070 .570 .563 .080 .580 .574 .090 .590 .588 .100 .600 .591 .110 .610 .593 .120 .620 .612 .130 .630 .613 .140 .640 .625 .150 .650 .649 .160 .660 .640 .170 .670 .650 .180 .680 .659 .190 .690 .669 .200 .700 .680 .210 .710 .698 .220 .720 .683 .230 .730 .708 .240 .740 .709 The actual results are consistently a few points below what was predicted by the rule of thumb, and the gap seems to grow as the matchup becomes more lopsided. Again, our guess is that the favorite tends to rest key players when they have the biggest advantage and that the weaker team might be more 'up' for these games. Although the numbers aren't shown here, we also computed the expected winning percentage for each matchup when the favored team was at home and on the road. Historically, home teams have consistently won at a .540 pace (see The Hidden Game of Baseball by John Thorn and Pete Palmer for details), so we need to know where a game is being played if we're going to get the best read on the expected winning percentage for a single game. Forecasting Series Winning PercentagesNow that we've come up with a way to make pretty good estimates of the expected game winning percentage for a game, we're ready to tackle the question of how those winning percentages translate into winning a series. To do this, we've used two methods -- a simple computer simulation and a more general approach that uses some basic work with probabilities -- to compute a theoretical series winning percentage for the favored team.
So, if you know that the favored team is expected to win individual games at a .550 clip, how often does that translate into a victory in a 3-game series, a 5-game series, and so on? What if the teams are more evenly matched, and the favored team should win games only at a .520 pace? We wrote a program that plays series of various lengths with various winning percentages. To simulate one game, the program flips a two-sided 'coin', with the chance of a win or loss based on the assumed winning percentage for the favored team. To simulate one series, the program simply generates game results until one team wins the requisite number of games. To compute the chances that the favored team would win, the program generates a million series and counts the number of victories by the favorite. We did this twice, with and without the home-field advantage. To save space, we present only the numbers without the home field factored in, since home field increases expected series wins only by a very small amount (1-2%). Here are the results, using a variety of assumed winning percentages and series lengths: Game Series Length WPct 3 5 7 .500 .501 .500 .500 .510 .515 .519 .522 .520 .530 .538 .544 .530 .545 .557 .566 .540 .560 .575 .587 .550 .575 .593 .608 .560 .589 .611 .629 .570 .604 .629 .651 .580 .618 .648 .671 .590 .634 .665 .690 .600 .648 .683 .710 Let's work through one example. Suppose we have a lopsided matchup, where the favored team is so much better that it should win games at a .590 clip. We refer to the second-last row in this table, which tells us that there's a 63.4% chance (.634) that the favored team would win a three-game series, a 66.5% chance that they'd win a five-game series, and a 69% chance they'd take a seven-game set. (It's not shown here, but their chances rise to 70% if they have home field in a 7-game series.) Using this information, we can draw a couple of conclusions: - upsets are likely. You can bet that the baseball media would be talking about the upset of the century if a .410 team were to upset a .590 team in 7-game series. But we just showed that a .590 team has only a 69% chance to win a best-of-seven series, meaning that an upset of this magnitude is almost a one-in-three occurrence. Upset, yes. Miracle, not quite. - there's been talk this year about the need to extend the first round from five games to seven games to reduce the number of upsets. But the number of upsets wouldn't change all that much. The amount by which the expected series winning percentage rises when the series goes from five to seven games can be seen by comparing the third and fourth columns in this table. Our .590 team would see its chances increase only by 2.5% (from 66.5% to 69%) in the longer series, and the gap is even smaller with more evenly matched teams. LimitationsThis approach to computing series winning percentages assumes that the better team has the same chance to win in each game, subject only to changes in home-field advantage. There are two ways in which this assumption is not valid. First, it stands to reason that the chances to win each game will go up and down with the relative ability of the starting pitchers. Suppose one team has one dominant pitcher and three average starters and the other team has four very good pitchers. (Do you see any resemblance to the 1998 Padres and Yankees here?) The first team will have a better shot in games started by the ace and will be less likely to win the others. Second, it's quite possible that the victor in game one might have an edge in game two if they (a) wear out some of the relief pitchers on the losing team, (b) gain confidence, or (c) sieze the momentum. Having seen how little the home-field advantage affects the series-winning percentages, our belief is that these two factors won't change the overall results by enough to make a difference in our conclusions. The 1998 Post-Season and the Atlanta BravesWe now live in an era when the top-ranked team must win three series to be crowned World Series champs. They're guaranteed to have the home-field advantage in the first two rounds, and there's a 50-50 chance they'll have it in the Series, too. Building on the work we've done so far, we can estimate the probability of each of the 1998 post-season teams advancing through the playoffs and winning it all. Here are the records of the Division Series opponents (the team listed first had the home-field advantage): NY 114-48 .704 vs TEX 88-74 .543 CLE 89-73 .549 vs BOS 92-70 .568 ATL 106-56 .654 vs CHI 90-73 .552 HOU 102-60 .630 vs SD 98-64 .605 There's a lot of ground to cover here, so we're going to gloss over the details and get to the meat of the subject. Using our ability to (a) estimate the game-winning percentages for each matchup, (b) adjust those percentages for the home-field advantage, and (c) translate that into a series-winning percentage, the chances that each of these teams would advance are as follows: NY .760 vs TEX .240 CLE .485 vs BOS .515 ATL .682 vs CHI .318 HOU .551 vs SD .449 In the League Championship Series, we have to worry about 2 possible opponents for each team: NY .764 vs BOS .236 NY .805 vs CLE .195 CLE .540 vs TEX .460 TEX .477 vs BOS .523 ATL .554 vs HOU .446 ATL .608 vs SD .392 HOU .669 vs CHI .331 SD .608 vs CHI .392 Note that the Yankees actually had a better chance of beating Boston than Texas, despite Boston's better record, because the LCS is a best-of-seven rather than a best-of-five series. So what were the odds of each team reaching the series? Well, it's sum of the odds of each of the two possible paths the team could have taken. For example, the Yankees could have gone through Boston or Cleveland on their way to the series, making the odds of them reaching the series: .760 * ( ( .515 * .764 ) + ( .485 * .805 ) ) = .596 Where: .760 = the odds of them reaching the second round .515 = the odds of Boston reaching the second round .764 = the odds of them beating Boston .485 = the odds of Cleveland reaching the second round .805 = the odds of them beating Cleveland If you work out the odds for each of the eight teams, you get the following: NY .596 ATL .394 BOS .157 HOU .285 CLE .135 SD .207 TEX .113 CHI .114 So not only were the Braves not a lock to win the World Series, they didn't even have a fifty-fifty chance of getting there. And, when we include the chances that each team would win the World Series, against whomever happened to make it from the other league, here are the final odds for this year's playoffs: NY .394 ATL .200 BOS .062 HOU .134 CLE .049 SD .086 TEX .040 CHI .036 So the Braves had about a one-in-five shot of winning their second championship this decade. Of course, this assumes that one league is not stronger than the other, that a .650 team in one league is as good as a .650 team in the other. Since we've now had an amateur draft for more than thirty years and free agency for over twenty years, it's reasonable to assume that the talent in the two leagues cannot deviate from equality by very much or for very long. Going Back in HistoryWe went back about fifty years, and computed the probability of winning the World Series for each of the teams that qualified for post-season play since 1947. Here are those chances, with the asterisk identifying the team that won it all: 1997 BAL .179 CLE .064 SEA .091 NY .150 ATL .246 HOU .057 SF .103 FLA .110* 1996 NY .120* CLE .236 TEX .103 BAL .078 ATL .171 STL .081 SD .121 LA .091 1995 BOS .106 CLE .359 SEA .057 NY .055 ATL .191* CIN .134 LA .056 COL .043 1993 TOR .223* CHI .209 PHI .230 ATL .337 1992 TOR .229* OAK .243 PIT .237 ATL .292 1991 TOR .205 MIN .280* PIT .298 ATL .218 1990 BOS .130 OAK .419 PIT .246 CIN .205* 1989 TOR .173 OAK .371* CHI .246 SF .210 1988 BOS .124 OAK .378 NY .299 LA .199* 1987 DET .372 MIN .151* STL .283 SF .194 1986 BOS .210 CAL .157 NY .427* HOU .206 1985 TOR .316 KC .168* STL .305 LA .210 1984 DET .430* KC .102 CHI .257 SD .212 1983 BAL .275* CHI .335 PHI .185 LA .205 1982 MIL .290 CAL .239 STL .248* ATL .223 1981 NY .111 MIL .137 OAK .215 KC .047 MON .125 PHI .114 LA .146* HOU .105 1980 NY .358 KC .231 PHI .223* HOU .188 1979 BAL .402 CAL .148 PIT .290* CIN .160 1978 NY .339* KC .191 PHI .280 LA .191 1977 NY .236* KC .294 PHI .267 LA .203 1976 NY .271 KC .142 PHI .269 CIN .318* 1975 BOS .203 OAK .243 PIT .154 CIN .401* 1974 BAL .218 OAK .187* PIT .158 LA .437 1973 BAL .289 OAK .257* NY .114 CIN .341 1972 DET .155 OAK .251* PIT .274 CIN .320 1971 BAL .314 OAK .294 PIT .246* SF .145 1970 BAL .381* MIN .190 PIT .114 CIN .316 1969 BAL .401 MIN .203 NY .245* ATL .151 1968 DET .555* STL .445 1967 BOS .392 STL .608* 1966 BAL .523* LA .477 1965 MIN .561 LA .439* 1964 NY .555 STL .445* 1963 NY .477 LA .523* 1962 NY .439* SF .561 1961 NY .646* CIN .354 1960 NY .513 PIT .487* 1959 CHI .608 LA .392* 1958 NY .496* MIL .504 1957 NY .554 MIL .446* 1956 NY .533* BRO .467 1955 NY .477 BRO .523* 1954 CLE .674 NY .326* 1953 NY .467* BRO .533 1952 NY .460* BRO .540 1951 NY .540* NY .460 1950 NY .579* PHI .421 1949 NY .504* BRO .496 1948 CLE .533* BOS .467 1947 NY .554* BRO .446 No team has ever entered the post-season with less of chance to win it all than the 1998 Cubs. A team with those odds (3.6%) could be expected to come home with the title once ever 27 years. The Marlins were the longest shot that actually won, but longshots are becoming the norm. No champion since baseball went to the three-round format has had better than a one-in-five chance of capturing the flag. Not surprisingly, the Yankees this season entered the post-season with the best odds since the A's of 1990. But, thanks to the three-round format, even this dominant Yankees team entered the post-season with less than a forty percent chance to win it all. By contrast, the biggest post-season underdog from 1962 to 1968, the 1967 Red Sox, had just about the same chance of winning a title (.392 to .394) as do this year's edition of the Yankees. In other words, it sure does increase your odds when you only have to beat one opponent instead of three. So what do we get when we compare the expected championships to the ones each franchise actually got? Here are how some of the top mini-dynasties of the past fifty years have done in the post-season: Team Period ExpWS Att Avg Wins +/- NY A 1947-1964 7.794 15 .520 10 +2.206 OAK A 1971-1975 1.232 5 .246 3 +1.768 MIN A 1987-1991 .431 2 .216 2 +1.569 LA N 1959-1966 1.831 4 .458 3 +1.169 TOR A 1985-1993 1.146 5 .229 2 + .854 PIT N 1970-1979 1.236 6 .206 2 + .764 NY A 1976-1981 1.315 5 .263 2 + .685 STL N 1964-1968 1.498 3 .499 2 + .502 LA N 1974-1988 1.591 7 .227 2 + .409 NY A 1995-1998 .719 4 .180 1 + .281 STL N 1982-1987 .836 3 .279 1 + .164 CIN N 1970-1979 1.856 6 .309 2 + .144 BAL A 1966-1974 2.126 6 .354 2 - .126 KC A 1976-1985 1.175 7 .168 1 - .175 PHI N 1976-1983 1.338 6 .223 1 - .338 OAK A 1988-1992 1.411 4 .353 1 - .411 BOS A 1986-1990 .464 3 .155 0 - .464 ATL N 1991-1998 1.655 7 .236 1 - .655 CLE A 1995-1998 .708 4 .177 0 - .708 PIT N 1990-1992 .781 3 .260 0 - .781 MIN A 1965-1970 .954 3 .318 0 - .954 BRK N 1947-1956 3.005 6 .501 1 -2.005 Where: ExpWS is the expected World Series wins (sum of probabilities for each season they qualified) Att is the number of attempts (trips to the Post-Season) Avg is the average odds on each attempt Wins is the number of World Series won by the team +/- is Wins - ExpWS So four other teams (two this decade alone) have been even bigger post-season disappointments that the Braves. Despite missing on six of their seven trips to the playoffs, Atlanta is only a little more than half a championship behind where they ought to be. If they manage to win a World Series in the next couple of years, they will have gotten as much out of their "dynasty" as anyone should have expected them to. By the way, the 7.794 expected wins the Yankees amassed from 1947 to 1964 will almost certainly never be challenged again. That total works out to an average of .433 a year for the period (and that's including the years when they didn't win their pennant). The Yankees this year won an AL record 114 games, were overwhelming favorites to take each of their playoff series, and still had only a .394 chance of winning the series. Appendix: An Approach Based on ProbabilitiesWe can estimate the chance that the better team will win using four simple steps:
To reinforce this point, let's work through an example using the 1998 World Series teams. Having finished the season with 16 more wins than the Padres, the Yankees have a theoretical winning percentage of .600 in each game. Without considering the home-field advantage, the Yankees should beat the Padres in a seven-game series 71% of the time. Here's how: Result #ways Probability Value if P=.600 ------ ----- --------------- ----------------------------- 4-0 1 P**4 1 * .600**4 = .130 4-1 4 P**4 * (1-P) 4 * .600**4 * .400 = .207 4-2 10 P**4 * (1-P)**2 10 * .600**4 * .400**2 = .207 4-3 20 P**4 * (1-P)**3 20 * .600**4 * .400**3 = .166 Total .710
The odds of winning change slightly if you add the home field advantage. If you're comfortable with math, you can add the home field factor to the equation without too much difficulty. If P is the chance that the favored team will win a game, and H is the home field advantage, then the chances for each team to win a given game are: At home On the road Favorite wins P + H P - H Underdog wins 1 - P + H 1 - P - H Now the chances that a favorite with the home field advantage will win a seven-game series (in a format where the favored team is at home in games 1, 2, 6 and 7) can be computed as follows. The last column shows the actual numbers for the 1998 Yankees, assuming an expected winning percentage of .600 and a home field advantage of .040, which shows that the home-field advantage increases their chance of winning the series to 72.2%: Road Result Wins #ways Probability Value if P=.600 ------ ---- ----- --------------- --------------- 4-0 2 1 (P+H)**2 * (P-H)**2 .129 4-1 2 2 (P+H)**2 * (P-H)**2 * (1-P+H) .113 4-1 3 2 (P+H) * (P-H)**3 * (1-P-H) .081 4-2 1 3 (P+H)**3 * (P-H) * (1-P+H)**2 .085 4-2 2 6 (P+H)**2 * (P-H)**2 * (1-P+H) * (1-P-H) .122 4-2 3 1 (P+H) * (P-H)**3 * (1-P-H)**2 .015 4-3 0 1 (P+H)**4 * (1-P+H)**3 .014 4-3 1 9 (P+H)**3 * (P-H) * (1-P+H)**2 * (1-P-H) .092 4-3 2 9 (P+H)**2 * (P-H)**2 * (1-P+H) * (1-P-H)**2 .066 4-3 3 1 (P+H) * (P-H)**3 * (1-P-H)**3 .005 Total .722
Copyright © 1998. Diamond Mind, Inc. All rights reserved. |
![]() |