May the best team win ... at least some of the time
By Tom Tippett
October 1, 2002
Baseball cliché: "Anything can happen in a short series."
Observation #1: Since 1990, the team with the best overall regularseason
record has won one World Series.
Observation #2: Since the extra round of playoffs was added in
1995, the team with the better regular season record has won 13 division
series and lost 13 of them, with two contested by teams with identical
records.
Observation #3: In league championship series play since 1990,
the team with the better regular season record (among the two teams in
the LCS) has won 12 times and lost 9 times, with one LCS involving teams
with the same record.
In other words, it's not easy to go all the way. And this isn't a recent
phonomenon. Since division play began in 1969, the team with the best
regularseason record in baseball has won the World Series only 8 times
in 32 tries. (Can you name these eight teams? Answer below.)
To put it another way, it's a fool's game to predict the winner of a
postseason tournament, even when one team has dominated the regular season.
But baseball is supposed to be fun, so we're going to have a little fun
with the numbers to see what we can learn about the chances of each of
this year's contenders.
To do that, we'll start by assessing each team's chances to win one game
against a given opponent. We'll use that information to estimate each
team's chances to win a series against that opponent. And we'll put those
figures together to estimate each team's chances of winning three consecutive
series.
Estimating onegame winning percentages
In the 1981 Bill James Baseball Abstract, Bill introduced the
log5 method to answer the question, "how often should team A be expected
to beat team B?" It took him several pages to describe and justify
the method, so we won't take the space to do all of that again here. Instead,
we'll just give the formula:
A  A * B
WPct = 
A + B  2 * A * B
where A is team A's winning percentage and B is team B's winning percentage.
In other words, if you have a .600 team playing a .400 team, this method
shows that the better team can be expected to win 69.2% of the games between
these two teams:
.600  .600 * .400 .360
WPct =  =  = .692
.600 + .400  2 * .600 * .400 .520
If you were to take A's winning percentage as a given (say .600) and
solve this equation for all possible values of B, you could determine
A's chances in games against any conceivable opponent. And if you graphed
those values, you'd see a curve, not a straight line.
But it's a gentle curve and the middle portion of that curve is very
close to a straight line. That makes it possible to substitute a simpler
straightline formula that gives very similar results in the range of
.400 to .600:
WPct = .500 + A  B
For example, if A is .550, the log5 and straightline methods produce
values that differ by no more than .001 whenever B is in the range of
.400 to .600. The further A gets away from .500, the bigger the differences,
but they are still manageable. If A is .600, the difference is as much
as .005 when B is close to .400 but is still within .002 for all values
of B from .440 to .630.
In other words, because almost all baseball teams fall into this range
of .400 and .600, and because the differences are smallest when A and
B are close to each other, the straightline formula is a handy alternative
that works for the vast majority of matchups.
A few years ago, Tom Ruane wrote a program that looked at the result
of every AL and NL game from 1901 to 1997. The program placed each team
into one of twenty groups based upon their winning percentage for that
season. All teams with winning percentages less than .330 went into group
A; those with winning percentage between .330 and .350 went into group
B, and so on up to the top group, which had all teams with winning percentages
greater than .690. For each game, the program figured out what type of
matchup it was (e.g. group C vs group F) and then added the game result
to the totals for that matchup.
That study showed that these formulas are very accurate predictors of
the actual winning percentages in matchups involving these different groups.
If you read
that article, you'll see that we focused on the straightline method,
but it's not hard to see that the log5 method would have provided an even
better fit for the 19011997 results that we compiled. We'll use the log5
method for the remainder of this article.
Adding in the homefield advantage
Historically, home teams have consistently won at a .540 pace (see The
Hidden Game of Baseball by John Thorn and Pete Palmer), so we need
to know who's at home if we're going to get the best read on the expected
winning percentage for a single game. This year, the winning percentage
for home teams was .542, so we'll give the home team a .042 boost in its
projected winning percentage when we extend the log5 estimate to assess
each team's chances in a playoff series.
Estimating winning percentages for a 5game or 7game series
We can use these onegame winning percentages to assess the chances to
win a series by (a) adding in the homefield adjustment for each game,
(b) multiplying the singlegame probabilities to get a probability that
a team will complete the series with a certain pattern of wins and losses,
(c) repeating this step for all possible patterns of wins and losses,
and (d) adding up the probabilities for all patterns that produce a series
win for the team.
For example, the probability of winning a fivegame series is the sum
of the chances of sweeping, winning 31, or winning 32. There are ten
ways a team can be first to win three games:
Result Patterns
30 WWW
31 LWWW, WLWW, WWLW
32 LLWWW, LWLWW, LWWLW, WLLWW, WLWLW, WWLLW
For example, if a .600 team is playing a .400 team, we've already established
that it has a .692 chance to win each game on a neutral field. If games
one and two are at home, their chances to sweep a series in three games
are:
(.692 + .042) * (.692 + .042) * (.692  .042) = .350
or 35%. We can use similar logic to compute the probabilities for the
patterns that produce a 31 or 32 win, add them up, and presto, we have
the probability that the favored team will win the series one way or another.
Using this method, here are the results for this year's division series
matchups:
Matchup Favorite
 
ANA @ NY NY .574
MIN @ OAK OAK .617
SF @ ATL ATL .596
SL @ ARI ARI .528
In other words, this model says that the Yankees have a 57.4% chance
of beating the Angels in a fivegame series when New York has the homefield
advantage, with Oakland having a bigger edge over the Twins.
We can move on to project the league championship series results. Of
course, we don't know yet who will win each of the firstround matchups,
so we'll need to do this for all possible outcomes of the first round:
Matchup Favorite
 
ANA @ OAK OAK .571
ANA @ MIN ANA .548
MIN @ NY NY .640
OAK @ NY NY .523
SF @ ARI ARI .547
SF @ SL SL .534
SL @ ATL ATL .587
ARI @ ATL ATL .573
Finally, there are sixteen possible matchups for the World Series, with
the AL champion having the homefield advantage no matter who makes it
that far:
@ NY @ OAK @ MIN @ ANA
   
ATL NY .534 OAK .525 ATL .594 ATL .537
ARI NY .594 OAK .585 ARI .535 ANA .527
SL NY .607 OAK .598 SL .521 ANA .541
SF NY .627 OAK .618 tie .500 ANA .561
Going all the way
We now have the serieswinning probabilities for every possible matchup,
so we can put it all together and project the chances that each team will
go all the way, given who it might have to face at each step.
The Yankees, for example, have a probability of .574 to beat Anaheim
and advance to the ALCS. There's a .617 chance they'll face Oakland and
a .523 chance they would beat the A's in that series, so their chances
to go to the World Series through Oakland are .574 * .617 * .523 = .185.
But there's a .383 chance they'll face Minnesota and a .640 chance they'd
beat the Twins, so their chances to go to the World Series through Minnesota
are .574 * .383 * .640 = .141.
Add these two possibilities together and you get a probability of .326,
or about one chance in three, that Yankee Stadium will host game one of
the World Series.
We can repeat this process for the other seven teams and then extend
it to include the probability of winning the World Series. And when we
do that, we come up with the following (drumroll, please):
NY 19.0% chance to win World Series
OAK 18.3%
ATL 17.4%
ARI 11.1%
ANA 10.4%
SL 9.3%
SF 7.6%
MIN 6.9%
Aren't we missing something?
Actually, we're missing a lot of things.
This approach doesn't take into account the starting pitchers in each
game. If Randy Johnson and Curt Schilling can replicate what they did
last year, Arizona's chances increase. Schilling hasn't pitched well lately,
but he might be able to turn it on again when things really matter.
We're assuming the homefield advantage is the same for everyone, and
Minnesota fans can point to 1987 and 1991 as proof that their home field
edge is bigger than most. Then again, all of this year's playoff teams
won between 50 and 55 games at home during the regular season, so nobody
stands out in this regard.
This method uses regularseason winning percentages as the basis for
all matchups. You could argue that other indicators, such as runs scored
minus runs allowed, might be a better gauge of team talent. Using run
differentials, the chances for Anaheim and San Francisco increase, mostly
at the expense of Minnesota. (Of course, if run differentials were paramount,
the Red Sox would still be playing, the A's would be booking tee times,
and the White Sox and Twins would be in a onegame playoff for the AL
Central title.)
The use of regularseason winning percentages also assumes that what
happened over a sixmonth period is indicative of how the team's stack
up right now. The unbalanced schedule skews things, with the teams in
the two West divisions having battled much harder to achieve their records.
Nobody would argue that Arizona is at full strength with Luis Gonzalez
on the sidelines for the duration. And anyone who has watched the Yankees
dial it up about three notches in almost every October since 1996 has
to consider the possibility that they could do that again.
One other thing. This method assumes that the probability of winning
one game in a series is independent of anything that has already happened
in previous series games. Any baseball fan knows this isn't true. One
team may wear out its bullpen more than the other. "Destiny"
or "momentum" may somehow favor one side or the other. Underdogs
who lose a couple of close games may subconsiously realize they're not
going to come back to win the series. Since 1989, there have been quite
a few more series sweeps than this model would predict, suggesting that
there are real effects that carry over from game to game.
Let's cut these guys a little slack
Much has been made of the fact that the Atlanta Braves have only one
World Series victory despite winning their division in every full season
since 1991. They have indeed come up a little short, but their postseason
record isn't as bad as you might think.
In seven cracks at the division series, Atlanta has won six times. That
includes a 5for5 showing when they entered that series with a better
record than their opponent, 1for1 as the underdog, and one loss (to
St. Louis in 2000) when the records were the same.
Atlanta has been in nine of the last ten National League Championship
Series. As the favorites, they won four times in seven tries, which is
about par for the course. As underdogs, they have one win in two tries.
That's not bad.
It's only in the World Series that the Braves have failed to achieve
their full potential, going 1for5 since 1991. As underdogs, they've
won one (over Cleveland in 1995) and lost one (to Minnesota in 1991).
As favorites, they were upset by the Blue Jays in 1992, the Yankees in
1996, the Yankees again in 1999.
Overall, in 19 postseason series against very good teams, the Braves
have 12 series wins and 7 losses to their credit. As favorites, they are
96. As underdogs, they've gone 32. That's not bad, not bad at all.
Of course, what stands out are those three World Series losses when they
were favored. But all it would take is one more run to the title to erase
a lot of those bad memories.
Consider this. If Atlanta does go all the way this year, that would give
them a 157 record in postseason series since 1991 and two World Series
wins in 11 trips to the postseason, with both wins coming since the third
round of postseason play made this journey so much more difficult. If
that happens, I hope they get the monkeys off their backs once and for
all.
But that's a very big IF.
The bottom line
Every one of the eight qualifiers for the 2002 postseason comes in with
some question marks. Oakland and Minnesota rank only 8th and 9th in the
AL in scoring, while Anaheim relied on timely hitting all year. All three
teams may struggle to score enough runs against good pitching. The Yankees
have the best offense by far, but their pitching hasn't been as good or
as healthy as they would like.
Atlanta's noname bullpen must keep doing what it's been doing, and their
10thranked offense mustn't break down. Arizona has to make up for the
loss of Gonzalez and hope that Schilling gets going again. The Cardinals
have had a remarkable season given everything they've had to deal with,
and may be this year's team of destiny, but their starting rotation is
a very big question mark right now. For San Francisco, Barry Bonds must
come up big and he must get some help, while the Giants pitching staff
needs to show that their #2 ranking in the NL isn't just an illusion created
by their home park.
The bottom line is that anything can happen this year, especially on
the NL side. That's not news, of course. The record over the past 32 years
is proof enough of that.
So even though the model described in this article leaves some things
out, it's still worth noting that it's much more likely that the topseeded
Yankees won't win the whole thing than that they will. George
Steinbrenner may be able to buy enough talent to win the AL East title
every year, but it's not nearly as easy to buy three straight series wins
against good teams.
Trivia answer: Since divisional play began in 1969, the eight
teams that have won the World Series after posting the best regularseason
record are the 1970 Orioles, 197576 Reds, 1978 Yankees, 1984 Tigers,
1986 Mets, 1989 Athletics, and 1998 Yankees.
