Tuesday, September 30, 2014

Dodgers' chances against Cardinals in NLDS look good, mathematically

The long regular season is over, and it went more or less as I expected (more on that in the winter). Of course, the details and the heat of the moment are fun to watch and why I'm a fan, but when you take a step back you see it all more or less evened out and the teams with the most talent are pretty much the ones still playing in October.

The Dodgers get the Cardinals in the first round and on paper, the Cardinals are not that scary. The Cardinals have a mystique about them though. They knocked the Dodgers out last year and they managed to win 90 games despite outscoring their opponent by only 16 runs over the course of the season. So, they have mojo, will to win, intangible grit, etc ...

Be that as it may, what they don't have is the talent of the Dodgers. I think the Dodgers have about a 68 percent chance of advancing to the NLCS.

How did I get to that number?  First off, let's look at the likely starting pitchers:


Game
LA Starter
STL Starter
Park
1
LA
2
LA
3
Ryu*
STL
4
STL
5
Kershaw*
Wainwright
LA


A simple model would take the Cardinals' starting pitchers' initial as the outcome and predict the Cardinals W-L-W-L-W.  I have a somewhat more numerical approach.

Let's first take a look at the ERA- (the lower the number, the better) of each pitcher and the average number of innings they went this year.  ERA- tells how many runs they allow relative to an average starter. This isn't the ERA- they put up in 2014 so much as an amalgam of ERA-, FIP-, and xFIP- that leads me to believe this is about how well they will perform in this series.


Pitcher
ERA-
IP/start
Kershaw
50
7.3
Greinke
80
6.3
Ryu
85
5.8
Haren
110
5.8
Wainwright
75
7.1
Lynn
90
6.2
Wacha
95
5.6
Lackey
105
6.0

When the starter comes out, the Dodgers' bullpen has an ERA- of 105 and the Cardinals' 100.

But the Dodgers and Cardinals are not average offenses, so we need to take that into account. I'll use wRC+ as a measure of how much better than average each offense is:


TEAM
wRC+
wRC+ vs. L
wRC+ vs. R
LA
111
106
112
STL
95
104
93

I used the overall score vs. the bullpens and the handedness splits against the starters.
Finally, I added a defense effect and a park factor effect. The Cardinals saved about .185 runs per game compared to an average team while the Dodgers cost an extra .074 or so.  Dodger Stadium has a 96 park factor while Busch stadium has a 98.

So in the end, to get to how many runs each team is expected to score in each game, I take the NL average offense (3.95 runs/game) * wRC split * ERA- for the starters innings and *wRC * ERA- for the bullpen's innings.  Then add the defense runs and multiply the total by the park factor.

For example, in game 1, the Dodgers are expected to score 
((3.95 * ((1.12 * .75 * 7.1/8.5)  + (1.11 * 1 * 1.4/8.5))) -.185) * .96 = 3.2
while the Cardinals are expected to score
((3.95 * ((1.04 * .5 * 7.3/8.5)  + (.95 * 1.05 * 1.2/8.5))) +.074) * .96 = 2.3

Using the Pythagorean formula, 3.2^2 / (3.2^2 + 2.3^2) = 66% chance of the Dodgers winning game

1.  You might recall the Dodgers have won 20 of the last 21 Kershaw starts! But Wainwright is pretty good too. He even started the all-star game #neverforget.

We can do this for each game. I see the Dodgers as having the advantage actually in each game:

Game
LA runs
STL runs
LA WPct
1
3.2
2.3
66%
2
3.8
3.1
59%
3
4.0
3.6
55%
4
4.3
4.0
54%
5
3.2
2.3
66%

To determine the outcome of the series then is just a Markov chain process. The chance of the Dodgers winning game 1 & 2 is 66% * 59% = 39%. The chance of the Dodgers winning one and losing one is the chance they win the first and lose the second plus the chance they lose the first and win the second: 66% * 41% + 34% * 59% = 47%. And so on and so on, but 3 wins for either team ends the series.

In the end, we get to to the probability distribution of outcomes for the series from the Dodgers' point of view:

3-0
21%
3-1
23%
3-2
23%
2-3
12%
1-3
13%
0-3
8%

So the Dodgers chance of winning, logically, is that 21% + 23% + 23% = 68%.  The most likely outcome, technically, is a 3-1 series win for the Dodgers.  Anything could happen of course, and I can't wait for Friday. But luck favors the prepared.

Photo credit: Bryce Edwards, Flickr