### Wednesday, March 30, 2005

## The Markov Value of the Stolen Base

I posted some thoughts on the value of the stolen base on Dodger Logs (a good site for the Dodger Sabermetric fans) a few days ago and got a positive response, so here is an edited and expanded version...

The following article on the Markov/stochastic process as applied to baseball is a good read for the non-math geeks.

http://www.harvardmagazine.com/on-line/050221.html

This transient-state quantification of baseball is an elegant way of modeling out-by-out/base-by-base sequences, and is helpful in determining the value of certain events, such as walks, stolen bases, sacrifice bunts, etc. I'll illustrate the value of the stolen base here.

Look at the green chart in the middle of the article. It gives you the historical "expected" run value from the 01 AL season depending on the out count and the runner situation. There are 8 possible runner situations (no one on, runner on 1st, runner and 1st and 2nd, etc.) for either 0, 1, or 2 outs. So there are a possible 24 out/runner combinations, or "states" in math-speak.

Let's consider a stolen base from 1st to 2nd with 0 outs. I'm taking numbers directly from that chart.

ExpectedRunValue(0 outs, runner on 1st) = 0.907

ExpectedRunValue(0 outs, runner on 2nd) = 1.138

Difference = 0.231

If the runner is successful, according to the data he has added 0.231 "expected" runs to his team.

If he's caught,

ExpectedRunValue(0 outs, runner on 1st) = 0.907

ExpectedRunValue(1 out, bases empty) = 0.294

Difference = 0.613 if the runner is caught, he has subtracted 0.613 "expected" runs to his team.

So at what success rate does the runner actually help the team? Let's determine the breakeven point.

Let:

x= success rate

1-x = caught rate

0.231x = (1-x)*0.613

0.231x +0.613x = 0.613

0.844x = 0.613

x = 72.6 %

With 0 outs, the breakeven point is 72.6% according to the data. With 1 and 2 outs, and the breakeven point is similar: 71% with 1 out, and 68.9% with 2 outs. Basically you have to steal at a rate significantly better than 70% to 75% to add any relevant value to your team when stealing 2nd.

If the runner is stealing 3rd from 2nd with no one on 1st, the breakeven points are 80% for 0 outs, 75.2% with 1 out, and and 88.7% with 2 outs - figures all higher than stealing 2nd. They also fluctiate more, but consider that:

1) It's highly likely that a runner would score from 2nd with 0 outs - by an RBI hit in 3 chances or by 2 productive outs - so it might be wise to stay put instead of risking an out.

2) It's virtually pointless to steal 3rd with 2 outs when, barring a freak occurence such as a passed ball or a fielding error, one would probably score from 2nd anyway by an RBI hit.

It makes sense that the best out count to steal 3rd is with 1 out, which gives the hitter an RBI opportunity with either a hit or a sacrifice. Still, it's risky according to the numbers.

So how valuable is a stolen base? Let's take Dave Roberts' 04 season with LA/BOS. He stole 38 bases while being caught 3 times, which is a phenominal success rate of 92.7%. Here is the breakdown:

1) 1st to 2nd, 0 outs: 17 out of 18

2) 1st to 2nd, 1 out: 8 out of 9

3) 1st to 2nd, 2 outs: 8 out of 8

4) 2nd to 3rd w/ no one on 1st, 1 out: 1 out of 2

5) 2nd to 3rd w/ no one on 1st, 2 outs: 1 out of 1

6) 2nd to 3rd w/ a runner on 1st, 2 outs: 3 out of 3

I will use the valuess from the green chart even though the 01 AL season (where the values are derived) has nothing to do with Roberts' 04 season - but you figure that there is a strong relationship, regardless.

1) (17 * .231 expected runs ) - (1 * .613 expected runs) = 3.314 expected runs added

2) (8 * .176) - (1 * .430) = 0.978 expected runs added

3) (8 * .108) - (0 * .239) = 0.864 expected runs added

4) (1 * .200) - (1 * .606) = -0.406 expected runs (ouch!!)

5) (1 * .044) - (0 * .347) = 0.044 expected runs added

6) (3 * .036) - (0 * .486) = 0.108 expected runs added

Overall, Roberts "added" 4.902 runs with his nearly perfect base stealing ability. In other words,

This may seem like a trivialization of the stolen base, but it's not. Speed is always good, whether stealing bases or chasing a flyball. However, the following 2 conditions have to be met before a runner can steal a base:

1) You have to get on base

2) There can't be another runner infront of you

Add to this the inherent risk/reward factor (the aforementioned breakeven points) and the relatively low payoff, and I can see why the art of the stolen base is starting to become something of a rarity.

The following article on the Markov/stochastic process as applied to baseball is a good read for the non-math geeks.

http://www.harvardmagazine.com/on-line/050221.html

This transient-state quantification of baseball is an elegant way of modeling out-by-out/base-by-base sequences, and is helpful in determining the value of certain events, such as walks, stolen bases, sacrifice bunts, etc. I'll illustrate the value of the stolen base here.

Look at the green chart in the middle of the article. It gives you the historical "expected" run value from the 01 AL season depending on the out count and the runner situation. There are 8 possible runner situations (no one on, runner on 1st, runner and 1st and 2nd, etc.) for either 0, 1, or 2 outs. So there are a possible 24 out/runner combinations, or "states" in math-speak.

Let's consider a stolen base from 1st to 2nd with 0 outs. I'm taking numbers directly from that chart.

ExpectedRunValue(0 outs, runner on 1st) = 0.907

ExpectedRunValue(0 outs, runner on 2nd) = 1.138

Difference = 0.231

If the runner is successful, according to the data he has added 0.231 "expected" runs to his team.

If he's caught,

ExpectedRunValue(0 outs, runner on 1st) = 0.907

ExpectedRunValue(1 out, bases empty) = 0.294

Difference = 0.613 if the runner is caught, he has subtracted 0.613 "expected" runs to his team.

So at what success rate does the runner actually help the team? Let's determine the breakeven point.

Let:

x= success rate

1-x = caught rate

0.231x = (1-x)*0.613

0.231x +0.613x = 0.613

0.844x = 0.613

x = 72.6 %

With 0 outs, the breakeven point is 72.6% according to the data. With 1 and 2 outs, and the breakeven point is similar: 71% with 1 out, and 68.9% with 2 outs. Basically you have to steal at a rate significantly better than 70% to 75% to add any relevant value to your team when stealing 2nd.

If the runner is stealing 3rd from 2nd with no one on 1st, the breakeven points are 80% for 0 outs, 75.2% with 1 out, and and 88.7% with 2 outs - figures all higher than stealing 2nd. They also fluctiate more, but consider that:

1) It's highly likely that a runner would score from 2nd with 0 outs - by an RBI hit in 3 chances or by 2 productive outs - so it might be wise to stay put instead of risking an out.

2) It's virtually pointless to steal 3rd with 2 outs when, barring a freak occurence such as a passed ball or a fielding error, one would probably score from 2nd anyway by an RBI hit.

It makes sense that the best out count to steal 3rd is with 1 out, which gives the hitter an RBI opportunity with either a hit or a sacrifice. Still, it's risky according to the numbers.

So how valuable is a stolen base? Let's take Dave Roberts' 04 season with LA/BOS. He stole 38 bases while being caught 3 times, which is a phenominal success rate of 92.7%. Here is the breakdown:

1) 1st to 2nd, 0 outs: 17 out of 18

2) 1st to 2nd, 1 out: 8 out of 9

3) 1st to 2nd, 2 outs: 8 out of 8

4) 2nd to 3rd w/ no one on 1st, 1 out: 1 out of 2

5) 2nd to 3rd w/ no one on 1st, 2 outs: 1 out of 1

6) 2nd to 3rd w/ a runner on 1st, 2 outs: 3 out of 3

I will use the valuess from the green chart even though the 01 AL season (where the values are derived) has nothing to do with Roberts' 04 season - but you figure that there is a strong relationship, regardless.

1) (17 * .231 expected runs ) - (1 * .613 expected runs) = 3.314 expected runs added

2) (8 * .176) - (1 * .430) = 0.978 expected runs added

3) (8 * .108) - (0 * .239) = 0.864 expected runs added

4) (1 * .200) - (1 * .606) = -0.406 expected runs (ouch!!)

5) (1 * .044) - (0 * .347) = 0.044 expected runs added

6) (3 * .036) - (0 * .486) = 0.108 expected runs added

Overall, Roberts "added" 4.902 runs with his nearly perfect base stealing ability. In other words,

**NOT A WHOLE LOT**. I would rather have 5 Shawn Green Specials (bases-empty HRs) than steal 38 out of 41 bases.This may seem like a trivialization of the stolen base, but it's not. Speed is always good, whether stealing bases or chasing a flyball. However, the following 2 conditions have to be met before a runner can steal a base:

1) You have to get on base

2) There can't be another runner infront of you

Add to this the inherent risk/reward factor (the aforementioned breakeven points) and the relatively low payoff, and I can see why the art of the stolen base is starting to become something of a rarity.

### Monday, March 28, 2005

## Baseball "Luck" and the Dodgers Offseason

Here's an interesting article on baseball. I'll relate this to some of the recent Dodger transactions.

www.hardballtimes.com/main/article/if-line-drives-could-speak

To summarize, there's a correlation between the % of line drives hit and AVG . This makes sense - if you hit more line drives into play, more of your ABs are likely to be hits. So if a hitter's batting average is unusually high compared to the his line drive rate, then there is a good chance that some luck was involved. (even if he's a fast runner, or has good power so the line drives are usually smoked, etc.) It would then follow that there is a good probability for a return to the norm - you can't be lucky forever. In a similar analogy, the same applies to a pitcher in that if he allows more line drives, his AVG-Against will likely be high. If a pitcher is unually out of step with this notion, there may be some luck involved.

In a similar argument (but more roundabout), an ERA lower than what defense-independent pitching stats (K, BB, HR) would suggest implies a bit of luck, too.

The article analyzes on the 2004 season and lists the luckiest/unluckiest players. Six Dodgers from either the 04 or 05 roster are listed, and they are shown as follows:

Jason Phillips was unluckly.

Lowe was extremely unluckly.

Alvarez was somewhat unlucky.

Ishii was lucky.

Lima was luckly.

Odalis was somewhat lucky (although his numbers were still good, lucky or not)

The article does not encorporate park effects or fielding efficiency. I would think a part of the "luck" that Ishii, Lima, and Odalis enjoyed had to do in part to the strong fielding lineup the Dodgers had in 04. If so, then Alvarez was not "somewhat unlucky" but "strongly unlucky" - the line drives fell for hits despite the fantastic glovework of Izturis and Co.

What I find significant is how this relates to the Dodgers offseason. Alvarez (unlucky) was resigned, Lima (lucky) was not resigned, Ishii (lucky) was traded for Phillips (unlucky), and Odalis (somewhat lucky) was resigned at terms lower than many expected (way lower than Lowe). Coincidence? I think not. Depodesta obtained the undervalued (unlucky) players while shedding the overvalued (lucky) ones.

Food for thought.

www.hardballtimes.com/main/article/if-line-drives-could-speak

To summarize, there's a correlation between the % of line drives hit and AVG . This makes sense - if you hit more line drives into play, more of your ABs are likely to be hits. So if a hitter's batting average is unusually high compared to the his line drive rate, then there is a good chance that some luck was involved. (even if he's a fast runner, or has good power so the line drives are usually smoked, etc.) It would then follow that there is a good probability for a return to the norm - you can't be lucky forever. In a similar analogy, the same applies to a pitcher in that if he allows more line drives, his AVG-Against will likely be high. If a pitcher is unually out of step with this notion, there may be some luck involved.

In a similar argument (but more roundabout), an ERA lower than what defense-independent pitching stats (K, BB, HR) would suggest implies a bit of luck, too.

The article analyzes on the 2004 season and lists the luckiest/unluckiest players. Six Dodgers from either the 04 or 05 roster are listed, and they are shown as follows:

Jason Phillips was unluckly.

Lowe was extremely unluckly.

Alvarez was somewhat unlucky.

Ishii was lucky.

Lima was luckly.

Odalis was somewhat lucky (although his numbers were still good, lucky or not)

The article does not encorporate park effects or fielding efficiency. I would think a part of the "luck" that Ishii, Lima, and Odalis enjoyed had to do in part to the strong fielding lineup the Dodgers had in 04. If so, then Alvarez was not "somewhat unlucky" but "strongly unlucky" - the line drives fell for hits despite the fantastic glovework of Izturis and Co.

What I find significant is how this relates to the Dodgers offseason. Alvarez (unlucky) was resigned, Lima (lucky) was not resigned, Ishii (lucky) was traded for Phillips (unlucky), and Odalis (somewhat lucky) was resigned at terms lower than many expected (way lower than Lowe). Coincidence? I think not. Depodesta obtained the undervalued (unlucky) players while shedding the overvalued (lucky) ones.

Food for thought.