BASEBALL AS A MARKOV CHAIN

by Mark D. Pankin

A Markov chain is a type of mathematical model that is well suited to analyzing baseball, that is, to what Bill James calls sabermetrics. The concept of a Markov chain is not new, dating back to 1907, nor is the idea of applying it to baseball, which appeared in mathematical literature as early as 1960. In fact, it is not unusual to see sabermetric analysis that incorporates the fundamental ideas of a Markov chain without formally using the mathematical structure. This type of work typically employs situational analysis and studies the probabilities and effects on expected scoring of moving from one base runners and outs combination to another. An example is the calculation of break-even probabilities for attempting to steal a base. However, formal Markov chain analysis of baseball is not at all common and is rarely found outside of academic studies. The main reasons for this are 1) most sabermetricians have never heard of Markov chains, 2) obtaining sufficient data has been rather difficult, and 3) a computer is a virtual necessity for serious Markov chain analysis. Project Scoresheet has taken care of the second problem, and the advent of personal computers and advanced software brings the solution to the third problem within the means of many sabermetricians. This essay attempts to do something about the first problem, but none of the mathematical details are presented since that would be inappropriate here.

There are three main sections that follow. The first is a nontechnical description of Markov chains and applying them to baseball. The second section discusses the types of sabermetric analyses that can be performed using Markov chains. The third section describes some of the author's work with Project Scoresheet data to develop prototype Markov analytical capabilities, and presents examples, which are for illustrative purposes only because of the limited scope of the data, that demonstrate the workings of some of the Markov concepts.

A. What are Markov chains and what do they have to do with baseball?

From a mathematical point of view, a Markov chain describes a process that can be considered to be in exactly one of a number of "states" at any given time. A baseball half-inning (the half- will be left out for brevity in the rest of this paper) fits that description if the states are considered the various runners and outs situations. There are 24 such combinations, which are listed below using the notation (runners,outs):

        TABLE 1: RUNNERS AND OUTS COMBINATIONS

Runners: 0(none)1     2     3     12     13     23     123

      0: (0,0) (1,0) (2,0) (3,0) (12,0) (13,0) (23,0) (123,0)
 Outs 1: (0,1) (1,1) (2,1) (3,1) (12,1) (13,1) (23,1) (123,1)
      2: (0,2) (1,2) (2,2) (3,2) (12,2) (13,2) (23,2) (123,2)

There also is a three out state, and to be technically correct there should be four three out states, corresponding to whether 0, 1, 2, or 3 runs scored on the play.

The heart of the Markov chain is the analysis of the transitions between the states. The key is the so-called transition matrix that contains the probabilities of moving from any state to any other state. Many of the transitions in baseball are impossible (e.g. the number of outs can never decrease) and have probability equal to zero. The other transitions have probabilities determined by the chances of various baseball events.

For sabermetric purposes, it is useful to have more than one transition matrix. One necessary refinement is to distinguish between transitions ("plays") that change the batter and those that do not. For example, suppose an inning is in the state (1,0) [runner on first, none out] and after the play it is in the state (2,0) [runner on second, none out]. If the batter changed, then the runner scored, most likely on a double. However, if the batter did not change, then the runner on first advanced to second (SB, WP, PB, balk) and no run scored. One way of handling this is to define additional states that indicate whether or not the batter changed, but that proves to be mathematically cumbersome. A better method is to have one transition matrix for plays that change the batter and a second one for plays that do not. In fact, there is no reason to stop at two. When performing strategy analysis, it makes sense to distinguish transitions for which the strategy is in effect from the others. Along these lines, it seems reasonable to establish separate transition matrices for such things as pitchers batting, sacrifice bunts and attempts, stolen bases and caught stealings, intentional walks, and so forth.

Closely associated with a baseball Markov chain, though strictly speaking not part of it, is the runs after matrix. For each of the 24 runners and outs states listed in Table 1, this matrix records what percentage of the time a specific number of runs scored in the remainder of the inning. For example, after the (0,2) state, there may have been 0 runs 82% of the time, 1 run 14%, 2 runs 2%, etc. From this matrix, it is easy to compute the expected or average number of runs for the rest of the inning. These expected run values for each state are commonly used in situational and strategy analysis to compute break-even probabilities. Note that the expected number of runs after the (0,0) state, in which all innings begin, is the average number of runs per inning.

B. How can Markov chains be used in sabermetrics?

There is a rich mathematical theory of Markov chains, but most of it is not applicable to baseball. (The three out state is an "absorbing" state because once entered, it can't be left. Most of the theory concerns chains without absorbing states.) Perhaps the greatest benefits from considering an inning as a Markov chain come from being able to formulate a large number of complex calculations in terms of matrix notation. The use of matrix algebra, as opposed to keeping track of numerous cases and equations, can greatly simplify the entire analytical process. A spreadsheet program on a personal computer is a natural setting for sabermetric work in general, and these programs with their row and column organization lend themselves naturally to matrix manipulations. The latest versions have matrix multiplication and matrix inversion commands, which are a virtual necessity for Markov chain analysis. The Markov chain and matrix algebra formulation enables the consideration of a wider range of questions and even makes getting the answers easier.

One fascinating computation that can be performed is to compute from a transition matrix the average or expected number of runs after each of the 24 runners and outs combinations. In particular, the expected runs after the (0,0) state is the average scoring per inning, and in a sense 9 times this number is average scoring per game. The key to this analysis is to start from an "interesting" transition matrix.

If the transition matrix contains only plays in which no strategy was involved (i.e. "hitting away"), then the values obtained are baselines against which strategies can be analyzed. One way of analyzing the strategy is to modify the transition matrix to include the strategy and then compute the expected runs again. For example, if the goal is to determine the effect on scoring of the actual stolen base attempts in a group of games, say for a whole league in a season, then the expected runs computation could be carried starting from a transition matrix without any steal attempts and again starting from the same transition matrix augmented by the steal attempts. Because the expected runs following all situations are calculated, it would be possible to see if the actual strategies increased scoring in some cases, say after (1,2), but not after others, say (1,0), in addition to telling if overall scoring increased or decreased.

Another application of this idea is to evaluate the offensive performance of individual players. Suppose we have a transition matrix for one player by himself. That matrix could be obtained from collecting data on all his plate appearances (and base running events if desired), or it could be estimated from season or career statistics. Calculating the expected runs per inning from that matrix yields an estimate of how much scoring there would be if that player batted (and ran) all the time. Multiplying the runs per inning by nine produces an offensive run average for the player. The player's average could be compared to a league average computed from a transition matrix based on all players.

Markov chain techniques can be used to compare different batting orders. The basic idea is again to compute the expected runs per inning associated with each batting order being analyzed. In this case, the computations are more complicated because nine different transition matrices are involved and how often each player leads off an inning must be accounted for. However, the necessary calculations are feasible, and there are often simplifying assumptions that can be applied to answer specific questions.

A different use of the transition matrix is to study possible differences between ballparks, teams, playing surfaces, etc. The idea is to examine specific transitions that can shed light on the issue. For example, suppose the goal is to determine whether it is harder to score from second base on a single in an astroturf or grass park. Of course, the best way is to go through the play-by-plays from a large of number of suitable games and collect the specific data. However, this may prove to be difficult, and the transition data if available can be of use. In this case, the idea is to examine transitions from a state with a runner on second to a state that was almost certainly reached by a single. This requires first base to be open so short singles can be distinguished from walks, which reduces to a runner on second or runners on second and third. The appropriate transitions are listed below:

            TABLE 2: SCORING FROM SECOND ON A SINGLE (I) 

                            End state after "single"
    Start State     Runner on 2nd scored       did not score 

    (2,0), (23,0)   (1,0), (0,1)               (13,0), (1,1)
    (2,1), (23,1)   (1,1), (0,2)               (13,1), (1,2)
    (2,2), (23,2)   (1,2)                      (13,2)

In addition, there may be a small number of transitions to three out states that result from singles, but there is no way to distinguish these from other transitions to the three out states. The number of such plays is probably too small to be meaningful in this analysis. A larger problem is that the transitions do not distinguish singles from other plays where the batter reaches first such as errors or fielder's choices. However, since the object is to compare two parks or playing surfaces, it is likely that the proportion of non-singles is similar for both, and the comparison will be valid.

By this point, you may be wondering if anyone would really go to the trouble of doing all of this. The answer is definitely yes. Most, but not all, of this type of work has been done by academic researchers. In some cases, simplifying assumptions were made or a reduced problem was studied. However, more complex analyses also have been performed. The description of the use of Project Scoresheet data that follows shows that elaborate computer facilities are no longer required for Markov chain baseball analysis.

C. Where does Project Scoresheet come in?

The author has been a Project Scoresheet inputter for the past two years. The inputters enter plays from the scoresheets into IBM PC and compatible personal computers using programs supplied by the project director. These programs write data files on floppy diskettes that contain all the information needed to reconstruct the games. The author input 37 Baltimore home games in 1985 and 74 Cincinnati home games in 1986 and kept copies of the data files. All the examples below are drawn from these games and as such form a limited and probably non-representative sample. Thus, any data or conclusions should not be considered to be representative of either league or of major league baseball.

The first step is the extraction from the data files of the information needed for the transition and runs after matrices: counts of how often each transition took place and how often specific numbers of runs scored after each situation. The author has written a program using the BASIC language for this task. The program keeps track of six different types of transitions: 1) non-pitchers hitting away, 2) pitchers hitting away, 3) intentional walks, 4) sacrifice bunts and attempts, 5) stolen bases and caught stealings, and 6) other transitions that do not change the batter (WP, PB, balk, etc.). In addition, these transitions and the runs after are separated into those for the home team, Baltimore or Cincinnati, and those for the visiting teams. The program defines a sacrifice bunt attempt to be any bunt with none out and men on base or any bunt by a pitcher with one out and men on base. Any credited sacrifice hit is counted as a such. Because the data files are structured in a way that makes the determination of the score at any point in the game difficult, the definition of sacrifice is not dependent on the game score.

The BASIC program writes disk files that can be read into the Lotus 1-2-3 spreadsheet program, which is used for the remainder of the analysis. Spreadsheets are an ideal way to manipulate transition data and perform the needed calculations. Moreover, release 2 of 1-2-3 has commands to multiply and invert matrices, which are almost a necessity for Markov chains and similar types of computations.

Table 3 below shows for each of the 24 runners and outs combinations 1) how many times each occurred in the observed (from the scoresheets) data, 2) the observed probability of scoring at least one run after the combination (useful for strategy analysis), 3) the observed average runs after, and 4) the average runs after computed from the Markov chain. For the NL (Cincinnati home games) data, two averages are shown, one which excludes the pitchers and one including the pitchers. These averages are calculated only from the hitting away transitions, while the observed averages reflect the effects of all plays. The probability of scoring at least one run is more difficult to calculate from the Markov chain formulation, and it has not been carried out at this time.

TABLE 3: OBSERVED AND THEORETICAL SITUATION DATA

37 BALTIMORE HOME GAMES IN 1985 (ALL TEAMS) 

                     OBSERVED       MARKOV
                   Prob.    Avg.     Avg.
Situation Number   of runs  runs     runs 

 1 (0,0)    691    0.288   0.530    0.536
 2 (0,1)    505    0.180   0.323    0.292
 3 (0,2)    393    0.079   0.125    0.117
 4 (1,0)    171    0.415   0.854    0.910
 5 (1,1)    216    0.282   0.583    0.559
 6 (1,2)    219    0.100   0.210    0.220
 7 (2,0)     46    0.696   1.435    1.253
 8 (2,1)     74    0.473   0.689    0.760
 9 (2,2)     89    0.270   0.382    0.366
10 (3,0)      4    1.000   2.250    1.557
11 (3,1)     24    0.667   1.333    1.088
12 (3,2)     37    0.297   0.351    0.374
13 (12,0)    42    0.667   1.524    1.380
14 (12,1)    74    0.446   1.014    0.923
15 (12,2)   101    0.277   0.554    0.507
16 (13,0)    16    0.875   1.875    1.736
17 (13,1)    42    0.786   1.524    1.409
18 (13,2)    48    0.292   0.542    0.509
19 (23,0)     8    0.500   0.875    1.820
20 (23,1)    32    0.531   1.156    1.309
21 (23,2)    28    0.143   0.357    0.251
22 (123,0)   16    0.875   1.938    2.118
23 (123,1)   30    0.567   1.133    1.447
24 (123,2)   40    0.250   0.600    0.480
           ----
           2946 

74 CINCINNATI HOME GAMES IN 1986 (ALL TEAMS) OBSERVED MARKOV Prob. Avg. Avg. runs Situation Number of runs runs no pit. w/pit. 1 (0,0) 1397 0.295 0.515 0.570 0.527 2 (0,1) 1000 0.162 0.259 0.297 0.270 3 (0,2) 804 0.071 0.102 0.114 0.103 4 (1,0) 369 0.472 0.900 1.017 0.955 5 (1,1) 395 0.281 0.532 0.622 0.578 6 (1,2) 408 0.145 0.243 0.281 0.254 7 (2,0) 110 0.609 0.955 1.098 1.034 8 (2,1) 202 0.411 0.678 0.632 0.599 9 (2,2) 246 0.236 0.325 0.360 0.330 10 (3,0) 26 0.923 1.423 1.533 1.498 11 (3,1) 69 0.667 0.942 1.015 0.945 12 (3,2) 125 0.272 0.408 0.373 0.355 13 (12,0) 83 0.735 1.590 1.797 1.703 14 (12,1) 127 0.480 1.087 1.148 1.079 15 (12,2) 194 0.242 0.407 0.514 0.469 16 (13,0) 34 0.824 1.353 1.815 1.748 17 (13,1) 66 0.667 1.045 1.255 1.161 18 (13,2) 77 0.234 0.351 0.451 0.391 19 (23,0) 11 0.818 1.909 1.778 1.715 20 (23,1) 55 0.745 1.455 1.413 1.350 21 (23,2) 57 0.404 0.772 0.726 0.715 22 (123,0) 22 0.955 2.091 2.402 2.282 23 (123,1) 55 0.764 1.764 1.966 1.786 24 (123,2) 73 0.288 0.521 0.751 0.654 ---- 6005

It should be noted that the Markov calculations for expected (average) runs after each situation assume all batters have the average transition probabilities. In actuality, each batter has a different transition matrix, which could be accounted for in the Markov calculation, but such a degree of complication is beyond the scope of the current effort. Also, the Markov calculations shown above exclude the effects of intentional walks, sacrifice bunts and attempts, stolen bases and attempts, and other plays that do not change the batter. This can be an advantage when analyzing strategies for the effect on expected runs because these calculations exclude the effects of some primary strategies. For these reasons and others, the observed average runs do not match the Markov calculations. The two are fairly close in many cases for the AL data, but the theoretical calculated values tend to be higher than the observed values for the NL data.

Table 4 summarizes the scoring from second on a single exercise described previously. In both cases, only non-pitcher hitting away transitions are counted.

             TABLE 4: SCORING FROM SECOND ON A SINGLE (II) 

           1985 BALTIMORE GAMES

Start Situation: (2,0)&(23,0)   (2,1)&(23,1)  (2,2)&(23,2)

End Situations: number
 --runner on second scores
                  (1,0):  4      (1,1):  4     (1,2):  10
                  (0,1):  1      (0,2):  1 

--runner on second does not score
                  (13,0): 4      (13,1): 12    (13,2): 2
                  (1,1):  0      (1,2):  0

Scoring percentage: 5/9 = .556   5/17 = .294   10/12 = .833 

1986 CINCINNATI GAMES Start Situation: (2,0)&(23,0) (2,1)&(23,1) (2,2)&(23,2) End Situations: number --runner on second scores (1,0): 6 (1,1): 12 (1,2): 26 (0,1): 1 (0,2): 0 --runner on second does not score (13,0): 6 (13,1): 12 (13,2): 6 (1,1): 2 (1,2): 4 Scoring percentage: 7/15 = .467 12/28 = .429 26/32 = .813

Because the Orioles and Reds players are involved offensively or defensively in all plays, these transitions can't be considered to be a direct comparison between grass and astroturf. That being said, the above evidence does not support a conclusion that it is easier or harder to score from second on a single in either of the two parks. The only thing shown is the unsurprising observation that runners score from second far more frequently when there are two outs.

Next, the thorny issue of the sacrifice bunt is considered. Because of data limitations the investigation is confined to situations with a runner on first only. For the AL, the no outs situation is the only one for a potential sac try, but in the NL, pitchers will often bunt with one out. These bunts should be evaluated against the objective of increasing the chances of scoring at least one run. In general, the sac bunt reduces overall scoring because it creates an out. One important point is that the probabilities of scoring (at least one run) used are drawn from the observed data, and hence they include the effects of all plays and strategies, including bunts and stolen base tries. That means these probabilities are not the best for the typical type of break-even analysis. Instead, the tables below compare the probabilities of scoring before the bunt, those shown in Table 3 for (1,0) and (1,1), with the probabilities resulting from the outcomes of the actual sacrifice bunt attempts.

             TABLE 5: SACRIFICE BUNT ATTEMPT ANALYSIS 

    BALTIMORE GAMES (bunts with runner on first, no outs)
Ending situation    Number    Percent     Scoring Probability

(0,2) [Double play]   1       .056           .079
(2,1) [Sac worked]   14       .778           .473
(12,0)[Batter safe]   3       .167           .667
                     -- 
                     18 

Average probability of scoring after bunt
           = (.056)(.079) + (.778)(.473) + (.167)(.667) = .483
Probability of scoring after (1,0)                      = .415

Net GAIN from sacrifice bunt attempt                    = .068 

CINCINNATI GAMES (bunts with runner on first, no outs) Ending situation Number Percent Scoring Probability (0,2)[Double play] 3 .077 .071 (1,1) [Sac failed] 4 .103 .281 (2,1) [Sac worked] 28 .718 .411 (12,0)[Batter safe] 4 .103 .735 -- 39 Average probability of scoring after bunt = (.077)(.071)+ (.103)(.281) + (.718)(.411) + (.103)(.735) = .404 Probability of scoring after (1,0) = .472 Net LOSS from sacrifice bunt attempt = .068

CINCINNATI GAMES (bunts with runner on first, one out) Ending situation Number Percent Scoring Probability 3 out [Double play] 1 .063 .000 (1,2) [Sac failed] 3 .188 .145 (2,2) [Sac worked] 9 .563 .236 (3,1) [Runner scored, batter reached 3rd on error] 1 .063 1.000 (run scored on bunt) (12,1) [Batter safe] 1 .063 .480 (23,1) [Batter safe, both advance extra base on error] 1 .063 .745 -- 16 Average probability of scoring after bunt = (.063)(0)+ (.188)(.145) + (.563)(.236) + (.063)(1) + (.063)(.480) + (.063)(.745) = .299 Probability of scoring after (1,1) = .281 Net GAIN from sacrifice bunt attempt = .018

The number of bunt transitions is not nearly large enough to support any definitive conclusions, but it is still interesting to interpret the above data. The sacrifice bunt as practiced in the games sampled appears to have been a good play. There is a meaningful increase in the probability of scoring at least one run in the AL games, a small increase for the NL one out bunts, and a sizeable decrease for the NL no out bunts. However, all of the NL one out bunts and many if not most of the NL no out bunts are by pitchers. With a pitcher hitting away, the actual probability of scoring after (1,0) or (1,1) is much less than the values shown, which are based on all players. Thus, the NL comparisons are more in favor of the bunt, especially with pitchers batting, than shown, although the exact amount can't be quantified from the data available.

Some caveats are in order. First, the calculations are based on average values. Specific batters will differ from the average to some extent, so any conclusions must be applied with care. A second consideration is that some bunt transitions may not have been tabulated because the scoresheet may have failed to indicate a bunt. This does not effect credited sacrifices, but could effect both failed sacs and sac tries that result in singles. A similar, and perhaps more serious problem, is that the scoresheets do not indicate when a batter tried unsuccessfully to bunt until he had two strikes and then hit away, presumably at a disadvantage. Bunted third strikes are generally recorded and counted as sac try transitions, but there may some cases where the bunt indication is missing on the scoresheet.

Most sabermetric analysis has denigrated the sacrifice bunt. Although not discussed here, it is almost certain that the sac bunt try decreases total scoring. Much of the published analysis supports the case that the bunt is not a good play to try to score one run. The above supports the opposite, even for non-pitchers, keeping in mind that the data used are obviously limited.

It is tempting to try to use Table 5 to compare bunting on grass and astroturf. It is true that the percentage of failed sac tries is higher for the Cincinnati data, but there is a possible explanation other than the playing surfaces. The NL bunts are mainly the efforts of pitchers, and the AL bunts, of course, are all by non-pitchers. In general, pitchers bunt in sacrifice situations whether or not they are good bunters. However, a non-pitcher who is a poor bunter is rarely asked to bunt. Thus, there is a good chance that the lower success rate in Cincinnati is due to pitchers. What is needed is to isolate the non-pitcher bunts in the NL data, but that was not done for this study.

The final analysis presented concerns the stolen base. Attempted steals are judged against two objectives, increasing the chances of scoring at least one run and increasing the expected number of runs. Because of the relatively small number of transitions, the calculations shown are confined to situations with a runner on first and no other runners. The expected runs shown in Table 6 are the theoretical values computed from the Markov chain for non-pitcher hitting away transitions, which are used because they are generally free of the effects of strategies and certainly are not influenced by stolen bases.

               TABLE 6: STOLEN BASE ATTEMPT ANALYSIS 

        BALTIMORE GAMES (runner on first, no outs)

 Ending                                Scoring       Expected
 Situation      Number    Percent      Probability   Runs

 (0,1) [CS]       3        .214         .180         0.292
 (2,0) [SB]      10        .714         .696         1.253
 (3,0) [SB&E]     1        .071        1.000         1.557
                 --                    -----         -----
                 14       Weighed Avg:  .607         1.069
                     Values for (1,0):  .415         0.910 

                Gain/loss from SB try:  .192         0.159

[The weighted averages are computed using the method shown in Table 5.] 

BALTIMORE GAMES (runner on first, one out) Ending Scoring Expected Situation Number Percent Probability Runs (0,2) [CS] 5 .333 .079 0.117 (2,1) [SB] 10 .667 .473 0.760 -- ----- ----- 15 Weighed Avg: .342 0.545 Values for (1,1): .282 0.558 Gain/loss from SB try: .060 -0.013

BALTIMORE GAMES (runner on first, two out) Ending Scoring Expected Situation Number Percent Probability Runs 3 out [CS] 7 .438 .000 0.000 (2,2) [SB] 9 .562 .270 0.366 -- ----- ----- 16 Weighed Avg: .152 0.205 Values for (1,2): .100 0.219 Gain/loss from SB try: .052 -0.014

CINCINNATI GAMES (runner on first, no outs) Ending Scoring Expected Situation Number Percent Probability Runs (0,1) [CS] 11 .324 .162 0.297 (2,0) [SB] 19 .559 .609 1.098 (3,0) [SB&E] 4 .118 .923 1.533 -- ----- ----- 34 Weighed Avg: .501 0.890 Values for (1,0): .471 1.017 Gain/loss from SB try: .030 -0.127

CINCINNATI GAMES (runner on first, one out) Ending Scoring Expected Situation Number Percent Probability Runs (0,2) [CS] 18 .409 .071 0.114 (2,1) [SB] 26 .591 .411 0.632 -- ----- ----- 44 Weighed Avg: .272 0.419 Values for (1,1): .281 0.621 Gain/loss from SB try: -.009 -0.202

CINCINNATI GAMES (runner on first, two out) Ending Scoring Expected Situation Number Percent Probability Runs 3 out [CS] 20 .282 .000 0.000 (2,2) [SB] 42 .648 .236 0.360 (3,Q) [SB&E] 5 .070 .272 0.373 -- ----- ----- 71 Weighed Avg: .172 0.259 Values for (1,2): .144 0.280 Gain/loss from SB try: .028 -0.021

The scoring probability of 1.000, which means certainty, shown for the (3,0) Baltimore data is based on just four observations. The true probability is a little lower, but that is unlikely to affect the results of this study. One caveat: it is quite possible some plays that should have been scored as caught stealings were instead scored as pick-offs. The tabulation program counts pick-offs as other transitions that do not change the batter, not at stolen base attempts. Thus, the data in Table 6 may have a small bias in favor of the stolen base. With that in mind, readers are invited to draw their own conclusions from Table 6.

While the calculations presented in the above tables may seem tedious and laborious, they were performed rather quickly from a few matrix multiplications that also produced additional information not shown above. Once the transition matrices have been set up in the spreadsheet environment, it is easier to do the computations than to write about them! This is a dramatic illustration of the analytical power that results from the combination of Markov chain techniques and Project Scoresheet data.

Return to the Markov Page

Return to the Baseball Page

Return to the Home Page