Is There an Association Between Batting Side And Fielding Position Among Current Major League Baseball Players?

Matt Cable, '07

Buckingham Browne & Nichols School
Cambridge, MA
May 2005

Abstract

 

The goal of this study was to determine whether there is an association between batting side and fielding position among all active players in Major League Baseball. In this study, a One-way Chi-Square Goodness of Fit test was preformed on six positions to answer this question.

The data collected included the entire population of active Major League Baseball Players, with specific exceptions. The groups of players excluded from the data were pitchers, designated hitters and switch-hitters. Overall, 342 Major League Baseball Players were included in the data, a little more than half of all active Major League Baseball Players. The recorded data consisted of each playerÕs fielding position and their batting side, right-handed or left-handed.

The results of the study showed that there is a very strong association between batting side and fielding position among current Major League Baseball players. There were six positions included in the data so the degrees of freedom for the test were five. The P-value for the test was .00000306, approximately zero.

There are several possible sources off error in this study. Among these are players who can play multiple positions and the effect of designated hitters and players who are listed as switch-hitters but very rarely bat from one side. It would be reasonable to extrapolate from the data to equal levels of play, but it would be unreasonable and dangerous to extrapolate to youth league and Little League.

 

 

 

 

 

 

The Study:

 

 

The research question: Is there an association between batting side (right-handed or left-handed) and fielding position among current players in Major League Baseball.

 

Sampling-

 

            For this study, no sampling or randomizations were used. The entire population of eligible Major League Baseball Players was included in the data. Players were included in the study if their name appeared on their official team website on the date of the data collecting, May 13, 2005. Injured players did appear on the team websites because they were officially still on the team roster, they had just been placed on a special list, the Disabled List. Since their official position and the side that they bat from, right-handed, left-handed, or switch-hitter, were included for injured players, they have been included in the data and treated like any other healthy player.

            Although the data included all eligible Major League Baseball players, they are several groups of players that were designated as ineligible. One of these groups of players is designated hitters. Designated hitters are players who rarely, if ever, play a fielding position. In the American League, the designated hitter bats instead of the pitcher, but in the National League, the pitcher must hit as well as pitch, so there are no designated hitters in the National League. Designated hitters were not included in the data for several reasons. The first is that they only exist in the American league, and the purpose of this study was to be able to extrapolate to at least the entire Major Leagues. The other reason that designated hitters were not included in the study is that many designated hitters can also play positions, but the manager decides to play them at designated hitter if they are the best extra hitter on the team. In fact, this is true on most teams, so only about five players in all of Major League Baseball would be listed officially as a designated hitter. This total does not satisfy one of the conditions of the Goodness of Fit Test which says that expected counts must be greater than five, because the expected number of left-handed batters at designated hitter would be less than five.

            Another group of players who were not included in the data were pitchers. Pitchers bat, like designated hitters, only in one of the two leagues in Major League Baseball. Starting pitchers who pitch for a National League team would bat approximately 80 times a year, compared to about 600 for an everyday hitter. The main problem with including pitchers in the data aside from all of the American League pitchers who do not hit is all relief pitchers in the Major Leagues who rarely, if ever, get to hit. The reason for this is that the majority of the time relief pitchers come in to relieve a starting pitcher for one inning or less. If they came up to bat in the next inning as scheduled, the manager almost always pinch-hits, which means he puts in an everyday hitter to replace his pitcher at the plate. The only way that a relief pitcher would ever get to hit is if the manager plans on pitching the reliever for an extended period of time, which happens very rarely. Therefore, since relief pitchers would have very few at-bats and only National League starting pitchers could be counted, it would be very hard to determine whether their position, pitcher, had any affect on their batting side.

            The last group of players that were not included in the data is switch-hitters. Switch-hitters are players who can bat both left-handed and right-handed. Since they have the luxury of being able to hit both left-handed and right-handed, they almost always hit left-handed against right-handed pitchers and right-handed against left-handed pitchers. They do this to give them the advantage of being able to see the ball released by the pitcher for an extra fraction of a second as the ball comes across the pitcherÕs body. The results of the data collecting showed that there were about 50 players officially listed as switch-hitter on their official team websites. These 50 players were not included in the data because Chi-square Tests are done in counts so the players listed in the data must be either classified as right-handed hitters or left-handed hitters. It would be impossible to tell which of the categories to place these 50 players in, so they were left out of the data entirely.

            The last detail about the data collecting is the fact that some players can play multiple positions. Although they may play games at other positions during the season, the official team websites list only one official position that the player has with the team. This is the position that was used for these players in the data.

            All data for this study was collected from official team websites, www.(team nickname).mlb.com. For example, the redsox official team website is www.redsox.mlb.com. These websites contain official 25-man up to the day rosters with each playerÕs official position and whether he bats right-handed, left-handed or switch-hitting. The official team websites are sponsored by the teams, so they are reliable, accurate and up to date.

 

 

 

 

 

 

Data/Goodness of Fit Test:

 

Observed Data

 

Position

Catcher

First Base

Second Base

Shortstop

Third Base

Outfield

Count of Left-handed Batters

4/50

23/41

10/43

3/35

13/39

68/134

Proportion of Left-handed batters

.080

.561

.233

.086

.333

.507

 

 

Combined proportion of left-handed batters in the data:

121/342 = .3538

 

 

Calculating Expected Values:

The null hypothesis for this study was that there is no association between fielding position and batting right-handed or left-handed. Therefore, before the study was done, the expected counts of left-handed batters is the combined proportion, .3538, times the number of players at a specific position as recorded in the data.

For example, 50 Major League Catchers were recorded in the data. Since the combined proportion of left-handed hitters is .3538, the expected number of left-handed hitting catchers in the data is:

50×(.3538) = 17.69 expected left-handed hitting catchers.

            This same process was repeated for all six positions to calculate the expected values at each position.

 

Expected Data

Position

Catcher

First Base

Second Base

Shortstop

Third Base

Outfield

Expected Proportion of Left-handed batters

.3538

.3538

.3538

.3538

.3538

.3538

Expected Count of Left-handed batters

17.69

14.5058

15.2134

12.383

13.7982

47.4092

 

 

Calculating Components of Chi-Squared

 

            The formula for calculating the components of Chi-Square is:

 

 

 

For example, The Chi-Square component of Catchers would be calculated:

 

This process was repeated for all six positions.

 

 

 

 

Components of Chi-Square

 

Position

Catcher

First Base

Second Base

Shortstop

Third Base

Outfield

Observed Value

4

23

10

3

13

68

Expected Value

17.69

14.5058

15.2134

12.383

13.7982

47.4092

Component of Chi-Square

10.590

4.974

1.787

7.110

.0462

8.943

 

 

 

            The sum of the Chi-Square is found by adding each component of Chi-Square.

 

= 33.4502

 

 

Goodness of Fit Test

 

            Degrees of Freedom- To find the P-value of a Chi-Square Goodness of Fit, the degrees of freedom must first be calculated. The degrees of freedom for a Goodness of Fit Test is equal to the (number of categories) Š 1. In this case, there are six positions, so the degrees of freedom for this data is five.


Hypothesis:

           

Conditions- All expected counts must be greater than or equal to five

- This condition is met by the data.

           

P-value- In a Goodness of Fit test, the P-value is the area under the Chi-Square curve with a mean as the number of degrees of freedom. The p-value is calculated by plotting the sum of the Chi-Square, the test statistic, against this mean. Since the test statistic for this study was 33.4502 and the degrees of freedom was five, the P-value can be calculated on a TI-83 Plus calculator;

 

Interpreting the P-value

            If samples had been taken to collect data for this study, then this P-value would indicate that there was a 3.06E-6 likelihood of getting data with this Chi-Square or greater by chance alone. However, this study did not involve taking any samples of players. Instead, all eligible players for this study were included in the data. Therefore, we could say that there is an absolute 0 chance of this happening by chance alone because the entire population is included in the data so there would be no variation.

            Instead of interpreting this P-value as if it were a sample, it can be interpreted as nearly 0, so we can reject Ho that there is no association between batting side and fielding position among current Major League Baseball players. This data shows a very strong association between batting side and fielding position. This suggests that players at certain positions tend to be left-handed hitters and players at other positions tend to be right-handed hitters.

 

Follow-up Analysis

            The sum of the Chi-Square for this data is 33.4502, but if the data is looked at more closely, some of the positions have much higher components of Chi-Square than others.

            The highest component of Chi-Square was 10.590 for the catchers. In the data collected, only 4 out of 50 catchers batted purely left-handed, compared to the expected count of 17.69. Catcher was one position mentioned in the reasons for the alternative hypothesis as having an advantage to throwing right-handed. This may have led to the small number of purely left-handed hitters, however, there were more switch-hitting catchers than any other position. Therefore, the mere 8% of left-handed catchers may not be very accurate in showing the number of left-handed and right-handed at-bats taken by catchers.

            Another position that falls into the defensive advantage reasoning is shortstop. Only three shortstops recorded in the data batted purely left-handed. However, 13 third baseman batted just left-handed and had the same defensive disadvantages as shortstops. In fact, the Chi-Square component for third basemen was almost zero, meaning that the observed and expected values were almost equal. The reason for low number of shortstops compared to third basemen may be that shortstop, in general, is perceived as a defensive position. Shortstops often take more pride in being excellent defenders than hitters, while third basemen usually are expected to hit homeruns and generally have less difficult defensive plays. Therefore, it could be that many third basemen would rather have the advantage of hitting left-handed and shortstops would rather have the advantage of playing both offense and defense purely right-handed.

            The two positions that had the proportion of left-handed hitters greater than .50 were outfielders and first basemen. According to the Atlanta Journal-Constitution, approximately 11% of AmericaÕs population is left-handed. Therefore, as projected by the reasons for the alternative hypothesis, this indicates that there is a significant advantage to batting left-handed. Outfield and first basemen are the only positions, excluding designated hitter, where throwing with either hand is of no defensive importance. Therefore, not only is the proportion of left-handed hitting outfielders much more than any other position in baseball, it also is about four times as much as AmericaÕs population, keeping in mind that not everyone who throws left-handed, bats left-handed.

            Overall, catchers, outfielders, first basemen and shortstops attributed to most of the Chi-square. The observed values for outfield and first base were much higher than expected and the observed values for shortstops were much lower than expected. Second base and third base had low components of Chi-square with observed values of a little less than the expected. Generally, the reasoning for the alternative hypothesis was correct because the positions second base, third base, shortstop and catcher all had lower proportions of left-handed hitters than both outfielders and first basemen.

 

Weaknesses

            Although no sampling took place for this study, there were several possible sources of error. One of these revolves around what question this study was specifically trying to answer. The goal of the study was to see if there was association between batting side and fielding position, which is what the collected data accurately showed. However, if the goal was to interpret the data about batting side and use this to argue that certain positions batted from one side more often, then this data might not be entirely accurate or useful. The reason for this is switch-hitters. While many switch-hitters bat almost equally from both sides of the plate, some are much more successful from one side. The only reason that they try to bat from both sides is to get the advantage of always facing an opposite throwing pitcher. In other words, they can bat left-handed against right-handed pitchers and right-handed against left-handed pitchers. Therefore, switch-hitters vary in their actual level of hitting from both sides so some players listed as switch-hitter may just be right-handed or left-handed hitters who bat from both sides just to get the advantage.

            Furthermore, these switch-hitters sometimes do not bat from the opposite side because they realize that they are stronger hitters from one side over the other. For example, if a switch-hitter is facing a soft throwing right-handed pitcher, he may bat right-handed because the advantage of batting left-handed is lessoned if the pitcher does not throw hard. These players especially, may have been more accurately placed in the side that they prefer to hit from rather than being left out of the data entirely.

The other main possible source of error in this study is players who can play multiple positions. The team websites are accurate sources of information about the players, including fielding position and batting side. These websites were specifically used in this study because they only included the main and most appropriate fielding position for each player. However, this method of collecting data does include some possible error. There are many players in Major League Baseball, especially infielders, first base, second base, third base and shortstop, who can play more than just the position at which they are officially listed. This means that several of the infielders who bat left-handed could be, and in some cases should be, at other infield positions. If just a few of the left-handed hitting players listed at third base had been listed at shortstop, where there were only three total left-handed hitters, this would have a significant impact on the results of the data. For the most part, catchers and outfielders do not play other positions, so this would not be a major source of error.

 

Extrapolation

            The rule for extrapolation with Chi-Square tests is that the collected data must come from a Simple Random Sample (SRS). In the case of this study, an SRS was not taken, because no samples were taken.

            If it were safe to extrapolate from the data, it would be reasonable to extrapolate to some levels of baseball but not to all levels. For example, similar data would be expected from the Major League Baseball team's minor league players and other professional baseball players. The results would likely be the same because the level of play is very high in professional baseball leagues and defense would be taken very seriously. At this level, the instant of a second longer that it takes to throw across your body from the infield to first base for a left-handed thrower would be critical at this high level of competition. For these same reasons, it would be reasonable to extrapolate to college baseball and even some high school baseball teams.

            The areas of baseball that it would not be reasonable to extrapolate to with this data is to Little League or other baseball teams with young and inexperienced players. Since these players are just starting to play baseball they would be trying to find a fielding position to play. The results of this study would encourage left-handed batting kids to not try to become second basemen, third basemen, shortstops or catchers. However, this would only be true if these kids played at a very high level of baseball. Since the kids playing naturally right-handed infield positions are already not very precise and accurate, adding an extra fraction of a second to the their throws across the infield would not be a problem in most cases.

 

 

Conclusion

The results of this study have shown that there is a very strong association between batting side and fielding position among eligible Major League Baseball players.

First, the reasoning behind excluding groups such as pitchers, designated hitters and switch-hitters was explained. The first calculation done was the combined proportion of left-handed hitting players, which was found to be .3538. This proportion was used to find the expected counts of left-handed hitting players at each individual position. From these observed and expected counts, the individual components of Chi-Square were calculated and totaled. From this total along with a degrees of freedom of (6 positions-1 = 5), a Goodness-of-Fit test produced a p-value of 3.06E-6. This p-value could not be interpreted as the probability of finding these results by chance along because no chance of sampling was included in the collecting of data. Instead, the interpretation was that there was a very strong association between batting side and fielding position.

A follow-up analysis confirmed the reasoning for the alternative hypothesis that second base, third base and shortstop would have a lower proportion of left-handed batters than outfielders and first basemen. The proportion of left-handed hitting outfielders and first basemen was also compared to the proportion of all left-handed Americans to confirm that it was a significant advantage to hit left-handed in Major League Baseball.

            Several weaknesses in the study included the exclusion of switch-hitters and players playing multiple positions. If the data had come from a Simple Random Sample, possible areas of extrapolation such as minor league baseball, college baseball and high school baseball were encouraged while other levels such as little league baseball would probably produce significantly different results.