Saturday, March 22, 2008

Fantasy Baseball Abstract-Predicting HR or AVG, what is more reliable?

Question: What is more reliably predicted in fantasy baseball-Batting Average or Home Runs?

Methods: I felt that the best way to answer this question is to realize how far away a prediction was away from 100%. The percentage of the prediction would be defined as 100% and the percentage of the actual result would be the total actual result divided by the total predicted result. The difference between these would be defined as the Confidence Factor or CF. The closer a CF is to zero, the greater reliability it reaches. The reliability percentage would be defined as actual result divided by projected result. The closer the result gets to 100% the more accurate the prediction would be.

It was broken into two categories, both exceptional players and reliable players. Exceptional HR hitters are defined as projected 25 or greater; and exceptional AVG players are defined as a projected 300 average or greater and 500 ABs. Projected reliable players are defined as a projection of 500 ABs or greater.

Data: For HRs, exceptional players were chosen who had a projected HR total of 25 or greater. This was factored in to their regular season stats regardless of injury. Of the 53 players to hit 25 HRs or greater, their projected total was 1789 HRs and their actual total was 1375 HRs, for a 23.1 CF. Another way to say it that it was 76.9% reliable. 3 players however (Joe Crede, Russ Branyan, and Morgan Ensberg) had ABs less than 50% of their projected totals due to injury or decreased playing time. Factoring in these three players, the projected HRs drop to 1700, and the actual total was 1349. This brings a CF of 20.6 or 79.4% reliability.

The results for reliable HRs are as follows. Of the 161 players that had 500+ projected ABs, a total of 3432 HR were projected and 2735 were hit. This is a CF of 20.3 or a reliability rate of 79.7%. Thirteen players who had at least 500 ABs projected ended up with less then 50% due to injury of lessened playing time. These players were Rocco Baldelli, Russ Branyan, Jorge Cantu, Joe Crede, Chris Duffy, Morgan Ensberg, Andre Ethier, Shea Hillenbrand, Nick Johnson, Mark Kotsay, Juan Rivera (LAA) Ryan Shealey, and Chad Tracy. The remaining 149 players were projected to hit 3175 HRs but actually hit 2676. The CF for this group is 15.7, or a reliability rate of 84.3%

For BA, players were chosen who had a projected AB total of greater than 500 in both categories. Exceptional players were defined as having a greater than 300 average.

Exceptional BA players were projected to have an average of 311, and they actually hit 299. This has a CF of 3.8, or 96.2% reliability. None of the players projected to hit above 300 and have more than 500 ABs had greater than 50% of their ABs lost.

The reliable BA results were as follows. The projected results were 285, and the actual results were 275. That is a CF of 3.5 and a reliability rate of 96.5%. Twelve players who had 500 ABs projected reached less then half that amount, they are the thirteen listed above with the exception of Russ Branyan who did not have 500 projected. This group’s projected BA was 286, and their actual BA was 281. This is a CF of 1.7, or an impressive reliability rate of 98.3%.

Conclusion: Why such a great disparity? I believe that easily quantifiable numbers (i.e. last year’s ct%, gb/fb ratio, etc) tend to favor the prediction of Batting Averages over Home Runs. Factors that cannot be predicted, such as wind, temperature, and humidity affect how far a ball travels, and the difference of 3 to 5 feet can mean the difference of between a home run and an out. However, these minutiae of difference would not affect a non-HR hit as much. There is a greater probability (assumed not proven) that a hit ball would land in the field of play and NOT be an out than hit over the fences. Then logic would follow that difference in feet or inches would not make as much of a difference in simple average as compared to HRs.

This is a game of inches, but the science isn’t specific enough to measure those inches. Instead we use quantifiable metrics, historical and implied trends, and some dumb luck to predict the future. But the Sabremetric Statistician’s crystal ball is a bit fuzzy when it comes to factors outside the game that affect inside the game. These factors however do not affect balls in play as much as balls hit over the wall.

AND this is only looking at data over the 2007 season. Every year I will add results to this, further proving or disproving the theory that batting average is more reliably predicted than home runs.

Addendum I: While reviewing this abstract I noticed a potential flaw in my logic. Over the past few years I have shown that BA and ABs have a direct correlation. Also the reliability of these players is much greater thereby making the reliability of these predictions easier to project. Therefore I will also project the CF and reliability rate of 25+ HR and 500 ABs.

46 Players fell into both categories. They were projected to hit 1579 HRs but hit 1225. This gives a CF of 22.3 or a 77.6% reliability rate. 2 players had less than half of the projected ABs, Joe Crede and Morgan Ensberg. Taking them into account 1517 HRs were projected and 1209 were hit. This raises the CF to 20.3 and the reliability rate to 79.7%.

Therefore the theory still holds true.

Addendum II: I noticed that I used Batting Average but Home Run Totals. To avoid this possible contradiction I will use HR average for the players who fell into the rankings from Addendum I.

The average number of HRs project to be hit was 34.3, and the actual average hit was 26.6. The CF for this was 26.7 and the reliability rate was 73.3%. Again factoring out Crede and Ensberg the averages change to 34.5 projected and 27.5 hit. This lends to a CF factor of 20.3 or a reliability rate of 79.7%.

No marked difference is shown taking account average rather then total.

Debate I: Perhaps however, a CF of 20 for HRs is not as important a CF less than 4 for BA. The Batting Average category is normally won by a matter of points, and the Home Run category is one by more than a HR or two, more often won by difference of many home runs.

Table I:CF and Rel% of players with >50% projected ABs


CF

Rel%

Exceptional HR

20.6

79.4

Reliable HR

15.7

84.3

Exception BA

3.8

96.2

Reliable HR

1.7

98.3

1 comment:

Unknown said...

Hi Doug,

You have a spelling error in the chart at the bottom. It should say Reliable BA after Exception(al) BA.

Dave