WelshShooter
Member
- Joined
- Jun 10, 2014
- Messages
- 491
Presenting the average of averages can easily make sense if the data sets compiled in the comparison set are independent but related.
Let’s say I want to know if Hornady brass is as good as Lapua brass. So I buy .30-06, 284 win, 7-08, and 6 creed brass.
I shoot 2 or 3 different brands of bullets from each cartridge in each set, over a couple different powders. Now I have hundreds of not thousands of shots in this multifactorial matrix.
I can simplify presentation of the data set by presenting average SD and average ES’s. If a guy only compared the singular ES for one cartridge loaded in the 2 types of brass, with one bullet, the data set would not be very conclusive. But presenting a comparison of the two matrices of different bullets and powder combinations in each cartridge would be exhaustive. Instead, a comparison of average ES’s and average SD’s might be very meaningful. Even then, taking the ES and SD of the ES’s, and the ES and SD of the SD’s might be very meaningful - in a very concise and palatable results set.
For the comparison shown above, you could perform a 2-sample t-test to confirm if the mean values between tho two distrubitions are significantly different or not (eg if two bullets in one case brand produce different velocities, or if the same bullet in different cases produce different velocities). Furthermore, you could also perform a 2-variance test to confirm if the SD's are significantly different or not.
However, if you want to see how well a load copes with different conditions etc, it would bode well to compare the min, median* and max ES/SD/mean velocity observed over time. If your min and median* are very similar but the max is much higher, then you can figure out what conditions caused that dataset to be different. It could be that all proceeding data were compiled over winter and this is the first sample added over summer, or it could be related to a new batch of powder compared to that old batch of powder that's been sitting on your shelf for years.
*when your data is skewed and non-normal in distrubiotn the median value is more reliable to use than the mean. If you had an n=10 population with nine datapontis as 1 and one data point as 10 your average value is 1.9 but your median is 1. The value of 1.9 does not appear in your dataset and doesn't really tell you anything other than your data is skewed, whereas the median of 1 at least tells you that your median is equal to your min so your data could be mostly grouped around this value.