Accuracy Evaluation

Status
Not open for further replies.

carnaby

Member
Joined
Feb 25, 2004
Messages
1,394
Location
Bellingham, WA
Cross posted at a couple forums. I'm trying to change the discussion on this topic to be more reasonable than the standard methods that I see at the range and on forums at the moment.

Accuracy evaluation through testing with 5-shot groups is dubious at best. I attempt to start to address that here:

http://www.bisonops.com/2019/08/17/rifle-ammunition-load-workup/

The long and short is that most 5-shot groups are statistically not different thought different loads may produce results that look convincing. I think that a better approach is to establish a baseline and work from there. I'm not sure the best way to do this. Establishing a baseline is time consuming and not as satisfying as shooting a few 5-shot groups and formulating conclusions from those, but we're really just kidding ourselves. Better to commit to starting with 20-30 shots in a single group, or several 5-10 shot groups that are then merged together.

Then the question is how to best differentiate one load from another relative to this baseline.
 
So yeah, that's exactly what I'm getting at. There is zero statistical difference between any of the groups you show on your targets. They are in effect all the same and your load development hasn't really proven anything beyond the fact that your rifle is extremely accurate and tolerant to small changes in cartridge parameters. None of those groups has been shown to be different than any other in any scientifically meaningful way. It's still fun and feels satisfying, but should it? I can't square that anymore with my load development.
 
I don't believe that @Nature Boy 's development for that test was for group size, but for external ballistic comparison through the chronograph.
Shooting them farther out could perhaps show more clear trend.

Yes, statistically, a one hundred round group would be more accurate at diagnosing an acceptable load.

But, how many matches could those hundred rounds rob you of? More if the load shows no more promise.

Shooting a hundred mediocre at best rounds sounds like punishment.

For me, fun and satisfying are the end goals.

Have you a procedure to share? I am all ears! Next weekend is centerfield weekend and I have rifles that could shoot better.:)
 
1. If a load groups poorly with a single 3 to 5 shot group it's probably not going to do a lot better with a 10 or 100 shot group.

2. If a load groups well with a single 3 shot group you better shoot a few more groups to make sure it's really a good load. If it shoots a 5 shot group well it's likely a good load but still some more groups should be shot to confirm that.

3. In my opinion, shooting 10 or 100 shot groups in load development is an excellent way to wear out barrels and waste a lot of money on ammunition.
 
There is zero statistical difference between any of the groups you show on your targets. They are in effect all the same and your load development hasn't really proven anything

What you’re not appreciating is that the individual groups are not discrete data points. They are part of larger analysis to get to an optimal load.

There are 4 steps to the method I use in the following sequence:

1. Powder charge. Compares 3 adjacent charge weights (9 shots total)
2. Seating depth. 3 shots per depth change x 9
3. Primer test. 5 shots per primer x 6
4. Validation at distance. 20 shots at 500 yards (my home range limit)

You could say there’s a 5th step. I shoot in F Class and how the load developed using the method outlined above performs in matches would be the final verification.

I’ve shot three 600 yard matches with this load. These are the results:

600 x35
597 x30
599 x35

So out of 180 shots for record at 600 yards, 176 were less than 1 MOA and 100 of those were less than 1/2 MOA

I’d say that’s statistically significant, wouldn’t you?
 
Last edited:
Nature boy, certainly the 180 shots for record are statistically significant. What you have not shown is whether any of the loads used in development would have shot the same, worse, or better. My claim is that scientifically no claim can be made from your load development about whether or not any of those adjacent powder charges, seating depths, or primers would have made any difference to your scores in that match.

Demi-human, agreed that 100 shot groups are too costly in terms of time and components, and especially that a load that shows little promise should be continued with. I think a known good and satisfactory load should be used to establish the baseline and maybe 30 shots used for that purpose. I think also that chrony data for a small number of rounds, like 3 and 5 shots, is also dubius for determining if a velocity node has indeed been found. The data is again not statistically significant.

And so we're on the same page: https://en.wikipedia.org/wiki/Statistical_significance:

In any experiment or observation that involves drawing a sample from a population, there is always the possibility that an observed effect would have occurred due to sampling error alone. But if the p-value of an observed effect is less than the significance level, an investigator may conclude that the effect reflects the characteristics of the whole population, thereby rejecting the null hypothesis.

Grumulkin, I agree that 100 shot groups don't help because as you imply, what's the point of wearing out a barrel and using up tons of components to get statistical significance when it robs shooting of any fun? But I think a single 30-shot group used to establish a baseline is not too burdensome or costly.

In my experience, small changes in powder charge and seating depth often don't result in definitive changes in group size when comparing individual 3-shot or 5-shot groups. The one time I do see an indisputable difference is when I try different bullets. When I'm shooting around say 0.75" 5-shot groups with say an 80 SMK and then switch to another bullet and can't get a single group below 1.5", then yeah, that load that produced the 1.5" and worse groups can be safely abandoned. But if I take that 80 SMK with 25.0 grains of powder X and shoot a single 0.75" 5-shot group and then shoot another single 5-shot group with 25.5 grains and get a 1.0" extreme spread... well not much can be said about this.

Have a look at the article I posted. I think this is the right approach. You need a baseline to work from, something that says with some mathematical certainty that my rifle with my baseline load shoots with a known and meaningful statistical metric, which for all practical purposes ought to just be the mean radius adjusted for sample size (see http://ballistipedia.com/index.php?title=Precision_Models for the math.)

I think the load for starting the baseline can be established the old fashioned way - shoot a couple of 5-shot groups and if you like the results, keep going, or if you already have a favorite load, use that. I think a baseline can be established with maybe as few as 20 shots, but really 30 is needed, and extreme discipline when shooting so that you can omit shooter error as much as possible - i.e. wait 30 seconds minimum between shots to keep the barrel from warming up, and establish a good shooting routine.

I'm working with the author from Ballistipedia to establish procedures and evaluation tools so that load development can be more scientific. That's why I'm looking for feedback on the shooting forums, in the end, I want the results to be useful to precision shooters.
 
Please take no offense, but you just talked in a big circle to get back to finding a good load "the old fashioned way" and then comparing other loads to it.

Not to mention using benchrest targets to keep from obliterating the aiming point.(Something not alluded to in your article.)

Have you a way to deduce (transduce, rather) a good load from some other metric beside firing cartridges?

I understand what you are saying about end target statistics and grouping, but how do we get there with less load development and more meaningful shot impact evaluation?

How do we separate the chaff, if not through trial and error, then validation through large round count statistical analysis?
 
My claim is that scientifically no claim can be made from your load development about whether or not any of those adjacent powder charges, seating depths, or primers would have made any difference to your scores in that match.

You claim is just that, a claim, without any scientific basis or data. It can’t be argued for or against. You put the burden on me to prove the validity of your “claim”. I’m not going to take the worst load in my analysis and go shoot a match with it. That would be silly and a waste of time.

I’m going to re-read your article again. I would respectfully ask you to do the same with what I’ve stated above. I’m pretty confident in my approach as it has proven itself in 3 different rifles and cartridges. Not to mention, it’s based on data and facts
 
You claim is just that, a claim, without any scientific basis or data. It can’t be argued for or against. You put the burden on me to prove the validity of your “claim”.

Let me rephrase: no scientifically reasonable claim can be made about the difference in the accuracy between the various groups shown in your first image. In statistics jargon, you have not shown that the null hypothesis is false. I don't have to show that it is true, it is assumed true until the data can show that it is false.

Here's an important starting point: do you agree that the point of impact of several bullets fired from a rifle at the same point of aim is described informally as a variable whose values depend on outcomes of a random phenomenon? Do you agree that the horizontal and vertical coordinates of the bullet holes in the group follow gaussian (i.e. normal) distributions that are described by a mean and standard deviation (whatever those happen to be for a given weapon and ammunition)? If we can't agree on this point, then we can't even start to have a discussion on the topic of accuracy and load development.

Here's a simulated target that I set up to match your first target in that other post. All I did was set the mean radius to 0.188 MOA, the vertical zero to -0.7 MOA, and the horizontal zero to 0.15 MOA then I let the random number generator do the rest.

Simulated-6-3-shot-groups-p188-mean-radius.jpg

We know that these groups were all statistically identical. Suppose we thought that we had varied some aspect of the load that generated each group by a small amount, then we might be tempted to infer something about the accuracy of these different loads. But this cannot be done, the groups are equivalent. And they look a lot like the groups in your image:

NatureBoyTarget.jpg

So what is the difference that lets you draw conclusions about the relative accuracy of the loads you developed?
 
Please take no offense, but you just talked in a big circle to get back to finding a good load "the old fashioned way" and then comparing other loads to it.

Of course not, never will an offense be taken when arguing honestly. You have to start somewhere is all I'm saying, so if you have a load that shoots well and that you trust, why not use it for your baseline?

Not to mention using benchrest targets to keep from obliterating the aiming point.(Something not alluded to in your article.)

This is a challenge, but the solution is to shoot at a new target and then combine the results. Not so difficult really.

Have you a way to deduce (transduce, rather) a good load from some other metric beside firing cartridges?

I'm not 100% sure what you mean. How else would you know anything about a weapon's accuracy without firing cartridges?

I understand what you are saying about end target statistics and grouping, but how do we get there with less load development and more meaningful shot impact evaluation?

That's the million dollar question, let's think about that some more. I'll reply after I've thought about it.

How do we separate the chaff, if not through trial and error, then validation through large round count statistical analysis?

The brute force approach would work but like you imply, who wants to do that? Not me. I'll reply after some thought. But my first guess is to start with a 30 round baseline and then work from there.
 
So what is the difference that lets you draw conclusions about the relative accuracy of the loads you developed?

The point you don’t seem to want to understand is the load test is the first step in a series of steps, not the end result.....and now I’m repeating myself....which is usually a sign that it’s time for me to bow out of the conversation.
 
So what is the difference that lets you draw conclusions about the relative accuracy of the loads you developed?

I don't wish to put words in Nature Boy's mouth, but I have a broken shoulder and nothing to do, so...

Knowing he is loading for a mid range competition, and seeing that basically all the targets are roughly half minute, the only difference in the groups are the differences in velocity in each group.
At longer ranges this is what matters. Slow shots hit low. Increasing group size. They also take longer to get there, drifting more in the wind and increasing group size.

I suspect you know all this. Not all shooting is raw accuracy. If I had a rifle that shot in the ones I still couldn't out shoot @Nature Boy, not even at a hundred. (Maybe Rimfire pistols at Fifty...;))


I'm not 100% sure what you mean. How else would you know anything about a weapon's accuracy without firing cartridges?

I thought you were getting at a way to find the first "good load" faster. I'm all for that. I'm here to learn.:thumbup:

But my first guess is to start with a 30 round baseline and then work from there.

Yes, but where do the thirty rounds come from? I understand having a good load and comparing a new one to it using a good process.

Talking on here about rimfire rifles and whether on not what you have is M.ore O.bviously A.ccurate, I know about statistically valid samples. (And dropped shots.:))
 
I think we all understand the importance of having more data points in any scientific experiment....but unless you have the means to reject the noise that is introduced by the human and the environment, all the data point in the world won't tell you the accuracy potential of your firearm.
 
...the only difference in the groups are the differences in velocity in each group.

And even these do not have enough rounds to give a good estimate of the mean and SD.

Yes, but where do the thirty rounds come from? I understand having a good load and comparing a new one to it using a good process.

Have a look at the linked article in my first post. Around 30 rounds the group starts to converge to give somewhat repeatable estimates of the mean radius and zero offsets.
 
I think we all understand the importance of having more data points in any scientific experiment....but unless you have the means to reject the noise that is introduced by the human and the environment, all the data point in the world won't tell you the accuracy potential of your firearm.

This is an excellent point. 100 yards is good because it rejects most of the environmental disturbance, especially if the winds are already low. Taking out the human factor is another point and largely something most of us are stuck with. In the end we're the one behind the trigger so we're estimating the rifle/weapon/shooter system as a whole.
 
Status
Not open for further replies.
Back
Top