Skip to main content

If my statistical results do not fall within a "normal" curve, does that mean they are wrong? Why?

I assume that you are asking about results that either lie outside a confidence interval, or results during a hypothesis test that lie in the critical region (tail.)


When creating a confidence interval we start with a point estimate for the population parameter we are interested in. For example, if we want to know the average height we might assume that the average from a random sample of sufficient size is a decent point estimate.


Understanding that the point estimate is not likely to exactly match the population parameter, we introduce an error term. This is added to and subtracted from the point estimate to create a confidence interval. The error term includes the standard error of the measurement, as well as a factor that is derived from the confidence level we want to achieve. (The larger the confidence, the larger the interval will be.)


If a secondary test gives results outside this confidence interval, is the result "wrong"? Not necessarily. Suppose the interval was created with a 95% confidence level. Thus we are 95% certain that the population parameter lies within the interval. One out of every twenty samples will have an estimate outside the interval. The true population parameter would lie outside the interval 5% of the time.


If we are doing a hypothesis test, in essence we are creating a confidence interval centered on the reported or accepted value of the parameter. Then we see if our sample statistic lies in that interval. If it does, we assume that the population parameter is as stated. If our sample statistic lies outside the interval (in the critical region), we have evidence to show that the given population parameter is incorrect.


In running a hypothesis test we run the risk of two types of error (assuming the samples are created correctly, etc...) A type I error is when we say that the purported parameter is incorrect, when it is actually correct. We have a lot of control over this type of error, as the probability of this type of error is equal to our confidence level. (I.e. at the 95% confidence level, the chance for a type I error is 5%.)


A type II error occurs when we fail to recognize an incorrect parameter. We can reduce the probability of this error by increasing the sample size and doing additional samples.


So when running a hypothesis test, the true result is not absolutely given. We only have probabilities to work with.

Comments

Popular posts from this blog

Is there a word/phrase for "unperformant"?

As a software engineer, I need to sometimes describe a piece of code as something that lacks performance or was not written with performance in mind. Example: This kind of coding style leads to unmaintainable and unperformant code. Based on my Google searches, this isn't a real word. What is the correct way to describe this? EDIT My usage of "performance" here is in regard to speed and efficiency. For example, the better the performance of code the faster the application runs. My question and example target the negative definition, which is in reference to preventing inefficient coding practices. Answer This kind of coding style leads to unmaintainable and unperformant code. In my opinion, reads more easily as: This coding style leads to unmaintainable and poorly performing code. The key to well-written documentation and reports lies in ease of understanding. Adding poorly understood words such as performant decreases that ease. In addressing the use of such a poorly ...

A man has a garden measuring 84 meters by 56 meters. He divides it into the minimum number of square plots. What is the length of the square plots?

We wish to divide this man's garden into the minimum number of square plots possible. A square has all four sides with the same length.Our garden is a rectangle, so the answer is clearly not 1 square plot. If we choose the wrong length for our squares, we may end up with missing holes or we may not be able to fit our squares inside the garden. So we have 84 meters in one direction and 56 meters in the other direction. When we start dividing the garden in square plots, we are "filling" those lengths in their respective directions. At each direction, there must be an integer number of squares (otherwise, we get holes or we leave the garden), so that all the square plots fill up the garden nicely. Thus, our job here is to find the greatest common divisor of 84 and 56. For this, we prime factor both of them: `56 = 2*2*2*7` `84 = 2*2*3*7` We can see that the prime factors and multiplicities in common are `2*2*7 = 28` . This is the desired length of the square plots. If you wi...