Skip to main content

If my statistical results do not fall within a "normal" curve, does that mean they are wrong? Why?

I assume that you are asking about results that either lie outside a confidence interval, or results during a hypothesis test that lie in the critical region (tail.)


When creating a confidence interval we start with a point estimate for the population parameter we are interested in. For example, if we want to know the average height we might assume that the average from a random sample of sufficient size is a decent point estimate.


Understanding that the point estimate is not likely to exactly match the population parameter, we introduce an error term. This is added to and subtracted from the point estimate to create a confidence interval. The error term includes the standard error of the measurement, as well as a factor that is derived from the confidence level we want to achieve. (The larger the confidence, the larger the interval will be.)


If a secondary test gives results outside this confidence interval, is the result "wrong"? Not necessarily. Suppose the interval was created with a 95% confidence level. Thus we are 95% certain that the population parameter lies within the interval. One out of every twenty samples will have an estimate outside the interval. The true population parameter would lie outside the interval 5% of the time.


If we are doing a hypothesis test, in essence we are creating a confidence interval centered on the reported or accepted value of the parameter. Then we see if our sample statistic lies in that interval. If it does, we assume that the population parameter is as stated. If our sample statistic lies outside the interval (in the critical region), we have evidence to show that the given population parameter is incorrect.


In running a hypothesis test we run the risk of two types of error (assuming the samples are created correctly, etc...) A type I error is when we say that the purported parameter is incorrect, when it is actually correct. We have a lot of control over this type of error, as the probability of this type of error is equal to our confidence level. (I.e. at the 95% confidence level, the chance for a type I error is 5%.)


A type II error occurs when we fail to recognize an incorrect parameter. We can reduce the probability of this error by increasing the sample size and doing additional samples.


So when running a hypothesis test, the true result is not absolutely given. We only have probabilities to work with.

Comments

Popular posts from this blog

Is there a word/phrase for "unperformant"?

As a software engineer, I need to sometimes describe a piece of code as something that lacks performance or was not written with performance in mind. Example: This kind of coding style leads to unmaintainable and unperformant code. Based on my Google searches, this isn't a real word. What is the correct way to describe this? EDIT My usage of "performance" here is in regard to speed and efficiency. For example, the better the performance of code the faster the application runs. My question and example target the negative definition, which is in reference to preventing inefficient coding practices. Answer This kind of coding style leads to unmaintainable and unperformant code. In my opinion, reads more easily as: This coding style leads to unmaintainable and poorly performing code. The key to well-written documentation and reports lies in ease of understanding. Adding poorly understood words such as performant decreases that ease. In addressing the use of such a poorly ...

Is 'efficate' a word in English?

I routinely hear the word "efficate" being used. For example, "The most powerful way to efficate a change in the system is to participate." I do not find entries for this word in common English dictionaries, but I do not have an unabridged dictionary. I have checked the OED (I'm not sure if it is considered unabridged), and it has no entry for "efficate". It does have an entry for "efficiate", which is used in the same way. Wordnik has an entry for "efficate" with over 1800 hits, thus providing some evidence for the frequency of use. I personally like the word and find the meaning very clear and obvious when others use it. If it's not currently an "officially documented" word, perhaps its continued use will result in it being better documented.