MathBench > Statistical Tests

Chi-squared Tests

What you observe vs what you expect

man fishing

Let's start with the 42% Monday or Friday sick days. For simplicity, we'll assume this means 42 out of 100 (rather than 84 out of 200 or 420 out of 1000, etc). This is the data that was observed. Using the laws of probability, we also know that (approximately) 40 out of 100 sick days should fall on Monday or Friday. This is the expected value.

What we want to do is test how far apart the "observed" and "expected" values are, right? So a logical first step is to subtract one from the other -- that tells us how different they are. We'll do this both for Monday or Friday sick days and for midweek sick days:

 

observed
(o)
expected
(e)
difference
(o - e)
Mon/Fri 42 40

Midweek 58 60

 

Then we want to know how important this difference is. Is it big compared to what we expected, or small? To compare the size of two numbers, you need to find a ratio -- in other words, use division. You need to find out how big the difference is compared to the number you expected to get. So, divide the difference (between the observed and expected) by the expected value:

 

observed
(o)
expected
(e)
difference
(o - e)
(difference compared to expected )
(o-e)/e
Mon/Fri 42 40 +2
Midweek 58 60 -2

 

The last column in the table ("difference compared to expected") shows the magnitude of deviations. If we ignore the negative signs and simply add up the values in that column, we have a way of measuring the TOTAL deviation for all the data, in this case . A big deviation would mean that we probably have the wrong explanation, whereas a small total deviation would probably mean we're on the right track. Since we're trying to show that sick days are RANDOM, big deviations are bad for our case, while small deviations are good for our case.