# What you observe vs. what you expect

Let's start with the 42% M/F sickdays. For simplicity, we'll assume this means 42 out of 100 (rather than 84 out of 200 or 420 out of 1000, etc). That's the data that was observed. Using the laws of probability, we also know that (approximately) 40 out of 100 sickdays should fall on M/F. That's the expected value.

What we want to do is test how far apart the "observed" and "expected" answers are, right? So a logical first step is to subtract one from the other -- that tells us how different they are. We'll do this both for M/F sickdays and for midweek sickdays:

 observed (o) expected (e) difference (o - e) Mon/Fri 42 40 Midweek 58 60

Then we want to know how important this difference is. Is it big compared to what we expected, or small? To compare the size of two numbers, you need to find a ratio -- in other words, use division. You need to find out how big the difference is compared to the number you expected to get. So, divide the difference (between the observed and expected) by the expected value:

 observed (o) expected (e) difference (o - e) (difference compared to expected ) (o-e)/e Mon/Fri 42 40 +2 Midweek 58 60 -2

The last column in the table shows the magnitude of deviations. If we ignore the negative signs and add them up, we have a way of measuring the TOTAL deviation for all the data, in this case . A big deviation would mean that we probably have the wrong explanation, whereas a small total deviation would probably mean we're on the right track. Since we're trying to show that sick days are RANDOM, big deviations are bad for our case, while small deviations are good for our case.