Detour stop #2: what's a Lookup Table?
So, so far we have a chi-square-calc, which has a p-value associated with it. This would be fine and dandy IF we actually knew what that p-value was. But we don't. And in fact, finding out the p-value for any given chi-square-calc would involve a complicated mathematical formula. Believe it or not, biologists are not actually big on complicated mathematical formulas (or formuli either). So instead we have a lookup table. Or as I like to say, a Magic Lookup Table, because for our purposes, it might as well have appeared magically.
What the lookup table tells you is, for your specific dataset, what the chi-square calc is that would correspond with p=0.05. This special number is called the "chi-square-crit", as in the critical value or threshold value of the chi-square-calc.
And how do you know that this chi-square-crit is the one and only chi-square-crit that fits your exact dataset? It turns out that you only need to know one thing about your dataset, which is how many rows are in the chi-square table. Below is the Magic Lookup Table. If your chi-square table had 2 rows (like ours did), then you look up the chi-square crit under df = 1 (cuz 2-1 = 1, more about that on the next page).
When doing a chi-square goodness of fit test, there is one last wrinkle to iron out, called degrees of freedom.
When I told you that 42 out of 100 sick days were on Mondays or Fridays, you automatically knew that 58 had to be in the middle of the week, right? I was "free" to specify how many were on Monday/Friday, but then I was NOT "free" to decide how many were on non-Monday/Friday. So we say that, in this problem, there is only 1 degree of freedom.
Say you flip a coin 100 times. If we want to do a chi-square test to determine whether a coin is fair (lands equally on heads and tails), how many degrees of freedom would the test have?
(To make this problem interactive, turn on javascript!)
- I need a hint ... :If I tell you the number of heads, do you also know the number of tails?
- ...another hint ... : How many variables are "free" to vary?
I think I have the answer: There are two variables here -- number of
heads and number of tails. But only 1 is free to vary -- once I tell you
how many heads there were, you know how many tails there were,
or vice versa.
It is possible to do chi-square tests using more than 2 variables. For example, let's say I got data on how many sickdays fell on EACH of the five weekdays:
| day | observed | expected |
| mon | 22 | 20 |
| tues | 19 | 20 |
| wed | 19 | 20 |
| thurs | 20 | 20 |
| fri | 20 | 20 |
We could do a chi-square test to check whether the distribution of sick days matched our expectations for ALL FIVE weekdays
How many degrees of freedom would this test have?
(To make this problem interactive, turn on javascript!)
- I need a hint ... : There are 5 weekdays -- how many of those am I "free"
to specify data for?
- ...another hint ... : If I knew that there were 20 sickdays each on Monday
through Thursday, is Friday still "free" to vary?
I think I have the answer: Once I know how many sickdays occurred
on 4 of the 5 days, the fifth day is no longer "free" to vary.
Therefore there are only 4 degrees of freedom.
Copyright University of Maryland, 2007
You may link to this site for educational purposes.
Please do not copy without permission
requests/questions/feedback email: mathbench@umd.edu




