Detour stop #3: what's a "df"?
On the last page, I said you should look up the chi-square-crit under "number of rows minus one". Why?
When I told you that 42 out of 100 sick days were on Mondays or Fridays, you automatically knew that 58 had to be in the middle of the week, right? I was "free" to specify how many were on Monday/Friday, but then I was NOT "free" to decide how many were on non-Monday/Friday. So we say that, in this problem, there is only 1 degree of freedom.
Say you flip a coin 100 times. If we want to do a chi-square test to determine whether a coin is fair (lands equally on heads and tails), how many degrees of freedom would the test have?
- I need a hint ... :If I tell you the number of heads, do you also know the number of tails?
- ...another hint ... : How many variables are "free" to vary?
I think I have the answer: There are two variables here -- number of
heads and number of tails. But only 1 is free to vary -- once I tell you
how many heads there were, you know how many tails there were,
or vice versa.
It is possible to do chi-square tests using more than 2 variables. For example, let's say I got data on how many sickdays fell on EACH of the five weekdays:
We could do a chi-square test to check whether the distribution of sick days matched our expectations for ALL FIVE weekdays
How many degrees of freedom would this test have?
- I need a hint ... : There are 5 weekdays -- how many of those am I "free"
to specify data for?
- ...another hint ... : If I knew that there were 20 sickdays each on Monday
through Thursday, is Friday still "free" to vary?
I think I have the answer: Once I know how many sickdays occurred
on 4 of the 5 days, the fifth day is no longer "free" to vary.
Therefore there are only 4 degrees of freedom.
Copyright University of Maryland, 2007
You may link to this site for educational purposes.
Please do not copy without permission
requests/questions/feedback email: firstname.lastname@example.org