The Case of the Confusing Axes

In this module we'll talk about how to graph data, and particularly how to set up the graph.

Below you will find an easy checklist to making the perfect graph. On the following pages, we'll practice some of the skills you will need...

A Sample Graph: Eyesight and TV Viewing Distance

When I was growing up, the older folks frequently rained dire predictions on our head about our viewing habits. Specifically, we sat too close to the color television, which was thought to have alien powers akin to Martians landing on the roof. Anyway, I could imagine my poor grandmother collecting the data for the graph below, hoping to prove her point and force us to sit farther from the Lawrence Welk show:

viewing distance 10 20 30 40 50 60 70 80 100 120 140
eyesight remaining 75 80 85 88 92 95 97 98 98 98 98

Choose your x and y carefully.

Scientists like to say that the “independent” variable goes on the x-axis (the bottom, horizontal one) and the “dependent” variable goes of the y-axis (the left side, vertical one). This does not mean that the x variable is out partying while the y variable is whining about the x variable never being around -- that's co-dependence, which is a completely different kettle of fish.

the x adn y axes

When you're talking about variables, “independent” means that the researcher (you, or someone else in a white coat) can pick any value you want for that variable. Using the TV viewing distance data, you can imagine the researcher putting little pieces of tape on the floor and positioning her small experimental subjects at just the right distance... whereas I can't think of any way the researcher can directly control how much eyesight the kids lost.

Personally, I'm a little dyslexic. I still mix up right and left, and I find the words “dependent” and “independent” a little confusing. So I think “cause” (the x variable) and “effect” (the y variable). This is not exactly right and will make some scientists apoplectic, but as a sort of memory aid, it might help you too. So,

•  The cause is how far the kid sits from the TV – the x variable
•  The effect is how much eyesight she loses – the y variable.

OK, now you try: for each pair, what is "x" and which is "y"?

Choose your x and y carefully...

Type "x" in front of the data that should go on the x axis, "y" in front of the data
for the y axis, and "n" for data that should NOT be used in a graph.

y - number of surviving fish in tank
x - water temperature in tank

Choose your x and y carefully...

Type "x" and "y" again...

x - grams of food fed per day
y - daily growth rate of mouse

OK, those two were easy. In both cases, the x variables (temperature of tank or amount of food given) are easily controlled by the experimenter. Sometimes the "x" variable can't actually be controlled but only "chosen" by researcher -- this is especially the case when some version of "time" is the x variable.
However, be careful. Just because a variable includes time does not mean that it is automatically the x variable. Sometimes the amount of time a process takes is the EFFECT of a treatment, and then its the "y" variable.

Choose your x and y carefully...

Type "x" and "y" again...

x - year
y - deer population size

Choose your x and y carefully...

Type "x" and "y" again...

y - time until total pain relief
x - dosage of Dilaudin (an opiate painkiller)

Did you get the Dilaudin example? In this case, the researcher is trying to find out how long it takes to get pain relief DEPENDING ON how much medication is given. The researcher can directly control (or choose) dosage, but not time.
Finally, data tables often contain many columns, so its important to figure out which ones belong on a graph. Below, type "n" for data that will NOT be represented by either "x" or "y".

Choose your x and y carefully...

Type "x" and "y" again... and "n" for data that should NOT be used in a graph.

n - embryo identification number
y - size of embryo
x - time since conception

Choose your x and y carefully...

Type "x" and "y" again... and "n" for data that should NOT be used in a graph.

x - number of hours studied for the test
n - student name
n - teacher name
y - grade on test


Practice labelling the axes

Now that you know what goes on the what axis, try labelling the axes. In general, your label should have two parts: what you measure, and how you measure it, or units. Units should be in parentheses. Like this: "distance (cm)" or "time (minutes)" or " eyesight left (%)".

Putting in the units allows your reader to interpret the graph. For example, if you are graphing the grade received on a test vs. the time spent studying, it will make a big difference whether the x-axis reads "time studying (hours)" or "time studying (minutes)"! Likewise, its hard to interpret the graph on the left, because you don't know what the units are:

graph with no units correct graph

Below, I've written the "x" variable label, but you need to think of a possible "y" varible label. There are many possibilities, so think of one, then click "compare" to see whether you're on the right track. In some cases the units are not very obvious and might take some thinking.

Write a label for "y"...

"x": grams of food fed per day food per day (grams)
"y": daily growth rate of mouse growth rate (grams per day)

Did you pick some unit other than grams/day for growth of mice? That's OK -- there are a lot of different ways to measure the same thing. BUT, that's also the reason that it's important to tell people HOW you measured your variables!

Write a label for "y"...

"x": temperature in tank Water Temperature (degrees Celsius)
"y": number of surviving fish in tank survival rate (number of individuals)

Write a label for "y"...

"x": year Time (year)
"y": number of deer population size (number of deer)

Write a label for "y"...

"x": dosage of Dilaudin Dosage of Dilaudin (mg)
"y": time to total relief of pain time to pain relief (minutes)

Write a label for "y"...

"x": time since conception Time Since conception (days)
"y": number of cells in embryo embryo size (number of cells)

Write a label for "y"...

"x": amount of studying Time spent studying (hours)
"y": grade on test test score (% correct)

Next step: ranging the axes.

Now that you have your axes, and labels for your axes, you need to think about what numbers go on the axes. The first step is to figure out the smallest and largest number that belongs on each axis. That's what "range" means -- find the range of the axis, or, find the biggest and smallest numbers that belong on the axis.

Obviously you know how to find the smallest and biggest numbers in a list. The tricky part is that these are often NOT the best numbers to use as the min and max of your axes. There are two reasons:

1. Almost always, the axes should start at 0, not at some other number. The reason is that starting someplace other than zero distorts the shape of the graph, usually making the rates of change look larger than they really are. (An exception is when the x-axis is showing time; for example you might start with the year 1990, rather than the year 0!) The graph on the left gives you a rather exagerated idea of how much TV viewing can damage your eyesight!

graph with no units correct graph

2. You want to pick maximum numbers for "x" and "y" that are nice and "round". The highest "y" value in a dataset might be 97.9%, but its a lot easier if your graph just goes up to 100. Look at the graph on the left -- it has very inconvenient maximum numbers on both axes. (Notice the bloodshot eyes!)

graph with no units correct graph

Practice ranging the axes

On this screen there are several practice sets of data. For each set, you should think about what data is on the x-axis, and what data goes on the y-axis. Click the buttons to pick the axes, and then click the "check" button to see if you have the answers right (in some cases, the "right" answer is a matter of judgement, and the answers I give are the ones that I think make the most sense).

Decide min and max...

  range min max
"x":Water Temperature (d. Celsius) 3 to 23 0 25
"y": Fish Survival (percent) 3 to 100% 0 100

Decide min and max...

  range min max
"x": Food per Day (grams) 0.1 to 0.9 0 1.0
"y": Growth Rate (grams per day) 0.01 to 0.088 0 0.1

Decide min and max...

  range min max
"x": time (year ) 1995 to 2003 1995 2005
"y":Deer-Vehicle Collisions (number of incidents) 1244 to 2147 0 2200

Decide min and max...

  range min max
"x": Dosage of Dilaudin (mg) 10 to 220 0 250
"y": time to relief (min) 1 to 25 0 25

Decide min and max...

  range min max
"x": Time since conception (weeks) 8 to 32 0 35
"y": Size of Embryo (grams) 55 to 348 0 350

Decide min and max...

  range min max
"x": Time Spent Studying (hrs) 0 to 78 0 80
"y": Grade on Exam (percent correct) 37 to 111% 0 120

Next step: scale the axes

Maybe you think we're done now, but were not! We still need to figure out how to spread the numbers out between the smallest and largest. And this is very important: we need to spread those numbers out evenly!

graph with no units correct graph

And just as importantly, we want to make it easy on ourselves later on. If one box on the graph paper stands for 50 units, that will make our job easy, but if one box on the graph paper stands for 42 units, that would make our job difficult, when it comes to locating the dots on the graph.

graph with no units graph with no units

So think about how to spread out the numbers easily and evenly .

You want to try to end up with about 4 to 6 labelled ticks. Any more than this and your graph will be slow to produce and hard to read.
So, if your axis ranges from 0 to 200, put labelled ticks every 50 units.
If your axis ranges from 0 to 500, put labelled ticks every 100 units.
If your axis ranges from 0 to 125, put labelled ticks every 25 units.
... and so on.


Practice determining labelled tick distance

How far apart are labelled ticks ...

  range min max  
"x":Water Temperature (d. Celsius) 3 to 23 0 25 5
"y": Fish Survival (percent) 3 to 100% 0 100 25

How far apart are labelled ticks ...

  range min max  
"x": time (year ) 1995 to 2005 1995 2005 5
"y":Population Size (number of individuals) 1244 to 2147 0 2200 500

How far apart are labelled ticks ...

  range min max  
"x": Time since conception (weeks) 8 to 32 0 35 5
"y": Size of Embryo (grams) 55 to 348 0 350 50

How far apart are labelled ticks ...

  range min max  
"x": Food per Day (grams) 0.1 to 0.9 0 1 0.1
"y": Growth Rate (grams per day) 0.01 to 0.088 0 0.01 0.01

How far apart are labelled ticks ...

  range min max  
"x": Dosage of Dilaudin (mg) 10 to 200 0 220 20
"y": time to relief (min) 1 to 25 0 25 5

Hmm, in the Dilaudin dataset, the only number that works for "x" is 20, which means you'll have 12 labelled ticks. That would be enough to make me think of making the x axis go up to 250 instead of 220...

How far apart are labelled ticks ...

  range min max  
"x": Time Spent Studying (hrs) 0 to 78 0 80 20
"y": Grade on Exam (percent correct) 37 to 111 % 0 120 20

Practice scaling the axes

The preceding steps -- choosing x and y, labeling, deciding on min and max values, and deciding on tick distance -- are steps you need to do for any graph you make, whether you use a computer or sketch the graph by hand. (Often a computer program will do these steps for you, but often the program will get them wrong, too).

If you are drawing a graph on paper, there is just one more complication. The graph paper has predetermined lines, and you need to make your graph fit on the page or half-page or whatever. You don't want an effect like the graph on the left!

graph with no units correct graph

 

I'm sure there's a formula for doing this, somewhere. But you can also manage by trial and error. On the graph above, let's say you need to approximately fill up a space which has 40 lines for the x-axis.

If you make each line 10 cm, you'll only use up 15 lines -- too few
If you make each line 2 cm, you'll need 75 lines -- too many (This is starting to sound like a nursery story...)
If you make each line 5 cm, you'll need 30 lines -- just right (or at least right enough).


Finally, fill in the axes

Time to graph

Water Temp, celsius 3 6 9 12 15 19 23
Survival, percent 100 100 72 34 22 17 3

Time to Graph...

Year 1995 '96 '97 '98 '99 2000 '01 '02 '03
# Collisions 1244 1776 1705 1774 1891 2033 2003 2127 2047

Time to Graph...

Weeks since conception 8 12 16 20 24 28 32
weight of embryo, g 55 108 160 201 244 295 348

Time to Graph...

Food, mg 0.1 0.3 0.5 0.7 0.9
Growth, mg/day 0.01 0.032 0.6 0.65 0.68

Time to Graph...

Dosage, mg 10 50 100 150 200
Time to relief , min 25 10 5 2 1

This one has an "outlier"...

Time Studying, hrs 0 1 1.5 2 3 5 78
Grade, percent 37 62 68 87 79 111 82


Final Words of Wisdom, and Some Graph Paper

And here is some graph paper which you can print and use for sketching graphs (for lab reports, you may need to hand in a full-page graph).

graph paper