Thursday, March 02, 2006

Misleading graphs

I want to call attention to a graph I published yesterday (source here) showing the relationship between asphalt use in Finland and asthma rates.

There is something very misleading, even wrong, about this graph: the asthma rate is plotted for 20 years longer than asphalt use. This causes several problems.

1. It tricks you into thinking that the asthma/ashphalt correlation has gone on much longer than it really has.

2. It makes the correlation look much tighter. Cover up the area of the graph from 1990 on and the dip in asphalt use between 1980 and 1990 is much more prominent.

Why would you ever present data like this? I can think of no reason to continue graphing one variable and not the other, especially for a period taking up 1/4 of the whole graph. The relationship is a good one without the extra data and this just blows your credibility in the eyes of a careful reader. I also can't help but wonder what asphalt use really was in 1995 and 2000. Maybe it plateaued?


At 12:36 PM, Blogger That Girl said...

Maybe because they couldnt get the data for asphalt for that time period?

At 1:29 PM, Blogger Stephen said...

I've not read the text. Is there any reason to believe asphalt has some causal relationship? One would expect more asphault with population growth, right? Just because it is there in a graph, doesn't mean it is true. That's like saying, my computer said so, therefore it's true. This concept predates computers:

On two occasions I have been asked [by members of Parliament!], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
--Charles Babbage

At 5:09 PM, Blogger Dr. Andy said...

If they didn't have the data, i think they should have just cut off the graph at the point the asphalt data ended

Asphalt is just a marker for urbanization which the authors hypothesize drives increased asthma. Note Y axis is rate of asthma not total cases. If you didn't read the original post, it discusses the whole idea rather than just the problems I have with the graphs

maybe there are some mistake in the data collection doctor ?

