There’s a lot of buzz on the list serves about a new website that just launched called Spurious Correlations. The site reports a new “correlation” each day. For example, the site points out that the correlation in Maine between eating margarine and divorce is over 99 percent. Does this mean that Maine residents who want to maintain their matrimonial bond need to switch right on over to butter? I mean 99 percent seems pretty darn compelling, right? There are lots of other important correlations listed on the site including:
Per capita consumption of mozzarella cheese and civil engineering doctorates awarded–95%
Honey producing bee colonies and the marriage rate in Vermont–93%
US domestic price of uranium with accidental poisoning by alcohol–97%
Spurious Correlations is a wonderful tool for demonstrating that oh so important axiom, “correlation is not causation”. This means that just because two things tend to happen together does not necessarily mean that one causes the other. They might have a third agent which is causing them to happen together or they might have no relationship to one another whatsoever outside of a random statistical similarity.
I think that Spurious Correlations is a fascinating site. I’ve spent way too much time tooling around in there. But I also think it is an important tool for helping us understand our world. Because so many of the people writing and talking about science on websites and blogs, on television, in magazines and newspapers get this relationship between correlation and causation so very wrong. I think in some cases the writers and speakers don’t understand the difference. But in other cases, I think the writers are very clear about the difference and simply report correlation as causation because it makes better headlines or sells more product. Take this blog post for example. I don’t have any proof that buying margarine causes a single divorce in Maine. But I imply that there might be a cause by asking the question in the headline: “Does Eating Margarine Cause Divorce”? It’s easy to see why I did that. “Per Capital Margarine Consumption in Maine Closely Correlates with Divorce Rate” just doesn’t have the same ring to it. But I think most people would agree that even though butter tastes a whole lot better, eating oleo is unlikely to be the cause for divorce. Either something else is going on to connect these two statistics, or they are completely unrelated. So the difference between correlation and causation here is pretty easy to spot.
But what about the correlation between the total number of computer science doctorates awarded and total arcade revenue. These two facts correlate at over 98 percent. And it would be pretty easy to formulate a theory about how these two facts are related. Maybe when there are more computer science students, it means there are more nerds that love to play arcade games. Maybe more computer science doctorates means there are more nerds qualified to design and implement great arcade games. With just the tiniest whiff of a potential relationship, our minds naturally leap to find ways that one of these facts could cause the other. But there remains the very distinct possibility that there is no causal relationship whatsoever between these two statistics.
I find this particularly relevant in our current national hysteria over obesity. It seems every week there is a new study claiming that this thing or that thing causes obesity. And everywhere you look you see “proof” that obesity causes this problem or that problem. But I think it is important for us to keep our wits about us and take a look whether these studies can sufficiently demonstrate that two correlated facts have a causal relationship. For example, people are spending more time in front of computer screens than ever before. Some have suggested that increased screen time causes obesity. But do we know that is true? Or are these things simply happening at the same time. We also have more 24 hour gyms than in the previous century. Is it reasonable to suggest that the increase in 24 hour gyms causes obesity? Maybe dieting causes obesity, or exposure to certain plastics? Heck, based on the correlation, one could easily suggest that talking about obesity increases obesity levels! And how about the rise in medical insurance costs and the rise in obesity. Does a larger number of fat people cause higher insurance rates or is there something else going on? The question of the rise in health insurance rates is detailed and complex but how many people have simply jumped to the conclusion that the fatties are making their monthly premiums higher. How many of us take the time to understand: the only way that we can prove that one thing causes for another is through careful experimentation where as many other variables as possible are ruled out and a causal agent is ultimately found.
So when you come across studies that demonstrate a relationship between say obesity and heart disease or obesity and cognitive function, I urge you not to just jump blindly onto the causation train. Ask yourself a few questions:
- Has this study adequately controlled for other causal factors? Has it controlled for diet, physical activity levels, socioeconomic status, access to good healthcare, education, etc.?
- Has this study identified a causal link that demonstrates why these two things are happening at the same time?
- Is it possible that these two statistics are simply randomly related with no causal relationship whatsoever?
That is not to say that correlation never go together. All causal relationships are also correlations. But not all correlations contain causation. These are important facts to keep in mind the next time you read a headline screaming about the causes of obesity or harm caused by obesity–or the next time you decide to buy margarine in Maine.
Jeanette (AKA The Fat Chick)
P.S. Want to go on a virtual vacation? Ragen and I over at the Fit Fatties Forum are launching Virtual Vacations that allow you to exercise while virtually visiting some of the world’s most fabulous cities!