Correlation Versus Causation
It's important to help students understand that correlation does not imply causation.
By Donna Iadipaolo
When teaching students higher–level math skills, it’s important to make sure that they fully understand the concepts. For example, students commonly mix up the terms correlation and causation. If you give students real-world examples, you can make sure they understand the difference between these two concepts. Here’s an example that might resonate with teenagers: studies have shown that smoking causes cancer, but smoking is correlated to alcoholism. In this sense, the physical act of smoking, due to chemicals inherent in the cigarettes including smoke, causes cancer. If you were to formulate a correlation using this information, you could say that those who are likely to smoke are also more likely to abuse alcohol. Smoking does not cause alcoholism, but there is a relationship between the two.
Graphing the Results
To help your students better understand correlation, you can have them create scatter plots. Correlation between two variables may be measured by looking at the applicable data plotted on a scatter plot. The three different kinds of correlations that can be determined with a scatter plot include: no correlation, positive correlation, and negative correlation. We place one value on the y-axis (usually the dependent variable) and the other corresponding value on the x-axis (usually the independent variable). Below is an example of how using a scatter plot can be helpful:
- If you wanted to see if there was a correlation between the number of minutes a basketball player plays in a basketball game and the number of points that s/he scored, one might plot the time played on the x-axis and the points scored for each player on the y-axis.
- As it happens, there is, in fact, a positive correlation between the amount of time a basketball player spends in a game and the number of points that s/he scores.
- Note, there is no causation between these two values. That is, simply spending time in a basketball game does not cause a player to score a greater number of points.
The Mantra Every Student Should Know
“Correlation does not imply causation” is a mantra I heard as an undergraduate engineering student at the University of Michigan. This is an important concept so that students can benefit from doing research about this topic and writing about it. By doing research and then writing about this topic, students can solidify their understanding of correlation and causation concepts in a linguistic and logical-mathematical multiple intelligence manner.
Science provides another way to explore the concepts of correlation and causation. You can have students, for instance, read medical studies that link the exposure of certain chemicals to various diseases. Students might delve into an analysis of the 3,800 individual chemicals in cigarettes that cause cancer. Other students might want to conduct environmental studies, like examining the chemicals in drinking water, the air, and food. You can connect this concept to math by having students analyze the dosage as well as strength of the chemical compound and length of exposure. Parts-per million and parts-per billion notations are not only common in environmental studies, but also in science and engineering, and should therefore become second nature to students for a greater understanding in a math class. This, in turn, connects to further exploration in the social sciences in relation to federal regulation laws. Technologically, exploration of the correlation coefficient and lines of best fit are essential. Here are some lessons that relate to these and other concepts.
Causation and Correlation Lesson Plans:
Students connect basketball to mathematics. They compare data to see how the number of points scored relates to the number of minutes played to determine correlations between the two sets of data.
Students compare and contrast linear and exponential models. Students plot data, find a regression line, and use the correlation coefficient to determine which model is a better fit.
Students investigate the line of best fit on a scatterplot. They identify lines as positive, negative, or no correction.
Students examine the concept of the correlation coefficient. Students use technology to determine positive, negative, or no relationship.