Correlation vs. Causation
Tools for Thinking Clearly
Two variables moving together does not mean one causes the other — a confusion that underlies much bad reasoning about health, policy, and everyday decisions. This topic covers the distinction between correlation and causation, confounding variables, the Bradford Hill criteria for evaluating causal claims, and the counter-intuitive puzzles of Simpson's paradox.
Lernmaterial
4 SeitenWhen Two Things Move Together
Somewhere in the 1990s, researchers noticed something striking: in countries where people ate more chocolate per capita, there were more Nobel laureates per capita. The correlation was real. The data was genuine. And the conclusion — that chocolate consumption causes intellectual achievement — was, of course, absurd.
This is correlation vs. causation in its most visible form. But the confusion is far less obvious in the claims that actually shape decisions about health, politics, and public policy. When a newspaper reports that 'people who sleep fewer than seven hours a night have a higher rate of cardiovascular disease,' the implied message is causal: sleep deprivation damages the heart. The data may show only a correlation — a relationship between two measurements — without establishing that one causes the other.
What correlation actually means
A correlation describes a statistical relationship between two variables: as one changes, the other tends to change in a predictable way. A positive correlation means they move in the same direction (more of X is associated with more of Y). A negative correlation means they move in opposite directions. A correlation of zero means no systematic relationship is detectable.
Correlation is measurable, precise, and genuinely informative. It tells us that a pattern exists. What it cannot tell us, by itself, is why the pattern exists — whether X causes Y, Y causes X, or some third factor causes both.
What causation means
Causation means that changing one variable produces a change in another — that there is a mechanism by which X brings Y about. Establishing causation requires more than observing a pattern; it requires ruling out alternative explanations. This is considerably harder than it sounds.
The computer scientist and statistician Judea Pearl has described this distinction using what he calls the 'ladder of causation': the lowest rung involves seeing patterns and associations; the next involves doing (intervening to change something and observing the result); and the highest involves imagining counterfactuals — what would have happened if things had been different. Most data we encounter in everyday life sits on the lowest rung (Pearl & Mackenzie, 2018, pp. 28–51). Correlation lives there. Causation requires climbing higher.
Why this confusion is so persistent
The human mind is pattern-seeking by design. Identifying correlations in a noisy environment was, evolutionarily speaking, useful: if certain plants preceded illness in your community, avoiding them was prudent, even if the true cause of the illness was something else entirely. We are not naturally built to demand causal mechanisms before acting on patterns.
This tendency is amplified in the modern information environment. Headlines are written to suggest action ('Coffee reduces risk of diabetes'), not nuance ('Coffee consumption is correlated with lower rates of diabetes in some observational studies, though the mechanism is unclear and confounding factors have not been fully ruled out'). The former is readable. The latter is accurate.
Karteikarten
Quiz
Mehr lernen?
Mit einem Account bekommst du KI-Tutor, Lernpläne, Prüfungsvorbereitung und mehr.
Kostenlos registrieren