We hear it all the time. Correlation is not causation, but what does this really mean? In this video, we'll be discussing what correlations are, and that correlations cannot imply causation, because of third variable and bidirectionality problems. First let's introduce correlation. A correlation, it means two variables are related. Variables are anything that can be measured. For example, how often we eat cheese and how often we get tangled up in our bed sheets. We can take measurements of each variable and see if the two measures correlate or change together. So here we have a correlation between cheese consumption and bed-sheet tangling resulting in death. When we observe a relationship or correlation between variables, we are often inclined to conclude causation that one variable, cheese consumption cause the other variable, bed-sheet tangling to change. Although causation can be implied by a correlation, a correlation cannot be the only evidence used to conclude a causation. A correlation just implies an association. For example, one of the most powerful ones is the causal illusion. This belief that if one thing happens and then another thing happens, the first thing caused the other. There's some scholars think that this causal illusion is at the heart of pseudoscience. The reason we can't establish causation from correlated variables is because we don't have any way of knowing what caused what. This is called the bidirectionality problem. For example, based off the correlation, we don't know if the cheese eating is what causes us to tangle in our bed-sheets, or if tangling in our bed-sheets results in the need to eat more cheese. For us to establish causation, we would have to experimentally test the role of manipulating one variable, for example, cheese consumption, and observing the effects on the other variable of bed-sheet tanglings. Experimental tests not only allow us to assess if our observed relationship is causal, but also the directionality of the relationship. Another reason we cannot establish causation from a correlation, is the third variable problem. Correlations between two variables obviously ignore other factors. Third variables like temperature or lifestyle factors may be causing both the increase in cheese consumption and bed-sheet tangling, which would of course, support the relationship while also debunking the explanation that cheese consumption may cause bed-sheet tangling or vice versa. The third variable problem highlights the importance of ruling out rival hypotheses. Because until we can rule out a third variable causing a change in our measured variables, we cannot conclude causation, certain types multivariate statistics will account for multiple additional variables to control for possible third variable problems. That being said, it is possible that third variables relevant to the correlation are not ever measured, and thus we cannot be certain about all the factors that may also associate with our variables of interest. Despite their convincing relationships, correlations are just that, relationships between two variables. Sometimes correlations make no sense whatsoever. Cheese consumption and bed tangling, coincidentally increasing together, may just be a spurious correlation, which may make us hypothesize a causal relationship between two variables when none exists. So we cannot conclude anything other than a relationship from a correlational finding. So just teaching a simple thing like causal illusion can allow people see the world in a more critical light. Let's remember, that causal illusion leads to things like people believing that vaccines cause autism, it leads people to believe that homeopathy can cure cancer, that leads people to believe that other forms of alternative therapies can have a specific outcome, when it didn't in fact cause that phenomenon. That being said, correlational findings are an excellent source of ideas to generate hypotheses. Given the correlation we discussed between cheese and bed-sheet tanglings, what can you conclude? What would you hypothesize, and how would you go about testing this hypothesis?