Happy Sunday, everyone. The latest "high fat dairy is actually linked to better metabolic health, oh my what a paradox!" study got me thinking about causation versus correlation. Everyone loves to say correlation isn’t causation. Or correlation doesn’t prove causation. Or some permutation of that same basic idea: that an association between two variables cannot prove that one causes the other. But there’s another piece to that: what does lack of correlation imply? Lack of correlation usually means a lack of causation. If there’s a strong correlation between two variables, it could be causation or it might be spurious. The possibility exists for both. If there is no correlation between two variables, causation is nigh on impossible. If high fat dairy causes heart disease, you will see correlation. What about mixed results? What if sometimes there’s a correlation and sometimes there isn’t? That suggests that causation is possible but that there may be other factors that determine the causal relationship. If high LDL always causes heart disease, you will always see a correlation. But you don’t. Sometimes it links up, sometimes it doesn’t. To me, that suggests that something else is related. Either there are confounding variables that correlate to LDL but are the actual causes of the heart disease, or there are other variables that must be present for LDL to cause heart disease. Anyway, I never see that side of it discussed, but it can really help you think about studies as they're released. It provides for me a solid foundation from which to begin analysis. Hopefully it helps you. I’m trying to imagine a time where there was no correlation in the data but the one variable still caused the other. Can you think of one? If you do, let me know in the comment section of Weekly Link Love. |