got into data science, there was a phrase that we’d all heard; everyone knows it, young and old: “Correlation doesn’t imply causation.” It is a catchy phrase, and you’ve definitely said it once or twice, and might even have nodded confidently when someone else said it. Especially for datasets that don’t relate to each other, but where it’s funny and intriguing to imply causation! Here are two very interesting facts: Countries that eat more pizza tend to have higher math scores. The more sunglasses sold, the more shark attacks occur. Now, if that were all the information you have… what should you conclude? Does eating pizza make you better at math? Will buying a new pair of sunglasses cause a shark attack? Though it is funny to think about, the answer to those questions is “probably not”. And yet, these are examples of something very real: Correlation . The question worth asking now is: if correlation doesn’t equal causation, then what does it mean? That’s where things get fuzzy.…