Data Science
DS Math
DS Statistics
DS Advanced

Data Science Statistics Correlation vs Causality

Correlation measures the numerical relationship between two variables

Correlation Does Not Imply Causality

Correlation measures the numerical relationship between two variables.

A high correlation coefficient (close to 1), does not mean that we can for sure conclude an actual relationship between two variables.

A classic example:

During the summer, the sale of ice cream at a beach increases

Simultaneously, drowning accidents also increase as well

Does this mean that increase of ice cream sale is a direct cause of increased drowning accidents?

The Beach Example in Python

Here, we constructed a fictional data set for you to try:

Example

import pandas as pd
import matplotlib.pyplot as plt

Drowning_Accident = [20,40,60,80,100,120,140,160,180,200]
Ice_Cream_Sale = [20,40,60,80,100,120,140,160,180,200]
Drowning = {"Drowning_Accident": [20,40,60,80,100,120,140,160,180,200],
"Ice_Cream_Sale": [20,40,60,80,100,120,140,160,180,200]}
Drowning = pd.DataFrame(data=Drowning)

Drowning.plot(x="Ice_Cream_Sale", y="Drowning_Accident", kind="scatter")
plt.show()

correlation_beach = Drowning.corr()
print(correlation_beach)

Output:

Correlation vs Causality - The Beach Example

In other words: can we use ice cream sale to predict drowning accidents?

The answer is - Probably not.

It is likely that these two variables are accidentally correlating with each other.

What causes drowning then?

Unskilled swimmers

Waves

Cramp

Seizure disorders

Lack of supervision

Alcohol (mis)use

etc.

Let us reverse the argument:

Does a low correlation coefficient (close to zero) mean that change in x does not affect y?

Back to the question:

Can we conclude that Average_Pulse does not affect Calorie_Burnage because of a low correlation coefficient?

The answer is no.

There is an important difference between correlation and causality:

Correlation is a number that measures how closely the data are related

Causality is the conclusion that x causes y.

Tip: Always critically reflect over the concept of causality when doing predictions!

Contact US

Email:

Content:

Data Science

DS Math

DS Statistics

DS Advanced