HTML tutorial
CSS3 tutorial
Bootstrap tutorial
JavaScript tutorial
JQuery tutorial
AngularJS tutorial
React tutorial
NodeJS tutorial
PHP tutorial
Python tutorial
Python3 tutorial
Django tutorial
Linux tutorial
Docker tutorial
Ruby tutorial
Java tutorial
C tutorial
C ++ tutorial
Perl tutorial
JSP tutorial
Lua tutorial
Scala tutorial
Go tutorial
ASP.NET tutorial
C # tutorial
Variance is another number that indicates how spread out the values are
Variance is another number that indicates how spread out the values are.
In fact, if you take the square root of the variance, you get the standard deviation. Or the other way around, if you multiply the standard deviation by itself, you get the variance!
We will first use the data set with 10 observations to give an example of how we can calculate the variance:
Duration | Average_Pulse | Max_Pulse | Calorie_Burnage | Hours_Work | Hours_Sleep |
---|---|---|---|---|---|
30 | 80 | 120 | 240 | 10 | 7 |
30 | 85 | 120 | 250 | 10 | 7 |
45 | 90 | 130 | 260 | 8 | 7 |
45 | 95 | 130 | 270 | 8 | 7 |
45 | 100 | 140 | 280 | 0 | 7 |
60 | 105 | 140 | 290 | 7 | 8 |
60 | 110 | 145 | 300 | 7 | 8 |
60 | 115 | 145 | 310 | 8 | 8 |
75 | 120 | 150 | 320 | 0 | 8 |
75 | 125 | 150 | 330 | 8 | 8 |
Tip: Variance is often represented by the symbol Sigma Square: σ^2
We want to find the variance of Average_Pulse.
1. Find the mean:
The mean is 102.5
2. Find the difference from the mean for each value:
3. Find the square value for each difference:
4. Sum the squared values and find the average:
The variance is 206.25.
We can use the var()
function from Numpy to find the
variance (remember that we now use the first data set with 10 observations):
import numpy as np
var = np.var(health_data)
print(var)
The output:
Here we calculate the variance for each column for the full data set:
import numpy as np
var_full = np.var(full_health_data)
print(var_full)
The output: