HTML tutorial
CSS3 tutorial
Bootstrap tutorial
JavaScript tutorial
JQuery tutorial
AngularJS tutorial
React tutorial
NodeJS tutorial
PHP tutorial
Python tutorial
Python3 tutorial
Django tutorial
Linux tutorial
Docker tutorial
Ruby tutorial
Java tutorial
C tutorial
C ++ tutorial
Perl tutorial
JSP tutorial
Lua tutorial
Scala tutorial
Go tutorial
ASP.NET tutorial
C # tutorial
Hypothesis testing is a formal way of checking if a hypothesis about a population is true or not
A hypothesis is a claim about a population parameter.
A hypothesis test is a formal procedure to check if a hypothesis is true or not.
Examples of claims that can be checked:
The average height of people in Denmark is more than 170 cm.
The share of left handed people in Australia is not 10%.
The average income of dentists is less the average income of dentists.
Hypothesis testing is based on making two different claims about a population parameter.
The null hypothesis (\(H_{0} \)) and the alternative hypothesis (\(H_{1}\)) are the claims.
The two claims needs to be mutually exclusive, meaning only one of them can be true.
The alternative hypothesis is typically what we are trying to prove.
For example, we want to check the following claim:
"The average height of people in Denmark is more than 170 cm."
In this case, the parameter is the average height of people in Denmark (\(\mu\)).
The null and alternative hypothesis would be:
Null hypothesis: The average height of people in Denmark is 170 cm.
Alternative hypothesis: The average height of people in Denmark is more than 170 cm.
The claims are often expressed with symbols like this:
\(H_{0}\): \(\mu = 170 \: cm \)
\(H_{1}\): \(\mu > 170 \: cm \)
If the data supports the alternative hypothesis, we reject the null hypothesis and accept the alternative hypothesis.
If the data does not support the alternative hypothesis, we keep the null hypothesis.
Note: The alternative hypothesis is also referred to as \(H_{A}\)
The significance level (\(\alpha\)) is the uncertainty we accept when rejecting the null hypothesis in the hypothesis test.
The significance level is a percentage probability of accidentally making the wrong conclusion.
Typical significance levels are:
A lower significance level means that the evidence in the data needs to be stronger to reject the null hypothesis.
There is no "correct" significance level - it only states the uncertainty of the conclusion.
Note: A 5% significance level means that when we reject a null hypothesis:
We expect to reject a true null hypothesis 5 out of 100 times.
The test statistic is used to decide the outcome of the hypothesis test.
The test statistic is a standardized value calculated from the sample.
Standardization means converting a statistic to a well known probability distribution.
The type of probability distribution depends on the type of test.
Common examples are:
Note: You will learn how to calculate the test statistic for each type of test in the following chapters.
There are two main approaches used for hypothesis tests:
The Critical Value Approach
The critical value approach checks if the test statistic is in the rejection region.
The rejection region is an area of probability in the tails of the distribution.
The size of the rejection region is decided by the significance level (\(\alpha\)).
The value that separates the rejection region from the rest is called the critical value.
Here is a graphical illustration:
If the test statistic is inside this rejection region, the null hypothesis is rejected.
For example, if the test statistic is 2.3 and the critical value is 2 for a significance level (\(\alpha = 0.05\)):
We reject the null hypothesis (\(H_{0} \)) at 0.05 significance level (\(\alpha\))
The P-Value Approach
The p-value approach checks if the p-value of the test statistic is smaller than the significance level (\(\alpha\)).
The p-value of the test statistic is the area of probability in the tails of the distribution from the value of the test statistic.
Here is a graphical illustration:
If the p-value is smaller than the significance level, the null hypothesis is rejected.
The p-value directly tells us the lowest significance level where we can reject the null hypothesis.
For example, if the p-value is 0.03:
We reject the null hypothesis (\(H_{0} \)) at a 0.05 significance level (\(\alpha\))
We keep the null hypothesis (\(H_{0}\)) at a 0.01 significance level (\(\alpha\))
Note: The two approaches are only different in how they present the conclusion.
The following steps are used for a hypothesis test:
One condition is that the sample is randomly selected from the population.
The other conditions depends on what type of parameter you are testing the hypothesis for.
Common parameters to test hypotheses are:
You will learn the steps for both types in the following pages.