Here we’ll go over the fundamental concepts of hypothesis testing. Generally we want to test two hypotheses. Let’s say we have two web pages, and we assume that the click-through-rates do not vary across time or users (iid for each web page across users and time). We want to compare the click-through-rates and .
We have two hypotheses,
- : the null hypothesis represents a standard assumption. In this case it could be that . That is, the two pages have the same click-through rate.
- : the alternative hypothesis. This is that they are different .
We have two types of errors.
- Type 1 error: we reject the null hypothesis when it is true. That is, we conclude that the click-through-rates are different, even when they aren’t.
- Type 2 error: the alternative hypothesis is true, but we fail to reject the null hypothesis.
An important concept is the p-value, which is the probability of observing at least as extreme a result as what we observe, given that the null hypothesis is true. We select a threshold for the p-value and use it to decide whether to reject the null hypothesis or not. Next time, we will describe how to actually do the test, and decide whether or not to reject the null hypothesis.