Statistical significance is the probability that observed differences between two groups is due to chance (Warner, 2012). A threshold of statistical significance commonly used is a p-value. Again, p-values indicate probability. Normally, they are set to .05 or .01 (Warner, 2012). So, if a study produces a p-value less than .05 or .01 (depending on set threshold), researchers report the results as statistically significant. In other words, researchers assume that there is a less than 5% probability that the study’s results occurred by chance.
However, according to the American Statistical Association, p-values do not convey meaningful evidence of the size of the effect or the importance of results (Wasserstein & Lazar, 2016; Yaddanapudi, 2016). In addition, smaller p-values do not imply the presence of larger effects and larger p-value do not imply lack of an effect (Sullivan & Feinn, 2012). Overall, p-values inform researchers that an effect exists, while effect sizes provide researchers with a measure of the magnitude of differences between groups. Larger effect sizes are more meaningful than smaller measures (Sullivan & Feinn, 2012; Wasserstein & Lazar, 2016; Yaddanapudi, 2016).
In tests of statistical significance, researchers are evaluating the probability of rejecting a null hypothesis (Warner, 2012). The null hypothesis is a theoretical assumption which suggests there is no statistical difference between two groups (Warner, 2012). For example, a researcher interested in testing an intervention for depression may hypothesize that his/her intervention will reduce levels of depression in the experimental group. However, the null hypothesis states there will be no statistical difference on levels of depression from those who receive the intervention and those who do not.
Normally, p-values used to determine if the null hypothesis will be rejected are predetermined, while considering two different types of errors: Type 1 & Type II (Warner, 2012). Type I error is the probability of rejecting the null hypothesis when the null hypothesis is true (Warner, 2012). A Type II error is the probability of rejecting the null hypothesis when the null hypothesis is NOT true (Warner, 2012). Type I errors are likely to occur when p-values are set high, such as in the example of a .10 level of significance. Type II errors are likely to occur when the p-value is set too low, like a .01 level of significance. Generally, Type I errors are more deceptive because these errors suggest there are significant difference between groups. Researchers may use more relaxed levels of statistical significance because of the misconception that statistically significant results are more meaningful.
Committing either error has the potential to cause harm to participants (Seigel, 2020). However, since Type II errors are considered false negatives (failing to reject the null hypothesis when it is NOT true), they may be more harmful to participants. For example, a person who actually has COVID-19 will not receive treatment if they inaccurately test negative. On the other hand, a person with a false positive (Type 1 error– rejecting the null hypothesis when it is true) may receive unnecessary treatment. Most would argue that the false negative is more harmful because the person will believe he/she is virus-free and refrain from taking proactive measures to reduce transmission, as well as not receiving appropriate potentially lifesaving medical treatment (Siegel, 2020).
Siegel, E. (2020). This is how false positives and false negatives can bias covid-19 testing. Forbes.
Sullivan, G. M. & Feinn, R. (2012). Using effect size-or why the p values is not enough. Journal of Graduate Medical Education, 4(3), 279-282.
Yaddanapudi, L. N. (2016). The American Statistical Association statement on p-values explained. Journal of Anesthesiology Clinical Pharmacology, 32(4), 421-423.
Warner, R. M. (2012). Applied statistics from bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage Publications.
Wasserstein, R. & Lazar, N. (2016). The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133.