In statistics, “power” refers to the ability of your study to identify effects of substantial interest. Basically, at the time of designing your study, you need to consider four essential factors:
1. Sample size, i.e., the number of units (e.g., patients), usually represented as “N.”
2. Size of the effect that you are interested in (usually, if you are looking for a large effect, you don’t need as big a sample as you would if you were looking for a small effect)
3. Alpha level: This is your significance threshold (it can be .001, .05, or .1). If your p values are at or above this level, you say that your result is not statistically significant.
4. Power: This is a value representing the likelihood of you finding an effect.
How do you determine the power of your study? The above four parameters are interrelated, so if you have the values for three of them, you can calculate the value of the fourth. But usually, the alpha level is fixed (you generally have to choose between .001, .05, and .1) and by reviewing the literature, you will know roughly how large or small your effect can possibly be (effect size). So if you want your study to have good power, you will need to focus on sample size.
Many prestigious journals like Nature require you to justify your sample size, so as to show that you have enough power. Nature also offers specific guidelines about what kind of tests you should conduct when your sample size is small. Others, like the British Journal of Surgery, want power calculations to be clearly stated in the manuscript. Still others, like Molecular Genetics and Metabolism, clearly state that “[s] ubmitted manuscripts without a power calculation will be rejected and returned to authors without review.” And it’s not just medical and life science journals that are strict about statistical power—the American Psychological Association also strongly recommends reporting a power analysis in the methods section of psychology papers, in its Reporting Standards for Research in Psychology.
It also helps to show your power calculations when applying for a grant, so that reviewers can gauge the robustness of your study.
You’ll notice that there has been no mention of methodology in the above explanation. This is because the power of a study is independent of the methodology. You can conduct the most rigorous tests, such as randomized clinical trials, even if your study has low statistical power (e.g., your sample size is too small for you to appropriately detect the effects you have chosen to study). “Underpowered studies” do not have sufficient power for the observed effects to be considered reliable and reproducible.
Unfortunately, it’s very difficult to fix power after you have conducted your research. It’s therefore important to consult a statistician before you start data collection, to check whether your study design has enough power.