Statistical Reporting Practices: What the Research Shows
A Review of Published Meta-Analyses on Effect Sizes, Power, and Chi-Square Testing
Type: Literature Review |
Last Updated: January 30, 2026 |
Sources: Peer-reviewed meta-analyses and systematic reviews
Overview
This review synthesizes findings from major meta-analyses examining statistical reporting practices in psychology and related fields. The evidence reveals persistent gaps between recommended practices and actual reporting, with implications for research reproducibility and interpretation.
Key Findings from the Literature
-
36%
Median statistical power across psychology studies
Stanley, Carter & Doucouliagos (2018), n = 200 meta-analyses
-
8%
Studies with adequate power to test their hypotheses
Stanley, Carter & Doucouliagos (2018)
-
49%
Articles reporting any effect size measure (2005-2007)
Sun, Pan & Wang (2010), n = 1,243 articles
-
2.9%
Studies reporting a power analysis
Fritz, Scherndl & Kühberger (2012), n = 1,164 articles
Statistical Power: A Persistent Problem
Jacob Cohen first documented the power problem in 1962, finding that abnormal psychology studies had only about 48% power to detect medium effects—essentially a coin flip. More than 50 years later, the situation has not meaningfully improved.
Szucs & Ioannidis (2017) analyzed 26,841 statistical records from 3,801 papers published in cognitive neuroscience and psychology journals (2011-2014). They found median power of:
- 12% for detecting small effects
- 44% for detecting medium effects
- 73% for detecting large effects
Stanley, Carter & Doucouliagos (2018) reviewed 200 meta-analyses spanning nearly 8,000 individual studies. They found:
- Median power: 36%
- Only 8% of studies were adequately powered
Power by Research Domain
| Domain |
Median Power |
Source |
| Neuroscience |
21% |
Button et al. (2013) |
| Psychology (overall) |
36% |
Stanley et al. (2018) |
| Applied Psychology |
52%* |
Mone et al. (1996) |
| Intelligence Research |
12%** |
Nuijten et al. (2020) |
*For medium effects. **For small effects, median N = 60.
Effect Size Reporting
The American Psychological Association has required effect size reporting since 2001, yet compliance remains inconsistent.
Sun, Pan & Wang (2010) reviewed 1,243 articles from 14 journals (2005-2007):
- 49% of articles reported any effect size
- Of those, only 57% interpreted the effect size
- AERA journals performed best at 73% reporting rate
Improvement Over Time
Fritz, Scherndl & Kühberger (2012) documented a growth rate in effect size reporting of approximately 2% per year between 1990 and 2007. More recent data suggests continued improvement, with some journals now achieving near-complete compliance:
| Period |
Effect Size Reporting Rate |
Source |
| 1990s |
~20-30% |
Fritz et al. (2012) |
| 2005-2007 |
49% |
Sun et al. (2010) |
| Post-2020 (Social/Personality) |
97% |
Farmus et al. (2023) |
Chi-Square Test Assumptions
The chi-square test requires certain conditions to produce valid p-values. The most commonly cited rule comes from Cochran (1954):
Cochran's Rule: Avoid using the chi-square test when more than 20% of cells have expected frequencies less than 5, or when any cell has an expected frequency less than 1.
Source: Cochran WG (1954) Some methods for strengthening the common χ² tests. Biometrics 10(4): 417-451.
When to Use Alternative Tests
| Condition |
Recommended Test |
Rationale |
| Any expected count < 5 (2×2 table) |
Fisher's Exact Test |
Computes exact p-value |
| Total N < 20 |
Fisher's Exact Test |
Chi-square approximation unreliable |
| >20% cells with E < 5 |
Combine categories or Fisher's |
Inflates Type I error rate |
| Paired/matched data |
McNemar's Test |
Independence assumption violated |
Research by Camilli & Hopkins (1978) and others suggests that Yates' continuity correction may be overly conservative, and modern practice favors Fisher's exact test when assumptions are not met.
Implications for Researchers
Based on this review, we recommend:
- Conduct a priori power analysis — Only 2.9% of studies report doing this, yet it's essential for adequate sample sizes
- Always report effect sizes with confidence intervals — Required by APA since 2001, but still underreported
- Check chi-square assumptions — Use Fisher's exact test when expected frequencies are low
- Report exact p-values — Not just "p < .05" but the actual value
- Interpret effect size magnitude — A significant p-value with negligible effect size has limited practical value
Tools for Better Practices
CrossTabs.com automatically calculates all recommended statistics:
- Effect sizes (Cramér's V, phi, odds ratio) with 95% confidence intervals
- Power analysis for sample size planning
- Automatic Fisher's exact test when assumptions are violated
- Expected frequency warnings
- APA-format output
Try CrossTabs.com Free →
References
- Button KS, Ioannidis JPA, Mokrysz C, et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365-376. doi:10.1038/nrn3475
- Cochran WG (1954). Some methods for strengthening the common χ² tests. Biometrics, 10(4), 417-451.
- Cohen J (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65(3), 145-153.
- Fritz A, Scherndl T, Kühberger A (2012). A comprehensive review of reporting practices in psychological journals. Theory & Psychology, 23, 98-122. doi:10.1177/0959354312436870
- Mone MA, Mueller GC, Mauland W (1996). The perceptions and usage of statistical power in applied psychology and management research. Personnel Psychology, 49(1), 103-120.
- Stanley TD, Carter EC, Doucouliagos H (2018). What meta-analyses reveal about the replicability of psychological research. Psychological Bulletin, 144(12), 1325-1346. doi:10.1037/bul0000169
- Sun S, Pan W, Wang LL (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102(4), 989-1004.
- Szucs D, Ioannidis JPA (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797. doi:10.1371/journal.pbio.2000797
How to Cite This Review
CrossTabs (2026). Statistical Reporting Practices: What the Research Shows. CrossTabs.com. Retrieved from https://crosstabs.com/pages/research-chi-square-practices-2026.html
About this review: This page synthesizes findings from peer-reviewed meta-analyses and systematic reviews. All statistics are sourced from published research. CrossTabs.com provides free statistical tools designed to support best practices in categorical data analysis.