Statistical Reporting Practices: What the Research Shows

A Review of Published Meta-Analyses on Effect Sizes, Power, and Chi-Square Testing

Type: Literature Review | Last Updated: January 30, 2026 | Sources: Peer-reviewed meta-analyses and systematic reviews

Overview

This review synthesizes findings from major meta-analyses examining statistical reporting practices in psychology and related fields. The evidence reveals persistent gaps between recommended practices and actual reporting, with implications for research reproducibility and interpretation.

Key Findings from the Literature

36%
Median statistical power across psychology studies
Stanley, Carter & Doucouliagos (2018), n = 200 meta-analyses
8%
Studies with adequate power to test their hypotheses
Stanley, Carter & Doucouliagos (2018)
49%
Articles reporting any effect size measure (2005-2007)
Sun, Pan & Wang (2010), n = 1,243 articles
2.9%
Studies reporting a power analysis
Fritz, Scherndl & Kühberger (2012), n = 1,164 articles

Statistical Power: A Persistent Problem

Jacob Cohen first documented the power problem in 1962, finding that abnormal psychology studies had only about 48% power to detect medium effects—essentially a coin flip. More than 50 years later, the situation has not meaningfully improved.

Szucs & Ioannidis (2017) analyzed 26,841 statistical records from 3,801 papers published in cognitive neuroscience and psychology journals (2011-2014). They found median power of:

12% for detecting small effects
44% for detecting medium effects
73% for detecting large effects

Source: PLoS Biology 15(3): e2000797. doi:10.1371/journal.pbio.2000797

Stanley, Carter & Doucouliagos (2018) reviewed 200 meta-analyses spanning nearly 8,000 individual studies. They found:

Median power: 36%
Only 8% of studies were adequately powered

Source: Psychological Bulletin 144(12): 1325-1346. doi:10.1037/bul0000169

Power by Research Domain

Domain	Median Power	Source
Neuroscience	21%	Button et al. (2013)
Psychology (overall)	36%	Stanley et al. (2018)
Applied Psychology	52%*	Mone et al. (1996)
Intelligence Research	12%**	Nuijten et al. (2020)

*For medium effects. **For small effects, median N = 60.

Effect Size Reporting

The American Psychological Association has required effect size reporting since 2001, yet compliance remains inconsistent.

Sun, Pan & Wang (2010) reviewed 1,243 articles from 14 journals (2005-2007):

49% of articles reported any effect size
Of those, only 57% interpreted the effect size
AERA journals performed best at 73% reporting rate

Source: Educational Psychology Review 25(1): 89-118. doi:10.1007/s10648-012-9218-2

Improvement Over Time

Fritz, Scherndl & Kühberger (2012) documented a growth rate in effect size reporting of approximately 2% per year between 1990 and 2007. More recent data suggests continued improvement, with some journals now achieving near-complete compliance:

Period	Effect Size Reporting Rate	Source
1990s	~20-30%	Fritz et al. (2012)
2005-2007	49%	Sun et al. (2010)
Post-2020 (Social/Personality)	97%	Farmus et al. (2023)

Chi-Square Test Assumptions

The chi-square test requires certain conditions to produce valid p-values. The most commonly cited rule comes from Cochran (1954):

Cochran's Rule: Avoid using the chi-square test when more than 20% of cells have expected frequencies less than 5, or when any cell has an expected frequency less than 1.

Source: Cochran WG (1954) Some methods for strengthening the common χ² tests. Biometrics 10(4): 417-451.

When to Use Alternative Tests

Condition	Recommended Test	Rationale
Any expected count < 5 (2×2 table)	Fisher's Exact Test	Computes exact p-value
Total N < 20	Fisher's Exact Test	Chi-square approximation unreliable
>20% cells with E < 5	Combine categories or Fisher's	Inflates Type I error rate
Paired/matched data	McNemar's Test	Independence assumption violated

Research by Camilli & Hopkins (1978) and others suggests that Yates' continuity correction may be overly conservative, and modern practice favors Fisher's exact test when assumptions are not met.

Implications for Researchers

Based on this review, we recommend:

Conduct a priori power analysis — Only 2.9% of studies report doing this, yet it's essential for adequate sample sizes
Always report effect sizes with confidence intervals — Required by APA since 2001, but still underreported
Check chi-square assumptions — Use Fisher's exact test when expected frequencies are low
Report exact p-values — Not just "p < .05" but the actual value
Interpret effect size magnitude — A significant p-value with negligible effect size has limited practical value

Tools for Better Practices

CrossTabs.com automatically calculates all recommended statistics:

Effect sizes (Cramér's V, phi, odds ratio) with 95% confidence intervals
Power analysis for sample size planning
Automatic Fisher's exact test when assumptions are violated
Expected frequency warnings
APA-format output

Try CrossTabs.com Free →

References

Button KS, Ioannidis JPA, Mokrysz C, et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365-376. doi:10.1038/nrn3475
Cochran WG (1954). Some methods for strengthening the common χ² tests. Biometrics, 10(4), 417-451.
Cohen J (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65(3), 145-153.
Fritz A, Scherndl T, Kühberger A (2012). A comprehensive review of reporting practices in psychological journals. Theory & Psychology, 23, 98-122. doi:10.1177/0959354312436870
Mone MA, Mueller GC, Mauland W (1996). The perceptions and usage of statistical power in applied psychology and management research. Personnel Psychology, 49(1), 103-120.
Stanley TD, Carter EC, Doucouliagos H (2018). What meta-analyses reveal about the replicability of psychological research. Psychological Bulletin, 144(12), 1325-1346. doi:10.1037/bul0000169
Sun S, Pan W, Wang LL (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102(4), 989-1004.
Szucs D, Ioannidis JPA (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797. doi:10.1371/journal.pbio.2000797

How to Cite This Review
CrossTabs (2026). Statistical Reporting Practices: What the Research Shows. CrossTabs.com. Retrieved from https://crosstabs.com/pages/research-chi-square-practices-2026.html

About this review: This page synthesizes findings from peer-reviewed meta-analyses and systematic reviews. All statistics are sourced from published research. CrossTabs.com provides free statistical tools designed to support best practices in categorical data analysis.