How to check normality assumptions

Updated March 2026

Most parametric statistical tests — t-tests, ANOVA, Pearson correlation, linear regression — assume that the data (or the residuals) follow a normal distribution. Violating this assumption can lead to inaccurate p-values and unreliable confidence intervals.

This guide covers how to check normality, how to interpret the results, and what to do when your data isn't normal.

Why normality matters

Parametric tests calculate p-values based on the assumption that the sampling distribution of the test statistic is known. When data is normally distributed, these calculations are exact. When data is severely non-normal — especially with small samples — the p-values can be misleading.

That said, normality is often misunderstood. The assumption is about the residuals or sampling distribution, not the raw data. With large samples (roughly n > 30 per group), the Central Limit Theorem means that parametric tests are robust to moderate departures from normality.

Practical rule of thumb: With sample sizes above 30 per group, moderate skewness is usually fine for t-tests and ANOVA. With small samples (n < 15), normality matters more, and you should check carefully.

Method 1: Shapiro-Wilk test

The Shapiro-Wilk test is the most widely recommended formal test for normality. It tests the null hypothesis that the data came from a normal distribution.

If p > .05: no evidence against normality (proceed with parametric test)
If p ≤ .05: evidence that the data departs from normality (consider a non-parametric alternative)

How to interpret Shapiro-Wilk results

Example output

W = 0.96, p = .42

A W statistic close to 1.0 indicates the data is approximately normal. Here, p = .42 is well above .05, so there's no evidence against normality.

Limitations of Shapiro-Wilk

Overpowered with large samples. When n > 5,000, Shapiro-Wilk almost always rejects the null hypothesis, even for trivially small departures from normality that won't affect your analysis. In this case, rely on Q-Q plots instead.
Underpowered with small samples. When n < 10, the test may fail to detect genuine non-normality. Visual inspection (Q-Q plot) is especially important here.

Method 2: Q-Q plots (visual inspection)

A quantile-quantile (Q-Q) plot graphs the quantiles of your data against the theoretical quantiles of a normal distribution. If your data is normal, the points fall approximately along a straight diagonal line.

Reading a Q-Q plot

Points on the line → Data is approximately normal
S-shaped curve → Data has heavy tails (leptokurtic) or light tails (platykurtic)
Banana shape curving upward → Right-skewed data
Banana shape curving downward → Left-skewed data
Isolated points far from the line → Outliers

Q-Q plots are often more informative than formal tests because they show how the data departs from normality — which helps you choose the right response.

Method 3: Descriptive indicators

Skewness and kurtosis values provide numerical summaries of distribution shape:

Skewness: Values between −1 and +1 suggest approximate symmetry
Kurtosis: Values between −2 and +2 suggest approximate normality (some sources use −1 to +1 for excess kurtosis)

These are supplementary — use them alongside Shapiro-Wilk and Q-Q plots, not as the sole check.

What to do when normality fails

When your data isn't normally distributed, you have several options:

Option 1: Switch to a non-parametric test

This is the most common and safest approach. Non-parametric tests don't assume normality:

Parametric test	Non-parametric alternative
Unpaired t-test	Mann-Whitney U test
Paired t-test	Wilcoxon signed-rank test
One-way ANOVA	Kruskal-Wallis H test
Repeated measures ANOVA	Friedman test
Pearson correlation	Spearman correlation

Option 2: Transform the data

Log, square root, or inverse transformations can sometimes normalize skewed data. However, this changes the scale of your outcome, making interpretation harder. Modern practice increasingly favors non-parametric tests over transformations.

Option 3: Proceed with caution (large samples)

If your sample is large (roughly n > 30 per group) and the departure from normality is moderate, parametric tests are often robust enough. Report that you checked normality, note the violation, and mention it in your limitations.

When to check normality

Check normality at the right level depending on your test:

t-tests: Check normality in each group separately (unpaired) or normality of the differences (paired)
ANOVA: Check normality of the residuals, or normality within each group
Linear regression: Check normality of the residuals (not the raw variables)
Pearson correlation: Check normality of both variables

Common mistake: Testing normality on the combined data instead of per group. If you have a treatment and control group, run Shapiro-Wilk on each group separately. The combined distribution may look non-normal even when each group is normal.

Reporting normality checks

In your methods section, briefly note how you checked assumptions:

APA example

Normality was assessed using the Shapiro-Wilk test and visual inspection of Q-Q plots. Both groups satisfied the normality assumption (W > 0.94, p > .05). Homogeneity of variances was confirmed by Levene's test, F(1, 48) = 0.82, p = .370.

When normality fails

The Shapiro-Wilk test indicated departure from normality in the treatment group (W = 0.87, p = .012). Accordingly, a Mann-Whitney U test was used instead of an independent t-test.

Join the beta to try this in GraphHelix — upload your data, and the AI will check normality (Shapiro-Wilk + Q-Q plots) automatically before every parametric test.

Join the beta