Multivariate Normality Tests

Multivariate normality is a foundational assumption for methods such as MANOVA, principal component analysis, and linear discriminant analysis. In this tutorial, we’ll apply and interpret the five main tests implemented in the MVN package, guided by the recommendations of Korkmaz et al. (2014).


Example Data

We’ll use two numeric variables from the built-in iris dataset:

df <- iris[1:50, 1:2]
head(df)
  Sepal.Length Sepal.Width
1          5.1         3.5
2          4.9         3.0
3          4.7         3.2
4          4.6         3.1
5          5.0         3.6
6          5.4         3.9

Henze–Zirkler Test

The Henze–Zirkler test is recommended for its balanced control of Type I error and good power properties under moderate sample sizes; a p-value below 0.05 suggests departure from multivariate normality based on a log-normalized distance metric.

hz_result <- mvn(data = df, mvn_test = "hz")
summary(hz_result, select = "mvn")
           Test Statistic p.value      MVN
1 Henze-Zirkler     0.286   0.915 ✓ Normal

Royston Test

The Royston test aggregates transformed univariate Shapiro–Wilk statistics into a joint chi‐square test and is noted for reliable performance in small-to-moderate samples.

ro_result <- mvn(data = df, mvn_test = "royston")
summary(ro_result, select = "mvn")
     Test Statistic p.value      MVN
1 Royston     2.698   0.245 ✓ Normal

Mardia Test

Mardia’s skewness and kurtosis measures provide insight into specific aspects of distributional shape; examining both skewness and kurtosis p-values helps diagnose the nature of non-normality, though neither may alone signal departure when sample size is limited.

mardia_result <- mvn(data = df, mvn_test = "mardia")
summary(mardia_result, select = "mvn")
             Test Statistic p.value      MVN
1 Mardia Skewness     0.760   0.944 ✓ Normal
2 Mardia Kurtosis     0.093   0.926 ✓ Normal

Doornik–Hansen Test

The Doornik–Hansen approach applies transformations to approximate a chi‐square distribution of combined moment statistics.

dh_result <- mvn(data = df, mvn_test = "doornik_hansen")
summary(dh_result, select = "mvn")
            Test Statistic df p.value          MVN
1 Doornik-Hansen     11.57  4   0.021 ✗ Not normal

Energy Test

Based on a nonparametric energy distance metric, this test is sensitive to any general deviation from normality. It provides a robust alternative especially when data exhibit heavy tails or multimodality.

energy_result <- mvn(data = df, mvn_test = "energy")
summary(energy_result, select = "mvn")
         Test Statistic p.value      MVN
1 E-Statistic     0.527   0.783 ✓ Normal

Tip

No single test is universally best; Korkmaz et al. (2014) recommend combining multiple numerical tests with graphical diagnostics to make a more reliable decision on multivariate normality.


References

Korkmaz S, Goksuluk D, Zararsiz G. MVN: An R Package for Assessing Multivariate Normality. The R Journal. 2014;6(2):151–162. URL: https://journal.r-project.org/archive/2014-2/korkmaz-goksuluk-zararsiz.pdf