<- iris[1:50, 1:2]
df head(df)
Sepal.Length Sepal.Width
1 5.1 3.5
2 4.9 3.0
3 4.7 3.2
4 4.6 3.1
5 5.0 3.6
6 5.4 3.9
Multivariate normality is a foundational assumption for methods such as MANOVA, principal component analysis, and linear discriminant analysis. In this tutorial, we’ll apply and interpret the five main tests implemented in the MVN package, guided by the recommendations of Korkmaz et al. (2014).
The mvn()
function also supports the use of bootstrap resampling for estimating p-values. By setting bootstrap = TRUE
, you can obtain empirically derived p-values for the Mardia, Henze–Zirkler, Henze-Wagner, Royston and Doornik-Hansen tests. This is particularly useful when working with small sample sizes or when the assumptions underlying asymptotic distributions may not hold. The default setting is FALSE.
We’ll use two numeric variables from the built-in iris
dataset:
Sepal.Length Sepal.Width
1 5.1 3.5
2 4.9 3.0
3 4.7 3.2
4 4.6 3.1
5 5.0 3.6
6 5.4 3.9
The Henze–Zirkler test is recommended for its balanced control of Type I error and good power properties under moderate sample sizes; a p-value below 0.05 suggests departure from multivariate normality based on a log-normalized distance metric.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic p.value Method MVN
1 Henze-Zirkler 0.286 0.915 asymptotic ✓ Normal
The Henze–Wagner test is a statistical method for assessing multivariate normality. It is based on a non-negative functional distance that quantifies the discrepancy between the empirical and theoretical characteristic functions. The test statistic follows an approximate asymptotic distribution under the null hypothesis of multivariate normality. It is particularly suited for datasets where the number of variables p is less than the number of observations n (i.e., p<n). A p-value below 0.05 provides evidence against the assumption of multivariate normality.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic p.value Method MVN
1 Henze-Wagner 0.136 0.939 asymptotic ✓ Normal
The Royston test aggregates transformed univariate Shapiro–Wilk statistics into a joint chi‐square test and is noted for reliable performance in small-to-moderate samples.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic p.value Method MVN
1 Royston 2.698 0.245 asymptotic ✓ Normal
Mardia’s skewness and kurtosis measures provide insight into specific aspects of distributional shape; examining both skewness and kurtosis p-values helps diagnose the nature of non-normality, though neither may alone signal departure when sample size is limited.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic p.value Method MVN
1 Mardia Skewness 0.760 0.944 asymptotic ✓ Normal
2 Mardia Kurtosis 0.093 0.926 asymptotic ✓ Normal
The Doornik–Hansen approach applies transformations to approximate a chi‐square distribution of combined moment statistics.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic df p.value Method MVN
1 Doornik-Hansen 11.57 4 0.021 asymptotic ✗ Not normal
Based on a nonparametric energy distance metric, this test is sensitive to any general deviation from normality. It provides a robust alternative especially when data exhibit heavy tails or multimodality.
── Multivariate Normality Test Results ─────────────────────────────────────────
Test Statistic p.value Method MVN
1 E-Statistic 0.527 0.783 bootstrap ✓ Normal
No single test is universally best; Korkmaz et al. (2014) recommend combining multiple numerical tests with graphical diagnostics to make a more reliable decision on multivariate normality.
Korkmaz S, Goksuluk D, Zararsiz G. MVN: An R Package for Assessing Multivariate Normality. The R Journal. 2014;6(2):151–162. URL: https://journal.r-project.org/archive/2014-2/korkmaz-goksuluk-zararsiz.pdf