Univariate Normality & Descriptive Statistics

Before proceeding with the joint multivariate analyses, it is essential to confirm that each variable separately approximates normality and to obtain descriptive insights such as mean, variance, skewness, and kurtosis. In this section, we’ll first run univariate Anderson–Darling tests on each variable, then calculate key summary statistics.


Example Data

# Load the package:
library(MVN)

We’ll use two numeric variables from the built-in iris dataset:

df <- iris[1:50, 1:2]
head(df)
  Sepal.Length Sepal.Width
1          5.1         3.5
2          4.9         3.0
3          4.7         3.2
4          4.6         3.1
5          5.0         3.6
6          5.4         3.9

Univariate Normality Tests

# Load MVN
library(MVN)

# Example data
df <- iris[1:50, 1:2]

Use the existing mvn object (e.g., from the Henze–Zirkler test) to pull out Anderson–Darling statistics for each variable:

# Run mvn (if not already run)
hz_result <- mvn(data = df, mvn_test = "hz", univariate_test = "AD")

# Extract univariate Anderson–Darling results
summary(hz_result, select = "uni")
              Test     Variable Statistic p.value Normality
1 Anderson-Darling Sepal.Length     0.408   0.335  ✓ Normal
2 Anderson-Darling  Sepal.Width     0.491   0.210  ✓ Normal

Sepal.Length
- Statistic = 0.408
- p-value = 0.335 → p > 0.05 → Normality assumption is not violated

Sepal.Width
- Statistic = 0.491
- p-value = 0.210 → p > 0.05 → Normality assumption is not violated

Both variables show no significant deviation from a normal distribution based on the Anderson–Darling test.


Tip

In the mvn() function, the default univariate normality test is “AD” (Anderson–Darling). However, you can choose alternative tests such as “SW” (Shapiro–Wilk), “SF” (Shapiro–Francia), “CVM” (Cramér–von Mises), or “Lillie” (Lilliefors).


Descriptive Statistics

Compute numerical summaries—mean, standard deviation, median, minimum, maximum, quartiles, skewness, and kurtosis—for each variable:

# Descriptive statistics for each variable
summary(hz_result, select = "descriptive")
      Variable  n  Mean Std.Dev Median Min Max 25th  75th  Skew Kurtosis
1 Sepal.Length 50 5.006   0.352    5.0 4.3 5.8  4.8 5.200 0.116    2.654
2  Sepal.Width 50 3.428   0.379    3.4 2.3 4.4  3.2 3.675 0.040    3.744

References

Korkmaz S, Goksuluk D, Zararsiz G. MVN: An R Package for Assessing Multivariate Normality. The R Journal. 2014;6(2):151–162. URL: https://journal.r-project.org/archive/2014-2/korkmaz-goksuluk-zararsiz.pdf