Documentation

vartestn

Multiple-sample tests for equal variances

Description

example

vartestn(x) returns a summary table of statistics and a box plot for a Bartlett test of the null hypothesis that the columns of data vector x come from normal distributions with the same variance. The alternative hypothesis is that not all columns of data have the same variance.

example

vartestn(x,Name,Value) returns a summary table of statistics and a box plot for a test of unequal variances with additional options specified by one or more name-value pair arguments. For example, you can specify a different type of hypothesis test or change the display settings for the test results.

example

vartestn(x,group) returns a summary table of statistics and a box plot for a Bartlett test of the null hypothesis that the data in each categorical group comes from normal distributions with the same variance. The alternative hypothesis is that not all groups have the same variance.

example

vartestn(x,group,Name,Value) returns a summary table of statistics and a box plot for a test of unequal variances with additional options specified by one or more name-value pair arguments. For example, you can specify a different type of hypothesis test or change the display settings for the test results.

example

p = vartestn(___) also returns the p-value of the test, p, using any of the input arguments in the previous syntaxes.

example

[p,stats] = vartestn(___) also returns the structure stats containing information about the test statistic.

Examples

collapse all

Load the sample data.

Test the null hypothesis that the variances are equal across the five columns of data in the students’ exam grades matrix, grades.  ans = 7.9086e-08

The low $p$-value, p = 0, indicates that vartestn rejects the null hypothesis that the variances are equal across all five columns, in favor of the alternative hypothesis that at least one column has a different variance.

Load the sample data.

Test the null hypothesis that the variances in miles per gallon (MPG) are equal across different model years.

vartestn(MPG,Model_Year)  ans = 0.8327

The high $p$-value, p = 0.83269, indicates that vartestn does not reject the null hypothesis that the variances in miles per gallon (MPG) are equal across different model years.

Load the sample data.

Use Levene’s test to test the null hypothesis that the variances in miles per gallon (MPG) are equal across different model years.

p = vartestn(MPG,Model_Year,'TestType','LeveneAbsolute')  p = 0.6320

The high $p$-value, p = 0.63195, indicates that vartestn does not reject the null hypothesis that the variances in miles per gallon (MPG) are equal across different model years.

Load the sample data.

Test the null hypothesis that the variances are equal across the five columns of data in the students’ exam grades matrix, grades, using the Brown-Forsythe test. Suppress the display of the summary table of statistics and the box plot.

p = 1.3121e-06
stats = struct with fields:
fstat: 8.4160
df: [4 595]

The small $p$-value, p = 1.3121e-06, indicates that vartestn rejects the null hypothesis that the variances are equal across all five columns, in favor of the alternative hypothesis that at least one column has a different variance.

Input Arguments

collapse all

Sample data, specified as a matrix or column vector. If a grouping variable group is specified, then x must be a column vector. If a grouping variable is not specified, x must be a matrix. In either case, vartestn treats NaN values as missing values and ignores them.

Data Types: single | double

Grouping variable, specified as a categorical array, logical or numeric vector, character array, string array, or cell array of character vectors with one row for each element of x. Each unique value in a grouping variable defines a group. vartestn treats NaN values as missing values and ignores them.

For example, if Gender is a cell array of character vectors with values 'Male' and 'Female', you can use Gender as a grouping variable to test your data by gender.

Example: Gender

Data Types: categorical | single | double | logical | string | cell | char

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'TestType','BrownForsythe','Display','off' specifies a Brown-Forsythe test and omits the plot of the results.

Display settings for test results, specified as the comma-separated pair consisting of 'Display' and one of the following.

 'on' Display a box plot and table of summary statistics. 'off' Do not display a box plot and table of summary statistics.

Example: 'display','off'

Type of hypothesis test to perform, specified as the comma-separated pair consisting of 'TestType' and one of the following.

 'Bartlett' Bartlett’s test. 'LeveneQuadratic' Levene’s test computed by performing ANOVA on the squared deviations of the data values from their group means. 'LeveneAbsolute' Levene’s test computed by performing ANOVA on the absolute deviations of the data values from their group means. 'BrownForsythe' Brown-Forsythe test computed by performing ANOVA on the absolute deviations of the data values from the group medians. 'OBrien' O’Brien’s modification of Levene’s test with W = 0.5.

Example: 'TestType','OBrien'

Output Arguments

collapse all

p-value of the test, returned as a scalar value in the range [0,1]. p is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. Small values of p cast doubt on the validity of the null hypothesis.

Test statistics for the hypothesis test, returned as a structure containing:

• chistat: Value of the test statistic.

• df: Degrees of freedom of the test.

collapse all

Bartlett’s Test

Bartlett’s test is used to test whether multiple data samples have equal variances, against the alternative that at least two of the data samples do not have equal variances.

The test statistic is

$T=\frac{\left(N-k\right)\mathrm{ln}{s}_{p}{}^{2}-\sum _{i=1}^{k}\left({N}_{i}-1\right)\mathrm{ln}{s}_{i}{}^{2}}{1+\left(1/\left(3\left(k-1\right)\right)\right)\left(\left(\sum _{i=1}^{k}1/\left({N}_{i}-1\right)\right)-1/\left(N-k\right)\right)},$

where ${s}_{i}{}^{2}$ is the variance of the ith group, N is the total sample size, Ni is the sample size of the ith group, k is the number of groups, and ${s}_{p}{}^{2}$ is the pooled variance. The pooled variance is defined as

${s}_{p}{}^{2}=\sum _{i=1}^{k}\left({N}_{i}-1\right){s}_{i}{}^{2}/\left(N-k\right).$

The test statistic has a chi-square distribution with k – 1 degrees of freedom under the null hypothesis.

Bartlett’s test is sensitive to departures from normality. If your data comes from a nonnormal distribution, Levene’s test could provide a more accurate result.

Levene, Brown-Forsythe, and O’Brien Tests

The Levene, Brown-Forsythe, and O’Brien tests are used to test if multiple data samples have equal variances, against the alternative that at least two of the data samples do not have equal variances.

The test statistic is

$W=\frac{\left(N-k\right)\sum _{i=1}^{k}{N}_{i}{\left({\overline{Z}}_{i.}-{\overline{Z}}_{..}\right)}^{2}}{\left(k-1\right)\sum _{i=1}^{k}\sum _{j=1}^{{N}_{i}}{\left({Z}_{ij}-{\overline{Z}}_{i.}\right)}^{2}},$

where Ni is the sample size of the ith group, and k is the number of groups. Depending on the type of test specified with the TestType name-value pair arguments, Zij can have one of four definitions:

• If you specify LeveneAbsolute, vartestn uses ${Z}_{ij}=|{Y}_{ij}-{\overline{Y}}_{i.}|$, where ${\overline{Y}}_{i.}$ is the mean of the ith subgroup.

• If you specify LeveneQuadratic, vartestn uses ${Z}_{ij}{}^{2}={\left({Y}_{ij}-{\overline{Y}}_{i.}\right)}^{2}$, where ${\overline{Y}}_{i.}$ is the mean of the ith subgroup.

• If you specify BrownForsythe, vartestn uses ${Z}_{ij}=|{Y}_{ij}-{\stackrel{˜}{Y}}_{i.}|$, where ${\stackrel{˜}{Y}}_{i.}$ is the median of the ith subgroup.

• If you specify OBrien, vartestn uses

${Z}_{ij}=\frac{\left(0.5+{n}_{i}-2\right){n}_{i}{\left({y}_{ij}-{\overline{y}}_{i}\right)}^{2}-\text{\hspace{0.17em}}\text{\hspace{0.17em}}0.5\left({n}_{i}-1\right){\sigma }_{i}{}^{2}}{\left({n}_{i}-1\right)\left({n}_{i}-2\right)},$

where ni is the size of the ith group, σi2 is its sample variance.

In all cases, the test statistic has an F-distribution with k – 1 numerator degrees of freedom, and Nk denominator degrees of freedom.

The Levene, Brown-Forsythe, and O’Brien tests are less sensitive to departures from normality than Bartlett’s test, so they are useful alternatives if you suspect the samples come from nonnormal distributions.