# fishertest

Fisher’s exact test

## Description

returns
a test decision for Fisher’s exact test of the null hypothesis
that there are no nonrandom associations between the two categorical
variables in `h`

= fishertest(`x`

)`x`

, against the alternative that
there is a nonrandom association. The result `h`

is `1`

if
the test rejects the null hypothesis at the 5% significance level,
or `0`

otherwise.

`[___] = fishertest(`

returns
a test decision using additional options specified by one or more
name-value pair arguments. For example, you can change the significance
level of the test or conduct a one-sided test.`x`

,`Name,Value`

)

## Examples

### Conduct Fisher's Exact Test

In a small survey, a researcher asked 17 individuals if they received a flu shot this year, and whether they caught the flu this winter. The results indicate that, of the nine people who did not receive a flu shot, three got the flu and six did not. Of the eight people who received a flu shot, one got the flu and seven did not.

Create a 2-by-2 contingency table containing the survey data. Row 1 contains data for the individuals who did not receive a flu shot, and row 2 contains data for the individuals who received a flu shot. Column 1 contains the number of individuals who got the flu, and column 2 contains the number of individuals who did not.

x = table([3;1],[6;7],'VariableNames',{'Flu','NoFlu'},'RowNames',{'NoShot','Shot'})

`x=`*2×2 table*
Flu NoFlu
___ _____
NoShot 3 6
Shot 1 7

Use Fisher's exact test to determine if there is a nonrandom association between receiving a flu shot and getting the flu.

h = fishertest(x)

`h = `*logical*
0

The returned test decision `h = 0`

indicates that `fishertest`

does not reject the null hypothesis of no nonrandom association between the categorical variables at the default 5% significance level. Therefore, based on the test results, individuals who do not get a flu shot do not have different odds of getting the flu than those who got the flu shot.

### Conduct a One-Sided Fisher's Exact Test

In a small survey, a researcher asked 17 individuals if they received a flu shot this year, and whether they caught the flu. The results indicate that, of the nine people who did not receive a flu shot, three got the flu and six did not. Of the eight people who received a flu shot, one got the flu and seven did not.

x = [3,6;1,7];

Use a right-tailed Fisher's exact test to determine if the odds of getting the flu is higher for individuals who did not receive a flu shot than for individuals who did. Conduct the test at the 1% significance level.

[h,p,stats] = fishertest(x,'Tail','right','Alpha',0.01)

`h = `*logical*
0

p = 0.3353

`stats = `*struct with fields:*
OddsRatio: 3.5000
ConfidenceInterval: [0.1289 95.0408]

The returned test decision `h = 0`

indicates that `fishertest`

does not reject the null hypothesis of no nonrandom association between the categorical variables at the 1% significance level. Since this is a right-tailed hypothesis test, the conclusion is that individuals who do not get a flu shot do not have greater odds of getting the flu than those who got the flu shot.

### Generate a Contingency Table Using `crosstab`

Load the hospital data.

`load hospital`

The `hospital`

dataset array contains data on 100 hospital patients, including last name, gender, age, weight, smoking status, and systolic and diastolic blood pressure measurements.

To determine if smoking status is independent of gender, use `crosstab`

to create a 2-by-2 contingency table of smokers and nonsmokers, grouped by gender.

[tbl,chi2,p,labels] = crosstab(hospital.Sex,hospital.Smoker)

`tbl = `*2×2*
40 13
26 21

chi2 = 4.5083

p = 0.0337

`labels = `*2x2 cell*
{'Female'} {'0'}
{'Male' } {'1'}

The rows of the resulting contingency table `tbl`

correspond to the patient's gender, with row 1 containing data for females and row 2 containing data for males. The columns correspond to the patient's smoking status, with column 1 containing data for nonsmokers and column 2 containing data for smokers. The returned result `chi2 = 4.5083`

is the value of the chi-squared test statistic for a chi-squared test of independence. The returned value `p = 0.0337`

is an approximate $$p$$-value based on the chi-squared distribution.

Use the contingency table generated by `crosstab`

to perform Fisher's exact test on the data.

[h,p,stats] = fishertest(tbl)

`h = `*logical*
1

p = 0.0375

`stats = `*struct with fields:*
OddsRatio: 2.4852
ConfidenceInterval: [1.0624 5.8135]

The result `h = 1`

indicates that `fishertest`

rejects the null hypothesis of nonassociation between smoking status and gender at the 5% significance level. In other words, there is an association between gender and smoking status. The odds ratio indicates that the male patients have about 2.5 times greater odds of being smokers than the female patients.

The returned $$p$$-value of the test, `p = 0.0375`

, is close to, but not exactly the same as, the result obtained by `crosstab`

. This is because `fishertest`

computes an exact $$p$$-value using the sample data, while `crosstab`

uses a chi-squared approximation to compute the $$p$$-value.

## Input Arguments

`x`

— Contingency table

2-by-2 matrix of nonnegative integer values | 2-by-2 table of nonnegative integer values

Contingency table, specified as a 2-by-2 matrix or table containing
nonnegative integer values. A contingency table contains the frequency
distribution of the variables in the sample data. You can use `crosstab`

to generate a contingency table
from sample data.

**Example: **`[4,0;0,4]`

**Data Types: **`single`

| `double`

### Name-Value Arguments

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

*
Before R2021a, use commas to separate each name and value, and enclose*
`Name`

*in quotes.*

**Example: **`'Alpha',0.01,'Tail','right'`

specifies
a right-tailed hypothesis test at the 1% significance level.

`Alpha`

— Significance level

0.05 (default) | scalar value in the range (0,1)

Significance level of the hypothesis test, specified as the
comma-separated pair consisting of `'Alpha'`

and
a scalar value in the range (0,1).

**Example: **`'Alpha',0.01`

**Data Types: **`single`

| `double`

`Tail`

— Type of alternative hypothesis

`'both'`

(default) | `'right'`

| `'left'`

Type of alternative hypothesis, specified as the comma-separated
pair consisting of `'Tail'`

and one of the following.

`'both'` | Two-tailed test. The alternative hypothesis is that there is
a nonrandom association between the two variables in `x` ,
and the odds ratio is not equal to 1. |

`'right'` | Right-tailed test. The alternative hypothesis is that the odds ratio is greater than 1. |

`'left'` | Left-tailed test. The alternative hypothesis is that the odds ratio is less than 1. |

**Example: **`'Tail','right'`

## Output Arguments

`h`

— Hypothesis test result

`1`

| `0`

Hypothesis test result, returned as a logical value.

If

`h`

is`1`

, then`fishertest`

rejects the null hypothesis at the`Alpha`

significance level.If

`h`

is`0`

, then`fishertest`

fails to reject the null hypothesis at the`Alpha`

significance level.

`p`

— *p*-value

scalar value in the range [0,1]

*p*-value of the test, returned as a scalar
value in the range [0,1]. `p`

is the probability
of observing a test statistic as extreme as, or more extreme than,
the observed value under the null hypothesis. Small values of `p`

cast
doubt on the validity of the null hypothesis.

`stats`

— Test data

structure

Test data, returned as a structure with the following fields:

`OddsRatio`

— A measure of association between the two variables.`ConfidenceInterval`

— Asymptotic confidence interval for the odds ratio. If any of the cell frequencies in`x`

are 0, then`fishertest`

does not compute a confidence interval and instead displays`[-Inf Inf]`

.

## More About

### Fisher’s Exact Test

Fisher’s exact test is a nonparametric statistical test used to test the null hypothesis that no nonrandom associations exist between two categorical variables, against the alternative that there is a nonrandom association between the variables.

Fisher’s exact test provides an alternative to the chi-squared
test for small samples, or samples with very uneven marginal distributions.
Unlike the chi-squared test, Fisher’s exact test does not depend
on large-sample distribution assumptions, and instead calculates an
exact *p*-value based on the sample data. Although
Fisher’s exact test is valid for samples of any size, it is
not recommended for large samples because it is computationally intensive.
If all of the frequency counts in the contingency table are greater
than or equal to `1e7`

, then `fishertest`

errors.
For contingency tables that contain large count values or are well-balanced,
use `crosstab`

or `chi2gof`

instead.

`fishertest`

accepts a 2-by-2 contingency
table as input, and computes the *p*-value of the
test as follows:

Calculate the sums for each row, column, and total number of observations in the contingency table.

Using a multivariate generalization of the hypergeometric probability function, calculate the conditional probability of observing the exact result in the contingency table if the null hypothesis were true, given its row and column sums. The conditional probability is

$${P}_{cutoff}=\frac{\left({R}_{1}!{R}_{2}!\right)\left({C}_{1}!{C}_{2}!\right)}{N!{\displaystyle {\prod}_{i,j}{n}_{ij}!}}\text{\hspace{0.17em}},$$

where R

_{1}and R_{2}are the row sums, C_{1}and C_{2}are the column sums,*N*is the total number of observations in the contingency table, and n_{ij}is the value in the*i*th row and*j*th column of the table.Find all possible matrices of nonnegative integers consistent with the row and column sums. For each matrix, calculate the associated conditional probability using the equation for

*P*._{cutoff}Use these values to calculate the

*p*-value of the test, based on the alternative hypothesis of interest.For a two-sided test, sum all of the conditional probabilities less than or equal to

*P*for the observed contingency table. This represents the probability of observing a result as extreme as, or more extreme than, the actual outcome if the null hypothesis were true. Small_{cutoff}*p*-values cast doubt on the validity of the null hypothesis, in favor of the alternative hypothesis of association between the variables.For a left-sided test, sum the conditional probabilities of all the matrices with a (1,1) cell frequency less than or equal to n

_{11}.For a right-sided test, sum the conditional probabilities of all the matrices with a (1,1) cell frequency greater than or equal to n

_{11}in the observed contingency table.

The odds ratio is

$$OR=\frac{{n}_{11}{n}_{22}}{{n}_{21}{n}_{12}}\text{\hspace{0.17em}}.$$

The null hypothesis of conditional independence is equivalent to the hypothesis that the odds ratio equals 1. The left-sided alternative is equivalent to an odds ratio less than 1, and the right-sided alternative is equivalent to an odds ratio greater than 1.

The asymptotic 100(1 – α)% confidence interval for the odds ratio is

$$CI=\left[\mathrm{exp}\left(L-{\Phi}^{-1}\left(\frac{1-\alpha}{2}\right)SE\right)\text{\hspace{0.17em}},\text{\hspace{0.17em}}\mathrm{exp}\left(L+{\Phi}^{-1}\left(\frac{1-\alpha}{2}\right)SE\right)\right]\text{\hspace{0.17em}},$$

where *L* is the log odds ratio, Φ^{-1}(
• ) is the inverse of the normal inverse
cumulative distribution function, and *SE* is the
standard error for the log odds ratio. If the 100(1 – α)%
confidence interval does not contain the value 1, then the association
is significant at the α significance level. If any of the four
cell frequencies are 0, then `fishertest`

does
not compute the confidence interval and instead displays ```
[-Inf
Inf]
```

.

`fishertest`

only accepts 2-by-2 contingency
tables as input. To test the independence of categorical variables
with more than two levels, use the chi-squared test provided by `crosstab`

.

## Version History

**Introduced in R2014b**

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)