# surrogateAssociation

Mean predictive measure of association for surrogate splits in classification tree

## Syntax

```ma = surrogateAssociation(tree) ma = surrogateAssociation(tree,N) ```

## Description

`ma = surrogateAssociation(tree)` returns a matrix of predictive measures of association for the predictors in `tree`.

`ma = surrogateAssociation(tree,N)` returns a matrix of predictive measures of association averaged over the nodes in vector `N`.

## Input Arguments

 `tree` A classification tree constructed with `fitctree`, or a compact regression tree constructed with `compact`. `N` Vector of node numbers in `tree`.

## Output Arguments

 `ma` `ma = surrogateAssociation(tree)` returns a `P`-by-`P` matrix, where `P` is the number of predictors in `tree`. `ma(i,j)` is the predictive measure of association between the optimal split on variable `i` and a surrogate split on variable `j`. For more details, see Algorithms.`ma = surrogateAssociation(tree,N)` returns a `P`-by-`P` representing the predictive measure of association between variables averaged over nodes in the vector `N`. `N` contains node numbers from `1` to `max(tree.NumNodes)`.

## Examples

expand all

`load fisheriris`

Grow a classification tree using `species` as the response. Specify to use surrogate splits for missing values.

`tree = fitctree(meas,species,'surrogate','on');`

Find the mean predictive measure of association between the predictor variables.

`ma = surrogateAssociation(tree)`
```ma = 4×4 1.0000 0 0 0 0 1.0000 0 0 0.4633 0.2500 1.0000 0.5000 0.2065 0.1413 0.4022 1.0000 ```

Find the mean predictive measure of association averaged over the odd-numbered nodes in `tree`.

```N = 1:2:tree.NumNodes; ma = surrogateAssociation(tree,N)```
```ma = 4×4 1.0000 0 0 0 0 1.0000 0 0 0.7600 0.5000 1.0000 1.0000 0.4130 0.2826 0.8043 1.0000 ```

expand all

## Algorithms

Element `ma(i,j)` is the predictive measure of association averaged over surrogate splits on predictor `j` for which predictor `i` is the optimal split predictor. This average is computed by summing positive values of the predictive measure of association over optimal splits on predictor `i` and surrogate splits on predictor `j` and dividing by the total number of optimal splits on predictor `i`, including splits for which the predictive measure of association between predictors `i` and `j` is negative.