Documentation

squareform

Format distance matrix

Description

example

ZOut = squareform(yIn) converts yIn, a pairwise distance vector of length m(m–1)/2 for m observations, into ZOut, an m-by-m symmetric matrix with zeros along the diagonal.

The pairwise distances in yIn are arranged in the order (2,1), (3,1), ..., (m,1), (3,2), ..., (m,2), ..., (m,m–1). The pairwise distance between the ith and jth observations is in ZOut(i,j) and yIn((i–1)*(m–i/2)+j–i) for ij.

yOut = squareform(ZIn) converts ZIn, a square, symmetric matrix with zeros along the diagonal, into yOut, a vector containing the ZIn elements below the diagonal.

ZOut = squareform(yIn,'tomatrix') forces squareform to treat yIn as a vector and converts yIn into a matrix.

yOut = squareform(ZIn,'tovector') forces squareform to treat ZIn as a matrix and converts ZIn into a vector. If ZIn is a scalar (1-by-1), then ZIn must be zero.

The previous two syntaxes are useful when the input argument is a scalar. If you do not specify either 'tomatrix' or 'tovector', then the default is 'tomatrix'.

Examples

collapse all

Compute the Euclidean distance between pairs of observations, and convert the distance vector to a matrix using squareform.

Create a matrix with three observations and two variables.

rng('default') % For reproducibility
X = rand(3,2);

Compute the Euclidean distance.

D = pdist(X)
D = 1×3

0.2954    1.0670    0.9448

The pairwise distances are arranged in the order (2,1), (3,1), (3,2). You can easily locate the distance between observations i and j by using squareform.

Z = squareform(D)
Z = 3×3

0    0.2954    1.0670
0.2954         0    0.9448
1.0670    0.9448         0

squareform returns a symmetric matrix where Z(i,j) corresponds to the pairwise distance between observations i and j. For example, you can find the distance between observations 2 and 3.

Z(2,3)
ans = 0.9448

Pass Z to the squareform function to reproduce the output of the pdist function.

y = squareform(Z)
y = 1×3

0.2954    1.0670    0.9448

The outputs y from squareform and D from pdist are the same.

Input Arguments

collapse all

Input distance vector, specified as a numeric or logical vector of length m(m–1)/2, where m is the number of observations.

The pairwise distances in yIn are arranged in the order (2,1), (3,1), ..., (m,1), (3,2), ..., (m,2), ..., (m,m–1), i.e., the lower-left triangle of the m-by-m distance matrix in column order. The pairwise distance between observations i and j is in yIn((i–1)*(m–i/2)+j–i) for ij.

You can create yIn by using the pdist function. m is the number of observations in the input data of pdist.

Data Types: single | double | logical

Input distance matrix, specified as a numeric or logical matrix. ZIn is an m-by-m symmetric matrix with zeros along the diagonal, where m is the number of observations. ZIn(i,j) denotes the distance between the ith and jth observations.

Data Types: single | double | logical

Output Arguments

collapse all

Distance vector, returned as a numeric or logical vector of length m(m–1)/2, where m is the number of observations.

The pairwise distances in yOut are arranged in the order (2,1), (3,1), ..., (m,1), (3,2), ..., (m,2), ..., (m,m–1), i.e., the lower-left triangle of the m-by-m distance matrix in column order. The pairwise distance between observations i and j is in yOut((i–1)*(m–i/2)+j–i) for ij.

yOut has the same format as the output from the pdist function.

Distance matrix, returned as a numeric or logical matrix. ZOut is an m-by-m symmetric matrix with zeros along the diagonal, where m is the number of observations. ZOut(i,j) denotes the distance between the ith and jth observations.

Tips

• You can use squareform to format a vector or matrix that is similar to a distance vector or matrix, such as the correlation coefficient matrix (corrcoef).