Array Row Similarity/Comparison

Question

Tyler Smith on 16 Nov 2016

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/312598-array-row-similarity-comparison

Commented: Tyler Smith on 17 Nov 2016

I want to compare rows of two arrays to see which rows are most similar to one another, sort of like clustering. To be clear, I don't want to compare the differences between the numbers, rather the entire row as a whole. Another thing I would like to be able to do is see if particular numbers in the SLP variable occur more often when a number shows up in the same index in the 500z variable. Both variables have 20 columns and 49 rows, but in the following example variables I only included 3 rows and 5 cols.

SLP = [1,3,4,2,3
       4,7,6,5,6
       1,4,3,3,2]
500z= [9,6,7,6,6
       7,5,7,6,8
       9,7,6,6,6 ]

An example output I would like is: 1.) A measure of row similarity (perhaps a percentage of similarity or even a cluster number): The most similar rows in SLP are rows 1 and 3: therefore an example output could be a 3x3 matrix (SLP rows 1-3 going down and 500z rows 1-3 going across) with the percentage of similarity between each row. Or it could be in the form of a cluster. Ex: rows 1 and 3 belonging to cluster 1 and row 2 belonging to cluster 2. 2.) Which numbers occur most frequently in the same index between the two variables. So looking at the sample variables, I would get SLP 1 tends to occur with 500z 9. SLP 3 tends to occur with 500z 6, SLP 4 tends to occur with 500z 7, and so on. This could be output as simply as an array where column 1 is the SLP pair and column 2 is the 500z pair. It would be great to be able to have a column 3 as well saying how often the pair occurred.

I've been stuck on this for a while so any help or suggestions of how to best approach this problem would be awesome! I am also fairly new to Matlab, so my wording may not be the best.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 17 Nov 2016

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/312598-array-row-similarity-comparison#answer_243593

One of the techniques for similarity is sum-of-squares-of-differences between the rows.

It so happens that square root of sum-of-squares-of-differences is equivalent to Euclidean distance. Therefore you can find a similarity measure by using pdist() between the rows.

1 Comment
Show -1 older commentsHide -1 older comments

Tyler Smith on 17 Nov 2016

Thanks, that definitely helps! I ended up using the pdist2(SLP, SLP) to get the matrix I wanted in the first question.

Sign in to comment.

Array Row Similarity/Comparison

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Array Row Similarity/Comparison

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments