Matching based on the first word
1 view (last 30 days)
Show older comments
Hi,
I have two cell arrays A and B where each contains companies names: for example A contains "Biotech Capital Corp" while B contains the same company but listed differently "BIOTECH CAP CORP". Can I match A and B based on the first word in the text? I mean can i write a code that tells that "Biotech Capital Corp" matches "BIOTECH CAP CORP" based on teh first word in the two strings "biotech" ?
0 Comments
Accepted Answer
Jan
on 10 Nov 2017
Edited: Jan
on 10 Nov 2017
A_list = {'Biotech Capital Corp', 'Apple something', 'tesla something else'};
B_list = {'Volvo Car AB', 'BIOTECH CAP CORP', 'TESLA by Elon Musk'};
A1 = strtok(A_list, ' ');
B1 = strtok(B_list, ' ');
[LiA, LocB] = ismember(lower(A1), lower(B1)); % Or the other way around
[AInd, BInd] = CStrAinBP(strtok(A_list), strtok(B_list), 'i')
0 Comments
More Answers (2)
per isakson
on 10 Nov 2017
Edited: per isakson
on 10 Nov 2017
Yes, try this
match = cssm()
match =
0 1 0
0 0 0
0 0 1
A_list items on the rows and B_list on the columns, i.e.
- the first item in A_list matches the second item in B_list.
- the fourth item in A_list matches the fourth item in B_list.
where
function match = cssm()
A_list = {'Biotech Capital Corp', 'Apple something', 'tesla something else'};
B_list = {'Volvo Car AB', 'BIOTECH CAP CORP', 'TESLA by Elon Musk'};
match = false( length( A_list ), length( B_list ) );
for jj = 1 : length( A_list )
a1 = regexp( A_list{jj}, '\<\w+\>', 'once', 'match' );
xpr = sprintf( '\\<%s\\>', a1 );
cac = regexpi( B_list, xpr );
%
match( jj, : ) = cellfun( @(pos) not(isempty(pos))&&pos==1, cac );
end
end
0 Comments
See Also
Categories
Find more on Genomics and Next Generation Sequencing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!