Find indices of multiple strings within another string
Show older comments
I am trying to efficiently find which strings (character vectors) match between two cell arrays.
One cell array contains ~1000 equations written as strings that I'm trying to parse by matching to strings in another array (100,000 items). I need to know the indices from the 100,000 items that are found within the ~1000 equations. There may be multiple of the 100,000 items found within each of the 1000 equations.
I'm currently implementing this as such:
Equations.Equation % this is a list of ~1000 equations, a cell array of character vectors
OutputData.DataName % list of ~100,000 possible strings I'm looking for in the equations (my variable names)
for ii = 1:length(Equations)
matches=cellfun(@(x) contains(Equations(ii).Equation,x),OutputData.DataName);
indices = find(matches);
% do some other stuff with the matches found, then move onto the next iteration of the loop
end
This is fairly slow. Is there a way to more efficiently find within Equations(ii).Equation which items within OutputData.DataName are found and the index of those items?
4 Comments
Paul
on 9 Apr 2022
Can a very small set of example data be provided for Equations and OutputData?
Nick Smith
on 9 Apr 2022
Something's not working with this example data and the code in the question. Is there a typo somewherer?
Equations.Equation = { '(123X + 123Y).^2'; ...
'500 + 456X + 123Z'; ...
'200 * abs(789Z * pi) + 123X'}
OutputData.DataName = {'123A'; '123B'; '123C'; '123X'; '123Y'; '123Z'; '456X'; '456Y'; '456Z'; '789X'; '789Y'; '789Z'};
for ii = 1:length(Equations)
matches=cellfun(@(x) contains(Equations(ii).Equation,x),OutputData.DataName);
indices = find(matches);
% do some other stuff with the matches found, then move onto the next iteration of the loop
end
It seems like Equations is actually a struct array:
Equations = struct('Equation',{ ...
'(123X + 123Y).^2'; ...
'500 + 456X + 123Z'; ...
'200 * abs(789Z * pi) + 123X'})
OutputData.DataName = {'123A'; '123B'; '123C'; '123X'; '123Y'; '123Z'; '456X'; '456Y'; '456Z'; '789X'; '789Y'; '789Z'};
for ii = 1:length(Equations)
matches = cellfun(@(x) contains(Equations(ii).Equation,x),OutputData.DataName).'
indices = find(matches)
% do some other stuff with the matches found, then move onto the next iteration of the loop
end
Accepted Answer
More Answers (0)
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!