Efficient way to use regexp and contains and matching

35 views (last 30 days)
Hello!
I am using the following code to match two cell array contents. But it takes way too long to process. Can anybody suggest a better way to code the same thing?
validVar={};
str = {'abc==';'bac[2]';'fuh[2]';'fgh'};
list={'abc(1)';'cde';'fgh'};
for x=1:numel(list)
expression = sprintf('%s..',list{x});
for y=1:numel(str)
if ~isempty(regexp(str{y},expression,'match')) || contains(str{y},list{x})
validVar=[validVar;list{x}];
end
end
end
Also, the result gives me validVar with 'fgh' only but I want 'abc(1)' in the list as well since it is a part of str{1}. Is there a way to match the entries of list with str in such a way that even if a part of list entry matches part of str entry then it should be listed under validVar.
  3 Comments
Tiasa Ghosh
Tiasa Ghosh on 7 Sep 2018
My mistake. I have edited the question with examples and more bugs. I am using regexp as one search pattern so that the expression match to be used can have extended parts and the condition turns true even if a part of the string matches with part of the input.
Greg
Greg on 8 Sep 2018
First, your performance is suffering because you're looping over both lists. Both regexp and contains will work on a vector with a scalar, removing one of the loops.
Second, if you know how to use regexp expertly (this is not a dig - regexp is extremely powerful but even more difficult to master), you could do all of your checking with one expression.
Finally, your requirements are very ill-formulated. What in the word does "... even if a part of list entry matches part of str entry" mean? A part of bac[2] matches a part of cde - the c character. I'm sure this isn't what you had in mind, so you need more explicit rules for validVar.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 10 Sep 2018
Right now, your double loops can be simplified to:
validVar = {};
for x = 1:numel(list)
if ~isempty(cell2mat(regexp(str, [list{x}, '(..)?'], 'once'))) %match either list{x} or list{x} followed by any two characters.
validVar = [validVar; list{x}];
end
end
which should be a lot faster.
However, I don't think that's exactly what you want. I agree with greg that it's not really clear what it is exactly you want. We need a very clear rule of what patterns you want to match and not match with the regex.
  1 Comment
Tiasa Ghosh
Tiasa Ghosh on 12 Sep 2018
I realised where the question went vague and somehow the question isn't relevant to me now. Anyhow, thank you for your time and answer. Hope it helps somebody else in future . :)

Sign in to comment.

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!