Regexp with a list of keywords

8 views (last 30 days)
Stephan Kolb
Stephan Kolb on 10 Sep 2019
Edited: Stephen23 on 10 Sep 2019
Hi everbody,
I'm aware regexp is a powerful tool, but I'm a newbie to this and I think my problem is an advanced one:
There is a list of keywords
keywords = {'alpha','basis','colic','druid','even'};
and I want regexp to find in a string all values after these keywords (followed by =).
For example:
str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
gives
'today','10',[],'none',{'odd','even'}
Can you help me?

Accepted Answer

Stephen23
Stephen23 on 10 Sep 2019
Edited: Stephen23 on 10 Sep 2019
Using a simple lookaround assertion:
>> keywords = {'alpha','basis','colic','druid','even'};
>> str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
>> fun = @(k)regexp(str,sprintf('(?<=%s=)\\w+',k),'match');
>> out = cellfun(fun,keywords,'uni',0);
>> out{:}
ans =
'today'
ans =
'10'
ans =
{}
ans =
'none'
ans =
'odd' 'even'
Note that each cell of out is itself a cell array, with varying sizes. If you want to unnest the scalar cells, as you indicate in your question, then try this:
>> idx = cellfun(@isscalar,out);
>> out(idx) = [out{idx}]
out =
'today' '10' {} 'none' {1x2 cell}
>> out{:}
ans =
today
ans =
10
ans =
{}
ans =
none
ans =
'odd' 'even'
  2 Comments
Stephen23
Stephen23 on 10 Sep 2019
Stephan Kolb's "Answer" moved here:
Hi Stephen,
thank you very much for your quick and smart answer!!!
Do you think, we can map your solution to string arrays, e.g.
str = ["basis=10","alpha=today","druid=none","even=odd","even=even"];
Thank you in advance,
Stephan
Stephen23
Stephen23 on 10 Sep 2019
Edited: Stephen23 on 10 Sep 2019
"Do you think, we can map your solution to string arrays"
Sure: use a loop or arrayfun or concatenate the data into one character vector / a scalar string or fiddle around with the cell array outputs of regexp. Whichever works for you.
But if your data really are separated (and not in one character vector as your showed in your question), then I would probably just split them at each = character, compare the 1st parts using strcmp or the like, and then use accumarray or similar to group together.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 10 Sep 2019
keywords = {'alpha','basis','colic','druid','even'};
kv = regexp(str, '(?<name>\w+)=(?<value>\w+)', 'names');
values = cell(1, length(keywords));
[found, idx] = ismember(keywords, {kv.name});
values(found) = {kv(idx(found)).value};
  1 Comment
Stephan Kolb
Stephan Kolb on 10 Sep 2019
Consequently, when using a string array for str, it's better to use a string array for list the list of keywords, too.
So we have:
keywords = ["alpha","basis","colic","druid","even"];
str = ["basis=10","alpha=today","druid=none","even=odd","even=even"];
Can we adapt your solution?

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!