Regexp with a list of keywords
8 views (last 30 days)
Show older comments
Hi everbody,
I'm aware regexp is a powerful tool, but I'm a newbie to this and I think my problem is an advanced one:
There is a list of keywords
keywords = {'alpha','basis','colic','druid','even'};
and I want regexp to find in a string all values after these keywords (followed by =).
For example:
str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
gives
'today','10',[],'none',{'odd','even'}
Can you help me?
0 Comments
Accepted Answer
Stephen23
on 10 Sep 2019
Edited: Stephen23
on 10 Sep 2019
Using a simple lookaround assertion:
>> keywords = {'alpha','basis','colic','druid','even'};
>> str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
>> fun = @(k)regexp(str,sprintf('(?<=%s=)\\w+',k),'match');
>> out = cellfun(fun,keywords,'uni',0);
>> out{:}
ans =
'today'
ans =
'10'
ans =
{}
ans =
'none'
ans =
'odd' 'even'
Note that each cell of out is itself a cell array, with varying sizes. If you want to unnest the scalar cells, as you indicate in your question, then try this:
>> idx = cellfun(@isscalar,out);
>> out(idx) = [out{idx}]
out =
'today' '10' {} 'none' {1x2 cell}
>> out{:}
ans =
today
ans =
10
ans =
{}
ans =
none
ans =
'odd' 'even'
2 Comments
Stephen23
on 10 Sep 2019
Stephan Kolb's "Answer" moved here:
Hi Stephen,
thank you very much for your quick and smart answer!!!
Do you think, we can map your solution to string arrays, e.g.
str = ["basis=10","alpha=today","druid=none","even=odd","even=even"];
Thank you in advance,
Stephan
Stephen23
on 10 Sep 2019
Edited: Stephen23
on 10 Sep 2019
"Do you think, we can map your solution to string arrays"
Sure: use a loop or arrayfun or concatenate the data into one character vector / a scalar string or fiddle around with the cell array outputs of regexp. Whichever works for you.
But if your data really are separated (and not in one character vector as your showed in your question), then I would probably just split them at each = character, compare the 1st parts using strcmp or the like, and then use accumarray or similar to group together.
More Answers (1)
Walter Roberson
on 10 Sep 2019
keywords = {'alpha','basis','colic','druid','even'};
kv = regexp(str, '(?<name>\w+)=(?<value>\w+)', 'names');
values = cell(1, length(keywords));
[found, idx] = ismember(keywords, {kv.name});
values(found) = {kv(idx(found)).value};
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!