How to extract a specific series of strings from a character array

6 views (last 30 days)
Hi guys,
I have a massive character array (1x187253) that I imported in matlab. Here is a small sample:
NEW SCOMPONENT /JBRC200XT
DESC 'CARBON STEEL ORDINARY REDUCER JIS3452 BWD CON. 350Ax200A'
GTYP REDU
PARA 350 200 355.6 216.3 BWD $
330.2 0
END
NEW SCOMPONENT /JBRC200XV
DESC 'CARBON STEEL ORDINARY REDUCER JIS3452 BWD CON. 350Ax250A'
GTYP REDU
PARA 350 250 355.6 267.4 BWD $
330.2 0
END
What I want to do with this character array is obtain all the 9 letter codes that are located right next to the NEW SCOMPONENT row (e.g. JBRC200XT and JBRC200XV in this case) as well as the characters between the quotes that are located on the DESC line (e.g. 'CARBON STEEL ORDINARY REDUCER... ') and place those side by side on a table in matlab table to be exported in excel.
I know that this should be possible however I have been trying for the last few days being stuck in the first step of even obtaining the codes JBRC200...
Thanks for your help in advance,
KR,
KMT.

Accepted Answer

TADA
TADA on 2 Nov 2018
Edited: TADA on 2 Nov 2018
Regular expressions are your friends
regexp(text, 'NEW SCOMPONENT\s*\/?(?<component>[\w]{9})\s+DESC\s*''(?<desc>[^'']+)', 'names');
the result is a struct array with two fields - component and desc containing your strings
ans =
struct with fields:
component: 'JBRC200XT'
desc: 'CARBON STEEL ORDINARY REDUCER JIS3452 BWD CON. 350Ax200A'
I tried it on your sample which i duplicated into a giant text file and it works very fast
  6 Comments
TADA
TADA on 2 Nov 2018
I'll leave the details to you, but you can use the \d to match digits and use the curly bracers with comma separated numbers to specify range in the number of repeats.
You can find some decent regex testers online to help with planning the right regex. Some of them also have good modules for learning regex patterns.
You should know that there are slight differences in the features of regex engines between different environments (Matlab, python, js, etc.), but the patterns are identical to the most part.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!