Replacing certain text from a .txt file.

How can the following expressions be replaced? For example expressions of the form:
FirstText [Text1@/ Text2@/ Text3] LastText
is to be preplaced by
FirstText Text1 LastText; FirstText Text2 LastText; FirstText Text3 LastText

2 Comments

What if there were many words before FirstText and after LastText in str defined above. But I still want the same ouput which is FirstText Text1 LastText; FirstText Text2 LastText; FirstText Text3 LastText
What should you get with
FirstText [Text1@/ Text2@/ Text3] MiddleText1 [Text4@/ Text5] MiddleText2 [Text6@/ Text7@/ Text8@/ Text9] LastText

Sign in to comment.

 Accepted Answer

Cedric
Cedric on 13 Sep 2017
Edited: Cedric on 13 Sep 2017
If it is all you have to do, here is an example:
str = 'FirstText [Text1@/ Text2@/ Text3] LastText' ;
tokens = regexp( str, '(.*)[([^@]+)@/ ([^@]+)@/ ([^\]]+)\] (.*)', 'tokens', 'once' ) ;
outStr = sprintf( '%s %s %s; %s %s %s; %s %s %s', tokens{[1,2,5,1,3,5,1,4,5]} ) ;
which outputs
outStr = FirstText Text1 LastText; FirstText Text2 LastText; FirstText Text3 LastText
Note that you don't need regular expressions for this:
tokens = strsplit( str, {' [', '@/ ', '] '} )
tokens =
1×5 cell array
'FirstText' 'Text1' 'Text2' 'Text3' 'LastText'
But what is the context? I suspect that you have to apply this to a more complex case. Could you give a real slice of what you have to process?

9 Comments

and me too
>> cac = regexp( str, '[^\w]+', 'split' );
>> cac
cac =
'FirstText' 'Text1' 'Text2' 'Text3' 'LastText'
>>
Actually I have a big text file. Variable str you defined above can be be anywhere in the text file with words before and after. The goal is to find those str and preplace. I am assuming we have to loop through the txt file.
No you likely won't have to loop, but you have to clarify the situation, or even better attach a sample file. Can FirstText and LastText contain delimiters like \@, [, or ] ? Are there multiple [...] blocks?
If not, then both REGEXP and STRSPLIT (better in this simple case) approaches should work on your file content. Give it a try:
str = fileread( 'YourTextFile.txt' ) ;
and then apply either approach and see what you get. If it doesn't work, please attach some more realistic slice of your file, or even better the whole file.
Basically, the goal is to replace each expressions that contains [Text1@/ Text2@/ Text3@/ ... @/ Textk] type expressions by a sequence of resolved expressions containing all possible variants. For example
LeadingText [Text1@/ Text2@/ Text3] TrailingText
is to be preplaced by
LeadingText Text1 TrailingText; LeadingText Text2 TrailingText; LeadingText Text3 TrailingText
Another example is:
LeadingText [Text1@/ Text1] InBetweenText [Text2@/ Text2@] TrailingText
is to be preplaced by
LeadingText Text1 InBetweenText Text2 TrailingText; LeadingText Text1 InBetweenText Text2 TrailingText; LeadingText Text1 InBetweenText Text2 TrailingText; LeadingText Text1 InBetweenText Text2 TrailingText
One Example might look like:
[the Flood@/ the story of creation]
Are there multiple entries like this in the file? Is it one entry per line? E.g.
Data.txt:
A [B@/ C] D
E [F@/ G@/ H@/ I] J
[K@/ L] M
N [O@/ P]
If not, what is the separator?
I think that your second example could be better understood if you were using unique text IDs, i.e. text1 to text4, as I suppose that all entries in each bracket are not equivalent.
Finally, what is the purpose? The general case with an arbitrary number of brackets and an arbitrary number of items in each bracket will require quite a bit of coding..
There are multiple entries like you mentioned in the file. It does not have to be one entry per line. There could be more than one.
What is the separator between these entries? Is it the period? If there is no separator, does that mean that you need to extract all elements of all [..] blocks in a file and use all possible combinations?
And again, we is the purpose? If you gave a little more information, you may have more answers. Solving the general case with a lot of blocks and a lot of elements per block may be quite a bit of work, so the better you define the context, the more likely you are to get help.
'(.*)[([^@]+)@/ ([^@]+)@/ ([^\]]+)\] (.*)'
Be careful. Obviously the angry armadillo has entered your keyboard. [I've grown up in a time, when humor was not marked by smileys, but recognized by mind reading] Nevertheless, +1.
Cedric
Cedric on 27 Sep 2017
Edited: Cedric on 27 Sep 2017
Well, I've gown up at the same time, but I got in trouble in the mean time making jokes with no smiley, so I am slowwwwly adapting ; <- this is the beginning of a smiley.

Sign in to comment.

More Answers (0)

Categories

Asked:

am
on 13 Sep 2017

Edited:

on 27 Sep 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!