Main Content

Search and Replace Text

Processing text data often involves finding and replacing substrings. There are several functions that find text and return different information: some functions confirm that the text exists, while others count occurrences, find starting indices, or extract substrings. These functions work on character vectors and string scalars, such as "yes", as well as character and string arrays, such as ["yes","no";"abc","xyz"]. In addition, you can use patterns to define rules for searching, such as one or more letter or digit characters.

Search for Text

To determine if text is present, use a function that returns logical values, like contains, startsWith, or endsWith. Logical values of 1 correspond to true, and 0 corresponds to false.

txt = "she sells seashells by the seashore"; 
TF = contains(txt,"sea")
TF = logical
   1

Calculate how many times the text occurs using the count function.

n = count(txt,"sea")
n = 2

To locate where the text occurs, use the strfind function, which returns starting indices.

idx = strfind(txt,"sea")
idx = 1×2

    11    28

Find and extract text using extraction functions, such as extract, extractBetween, extractBefore, or extractAfter.

mid = extractBetween(txt,"sea","shore")
mid = 
"shells by the sea"

Optionally, include the boundary text.

mid = extractBetween(txt,"sea","shore","Boundaries","inclusive")
mid = 
"seashells by the seashore"

Find Text in Arrays

The search and replacement functions can also find text in multi-element arrays. For example, look for color names in several song titles.

songs = ["Yellow Submarine"; 
         "Penny Lane";  
         "Blackbird"]; 

colors =["Red","Yellow","Blue","Black","White"]; 

TF = contains(songs,colors)
TF = 3x1 logical array

   1
   0
   1

To list the songs that contain color names, use the logical TF array as indices into the original songs array. This technique is called logical indexing.

colorful = songs(TF)
colorful = 2x1 string
    "Yellow Submarine"
    "Blackbird"

Use the function replace to replace text in songs that matches elements of colors with the string "Orange".

replace(songs,colors,"Orange")
ans = 3x1 string
    "Orange Submarine"
    "Penny Lane"
    "Orangebird"

Match Patterns

Since R2020b

In addition to searching for literal text, like “sea” or “yellow”, you can search for text that matches a pattern. There are many predefined patterns, such as digitsPattern to find numeric digits.

address = "123a Sesame Street, New York, NY 10128"; 
nums = extract(address,digitsPattern) 
nums = 2x1 string
    "123"
    "10128"

For additional precision in searches, you can combine patterns. For example, locate words that start with the character “S”. Use a string to specify the “S” character, and lettersPattern to find additional letters after that character.

pat = "S" + lettersPattern; 
StartWithS = extract(address,pat) 
StartWithS = 2x1 string
    "Sesame"
    "Street"

For more information, see Build Pattern Expressions.

See Also

| | | | |

Related Topics