regexprep
Replace text using regular expression
Syntax
Description
replaces
the text in newStr
= regexprep(str
,expression
,replace
)str
that matches expression
with
the text described by replace
. The regexprep
function
returns the updated text in newStr
.
If
str
is a single piece of text (either a character vector or a string scalar), thennewStr
is also a single piece of text of the same type.newStr
is a single piece of text even whenexpression
orreplace
is a cell array of character vectors or a string array. Whenexpression
is a cell array or a string array,regexprep
applies the first expression tostr
, and then applies each subsequent expression to the preceding result.If
str
is a cell array or a string array, thennewStr
is a cell array or string array with the same dimensions asstr
. For each element ofstr
, theregexprep
function applies each expression in sequence.If there are no matches to
expression
, thennewStr
is equivalent tostr
.
Examples
Update Text
Replace words that begin with M
, end with y
, and have at least one character between them.
str = 'My flowers may bloom in May'; expression = 'M(\w+)y'; replace = 'April'; newStr = regexprep(str,expression,replace)
newStr = 'My flowers may bloom in April'
Include Tokens in Replacement Text
Replace variations of the phrase 'walk up'
by capturing the letters that follow 'walk'
in a token.
str = 'I walk up, they walked up, we are walking up.'; expression = 'walk(\w*) up'; replace = 'ascend$1'; newStr = regexprep(str,expression,replace)
newStr = 'I ascend, they ascended, we are ascending.'
Include Dynamic Expression in Replacement Text
Replace lowercase letters at the beginning of sentences with their uppercase equivalents using the upper
function.
str = 'here are two sentences. neither is capitalized.'; expression = '(^|\.)\s*.'; replace = '${upper($0)}'; newStr = regexprep(str,expression,replace)
newStr = 'Here are two sentences. Neither is capitalized.'
The regular expression matches single characters (.
) that follow the beginning of the character vector (^)
or a period (\.)
and any whitespace (\s*)
. The replace
expression calls the upper
function for the currently matching character ($0
).
Update Multiple Pieces of Text
Replace each occurrence of a double letter in a set of character vectors with the symbols '--'
.
str = { ... 'Whose woods these are I think I know.' ; ... 'His house is in the village though;' ; ... 'He will not see me stopping here' ; ... 'To watch his woods fill up with snow.'}; expression = '(.)\1'; replace = '--'; newStr = regexprep(str,expression,replace)
newStr = 4x1 cell
{'Whose w--ds these are I think I know.'}
{'His house is in the vi--age though;' }
{'He wi-- not s-- me sto--ing here' }
{'To watch his w--ds fi-- up with snow.'}
Preserve Case in Original Text
Ignore letter case in the regular expression when finding matches, but mimic the letter case of the original text when updating.
str = 'My flowers may bloom in May'; expression = 'M(\w+)y'; replace = 'April'; newStr = regexprep(str,expression,replace,'preservecase')
newStr = 'My flowers april bloom in April'
Replace Zero-Length Matches
Insert text at the beginning of a character vector using the '^'
operator, which returns a zero-length match, and the 'emptymatch'
keyword.
str = 'abc'; expression = '^'; replace = '__'; newStr = regexprep(str,expression,replace,'emptymatch')
newStr = '__abc'
Input Arguments
str
— Text to update
character vector | cell array of character vectors | string array
Text to update, specified as a character vector, a cell array of character vectors, or a string array.
Data Types: char
| cell
| string
expression
— Regular expression
character vector | cell array of character vectors | string array
Regular expression, specified as a character vector, a cell
array of character vectors, or a string array. Each expression can
contain characters, metacharacters, operators, tokens, and flags that
specify patterns to match in str
.
The following tables describe the elements of regular expressions.
Metacharacters
Metacharacters represent letters, letter ranges, digits, and space characters. Use them to construct a generalized pattern of characters.
Metacharacter | Description | Example |
---|---|---|
| Any single character, including white space |
|
| Any character contained within the square brackets. The following characters are treated
literally: |
|
| Any character not contained within the square brackets. The following characters are treated
literally: |
|
| Any character in the range of |
|
| Any alphabetic, numeric, or underscore character. For
English character sets, |
|
| Any character that is not alphabetic, numeric, or underscore.
For English character sets, |
|
| Any white-space character; equivalent to |
|
| Any non-white-space character; equivalent to |
|
| Any numeric digit; equivalent to |
|
| Any nondigit character; equivalent to |
|
| Character of octal value |
|
| Character of hexadecimal value |
|
Character Representation
Operator | Description |
---|---|
| Alarm (beep) |
| Backspace |
| Form feed |
| New line |
| Carriage return |
| Horizontal tab |
| Vertical tab |
| Any character with special meaning in regular expressions
that you want to match literally (for example, use |
Quantifiers
Quantifiers specify the number of times a pattern must occur in the matching text.
Quantifier | Number of Times Expression Occurs | Example |
---|---|---|
| 0 or more times consecutively. |
|
| 0 times or 1 time. |
|
| 1 or more times consecutively. |
|
| At least
|
|
| At least
|
|
| Exactly Equivalent
to |
|
Quantifiers can appear in three modes, described in the following table. q represents any of the quantifiers in the previous table.
Mode | Description | Example |
---|---|---|
| Greedy expression: match as many characters as possible. | Given the text
|
| Lazy expression: match as few characters as necessary. | Given the text
|
| Possessive expression: match as much as possible, but do not rescan any portions of the text. | Given the text |
Grouping Operators
Grouping operators allow you to capture tokens, apply one operator to multiple elements, or disable backtracking in a specific group.
Grouping Operator | Description | Example |
---|---|---|
| Group elements of the expression and capture tokens. |
|
| Group, but do not capture tokens. |
Without
grouping, |
| Group atomically. Do not backtrack within the group to complete the match, and do not capture tokens. |
|
| Match expression If
there is a match with You can include |
|
Anchors
Anchors in the expression match the beginning or end of the input text or word.
Anchor | Matches the... | Example |
---|---|---|
| Beginning of the input text. |
|
| End of the input text. |
|
| Beginning of a word. |
|
| End of a word. |
|
Lookaround Assertions
Lookaround assertions look for patterns that immediately precede or follow the intended match, but are not part of the match.
The pointer remains at the current location, and characters
that correspond to the test
expression are not
captured or discarded. Therefore, lookahead assertions can match overlapping
character groups.
Lookaround Assertion | Description | Example |
---|---|---|
| Look ahead for characters that match |
|
| Look ahead for characters that do not match |
|
| Look behind for characters that match |
|
| Look behind for characters that do not match |
|
If you specify a lookahead assertion before an
expression, the operation is equivalent to a logical AND
.
Operation | Description | Example |
---|---|---|
| Match both |
|
| Match |
|
Logical and Conditional Operators
Logical and conditional operators allow you to test the state
of a given condition, and then use the outcome to determine which
pattern, if any, to match next. These operators support logical OR
,
and if
or if/else
conditions.
Conditions can be tokens, lookaround operators, or dynamic expressions
of the form (?@cmd)
. Dynamic expressions must return
a logical or numeric value.
Conditional Operator | Description | Example |
---|---|---|
| Match expression If
there is a match with |
|
| If condition |
|
| If condition |
|
Token Operators
Tokens are portions of the matched text that you define by enclosing part of the regular expression in parentheses. You can refer to a token by its sequence in the text (an ordinal token), or assign names to tokens for easier code maintenance and readable output.
Ordinal Token Operator | Description | Example |
---|---|---|
| Capture in a token the characters that match the enclosed expression. |
|
| Match the |
|
| If the |
|
Named Token Operator | Description | Example |
---|---|---|
| Capture in a named token the characters that match the enclosed expression. |
|
| Match the token referred to by |
|
| If the named token is found, then match |
|
Note
If an expression has nested parentheses, MATLAB® captures
tokens that correspond to the outermost set of parentheses. For example,
given the search pattern '(and(y|rew))'
, MATLAB creates
a token for 'andrew'
but not for 'y'
or 'rew'
.
Dynamic Regular Expressions
Dynamic expressions allow you to execute a MATLAB command or a regular expression to determine the text to match.
The parentheses that enclose dynamic expressions do not create a capturing group.
Operator | Description | Example |
---|---|---|
| Parse When parsed, |
|
| Execute the MATLAB command represented by |
|
| Execute the MATLAB command represented by |
|
Within dynamic expressions, use the following operators to define replacement text.
Replacement Operator | Description |
---|---|
| Portion of the input text that is currently a match |
| Portion of the input text that precedes the current match |
| Portion of the input text that follows the current match
(use |
|
|
| Named token |
| Output returned when MATLAB executes the command, |
Comments
Characters | Description | Example |
---|---|---|
(?#comment) | Insert a comment in the regular expression. The comment text is ignored when matching the input. |
|
Search Flags
Search flags modify the behavior for matching expressions. An
alternative to using a search flag within an expression is to pass
an option
input argument.
Flag | Description |
---|---|
(?-i) | Match letter case (default for |
(?i) | Do not match letter case (default for |
(?s) | Match dot ( |
(?-s) | Match dot in the pattern with any character that is not a newline character. |
(?-m) | Match the |
(?m) | Match the |
(?-x) | Include space characters and comments when matching (default). |
(?x) | Ignore space characters and comments when matching. Use |
The expression that the flag modifies can appear either after the parentheses, such as
(?i)\w*
or inside the parentheses and separated from the flag with a
colon (:
), such as
(?i:\w*)
The latter syntax allows you to change the behavior for part of a larger expression.
Data Types: char
| cell
| string
replace
— Replacement text
character vector | cell array of character vectors | string array
Replacement text, specified as a character vector, a cell array of character vectors, or a string array, as follows:
If
replace
is a single character vector andexpression
is a cell array of character vectors, thenregexprep
uses the same replacement text for each expression.If
replace
is a cell array ofN
character vectors andexpression
is a single character vector, thenregexprep
attemptsN
matches and replacements.If both
replace
andexpression
are cell arrays of character vectors, then they must contain the same number of elements.regexprep
pairs eachreplace
element with its matching element inexpression
.
The replacement text can include regular characters, special characters (such as tabs or new lines), or replacement operators, as shown in the following tables.
Replacement Operator | Description |
---|---|
| Portion of the input text that is currently a match |
| Portion of the input text that precedes the current match |
| Portion of the input text that follows the current match
(use |
|
|
| Named token |
| Output returned when MATLAB executes the command, |
Operator | Description |
---|---|
| Alarm (beep) |
| Backspace |
| Form feed |
| New line |
| Carriage return |
| Horizontal tab |
| Vertical tab |
| Any character with special meaning in regular expressions
that you want to match literally (for example, use |
Data Types: char
| cell
| string
option
— Search or replacement option
'once'
| N
| 'warnings'
| 'ignorecase'
| 'preservecase'
| 'emptymatch'
| 'dotexceptnewline'
| 'lineanchors'
| ...
Search or replacement option, specified as a character vector or an integer value, as shown in the following table.
Options come in sets: one option that corresponds to the default behavior, and one or two options that allow you to override the default. Specify only one option from a set. Options can appear in any order.
Default |
Override |
Description |
---|---|---|
|
|
Match and replace the expression as many times as possible (default), or only once. |
|
Replace only the | |
|
|
Suppress warnings (default), or display them. |
|
|
Match letter case (default), or ignore case while matching and replacing. |
| Ignore case while matching, but preserve the case of corresponding characters in the original text while replacing if they match one of the following structures:
If the original text does not match one of these structures then the replacement text will be all lowercase. | |
|
|
Ignore zero length matches (default), or include them. |
|
|
Match dot with any character (default), or all
except newline ( |
|
| Apply |
|
|
Include space characters and comments when
matching (default), or ignore them. With
|
Data Types: char
| string
Output Arguments
newStr
— Updated text
character vector | cell array of character vectors | string array
Updated text, returned as a character vector, a cell array of
character vectors, or a string array. The data type of newStr
is
the same as the data type of str
.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
The
regexprep
function fully supports tall arrays. For more information,
see Tall Arrays.
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)