How do you get up to first n characters in a string?

351 views (last 30 days)
I have an array of strings of variable length s_arr. I want to limit the strings to only contain the first 3 elements. Some of the string have less than 3 elements, in which case they should be unchanged. I tried
arrayfun(@(s) s(1:3), s_arr)
but this gives an error when it finds a string with fewer than 3 elements. Why does MATLAB give an error and not just return the string unchanged? How can I achieve this goal?
  1 Comment
Stephen23
Stephen23 on 2 Feb 2024
You were very close:
s_arr = ["a";"ab";"abc";"abcd";"abcde"]
s_arr = 5×1 string array
"a" "ab" "abc" "abcd" "abcde"
cellfun(@(s) s(1:min(end,3)), s_arr, 'uni',0)
ans = 5×1 cell array
{'a' } {'ab' } {'abc'} {'abc'} {'abc'}

Sign in to comment.

Answers (5)

Dyuman Joshi
Dyuman Joshi on 2 Feb 2024
A simple approach using extractBetween -
s_arr = ["a";"ab";"abc";"abcd";"abcde";"abcdef";"xyzxyzxyz"]
s_arr = 7×1 string array
"a" "ab" "abc" "abcd" "abcde" "abcdef" "xyzxyzxyz"
pos = 4; % pos = no. of characters + 1
extractBetween(s_arr, 1, min(pos, strlength(s_arr)))
ans = 7×1 string array
"a" "ab" "abc" "abcd" "abcd" "abcd" "xyzx"

Paul
Paul on 27 Oct 2021
Edited: Paul on 2 Feb 2024
If a string array:
s_arr = ["a";"ab";"abc";"abcd";"abcde"]
s_arr = 5×1 string array
"a" "ab" "abc" "abcd" "abcde"
s_arr(strlength(s_arr)>3) = extractBefore(s_arr(strlength(s_arr)>3),4)
s_arr = 5×1 string array
"a" "ab" "abc" "abc" "abc"
Exact same code works on a cell array
s_arr = cellstr(["a";"ab";"abc";"abcd";"abcde"])
s_arr = 5×1 cell array
{'a' } {'ab' } {'abc' } {'abcd' } {'abcde'}
s_arr(strlength(s_arr)>3) = extractBefore(s_arr(strlength(s_arr)>3),4)
s_arr = 5×1 cell array
{'a' } {'ab' } {'abc'} {'abc'} {'abc'}
  1 Comment
Stephen23
Stephen23 on 2 Feb 2024
+1 This gracefully handles the shorter elements. Stylistically I would define the index beforehand:
s_arr = ["a";"ab";"abc";"abcd";"abcde"]
s_arr = 5×1 string array
"a" "ab" "abc" "abcd" "abcde"
N = 3;
X = strlength(s_arr)>N;
s_arr(X) = extractBefore(s_arr(X),1+N)
s_arr = 5×1 string array
"a" "ab" "abc" "abc" "abc"

Sign in to comment.


Image Analyst
Image Analyst on 26 Oct 2021
% Character array
charArray = 'abcdefgh' % Single quotes
n = 3
out = charArray(1:n) % Another way
% String
str = "abcdefgh" % Double quotes
n = 3
charArray = char(str)
out = charArray(1:n)
charArray =
'abcdefgh'
n =
3
out =
'abc'
str =
"abcdefgh"
n =
3
charArray =
'abcdefgh'
out =
'abc'
  1 Comment
Image Analyst
Image Analyst on 26 Oct 2021
For the case where n is longer than the string, or less than the string (so more general and robust):
% Character array
charArray = 'ab' % Single quotes
n = 30
out = charArray(1:min(length(charArray), n)) % Another way
% String
str = "ab" % Double quotes
charArray = char(str)
out = charArray(1:min(length(charArray), n))

Sign in to comment.


Ramanan
Ramanan on 2 Feb 2024
You can use the extractBefore function in the array function instead.
pos = 4; % pos = no. of characters + 1
out = arrayfun(@(s) extractBefore(s,pos), s_arr);
  2 Comments
Stephen23
Stephen23 on 2 Feb 2024
Edited: Stephen23 on 2 Feb 2024
This fails due to the shorter elements:
s_arr = ["a";"ab";"abc";"abcd";"abcde"]
s_arr = 5×1 string array
"a" "ab" "abc" "abcd" "abcde"
pos = 4; % pos = no. of characters + 1
out = arrayfun(@(s) extractBefore(s,pos), s_arr)
Error using extractBefore
Numeric value exceeds the number of characters in element 1.

Error in solution>@(s)extractBefore(s,pos) (line 3)
out = arrayfun(@(s) extractBefore(s,pos), s_arr)
See Paul's answer from three years ago for one approach to handle this.
Ramanan
Ramanan on 2 Feb 2024
You can add the min function and make use of the length of the string inside the arrayfun.
I don't notice the previously answered one.
s_arr = ["a";"ab";"abc";"abcd";"abcde";"abcdef";"xyzxyzxyz"];
pos = 4; % pos = no. of characters + 1
out = arrayfun(@(s) extractBefore(s,min(pos,strlength(s))), s_arr)
out = 7×1 string array
"" "a" "ab" "abc" "abc" "abc" "xyz"
Also I think a more simplified version is submitted by Dyuman Joshi.

Sign in to comment.


Geoff
Geoff on 12 Apr 2024
Learning about and using regex (regular expressions) has solved many string dilemmas for me. Here's a regex one liner for this question:
regexprep(s_arr,"(.{3}).*","$1")
Explanation:
The function regexprep ("reg ex replace") is just a fancy text search-and-replace. In a nutshell, our strategy is to find the entire string and replace it with just its first {3} characters.
The first input s_arr is your string array. regexprep will operate on each element of the array.
The next two inputs are our search "(.{3}).*" and our replacement "$1". You can read about regex syntax here. In a nutshell:
  • The period . matches any character.
  • And .{3} matches any 3 consecutive characters.
  • And .* matches any number of characters -- i.e. the rest of the string.
So, we've searched for the entire string, represented it in two parts: the first 3 characters (.{3}) and the rest .*. The surrounding parentheses in (.{3}) say we want to capture (i.e. remember) those characters. The $1 in our replacement string is what recalls those captured characters.
So, the entire string has been replaced by the first three characters. If the string is less than 3 characters, regexprep doesn't find any matches in its search, so nothing is replaced.
Pros and cons:
Con: it's cryptic.
Pro: It doesn't require Matlab to compute the length of the strings in the array.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!