split columns with separate elements

I have a column with lastnames and insertions. How can i split those? Cause not all the lastnames have the same number of elements.
a = bookings.child_firstname
b = split(a,' ')
Now I have the following error:
Error using split
Element 8 of the text contains 1 matches while the previous elements
have 0. All elements must contain the same number of matches.

4 Comments

a = bookings.child_firstname
is that returning a cell array of character vectors? A nonscalar string array?
split probably isn't the tool for the job here -- more likely a regular expression would be able to parse the variable strings, but would need to see the definition of what the string format can be and what you're trying to extract, specifically.
split is, as the error message indicates, unforgiving in that every record must have the same number of delimiters; I've wished for more flexibility there quite a number of times, too, but to use it on data that is not strictly conforming as to following a given pattern you'll have to wrap it in either try...catch...end construct to handle the missing cases or count/locate the bum strings and have code to specifically deal with what is found.
Better is to have a fixed, well-defined format that the data must follow and enforce it on creation/data entry.
regexp(a, '\s+', 'split')
would return a cell array with one entry for each a entry, with the cell array entry being a cell array of character vectors (or possibly a string array?)
One could also investigate the new(ish) pattern object/function that has "regexp for dummies" kind of functionality...

Sign in to comment.

Answers (1)

I understand that you are currently facing an issue while using the “split” function due to unequal sizes of the spitted array.
Thesplitfunction in MATLAB requires that all elements in the input text have the same number of matches. In your case, since the last names and insertions have varying lengths, thesplitfunction cannot be directly used.
To split last names and insertions that have varying lengths, you can use regular expressions and theregexpfunction in MATLAB. Regular expressions provide a flexible way to match patterns in text.
You can refer to the documentation of these functions below:
Here's an example of how you can split last names and insertions using regular expressions:
% Example input
lastNames = {'Smith', 'Johnson Jr.', 'Brown III', 'Davis'};
% Initialize arrays to store last names and insertions
splitLastNames = cell(size(lastNames));
insertions = cell(size(lastNames));
% Regular expression pattern to match last names and insertions
pattern = '^(.*?)\s+(.*?)$';
for i = 1:numel(lastNames)
% Match the pattern using regular expressions
matches = regexp(lastNames{i}, pattern, 'tokens');
if isempty(matches)
% No insertion found, assign the whole string as the last name
splitLastNames{i} = lastNames{i};
insertions{i} = '';
else
% Extract the last name and insertion from the matches
splitLastNames{i} = matches{1}{1};
insertions{i} = matches{1}{2};
end
end
% Display the split last names and insertions
for i = 1:numel(lastNames)
disp(['Last Name: ' splitLastNames{i} ', Insertion: ' insertions{i}]);
end
Last Name: Smith, Insertion: Last Name: Johnson, Insertion: Jr. Last Name: Brown, Insertion: III Last Name: Davis, Insertion:
In this example, the regular expression pattern^(.*?)\s+(.*?)$is used to match the last name and insertion. The^and$symbols denote the start and end of the string, respectively. The(.*?)captures any characters lazily (i.e., as few as possible), and\s+matches one or more whitespace characters.
I hope these suggestions help you resolve the issue you are facing.
Best regards
Chetan Verma

Categories

Products

Release

R2022a

Asked:

on 10 Aug 2022

Edited:

on 7 Sep 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!