Replace a missing string in a table
16 views (last 30 days)
Show older comments
I want to replace all missing strings in a table with a string of my choice, say 'unknown'. I use R2016a (without an upgrade option), so functions like fillmissing are not available to me, in case they could be of help. Eg:
dblVar = [NaN; 3; 7; 9];
cellstrVar = {'one'; 'three'; ''; 'nine'};
categoryVar = categorical({''; 'red'; 'yellow'; 'blue'});
A = table(dblVar, cellstrVar, categoryVar)
A =
dblVar cellstrVar categoryVar
______ __________ ___________
NaN 'one' <undefined>
3 'three' red
7 '' yellow
9 'nine' blue
I would like to end up with this:
A =
dblVar cellstrVar categoryVar
______ __________ ___________
NaN 'one' unknown
3 'three' red
7 'unknown' yellow
9 'nine' blue
Note I also replaced the categorical '<undefined>' as well, if you can please include in your answer.
Is there a way to do this without changing A's structure, eg from table to cell, in the process? The reason I want to avoid the transformation is my table is large, and transformation may cause memory issues.
Edit to add: the location of the missing string value has to be identified as well, there may be several such columns in the table.
Many thanks.
0 Comments
Accepted Answer
George
on 28 Sep 2016
This should work:
dblVar = [NaN; 3; 7; 9];
cellstrVar = {'one'; 'three'; ''; 'nine'};
categoryVar = categorical({''; 'red'; 'yellow'; 'blue'});
cellstrVar2 = {'four'; 'none'; '7'; ''};
A = table(dblVar, cellstrVar, categoryVar, cellstrVar2);
varNames = A.Properties.VariableNames;
for ii = 1:numel(varNames)
if iscellstr(A{1,varNames{ii}})
undefloc = strcmp(A.(ii), '');
A{undefloc, ii} = cellstr('unknown');
end
if iscategorical(A{1, varNames{ii}})
undefloc = isundefined(A{:,ii});
A{undefloc, ii} = categorical(cellstr('unknown'));
end
end
A =
dblVar cellstrVar categoryVar cellstrVar2
______ __________ ___________ ___________
NaN 'one' unknown 'four'
3 'three' red 'none'
7 'unknown' yellow '7'
9 'nine' blue 'unknown'
You can use the undefloc variable to find where things were undefined or empty strings.
4 Comments
George
on 29 Sep 2016
There you go. If you do that and slam the cells and categoricals into cell arrays you can use the curly brace syntax on the cell array, rather than the variable name.
More Answers (1)
Peter Perkins
on 3 Oct 2016
George's loop seems fine to me although you could tweak it a bit as
for name = varNames
var = A.name;
if iscellstr(var)
var(strcmp(var,'')) = {'unknown'};
elseif iscategorical(A.(name))
var(isundefined(var) = 'unknown';
end
A.Name = var;
end
If you're willing to write a couple of small functions, you can do this:
theCellStrs = varfun(@iscellstr,A);
A(:,theCellStrs) = varfun(@replaceEmptyString,A(:,theCellStrs));
function c = replaceEmptyString(c)
c(strcmp(c,'') = {'Unknown'};
(and similarly for categorical) but varfun uses a loop underneath.
0 Comments
See Also
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!