MATLAB Answers

cellfun vs. varfun applied to column of table

20 views (last 30 days)
Rob Hurt
Rob Hurt on 20 May 2020
Commented: Rob Hurt on 20 May 2020
Hello,
I have a table (only showing one column here for demo purposes):
T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})
What I need to do is to convert the 1x1 char arrays into 1xn double arrays.
First I tried to use varfun to apply the str2double function to the indices variable:
T.indices = varfun(@str2double,T,'InputVariables','indices')
That ran without error, but converted all of my 1x1 char arrays into 1x1 double arrays with NaN in each.
I tried the same thing with str2num:
T.indices = varfun(@str2num,T,'InputVariables','indices')
But it gave me the error:
Applying the function 'str2num' to the variable 'indices' generated the following error:
Input must be a character vector or string scalar.
What ultimately worked was using cellfun to apply the str2num function:
T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)
But I'm not sure why. What is the difference between passing "T,'InputVariables','indices'" into varfun and passing "T.indices" into cellfun? In what case would you use varfun? And is there any way to pass in my variable so that str2double returns the correct output?
Thanks in advance for any insights,
Rob

  0 Comments

Sign in to comment.

Accepted Answer

Tommy
Tommy on 20 May 2020
Edited: Tommy on 20 May 2020
varfun passes entire variables from your table into your function. It calls your function once per table variable. The variable in your table is indices, a 5x1 cell array of character vectors:
>> T.indices
ans =
5×1 cell array
{'147956, 154414' }
{'1, 7439' }
{'93053, 101815' }
{'50151, 54585; 827532, 828570; 5846728, 5848716'}
{'1063488, 1079019' }
This runs without error because str2double can handle cell array input:
T.indices = varfun(@str2double,T,'InputVariables','indices')
It is similar to:
>> str2double(T.indices)
ans =
NaN
NaN
NaN
NaN
NaN
You get an error here because str2num cannot accept cell arrays:
T.indices = varfun(@str2num,T,'InputVariables','indices')
It is similar to:
>> str2num(T.indices)
Error using str2num (line 35)
Input must be a character vector or string scalar.
cellfun, on the other hand, calls your function over and over again, once per cell in your cell array. cellfun passes the contents of each cell when calling your function. The contents of each cell in T.indices is a character vector, so str2num doesn't complain, and you don't get an error here:
T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)
It is similar to:
>> {str2num(T.indices{1}); str2num(T.indices{2}); str2num(T.indices{3});...
str2num(T.indices{4}); str2num(T.indices{5})}
ans =
5×1 cell array
{1×2 double}
{1×2 double}
{1×2 double}
{3×2 double}
{1×2 double}
As for the different behavior of str2num and str2double, consider the following:
>> str2double('1,000')
ans =
1000
>> str2num('1,000')
ans =
1 0
From the documentation for str2double (roughly),
"If [input] is a character vector or string scalar, then [output] is a numeric scalar."
It therefore treats commas differently.
You could use something like sscanf, regexp, strsplit, etc to split your string and, if still needed, use str2double to convert to a double.
(edit) reworded a few things

  3 Comments

Rob Hurt
Rob Hurt on 20 May 2020
Hi Tommy,
Thanks, this is great. Clearly I have some confusion on varfun. From reading the docs, especially "Apply Element-wise Function," I was under the impression that varfun could pass in each cell individually, because that was the result in that example. As a general rule, how do you know if something will be applied to the entire variable or the cell contents?
Thanks,
Rob
Tommy
Tommy on 20 May 2020
In that example, the fact that the function is element-wise means that the function applies its operations to each element of the input separately. However, when the function is used with varfun, entire variables are still passed to the function at once. varfun does not pass each element in the table's variable individually.
The corresponding 'non-element-wise' function,
func = @(x) x^2;
would attempt to perform matrix multiplication, multiplying the input with itself. For the example table given in the docs, it would fail, seeing as each variable is a 5x1 array.
With varfun, if you want your function to apply to each element within your table's variables, you can write it to do so, but you need to keep in mind that the entire variable will be passed to it at once. The difference between the element-wise function func from the example in the docs and str2num is that func can properly handle an input which is a double array, but str2num can't handle input which is a cell array.
(edit) Also I reworded my answer a bit, I hope it wasn't misleading before!
Rob Hurt
Rob Hurt on 20 May 2020
Yes, that makes sense, thank you very much!

Sign in to comment.

More Answers (1)

per isakson
per isakson on 20 May 2020
Edited: per isakson on 20 May 2020
This works, but may contain more code than you hoped for.
%%
T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})
%%
T.num = varfun( @foo, T, 'InputVariables','indices' );
%%
function out = foo( varargin )
len = length( varargin{1} );
out = cell( len, 1 );
for jj = 1 : len
chr = varargin{1}{jj};
num = textscan( chr, '%f', 'Delimiter',',;' );
out{jj} = reshape( num{1}, 1,[] );
end
end
Adds a column to the table
>> T
T =
5×2 table
indices num
foo_indices
________________________________________________ ____________
'147956, 154414' [1×2 double]
'1, 7439' [1×2 double]
'93053, 101815' [1×2 double]
'50151, 54585; 827532, 828570; 5846728, 5848716' [1×6 double]
'1063488, 1079019' [1×2 double]
>>
Check whether the order of the numbers in [1×6 double] is what you want.

  0 Comments

Sign in to comment.