cellfun vs. varfun applied to column of table

Question

Rob Hurt on 20 May 2020

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/527269-cellfun-vs-varfun-applied-to-column-of-table

Commented: Rob Hurt on 20 May 2020

Accepted Answer: Tommy

Open in MATLAB Online

Hello,

I have a table (only showing one column here for demo purposes):

T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})

What I need to do is to convert the 1x1 char arrays into 1xn double arrays.

First I tried to use varfun to apply the str2double function to the indices variable:

T.indices = varfun(@str2double,T,'InputVariables','indices')

That ran without error, but converted all of my 1x1 char arrays into 1x1 double arrays with NaN in each.

I tried the same thing with str2num:

T.indices = varfun(@str2num,T,'InputVariables','indices')

But it gave me the error:

Applying the function 'str2num' to the variable 'indices' generated the following error:

Input must be a character vector or string scalar.

What ultimately worked was using cellfun to apply the str2num function:

T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)

But I'm not sure why. What is the difference between passing "T,'InputVariables','indices'" into varfun and passing "T.indices" into cellfun? In what case would you use varfun? And is there any way to pass in my variable so that str2double returns the correct output?

Thanks in advance for any insights,

Rob

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Tommy on 20 May 2020

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/527269-cellfun-vs-varfun-applied-to-column-of-table#answer_433999

Edited: Tommy on 20 May 2020

Open in MATLAB Online

varfun passes entire variables from your table into your function. It calls your function once per table variable. The variable in your table is indices, a 5x1 cell array of character vectors:

>> T.indices
ans =
  5×1 cell array
    {'147956, 154414'                                }
    {'1, 7439'                                       }
    {'93053, 101815'                                 }
    {'50151, 54585; 827532, 828570; 5846728, 5848716'}
    {'1063488, 1079019'                              }

This runs without error because str2double can handle cell array input:

T.indices = varfun(@str2double,T,'InputVariables','indices')

It is similar to:

>> str2double(T.indices)
ans =
   NaN
   NaN
   NaN
   NaN
   NaN

You get an error here because str2num cannot accept cell arrays:

T.indices = varfun(@str2num,T,'InputVariables','indices')

It is similar to:

>> str2num(T.indices)
Error using str2num (line 35)
Input must be a character vector or string scalar.

cellfun, on the other hand, calls your function over and over again, once per cell in your cell array. cellfun passes the contents of each cell when calling your function. The contents of each cell in T.indices is a character vector, so str2num doesn't complain, and you don't get an error here:

T.indices = cellfun(@str2num,T.indices,'UniformOutput',false)

It is similar to:

>> {str2num(T.indices{1}); str2num(T.indices{2}); str2num(T.indices{3});...
    str2num(T.indices{4}); str2num(T.indices{5})}
ans =
  5×1 cell array
    {1×2 double}
    {1×2 double}
    {1×2 double}
    {3×2 double}
    {1×2 double}

As for the different behavior of str2num and str2double, consider the following:

>> str2double('1,000')
ans =
        1000
>> str2num('1,000')
ans =
     1     0

From the documentation for str2double (roughly),

"If [input] is a character vector or string scalar, then [output] is a numeric scalar."

It therefore treats commas differently.

You could use something like sscanf, regexp, strsplit, etc to split your string and, if still needed, use str2double to convert to a double.

(edit) reworded a few things

3 Comments
Show 1 older commentHide 1 older comment

Tommy on 20 May 2020

Edited: Tommy on 20 May 2020

Open in MATLAB Online

In that example, the fact that the function is element-wise means that the function applies its operations to each element of the input separately. However, when the function is used with varfun, entire variables are still passed to the function at once. varfun does not pass each element in the table's variable individually.

The corresponding 'non-element-wise' function,

func = @(x) x^2;

would attempt to perform matrix multiplication, multiplying the input with itself. For the example table given in the docs, it would fail, seeing as each variable is a 5x1 array.

With varfun, if you want your function to apply to each element within your table's variables, you can write it to do so, but you need to keep in mind that the entire variable will be passed to it at once. The difference between the element-wise function func from the example in the docs and str2num is that func can properly handle an input which is a double array, but str2num can't handle input which is a cell array.

(edit) Also I reworded my answer a bit, I hope it wasn't misleading before!

Rob Hurt on 20 May 2020

Yes, that makes sense, thank you very much!

Sign in to comment.

Answer 2

per isakson on 20 May 2020

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/527269-cellfun-vs-varfun-applied-to-column-of-table#answer_434019

Edited: per isakson on 20 May 2020

Open in MATLAB Online

This works, but may contain more code than you hoped for.

%%
T = table({'147956, 154414'; '1, 7439'; '93053, 101815'; '50151, 54585; 827532, 828570; 5846728, 5848716'; '1063488, 1079019'},'VariableNames',{'indices'})
%% 
T.num = varfun( @foo, T, 'InputVariables','indices' );
%%
function    out = foo( varargin )
    len = length( varargin{1} );
    out = cell( len, 1 );
    for jj = 1 : len
       chr = varargin{1}{jj};
       num = textscan( chr, '%f', 'Delimiter',',;' );
       out{jj} = reshape( num{1}, 1,[] );
    end
end

Adds a column to the table

>> T
T =
  5×2 table
                        indices                             num     
                                                        foo_indices 
    ________________________________________________    ____________
    '147956, 154414'                                    [1×2 double]
    '1, 7439'                                           [1×2 double]
    '93053, 101815'                                     [1×2 double]
    '50151, 54585; 827532, 828570; 5846728, 5848716'    [1×6 double]
    '1063488, 1079019'                                  [1×2 double]
>> 

Check whether the order of the numbers in [1×6 double] is what you want.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

cellfun vs. varfun applied to column of table

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

cellfun vs. varfun applied to column of table

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments