Passing structs/objects to functions
85 views (last 30 days)
Show older comments
broken_arrow
on 16 Apr 2021
Edited: Trym Gabrielsen
on 12 Mar 2022
When applying functions to structs, I often pass the entire struct as an input argument even if not all the variables within it are needed by the function in order to keep the input argument list short. AFAIK functions are JIT compiled on their first execution. That makes me wonder: Does the compiler realize which parts of the struct are actually needed and pass only these to the function? Or is the entire struct loaded into the function even if just one element is modified (which could potentially cause enormous overhead)? I would guess it's the former since with objects, the entire object is usually passed to each method:
function obj = whatever(obj,otherArgument1,etc)
% function code
end
Is that correct?
0 Comments
Accepted Answer
Bruno Luong
on 16 Apr 2021
Edited: Bruno Luong
on 16 Apr 2021
"Does the compiler realize which parts of the struct are actually needed and pass only these to the function? "
MATLAB don't pass input values of function like C and C++, it passe input (structure) "address" (mxArray pointer). So your question is not applicable thus irrelevant.
5 Comments
Bruno Luong
on 16 Apr 2021
Edited: Bruno Luong
on 16 Apr 2021
"Or does it work like a path"
I don't know what you mean be "work like a path". Are you taking about nesetd structures?
"only the top level adress"
All MATLAB objects (all classes) are encapsulated in a data structure called "mxArray", only theirs addresses are passed into the functtion.
More Answers (2)
Steven Lord
on 16 Apr 2021
Just because MATLAB passes a large array into a function doesn't mean it needs to make a copy of that array if you don't modify the input in the function
tic
S = repmat(dir, 1000, 1000);
toc
whos S
tic
y = myfun(S);
toc
It took much less time to perform the operation in myfun than it did to create S, and if myfun were copying S you'd expect that time to be closer to the creation time. As a different example using a numeric array rather than a struct:
A = rand(4000, 4000);
tic
y = myfun2(A); % Does not modify A
toc
tic
y = myfun3(A); % Modifies A
toc
There's a limit to how big I can make A in a MATLAB Answers post, but try the code yourself in a desktop installation of MATLAB with a larger A if you want to see a larger difference in the times.
function y = myfun(S)
y = fieldnames(S); % S is not modified and so not copied
end
function y = myfun2(A)
y = A(42, 999);
end
function y = myfun3(A)
A = A + 1;
y = A(42, 999);
end
0 Comments
James Tursa
on 19 Apr 2021
Edited: James Tursa
on 19 Apr 2021
MATLAB typically passes shared data copies of input arguments to functions. That means creating a separate mxArray header for the variable and then sharing the data pointers. For cell arrays and structs, that means that only one top level data pointer is copied. The individual addresses of all of the cell and field variables (perhaps hundreds or thousands of these) are not copied ... they are part of the "shared data" that is pointed to by the one top level data pointer. Passing cell or struct variables to functions takes about the same amount of overhead as passing a regular numeric variable. I.e., it takes about the same amount of effort to pass a struct with thousands of field elements as it does to pass a small 2D numeric matrix.
If you subsequently modify one of those input arguments inside the function, then MATLAB needs to make a deep copy of the variable (or deep copy of the cell or field you are modifying) first, which will take additional time and memory.
3 Comments
Bruno Luong
on 7 Mar 2022
Edited: Bruno Luong
on 7 Mar 2022
Trym Gabrielsen interesting question, I don't know the answer and I myself asking.
Short answer, I use a lot of structure that captures my program "state" and passing it as input and output of my functions. The function either change or add a new fields to a structure. Performance wise I did not notice any significant speed penalty.
However for speed intensive part I tries to use simpler variable type extracted from the structure before the loop, then assign the result after the loop terminates. I avoid to assign the fields multiple time when it not needed. This also makes my code readable and easier to maintain. So I don't see why to do differently.
Long answer, the internal organization and data sharing is not documentend and TMW makes it more and more difficult to do reverse engineering. I must admit I don't understand how it works since last few release now. I juts want to show how different data pointers change internally in this short example, run on R2021b. Up to you to lake a conclusion, because as I said I don't completely understandd how it works and why it works like that.
The code:
clear all
clc
foo
function foo
format debug;
s = struct('a', 'a', 'b', 'b');
fprintf('\nbefore calling function\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); s.a, PrintPtr(s.a); fprintf('s.b\n'); s.b, PrintPtr(s.b);
s = modifyb(s);
fprintf('\nafter calling function\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
bar
end
%%
function s = modifyb(s)
fprintf('\ninside function before modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
s.b = 'bb';
fprintf('\ninside function after modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
end
%%
function bar()
s = struct('a', 'a', 'b', 'b');
fprintf('\ninside function bar before modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
s.b = 'bb';
fprintf('\ninside function bar after modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
end
Mex file PrintPtr.c
// Save PrintPtr.c to Compile this by using "mex PrintPtr.c"
#include "mex.h"
// PrintPtr.c
// Compile it in Matlab using
// > mex -R2018a PrintPtr.c
// Gateway routine, simple function to print
// data pointer of a MATLAB object
// Bruno Luong: 06 Dec 2008
void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
mxArray *Var;
char Str[256];
// Check for proper inputs and output
if( nrhs != 1 )
mexErrMsgTxt("PrintPtr needs one input");
if( nlhs > 0 )
mexErrMsgTxt("PrintPtr does not return value");
// Set temporary operand pointers to the inputs.
Var = prhs[0];
sprintf(Str,"mxArray = %px\n", Var);
mexPrintf(Str);
sprintf(Str,"Data ptr = %px\n", mxGetData(Var));
mexPrintf(Str);
return;
}
Result
before calling function
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
ans =
'a'
mxArray = 0000022F7C2C71C0x
Data ptr = 0000022F16401680x
s.b
ans =
'b'
mxArray = 0000022F7C4911E0x
Data ptr = 0000022F010D99C0x
inside function before modification
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C2C7160x
Data ptr = 0000022F16401360x
s.b
mxArray = 0000022F7C4910C0x
Data ptr = 0000022F010DAC80x
inside function after modification
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C2C6DA0x
Data ptr = 0000022F163FBC80x
s.b
mxArray = 0000022F7C4BB760x
Data ptr = 0000022F0DC140C0x
after calling function
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C490FA0x
Data ptr = 0000022F010DAA40x
s.b
mxArray = 0000022F7C4BB760x
Data ptr = 0000022F0DC140C0x
inside function bar before modification
s
mxArray = 0000022F7C4B99C0x
Data ptr = 0000022F142041C0x
s.a
mxArray = 0000022F7C2C7760x
Data ptr = 0000022F16401500x
s.b
mxArray = 0000022F7C490C40x
Data ptr = 0000022F010DB4A0x
inside function bar after modification
s
mxArray = 0000022F7C4B99C0x
Data ptr = 0000022F142041C0x
s.a
mxArray = 0000022F7C2C76A0x
Data ptr = 0000022F16400BA0x
s.b
mxArray = 0000022F7C4B9120x
Data ptr = 0000022F162CB400x
James Tursa
on 8 Mar 2022
Edited: James Tursa
on 8 Mar 2022
@Trym Gabrielsen "I am wondering if the enite struct B is copied, when you modify only one field, or is only that one field copied?"
Only that one field being modified is deep copied. All of the other fields remain as reference copies, i.e. only the top level mxArray variable address is copied. None of the data within these other fields is deep copied.
See Also
Categories
Find more on Function Creation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!