Can a function known whether/which specific output is ignored (tilde operator)?

A function can know the number of requested outputs via nargout.
But can it know if a specific output is replaced with tilde operator (~) ?
[x,~,z] = foo();
function [a,b,c] = foo()
% how to know that the second output is not needed?
end

26 Comments

Yeah, I've thought on a few occasions it would be nice to be able to know -- it's rare in my experience the case comes up, but once or twice have had a function for which one output could be expensive to compute and the other(s) could be done without that one (or at least all of it) that would save time if it weren't needed.
In general, though, I'd think the solution would be to either refactor or pass the info; unless one is writing self-modifying code the particular use case is fixed so it isn't as though it changes from call to call.
To continue on dpb's reasoning: In a "gentle-downhill-slope-home" use-case you might be able to sort your a, b and c such that the expensive and avoidable output is the third output - then you can check how many outargs there are with nargout. Then you can avoid the expensive calculation if you only have 2 outputs.
thanks for these helpful thoughts !
Technically you could use dbstack to get the line number and file of the caller, read that line, and use regular expressions to detect which output contains a tilde, if any. That would be a relatively simple function that shouldn't cost a lot.
Another alternative is to add another input to the function that specifies the number of outputs.
sounds good. but what if the function is called from the command line?
The dbstack output can be used to detect that the command line was the caller and in that case, instead of reading the file you can read your command line history.
Note that dbstack is a debugging tool not intended for fast code. In fact it will be quite slow, as the MATLAB documentation also clearly states "Avoid functions that query the state of MATLAB such as inputname, which, whos, exist(var), and dbstack. Run-time introspection is computationally expensive."
Not only that, defining a regular expression that can reliably detect valid MATLAB output code (e.g. variables, comma-separated lists) and ignore all comments, line continuation ellipses, etc, is not a trivial task. Essentially you need to replicate the MATLAB parser rules, which so far I have not seen anyone do successfully. Your code regular expression would have to handle this valid code:
>> C = cell(1,3);
>> D = cell(1,2);
>> [C{:},~,D{:}] = ndgrid(1:9); % 4th output unused
And this valid code too:
>> S = struct('X',{1,2,3});
>> [S.X,...Not,An,Output
~,Y,Z] = ndgrid(1:9); % 4th output unused
I would not recommend this approach if you are at all interested in robust or efficient code.
If you want to keep the outputs in the same order then the most efficient solution by far is to add an input flag, e.g. a logical vector or some scalar/vector that tells your function which outputs to calculate. All you need as some basic if or switch or similar to handle it. This will be efficient and robust.
I agree that there are cleaner alternatives and I suggested them in my initial comment above but I wanted to try out the dbstack approach, anyway, for no other reason than to see if the idea worked.
I quickly wrote a function detectOutputSuppres​sion() that can be called from within a function (just like nargout) and returns a logical vector identifying which outputs were suppressed by the caller using a tilde operator. If the caller did not request any outputs, the vector will be empty.
The problem actually wasn't difficult when the caller function is within an m-file. The dbstack tells you which line the caller function is on and since tilde-suppression requires outputs to be in square brackets and for outputs be to separated by a comma or space(es), the regular expressions isn't that difficult. If the square brackets aren't used, there were no suppressed outputs and an empty vector is returned.
When the caller is from the command window, the problem is a bit more difficult. This function only looks at the most recent command from the command history and if it doesn't have the square-bracket output structure or if it calls a different function, it pretty much gives up and indicates that it could not find a match.
The documentation is clear that there are limitations when the caller is from the command window. Otherwise, I tested it in a few contexts and didn't have any problems. I'd be glad hear if others break it, though.
BTW, execution time is about 0.04 sec.
"The problem actually wasn't difficult when the caller function is within an m-file."
Except that it fails to work for many valid syntaxes, e.g.:
  • as the second, third, etc. command on one line,
  • line continuation ellipses,
  • any comma-separated list expansion,
  • ... etc
I tried it with this simple function:
function varargout = testfun()
[idx,txt] = detectOutputSuppression()
[varargout{1:nargout}] = ndgrid(1:2);
end
and then tested it (in all examples there are six outputs, the fourth is unused)
It failed with this simple comma-separated list expansion using cell arrays:
G = cell(1,3);
H = cell(1,2);
[G{:},~,H{:}] = testfun(); % 4th output unused
prints the incorrect and misleading indices (where are the six outputs?)
idx =
0 1 0
It gets confused by ellipses:
S = struct('X',{1,2,3});
[S.X,...Not,An,Output] = testfun()
~,Y,Z] = testfun(); % 4th output unused
prints the incorrect and misleading indices (where are the six outputs?):
idx =
0 0 0 0
This trivial syntax:
1,[A,B,C,~,E,F] = testfun(); % 4th output unused
does not recognise any outputs at all!:
idx =
[]
For a very limited set of circumstances it might work (inefficiently and rather fragile), but the documentation as it is currently written only mentions that "Function outputs must be separated by a comma." and no other restrictions.
I haven't read your code with enough attention yet and I'm on mobile, but it looks like the command line detection will fail for section evaluation. (note that this behavior has changed a few releases ago)
KISS.
If this is a big problem in terms of time-consumption, then write the function with an optional argument, for example "whichargouts" that directs the function excecution. If the user doesn't supply the time-saving directive, no time-saving is made (too bad), if directives are given then time-savings are made. QD? Yup, but should work until a neat and robust automatic solution is available.
@Bjorn: You mean questionable design by QD? Because I don't think I agree. You can see the same design in action with the regexp function if you use the OutKeys (match/split/etc).
@Adam: you should move your comment to the answer section. It may not be perfect, but it is an actual answer. If OP can live with its limits it solves the problem.
@Rik: No I meant my "solution" was "quick and dirty", a simplistic attempt at avoiding the time-cost of calculating unrequested outputs with the additional cost of handling (both for programmer and function-user) additional input arguments...
Thanks for the feedback, guys.
You're right that it will not work if the command that invoked the function is not the first command within a line of code. That style is often frowned upon but it does exist, and sometime for good reason.
However, there isn't anything wrong in the first example you shared. The documentation is clear, "The [output] row vector will be the same length as the number of requested outputs". [G{:},~,H{:}]=... should return [0 1 0] no matter how many outputs are possible. Matlab's nargout() function behaves in the same way.
The second example you shared with the elipses is a problem and I have a relatively easy way to fix that, I think. Thanks for pointing that out.
There are definitely some limitation but it does seem to function well with the most common syntaxes / styles. The ones you pointed out are relatively less common but important to consider. Thanks again for stress testing it.
@Rik
Not sure if the section-evaluation comment referred to my function or not. Are you referring to running commands directly from the editor (highlight + F9)? That shouldn't pose a problem. Before moving it as an answer I'm going to clarify the requirements in the documentation.
much much thanks Adam
looks really practical and helpful
solves (most) of my need
suggestion for added functionality: might be helpful to also return the names of the external output variables.
as of the issues raised by Stephen: will be very helpful if the function could detect when these issues occur and return a warning flag (say when it detects that it is called teize on the same line in the caller mfile).
@Adam: I was referring to %% sections and running with control+enter.
And if G is a cell with multiple elements, nargin will see if as one input per element:
G=cell(1,3);H=cell(1,5);
[G{:},~,H{:}]=callme;
function varargout=callme
disp(nargout)
varargout=cell(1,nargout);
[varargout{:}]=ndgrid(1:2);
end
"However, there isn't anything wrong in the first example you shared. The documentation is clear, "The [output] row vector will be the same length as the number of requested outputs". [G{:},~,H{:}]=... should return [0 1 0] no matter how many outputs are possible. Matlab's nargout() function behaves in the same way."
Aaaah, so if nargout behaves in the same way, then lets print nargout too and see how many outputs are requested:
function varargout = testfun1()
idx = detectOutputSuppression() % print incorrect indices
out = nargout() % print actual number of requested outputs
[varargout{1:nargout}] = ndgrid(1:2);
end
and now call it in a simple script:
G = cell(1,3);
H = cell(1,2);
[G{:},~,H{:}] = testfun1(); % 4th output unused
so we can see how many outputs are requested. I predict six, you predict three. What does nargout display?:
idx =
0 1 0
out =
6
If requesting three outputs (as you state) returns the data from six outputs (as shown by testing it) from a function which can provide unlimited outputs, why six? Why does nargout not return some other random number unrelated to how many outputs were actually requested, e.g. twenty-three? Or forty-two? Nothing inside my original function limited or specified the number of outputs, only the calling syntax requested a specific number of outputs (six in fact).
Lets try another simple demonstration, where we define a simple function with exactly six unique outputs:
function [A,B,C,D,E,F] = testfun2()
idx = detectOutputSuppression()
nargout()
A = 'one';
B = 'two';
C = 'three';
D = 'four';
E = 'five';
F = 'oh no, does this really mean six requested outputs?';
end
Do you agree that there is no way for us to get the character vectors 'four', 'five', and whatever that last one is, if only three outputs (as you state) are requested? So to get the char vector 'five' at least five outputs have to be requested, no? Lets try it: I predict six, you predict three:
G = cell(1,3);
H = cell(1,2);
[G{:},~,H{:}] = testfun2(); % 4th output unused
What does nargout display?:
idx =
0 1 0
out =
6
checking the returned values:
>> G{:}
ans =
one
ans =
two
ans =
three
>> H{:}
ans =
five
ans =
oh no, does this really mean six requested outputs?
>>
Yes, it does! And the fourth output is ignored, just as I wrote in my previous comment.
Or if using two cell arrays and one tilde is too confusing, lets simplify it down to one cell array (which following your logic would be one output, am I right?), which of course we can simply define using the standard comma-separated list syntax to request six outputs from the function:
Z = cell(1,6);
[Z{:}] = testfun2();
which prints this (your function incorrectly shows only one output, even though six were requested, as nargout shows):
idx =
0
out =
6
And checking how many outputs were actually returned by the function:
>> Z{:}
ans =
one
ans =
two
ans =
three
ans =
four
ans =
five
ans =
oh no, does this really mean six requested outputs?
Just for fun: I can think of a simple scenario where your function will (incorrectly) state that three outputs are requested, whereas in fact only two outputs are requested... and just to add to the fun, lets redefine the function again so that it will throw an error if more than two arguments are requested:
function [A,B] = testfun3() % Only two outputs! Three or more throws an error!
idx = detectOutputSuppression()
out = nargout()
A = 'one';
B = 'two';
end
now lets call it in a script:
X = cell(1,1);
Y = cell(1,1);
Z = cell(1,0);
[X{:},Y{:},Z{:}] = testfun3();
and which prints (no error here, so how does your function show three outputs?):
idx =
0 0 0
out =
2
and checking the values of the two requested outputs:
>> X{:}
ans =
one
>> Y{:}
ans =
two
>> Z{:}
>>
Note how your function clearly shows that testfun3 was called with three output arguments, yet testfun3 actually throws an error when it is called with three outputs! Just for sanity, we might as well confirm that calling it with two hard-coded outputs is okay, but three will throw an error:
>> [a,b] = testfun3()
idx =
0 0
out =
2
a =
one
b =
two
>> [a,b,c] = testfun3()
Error using testfun
Too many output arguments.
As I wrote earlier, in fact your function does not handle comma-separated lists (probably never will), and this limitation is not mentioned in its documentation. I did not even show any examples using structures... maybe tomorrow.
I suggest the function will return a warning flag if it encounters {:}, or a struct.field notation.
My readfile function fails sometimes. Those failures don't always return errors (some do, but not all). I did my best to document the expected behavior for every conceivable combination of file, release, runtime (Matlab or GNU Octave), and operating system. I'm fairly certain I'm not stating a claim in my doc that is false.
My point is: limitations are fine, as long as you are aware of them and properly document them. Either state which specific syntaxes you support, or don't make broad sweeping claims with as broad testing. And as long as you documented you don't support comma separated lists, you don't even need a warning flag.
Reading through these many comments, I'd suggest just sorting the outputs according to difficulty of computation. The easy ones you will always compute, along with the outputs you know the user will always want to see. Then use nargout to decide what to skip. This is by far the easiest solution. I do use it in some of my codes.
An option I have sometimes used is to pass in a verbosity argument. This usully indicates how much crap I am willing to dump to the command window. 0 indicates nothing, 1 indicates what I might wat to see when I wonder if there is a problem. 2 indicates I am debugging code, and I want to see all sorts of stuff, and I am willing to tolerate much command window output. Of course that does not apply here, but it is a useful idea in general.
For this problem, I would suggest input flags that tell which outputs will be used. While that is not perfect, it will work. You could use property/value pairs to indicate which outputs are of interest. A bit of a kludge, but it will work.
I saw a request to know the names of the outputs - this might get confusing.
[fitpoly.P,fitpoly.S,fitpoly.mu] = polyfit(rand(10,1),rand(10,1),3);
What are the names of the output variables? Or, since multiple outputs can all be stuffed into the same variable:
M = zeros(1,2);
[M(1),M(2)] = min(magic(3),[],'all','linear');
M
M =
1 4
Here is an old trick for disposing of the earlier inputs when only one output was wanted:
[ind,ind] = min(magic(3));
ind
ind =
2 1 3
That was often used in the old days before ~ became available to discard outputs into the bit bucket. But there is a lot of legacy code around that may still use it.
An example of code where it might be difficult to resolve the variable name is one that uses eval.
eval(['X',num2str(randi(1000,1,1)),' = min(magic(3));'])
A variable named X960 was created. In this case, the function min would arguably be able to know the name of the output variable, but I'd bet if we tried to be creative, we could find a counter-example.
In my eyes, I don't think a function should need to worry about where it puts its results. That is the duty of the caller to resolve.
Let us consider John D'Erricos example here, where just like Adam Danz's submission we only look at this one line:
[fitpoly.P,fitpoly.S,fitpoly.mu] = polyfit(rand(10,1),rand(10,1),3);
Question: how many outputs are requested?
Most people will incorrectly answer "three".
Adam Danz's submission will also return an index with three elements.
But in fact we don't know how many outputs are requested, because it depends on the prior existence and size of that structure, which, if just like Adam Danz's submission we only view that one line of code, we do not have enough information on to decide (and which in any case is only resolved at runtime, not during static code analysis). And that is the flaw of Adam Danz's submission: it assumes that viewing just one line of code (without any information of prior operations) is enough to know how many outputs are requested. In fact this does not work in a general case (e.g. comma-separated lists, eval, etc). Lets consider John D'Errico's example with a small change, we reduce the outputs down to this:
[S.X] = polyfit(rand(10,1),rand(10,1),3);
How many requested outputs are there? One? Three? Twenty-one? One million?
Answer: any of these, we simply don't know. Any static code analysis which only look at this one line of code cannot tell us (but that is exactly Adam Danz's function erroneously does, returning incorrect numbers of requested outputs without any warning, error, flag, or indication that the ouput may be incorrect).
If I then told you that five hundred lines prior this structure was defined:
S = struct('X',{1,2,3});
then a reasonable guess would be three requested outputs. But until runtime resolves this structure, we cannot be sure.
In exactly the same way, even John D'Erricos example is not actually "simple", cannot be resolved until runtime, and certainly cannot not be reliably determined by simply looking at one line of code. In essence counting requested outputs which use either {} or dot indexing within square brackets cannot be achieved using a regular expression.
"I suggest the function will return a warning flag if it encounters {:}, or a struct.field notation. "
Yes. Using MATLAB's warning would be useful. Note that the comma-separated list syntax can also include indices.
"My point is: limitations are fine, as long as you are aware of them and properly document them."
Yes.
"And as long as you documented you don't support comma separated lists, you don't even need a warning flag."
Hmmm... I think comma-separated lists are common enough that they need to be considered. Code written based based on the principal "I cannot imagine doing X, nor do I do X in my own work, therefore no one else does X either" is not a good recipe for writing robust code, especially when said code is being heavily promoted as a general tool for general situations (but in fact isn't).
@Rik
Now I understand what you and Stephen are refering to regarding the outputs with cell indexing. That's definitely a problem with the quick-and-simple over-my-lunch-break function. Thanks for that demo, and the demos provided by Stephen Cobeldick.
Thanks for the suggestions. As I mentioned in my first comment and as others have mentioned, a better approach would be to use an input that specifies which outputs to compute. I'd also like to promote John D'Errico's suggestion of prioritizing the order of outputs.
thanks so much everyone for the thorough and careful assessment
I actually finds Adam's detectOutputSuppres​sion() quite helpful even if it doesn’t resolve all cases. the function should really just let you know whether or not it was able to determine the outputs (namely when there was no use of dot or {} in the output line; and only one call to the function on the caller line).
Another more general approach is perhaps for detectOutputSuppres​sion() to evaluate in the caller any curly brace or dot indexing of the output variables to see how many outputs they produce.
Thanks royk. I've carefully considered the feedback and will update the function soon.

Sign in to comment.

 Accepted Answer

"Can a function known whether/which specific output is ignored ?"
No. nargout tells you the number of output arguments specified in the caller but it counts outputs suppressed with a tilde (~). nargout(fun) tells you the number of ouputs a specific function has to offer but that tells you nothing about the caller.
What are the alternatives?
1. Pass an input to the function that specifies which outputs you're requesting. The tricky part is to remember to change that input value if a different set of outputs is requested.
2. Prioritize your outputs and use nargout to determine which outputs to produce.
You can also try the new and improved (vs.2.0.0) detectOutputSuppression() from the file exchange. It reads the caller line and parses the outputs to determine if any are suppressed.
Example
The main() function calls myFunc() and suppresses the 2nd and 3rd outputs.
function main()
[mst(1), ~, ~, data] = myFunc();
function [a, b, c, d] = myFunc()
a = 1; b = 2; c = 3; d = 4;
isTilde = detectOutputSuppression(nargout);
Result: isTilde = [0 1 1 0].
Supported syntaxes
  • [a,~,~,d] = myFunc();
  • a = myFunc();
  • [a(1),a(2),~] = myFunc();
  • [~,~,c(1:20),~] = myFunc();
  • [~,b{3}] = myFunc();
  • [~,b{3,1,5}] = myFunc();
  • [~,~,~,g{1}{2}{4}(1)] = myFunc();
  • q=cell(1); [~,q{:}] = myFunc(); % because q is 1x1.
  • T=table(); [T.a,~] = myFunc();
  • [j,u,n,k]; [~,~,~,~] = myFunc(); [j,u,n,k]
  • [j,u,n,k], [~,~,~,~] = myFunc(), [j,u,n,k]
  • S=struct; [S.x,~,S.t(4)] = myFunc();
  • [v.a, v.b{1}, v.c(2,2), ~] = myFunc();
  • myFunc + 1;
  • y = [1, myFunc(), 1];
  • assert(~isempty(myFunc()))
  • [a,b,~,~] = myFunc(...
  • inputs); %split in multiple lines after function name
  • and more....
Syntaxes not supported
An error message is thrown when these syntaxes are detected.
  • Outputs not separated by commas: [a b c] = myFunc()
  • When myFunc() is not called from within an m-file or wasn't the most recent command called from the command window.
  • When myFunc() is called in debug mode but isn't the line currently paused or executed by MATLAB.
  • Output assignment to comma separated lists: x=cell(1,5); [x{:}] = myFunc();
  • Caller line is split into multiple lines before the inputs (a split after the function name is OK).
  • When myFunc() is wrapped in an anonymous function: f = @myFunc; f()
  • When the parser matches more than one [...]=myFunc per line of code.

2 Comments

"Supported syntaxes"
are not robust if they contain any comma-separated list (not just ones from a cell array, as Adam Danz wrote), In fact this also includes structures and strings:
This means that some syntaxes in the list are not possible to robustly detect, for example:
[v.a, v.b{1}, v.c(2,2), ~] = myFunc();
Some other syntaxes are also not robust because of MATLAB's weak typing:
table = @() struct('a',{1,2,3}); % could be any size.
...
T = table() % oops, not an instance of the table class.
T = 1×3 struct array with fields:
a
[T.a,~] = myFunc() % how many outputs?
On top of that we have the possibility of EVAL/ASSIGNIN/etc replacing or clearing any variable... which all just proves, that the number/ignored output arguments can only be determined at runtime.

Sign in to comment.

More Answers (2)

I'll merge parts of two of my comments to provide a KISS-type suggestion:
In a "gentle-downhill-slope-home" use-case you might be able to sort your a, b and c such that the expensive and avoidable output is the third output - then you can check how many outargs there are with nargout. That way you can avoid the expensive calculation if you only have 2 outputs, this generalizes well in some cases.
If this is a big problem in terms of time-consumption, then re-write the function with an optional argument, call it something like whichargouts that directs the function excecution. You could make it a bolean array with 1 for requested outputs. If the user doesn't supply the time-saving directive, no time-saving is made (too bad), if directives are given then time-savings are made. This is a QD-solution to a programming-wise tricky problem, but should work until a neat and robust automatic solution is available.
HTH
You can do it with this FEX download,
but there are some caveats to its use, mentioned in the documentation.
[x,~,z,~] = foo()
function [a,b,c,d] = foo()
[a,b,c,d]=deal([]);
unused=find(outputnames=="~")';
if ~isempty(unused)
disp("These outputs are not used:"+mat2str(unused));
end
end
These outputs are not used:[2;4]
x =
[]
z =
[]

Categories

Products

Release

R2020a

Asked:

on 17 Aug 2020

Answered:

on 1 Dec 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!