You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Groupby Date and Categorical Column on Tables
6 views (last 30 days)
Show older comments
Suppose that I have 5 columns in my table as Stock, Date, Open, High, Low, Close. Stock is a string column and Date is in Date format. Suppose that same stock can repeat on same day.
How can I count of unique stocks on each Day? That would be table with number of unique stock counts against each date.
Given that each stock can have mutiple observation on same day how can I group by both Date and Stock to get average Open price for each stock on each day.
23 Comments
Furqan Hashim
on 28 Aug 2020
Iam new to MATLAB I've used the function that you mentioned. I got an error
Undefined function or variable 'groupsummary'.
I from a Python background so what I can guess I need some module or library to work with the function you mentioned.
Apart from that I tried the following piece of code
mean = varfun(@mean, data,'GroupingVariables',{'Stock', 'Date'},...
'InputVariables','Open');
I got an error
Error using tabular/varfun (line 206)
Not enough input arguments.
Adam Danz
on 28 Aug 2020
Edited: Adam Danz
on 28 Aug 2020
I filled in your release info.
When I test your solution, it works well!
Perhaps you could show us a sample of your input using
head(data)
Avoid using mean as a variable name since it's a very commonly used function.
data = table(['A';'B';'A';'A';'B'], datetime('now')-days([1;2;1;2;2]), rand(5,1)*100, 'VariableNames', {'Stock','Date','Open'})
% data =
% 5×3 table
% Stock Date Open
% _____ ____________________ ______
% A 27-Aug-2020 16:27:51 81.472
% B 26-Aug-2020 16:27:51 90.579
% A 27-Aug-2020 16:27:51 12.699
% A 26-Aug-2020 16:27:51 91.338
% B 26-Aug-2020 16:27:51 63.236
mu = varfun(@mean, data,'GroupingVariables',{'Stock', 'Date'},'InputVariables','Open')
% mu =
% 3×4 table
% Stock Date GroupCount mean_Open
% _____ ____________________ __________ _________
% A 26-Aug-2020 16:27:51 1 91.338
% A 27-Aug-2020 16:27:51 2 47.086
% B 26-Aug-2020 16:27:51 2 76.908
I've also verified that the r2017b documentation contains the same inputs and name-value options.
https://www.mathworks.com/help/releases/R2017b/matlab/ref/varfun.html
Furqan Hashim
on 28 Aug 2020
Edited: Furqan Hashim
on 28 Aug 2020
I did exactly what you did. I am still facing the same error. see attached screenshot.
Furqan Hashim
on 29 Aug 2020
See the results below that you asked
>> which mean -all
C:\Program Files\MATLAB\R2017b\toolbox\matlab\datafun\mean.m
C:\Program Files\MATLAB\R2017b\toolbox\matlab\timefun\@duration\mean.m % duration method
C:\Program Files\MATLAB\R2017b\toolbox\matlab\timefun\@datetime\mean.m % datetime method
C:\Program Files\MATLAB\R2017b\toolbox\matlab\bigdata\@tall\mean.m % tall method
C:\Program Files\MATLAB\R2017b\toolbox\distcomp\parallel\@codistributed\mean.m % codistributed method
C:\Program Files\MATLAB\R2017b\toolbox\distcomp\gpu\@gpuArray\mean.m % gpuArray method
C:\Program Files\MATLAB\R2017b\toolbox\finance\ftseries\@fints\mean.m % fints method
C:\Program Files\MATLAB\R2017b\toolbox\mbc\mbctools\@sweepset\mean.m % sweepset method
C:\Program Files\MATLAB\R2017b\toolbox\stats\stats\@ProbDistUnivParam\mean.m % ProbDistUnivParam method
C:\Program Files\MATLAB\R2017b\toolbox\matlab\timeseries\@timeseries\mean.m % timeseries method
>> which('mean','in','varfun')
C:\Program Files\MATLAB\R2017b\toolbox\matlab\datafun\mean.m
Furqan Hashim
on 29 Aug 2020
I am able to group by with 1 variable
mu = varfun(@mean, data,'GroupingVariables','Stock','InputVariables','Open')
mu =
2×3 table
Stock GroupCount mean_Open
_____ __________ _________
A 3 60.182
B 2 45.716
Adam Danz
on 29 Aug 2020
Edited: Adam Danz
on 29 Aug 2020
Thanks for the providing the mean function info. It doesn't appear to be the problem.
I just tested the demo in r2016b and r2017b and it worked perfectly. The 2017b documentation clearly shows support for the cell array of GroupingVariables.
Furqan Hashim
on 29 Aug 2020
I am testing code on sample data that you generated using
data = table(['A';'B';'A';'A';'B'], datetime('now')-days([1;2;1;2;2]), rand(5,1)*100, 'VariableNames', {'Stock','Date','Open'})
Still getting the error.
Furqan Hashim
on 29 Aug 2020
Ok, I'll do that but 1 thing that amazes me is that when I open up varfun using
open varfun
A function named varfun is opened that contains only 183 lines of code.
But when I do click on line 206 in mentioned error it gives stops me from making any changes as the file is read-only
Furqan Hashim
on 29 Aug 2020
Edited: Furqan Hashim
on 29 Aug 2020
See the attached ouptut after placing break point on line 206.
The error pops up from 17b as the function that opens after clicking on line 206 of error is the function with 509 lines of code.
Furqan Hashim
on 29 Aug 2020
Edited: Furqan Hashim
on 29 Aug 2020
Up unitl break point everything is fine when I run next line I get an error
Error using tabular/varfun (line 206)
Not enough input arguments.
Adam Danz
on 29 Aug 2020
So, when you place the break point on line 206 within the function that contains 509 lines of code, after running the varfun line with our demo table, what are the values of groupvars input?
I got a little lost from the previous two comments. Was there supposed to be an attachment? What's the other varfun file that you were looking at -- is that a custom function?
Furqan Hashim
on 29 Aug 2020
See attached lines below which tells the ouput of group vars
>> mu = varfun(@mean, data,'GroupingVariables',{'Stock', 'Date'},'InputVariables','Open')
206 [group,grpNames,grpRowLoc] = a.table2gidx(groupVars); % leave out categories not present in data
K>> groupVars
groupVars =
1 2
Adam Danz
on 29 Aug 2020
Edited: Adam Danz
on 29 Aug 2020
I'm baffled. That's the output you should be getting and you didn't get the error that you get when you're not running debug mode. It makes no sense to me. Why would you get an error when you run the function normally but not get the error in debug mode? It might have something to do with the other varfun file you found which I still have questions about - is that a Matlab file? Nevertheless, when you call varfun it must be using the correct file since its breaks on line 206 within the 509-lined file, right?
Furqan Hashim
on 29 Aug 2020
Yes it is using the correct file the one with 509 lines of code.
I think groupVars is the input being given in line 206 so when groupVars is passed to a.table2gidx it would give error
Adam Danz
on 29 Aug 2020
Edited: Adam Danz
on 29 Aug 2020
Right, but that function only has 1 input and it's the correct input but the error message you shared was "Not enough input arguments", which doesn't occur on my end.
From this point I think your options are
- Install the updates to your current release and try again.
- Use a different machine / installation of matlab / or maybe even a newer release. If the updates didn't fix the problem, something seems to be wrong locally since none of these issues occur on my vs of r2017b update 9.
- Make a tech support request and point to this thread within the request so they can see what troubleshooting you've already done.
I would be interested in hearing the final solution and cause of the problem if you get to the bottom of this. Note that the error you reported in the answer below also doesn't match the error I got on the demo data in r2017b (update 9). So something's fishy.
Answers (1)
Mohammad Sami
on 29 Aug 2020
Perhaps you can try the alternative below.
if true
[G,tabout] = findgroups(data(:,{'Stock','Date'}));
tabout.avgOpen = splitapply(@mean,data.Open,G);
end
4 Comments
Furqan Hashim
on 29 Aug 2020
The above answer leads to this error.
Error using left (line 13)
if not cellstr, string can only be a single row or empty!
Error in istable (line 7)
if strcmp(left(name,1),'`') && strcmp(right(name,1),'`')
Error in findgroups (line 129)
if istable(tOutTemplate{i})
Adam Danz
on 29 Aug 2020
Edited: Adam Danz
on 29 Aug 2020
I'm have some doubts that you're using r2017b. That is not the error message you should be getting in 17b with Mohammad's code if you're running it on the demo table "data" that we've been using.
What's the first line of the output from the command below?
ver
For example, I'm testing this on "MATLAB Version: 9.3.0.948333 (R2017b) Update 9" and the error message does not match the error message you shared.
Adam Danz
on 29 Aug 2020
Edited: Adam Danz
on 29 Aug 2020
The only difference between your and my versions is the update #9. You could easily update your version for free by pressing the little bell icon in the upper right of the main window. I can't promise that will affect this problem but it's a good idea to get the updates anyway.
See Also
Categories
Find more on Startup and Shutdown in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)