MATLAB Answers

Count number of values within a range in tall array per column and assign count to new matrix

7 views (last 30 days)
Juan Estrella-Martínez
Juan Estrella-Martínez on 3 Jul 2020
Commented: Juan Estrella-Martínez on 6 Jul 2020 at 15:47
Hi everyone. A month ago I had this same question but for regular arrays. Now I am trying to extend my reasoning to tall arrays. Context:
What I would like to do is:
1. Count the number of elements that fall under certain range on a per column basis (say, elements between 0 and 0.5)
2. Assign that count to an element of a different matrix
3. Repeat steps 1 and 2 using a new range (say, elements between 0.5 and 1.0)
4. Repeat steps 1 through 3 for all columns
Example:
Count values between 0 and 0.5, and 0.5 and 1.0 in matrix A and assign the results to matrix B.
The answer then (link) was
A= [
0.83 0.02
NaN 0.69
0.7 0.72
0.3 0.32
NaN 0.8
0.02 0.04
0.56 NaN
0.78 NaN
0.01 0.03
0.67 NaN]
[ra, ca] = size(A)
ranges = [0, 0.5;
0.5, 1]
[rr, cr] = size(ranges);
B = zeros(rr, ca)
for row = 1 : rr
B(row, :) = sum(A > ranges(row, 1) & A <= ranges(row, 2), 1)
end
But what if matrix A is now a tall array? Here is what I've done that is not working:
%Previous few lines have code that creates an array on disk using matfile function
fds=fileDatastore('filename.mat','ReadFcn',@load,'FileExtensions','.mat');
A=tall(fds);
ranges_edge1=linspace(0,0.999,1000)';%Generate first edge of ranges
ranges_edge2=linspace(0.001,1,1000)';%Generate second edge of ranges
ranges = [ranges_edge1,ranges_edge2];%Generate matrix with edges
[row_ranges, cr] = size(ranges);%This step can probs be skipped because I know the size a priori
n=ColN;%Number of columns for my new matrix. It is the same number of columns in A.
B = zeros(row_ranges, n);%Generate empty matrix in which to store results
for row = 1 : row_ranges
B(row, :) = sum(A > ranges(row, 1) & A <= ranges(row, 2), 1);%Count results that fall between the edges and store in matrix B
end
%Next few lines manipulate B
At this point I get the error
The following error occurred converting from tall to double: Conversion to double from tall is not possible.
Error in %filename.m
(%linenumber)
B(row, :) = sum(out > ranges(row, 1) & out <= ranges(row, 2), 1);%Count results that fall between the edges and store in matrix B
What can I do to continue? Hope anyone can help. Cheers!

  0 Comments

Sign in to comment.

Answers (1)

KSSV
KSSV on 3 Jul 2020
You can achieve this using histogram. Read about histcounts.

  3 Comments

Juan Estrella-Martínez
Juan Estrella-Martínez on 6 Jul 2020 at 13:45
Not working yet. I played arround with histcounts and got it to work by column in a normal array but not in a tall array. Here's what I've done
%Previous few lines have code that creates an array on disk using matfile function
fds=fileDatastore('filename.mat','ReadFcn',@load,'FileExtensions','.mat');
A=tall(fds);
edges=linspace(0,1,1000)';
[height,dummy]=size(edges);
B=zeros(height,427);
for a=1:427
N=histcounts(A(:,a),edges);
N=gather(N);
N=N';%Transpose might work in the line above, eliminating the need of this line.
B(:,a)=N;
end
%Next few lines manipulate B
Which results in the error
Evaluating tall expression using the Local MATLAB Session:
Error using tall/histcounts (line 24)
Argument 1 to HISTCOUNTS must be one
of the following data types: numeric
logical categorical.
Learn more about errors encountered
during GATHER.
Error in
%filename.m
(%linenumber)
N=histcounts(A(:,a),edges);
Error in tall/gather (line 50)
[varargout{:}] =
iGather(varargin{:});
Error in
Tallninetyfive_ConfInt_RichardsCurve
(line 44)
N=gather(N);
Juan Estrella-Martínez
Juan Estrella-Martínez on 6 Jul 2020 at 15:47
I had to do a workaround without invoking tall arrays after all. The first section of my code generates a very large matrix (saved as a single variable in an .mat file) directly in disk. I thought I could easily work with this matrix using a FileDataStore and a Tall array but, for now, it does not seem to be the case.
I decided to lower the resolution of my very large matrix so I could load one column into memory, work with it and load the next column until done.
I did change my code to make use of histcounts which makes it tiddier. Thanks!
I guess the original question is still unanswered.

Sign in to comment.