How to retrieve the indices of the values in each bin?

39 views (last 30 days)
I have a histogram with values from a vector A , spread into 30 bins. How can I get the indices of the values in A, that correspond to each bin?
Example:
A = [ 4 6 8 2 5 3 3]
bin number 4 contains [4 3 3]
so I want to have a vecor B containing the indices B = [1 6 7]

Answers (1)

Walter Roberson
Walter Roberson on 10 Oct 2021
The code could be slightly simpler if all of the bins were only one value wide.
A = randi(60, 1, 50)
A = 1×50
36 10 40 43 35 52 52 43 21 44 23 47 8 31 46 19 59 56 47 52 46 47 58 59 25 48 31 59 53 13
edges = [1:2:60, inf]
edges = 1×31
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
[counts, ~, bins] = histcounts(A, edges)
counts = 1×30
1 1 0 2 1 1 1 1 0 2 2 1 3 1 1 3 0 2 0 1 1 3 2 4 1 4 2 2 4 3
bins = 1×50
18 5 20 22 18 26 26 22 11 22 12 24 4 16 23 10 30 28 24 26 23 24 29 30 13 24 16 30 27 7
B = accumarray(bins(:), reshape(1:numel(bins), [], 1), [], @(V){V.'})
B = 30×1 cell array
{[ 48]} {[ 37]} {0×0 double} {[ 13 45]} {[ 2]} {[ 33]} {[ 30]} {[ 32]} {0×0 double} {[ 16 46]} {[ 9 50]} {[ 11]} {[25 31 43]} {[ 42]} {[ 39]} {[14 27 35]}
  3 Comments
Walter Roberson
Walter Roberson on 28 Oct 2021
If accumarray() is giving you that error, then it implies that some value in your matrix A is less than the first value in your edges vector or greater than the last one. For example,
A = randi([-2 60], 1, 50)
A = 1×50
51 59 42 31 0 39 6 59 48 56 43 30 -2 8 25 45 48 2 59 0 49 57 0 17 9 16 11 56 46 50
edges = [1:2:60, inf]
edges = 1×31
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
[counts, ~, bins] = histcounts(A, edges)
counts = 1×30
2 0 2 1 2 1 0 3 1 0 1 1 1 0 2 1 0 1 0 2 3 1 4 3 2 3 1 3 1 3
bins = 1×50
26 30 21 16 0 20 3 30 24 28 22 15 0 4 13 23 24 1 30 0 25 29 0 9 5 8 6 28 23 25
B = accumarray(bins(:), reshape(1:numel(bins), [], 1), [], @(V){V.'})
Error using accumarray
First input SUBS must contain positive integer subscripts.
See how the 5th bin number is 0, which corresponds to the bin for the input value 0 in A, and 0 is before the first value in edges . (It is not because A has negative or 0 entries: it is strictly to do with the fact that it has entries that are outside the range of the edges list.)
Walter Roberson
Walter Roberson on 28 Oct 2021
Note that in the following code, any value in A that is outside the range of the edges will not have its index appear anywhere in B.
A = randi([-2 60], 1, 50)
A = 1×50
58 21 4 42 56 -2 4 20 34 29 40 26 58 4 60 -2 29 6 -2 44 38 28 32 33 28 52 9 1 8 33
edges = [1:2:60, inf]
edges = 1×31
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
[counts, ~, bins] = histcounts(A, edges)
counts = 1×30
2 5 2 2 1 1 2 1 0 1 1 0 1 2 3 2 5 0 2 1 1 1 2 1 1 1 0 2 2 1
bins = 1×50
29 11 2 21 28 0 2 10 17 15 20 13 29 2 30 0 15 3 0 22 19 14 16 17 14 26 5 1 4 17
valididx = reshape(find(bins), [], 1);
B = accumarray( reshape(bins(valididx), [], 1), valididx, [], @(V){V.'})
B = 30×1 cell array
{[ 28 39]} {[3 7 14 31 37]} {[ 18 38]} {[ 29 47]} {[ 27]} {[ 46]} {[ 34 50]} {[ 42]} {0×0 double } {[ 8]} {[ 2]} {0×0 double } {[ 12]} {[ 22 25]} {[ 10 17 44]} {[ 23 40]}

Sign in to comment.

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!