How to extract multiple matrices from a big set (Large Matrix) of data?
    4 views (last 30 days)
  
       Show older comments
    
Hello Matlab,
I need some help extracting some matrices from a bigger data set. My large 4D-matrix/ dataset has the following dimension: (5*5*14680*30). That's 30 sets of 14680 of five by five matrices. I want to extract multiple (5*5) matrices from all the 30 sets. The 5*5 matrix selection needs to be done based on the matrix location/Order, they are stacked up from 1 through 14680. For example, I want to extract the (5*5)matrices number 500, 550, 700, 755,& 793 out of the 14680 Matrices and in each set (30 sets total).
I was thinking a loop and an "if" statement may do the job because I need to extract 580 matrices then 1170 of them, and I know exactly where they are located so I have a vector (580,1) containing the exact location for the matrices I need. I would greatly appreciate some feedback. Thanks
6 Comments
  per isakson
      
      
 on 14 Dec 2017
				
      Edited: per isakson
      
      
 on 14 Dec 2017
  
			"extract specific matrices based on their location/indice" Isn't that straight forward?
>> M(:,:,500,1)
ans =
       12476       12481       12486       12491       12496
       12477       12482       12487       12492       12497
       12478       12483       12488       12493       12498
       12479       12484       12489       12494       12499
       12480       12485       12490       12495       12500
"create a vector column(580*1)" row is better if you want to use a for-loop.
"where the code verifies the location" I don't get this
Answers (1)
  per isakson
      
      
 on 14 Dec 2017
        
      Edited: per isakson
      
      
 on 15 Dec 2017
  
      Use this code to study for-loop and indexing. Put a break-point at 17;
MNP130 = 1:(5*5*14680*30);  % a 1D double vector
MNP130 = reshape( MNP130, 5,5,14680,30);  % reshape to 4D
for kk = 1 : 30  % loop over all "sets"
    for  jj = reshape( RowID, 1,[] )  % reshape to row vector
        m5x5 = MNP130 (:,:,jj,kk);
        17; % set break-point here
    end
end
Yes, m5x5 is overwritten many times.
.
"done with a loop (from i=1:14680) in each set, check the location of the 5*5 matrix"
RowID = [ 500; 503 ];
for  jj = reshape( RowID, 1,[] )  % reshape to row vector
    m5x5 = MNP130 (:,:,jj,kk);
end
does the same job as
for  jj = 1 : 14680
    if ismember( jj, RowID )
        m5x5 = MNP130 (:,:,jj,kk);
    end
end
with kk from outer loop
.
"extract it that 5*5 and store it aside in a new data set"
"a new data set" in what type of variable, speed and memory speaks for double array. However, 4D may be a bit tricky to get ones head around.
new_data_set = squeeze( MNP130 (:,:,500,:) );
>> whos new_data_set
  Name              Size              Bytes  Class     Attributes
  new_data_set      5x5x30             6000  double
I believe this one-liner is the answer to your question. If so, maybe there is no need to calculate new_data_set in a separate step. MNP130 is not that big, only 88 MB, and the execution time of the one-liner is really short.
.
5 Comments
  per isakson
      
      
 on 18 Dec 2017
				- "I did read on Debug and Examining Values as you suggested." Ok, but I suspect you didn't try to step through the code as I proposed. That's because you wrote "I tested this: [...] the output was one 5*5 matrix."
- "I believe that what I lacked was a better explanation of my expected/desired output" Yes, indeed. The question is how you plan to use those new_data_sets. However, I think it's better to stick with a 4D-array.
- "The reason I believe that using loop is necessary [...]" Neither a loop nor if-statements are necessary. MNP130(:,:,RowID,:) creates the "(5*5*580*30)" array.
- "never taken any coding courses" I guess you share that with many Matlab users. I'm one of them. An Algol course half a century ago doesn't count. The MathWorks targets engineers and scientists, who don't want to waste time on computer science. See MATLAB Fundamentals
Make experiment with small data samples. Here is a functions that creates an appropriate sample-array.
>> M = sample4D;
>> subset = M(:,:,[3,4],:);
>> subset
subset(:,:,1,1) =
    13    13
    13    13
subset(:,:,2,1) =
    14    14
    14    14
subset(:,:,1,2) =
    23    23
    23    23
...
where
function  M = sample4D
    M = nan(2,2,5,3);
    for kk = 1 : 3
        for jj = 1 : 5
            M(:,:,jj,kk) = (kk*10+jj)*ones(2,2);
        end
    end
end
The indices, [3,4], of M are [1,2] of subset. To me that's a reason to stick with M
See Also
Categories
				Find more on Matrix Indexing in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


