Faster data import/export (xlsread or textscan)

1 view (last 30 days)
I am running a code that exports matrices (clusternoA, ClustersizeA etc) to an excel file with each column corresponding to each variable. These matrices are row vectors with variable length.
if true
resultsfile=[path1,'Clusterdistribution',num2str(im),'.xlsx'];
col_header1 = {'ClustersizeA','ClustersizeB','ClustersizeAB', '','ClusternoA','ClusternoB','ClusternoAB' };
col_header2 = {'micrometer','micrometer','micrometer', '','no per clust','no per clust','no per clust'};
xlswrite(resultsfile, col_header1, 'sheet1','A1');
xlswrite(resultsfile, col_header2, 'sheet1','A2');
xlswrite(resultsfile, clustersizeA', 'sheet1','A3');
xlswrite(resultsfile, clustersizeB', 'sheet1','B3');
xlswrite(resultsfile, clustersizeAB', 'sheet1','C3');
xlswrite(resultsfile, clusternoA', 'sheet1','E3');
xlswrite(resultsfile, clusternoB', 'sheet1','F3');
xlswrite(resultsfile, clusternoAB', 'sheet1','G3');
end
Once, excel file is generated, I am trying to import this data corresponding to each image (above data is for one image, importing data for all images into one matrix). Following Code works but its very slow probably because the size of arrays is not specified or xlsread. Is there a way to make this faster?
if true
clustersizeA =[];
clustersizeB =[];
clustersizeAB =[];
clusternoA=[];
clusternoB=[];
clusternoAB=[];
for im=5:7
Inputfile = ['Clusterdistribution',num2str(im),'.xlsx'];
sizeA = xlsread(Inputfile, 'sheet1', 'A:A');
sizeB = xlsread(Inputfile, 'sheet1', 'B:B');
sizeAB = xlsread(Inputfile, 'sheet1', 'C:C');
noA = xlsread(Inputfile, 'sheet1', 'E:E');
noB = xlsread(Inputfile, 'sheet1', 'F:F');
noAB = xlsread(Inputfile, 'sheet1', 'G:G');
clustersizeA=cat(1, clustersizeA,sizeA);
clustersizeB=cat(1, clustersizeB,sizeB);
clustersizeAB=cat(1, clustersizeAB,sizeAB);
clusternoA=cat(1, clusternoA, noA);
clusternoB=cat(1, clusternoB, noB);
clusternoAB=cat(1, clusternoAB, noAB);
end
end

Answers (1)

Rahul Goel
Rahul Goel on 30 Oct 2015
Hi Nitin,
For reading excel files, "xlsread" is mostly preferred. However, I would suggest to read the sheet into a variable in one go and then split that into different variables column-wise instead of calling xlsread for every column explicity. It will reduce the overhead of opening and closing the underlying COM communication used by xlsread for every single column. Also, your code should run faster in MATLAB R2015b using "xlsread" as R2015b onward, MATLAB does not close the excel COM instance every time it is done reading the from excel. It keeps it opened until you close the MATLAB so as to speed up the subsequent calls to xlsread/xlswrite.
Hope this helps.
  1 Comment
nitin arora
nitin arora on 30 Oct 2015
Thanks Rahul. Importing data into a single matrix is faster (9 vs 16 sec) but I got into another problem, that is I am getting lot of NaN(Not a number) in my arrays. When I use column A:A, it stops at last number whereas when using column A:G for import, the length of each column is set to the column with maximum values(max rows). How to solve this problem? Please advice.
if true
clustersizeA =[];
clustersizeB =[];
clustersizeAB =[];
clusternoA=[];
clusternoB=[];
clusternoAB=[];
for im=5:6
Inputfile = ['Clusterdistribution',num2str(im),'.xlsx'];
clusterdata = xlsread(Inputfile, 'sheet1', 'A:G');
sizeA = clusterdata (:,1);
sizeB = clusterdata (:,2);
sizeAB = clusterdata (:,3);
noA = clusterdata (:,5);
noB = clusterdata (:,6);
noAB = clusterdata (:,7);
clustersizeA=cat(1, clustersizeA,sizeA);
clustersizeB=cat(1, clustersizeB,sizeB);
clustersizeAB=cat(1, clustersizeAB,sizeAB);
clusternoA=cat(1, clusternoA, noA);
clusternoB=cat(1, clusternoB, noB);
clusternoAB=cat(1, clusternoAB, noAB);
end
end

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!