How can I construct a dataset array from data on an excel worksheet with an unknown number of rows

9 views (last 30 days)
Jon
Jon on 4 Apr 2013
I would like to construct a dataset array from data on an Excel worksheet. I know the location of the upper left corner of the data and the number of columns, but not the number of rows. So I can't use for example ds = dataset('XLSFile','Sheet','mysheet',Range,'G8:H25') because I don't know where the lower right hand corner is. I do know that the sheet is blank below the last row of data of interest and also blank to the right of the last column of data of interest, with no entirely blank columns or rows in between. So it would be nice if I could just specify the range using the upper left corner, e.g. 'G8', but this is not accepted by the dataset constructor. I would appreciate any solutions that you may suggest.

Accepted Answer

Peter Perkins
Peter Perkins on 5 Apr 2013
Jonathan, it's possible to use named ranges per the XLSREAD reference page, but that may not solve your problem. I don't know of a way to specify a range like b5: without a lower right corner.
Hope this helps.
  1 Comment
Jon
Jon on 5 Apr 2013
I did notice the feature you refer to in the documentation, but I would prefer not to have to create all the named ranges as this is extra work and as a manual operation it has the possibility of creating errors. Actually if I just had the Headerlines property that is available when reading text files I would be all set as the only thing on the worksheet is a contiguous table of data with some header rows, with all other cells blank. Its just that I don't know how many rows of data I have in advance. It varies from worksheet to worksheet.

Sign in to comment.

More Answers (1)

Kye Taylor
Kye Taylor on 4 Apr 2013
Just try
d = dataset('XLSFile',yourFileName,'Sheet','mysheet')
  3 Comments
Jon
Jon on 5 Apr 2013
Typically the first few rows of the spread sheet have header data that I'm not interested in (for the purpose of making datasets anyhow), and then there is a row that has the variable names. So I don't want to set 'ReadVarNames' to false, because I want to read them, I just don't want it to assume that they are in the first non-empty row. The 'HeaderLines' property that is available when reading text files would be perfect, but unfortunately it is not available when reading Excel files.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!