How to read Excel files with unknown number of header rows?

16 views (last 30 days)
Below is my code to read an Excel file using readtable:
T1 = readtable ('test.xlsx','PreserveVariableNames',true);
Headers = T1.Properties.VariableNames;
A = T1{:,1};
Here is the problem. My Excel file could have unknow number of header rows (from 1 to 20). It seems that (a) the Headers are always the first row, and (b) the A values always start from the first all numerical row.
What I need is the Row # of the A values. If I know there is only one header row, I know the first element of A starts from Row # 2. With unknown number of header rows, how do I derive that Row # info of A?
Thanks!
  6 Comments
Walter Roberson
Walter Roberson on 5 Oct 2022
Is there anything that is consistent between the files? Same number of variables with the same headers?
If there is a variable number of header lines, then is the variable names always going to be the first row, and the variable units always going to be the second row? Or is the variable names always the first row but the variable units is always the row before the data? Or are there a variable number of headers all followed by names and then units and then data?
Leon
Leon on 6 Oct 2022
Thanks for the reply, Walter.
The first row is always the header row. The 2nd row can either be the data or the unit row.

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 7 Oct 2022
I notice you are using R2019b. Starting with R2019a, you can use readcell . Then you would ask isnumeric(T{2,1}) to determine whether the second row was header (units) or numeric. Then cell2table() using the first row of the cell as the VariableNames, and either 2:end or 3:end indexing for the content depending where the numerics start; if row 2 was not numeric then use the content to set the VariableUnits property.
  1 Comment
Leon
Leon on 7 Oct 2022
Many thanks for the recommended solution, Walter!
That will work most of the time, except that T{2,1} may not always be numeric even when they are part of the data. Sometimes, it can be a string. My data does containt some strings such as the Station ID, Cruise_name, Expedition code, etc.
Maybe the only way is to identify a column that is always numerica first?

Sign in to comment.

More Answers (1)

Shashwat Bajpai
Shashwat Bajpai on 13 Feb 2020
The spreadsheetDataStore function can help with this alongwith detectImportOptions
You can also use the Import Tool in the MATLAB Toolstrip to select the rows required.
Hope this Helps!

Tags

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!