How do I use readtable to read in an Excel file whose first column has indenting that I want to preserve?

17 views (last 30 days)
I have an excel document with top-level information that is comprised of components which themselves are comprised of parts. It is used to run code which combines the parts to find the component, and combines the componenets to find the Top Level Data. Ignoring the actual data, which I am able to retrieve, the first column in the Excel file looks like this:
Top level data
Component 1
Part 1-1
Part 1-2
Part 1-3
Component 2
Part 2-1
Part 2-2
What happens with my readtable call,
Ex_MassProps=readtable('MyExcelFile.xlsx','Sheet', 'M1','VariableNamingRule', 'preserve'); % load the excel file, sheet called M1
is that Matlab strips all the leading spaces indenting the row names. I need that information. Is there a way to get Matlab to treat the first column elements as strings to retain the spaces? I use it to display the data the user cares about with the statement
disp(Ex_MassProps(:,1));
But more importantly, periodically new components and parts are added. I need to be able to detect the nesting so I add them up correctly.

Accepted Answer

Voss
Voss on 19 May 2022
opts = detectImportOptions('test.xlsx');
opts.VariableOptions(1).WhitespaceRule = 'preserve';
Ex_MassProps = readtable('test.xlsx',opts)
Ex_MassProps = 8×3 table
Var1 Var2 Var3 __________________ ____ ____ {'Top level data'} 1 9 {' Component 1' } 2 10 {' Part 1-1' } 3 11 {' Part 1-2' } 4 12 {' Part 1-3' } 5 13 {' Component 2' } 6 14 {' Part 2-1' } 7 15 {' Part 2-2' } 8 16
  11 Comments
Voss
Voss on 20 May 2022
You're welcome!
Rather than hidden character(s), maybe the creator of the sheet included some formatting options in those cells, using Excel. I say that because I believe that readtable with WhitespaceRule = 'preserve' would return a table with any whitespace characters from the cells still in it.
You can programmatically check for formatting options like this:
% full path to your file:
fn = fullfile(pwd(),'test.xlsx');
e = actxserver('Excel.Application');
wb = e.Workbooks.Open(fn); % fn must be the full path
ws = wb.Worksheets.Item('M1');
ws.Range('A1:A8').HorizontalAlignment
ws.Range('B1:B8').HorizontalAlignment
ws.Range('C1:C8').HorizontalAlignment
ws.Range('A1:A8').NumberFormat
ws.Range('B1:B8').NumberFormat
ws.Range('C1:C8').NumberFormat
e.Quit; % important: release the file and close the actxserver
If you run that code with the file attached here, which I've applied some formatting to, you should see that
  • Column A has HorizontalAlignment -4152, which corresponds to "Right (indent)" in Excel
  • Column B has HorizontalAlignment -4131, which corresponds to "Left (indent)" in Excel
  • Column C has HorizontalAlignment 1, which corresponds to "General" in Excel
  • Column B has NumberFormat '0.00', which corresponds to "Number" with 2 decimal places in Excel
  • Columns A and C have NumberFormat 'General', which corresponds to "General" in Excel
(Here's a link to the Microsoft documentation for HorizontalAlignment codes in Excel, and you can try to find other infomation you might need by looking around on there.)
I mention this approach because it may be useful - in general - to be able to get formatting information (or any other information) from the xlsx file this way. (I don't know how useful it is in this particular case, though, because it seems like you wanted to know the indentation level of the text in the first element of each row, and just knowing that the first column is right-aligned or whatever doesn't tell you anything about how indented some particular text is.)

Sign in to comment.

More Answers (0)

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!