Reading only Certain Columns from .CSV

I'm trying to read in the 1st and 3rd columns of data from a file.
I know this code shows the 2nd and 3rd:
M = csvread('filename.csv', 2, 2)
How do I skip the 2nd column and just display the 1st and 3rd?

Answers (1)

Walter Roberson
Walter Roberson on 20 Mar 2019
Edited: Walter Roberson on 20 Mar 2019
That cannot be done with csvread() or dlmread().
With textscan() you would use a format specification of '%f %*f %f %*[^\n]' . You would probably use cell2mat() around the result of the textscan() call.
With readtable() the way to proceed would be to use detectImportOptions (new as of R2016b). You might have to give the option 'ReadVariableNames', false . Assign the result to a variable, and set the SelectedVariableNames property of the object to [1 3]. Then use readtable() on the .csv file, passing in that options object.

18 Comments

Would you be able to provide some sample code? I'm not really too sure what you are referring to
filename = 'filename.csv';
[fid, msg] = fopen(filename, 'rt');
if fid < 0
error('Failed to open file "%s" because "%s"', filename, msg);
end
data = cell2mat( textscan(fid, '%f,%*f,%f%*[^\n]') );
fclose(fid);
%data is now a numeric table with two columns
or
filename = 'filename.csv';
opts = detectImportOptions(filename, 'ReadVariableNames', false);
opts.SelectedVariableNames = [1 3];
data_table = readtable(filename, opts);
%data_table is now a table object with two variables.
%data_table{:,:} would be a numeric array with two columns
I am getting the following errors with your second solution:
Error using detectImportOptions
'ReadVariableNames' is not a recognized parameter. For a list of valid name-value pair
arguments, see the documentation for detectImportOptions.
Error in detectImportOptions>getTypedParser/parsefcn (line 287)
p.parse(args{:});
Error in detectImportOptions>textArgs (line 319)
args = parser(otherArgs);
Error in detectImportOptions (line 219)
args = textArgs(p.Unmatched);
Error in FlexTest2 (line 2)
opts = detectImportOptions(lux, 'ReadVariableNames', false);
And with your first proposed solution I am getting an empty array:
data =
0×2 empty double matrix
Ah, ReadVariableNames was not a parameter in your R2016b release.
Please show a sample of the first 3 or 4 lines of your file.
epoc (ms)timestamp (-0400)elapsed (s)illuminance (lx)
15542357188562019-04-02T16.08.38.8560.0008.203
15542357208562019-04-02T16.08.40.8562.0002.323
15542357228552019-04-02T16.08.42.8553.9992.323
15542357248552019-04-02T16.08.44.8555.9992.323
15542357268542019-04-02T16.08.46.8547.9982.323
Perhaps you should attach the sample as a file, so that we can check for hidden characters such as tab characters.
What you have posted cannot be read by csvread() or xlsread().
In R2017a and later, what you posted could be approached with readtable() and a FixedWidthImportOptions https://www.mathworks.com/help/matlab/ref/matlab.io.text.fixedwidthimportoptions.html . However, you only have R2016b. For your options, see the discussion at https://www.mathworks.com/matlabcentral/answers/453137-sscanf-to-extract-numbers-from-string#answer_367985
See attached CSV file
t = readtable('BlackModelTest.csv');
times = datetime(t{:,1}/1000, 'convertfrom', 'posixtime');
elapsed = t{:,3};
My tests show that if you do
seconds(times - times(1))
then that exactly matches the elapsed data (column 3).
In order for the second column to match, we would have to assume that there is a 4 hour timezone difference
Hi Walter,
I'm sorry this is getting really confusing. The comment you just posted only outputs the timestamp.
All I want to output is an array with just the TIMESTAMP and LIGHT SENSOR READING (columns 2 and 4).
Can you simply expalin code to achieve this?
t = readtable('BlackModelTest.csv');
timestamp = datetime(t{:,1}/1000, 'convertfrom', 'posixtime');
light_sensor_reading = t{:,4};
In the sample file you provided, the second column was the text version of a timestamp, with the text being equivalent to the information that could be obtained by treating the first column as a POSIX time. There is, however, a timezone difference between the time obtained from the first column, and the text version of the time; the difference would be easiest to explain if the text version is local time in EDT (Eastern Daylight Time). If so, then you could use
t = readtable('BlackModelTest.csv');
timestamp = datetime(t{:,1}/1000, 'convertfrom', 'posixtime', 'TimeZone', 'UTC');
timestamp.TimeZone = 'America/New_York';
light_sensor_reading = t{:,4};
okay but how do I get ONE SINGLE ARRAY. WIth the timestamp and light sensor reading?
I currently have a seperated timestamp array and light sensor reading array. How do I combine the two into ONE ARRAY 2 COLUMNS?
final = [timestamp.TimeZone(:), light_sensor_reading(:)]
I am trying this but am getting error "Dimensions of matrices being concatenated are not consistent"
Your second column is text. How do you intend to represent that in a numeric array?
What numeric value do you want stored for '2019-04-03T14.01.57.690' ?
Why can't the first column be a string and the second column be a number?
perhaps
t(:,[2 4])
would work for your purposes. If not, then
[t{:.2}, num2cell(t{:,4})]
Hi walter,
This works well but for my second application I would prefer not a table. How do I get a matrix?
The table is giving me an error because I am trying to add a greater than operator to find light sensor values greater than 20.
Thanks!
What would it mean to use a greater than operator with those text timestamps ?
If you want to test the sensor values you can extract them from the table:
mask = t{:,4} > 20;
which could, for example, be used in the context:
t(t{:.4}>20, [2 4])

Sign in to comment.

Categories

Products

Release

R2016b

Tags

Asked:

on 20 Mar 2019

Commented:

on 3 Apr 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!