Importing Column Vectors Sequentially

Question

0 votes

Hi, I'm trying to import data from a large tab delimited CSV with headers. (0.8 GB)

I don't want to import everything, just a number of specific columns. I.e. I would like to create unique column vectors for:

1. Cells D19:D234568

then

2. Cells F19:F234568

and so on.

Currently I'm doing this one-by-one as, even with 12GB ram I'm running out of memory.

There must be a simple way of doing this quickly, no? Once the vectors are loaded and saved as .mat they load up in seconds.

Cheers,

Alasdair

3 Comments
Show 1 older comment Hide 1 older comment

Alasdair Fulton on 7 Aug 2016

Hi,

Sorry for the confusion - to clarify, when I view it in the MATLAB import app those are the cells I highlight manually before hitting "Import Selection". It's a tab delimited test file.

per isakson on 7 Aug 2016

Open in MATLAB Online

"tab delimited CSV with headers. (0.8 GB) .... even with 12GB ram I'm running out of memory" &nbsp I find it hard to believe that a 0.8GB text-file should cause an out of memory error.

Is it numerical, text or mixed data?

Did you try something like this?

frm = '%*s%*s%*s*f%*s%f%*[^\n]';
cac = textscan( fid, frm, (234568-18), 'Headerlines',18, 'Delimiter',\t')

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

dpb on 7 Aug 2016

Edited: dpb on 7 Aug 2016

Open in MATLAB Online

0 votes

"... in ... import app those are the cells ... tab delimited..."

In that case, use textscan and a format string set up to read the desired columns. This isn't particularly difficult to automate depending on the columns wanted...

cols={'D','F'};           % the list of wanted columns
fmt=[];                   % empty string to build format string into
for i=1:length(cols)      % over the number of columns to read
  fmt=[fmt repmat('%*f',1,cols{i}-'A') '%f'];  % skip N-1, read 1
end
fmt=[fmt '%*[^\n'];       % and then skip rest of line
fid=fopen('filename','r');
data=cell2mat(textscan(fid,fmt,'delimiter','\t', ...
                                'headerlines', 18, ...
                                'collectoutput',1));  % and read the file
fid=fopen(fid);

There's a section Large Text Files linked to at the doc for textscan that describes how to read a file in blocks if this still errors out on memory altho if the import tool can do it, the above likely will work as I'd venture it's what it does as a first try, anyway...

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Importing Column Vectors Sequentially

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

0 Comments
Show -2 older comments Hide -2 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Importing Column Vectors Sequentially

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

0 Comments Show -2 older comments Hide -2 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

0 Comments
Show -2 older comments Hide -2 older comments