Importing specific rows of Data from Text file
13 views (last 30 days)
Show older comments
Hi;
I am having some sensor data which is a very large text (.dat) file. Some of the relevant data from this file needs to be analyzed and plotted through help of MATLAB.
The example for the data is like:
C 0 0.001 -0.02 24.09 4.64 -100.00 -100.00
C 0 1.005 0.29 24.09 4.43 -100.00 -100.00
C 0 2.009 -0.34 24.09 8.26 -100.00 -100.00
C 0 3.014 -0.18 24.06 6.06 -100.00 -100.00
C 0 4.018 0.07 24.06 5.61 -100.00 -100.00
C 0 5.022 0.02 24.09 4.92 -100.00 -100.00
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000 ......
Now all I want to import to MATLAB and analyze is the rows which start with this alphabet 'R' , which stands for Result. There is a pattern to the occurrence of 'Result' data in this big text file. The 'R' row occurs at an interval of every 160 rows.
How can I achieve this solution to import only these rows which tell the 'Result' into MATLAB, maybe interactively or programmatically. I would deeply appreciate a detailed answer as I am on intermediate level of MATLAB programming.
Thank you so much in advance! Pramit
1 Comment
per isakson
on 7 Jan 2015
"very large text (.dat) file"   How large is that? The size is important. Does the entire file fits in memory? The total time of reading the file might be a problem.
Accepted Answer
per isakson
on 7 Jan 2015
Edited: per isakson
on 8 Jan 2015
If the entire file fits in memory, try this code
>> num = cssm()
num =
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
0 60.2750 -0.1577 -0.0069 0
where
function out = cssm()
% read the entire file to one cell array with one row per cell
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, '%s', 'Delimiter', '\n' );
[~] = fclose( fid );
% find rows which begin with 'R'.
isR = cellfun( @(str) strncmp(strtrim(str),'R',1), cac{:} );
% extract the rows beginning with 'R'
rlt = cac{:}(isR);
% join all rows with results to one long string separated by '\n'
one_str = strjoin( rlt, '\n' );
% parse the string.
result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );
% make sure that only results are included in the output
assert( strcmp( unique(result{1}), 'R' ) ...
, 'Non-result rows included in result' )
out = result{2};
end
and where cssm.txt contains
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
C 0 6.026 0.34 24.12 4.28 -100.00 -100.00
C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00
C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00
R 0 60.275 -0.157674 -0.006891 0.000000
 
... and an alternative, which is an order of magnitude faster
function out = faster()
% read the entire file to one string
str = fileread( 'cssm.txt' );
% find start and end indicies of all the "rows" beginning with 'R'
xpr = '(?<=\s)R[^(\n|\r)]+(\n|\r){1,2}';
[ix1,ix2] = regexp( str, xpr, 'start', 'end' );
% extract the "rows" beginning with 'R'
isi = false(1,length(str));
for ii = 1:length(ix1)
isi(ix1(ii):ix2(ii))=true;
end
one_str = str(isi);
% parse the string.
result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );
% make sure that only results are included in the output
assert( strcmp( unique(result{1}), 'R' ) ...
, 'Non-result rows included in result' )
out = result{2};
end
More Answers (3)
Shoaibur Rahman
on 18 Dec 2014
I think the following code will serve your purpose. I assume that the text file is named as textFile.txt , and is saved in your working directory, otherwise add the file path.
A few things about the code for your better understanding (yet, if you may have questions, please feel free to contact me):
- cellData is your text data in cellular form.
- First for loop finds the starting row of your data.
- Second set of for loops generates a matrix ResultData that contains all your result data, so you can use that matrix for further analyses. Each row of ResultData corresponds to each your result row in the original text file, except the name R.
filename = '/textFile.txt';
delimiter = ' ';
formatSpec = '%s%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', false, 'ReturnOnError', false);
fclose(fileID);
DataIndex = 2:8;
dataArray(DataIndex) = cellfun(@(x) num2cell(x), dataArray(DataIndex), 'UniformOutput', false);
cellData = [dataArray{1:end-1}];
for k = 1:size(cellData,1)
if cellData{k,1} == 'R'
RstartRow = k;
break
end
end
R_rows = RstartRow:160:size(cellData,1);
for k = 1:length(R_rows)
for kk = 2:size(cellData,2);
ResultData(k,kk-1) = cellData{k,kk};
end
end
3 Comments
Shoaibur Rahman
on 28 Dec 2014
Hi,
Thank you. Lets discuss this together, and to do so, we first take a simple example:
out = cellfun(@mean, {1:10,1:5})
This computes the mean of the two vectors 1:10 and 1:5. Each output is of same size, type and scaler, so 'UniformOutput' will be true, which is default.
Now, dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', false, 'ReturnOnError', false); returns dataArray with different size and type (both cell and double).
To convert all of them into cell, we use a function handle defined by @(x) num2cell(x), where x is a dummy variable and is used to pass dataArray(DataIndex).
We also want to do this only with the data we are interested with, which is set by dataArray(DataIndex).
Finally, because each output is nonscalar and may be of different size, set UniformOutput to false.
Sean de Wolski
on 16 Dec 2014
You'll have to use textscan which provides an option for skipping rows. If you provide a small file (1000 rows or so) we can probably help out more.
0 Comments
Sudharsana Iyengar
on 18 Dec 2014
Edited: Sudharsana Iyengar
on 18 Dec 2014
I dont know if this would help. when i looked at your sensor data the first column consisted of strings while remaining columns consisted of numbers. you can do two of the following.
1) open your csv file in excel and arrange it asccending or descending and pick up the values manually.
2) you can set the string values into their ascii form. In your data C and M were there.ascii value for C is 01000011 and M is 01001101 and for R is 01010010. After making the transformation. you can import your data into matlab.
This would be stored as matrix. with first column having ascii values and remaing 7 with numbers. then you can use the following code.
your data will be stored as untitled.
j=1;k=1;l=1;
for i=1:length(untitled(:,1))
if untitled(i,1)== 01000011
B(j,1)=untitled(i,2);B(j,2)=untitled(i,3);B(j,3)=untitled(i,4);B(j,4)=untitled(i,5);B(j,5)=untitled(i,6);B(j,6)=untitled(i,7);j=j+1; %storing the remaining 7 columns as a separate varaible
end
if untitled(i,1)==01001101
C(k,1)=untitled(i,2);C(k,2)=untitled(i,3);C(k,3)=untitled(i,4);C(k,4)=untitled(i,5);C(k,5)=untitled(i,6);C(k,6)=untitled(i,7);k=k+1;
end
if untitled(i,1)==01010010
D(l,1)=untitled(i,2);D(l,2)=untitled(i,3);D(l,3)=untitled(i,4);D(l,4)=untitled(i,5);D(l,5)=untitled(i,6);D(l,6)=untitled(i,7);l=l+1;
end
end
This will create 2 files B C and D for separate C M and R values. Let me know if this was help full.
0 Comments
See Also
Categories
Find more on Data Import and Export in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!