Assumptions:
- the files have the same format and are "correct" - no checking needed
- the first character of all file names is "x"
- the concatenated file shall have one header
- the files may have trailing empty lines, which shall not be included in the concatenated file.
function ccsm()
sad = dir( 'x*.csv' );
fid_out = fopen( 'concatenated_files.csv', 'w' );
for jj = 1 : length( sad )
fid_in = fopen( sad(jj).name, 'r' );
str = transpose( fread( fid_in, '*char' ) );
fclose( fid_in );
if not( jj == 1 )
ix1 = regexp( str, '[\r\n]++', 'once' );
str = str(ix1:end);
end
ix2 = regexp( str, '[\r\n]++$', 'once' );
if not( isempty( ix2 ) )
str(ix2:end) = [];
end
fwrite( fid_out, str, '*char' );
end
fclose('all');
end
where there are four csv-files in the current directory with the content
rowhead, var1, var2, var3
d1, 0.5377, 0.3188, 3.5784
d2, 1.8339, -1.3077, 2.7694
d3, -2.2588, -0.4336, -1.3499
d4, 0.8622, 0.3426, 3.0349
.
Description
A file is a row of bytes on some storage media. The program interprets those bytes.
- get a list of the source files, which shall be concatenated
- create an empty target file to put the result
- loop over all source files
- open the current source file
- read the file, interpret the bytes as characters, put the result in the variable, str
- close the current source file
- we want to keep the header line of the first source file and remove it for the others
- line breaks are indicated with either the two bytes "\r" or by the single byte "". ix1 is the start position of the first group of "" and "\r" (any number and any order).
- keep the bytes from the position, ix1, to the end, i.e. strip off the header line.
- find the starting position, ix2, of the trailing group of "" and "\r"
- there might not be any line breaks at the end of the file
- strip off the line breaks at the end
- write the remaining row of characters to the target file.
- close the target file (and others, which might be open by mistake)
Fill in the details with the help of the Matlab documentation!