How to delete rows of characters in Text files?

Question

Lei on 1 Jan 2015

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/168639-how-to-delete-rows-of-characters-in-text-files

Edited: per isakson on 9 Jan 2015

I was trying to input the data from lots of TXT files, but there were rows of characters. How can I delete the rows with characters? How can I create a new txt file with just numerical data? The example of the txt data is as follows:

 *****************************************
* Log File Started 11:29:05 Wed Dec 31 2014
* Using PFC3D 4.00-182 (64-bit)
* Serial Number: 262-000-0000-00000
* By: 
*     
*****************************************
Fish>              
    1  1  1  1  1  1  1  1  1
    0  0  0  0  0  0  0  0  0
    0  0  0  0  0  0  0  0  0
    2  2  2  2  2  2  2  2  2
    0  0  0  0  0  0  0  0  0
Fish>          
*****************************************
* Log File Ended 11:29:07 Wed Dec 31 2014
*****************************************

I would like to delete all the headers and footers. I was trying to use fgetl function, but only the headers was deleted.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Pourya Alinezhad on 1 Jan 2015

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/168639-how-to-delete-rows-of-characters-in-text-files#answer_163767

Open in MATLAB Online

hi there, u can use the following lines of code:

fid=fopen('txtfile.extention');
textdata=textscan(fid,'%n%n%n%n%n%n','headerlines',8,'delimiter','\b\t');

for footers see here : http://nl.mathworks.com/matlabcentral/newsreader/view_thread/325668

1 Comment
Show -1 older commentsHide -1 older comments

Lei on 2 Jan 2015

In the reference link, there is still only method for removing the headers, no footers actually. Thank you for this anyway.

Sign in to comment.

Answer 2

per isakson on 2 Jan 2015

1
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/168639-how-to-delete-rows-of-characters-in-text-files#answer_163871

Edited: per isakson on 9 Jan 2015

Open in MATLAB Online

There is no easy way to read blocks of numerical data, which are embedded in text. That might not be quite true, I just learned

Here are three different functions, which read and parse the numerical block of the the example file, cssm.txt, of the question.

cssm_1 &nbsp is a straight forward use of textscan. There are no problems to use it in this case because it is easy to determine the numbers of lines in the header and the block of data, respectively. &nbsp Matlab evolves gradually and it is easy to miss new behavior. With R2013a it is not neccessary to set rows_of_data, the number of time the formatspec is used. "[...] and stops when it cannot match formatSpec to the data." is new in the documentation of R2014a.

cssm_2 &nbsp is based on a different approach. The entire file is read to a string and regexp extracts the blocks of numerical data. str2num converts the blocks to numerical arrays. This function can handle many blocks.

cssm_3 &nbsp Sometimes the beginning and end of the blocks of tabular data are indicated with special strings. In this case Fish> indicates both beginning and end. fileread reads entire file to a string and regexp extracts the blocks bewteen the beginning and end markers. textscan parses the blocks.

Run on R2013a

    >> num = read_block_demo(  )
    num(:,:,1) =
         1     1     1     1     1     1     1     1     1
         0     0     0     0     0     0     0     0     0
         0     0     0     0     0     0     0     0     0
         2     2     2     2     2     2     2     2     2
         0     0     0     0     0     0     0     0     0
    num(:,:,2) =
         1     1     1     1     1     1     1     1     1
         0     0     0     0     0     0     0     0     0
         0     0     0     0     0     0     0     0     0
         2     2     2     2     2     2     2     2     2
         0     0     0     0     0     0     0     0     0
    num(:,:,3) =
         1     1     1     1     1     1     1     1     1
         0     0     0     0     0     0     0     0     0
         0     0     0     0     0     0     0     0     0
         2     2     2     2     2     2     2     2     2
         0     0     0     0     0     0     0     0     0
    >>

where

    function    num = read_block_demo()
        filespec    = 'cssm.txt';
        data_frmt   = '%f%f%f%f%f%f%f%f%f';
        rows_of_data = 5;
        header_lines = 8;
        begin_xpr   = '\*{20,}\s+Fish>\s+';
        end_xpr     = '\s+Fish>\s+\*{20,}';
        num(:,:,1)  = cssm_1( filespec,data_frmt, rows_of_data, header_lines);
        num(:,:,2)  = cssm_2( filespec, 50 );
        num(:,:,3)  = cssm_3( filespec, data_frmt, begin_xpr, end_xpr );
        assert( all(all(num(:,:,2)==num(:,:,1)))            ...
            &&  all(all(num(:,:,3)==num(:,:,1)))            ...
            , 'The methods don''t return indentical results' ) 
    end
    function    num = cssm_1(filespec, data_frmt, rows_of_data, header_lines )
        fid = fopen( filespec );
        cac = textscan( fid, data_frmt, rows_of_data    ...
                    ,   'Headerlines'   , header_lines  ...
                    ,   'CollectOutput' , true          );
        fclose( fid );
        num = cac{1};
    end
    function    num = cssm_2( filespec, block_size )
      cac = read_blocks_of_numerical_data( filespec, block_size );
      num = cac{1};
    end  
    function    num = cssm_3( filespec, data_frmt, begin_xpr, end_xpr )
      str = fileread( filespec );
      cac = regexp( str, ['(?<=',begin_xpr,').+(?=',end_xpr,')'], 'match' );
      cac = textscan( cac{1}, data_frmt, 'CollectOutput', true );
      num = cac{1};
    end  
    function out=read_blocks_of_numerical_data(filespec,block_size,delimiter )
    %   block_size  lower limit of number of characters in numerical block
    %
    %   Within a block all rows must have the same number of "columns". 
        narginchk( 2, 3 )
        buffer  = fileread( filespec );
        if nargin == 2 
            del_xpr = '[ ]+';
            trl_xpr = '[ ]*';
        else
            del_xpr = ['([ ]*',delimiter,'[ ]*)'];
            trl_xpr = ['([ ]*',delimiter,'?[ ]*)'];
        end
        num_xpr = '([+-]?(\d+(\.\d*)?)|(\.\d+))';
        sen_xpr = '([EeDd](\+|-)\d{1,3})?';  % optional scientific E notation
        num_xpr = [ num_xpr, sen_xpr ];
        nl_xpr  = '((\r\n)|\n)';
        row_xpr = cat( 2, '(^|', nl_xpr, ')[ ]*('  ...
                        , num_xpr, del_xpr, ')*'   ...
                        , num_xpr, trl_xpr, '(?='  ...
                        , nl_xpr,'|$)'             ); 
        blk_xpr = ['(',row_xpr,')+'];
        blocks  = regexp( buffer, blk_xpr, 'match' );
        is_long = cellfun( @(str) length(str)>=block_size, blocks );
        blocks(not(is_long)) = [];
        out = cell( 1, length( blocks ) ); 
        for jj = 1 : length( blocks )
            out{jj} = str2num( blocks{jj} );        
        end
    end

&nbsp

I learned that textscan in this case handles the free text at the end of a file better than I thought it would.

@Lei [...] there is still only method for removing the headers, no footers actually &nbsp textscan actually removes (/ignores) the footer automagically in your example.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to delete rows of characters in Text files?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

How to delete rows of characters in Text files?

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment Show -1 older commentsHide -1 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments