Errors while reading binary data files
18 views (last 30 days)
Show older comments
I am trying to read binary files with uint32 data entries for example. The function reads the data correctly up to a certain elelment and then reads unexpected data (which I am sure do not exist in the original file). In some cases, these data blowup to very large values (for example, I was reading a uint32 data file with a maximum value of ~8000 and the maximum of data read is ~4.127*10^9. The code I am using is shown below (note: the asterisck does not have an effect, I repeated this with different files and checked the data using other programming languages/softwares):
function [X,Y,Z,Volume] = GetBin(filename,volumeSize,nOfBytes)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Reads a binary formatted file into a 3D MATLAB matrix.
%
% INPUT:
% filename: string, name of binary file for reading
%
% OUTPUT:
% X,Y,Z: integer, size of matrix in cartesian coordinates
% Volume: integer 3D matrix, voxel values (labels)
% Ahmed Zankoor , April 2021
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
file = "data\" + filename;
fid = fopen(file, 'rt');
if fid == -1
error('Cannot open file for reading: %s', file);
end
X = volumeSize(1);Y = volumeSize(2);Z = volumeSize(3);
% Read binary file, By default, fread reads a file 1,2 or 3 byte at a time,
% interprets one byte as an 8-bit unsigned integer (uint8),two byte as an 16-bit unsigned integer (uint16)
% three byte as an 32-bit unsigned integer (uint32).
if nOfBytes == 1
data = fread(fid,Inf,'*uint8');
data = uint8(data);
elseif nOfBytes == 2
data = fread(fid,Inf,'*uint16');
data = uint16(data);
elseif nOfBytes == 4
data = fread(fid,Inf,'*uint32');
data = uint32(data);
else
error('Unrecognized number of bytes per entry.')
end
fclose(fid);
if length(data)~= X*Y*Z
disp('Size of data does not match size of Volume.')
disp(['Size of data = ' num2str(length(data))])
disp(['Size of volume = ' num2str(X*Y*Z)])
end
Z = floor(length(data)/(X*Y));
data = data(1:X*Y*Z);
Volume = reshape(data,X,Y,Z);
end
For visualization, the image attached shows an example of the read data, where the top is correctly read data and then the mess below is because of the errors. I wonder if anyone knows why this may be happening?
Thank you.
2 Comments
Accepted Answer
Jan
on 21 Apr 2021
Edited: Jan
on 21 Apr 2021
The problem is hidden here:
fid = fopen(file, 'rt');
This opens the file in "text"-mode on Windows. Then e.g. a CHAR(8) is converted to a backspace, which means, that the former byte is deleted. ^Z is interpreted as end of file and there are a lot of further gimmicks. Therefore a file with arbitrary bytes can contain less characteres after the import than the files has bytes on the disk.
The solution is easy and makes the code unspecific for the platform it runs on: Open the file in binary mode by omitting the 't':
fid = fopen(file, 'r');
I prefer this for text files also, because the old DOS control characters are a common source of unexpected behaviour. The interpretation in text moder costs runtime also.
A hint: fread(fid, inf, '*uint8') replies an UINT8 already. So you can omit the lines:
data = uint8(data);
More Answers (0)
See Also
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!