How to check a txt file is GBK format or UTF-8 format ?
12 views (last 30 days)
Show older comments
How to check a txt file is GBK format or UTF-8 format ?
0 Comments
Answers (1)
Shubham Dhanda
on 28 Jun 2023
Hi,
I understand that you want to find whether the encoding of the specified text file is GBK or UTF-8.
Below is the MATLAB code to check the encoding of a txt file:
% Specify the file path and name
filename = 'untitled.txt';
% Read the file as a binary stream
fid = fopen(filename, 'rb');
data = fread(fid);
fclose(fid);
% Check if the file is UTF-8 encoded
isUTF8 = isequal(data(1:3), [239; 187; 191]);
% Check if the file is GBK encoded
isGBK = false;
try
decodedText = native2unicode(data, 'GBK');
isGBK = true;
catch
% GBK decoding failed, indicating it's not GBK encoded
end
% Check the encoding
if isUTF8
disp('The file is in UTF-8 format.');
elseif isGBK
disp('The file is in GBK format.');
else
disp('The file encoding is not UTF-8 or GBK.');
% You can assume it is encoded in another format
end
Hope this helps.
0 Comments
See Also
Categories
Find more on Standard File Formats in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!