Work with Non-ASCII Characters in HDF5 Files
To enable sharing of HDF5 files across multiple locales, MATLAB® supports the use of non-ASCII characters in HDF5 files. This example shows you how to:
Create HDF5 files containing data set and attribute names that have non-ASCII characters using the high-level functions.
Create variable-length string data sets containing non-ASCII characters using the low-level functions.
Create Data Set and Attribute Names Containing Non-ASCII Characters
Create an HDF5 file containing a data set name and an attribute name that contains non-ASCII characters. To check if the data set and attribute names appear as expected, write data to the data set, and display the file information.
Create a data set with a name (/数据集
) that includes non-ASCII
characters.
dsetName = ['/' char([25968 25454 38598])]; dsetDims = [5 2]; h5create('outfile.h5',['/grp1' dsetName],dsetDims,... 'TextEncoding','UTF-8');
dataToWrite = rand(dsetDims); h5write('outfile.h5',['/grp1' dsetName],dataToWrite);
Create an attribute name (屬性名稱
) that includes non-ASCII
characters and assign a value to the attribute.
attrName = char([25967 25453 38597]); h5writeatt('outfile.h5','/',attrName,'I am an attribute',... 'TextEncoding','UTF-8');
Display information about the file and check if the attribute name and data set name appear correctly.
h5disp('outfile.h5')
HDF5 outfile.h5 Group '/' Attributes: '/屬性名稱': 'I am an attribute' Group '/grp1' Dataset '数据集' Size: 5x2 MaxSize: 5x2 Datatype: H5T_IEEE_F64LE (double) ChunkSize: [] Filters: none FillValue: 0.000000
Create Variable-Length String Data Containing Non-ASCII Characters
Create a variable-length string data set to store data containing non-ASCII characters using the low-level functions. Write the data to the data set. Check if the data is written correctly.
Create data containing non-ASCII characters.
dataToWrite = {char([12487 12540 12479]) 'hello' ... char([1605 1585 1581 1576 1575]); ... 'world' char([1052 1080 1088]) ... char([954 972 963 956 959 962])}; disp(dataToWrite)
'データ' 'hello' 'مرحبا' 'world' 'Мир' 'κόσμος'
To write this data into a file, create an HDF5 file, define a group name, and a data set name within the group.
Create the HDF5 file.
fileName = 'outfile.h5'; fileID = H5F.create(fileName,'H5F_ACC_TRUNC',... 'H5P_DEFAULT', 'H5P_DEFAULT');
To create the group containing non-ASCII characters in its name, first, configure the link creation property.
lcplID = H5P.create('H5P_LINK_CREATE'); H5P.set_char_encoding(lcplID,H5ML.get_constant_value('H5T_CSET_UTF8')); plist = 'H5P_DEFAULT';
Then, create the group (グループ
).
grpName = char([12464 12523 12540 12503]); grpID = H5G.create(fileID,grpName,lcplID,plist,plist);
Create a data set that contains variable-length string data with non-ASCII characters. First, configure its data type.
typeID = H5T.copy('H5T_C_S1'); H5T.set_size(typeID,'H5T_VARIABLE'); H5T.set_cset(typeID,H5ML.get_constant_value('H5T_CSET_UTF8'));
Now create the data set by specifying its name, data type, and dimensions.
dsetName = 'datasetUtf8'; dataDims = [2 3]; h5DataDims = fliplr(dataDims); h5MaxDims = h5DataDims; spaceID = H5S.create_simple(2,h5DataDims,h5MaxDims); dsetID = H5D.create(grpID,dsetName,typeID,spaceID,... 'H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
Write the data to the data set.
H5D.write(dsetID,'H5ML_DEFAULT','H5S_ALL',... 'H5S_ALL','H5P_DEFAULT',dataToWrite);
Read the data back.
dataRead = h5read('outfile.h5',['/' grpName '/' dsetName])
dataRead = 2×3 cell array {'データ'} {'hello'} {'مرحبا' } {'world'} {'Мир' } {'κόσμος'}
Check if data in the file matches the written data.
isequal(dataRead,dataToWrite)
ans = logical 1
Close ids.
H5D.close(dsetID); H5S.close(spaceID); H5T.close(typeID); H5G.close(grpID); H5P.close(lcplID); H5F.close(fileID);
See Also
h5create
| h5writeatt
| h5info
| h5disp