How to extract data from he5 files
    13 views (last 30 days)
  
       Show older comments
    
I have downloaded the data of MLS which are he5 format. I want to extract data from the files. I am handling .hdf files for the first time. Could anyone suggest how to extract data from multiple .he5 files? 
Any help would be appreciated.
I attached 4 files here in a zip format. 
0 Comments
Answers (1)
  Manish
 on 9 Oct 2024
        Hi, 
I understand that you want to extract the data from the multiple he5 files. 
The code below reads all. he5 files in a specified directory, extracts datasets into a structure with valid field names, and saves the entire structure as a single .mat file named ‘all_datasets.mat’. 
It recursively explores groups and datasets within each. he5 file, ensuring that all data is consolidated into one file for easy access. 
To achieve the task, replace your path of the extracted file in the ‘filedir’ variable code below:
fileDir = 'C:\\Users\\Test_files';
fileList = dir(fullfile(fileDir, '*.he5'));
allData = struct();
for k = 1:length(fileList)
    fileName = fullfile(fileDir, fileList(k).name);
    % Display the file being processed
    %disp(['Processing file: ', fileList(k).name]);
    fileInfo = h5info(fileName);
    % Recursively explore groups and datasets
    allData = exploreGroups(fileName, fileInfo.Groups, allData, fileList(k).name);
end
save('all_datasets.mat', '-struct', 'allData')
function allData = exploreGroups(fileName, groups, allData, baseFileName)
    for i = 1:length(groups)
        disp(['Group: ', groups(i).Name]);
        % Explore datasets within the current group
        for j = 1:length(groups(i).Datasets)
            % Correctly format the dataset path with leading '/'
            datasetName = ['/' groups(i).Name, '/', groups(i).Datasets(j).Name];
            datasetName = strrep(datasetName, '//', '/'); % Remove any double slashes
            disp(['Dataset: ', datasetName]);
            try
                % Attempt to read the dataset
                data = h5read(fileName, datasetName);        
                % Create a valid field name for the structure
                fieldName = matlab.lang.makeValidName([baseFileName, '_', groups(i).Datasets(j).Name]);
                % Add the data to the structure
                allData.(fieldName) = data;
            catch ME
                disp(['Error reading dataset: ', ME.message]);
            end
        end
        % Recursively explore subgroups
        if ~isempty(groups(i).Groups)
            allData = exploreGroups(fileName, groups(i).Groups, allData, baseFileName); % Recursive call
        end
    end
end
Hope this solves! 
0 Comments
See Also
Categories
				Find more on Data Import and Analysis in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
