h5create creates Extendable (unlimited) dataset also if Size == 0

6 views (last 30 days)
According to the h5create() documentation, you can create a dataset with unlimited dimension(s) by specifying Inf for any of the Size dimensions.
However, when creating (for a test case) an empty dataset with Size == 0, I found that that is also an indication for an Extendable dataset. I got the error message that I had forgotten to specify the ChunkSize, as required for extendable datasets. This undocumented feature can be read on line 177 of h5create.m:
% Setup Extendable.
options.Extendable = false(1,numel(options.Size));
options.Extendable(isinf(options.Size)) = true;
options.Extendable(options.Size == 0) = true; % <- line 177
It is not strange to handle the Size == 0 case as indicator that (more) data will follow later (thus requiring extendability and a ChunkSize). If that is the intended behaviour, please update the documentation likewise.
An alternative approach could be that if Size == 0 the presence of ChunkSize determines if the dataset will be extendable. That will require an update of h5create as well, and might break existing code, relying on the current exception behaviour.
  1 Comment
Abhipsa
Abhipsa on 6 Jun 2025
I have also observed this behavior when using h5create with Size == 0, MATLAB throws an error unless ChunkSize is explicitly specified.

Sign in to comment.

Accepted Answer

Deepak
Deepak on 16 Jun 2025
I understand that you are trying to create an initially empty HDF5 dataset using "h5create" in MATLAB by specifying (Size == 0), and observed that this triggers an error unless "ChunkSize" is provided. This behavior is indeed due to MATLAB internally treating any zero-sized dimension as extendable, similar to using "Inf", though this is not currently documented. As a result, specifying (Size = 0) requires a valid "ChunkSize", just like for extendable datasets.
While using a literal 0 in the "Size" argument may appear intuitive, the recommended and supported way to create an empty, extendable dataset is by using "Inf" in the corresponding dimension. For example,
h5create(filename, '/data', [Inf 10], 'ChunkSize', [10 10])
creates a dataset with an initial size of [0 10], allowing future expansion along the first dimension. This approach ensures compatibility and avoids ambiguity in "h5create" behavior.
Please find attached the documentation of "h5create" as reference:
I hope this helps.
  1 Comment
Ernst van der Pols
Ernst van der Pols on 18 Jun 2025
Thanks for confirming my observation. At least this topic now functions as addendum to the documentation.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Preprocessing in Help Center and File Exchange

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!