NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
I also add. The problem remains there even if I do the I/O for just the real variable. i.e. my file now is From ncdump netcdf ndb { dimensions: complex = 2 ; BS_K_linearized1 = 2025000000 ; BS_K_linearized2 = 781887360 ; variables: float BSE_RESONANT_LINEARIZED1(BS_K_linearized1, complex) ; float BSE_RESONANT_LINEARIZED2(BS_K_linearized1, complex) ; float BSE_RESONANT_LINEARIZED3(BS_K_linearized2, complex) ; } From h5dump Variables (this is ok)DATASET "BSE_RESONANT_LINEARIZED1" SIZE 16200000000DATASPACE SIMPLE { ( 2025000000, 2 ) / ( 2025000000, 2 ) } DATASET "BSE_RESONANT_LINEARIZED2" SIZE 16200000000 DATASPACE SIMPLE { ( 2025000000, 2 ) / ( 2025000000, 2 ) } DATASET "BSE_RESONANT_LINEARIZED3" SIZE 6255098880 DATASPACE SIMPLE { ( 781887360, 2 ) / ( 781887360, 2 ) }
Dimensions (this is bad), each dimension is a vector of integers of size equal to its own value DATASET "BS_K_linearized1" SIZE 8100000000 DATASPACE SIMPLE { ( 2025000000 ) / ( 2025000000 ) } DATASET "BS_K_linearized2" SIZE 3127549440 DATASPACE SIMPLE { ( 781887360 ) / ( 781887360 ) }
DATASET "complex" SIZE 8 DATASPACE SIMPLE { ( 2 ) / ( 2 ) }It seems to me that, in the HDF5 conversion the "netcdf dimension" value is used for the variable size (?)
The dimension is generated with the fortran command ierr = nf90_def_dim(io_unit,dim_name,dim_value,dim_id) D. On 15/05/20 11:12, Davide Sangalli wrote:
Dear all, also moving to the last version of the libraries the problem remains. pkgname_netcdf=netcdf-c-4.7.4 pkgname_netcdff=netcdf-fortran-4.5.2 pkgname_pnetcdf=pnetcdf-1.12.1 pkgname_hdf5=hdf5-1.12.0Moreover I noticed differences between running in serial and running in parallel.(I interrupted the two runs, so it maybe that the I/O was not over) Below BS_K_linearized should just be a number (a dimension with netcdf) SERIAL: DATASET "BS_K_linearized1" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 2025000000 ) / ( 2025000000 ) } STORAGE_LAYOUT { CONTIGUOUS SIZE 0 OFFSET 18446744073709551615 } FILTERS { NONE } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_LATE } PARALLEL: DATASET "BS_K_linearized1" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 2025000000 ) / ( 2025000000 ) } STORAGE_LAYOUT { CONTIGUOUS SIZE 8100000000 OFFSET 2387 } FILTERS { NONE } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_EARLY } I also tried to move to pnetcdf but I have some issues for now. Best, D. On 03/05/20 19:06, Dennis Heimbigner wrote:One reason to use netcdf over HDF5 is the fact that the netcdf API is much simpler than HDF5. The HDF5 API is some 6 times larger than the netcdf API. On 5/3/2020 1:42 AM, Davide Sangalli wrote:Thanks again. I'll have a look to pnetcdf.Another reason why we moved towards HDF5 was that, according to what I know, they could be able to exploit different levels of memory hierarchy in HPC simulations. Could pnetcdf do that as well ?Besides that I'd really like some hints. Why could netcdf better than HDF5, or viceversa. Please do your worst.For the NF90_unlimited, we are already using it in time dependent simulations in a way similar to the one you suggest. For the present case instead I'm just filling a huge complex matrix. So the interruption usually happens because there are limits on the simulation time. I'd really need to check which elements were filled and which were not without having any clue on the status.Since you mentioned it. I'm very interested in the storage of sparse matrices. My huge matrix is indeed quite sparse. How does that work ?Best, DOn Sun, May 3, 2020 at 12:48 AM +0200, "Wei-Keng Liao" <wkliao@xxxxxxxxxxxxxxxx <mailto:wkliao@xxxxxxxxxxxxxxxx>> wrote:Hi, DaveThanks for following up with the correct information about the dimension objects. I admit that I am not familiar with the NetCDF4 dimension representation in HDF5.Wei-keng> On May 2, 2020, at 5:28 PM, Dave Allured - NOAA Affiliate wrote: > > Wei-king, thanks for the info on the latest release.Minor detail, I found that hidden dimension scales are still stored as arrays, but the arrays are left unpopulated. HDF5 stores these as sparse, which means no wasted space in arrays that are never written. > > For Davide, I concur with Wei-king that netcdf-C 4.7.4 is okay for your purpose, and should not store wasted space. Version 4.7.3 behaves the same as 4.7.4. > > I wonder when they changed that, some time between your 4.4.1.1 and 4.7.3. Also you used HDF5 1.8.18, I used 1.10.5. That should not make any difference here, but perhaps it does. > > > On Sat, May 2, 2020 at 1:01 PM Wei-Keng Liao wrote: > > If you used the latest NetCDF 4.7.4, the dimensions will be stored as scalars. > > Wei-keng > > > On May 2, 2020, at 1:42 PM, Davide Sangalli wrote: > > > > Yeah, but BS_K_linearized1 is just a dimension, how can it be 8 GB big ? > > Same for BS_K_linearized2, how can it be 3 GB big ? > > These two are just two numbers > > BS_K_linearized1 = 2,025,000,000 > > (it was chosen has a maximum variable size in my code to avoid overflowing the maximum allowed integer in standard precision) > > BS_K_linearized2 = 781,887,360 > > > > D. > > > > On 02/05/20 19:06, Wei-Keng Liao wrote: > >> The dump information shows there are actually 8 datasets in the file. > >> Below is the start offsets, sizes, and end offsets of individual datasets. > >> There is not much padding space in between the datasets. > >> According to this, your file is expected to be of size 16 GB. > >> > >> dataset name start offset size end offset > >> BS_K_linearized1 2,379 8,100,000,000 8,100,002,379 > >> BSE_RESONANT_COMPRESSED1_DONE 8,100,002,379 2,025,000,000 10,125,002,379 > >> BSE_RESONANT_COMPRESSED2_DONE 10,125,006,475 2,025,000,000 12,150,006,475 > >> BS_K_linearized2 12,150,006,475 3,127,549,440 15,277,555,915 > >> BSE_RESONANT_COMPRESSED3_DONE 15,277,557,963 781,887,360 16,059,445,323 > >> complex 16,059,447,371 8 16,059,447,379 > >> BS_K_compressed1 16,059,447,379 99,107,168 16,158,554,547 > >> BSE_RESONANT_COMPRESSED1 16,158,554,547 198,214,336 16,356,768,883 > >> > >> Wei-keng > >> > >>> On May 2, 2020, at 11:28 AM, DavideSangalli wrote: > >>> > >>> h5dump -Hp ndb.BS_COMPRESS0.005000_Q1 > >_______________________________________________ NOTE: All exchanges posted to Unidata maintained email lists are recorded in the Unidata inquiry tracking system and made publicly available through the web. Users who post to any of the lists we maintain are reminded to remove any personal information that they do not want to be made public. netcdfgroup mailing list netcdfgroup@xxxxxxxxxxxxxxxxFor list information or to unsubscribe, visit: https://www.unidata.ucar.edu/mailing_lists/
netcdfgroup
archives: