NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
On Tue, May 20, 2014 at 9:46 AM, Timothy Stitt <Timothy.Stitt.9@xxxxxx>wrote: > I then checked how to use the > NETCDF-4 format instead and made the change to my write routine. now that you've got netcdf-4, yuo can compress, which should really help. but still... > I¹ve now > got my NC file in NETCDF-4 format but I¹m still seeing the 2X file storage > increase compared to my original ASCII file. Can you see any other > problems with my file structure based on the ncdump command below? > > netcdf plate { > dimensions: > Record_Lines = 4 ; > Line_Symbols = 87 ; > Record_Number = UNLIMITED ; // (11474 currently) > variables: > char Record(Record_Number, Record_Lines, Line_Symbols) ; > Record:_Storage = "chunked" ; > Record:_ChunkSizes = 1, 4, 87 ; > OK -- that does look like the old defaults. If I've got this right, your chunks are 4*87=348 bytes -- that's pretty small . IN some limited experiments, I found you want chunks of at least kb, and MB is probably better. You might try: 1024, 4, 87 and see how it works. -Chris > // global attributes: > :_Format = "netCDF-4" ; > } > > The files sizes are as follows: > > 2.2M May 13 16:03 plate.10000 (original ASCII file with 4*10000 lines - > 10000 records, 4 lines per record) > 4.5M May 20 12:38 plate.nc > > Thanks in advance for your help, > > Tim. > ______________________________________________ > Tim Stitt PhD > User Support Manager (CRC) > Research Assistant Professor (Computer Science & Engineering) > Room 108, Center for Research Computing, University of Notre Dame, IN 46556 > Email: tstitt@xxxxxx > > > > > > On 5/20/14, 11:43 AM, "Rob Latham" <robl@xxxxxxxxxxx> wrote: > > > > > > >On 05/19/2014 09:52 AM, Timothy Stitt wrote: > >> Hi all, > >> > >> I¹ve been trying to convert a large (40GB) ASCII text file (composed of > >> multiple records of 4 line ASCII strings about 90 characters long) into > >> NetCDF format. My plan was to rewrite the original serial code to use > >> parallel NetCDF to have many MPI processes concurrently read records and > >> process them in parallel. > >> > >> I was able to write some code to convert the ASCII records into > >> [unlimited][4][90] NetCDF NC_CHAR arrays, which I was able to read > >> concurrently via parallel NetCDF routines. My question is related to the > >> size of the converted NetCDF file. > >> > >> I notice that the converted NetCDF file is always double the size of the > >> ASCII file whereas I was hoping for it be to much reduced. I was > >> therefore wondering if this is expected or is more due to my bad > >> representation in NetCDF of the ASCII records? I am using > >> nc_put_vara_text() to write my records. Maybe I need to introduce > >> compression that I¹m not doing already? > > > >Are you using the classic file format or the NetCDF-4 file format? > > > >Can you provide an ncdump -h of the new file? > > > >==rob > > > >> > >> Thanks in advance for any advice you can provide. > >> > >> Regards, > >> > >> Tim. > >> > >> > >> _______________________________________________ > >> netcdfgroup mailing list > >> netcdfgroup@xxxxxxxxxxxxxxxx > >> For list information or to unsubscribe, visit: > >>http://www.unidata.ucar.edu/mailing_lists/ > >> > > > >-- > >Rob Latham > >Mathematics and Computer Science Division > >Argonne National Lab, IL USA > > > >_______________________________________________ > >netcdfgroup mailing list > >netcdfgroup@xxxxxxxxxxxxxxxx > >For list information or to unsubscribe, visit: > >http://www.unidata.ucar.edu/mailing_lists/ > > _______________________________________________ > netcdfgroup mailing list > netcdfgroup@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > http://www.unidata.ucar.edu/mailing_lists/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@xxxxxxxx
netcdfgroup
archives: