NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

Re: [netcdfgroup] NetCDF for parallel usage

Hi Rob & Ed,

I think that the machine i am using is not that bad. It was commissioned in
'12. Some basic info:

Performance
360 TFLOPS Peak & 304 TFLOPS sustained on LINPACK
Hardware
HP blade system C7000 with BL460c Gen8 blades
1088 nodes with 300 GB disk/node (319 TB)
2,176 Intel Xeon E5 2670 processors@ 2.6 GHz
17,408 processor cores, 68 TB main memory
FDR Infiniband based fully non-blocking fat-tree topology
2 PB high performance storage with lustre parallel file system

----

Using netCDF configured for parallel applications, i did manage to write
data on a single netCDF file using 512 procs --- but this was when i
reduced the grid nodes per proc to about 20-30. When i made the grid nodes
to about 100 i got this error too:

NetCDF: HDF error

----

There is another issue i need to share --- while compiling netCDF4 for
parallel usage, i had encountered errors during 'make check' in these
files: run_par_test.sh, run_f77_par_test.sh and run_f90_par_test.sh

These were related to mpiexec commands --- mpd.hosts issue. These errors
did not occur when i compiled netcdf for parallel on my desktop.

----

Dumping outputs from each processor gave me these  errors --- it is not
that all such errors appear together - they are a bit random.

proxy:0:13@cn0083] HYDT_bscu_wait_for_completion
(./tools/bootstrap/utils/bscu_wait.c:73): one of the processes terminated
badly; aborting
[proxy:0:13@cn0083] HYDT_bsci_wait_for_completion
(./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
completion
[proxy:0:13@cn0083] HYD_pmci_wait_for_childs_completion
(./pm/pmiserv/pmip_utils.c:1476): bootstrap server returned error waiting
for completion
[proxy:0:13@cn0083] main (./pm/pmiserv/pmip.c:392): error waiting for event
children completion

[mpiexec@cn0002] control_cb (./pm/pmiserv/pmiserv_cb.c:674): assert
(!closed) failed
[mpiexec@cn0002] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@cn0002] HYD_pmci_wait_for_completion
(./pm/pmiserv/pmiserv_pmci.c:388): error waiting for event
[mpiexec@cn0002] main (./ui/mpich/mpiexec.c:718): process manager error
waiting for completion

cn0137:b279:beba2700: 132021042 us(132021042 us!!!):  ACCEPT_RTU: rcv ERR,
rcnt=0 op=1 <- 10.1.1.136
cn1068:48c5:4b280700: 132013538 us(132013538 us!!!):  ACCEPT_RTU: rcv ERR,
rcnt=-1 op=1 <- 10.1.5.47
cn1075:dba3:f8d7700: 132099675 us(132099675 us!!!):  CONN_REQUEST: SOCKOPT
ERR Connection refused -> 10.1.1.51 16193 - RETRYING... 5
cn1075:dba3:f8d7700: 132099826 us(151 us):  CONN_REQUEST: SOCKOPT ERR
Connection refused -> 10.1.1.51 16193 - RETRYING...4
cn1075:dba3:f8d7700: 132099942 us(116 us):  CONN_REQUEST: SOCKOPT ERR
Connection refused -> 10.1.1.51 16193 - RETRYING...3
cn1075:dba3:f8d7700: 132100049 us(107 us):  CONN_REQUEST: SOCKOPT ERR
Connection refused -> 10.1.1.51 16193 - RETRYING...2
cn1075:dba3:f8d7700: 132100155 us(106 us):  CONN_REQUEST: SOCKOPT ERR
Connection refused -> 10.1.1.51 16193 - RETRYING...1
cn1075:dba3:f8d7700: 132100172 us(17 us): dapl_evd_conn_cb() unknown event
0x0

----

Rob, I guess i will need to look into the io methods you listed.

Thanks for your time,
Samrat.


On Fri, Oct 17, 2014 at 10:00 PM, Rob Latham <robl@xxxxxxxxxxx> wrote:

>
>
> On 10/17/2014 11:25 AM, Ed Hartnett wrote:
>
>> Unless things have changed since my day, it is possible to read pnetcdf
>> files with the netCDF library. It must be built with --enable-pnetcdf
>> and with-pnetcdf=/some/location, IIRC.
>>
>
> Ed!
>
> In this case, Samrat Rao was using pnetcdf to create CDF-5 (giant
> variable) formatted files.  To refresh your memory,  Argonne and
> Northwestern developed this file format with UCARS's signoff, with the
> understanding that we (ANL and NWU) would never expect UCAR to add support
> for it unless we did the work.  I took a stab at it a few years back and
> Wei-keng is taking a second crack at it right now.
>
> the classic file formats CDF-1 and CDF-2 are fully inter-operable between
> pnetcdf and netcdf.
> ==rob
>
>
>
>> On Fri, Oct 17, 2014 at 6:33 AM, Samrat Rao <samrat.rao@xxxxxxxxx
>> <mailto:samrat.rao@xxxxxxxxx>> wrote:
>>
>>     Hi,
>>
>>     I'm sorry for the late reply.
>>
>>     I have no classic/netcdf-3 datasets --- datasets are to be
>>     generated. All my codes are also new.
>>
>>     Initially i tried with pnetcdf, wrote a few variables, but found
>>     that the format was CDF-5 which 'normal' netcdf would not read.
>>
>>     I also need to read some bits of netcdf data in Matlab, so i thought
>>     of sticking to the usual netcdf-4 compiled for parallel io. It is
>>     also likely that i will have to share my workload with others in my
>>     group and/or leave the code for future people to work on.
>>
>>     Does matlab read cdf-5 files?
>>
>>     So i preferred the usual netcdf. Rob, i hope you are not annoyed.
>>
>>     But most of the above is for another day. Currently i am stuck
>>     elsewhere.
>>
>>     With a less no of processors, 216, the single netcdf file gets
>>     created (i create i single netcdf file for each variable), but for
>>     anything above that i get these errors:
>>     NetCDF: Bad chunk sizes.
>>     Not sure where these errors come from.
>>
>>     Then i shifted to dumping outputs from each processor in simple
>>     binary --- this works till about 1500 processors. Above this number
>>     the code gets stuck and eventually aborts.
>>
>>     This issue is not new. My colleague too had problems with running
>>     his code on 1500+ procs.
>>
>>     Today i came to know that opening a large number of files (each proc
>>     writes 1 file) can overwhelm the system --- solving this requires
>>     more than rudimentary techniques of writing --- or understanding the
>>     system's inherent parameters/bottlenecks.
>>
>>     So netcdf is probably out of bounds for now --- will try again if
>>     the simple binary write from each processor gets sorted out.
>>
>>     Does anyone have any suggestion?
>>
>>     Thanks,
>>     Samrat.
>>
>>
>>     On Thu, Oct 2, 2014 at 7:52 PM, Rob Latham <robl@xxxxxxxxxxx
>>     <mailto:robl@xxxxxxxxxxx>> wrote:
>>
>>
>>
>>         On 10/02/2014 01:24 AM, Samrat Rao wrote:
>>
>>             Thanks for your replies.
>>
>>             I estimate that i will be requiring approx 4000 processors
>>             and a total
>>             grid resolution of 2.5 billion for my F90 code. So i need to
>>             think/understand which is better - parallel netCDF or the
>>             'normal' one.
>>
>>
>>         There are a few specific nifty features in pnetcdf that can let
>>         you get really good performance, but 'normal' netCDF is a fine
>>         choice, too.
>>
>>             Right now I do not know how to use parallel-netCDF.
>>
>>
>>         It's almost as simple as replacing every 'nf' call with 'nfmpi'
>>         but you will be just fine if you stick with UCAR netCDF-4
>>
>>             Secondly, i hope that the netCDF-4 files created by either
>>             parallel
>>             netCDF or the 'normal' one are mutually compatible. For
>>             analysis I will
>>             be extracting data using the usual netCDF library, so in
>>             case i use
>>             parallel-netCDF then there should be no inter-compatibility
>>             issues.
>>
>>
>>         For truly large variables, parallel-netcdf introduced, with some
>>         consultation from the UCAR folks, a 'CDF-5' file format.  You
>>         have to request it explicitly, and then in that one case you
>>         would have a pnetcdf file that netcdf tools would not understand.
>>
>>         In all other cases, we work hard to keep pnetcdf and "classic"
>>         netcdf compatible.  UCAR NetCDF has the option for an HDF5-based
>>         backend -- and in fact it's not an option if you want parallel
>>         I/O with NetCDF-4 -- is not compatible with parallel-netcdf.  By
>>         now, your analysis tools surely are updated to understand the
>>         new HDF5-based backend?
>>
>>         I suppose it's possible you've got some 6 year old analysis tool
>>         that does not understand NetCDF-4's HDF5-based file format.
>>         Parallel-netcdf would allow you to simulate with parallel i/o
>>         and produce a classic netCDF file.  But I would be shocked and a
>>         little bit angry if that was actually a good reason to use
>>         parallel-netcdf in 2014.
>>
>>
>>         ==rob
>>
>>
>>         --
>>         Rob Latham
>>         Mathematics and Computer Science Division
>>         Argonne National Lab, IL USA
>>
>>
>>
>>
>>     --
>>
>>     Samrat Rao
>>     Research Associate
>>     Engineering Mechanics Unit
>>     Jawaharlal Centre for Advanced Scientific Research
>>     Bangalore - 560064, India
>>
>>     _______________________________________________
>>     netcdfgroup mailing list
>>     netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
>>     For list information or to unsubscribe,  visit:
>>     http://www.unidata.ucar.edu/mailing_lists/
>>
>>
>>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>



-- 

Samrat Rao
Research Associate
Engineering Mechanics Unit
Jawaharlal Centre for Advanced Scientific Research
Bangalore - 560064, India