Hello!
I have implemented netcdf4 parllel writing of  4 dimensional(time, z, y, x)
variable in our application which replaces posix based shared I/O into a
binary file.
The data is correctly written into the file.
As a next step to take the application towards petaflop scale, I am
planning to implement compute-node level I/O where only one core on each
compute node will take part in writing the output. I know the start indices
and counts from each core from the application.
I have gathered this information on one core of each compute node and I can
write the gathered data one after the other in a for loop.
My question is, it possible to write this whole gathered array from the I/O
core in one single call?
So far from the documentation I can understand that for nc_put_vara_type
functions
the length of start and length of count has to match the dimensions of the
variable.
Is it possible to replace this with a start vector and count vector which
has multiple values corresponding to each block and supply the complete
gathered array to one single nc_put_vara_type call.
As a side note MPI standard guarantees that the MPI_gather will order the
data according to ranks. So If I gather the start and count from each core
in a single vector, they will correspond to each other.
Thank you and regards,
Ketan Kulkarni