NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Thanks Antonio! I'll definitely give this idea a shot.Is there any performance hit if listing several thousand files in the catalog (as opposed to scanning the directory)?
Thanks again! -kevin. On 12/20/13 1:34 PM, "Antonio S. Cofiño" wrote:
Kevin,To improve the JoinExisting aggregation you can substitute the inner scan element by adding explicitly files (explicitly) you want aggregate and add the ncoords or the coordValue attribute to the netcdf element as it's been explained in the "Defining coordinates on a JoinExisting aggregation" section of the Aggregation document: http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ncml/v2.2/Aggregation.htmlBe sure the aggregation cache for the TDS config is configured:http://www.unidata.ucar.edu/software/thredds/current/tds/tds4.3/reference/ThreddsConfigXMLFile.html#AggregationCacheI hope this help. Regards Antonio -- Antonio S. Cofiño Grupo de Meteorología de Santander Dep. de Matemática Aplicada y Ciencias de la Computación Universidad de Cantabria http://www.meteo.unican.es El 20/12/2013 19:28, Kevin Manross escribió:Seasons Greetings!I really wish we didn't have these restrictions on data, but that's what I'm dealing with so please bear with me.We have some large (33 Tb, 840 Gb, etc) netCDF datasets that I am trying to aggregate. Many are in "time series" layout (I.e., single parameter grid spread out across many time steps [files], such as u10/u10_RCPP_2004_11.nc, u10/u10_RCPP_2004_12.nc, etc.)I initially tried a large nested aggregations such as: <dataset name="ds601.0-Agg" ID="ds601.0-AGG" & nbsp;&nbs p; urlPath="ds601.0/10/best" harvest="true"> <metadata inherited="true"> <serviceName>all</serviceName> <dataFormat>NetCDF</dataFormat> <dataType>GRID</dataType> </metadata> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><!--attribute name="title" type="string" value="20th Century Simulation Yearly Timeseries-Parameter Aggregations"/--><aggregation type="Union"> <netcdf> <aggregation dimName="time" type="joinExisting"><scan location="/glade/p/rda/data/ds601.0/RCPP/1995_2005/glw/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> <netcdf> <aggregation dimName="time" type="joinExisting">&n bsp; ; <scan location="/glade/p/rda/data/ds601.0/RCPP/1995_2005/graupel/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> <netcdf> <aggregation dimName="time" type="joinExisting"><scan location="/glade/p/rda/data/ds601.0/RCPP/1995_200 5/olr/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> <netcdf> &nb sp; <aggregation dimName="time" type="joinExisting"><scan location="/glade/p/rda/data/ds601.0/RCPP/1995_2005/psfc/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> ... ... ... </aggregation> </netcdf> </dataset>This takes a long time to build the cache file, and upon each revisit it goes through the process of rebuilding the file. Honestly, it is unusable this way from a user standpoint. However, everything works with the restrictions I have set up via Tomcat DataSourceRealm and webapps/thredds/WEB-INF/web.xmlMike McDonald had a really slick way to aggregate and cache the parameter timeseries files, and then build the union on demand. (see his response to the thread '"Too Many Open Files" Error. Dataset too big?' on 28 October 2013) . So using his example, I reformatted my catalog as such:<dataset name="Full Aggregation of ds601.0" ID="ds601.0-AGG" urlPath="aggregations/ds601.0/10/best" harvest="true"> <serviceName>all</serviceName><netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><aggregation type="Union"><netcdf location="dods://localhost:8080/thredds/dodsC/internal/ds601.0/101/glw"/> <netcdf location="dods://localhost:8080/thredds/dodsC/internal/ds601.0/102/graupel"/> <netcdf location="dods://localhost:8080/thredds/dodsC/internal/ds601.0/103/olr"/> <netcdf location="dods://localhost:8080/thredds/dodsC/internal/ds601.0/104/psfc"/>... ... ... </aggregation> </netcdf> </dataset> <dataset name="internal/ds601.0 Aggregation (glw)" ID="internal/ds601.0/101/glw" urlPath="internal/ds601.0/101/glw" harvest="true"> <serviceName>all</serviceName><netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><aggregation dimName="time" type="joinExisting"><scan location="/data/glade/p/rda/data/ds601.0/RCPP/1995_2005/glw/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> </dataset> <dataset name="internal/ds601.0 Aggregation (graupel)" ID="internal/ds601.0/102/graupel" urlPath="internal/ds601.0/102/graupel" harvest="true"> <serviceName>all</serviceName><netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><aggregation dimName="time" type="joinExisting"><scan location="/data/glade/p/rda/data/ds601.0/RCPP/1995_2005/graupel/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> </dataset> <dataset name="internal/ds601.0 Aggregation (olr)" ID="internal/ds601.0/103/olr" urlPath="internal/ds601.0/103/olr" harvest="true"> <serviceName>all</serviceName><netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><aggregation dimName="time" type="joinExisting"><scan location="/data/glade/p/rda/data/ds601.0/RCPP/1995_2005/olr/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> </dataset> <dataset name="internal/ds601.0 Aggregation (psfc)" ID="internal/ds601.0/104/psfc" urlPath="internal/ds601.0/104/psfc" harvest="true"> <serviceName>all</serviceName><netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"><aggregation dimName="time" type="joinExisting"><scan location="/data/glade/p/rda/data/ds601.0/RCPP/1995_2005/psfc/" suffix=".nc" subdirs="true"/></aggregation> </netcdf> </dataset> ... ... ...This sped things up immensely and the server is very responsive, however, I can't seem to get the authorization to work with the internal Union aggregation.I have attempted a number of things, such as:+ https://www.unidata.ucar.edu/software/thredds/current/tds/reference/RestrictedAccess.html - 2. Restrict by Dataset using TDS Catalogfor each joinExisting aggregation+ Adding a valid username/password to the url in the netcdf location value of the Union call<aggregation type="Union"><netcdf location="dods://USERNAME:PASSWORD@localhost:8080/thredds/dodsC/internal/ds601.0/101/glw"/>+ trying the above with an http:// protocolThe only thing that seems to work is to leave the the joinExisting aggregations unrestricted, but keep the restriction on the Union aggregation.I would like to do any of the following: 1) Hide the joinExisting aggregations (links) from the web browser2) Since the joinExisting aggregations are only needed to populate the Union aggregation "internally" to the TDS, somehow ease restrictions when called within the TDS on the localhost3) Somehow authorize the joinExisting aggregations within the Uinion aggregation4) Hear of an alternative way to efficiently aggregate the timeseries parameters and then combine those aggregated timeseries.If this is completely undo-able, that is also helpful information, and I'll leave the aggregated timeseries (joinExisting) unrestricted.-kevin. -- Kevin Manross NCAR/CISL/Data Support Section Phone: (303)-497-1218 Email:manross@xxxxxxxx <mailto:manross@xxxxxxxx> Web:http://rda.ucar.edu _______________________________________________ thredds mailing list thredds@xxxxxxxxxxxxxxxxFor list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/
-- Kevin Manross NCAR/CISL/Data Support Section Phone: (303)-497-1218 Email:manross@xxxxxxxx <mailto:manross@xxxxxxxx> Web:http://rda.ucar.edu
thredds
archives: