NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi David: This is a timely email, Im just now scratching my head trying to understand why things are different in 4.6. I have a report that possibly the cache is not getting used on subsequent reads, but my tests have not reproduced that. So this is helpful. Note that to get everything to cache, you just need to make a request for the aggregation coordinate (usually time) values. Could do it with a dods request, or open the file as a Grid (eg WMS, WCS, NCSS, rom ToolsUI, IDV, etc) which will automatically request coordinates. A script to do so is easy enough, using wget or python or whatever you like. Email support if you need an example. One might ask why doesnt 4.6 used the previous cached values? It does, but a change to the default behavior of DiskCache2 may be affecting this. The 4.3 default was to put all cache files into a single directory, but 4.6 default makes nested directories, because having thousands of files in a single directory is Considered Harmful. If you need to, you can control that behavior in threddsConfig.xml, but better is to pay the price and redo the cache with the default. Email support if you need more details. BTW, it might be advisable to take the opportunity to clear out your caches, if you are installing on top of your old TDS. Just go to your cache directory (default is content/thredds/cache), and delete the entire directory, or if you have the inclination, go and selectively delete stuff (but then you have to think hard). Then trigger a repopulation as above. On Thu, May 21, 2015 at 9:32 AM, David Robertson < robertson@xxxxxxxxxxxxxxxxxx> wrote: > Hello, > > I noticed that the way NcML aggregation cache xml files are created has > changed in version 4.6.x. In previous versions, the cache xml file > contained lines similar to: > > <netcdf id='/data/roms/espresso/2009_da/avg/espresso_avg_1379.nc' > ncoords='1' > > <cache varName='ocean_time' >1.191888E8 </cache> > </netcdf> > > from the start. With large datasets, this took a while (30 minutes plus > and sometimes crashing TDS) to generate the first time the dataset was > accessed, but subsequent accesses were much faster. The new way more > quickly generates the NcML cache without the cached joinExisting values: > > <netcdf id='/data/roms/espresso/2009_da/avg/espresso_avg_1379.nc' > ncoords='1' > > </netcdf> > > and fills in the "<cache varName='ocean_time' >1.191888E8 </cache>" lines > as data from the corresponding file is requested. A side effect, in my case > at least, is that even requests for small amounts of data are relatively > slow. Presumably, this will be the case until all ocean_time cache values > are filled in. Once all values were cached, response times dropped > significantly: from 15s to less than 1s in my very limited tests (~1600 > files spanning 19,146 time records). > > For anyone experiencing the same side effect, you can populate the whole > aggregation cache xml file with the <cache> lines by requesting all records > of the joinExisting variable (or successive chunks for very large datasets) > as a workaround. > > I can certainly see the reasoning and benefits to the new way of caching > but want to point out possible side effects and workarounds. Another > workaround could be to use a combination of Python/Perl and NCO to generate > the cache file (complete with cached joinExisting values) offline. > > Dave > > _______________________________________________ > thredds mailing list > thredds@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > http://www.unidata.ucar.edu/mailing_lists/ >
thredds
archives: