NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
I recently received a support question about LDM queue size (and why larger is the desire for some instances) - and felt that there may be others that have a similar question. Disclaimer: I know there will be those of you that disagree or have an analog that disproves what I will articulate here. That's cool. Post it and discuss it with others - that is the point of this email list. I am disinclined to argue or refute anything said unless it is dangerous to the community. I.e., feel free to take it or leave it when I provide suggestions. And here is the content of my reply to the support question: One of the powers of *nix operating systems is the ability to performance tune the OS and applications to the specific use. This also applies to LDM . The purpose of the LDM queue is a network/processing "shock absorber" to help mitigate bottlenecks; the concept of streaming directly from node to node with no buffering would be great in the perfect world, but we don't live in a perfect world. There are three conditions for which a large queue size is important (although the first two are related - i.e. what is done after data is received): 1) the node in question is feeding downstream nodes via REQUESTS; 2) the node in question is performing a lot of post-reception processing (pqact) that can challenge the OS kernel scheduler to keep up; 3) the node in question is a "top-tier" node for injecting data onto the local LDM queue - i.e., pqinsert or SBN (NOAAPort/NWWS) reception. Taking the first listed instance into account, if your node is a terminal node - i.e. you are not feeding any downstream nodes - and processing is minimal (writing to a file system, for instance) - there really is not a need for a large queue. The calculation for the queue size is mostly to account for a full scans-worth of storage slots of the largest product you may expect, such as the netCDF GOES data tiles. A 500M queue would be plenty, and light and nibble. Transversely, if your node is a relay *or* "top-tier" node where products are injected via pqinsert or noaaportIngester (addressing the first and third listed instances), and you are feeding downstream LDM nodes, the desire maybe to build in the largest queue possible given the physical memory constraints of the machine or VM instance. The purpose of such a large queue is to build in as much resilience to a network outage or bottleneck as possible. As with every condition I will discuss here, there is always an exception . . . if you are using your LDM to receive NOAAPort, but that is it . . . you have noaaportIngester(s) running per PID you desire, and you direct the output from pqact just to a file system . . . you don't need a lot of queue for that, either. I would suggest more than 500M, but not much, unless you are on a 666MHz Pentium III ;-) Finally, the more complicated scenario of the three listed is the second one; you will need to pull on your big-person system administration pants and do some math and benchmarking. The reason this one is more complicated is that you have weigh the need of the queue size to be large enough to handle all the pqact enabled or piped processing, yet small enough to leave resources for any other LDM-external applications or services running that are fed or piped the data - such as an RDBMS or graphics creation task. External to LDM, using top, w, mpstat, and others can help you track the memory and processing usage of an RDBMS engine, or other post-processing that are not directly piped to using pqact. The use of incrontab, for instance, watches for files to close in a directory to do a job, and may not be directly related to an LDM session. Internal to LDM, one can encapsulate a PIPE instance in the pqact configuration file into a script, and fire the actual executable using the "time" built-in to measure the system and real-time processing statistics of something, and then average these to get a determination the minimum queue size to have when compared to the output from pqmon. Your goal is to have a queue large enough to handle the potential bottleneck of processing, yet leaving resources for other non-LDM processing. Finally, the issue of physical memory and LDM. I don't know anyone that compiles LDM without mmap(), so it is worth mentioning some basics about the queue and memory usage. In almost every case of modern LDM loads that I know of, the size of the queue is limited by the physical memory of the hardware/VM instance because the queue is mapped to memory for speed and rapid access. If you have set the queue to 32768M in the registry.xml, you need to ensure that you have more than that as physical/VM allocated memory or you will set yourself up for swapping (if enabled), or faults in running. And this only applies if the queue is actually configured "on-disk". If you use ramdisk/shm as your queue partition, then you will need more than twice the queue size as indicated in the registry.xml. One final thing to consider, the larger the queue, the more time it takes to create and manage at the system level. Just a simple function of the number of bytes. The point of this is to illustrate that simply configuring for the maximum queue size possible is not necessarily a good practice. If you are running purely a "relay" node - you have a REQUEST to an upstream node, and you are passing to downstream nodes - then, sure. Not running a graphical console, not really anything else, I would be comfortable with an instance having a physical/VM allocated 48G making the queue 40G if the queue is on disk, or 20G if the queue is in ramdisk/shm (which is done for speed, by the way). If you are inserting a satellite-sourced stream, such as NOAAPort, I would suggest the same queue would require a minimum of 64G physical/VM allocated memory, mostly because such a machine is normally a "relay" - except instead of an upstream REQUEST, you are performing inserts. The additional physical memory in this instance is to accommodate the processing of the noaaportIngester local buffering. With the instance of doing a lot of tasks while receiving data, pqmon will guide you on the oldest product on the queue, and you can measure that against the stats you acquire through system tools such as top, w or mpstat. You want to avoid "strangling" your machine by not leaving enough resources external to LDM to do what you want to do. One last thing I want to point out as I know it is popular with some to put the queue on ramdisk/shm . . . although fast, it is not resilient. Unlike with using the fast file system, once you reboot, that queue is gone. If you want to have some resemblance of a queue, and you do "nice" LDM shutdowns before rebooting, when the machine comes back up, you will have your original queue ready to roll. If you do the memory route, then you will need to "mkqueue" upon every reboot. I am not saying one method is better than the other, I merely want to point out the caveat to using the faster "disk space in memory" paradigm. *Stonie Cooper, PhD* Software Engineer III NSF Unidata Program Center University Corporation for Atmospheric Research *I acknowledge that the land I live and work on is the traditional home of The **Chahiksichahiks (**Pawnee), The **Umoⁿhoⁿ (**Omaha), and The **Jiwere (**Otoe).*
ldm-users
archives: