NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.
To learn about what's going on, see About the Archive Site.
Hi Tom, Thanks for the response. I'm an academic, let us do academic things. I putz here on my RHEL8 64bit laptop and LDM 6.13.11. Testing `PIPE -close` ============= Lets create a fancy pants bash script, since python is 3x too slow. $ cat 10xengineer.sh # save data to the bit bucket cat > /dev/null # Wait a second, no 3600 seconds sleep 3600 Lets create a fancy pants pqact entry $ cat etc/fun.conf EXP ^(.*)$ PIPE -close sh 10xengineer.sh \1 So in desperation, I wrote a python script to inject LDM with unique product names so that each file fires off a script from pqact. $ cat bah.py import datetime import subprocess import tempfile with open('bah.txt', 'w') as fp: fp.write('hi') for _ in range(1000): cmd = f"pqinsert -i -p '{datetime.datetime.now()}' bah.txt" subprocess.call(cmd, shell=True) and so we fire up LDM and observe ye ole fire off first pqact entry process running :) sh 10xengineer.sh _BEGIN_ So our numbers below will have 1001 processes involved. Lets get going here and insert those 1000 products. $ python bah.py $ And so how many 10xengineer.sh processes we have now? $ ps auxw | grep 10x | wc -l 1001 Ah fun, I have DOS'd my LDM with bash scripts, hehe. Lets kill all those and try something else. Testing with just `PIPE` ============== So now we adjust the pqact to drop the `-close` and so pqact should hold the file descriptor open until it closes it after that timeout that Steve mentioned. $ cat etc/fun.conf EXP ^(.*)$ PIPE sh 10xengineer.sh \1 and now inject those 1000 files into LDM and behold.... $ ps auxw | grep 10x | wc -l 1001 Whoa, that's interesting. Looking at the process's (pid 5092 is pqact) open file descriptors $ lsof -p 5092 | grep pipe | wc -l 1001 So there I was wrong, there's no 32 limit anymore. We had better test FILE too in order to satisfy reviewer #3 Testing `FILE -close` ============ Our pqact.conf file looks like so now: $ cat etc/fun.conf EXP ^(.*)$ FILE -close /tmp/daryl/\1 and start our LDM up and see our _BEGIN_ fun again. $ ls /tmp/daryl _BEGIN_ and so we inject 1000 products and observe nothing because there is a space in the LDM product name, hehe. So we adjust our python script to remove the spaces and we find. $ ls /tmp/daryl/ | wc -l 1001 and the pqact process has no open file descriptors. So lets test without the close Testing `FILE` ======== Our pqact now looks like: $ cat etc/fun.conf EXP ^(.*)$ FILE /tmp/daryl/\1 $ rm -rf /tmp/daryl/ and we inject 1000 files again and observe they all got written $ ls /tmp/daryl/ | wc -l 1001 and observe how many open file descriptors the pqact process has. $ lsof -p 14633 | grep daryl | wc -l 1001 Again, so much for my naive "32 slots" life. Well, I learned something today! daryl -- /** * daryl herzmann * Systems Analyst III -- Iowa Environmental Mesonet * https://mesonet.agron.iastate.edu */ ________________________________________ From: ldm-users <ldm-users-bounces@xxxxxxxxxxxxxxxx> on behalf of Tom Yoksas <yoksas@xxxxxxxx> Sent: Thursday, April 23, 2020 3:03 PM To: ldm-users@xxxxxxxxxxxxxxxx Subject: [ldm-users] 20200423: Re: 20200423: Re: Efficiency of splitting pqacts Hi Daryl, On 4/23/20 1:36 PM, Herzmann, Daryl E [AGRON] wrote: > I am sure Unidata will correct my ignorance / incorrect details, but > my understanding is that an individual pqact process can only do 32 > "things" at one time, or there's 32 slots available for work. A _long_ time ago, the LDM used to only keep open a maximum of 32 file descriptors. A less, but still long time ago, Steve changed that to used the system value for the number of open file descriptors. Recently, we came to the conclusion that there being LOTS of open file descriptors was a major cause for the length of time it took to stop the LDM, at least, on our publicly facing servers (lead.unidata.ucar.edu and atm.ucar.edu). Actions like those that append to an open file were simply not being closed because current OSes allow for LOTS of open file descriptors. Steve's solution was to add code to the LDM that would close file descriptors after a certain amount of time during which the writes were inactive. The best example of the kind of actions that I am referring to are ones for model output that write all model fields for a single model time step into a single file. In these kinds of actions (FILE with no -close flag), there is no way to know when all of the products to be written into the output file have been received, so the file descriptor stays open, and as I noted current OSes allow for a LOT of open file descriptors. re: > Now, the above depends on the action. If you run `PIPE -close`, > the slot can be used for another product even with the PIPEd process > still running... This type of action can lead LDM to DOSing the server > it is on as it will fire off as many PIPE'd processes that it can. I'm not sure that this is the case, but Steve can certainly say yea/nea on this. re: > You old timers, like me, will recall the lock file fun Chiz wrote into > the GIF generation script of NIDS data for this reason. > > If you are doing just FILE actions without a `-close`, there is some > benefit to spreading out the pqact.conf file into multiple files to > keep each pqact roughly touching 32 files each. For example with > level2 data, dividing the radars into chunks like so: > > exec "pqact -p BZIP2/K[A-D] -f CRAFT /local/ldm/etc/pqact-craft.conf" > exec "pqact -p BZIP2/K[E-H] -f CRAFT /local/ldm/etc/pqact-craft2.conf" > exec "pqact -p BZIP2/K[I-K] -f CRAFT /local/ldm/etc/pqact-craft3.conf" > exec "pqact -p BZIP2/K[L-O] -f CRAFT /local/ldm/etc/pqact-craft4.conf" > exec "pqact -p BZIP2/K[P-R] -f CRAFT /local/ldm/etc/pqact-craft5.conf" > exec "pqact -p BZIP2/K[S-Z] -f CRAFT /local/ldm/etc/pqact-craft6.conf" > exec "pqact -p BZIP2/[A-J] -f CRAFT /local/ldm/etc/pqact-craft7.conf" > exec "pqact -p BZIP2/[L-Z] -f CRAFT /local/ldm/etc/pqact-craft8.conf" > > Behold, another caveat here. While with the above, each pqact process has > its own uniquely named file, this file can be the same file on the filesystem > and managed with sym links. They need to be unique to the pqact process so > that pqact can write its `.state` file to a unique location. The question that Mike Z was asking was about the number of actions in the pattern-action file. If one uses the exact same pattern-action file for each 'pqact' instance, and that pattern-action file has a lot of actions, it will take longer for 'pqact' to work its way through the actions. This is true even if some/most of the actions are not executed because their extended regular expression doesn't match the Product ID for the product being acted upon. Of course, actions that don't match tend to be dealt with much faster than ones that do match. re: > You should consider the processes being run, how long their lifetime is, > and your server's capacity. If you have a bunch of long running GEMPAK > decoders that totals something less than 32 total, then just keep them > in one file but perhaps isolate that pqact process to just those tasks. I agree with the sentiment expressed here, but I would caution that the old 32 open file descriptor limit does not apply. re: > So hold tight until Unidata corrects my above as FUD :) Just having fun on a stay at home day :-) Cheers, Tom _______________________________ > From: ldm-users <ldm-users-bounces@xxxxxxxxxxxxxxxx> on behalf of Tom Yoksas > <yoksas@xxxxxxxx> > Sent: Thursday, April 23, 2020 2:15 PM > To: ldm-users@xxxxxxxxxxxxxxxx > Subject: [ldm-users] 20200423: Re: Efficiency of splitting pqacts > > Hi Mike, > > On 4/23/20 12:39 PM, Mike Zuranski wrote: >> I'm wondering if there is a difference in speed/efficiency of the LDM, >> or in system resource allocation, between grouping all my pqact >> statements in one file vs. splitting them up into different pqact >> files. > > Since all actions in an LDM pattern-action file are processed > sequentially, there is a benefit to distributing actions in multiple > pattern-action files that are each processed by a separate 'pqact' > instance. > > re: >> Does LDM do anything differently or is it a wash either way? > > No, each 'pqact' instance will work through the list of actions in > the pattern-action file that it works in sequence. So, if one has > a monolithic pattern-action file with, say 10K actions, it will take > significantly longer than having 10 'pqact' instances operating > on pattern-action files that each have 100 actions. > > re: >> I vaguely remember this coming up at one point but I couldn't find any >> documentation or old email threads about it. I'm mostly just asking out >> of curiosity, I don't have a specific problem that I'm trying to solve >> or anything. But if I were to redo my pqact organization I'm wondering >> if there is a preferred methodology. > > The best rule of thumb is to have multiple 'pqact' instances operating > on multiple pattern-action files when the list of actions to be > performed is large, or when some of the actions are slow. There is no > "best practice" for, say, having only N actions in a pattern-action > file since the speed that the actions will be performed is a function > of how fast/slow each action is. Sites invariably will need to do > their own tuning to find the right balance of speed and use of > resources (more 'pqact' instances will, of course, use more resources > like CPU, RAM, etc.). > > Cheers, > > Tom > -- > +----------------------------------------------------------------------+ > * Tom Yoksas UCAR Unidata Program * > * (303) 497-8642 (last resort) P.O. Box 3000 * > * yoksas@xxxxxxxx Boulder, CO 80307 * > * Unidata WWW Service http://www.unidata.ucar.edu/ * > +----------------------------------------------------------------------+ > > _______________________________________________ > NOTE: All exchanges posted to Unidata maintained email lists are > recorded in the Unidata inquiry tracking system and made publicly > available through the web. Users who post to any of the lists we > maintain are reminded to remove any personal information that they > do not want to be made public. > > > ldm-users mailing list > ldm-users@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > https://www.unidata.ucar.edu/mailing_lists/ > > _______________________________________________ > NOTE: All exchanges posted to Unidata maintained email lists are > recorded in the Unidata inquiry tracking system and made publicly > available through the web. Users who post to any of the lists we > maintain are reminded to remove any personal information that they > do not want to be made public. > > > ldm-users mailing list > ldm-users@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > https://www.unidata.ucar.edu/mailing_lists/ > -- +----------------------------------------------------------------------+ * Tom Yoksas UCAR Unidata Program * * (303) 497-8642 (last resort) P.O. Box 3000 * * yoksas@xxxxxxxx Boulder, CO 80307 * * Unidata WWW Service http://www.unidata.ucar.edu/ * +----------------------------------------------------------------------+ _______________________________________________ NOTE: All exchanges posted to Unidata maintained email lists are recorded in the Unidata inquiry tracking system and made publicly available through the web. Users who post to any of the lists we maintain are reminded to remove any personal information that they do not want to be made public. ldm-users mailing list ldm-users@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: https://www.unidata.ucar.edu/mailing_lists/
ldm-users
archives: