NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.

Re: [ldm-users] ldm data dir question

I'm also interested in the size of the product queue (look in
~ldm/etc/registry.xml for the queue size) vs the amount of ram available.
It sounds like you could be hammering system memory.

gerry

On Fri, Apr 24, 2020 at 8:44 PM Mike Zuranski <zuranski.wx@xxxxxxxxx> wrote:

> Hi Jack,
>
> First thing I want to point out is (barring any symlink or similar
> shenanigans) your product queue is not under /home/ldm/var/data/.   As
> shown by LDM's error message, the product queue is the
> /home/ldm/var/queues/ldm.pq file.  That single file will house the entire
> queue, so you wouldn't see excessive files from that.
>
> That being said, the times I've had issues like yours with not being able
> to log in or issue commands, it was usually because of either a full root
> partition ("/"), full /tmp partition (unlikely that's relevant here, but
> just FYI), full memory, or full inodes on a partition.  I see Tom already
> asked about "df -h" output, and you already checked inodes and that appears
> fine.  But those have been some of my experiences as well.
>
> So what IS in /home/ldm/var/data ?  My guess is that's where LDM is saving
> data to, and that configuration would be found in your pqact file(s).  One
> thing you could try is running the following command to see what LDM will
> attempt to save in that directory (assuming your pqact file(s) are named
> "pqact..." and in that dir, otherwise adjust accordingly):  "grep var/data
> ~/etc/pqact* | grep -i file"  (without quotes)
>
> Side-note to the above:  By default, relative paths with the FILE action
> will start in the "/home/ldm" directory.  This is set in ~/etc/registry.xml
> under /pqact/datadir-path, and you can check it with "regutil
> /pqact/datadir-path" (without quotes).  If that points straight to your
> /home/ldm/var/data/ dir then THAT becomes the default starting point for
> relative paths (and it might make the above grep command come back empty).
>
> If there are actions to save data there they should (hopefully but not
> guaranteed to) be listed by that grep command, and that could point you
> where to look next.  If it comes back empty then maybe something's getting
> PIPEd to a script which is in turn saving data there, but that might be
> harder to track down.  Either way, it's hard to know without looking in
> that directory or your pqact(s) what might be happening, but hopefully this
> will yield a clue or two.  It's possible you're getting more than you think
> you're asking for, and it's leading to that directory filling up... and if
> that's on the root partition it could explain the log in / lock up issues.
>
> You also mentioned ldmadmin scour doesn't seem to be doing much.  Check
> ~/etc/scour.conf to see where it's doing actual scouring.  Maybe it's not
> looking in that data directory, or maybe it is letting files stay too long.
>
> I'd also be curious about the size of your product queue vs. the size of
> the partition it's on.  If it's able to get made and LDM starts at all it's
> probably fine, but it is worth paying attention to.  The size of the queue
> gets defined in ~/etc/registry.xml, then just compare "ls -lh
> /home/ldm/var/queues/ldm.pq" and "df -h" to see how the partition is
> filling up the disk.  I try to ensure the partition it's on stays at 75% or
> less, though I don't think that's a true hard/fast rule, just guidance.
>
> Some reference pages that may be useful to you if you haven't seen these
> already:
> https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/ldmd.conf.html
>
>
> https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/pqact.conf.html
>
> https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/scour.conf.html
>
>
> https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/LDM-registry.html
>
>
> Per your last email:
> >  just to confirm... find and rm on the data dir won't mess up / confuse
> the ldm queue stuff?
>
> It shouldn't.  Again, from what I've seen in your original email that's
> not where the queue is.  And even if it were, scour shouldn't touch it as
> long as it keeps updating (though rm -rf would).  I'd double-check
> ~/etc/registry.xml to verify the queue is housed elsewhere, but it sounds
> like you should be fine on this.
>
> Hope some of this helps you out,
>
> -Mike
>
> ======================
> Mike Zuranski
> Meteorology Support Analyst
> College of DuPage - Nexlab
> Weather.cod.edu <http://weather.cod.edu/>
> ======================
>
>
> On Fri, Apr 24, 2020 at 1:32 PM Jack Snodgrass <jack@xxxxxxxxxxxxxx>
> wrote:
>
>> having issues with our server ( centos7 ) that runs ldm... locking up. It
>> has happened 2 times in the last 3 weeks or so.
>> The server is pingable... so it's not totally dead.. but you can't get a
>> local or remote console to start. can't figure out if it is out of memory
>> or file handles or what.... it's like a ghost of itself.
>>
>> After rebooting... the  /home/ldm/var/data/ has around 350,000 files in
>> it.  I am not sure if that is 'ok' or a bit extra.
>>
>> We are running a
>>
>> ldmadmin scour
>>
>> command... via cron but I don't know what that is doing exactly or it
>> it's doing much.
>>
>> when I try and restart ldm it says:
>>
>> Checking the product-queue...
>> The writer-counter of the product-queue isn't zero.  Either a process
>> has the product-queue open for writing or the queue might be corrupt.
>> Terminate the process and recheck or use
>>     pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
>>     /home/ldm/var/queues/ldm.pq
>> to validate the queue and set the writer-counter to zero.
>> LDM not started
>>
>>
>> In the past.... during testing and what not.. I've been able to run:
>> pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
>> /home/ldm/var/queues/ldm.pq
>>
>> and ldm would start after that. This time.. with the 350K files or so..
>> that pqcat stuff fails.
>>
>> I am deleting older ( than a day ) files from the /home/ldm/var/data/
>> direcory... going to see if
>>
>> pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
>> /home/ldm/var/queues/ldm.pq
>>
>>
>> will work or if I have to rm -rf /home/ldm/var/data/ and start a new q.
>>
>>
>> If  ldmadmin scour does not let us remove enough files from
>> /home/ldm/var/data/ can I use find and rm to remove files or do they have
>> to be removed using ldm to keep and queses or indexes  in sync?
>>
>> - jack
>>
>> --
>> *jack* - Southlake Texas - http://mylinuxguy.net - *817-601-7338*
>> _______________________________________________
>> NOTE: All exchanges posted to Unidata maintained email lists are
>> recorded in the Unidata inquiry tracking system and made publicly
>> available through the web.  Users who post to any of the lists we
>> maintain are reminded to remove any personal information that they
>> do not want to be made public.
>>
>>
>> ldm-users mailing list
>> ldm-users@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe,  visit:
>> https://www.unidata.ucar.edu/mailing_lists/
>>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> ldm-users mailing list
> ldm-users@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> https://www.unidata.ucar.edu/mailing_lists/
>


-- 
Gerry Creager
NSSL/CIMMS
405.325.6371
++++++++++++++++++++++
*The way to get started is to quit talking and begin doing.*
*   Walt Disney*
  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: