NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

To learn about what's going on, see About the Archive Site.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #BRX-675440]: Incomplete files on wheezy OS



George,

Thanks for the information.

Which system misses the data-products?

> Hi,
> 
> It will be easier to give you this information instead of giving you
> access to the machines.  We may be able to if this is not enough.  Also
> attached is the pqact.conf file used by both machines.
> 
> Thanks,
> George
> 
> nikara.rap.ucar.edu (128.117.196.12)
> load average: 2.68, 2.63, 2.60
> nikara:~/cvs/third_party/open/apps/unisys_decoders/src/ucsat% ldmadmin
> config
> 
> hostname:              nikara.rap.ucar.edu
> os:                    Linux
> release:               3.2.0-4-amd64
> ldmhome:               /home/ldm
> LDM version:           6.11.5
> PATH:
> /home/ldm/ldm-6.11.5/bin:.:/home/ldm/bin:/home/ldm/util:/home/ldm/decoders:/home/ldm/rap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11
> LDM conf file:         /home/ldm/etc/ldmd.conf
> pqact(1) conf file:    /home/ldm/etc/pqact.conf
> scour(1) conf file:    /home/ldm/etc/scour.conf
> product queue:         /home/ldm/var/queues/ldm.pq
> queue size:            500M bytes
> queue slots:           default
> reconciliation mode:   do nothing
> pqsurf(1) path:        /home/ldm/var/queues/pqsurf.pq
> pqsurf(1) size:        2M
> IP address:            0.0.0.0
> port:                  388
> PID file:              /home/ldm/ldmd.pid
> Lock file:             /home/ldm/.ldmadmin.lck
> maximum clients:       256
> maximum latency:       3600
> time offset:           3600
> log file:              /home/ldm/var/logs/ldmd.log
> numlogs:               7
> log_rotate:            1
> netstat:               /bin/netstat -A inet -t -n
> top:                   /usr/bin/top -b -n 1
> metrics file:          /home/ldm/var/logs/metrics.txt
> metrics files:         /home/ldm/var/logs/metrics.txt*
> num_metrics:           4
> check time:            1
> delete info files:     0
> ntpdate(1):            /usr/sbin/ntpdate
> ntpdate(1) timeout:    5
> time servers:          ntp.ucsd.edu ntp1.cs.wisc.edu ntppub.tamu.edu
> otc1.psu.edu timeserver.unidata.ucar.edu
> time-offset limit:     10
> 
> REQUEST NIMAGE  "satz.*EAST-CONUS.*"    khufu.rap.ucar.edu
> REQUEST NIMAGE  "satz.*WEST-CONUS.*"    khufu.rap.ucar.edu
> 
> nikara:~/logs% egrep 'ERR|WARN' `ls -rt ldmd.log*` | tail -22
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12986, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12986, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   840869 20131009221042.784  NIMAGE 211512
> satz/ch2/GOES-15/3.9/20131009 2200/WEST-CONUS/4km/ TIGW04 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200): Broken
> pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12987, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12987, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/3.9/20131009/3.9_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/WV/20131009/WV_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_sync():
> pid=12988, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/WV/20131009/WV_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12989, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12989, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   819854 20131009221044.789  NIMAGE 211514
> satz/ch2/GOES-15/IR/20131009 2200/WEST-CONUS/4km/ TIGW02 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200): Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12990, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12990, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/IR/20131009/IR_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200):
> Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12991, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12991, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200"
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_prodput: trying
> again:   673834 20131009221046.023  NIMAGE 211515
> satz/ch2/GOES-15/13.3/20131009 2200/WEST-CONUS/4km/ TIGW06 KNES 092200
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pbuf_flush(): fd=26,
> cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200):
> Broken pipe
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: pipe_put(): write
> error: pid=12992, cmd=(-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200)
> ldmd.log:Oct  9 22:10:49 nikara pqact[12142] ERROR: [filel.c:305]
> Deleting failed PIPE entry: pid=12992, cmd="-close ucsat -
> /d1/ldm/data/gini/WEST-CONUS/4km/13.3/20131009/13.3_20131009_2200"
> 
> 
> 
> khafra.rap.ucar.edu (128.117.196.23)
> load average: 4.94, 7.82, 7.39
> khafra:~% ldmadmin config
> 
> hostname:      khafra.rap.ucar.edu
> os:            Linux
> release:       2.6.32-5-amd64
> ldmhome:       /home/ldm
> bin path:      /home/ldm/bin
> conf file:     /home/ldm/etc/ldmd.conf
> log file:      /home/ldm/logs/ldmd.log
> numlogs:       7
> log_rotate:    1
> data path:     /home/ldm/data
> product queue: /home/ldm/data/ldm.pq
> queue size:    400M bytes
> queue slots:   default
> IP address:    all
> port:          388
> PID file:      /home/ldm/ldmd.pid
> LDMHOSTNAME:   khafra.rap.ucar.edu
> PATH:
> /home/ldm/bin:/bin:/usr/bin:/usr/sbin:/sbin:/usr/ucb:/usr/usb:/usr/etc:/etc:.:/home/ldm/bin:/home/ldm/util:/home/ldm/decoders:/home/ldm/rap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/rap/bin
> 
> REQUEST NIMAGE  "satz.*EAST-CONUS.*"    khufu.rap.ucar.edu
> REQUEST NIMAGE  "satz.*WEST-CONUS.*"    khufu.rap.ucar.edu
> 
> khafra:~/logs% egrep 'ERR|WARN' `ls -rt ldmd.log*` | tail -22
> ldmd.log:Oct  9 21:51:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:53:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:54:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:55:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:56:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:57:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 21:59:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:00:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:01:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:02:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:03:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:05:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:06:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:07:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:08:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:09:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:11:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:12:23 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:13:35 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:14:47 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:15:59 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out
> ldmd.log:Oct  9 22:17:11 khafra 204.227.127.47[9886] ERROR:
> Disconnecting due to LDM failure; Couldn't connect to LDM on
> 204.227.127.47 using either port 388 or portmapper; : RPC: Remote system
> error - Connection timed out

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: BRX-675440
Department: Support LDM
Priority: Normal
Status: Closed