Bug #616
service resource takes over puppetd port - 0.22.4
| Status: | Closed | Start date: | ||
|---|---|---|---|---|
| Priority: | High | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | - | |||
| Target version: | 0.25.0 | |||
| Affected Puppet version: | 0.25.4 | Branch: | ||
| Keywords: | ||||
| Votes: | 0 |
Description
Strange one: Now and again some puppetd clients stop responding on port 8139 if i log onto the client and do the following: NOTE: x.x.x.x = client1 ip, y.y.y.y = puppetmaster server, z.z.z.z = snmp polling device (graphing server)
client1 # netstat -anp | grep 8139 tcp 0 0 0.0.0.0:8139 0.0.0.0:* LISTEN 21419/snmpd tcp 0 0 x.x.x.x:8139 y.y.y.y:60796 SYN_RECV - tcp 0 0 x.x.x.x:8139 y.y.y.y:60778 SYN_RECV - tcp 0 0 x.x.x.x:8139 z.z.z.z:49714 SYN_RECV - tcp 1 0 x.x.x.x:8139 z.z.z.z:58764 CLOSE_WAIT - tcp 1 0 x.x.x.x:8139 z.z.z.z:55994 CLOSE_WAIT - tcp 1 0 x.x.x.x:8139 z.z.z.z:34520 CLOSE_WAIT - tcp 1 0 x.x.x.x:8139 z.z.z.z:37074 CLOSE_WAIT - tcp 1 0 x.x.x.x:8139 z.z.z.z:33729 CLOSE_WAIT - tcp 1 0 x.x.x.x:8139 z.z.z.z:42101 CLOSE_WAIT -
Even stranger:
client1#netstat -anp | grep 161 udp 0 0 0.0.0.0:161 0.0.0.0:* 21419/snmpd client1# /etc/init.d/snmpd restart Restarting network management services: snmpd. client1# netstat -anp | grep 8139 /etc/init.d/puppetd restart * Stopping puppetd...[fail] * Starting puppetd...[ ok ] client1#netstat -anp | grep 8139 tcp 0 0 0.0.0.0:8139 0.0.0.0:* LISTEN 27612/ruby1.8
Config for that snmp resource:
package { 'snmp': ensure => latest }
package { 'snmpd': ensure => latest }
configfile { "/etc/snmp/snmpd.conf": source => "/snmpd.conf", mode => 600 }
exec { "/etc/init.d/snmpd restart": subscribe => FILE[[etcsnmpsnmpdconf]], refreshonly => true }
logs when this happened showed very little:
Apr 27 11:21:09 syd-spool-02 puppetdr5392: Reopening log files Apr 27 11:21:10 syd-spool-02 puppetdr5392: Starting Puppet client version 0.22.3 Apr 27 11:21:17 syd-spool-02 puppetdr5392: Could not trigger run; already running Apr 27 11:21:18 syd-spool-02 puppetdr5392: Starting configuration run
History
Updated by Brendan Beveridge almost 5 years ago
I can confirm this still happens in 0.22.4 now
I have now tracked down how to replicate this on my systems (debian 3.1, ubuntu 6.06, ubuntu 6.10) puppetd —server $servername —waitforcert 60 —listen —debug if its starts a service in this: notice: //client.hostname.blah/template-class/client-ntp/Service[ntp-server]/ensure: ensure changed ‘stopped’ to ‘running’
if i leave it running its fine, ie puppetd is still listening on 8139 if i then stop puppetd from running in the foreground then run a netstat -anpt | grep 8139
instead of the puppet process in control of the port its now ntpd
my ntp service config is:
package { ‘ntp-server’: ensure => latest } service { ‘ntp-server’: ensure => true, status => “/bin/pidof ntpd”, enable => true } configfile { “/etc/ntp.conf”: source => “/clients/ntp/ntp.conf”, notify => service[‘ntp-server’] }
at first i thought it may have been something funky with using status => “/bin/pidof ntpd”
but i then tested this with a variety of services syslog-ng, am-utils etc and they just use: service { ‘servicename’: ensure => true, enable => true }
Updated by David Lutterkort almost 5 years ago
It sounds a lot like there’s a fork somewhere and the child process doesn’t close all the sockets it received from the parents. Haven’t looked at the code though.
Updated by Luke Kanies almost 5 years ago
- Status changed from 1 to 2
Yep. This is where the ObjectSpace loop looking for TCPServer instances was from, but I never got it to work apparently, it caused other problems.
Updated by Valentin Vidic over 4 years ago
It seems that puppet doesn’t set close-on-exec flag for opened files so they get inherited by the children. For example, lsof lists these open files for puppetd:
ruby 2383 root 0u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 1u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 2u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 3r FIFO 0,6 7841 pipe ruby 2383 root 4w FIFO 0,6 7841 pipe ruby 2383 root 5u unix 0xc76cb6c0 7842 socket ruby 2383 root 6u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 7u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 8u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 9u IPv4 7848 TCP *:8139 (LISTEN) ruby 2383 root 10r REG 8,1 598 49553 /etc/ssh/ssh_host_dsa_key.pub ruby 2383 root 11r REG 8,1 390 49550 /etc/ssh/ssh_host_rsa_key.pub
The same files are open in the ntpd started by puppetd:
ntpd 2436 ntp 0u CHR 1,3 753 /dev/null ntpd 2436 ntp 1u CHR 1,3 753 /dev/null ntpd 2436 ntp 2u CHR 1,3 753 /dev/null ntpd 2436 ntp 3r FIFO 0,6 7841 pipe ntpd 2436 ntp 4w FIFO 0,6 7841 pipe ntpd 2436 ntp 5u unix 0xc76cb860 7895 socket ntpd 2436 ntp 6u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 7u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 8u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 9u IPv4 7848 TCP :8139 (LISTEN) ntpd 2436 ntp 10r REG 8,1 598 49553 /etc/ssh/ssh_host_dsa_key.pub ntpd 2436 ntp 11r REG 8,1 390 49550 /etc/ssh/ssh_host_rsa_key.pub ntpd 2436 ntp 12w REG 8,1 21975 182323 /var/log/ntpstats/peerstats.20070604 ntpd 2436 ntp 13w REG 8,1 596 182384 /var/log/ntpstats/loopstats.20070604 ntpd 2436 ntp 16u IPv4 7899 UDP :ntp ntpd 2436 ntp 17u IPv6 7900 UDP *:ntp ntpd 2436 ntp 18u IPv6 7901 UDP [::1]:ntp ntpd 2436 ntp 19u IPv6 7902 UDP [fe80::216:3eff:fe27:4eea]:ntp ntpd 2436 ntp 20u IPv4 7903 UDP front.local:ntp ntpd 2436 ntp 21u IPv4 7904 UDP dhcp132.nat11.irb.lo:ntp
You need to call io.fcntl(Fcntl::F_SETFD, Fcntl::FD_CLOEXEC) on every opened io object that has a file descriptor associated.
Updated by Valentin Vidic over 4 years ago
It seems that puppet doesn’t set close-on-exec flag for opened files so they get inherited by the children. For example, lsof lists these open files for puppetd:
ruby 2383 root 0u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 1u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 2u CHR 3,0 1713 /dev/ttyp0 ruby 2383 root 3r FIFO 0,6 7841 pipe ruby 2383 root 4w FIFO 0,6 7841 pipe ruby 2383 root 5u unix 0xc76cb6c0 7842 socket ruby 2383 root 6u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 7u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 8u REG 8,1 20762 182340 /var/log/puppet/http.log ruby 2383 root 9u IPv4 7848 TCP *:8139 (LISTEN) ruby 2383 root 10r REG 8,1 598 49553 /etc/ssh/ssh_host_dsa_key.pub ruby 2383 root 11r REG 8,1 390 49550 /etc/ssh/ssh_host_rsa_key.pub
The same files are open in the ntpd started by puppetd:
ntpd 2436 ntp 0u CHR 1,3 753 /dev/null ntpd 2436 ntp 1u CHR 1,3 753 /dev/null ntpd 2436 ntp 2u CHR 1,3 753 /dev/null ntpd 2436 ntp 3r FIFO 0,6 7841 pipe ntpd 2436 ntp 4w FIFO 0,6 7841 pipe ntpd 2436 ntp 5u unix 0xc76cb860 7895 socket ntpd 2436 ntp 6u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 7u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 8u REG 8,1 20762 182340 /var/log/puppet/http.log ntpd 2436 ntp 9u IPv4 7848 TCP *:8139 (LISTEN) ntpd 2436 ntp 10r REG 8,1 598 49553 /etc/ssh/ssh_host_dsa_key.pub ntpd 2436 ntp 11r REG 8,1 390 49550 /etc/ssh/ssh_host_rsa_key.pub ntpd 2436 ntp 12w REG 8,1 21975 182323 /var/log/ntpstats/peerstats.20070604 ntpd 2436 ntp 13w REG 8,1 596 182384 /var/log/ntpstats/loopstats.20070604 ntpd 2436 ntp 16u IPv4 7899 UDP *:ntp ntpd 2436 ntp 17u IPv6 7900 UDP *:ntp ntpd 2436 ntp 18u IPv6 7901 UDP [::1]:ntp ntpd 2436 ntp 19u IPv6 7902 UDP [fe80::216:3eff:fe27:4eea]:ntp ntpd 2436 ntp 20u IPv4 7903 UDP front.local:ntp ntpd 2436 ntp 21u IPv4 7904 UDP dhcp132.nat11.irb.lo:ntp
You need to call io.fcntl(Fcntl::F_SETFD, Fcntl::FD_CLOEXEC) on every opened io object that has a file descriptor associated.
Updated by Valentin Vidic over 4 years ago
I have attached a small patch that sets this flag on the socket descriptor. Similar change should be done for other open file descriptors.
It seems the socket problem is a bug in WEBrick, since it also tries to set FD_CLOEXEC but calls fcntl with the wrong parameters –
io.fcntl(Fcntl::FD_CLOEXEC, 1)
instead of
io.fcntl(Fcntl::F_SETFD, Fcntl::FD_CLOEXEC)
Updated by Luke Kanies over 4 years ago
- Status changed from 2 to Closed
- 7 set to fixed
Fixed in r2553, thanks to patches from vvidic.