Bug #1683

Settings should not replace current values until new values are available

Added by Oliver Hookins over 3 years ago. Updated over 3 years ago.

Status:Closed Start date:10/24/2008
Priority:High Due date:
Assignee:James Turnbull % Done:

0%

Category:settings
Target version:0.24.7
Affected Puppet version:0.24.6 Branch:
Keywords:
Votes: 0

Description

So far I haven’t been able to figure out what this is caused by so indulge me if I don’t have a complete picture of what is going on.

Background: * my manifests are working – I test them quite thoroughly on a staging puppetmaster * the staging puppetmaster never has any issues and is quite similar in environment to the live puppetmaster * this problem most often happens after I’ve rolled out some changes to the manifests, e.g. a new module or reasonably large change. * after the next run by the client puppetd on all servers, usually the issue goes away, so it is transient and not related to the manifests (as far as I can see).

On the puppetd side I usually get messages like this:

Fri Oct 24 18:02:09 +1100 2008 //Node[testclient]/managed/ntp/File[/etc/xinetd.d/time] (err): Failed to retrieve current state of resource: No such file or
directory - /var/puppet/yaml/node Could not describe /ntp/etc/xinetd.d/time: No such file or directory - /var/puppet/yaml/node at
/var/lib/puppet/modules/ntp/manifests/init.pp:164
Fri Oct 24 18:02:09 +1100 2008 //Node[testclient]/managed/ntp/Service[xinetd] (notice): Dependency file[/etc/xinetd.d/time] has 1 failures
Fri Oct 24 18:02:09 +1100 2008 //Node[testclient]/managed/ntp/Service[xinetd] (warning): Skipping because of failed dependencies
Fri Oct 24 18:02:21 +1100 2008 //Node[testclient]/managed/ntp/File[/etc/ntp.conf] (err): Failed to retrieve current state of resource: No such file or
directory - /var/puppet/yaml/node Could not describe /host/etc/ntp.conf: No such file or directory - /var/puppet/yaml/node at
/var/lib/puppet/modules/ntp/manifests/init.pp:84

I do NOT have /var/puppet/ configured as a path ANYWHERE in the client or server configs. I have double and triple checked this.

On the puppetmasterd side I was able to capture this when running briefly with —debug and —verbose from the console:

info: Caching node for client1
err: Could not call: No such file or directory - /var/puppet/yaml/node
err: Could not call: No such file or directory - /var/puppet/yaml/node
warning: Could not find facts for client2; you probably have a discrepancy between the node and fact names
info: Processing reports tagmail, store for client3
warning: Could not find facts for client4; you probably have a discrepancy between the node and fact names
warning: Could not find facts for client5; you probably have a discrepancy between the node and fact names
warning: Could not find facts for client6; you probably have a discrepancy between the node and fact names
info: Processing reports tagmail, store for client7
warning: Could not find facts for client7; you probably have a discrepancy between the node and fact names/usr/lib/ruby/1.8/sync.rb:57:in `Fail'
/usr/lib/ruby/1.8/sync.rb:63:in `Fail'
/usr/lib/ruby/1.8/sync.rb:183:in `sync_unlock'
/usr/lib/ruby/1.8/sync.rb:231:in `synchronize'
/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:385:in `threadlock'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:437:in `newmessage'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:436:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:436:in `newmessage'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:505:in `initialize'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:126:in `new'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:126:in `create'
/usr/lib/ruby/site_ruby/1.8/puppet.rb:59:in `info'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:152:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:151:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:151:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `fork'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:121:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:66:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:33:in `report'
/usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `to_proc'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:52:in `call'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:52:in `protect_service'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:85:in `setup_processor'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `call'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `each'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:366:in `call_method'
/usr/lib/ruby/1.8/xmlrpc/server.rb:378:in `handle'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:44:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:111:in `process'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel/http_response.rb:65:in `start'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:108:in `process'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:159:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `each'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `run'
/usr/sbin/puppetmasterd:287
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:166:in `send': Could not send report emails via sendmail: Thread(#) not locked. (Puppet::Error)
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `fork'
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `send'
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:121:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:66:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `each'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:33:in `report'
        from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `to_proc'
         ... 20 levels...
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `initialize'
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `new'
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `run'
        from /usr/sbin/puppetmasterd:287

I’m sorry if I have the traceback and other output mixed up, I’m not sure what causes what. This traceback and error messages are printed ad infinitum as I have about 120 nodes connecting at about the same time. Hostnames have been changed to protect the innocent.

Any ideas?

0003-Lock-all-accesses-to-the-values-instance-variable.patch - Proposed patch to fix the bug. (20.5 kB) Matt Palmer, 11/04/2008 11:32 pm


Related issues

related to Puppet - Bug #1143: Sporadic error messages in puppetmaster log related to un... Closed

History

Updated by Luke Kanies over 3 years ago

  • Status changed from Unreviewed to Needs More Information
  • Priority changed from Normal to High

Wow. So it looks like you’re somehow losing all setting values when the manifests get reparsed? That’s frightening.

Could this be happening when you update puppet.conf, or are you sure it’s only when you update the manifests?

Updated by Oliver Hookins over 3 years ago

puppet.conf on the puppetmaster is very stable – it is rolled out as is from my svn repo by rsync so rarely if ever gets changed. puppet.conf on the clients is rolled out by puppet but again this rarely changes.

My repo rollout makefile doesn’t restart the puppetmaster as I assume it can pick up any changes made to the manifests automatically. A graceful restart of the Apache daemon in front of Mongrel is done though.

So in short, no I don’t believe it has anything to do with updating puppet.conf on either the clients or the master as this is generally not done. Is there any more I can do to get more debugging/trace output that would be useful?

Updated by Oliver Hookins over 3 years ago

Oh, if it makes any difference, both my staging and live puppetmasters are running on RHEL5 i386.

Updated by Luke Kanies over 3 years ago

As indicated, I think this is related to #1143, but that ticket shows the settings losing their values entirely, rather than your case which reverts to the default values.

I’m absolutely stumped, though — I have analyzed the code pretty thoroughly and have no idea how it could ever revert to either no values or default values; at least, I don’t see how it could happen without puppet.conf being involved in some way.

Try this: in puppetmasterd, after startup but before it backgrounds, freeze Puppet.settings. At some point, you should get an exception, and hopefully the stack trace will tell you who was trying to modify it. However, freezing might not quite be enough — try the code here: http://www.semintelligent.com/blog/articles/29/freezing-ruby-data-structures-recursively

Updated by Oliver Hookins over 3 years ago

OK, I did a simple Puppet.settings.freeze and it did provide a traceback, let me know if you want me to try again with the deep freeze.

In order to reproduce the issue I added a new node config. Here is the traceback and output from puppetmasterd after the next node connected:

notice: Reparsing /etc/puppet/puppet.conf
Tue Oct 28 17:10:48 +1100 2008: Read error: #
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:107:in `clear'
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:337:in `parse'
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:506:in `reparse'
/usr/lib/ruby/1.8/sync.rb:229:in `synchronize'
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:505:in `reparse'
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:762:in `value'
/usr/lib/ruby/site_ruby/1.8/puppet/util/settings.rb:19:in `[]'
/usr/lib/ruby/site_ruby/1.8/puppet.rb:92:in `[]'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:127:in `client_info'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:105:in `process'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:159:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `each'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `run'
/usr/sbin/puppetmasterd:290
debug: Allowing authenticated client client1(127.0.0.1) access to fileserver.describe
info: mount[module]: allowing *.anchor.net.au access
info: mount[host]: allowing *.anchor.net.au access
info: mount[role]: allowing *.anchor.net.au access
info: mount[common]: allowing *.anchor.net.au access
info: mount[facts]: allowing *.anchor.net.au access
debug: No modules mount given; autocreating with default permissions
debug: No plugins mount given; autocreating with default permissions
warning: Could not find facts for client1; you probably have a discrepancy between the node and fact names
info: Caching node for client1
err: Could not call: No such file or directory - /var/puppet/yaml/node
debug: Allowing authenticated client client1(127.0.0.1) access to fileserver.list
warning: Could not find facts for client1; you probably have a discrepancy between the node and fact names
info: Caching node for client1
err: Could not call: No such file or directory - /var/puppet/yaml/node

… and so on.

Updated by Luke Kanies over 3 years ago

  • Subject changed from "Random" errors coming from puppetmasterd to Settings should not replace current values until new values are available
  • Category set to settings
  • Status changed from Needs More Information to Accepted
  • Assignee set to Luke Kanies
  • Target version set to 0.24.7

Awesome, thanks.

Puppet is convinced that your puppet.conf needs reparsing, which means that the timestamp on disk has changed. It looks like a value is being used while the file is being reparsed, apparently because while I’m using a mutex around the parsing, I’m not using one to retrieve a value, providing a race condition.

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

Updated by Oliver Hookins over 3 years ago

Ah, ok that makes sense. My rollout procedure actually rsyncs over the puppet.conf from the staging server then uses sed to replace a few values for the live environment, so while the MD5 of the file doesn’t change, the mtime would.

Glad we could get to the bottom of this.

Updated by Matt Palmer over 3 years ago

luke wrote:

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

More sensible, but not practical. There’s at least three instance variables in a Settings object that need updating, and I can’t see a way of doing that atomically, so all variable accesses will have to be locked anyway.

It might be easier to wrap all accesses to @@settings in the Puppet class and do the ol' switcheroo there.

Updated by Luke Kanies over 3 years ago

mpalmer wrote:

luke wrote:

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

More sensible, but not practical. There’s at least three instance variables in a Settings object that need updating, and I can’t see a way of doing that atomically, so all variable accesses will have to be locked anyway.

It might be easier to wrap all accesses to @@settings in the Puppet class and do the ol' switcheroo there.

What are the three instance variables that need updating? Seems like you could just swap out the @values hash in the ‘parse’ method; what am I missing?

I’m less comfortable wrapping Puppet.settings, because I really think the responsibility should lie with the Settings class, and I don’t think you could wrap it successfully for cases where Puppet.settings[:foo] is used.

Updated by Matt Palmer over 3 years ago

luke wrote:

mpalmer wrote:

luke wrote:

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

More sensible, but not practical. There’s at least three instance variables in a Settings object that need updating, and I can’t see a way of doing that atomically, so all variable accesses will have to be locked anyway.

It might be easier to wrap all accesses to @@settings in the Puppet class and do the ol' switcheroo there.

What are the three instance variables that need updating? Seems like you could just swap out the @values hash in the ‘parse’ method; what am I missing?

There’s also @cache that needs dealing with, and I can’t find the third variable now, but when I started adding locks everywhere there was definitely something else that needed love.

I’m less comfortable wrapping Puppet.settings, because I really think the responsibility should lie with the Settings class, and I don’t think you could wrap it successfully for cases where Puppet.settings[:foo] is used.

Yeah, it’s probably going to be at least as ugly going that way. It would have been an awfully minimal solution, though.

  • Matt

Updated by Luke Kanies over 3 years ago

mpalmer wrote:

luke wrote:

mpalmer wrote:

luke wrote:

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

More sensible, but not practical. There’s at least three instance variables in a Settings object that need updating, and I can’t see a way of doing that atomically, so all variable accesses will have to be locked anyway.

It might be easier to wrap all accesses to @@settings in the Puppet class and do the ol' switcheroo there.

What are the three instance variables that need updating? Seems like you could just swap out the @values hash in the ‘parse’ method; what am I missing?

There’s also @cache that needs dealing with, and I can’t find the third variable now, but when I started adding locks everywhere there was definitely something else that needed love.

Actually, you shouldn’t have to touch @cache, at least until you replace the @values hash.

All I’m talking about is building up the replacement hash in a separate variable, and only reassigning @values once the whole parsing or whatever has succeeded. This is a technique I’ve used in many other parts of the system to avoid race conditions, including in the language parser. Once you replace @values, of course, you’d want to clear the cache.

Updated by Matt Palmer over 3 years ago

luke wrote:

mpalmer wrote:

luke wrote:

mpalmer wrote:

luke wrote:

In looking at the code, I see two things I can do: Either wrap value retrieval in the mutex, or don’t replace the in-memory values until the whole file has been parsed. The latter is the more sensible option, I think.

More sensible, but not practical. There’s at least three instance variables in a Settings object that need updating, and I can’t see a way of doing that atomically, so all variable accesses will have to be locked anyway.

It might be easier to wrap all accesses to @@settings in the Puppet class and do the ol' switcheroo there.

What are the three instance variables that need updating? Seems like you could just swap out the @values hash in the ‘parse’ method; what am I missing?

There’s also @cache that needs dealing with, and I can’t find the third variable now, but when I started adding locks everywhere there was definitely something else that needed love.

Actually, you shouldn’t have to touch @cache, at least until you replace the @values hash.

All I’m talking about is building up the replacement hash in a separate variable, and only reassigning @values once the whole parsing or whatever has succeeded. This is a technique I’ve used in many other parts of the system to avoid race conditions, including in the language parser. Once you replace @values, of course, you’d want to clear the cache.

I have this sneaking suspicion that “avoid” might actually mean “make less likely” rather than “make impossible”…

The problem with trying to replace @values and clear @cache without a mutex is that the race condition still exists, it’s just a smaller target. So when you’re updating @values you need to also clear @cache in the same lock — which means that we still need explicit locking on every access to @values. Since this bug is causing pain in the day job, I’ve whipped up the attached patch to do just that. It doesn’t break anything else, but it hasn’t been tested enough to have a reasonable guarantee that it definitely fixes the bug, since it can’t be reproduced with 100% reliability. The patch is a bit of a monster, as there’s a few bits of dodgy code I gutted while I was in there.

Also, I found the other place that (I thought) needed locking — @config. Turns out that @config is just diving back into the Settings object, so it kinda-sorta doesn’t need locking (although I get the feeling there’s still going to be pain and suffering down the line).

  • Matt

Updated by Oliver Hookins over 3 years ago

Try as I might, I can’t reproduce the problem in my staging environment so a safe testing of the patch was impossible. I did however deploy the patch on my live puppetmaster and the problem hasn’t recurred since the patch has been in place. I haven’t noticed anything additional breaking due to the patch.

Updated by Oliver Hookins over 3 years ago

Not sure if this is related still to this bug but I was trying to diagnose another error and got this traceback from the puppetmaster:

/usr/lib/ruby/1.8/sync.rb:57:in `Fail'
/usr/lib/ruby/1.8/sync.rb:63:in `Fail'
/usr/lib/ruby/1.8/sync.rb:183:in `sync_unlock'
/usr/lib/ruby/1.8/sync.rb:231:in `synchronize'
/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:385:in `threadlock'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:437:in `newmessage'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:436:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:436:in `newmessage'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:505:in `initialize'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:126:in `new'
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:126:in `create'
/usr/lib/ruby/site_ruby/1.8/puppet.rb:59:in `info'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:152:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:151:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:151:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `fork'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:121:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:66:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `each'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:33:in `report'
/usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `to_proc'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:52:in `call'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:52:in `protect_service'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:85:in `setup_processor'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `call'
/usr/lib/ruby/1.8/xmlrpc/server.rb:336:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `each'
/usr/lib/ruby/1.8/xmlrpc/server.rb:323:in `dispatch'
/usr/lib/ruby/1.8/xmlrpc/server.rb:366:in `call_method'
/usr/lib/ruby/1.8/xmlrpc/server.rb:378:in `handle'
/usr/lib/ruby/site_ruby/1.8/puppet/network/xmlrpc/processor.rb:44:in `process'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:111:in `process'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel/http_response.rb:65:in `start'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_server/mongrel.rb:108:in `process'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:159:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `each'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:158:in `process_client'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:285:in `run'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `initialize'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `new'
/usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `run'
/usr/sbin/puppetmasterd:287
debug: File[/etc/puppet/puppet.conf]: Adding default for backup/usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:166:in `send': Could not send report emails via sendmail: Thread(#) not locked. (Puppet::Error)
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `fork'
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:126:in `send'
        from /usr/lib/ruby/site_ruby/1.8/puppet/reports/tagmail.rb:121:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:66:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `each'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:59:in `process'
        from /usr/lib/ruby/site_ruby/1.8/puppet/network/handler/report.rb:33:in `report'
        from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `to_proc'
         ... 20 levels...
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `initialize'
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `new'
        from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/lib/mongrel.rb:268:in `run'
        from /usr/sbin/puppetmasterd:287

After it occurred the first time after quite a few clients had already connected then disconnected, it progressively happened more and more frequently in each of the puppetmasters I had running.

Updated by Luke Kanies over 3 years ago

ohookins wrote:

Not sure if this is related still to this bug but I was trying to diagnose another error and got this traceback from the puppetmaster:

[…]

After it occurred the first time after quite a few clients had already connected then disconnected, it progressively happened more and more frequently in each of the puppetmasters I had running.

That’s a different and already-filed bug.

Updated by Luke Kanies over 3 years ago

mpalmer wrote:

I have this sneaking suspicion that “avoid” might actually mean “make less likely” rather than “make impossible”…

The problem with trying to replace @values and clear @cache without a mutex is that the race condition still exists, it’s just a smaller target. So when you’re updating @values you need to also clear @cache in the same lock — which means that we still need explicit locking on every access to @values. Since this bug is causing pain in the day job, I’ve whipped up the attached patch to do just that. It doesn’t break anything else, but it hasn’t been tested enough to have a reasonable guarantee that it definitely fixes the bug, since it can’t be reproduced with 100% reliability. The patch is a bit of a monster, as there’s a few bits of dodgy code I gutted while I was in there.

Also, I found the other place that (I thought) needed locking — @config. Turns out that @config is just diving back into the Settings object, so it kinda-sorta doesn’t need locking (although I get the feeling there’s still going to be pain and suffering down the line).

  • Matt

The patch looks good to me.

I’d still like the atomic value replacement, though; I’ll hopefully get to it, but it should be easy to do with the mutex in this patch.

Updated by Luke Kanies over 3 years ago

  • Status changed from Accepted to Ready For Checkin
  • Assignee changed from Luke Kanies to James Turnbull

I’ve applied this patch with a small change in my tickets/0.24.x/1683 branch in my repo.

Updated by James Turnbull over 3 years ago

  • Status changed from Ready For Checkin to Closed

Pushed in commit:97a817706f7993044b69f148fe2ba74bbcb5d4a3 in branch 0.24.x

Also available in: Atom PDF