Bug #2299
Node YAML files being corrupted on puppet master server
| Status: | Duplicate | Start date: | 05/24/2009 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | transactions | |||
| Target version: | - | |||
| Affected Puppet version: | 0.24.8 | Branch: | ||
| Keywords: | ||||
| Votes: | 0 |
Description
There appears to be a case where the node YAML file for a given host gets corrupted.
Error messages start appearing in the messages file of the client like these:
May 25 08:24:32 s_local@CLIENT puppetd[29059]: [ID 702911 daemon.error] (//Node[default]/filesecurity/File[/etc/default/passwd]) Failed to retrieve current state of resource: Could not parse YAML data for node CLIENT.domain.com: syntax error on line 46, col 18: ` operatingsystem: Solaris' Could not describe /filesecurity/passwd-CLIENT: Could not parse YAML data for node CLIENT.domain.com: syntax error on line 46, col 18: ` operatingsystem: Solaris' at /etc//opt/csw/puppet/modules/filesecurity/manifests/init.pp:12
This issue was possibly triggered by circumstances outside the control of the Puppet Master (like a network failure), but its the way Puppet handles this issue that I’m more concerned about…
I have seen this happen a few times under different circumstances – usually the corruption is just an entry split across two lines – not a huge corruption. The solution I have used is to remove the file which causes Puppet to rebuild it from the client the next time it is triggered.
What would be better would be to detect the failure and attempt to rebuild first before alerting. Obviously it would be better to not have the corruption in the first place, but given the existance of custom facts and other code-based add-ons, this is unlikely to be something that Puppet can stop from happening. Would also be preferable to have it retry a set number of times – otherwise it could in theory get into an infinate loop…
Related issues
History
Updated by James Turnbull over 2 years ago
- Category set to transactions
- Status changed from Unreviewed to Needs More Information
- Assignee set to Luke Kanies
Luke – I think this is fixed in 0.25.0?
Updated by Luke Kanies over 2 years ago
- Status changed from Needs More Information to Duplicate
Updated by Kurt Keller over 1 year ago
For us this started happening in 0.24.8 as well, since we added more clients. I don’t have a fix for the root cause (corrupted YAML file), but the following patch against puppet/indirector/yaml.rb helps us a lot. Instead of puppet not running any more and an admin removing the offending YAML file, this is now done automatically with this patch.
--- yaml.rb.ori 2010-03-02 16:04:57.000000000 +0000
+++ yaml.rb.new 2010-11-09 16:48:30.000000000 +0000
@@ -17,9 +17,13 @@
raise Puppet::Error, "Could not read YAML data for %s %s: %s" % [indirection.name, request.key, detail]
end
begin
- return from_yaml(yaml)
+ # give errors a chance to be caught before anything is returned
+ yaml_hack = from_yaml(yaml)
+ # and return the result if no errors are found
+ return yaml_hack
rescue => detail
- raise Puppet::Error, "Could not parse YAML data for %s %s: %s" % [indirection.name, request.key, detail]
+ # simply remove the trashed YAML file; it will be recreated
+ File.unlink(file)
end
end