Bug #2299

Node YAML files being corrupted on puppet master server

Added by Greg Boug over 2 years ago. Updated over 1 year ago.

Status:Duplicate Start date:05/24/2009
Priority:Normal Due date:
Assignee:Luke Kanies % Done:

0%

Category:transactions
Target version:-
Affected Puppet version:0.24.8 Branch:
Keywords:
Votes: 0

Description

There appears to be a case where the node YAML file for a given host gets corrupted.

Error messages start appearing in the messages file of the client like these:

May 25 08:24:32 s_local@CLIENT puppetd[29059]: [ID 702911 daemon.error] (//Node[default]/filesecurity/File[/etc/default/passwd]) Failed to retrieve current state of resource: Could not parse YAML data for node CLIENT.domain.com: syntax error on line 46, col 18: `  operatingsystem: Solaris' Could not describe /filesecurity/passwd-CLIENT: Could not parse YAML data for node CLIENT.domain.com: syntax error on line 46, col 18: `  operatingsystem: Solaris' at /etc//opt/csw/puppet/modules/filesecurity/manifests/init.pp:12

This issue was possibly triggered by circumstances outside the control of the Puppet Master (like a network failure), but its the way Puppet handles this issue that I’m more concerned about…

I have seen this happen a few times under different circumstances – usually the corruption is just an entry split across two lines – not a huge corruption. The solution I have used is to remove the file which causes Puppet to rebuild it from the client the next time it is triggered.

What would be better would be to detect the failure and attempt to rebuild first before alerting. Obviously it would be better to not have the corruption in the first place, but given the existance of custom facts and other code-based add-ons, this is unlikely to be something that Puppet can stop from happening. Would also be preferable to have it retry a set number of times – otherwise it could in theory get into an infinate loop…


Related issues

duplicates Puppet - Bug #2029: node yaml file can get corrupted Closed 02/26/2009

History

Updated by James Turnbull over 2 years ago

  • Category set to transactions
  • Status changed from Unreviewed to Needs More Information
  • Assignee set to Luke Kanies

Luke – I think this is fixed in 0.25.0?

Updated by Luke Kanies over 2 years ago

  • Status changed from Needs More Information to Duplicate

Updated by Kurt Keller over 1 year ago

For us this started happening in 0.24.8 as well, since we added more clients. I don’t have a fix for the root cause (corrupted YAML file), but the following patch against puppet/indirector/yaml.rb helps us a lot. Instead of puppet not running any more and an admin removing the offending YAML file, this is now done automatically with this patch.

--- yaml.rb.ori 2010-03-02 16:04:57.000000000 +0000
+++ yaml.rb.new 2010-11-09 16:48:30.000000000 +0000
@@ -17,9 +17,13 @@
             raise Puppet::Error, "Could not read YAML data for %s %s: %s" % [indirection.name, request.key, detail]
         end
         begin
-            return from_yaml(yaml)
+            # give errors a chance to be caught before anything is returned
+            yaml_hack = from_yaml(yaml)
+            # and return the result if no errors are found
+            return yaml_hack
         rescue => detail
-            raise Puppet::Error, "Could not parse YAML data for %s %s: %s" % [indirection.name, request.key, detail]
+            # simply remove the trashed YAML file; it will be recreated
+            File.unlink(file)
         end
     end

Also available in: Atom PDF