Bug #2773
buglet in provider/service/launchd.rb
| Status: | Closed | Start date: | 11/03/2009 | |
|---|---|---|---|---|
| Priority: | High | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | service | |||
| Target version: | 2.7.10 | |||
| Affected Puppet version: | 0.25.1 | Branch: | https://github.com/puppetlabs/puppet/pull/324 | |
| Keywords: | osx launchd | |||
| Votes: | 0 |
Description
I added a persistent postfix plist to replace the run-on-demand one distributed with OSX. (Which causes our nagios mailqueue check to fail annoyingly because qmgr doesn’t run all the time). So upon copying the plist, process status should have been ‘enabled’ but ‘stopped’. But it wouldn’t start up by itself:
debug: Service[postfix](provider=launchd): Executing 'launchctl load /System/Library/LaunchDaemons/org.postfix.master.plist' err: //mail::nullclient/Service[postfix]/ensure: change from stopped to running failed: Unable to start service: org.postfix.master at path: /System/Library/LaunchDaemons/org.postfix.master.plist
I manually got it working with ‘launchctl load -w’ , but the conditional in the start() method was not being triggered. After resetting my system back to previous state and changing the check as below, I get the expected result:
debug: //mail::nullclient/Service[postfix]: Changing ensure debug: //mail::nullclient/Service[postfix]: 1 change(s) debug: Service[postfix](provider=launchd): Executing 'launchctl load -w /System/Library/LaunchDaemons/org.postfix.master.plist' notice: //mail::nullclient/Service[postfix]/ensure: ensure changed 'stopped' to 'running'
So it seems ‘enabled’ (in puppet’s view of things) but ‘stopped’ may still need -w to overcome a ‘disabled’ key in launchd’s little mind. What do you think?
--- a/lib/puppet/provider/service/launchd.rb
+++ b/lib/puppet/provider/service/launchd.rb
@@ -163,7 +163,7 @@ Puppet::Type.type(:service).provide :launchd, :parent => :base do
did_enable_job = false
cmds = []
cmds << :launchctl << :load
- if self.enabled? == :false # launchctl won't load disabled jobs
+ if self.enabled? == :false || self.status == :stopped # launchctl won't load disabled jobs
cmds << "-w"
did_enable_job = true
end
History
Updated by James Turnbull over 2 years ago
- Status changed from Unreviewed to Accepted
- Target version set to 0.25.2
Updated by Nigel Kersten over 2 years ago
Can you post the contents of the plist too please eric?
Was this on 10.5 or 10.6 ?
Updated by eric sorenson over 2 years ago
- File postfix-launchd.plist added
Ack, sorry I should have included this straight off. Attached.
It’s on 10.6, I’ll try to repro on 10.5 as well.
Updated by Nigel Kersten over 2 years ago
That would be awesome Eric. There are some differences in 10.5 and 10.6 with launchd that mean this might only be a 10.6 bug.
Updated by eric sorenson over 2 years ago
Leopard appears to behave correctly. Same plist, same manifest, no code changes, produced this on first run:
debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list' notice: //mail::nullclient/Service[postfix]: Triggering 'refresh' from 1 dependencies debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list' debug: Service[postfix](provider=launchd): Executing 'launchctl unload -w /System/Library/LaunchDaemons/org.postfix.master' debug: Service[postfix](provider=launchd): Executing 'launchctl load /System/Library/LaunchDaemons/org.postfix.master' debug: Finishing transaction 14181260 with 1 changes debug: Storing state
qmgr and pickup are running as expected at the end of it.
Updated by Nigel Kersten about 2 years ago
(sent via email to Eric while projects.reductivelabs was down)
There were a couple of things that struck me about your plist.
a) you’re replacing a system plist, this is a bad idea due to OS updates potentially stomping over it. Better to leave the system one disabled and make a new one. b) you should use your own reverse domain notation for the label so it’s a separate job to the system one. c) It should go into /Library/LaunchDaemons, not /System/Library/LaunchDaemons
Even so… I can’t seem to replicate this, although I messed up while testing and deleted the original postfix plist, so I need to go do some digging to retrieve it.
Did you possibly have the system version already running and then dropped your plist on top of it? That’s about the only scenario I can think of that might produce this behaviour that I haven’t tested.
The provider should cope with the actual job plist changing, but I can see there might be problems with certain kinds of changes…
Eric is hoping to get to this next week, and this weekend I’ll run through some more tests with the various scenarios.
Updated by Luke Kanies about 2 years ago
Any progress on this? Sounds like you guys haven’t quite decided on the ultimate answer.
Updated by Nigel Kersten about 2 years ago
- Target version changed from 0.25.2 to 4
I can’t reproduce it, but am not yet sure that there isn’t an issue somewhere.
I’ve switched the target version to unplanned until I get more info back or can reproduce the issue, and will make a target decision then.
Updated by Nigel Kersten about 2 years ago
To clarify, I feel like there might be a cluster of issues around replacing an existing plist, where the original job is loaded and has completely different characteristics to the new plist, but I haven’t had the time to go through all the possible permutations.
We may be able to do some fiddling around with getting launchctl info for a particular job (for 10.5 and higher) before we decide how to apply desired state, but it’s going to be rather complicated, so I’d like to find the common cases first.
Updated by eric sorenson about 2 years ago
I think Nigel’s update in #4 is right in that I am doing enough questionable things to make this an edge-case. His point about not replacing a system plist is a fair one, but rather disabling it and adding our own in the non-OS-controlled /Library/LaunchDaemons area is a good one.
I’d be OK with closing this as unable-to-repro or PEBCAK or whatever.
–=Eric
Updated by eric sorenson about 2 years ago
eric sorenson wrote:
I think Nigel’s update in Comment 4 is right in that I am doing enough questionable things to make this an edge-case.
I’ve spent some more time with this (and just clobbered my own update to the ticket! argh) and now I do think we ought to be more aggressive about using ‘launchctl load -w’ because, at least on snow leopard, my newly-added non-conflicting plists are not getting loaded otherwise. Using ‘-w’ doesn’t hurt anything if it’s already enabled AFAIK.
debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list' debug: //puppet-client::service/Service[puppetd]: Changing ensure debug: //puppet-client::service/Service[puppetd]: 1 change(s) debug: Service[puppetd](provider=launchd): Executing 'launchctl load /Library/LaunchDaemons/com.reductivelabs.puppet.plist' err: //puppet-client::service/Service[puppetd]/ensure: change from stopped to running failed: Unable to start service: com.reductivelabs.puppet at path: /Library/LaunchDaemons/com.reductivelabs.puppet.plist
Here’s the shell equivalents and the plist:
[root@leterel /usr/local/sbin]# launchctl load /Library/LaunchDaemons/com.reductivelabs.puppet.plist nothing found to load [root@leterel /usr/local/sbin]# echo $? 1 [root@leterel /usr/local/sbin]# launchctl load -w /Library/LaunchDaemons/com.reductivelabs.puppet.plist [root@leterel /usr/local/sbin]# cat /Library/LaunchDaemons/com.reductivelabs.puppet.plistEnvironmentVariables PATH /usr/local/bin:/usr/local/sbin:/bin:/usr/bin:/sbin:/usr/sbin Label com.reductivelabs.puppet OnDemand ProgramArguments puppetd --verbose --logdest=/var/puppet/log/puppetd.log --no-daemonize --color=false RunAtLoad StandardErrorPath /var/puppet/log/puppetd.err StandardOutPath /var/puppet/log/puppetd.out
Updated by Nigel Kersten about 2 years ago
I’m completely swamped at work at the moment Eric, and I’d be overjoyed to take a patch from you on this with tests :)
Updated by Nigel Kersten over 1 year ago
Eric, you sent a patch to fix this didn’t you?
Updated by eric sorenson over 1 year ago
I’ve just been using the one-line patch inline at the problem description with no ill effects – I haven’t pushed a real git commit.
Updated by Gary Larizza over 1 year ago
Eric,
Have you seen any ill-effects out of the change in launchd.rb? I’m seeing this pop up and am wondering if you’ve had good luck with that fix.
Updated by eric sorenson over 1 year ago
Gary Larizza wrote:
Eric,
Have you seen any ill-effects out of the change in launchd.rb? I’m seeing this pop up and am wondering if you’ve had good luck with that fix.
Nope it’s been fine.
Updated by James Turnbull 12 months ago
- Status changed from Accepted to Needs Decision
Updated by Nigel Kersten 6 months ago
- Status changed from Needs Decision to Tests Insufficient
- Priority changed from Normal to High
- Target version changed from 4 to 2.7.x
It looks to me like we need some tests around this?
Updated by James Turnbull 3 months ago
- Assignee changed from Nigel Kersten to Gary Larizza
Gary – any chance you can look at the tests for this in your copious spare time? :)
Updated by Gary Larizza 3 months ago
Sure – thanks for the reminder on it :) I just submitted a bunch of tests to launchd, and I guess I should probably work on it while it’s fresh in my head :)
Updated by Gary Larizza about 1 month ago
- Branch set to https://github.com/puppetlabs/puppet/pull/324
I updated the code and created spec tests to account for this. Eric, take a look and see if it meets the condition you’re encountering.
Updated by Daniel Pittman about 1 month ago
- Status changed from Tests Insufficient to Merged - Pending Release
- Target version changed from 2.7.x to 2.7.10
Updated by Michael Stahnke 23 days ago
- Status changed from Merged - Pending Release to Closed
released in 2.7.10rc1