Bug #1963
Failing to read /proc/mounts for selinux kills file downloads
| Status: | Closed | Start date: | 02/12/2009 | |
|---|---|---|---|---|
| Priority: | High | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | plumbing | |||
| Target version: | 0.25.0 | |||
| Affected Puppet version: | 0.25.0 | Branch: | ||
| Keywords: | selinux plugin timeout | |||
| Votes: | 1 |
Description
With selinux enabled (on fedora10, libselinux-ruby RPM installed, git checkout of 0.24.7) and “pluginsync=true” in puppet.conf, puppet timeouts on plugin retrieval.
Curiously, the problem occurs before puppet starts to connect to the puppetmaster. tcpdump shows no network access before/while the timeout. It seems to me puppet tries to access a file in /proc, which causes the timeout.
Same problem occurs with latest commit on branch 0.24.x Same problem occurs with redhat5 and Tom Payne’s libselinux-ruby-puppet.
This renders puppet unusable if you rely on plugins and you have an selinux-enabled system.
Here is the backtrace I obtain when running puppet with —debug:
[root@fedora10test puppet]# git checkout 0.24.7 HEAD is now at 8befc18... Updated to version 0.24.7 [root@fedora10test puppet]# RUBYLIB=/root/puppet/lib/ bin/puppetd -t --debug --environment marc debug: /File[/var/puppet/facts]/seluser: Found seluser default 'system_u' for /var/puppet/facts debug: /File[/var/puppet/facts]/selrole: Found selrole default 'object_r' for /var/puppet/facts debug: /File[/var/puppet/facts]/seltype: Found seltype default 'var_t' for /var/puppet/facts [...] debug: Finishing transaction 70197300339920 with 0 changes debug: Loaded state in 0.00 seconds debug: Puppet::Network::Client::File: defining fileserver.describe debug: Puppet::Network::Client::File: defining fileserver.list debug: Puppet::Network::Client::File: defining fileserver.retrieve debug: /File[/var/lib/puppet/lib]/seluser: Found seluser default 'system_u' for /var/lib/puppet/lib debug: /File[/var/lib/puppet/lib]/selrole: Found selrole default 'object_r' for /var/lib/puppet/lib debug: /File[/var/lib/puppet/lib]/seltype: Found seltype default 'var_lib_t' for /var/lib/puppet/lib debug: /File[/var/lib/puppet/lib]/selrange: Found selrange default 's0' for /var/lib/puppet/lib info: Retrieving plugins /usr/lib/ruby/1.8/timeout.rb:60:in `read_mounts' /root/puppet/lib/puppet/util/selinux.rb:177:in `find_fs' /root/puppet/lib/puppet/util/selinux.rb:203:in `selinux_label_support?' /root/puppet/lib/puppet/util/selinux.rb:49:in `get_selinux_default_context' /root/puppet/lib/puppet/type/file/selcontext.rb:37:in `retrieve_default_context' /root/puppet/lib/puppet/type/file/selcontext.rb:61:in `default' /root/puppet/lib/puppet/type.rb:651:in `setdefaults' /root/puppet/lib/puppet/type.rb:138:in `eachattr' /root/puppet/lib/puppet/type.rb:135:in `each' /root/puppet/lib/puppet/type.rb:135:in `eachattr' /root/puppet/lib/puppet/type.rb:641:in `setdefaults' /root/puppet/lib/puppet/type.rb:2362:in `initialize' /root/puppet/lib/puppet/type/file.rb:460:in `initialize' /root/puppet/lib/puppet/type.rb:1131:in `new' /root/puppet/lib/puppet/type.rb:1131:in `create' /root/puppet/lib/puppet/node/catalog.rb:200:in `create_resource' /root/puppet/lib/puppet/node/catalog.rb:186:in `create_implicit_resource' /root/puppet/lib/puppet/type/file.rb:643:in `newchild' /root/puppet/lib/puppet/type/file.rb:563:in `localrecurse' /root/puppet/lib/puppet/type/file.rb:558:in `each' /root/puppet/lib/puppet/type/file.rb:558:in `localrecurse' /root/puppet/lib/puppet/type/file.rb:712:in `recurse' /root/puppet/lib/puppet/type/file.rb:340:in `eval_generate' /root/puppet/lib/puppet/transaction.rb:193:in `eval_generate' /root/puppet/lib/puppet/transaction.rb:228:in `eval_resource' /root/puppet/lib/puppet/transaction.rb:310:in `evaluate' /root/puppet/lib/puppet/util.rb:425:in `thinmark' /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime' /root/puppet/lib/puppet/util.rb:424:in `thinmark' /root/puppet/lib/puppet/transaction.rb:309:in `evaluate' /root/puppet/lib/puppet/transaction.rb:303:in `collect' /root/puppet/lib/puppet/transaction.rb:303:in `evaluate' /root/puppet/lib/puppet/node/catalog.rb:124:in `apply' /root/puppet/lib/puppet/network/client/master.rb:331:in `download' /root/puppet/lib/puppet/network/client/master.rb:330:in `download' /root/puppet/lib/puppet/network/client/master.rb:369:in `getplugins' /root/puppet/lib/puppet/network/client/master.rb:192:in `getplugins' /root/puppet/lib/puppet/network/client/master.rb:133:in `getconfig' /root/puppet/lib/puppet/network/client/master.rb:245:in `run' /root/puppet/lib/puppet/util.rb:425:in `thinmark' /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime' /root/puppet/lib/puppet/util.rb:424:in `thinmark' /root/puppet/lib/puppet/network/client/master.rb:244:in `run' /usr/lib/ruby/1.8/sync.rb:229:in `synchronize' /root/puppet/lib/puppet/network/client/master.rb:237:in `run' bin/puppetd:417 err: Could not retrieve plugins: execution expired debug: /File[/var/puppet/facts]/seluser: Found seluser default 'system_u' for /var/puppet/facts debug: /File[/var/puppet/facts]/selrole: Found selrole default 'object_r' for /var/puppet/facts debug: /File[/var/puppet/facts]/seltype: Found seltype default 'var_t' for /var/puppet/facts debug: /File[/var/puppet/facts]/selrange: Found selrange default 's0' for /var/puppet/facts info: Retrieving facts debug: Calling fileserver.list [...]
The same, but under supervision of “strace -e open”:
open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
open("/proc/mounts", O_RDONLY) = 6
open("/selinux/context", O_RDWR) = 6
debug: /File[/var/lib/puppet/lib]/seluser: Found seluser default 'system_u' for /var/lib/puppet/lib
open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
open("/proc/mounts", O_RDONLY) = 6
open("/selinux/context", O_RDWR) = 6
debug: /File[/var/lib/puppet/lib]/selrole: Found selrole default 'object_r' for /var/lib/puppet/lib
open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
open("/proc/mounts", O_RDONLY) = 6
open("/selinux/context", O_RDWR) = 6
debug: /File[/var/lib/puppet/lib]/seltype: Found seltype default 'var_lib_t' for /var/lib/puppet/lib
open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
open("/proc/mounts", O_RDONLY) = 6
open("/selinux/context", O_RDWR) = 6
debug: /File[/var/lib/puppet/lib]/selrange: Found selrange default 's0' for /var/lib/puppet/lib
info: Retrieving plugins
Process 10089 attached (waiting for parent)
Process 10089 resumed (parent 10066 ready)
[pid 10066] open("/var/lib/puppet/lib", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
[pid 10066] open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
[pid 10066] open("/proc/mounts", O_RDONLY) = 6
/usr/lib/ruby/1.8/timeout.rb:60:in `read_mounts'
/root/puppet/lib/puppet/util/selinux.rb:177:in `find_fs'
/root/puppet/lib/puppet/util/selinux.rb:177:in `find_fs'
/root/puppet/lib/puppet/util/selinux.rb:203:in `selinux_label_support?'
/root/puppet/lib/puppet/util/selinux.rb:49:in `get_selinux_default_context'
[...]
bin/puppetd:417
err: Could not retrieve plugins: execution expired
[pid 10066] open("/usr/lib/ruby/1.8/net/https.rb", O_RDONLY) = 6
[pid 10066] open("/var/lib/puppet/ssl/certs/ca.pem", O_RDONLY) = 6
[pid 10066] open("/var/lib/puppet/ssl/certs/fedora10test.pem", O_RDONLY) = 6
[pid 10066] open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
[pid 10066] open("/proc/mounts", O_RDONLY) = 6
[pid 10066] open("/selinux/context", O_RDWR) = 6
debug: /File[/var/puppet/facts]/seluser: Found seluser default 'system_u' for /var/puppet/facts
[pid 10066] open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
[pid 10066] open("/proc/mounts", O_RDONLY) = 6
[pid 10066] open("/selinux/context", O_RDWR) = 6
debug: /File[/var/puppet/facts]/selrole: Found selrole default 'object_r' for /var/puppet/facts
[pid 10066] open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
[pid 10066] open("/proc/mounts", O_RDONLY) = 6
[pid 10066] open("/selinux/context", O_RDWR) = 6
debug: /File[/var/puppet/facts]/seltype: Found seltype default 'var_t' for /var/puppet/facts
[pid 10066] open("/proc/self/task/10066/attr/current", O_RDONLY) = 6
[pid 10066] open("/proc/mounts", O_RDONLY) = 6
[pid 10066] open("/selinux/context", O_RDWR) = 6
debug: /File[/var/puppet/facts]/selrange: Found selrange default 's0' for /var/puppet/facts
info: Retrieving facts
Related issues
History
Updated by James Turnbull almost 3 years ago
- Status changed from Unreviewed to Accepted
- Assignee set to Sean Millichamp
Sean? Any ideas? Anyone else?
Updated by James Turnbull almost 3 years ago
I should make it clear that I can’t replicate the problem.
Updated by Luke Kanies almost 3 years ago
- Subject changed from plugins fail to download when selinux ruby bindings are installed to Failing to read /proc/mounts for selinux kills file downloads
I thought I’d seen another case of /proc/mounts timing out when being read; I think it’s a misnomer to say that this is related to plugins or the ruby bindings – it’s the reading of that device or whatever that’s failing. I can’t imagine that either of those other details affect that timeout.
Maybe we just need better error handling for when /proc/mounts times out?
I’m renaming the ticket accordingly.
Updated by James Turnbull almost 3 years ago
- Assignee deleted (
Sean Millichamp)
Updated by Todd Zullinger almost 3 years ago
I can reproduce this, both on F10, and CentOS 5 (with libselinux-ruby installed). I think it’s a bit strange that the open of /proc/mounts only fails for the read done in lib/puppet/util/selinux.rb:read_mounts(), and then only for the plugin syncing. I don’t know if that’s a useful clue or not. There is also a “recent post”:http://groups.google.com/group/puppet-users/browse_thread/thread/e1bef6681ba6ad49/bbe5ca682d14100d to puppet-users that had a similar failure to read /proc/mounts.
Updated by Luke Kanies almost 3 years ago
tmz wrote:
I can reproduce this, both on F10, and CentOS 5 (with libselinux-ruby installed). I think it’s a bit strange that the open of /proc/mounts only fails for the read done in lib/puppet/util/selinux.rb:read_mounts(), and then only for the plugin syncing. I don’t know if that’s a useful clue or not. There is also a “recent post”:http://groups.google.com/group/puppet-users/browse_thread/thread/e1bef6681ba6ad49/bbe5ca682d14100d to puppet-users that had a similar failure to read /proc/mounts.
Are you using autofs, by chance? If you do a search for ‘/proc/mounts timeout’, you’ll see it crop up quite a bit.
Updated by Luke Kanies almost 3 years ago
This link seems pertinent, too:
http://www.nabble.com/Does-readFile-%22-proc-mounts%22-hang-for-you—td21587871.html
Updated by Frank Sweetser almost 3 years ago
- File puppet-read-mounts-nonblock.diff added
Actually, I had it fail when trying to get default selinux context for files, too, so it’s not just the plugin mount.
This patch to make reading /proc/mounts non-blocking fixes the symptoms for me.
Updated by Frank Sweetser almost 3 years ago
The mounts section of the proc man page confirms that select on /proc/mounts will only flag a file as readable when its contents change.
http://linux.die.net/man/5/proc
??Since kernel version 2.6.15, this file is pollable: after opening the file for reading, a change in this file (i.e., a file system mount or unmount) causes select(2) to mark the file descriptor as readable, and poll(2) and epoll_wait(2) mark the file as having an error condition.??
Updated by Todd Zullinger almost 3 years ago
Frank, that patch seems to do the trick here. I’ve tested it on Fedora 10 and CentOS 5 (with libselinux-ruby installed) and not had any failures reading /proc/mounts. Thanks!
Luke, I kept missing you on IRC the last few days. I did test your refactor/0.24.x/fewerfiles branch. More accurately, I applied these 3 commits on top of 0.24.7:
aa2bdf7... Adding a FileCollection and a lookup module for it. adb9097... Using the FileCollection where appropriate. fb7f29b... Adding a performance optimization to the FileCollection.
Unfortunately, that didn’t appear to have any affect on the /proc/mounts problem. If you still think it would help, I could build fresh from that branch. (I was trying to introduce as few changes as possible, and save some time so I’d have more hours to relax on the beach. ;)
Updated by James Turnbull almost 3 years ago
- Category set to plumbing
- Status changed from Accepted to Closed
Pushed in commit:0e467869f4d427a8c42e1c2ff6a0bb215f288b09 in branch 0.24.x
Updated by Todd Zullinger almost 3 years ago
- Status changed from Closed to Re-opened
- Target version changed from 0.24.8 to 0.25.0
- Affected Puppet version changed from 0.24.7 to 0.25.0
This seems to be back with 0.25.0beta1. In commit:a677e26 (Fixing all tests that were apparently broken in the 0.24.x merge), NONBLOCK was changed to File::NONBLOCK in lib/puppet/util/selinux.rb. Reverting this change corrects the problem, but I assume it will break one of the tests.
Updated by Todd Zullinger over 2 years ago
Todd Zullinger wrote:
This seems to be back with 0.25.0beta1. In commit:a677e26 (Fixing all tests that were apparently broken in the 0.24.x merge), NONBLOCK was changed to File::NONBLOCK in lib/puppet/util/selinux.rb. Reverting this change corrects the problem, but I assume it will break one of the tests.
Well, appearances were deceiving. I don’t think this was ever fixed, I think the original patch just hid the failure. As NONBLOCK isn’t defined anywhere, it’s use in 0.24.8 causes an ‘uninitialized constant’ error. But since it is used in a begin/rescue block, read_mounts() jusr returns nil rather than throwing that error. Now that the code uses the valid File::NONBLOCK flag there is no error, but the read of /proc/mounts hangs (on select(), according to strace).
Anyone have good ideas on how to properly debug and/or fix this? It still seems odd that this only seems to hang while retrieving plugins and not during any of the other places where read_mounts() is called.
Updated by James Turnbull over 2 years ago
- Status changed from Re-opened to Accepted
- Priority changed from Normal to High
Updated by Ricky Zhou over 2 years ago
Hey, Todd. As I mentioned on IRC, our bugs (#1963 and my last few comments at #1095) are starting to look like they could be related in that read_mounts is hanging on reading /proc/mounts. Here’s my strace output on puppetmasterd: http://ricky.fedorapeople.org/puppetmasterd.strace. I see a lot of selects on 7 timing out – does that match up with your findings?
Updated by Ricky Zhou over 2 years ago
By the way, Luke’s link looks very promising. The bugzilla report linked off of the trac ticket there is http://bugzilla.kernel.org/show_bug.cgi?id=11014. The test program in that bug does hang on /proc/mounts for me. Is it normal behavior in Ruby to do selects on files that you open? Is there perhaps some other interface for getting mount information?
Updated by Ricky Zhou over 2 years ago
Here is a stripped down test which reproduces the problem. I don’t know if Ruby upstream would consider this a bug or not:
#!/usr/bin/ruby
x = Thread.new {
puts 'Before open'
mountfh = File.open("/proc/mounts", File::NONBLOCK)
puts 'After open, before read'
out = mountfh.read
puts out
puts 'After read, before close'
mountfh.close
puts 'After close'
}
x.join
for what it’s worth, a similar python version does not seem to call select:
#!/usr/bin/python
import threading
def do_stuff():
print open('/proc/mounts').read()
x = threading.Thread(target=do_stuff)
x.start()
x.join()
Updated by Todd Zullinger over 2 years ago
Ricky, yes, the failures I see all occur when ruby tries to use select on /proc/mounts (which it only seems to do when retrieving the plugins). Perhaps there’s a simple way to work-around this with some creative refactoring? :)
This does seem like a ruby bug at its core.
Updated by Todd Zullinger over 2 years ago
FWIW, here is some strace output showing the difference between successful reads of /proc/mount and the failing one.
Clean open calls:
open("/proc/mounts", O_RDONLY|O_NONBLOCK) = 6
fcntl(6, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b2c3501e000
lseek(6, 0, SEEK_CUR) = 0
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(6, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 1018
read(6, "", 4096) = 0
close(6) = 0
munmap(0x2b2c3501e000, 4096) = 0
Failed open call:
open("/proc/mounts", O_RDONLY|O_NONBLOCK) = 6
fcntl(6, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaab000
lseek(6, 0, SEEK_CUR) = 0
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
gettimeofday({1246722448, 477419}, NULL) = 0
select(7, [6], [], [], {119, 893409}) = 0 (Timeout)
gettimeofday({1246722573, 883356}, NULL) = 0
select(7, [6], [], [], {0, 0}) = 0 (Timeout)
brk(0x1b230000) = 0x1b230000
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
time(NULL) = 1246722573
select(7, [6], [], [], {0, 0}) = 0 (Timeout)
Updated by Todd Zullinger over 2 years ago
Also potentially of interest is http://patchwork.kernel.org/patch/17058/. The patch there is regarding things in /sys, but one of the subsystem maintainers points out that /proc/mounts has the same problem.
Updated by Todd Zullinger over 2 years ago
Sorry for all the bug spam today. Following the kernel git tree, this issue appears to have been resolved in the linux-2.6 stable tree with commit 31b0709¹. I’ll try to test this locally and verify that it corrects the problem we’re seeing here. Then we’ll have to see whether this is going to make it into the RHEL 4/5 updates.
Perhaps read_mounts could be refactored to cache the output? On a very small puppetmaster I counted 454 open calls for /proc/mounts. If the data was cached and only updated when needed, it might save a bit of time even after the kernel fix makes its way to users systems. And until then, it could probably work-around this bug.
¹ http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6-stable.git;a=commitdiff;h=31b0709
Updated by Luke Kanies over 2 years ago
Todd Zullinger wrote:
Sorry for all the bug spam today. Following the kernel git tree, this issue appears to have been resolved in the linux-2.6 stable tree with commit 31b0709¹. I’ll try to test this locally and verify that it corrects the problem we’re seeing here. Then we’ll have to see whether this is going to make it into the RHEL 4/5 updates.
Perhaps read_mounts could be refactored to cache the output? On a very small puppetmaster I counted 454 open calls for /proc/mounts. If the data was cached and only updated when needed, it might save a bit of time even after the kernel fix makes its way to users systems. And until then, it could probably work-around this bug.
That’s, um, crazy. This is only used in Facter, right? How long does it take for that number of calls to happen?
We’re working on a caching system for Facter, for 2.0, I think, which should make this better.
Updated by Todd Zullinger over 2 years ago
The read_mounts call in question here is from puppet (in lib/puppet/util/selinux.rb). I don’t know what the best way to determine how many files this puppet configuration is managing, but here’s a rough idea:
# grep -c 'type: file$' /var/lib/puppet/localconfig.yaml 82
It doesn’t take any real noticeable amount of time for most of these calls, aside from the one that runs inside the download_plugins function and hangs, of course. But surely they do add up to something more than is really needed. To paraphrase an old statesman¹, “A few milliseconds here and a few milliseconds there, and pretty soon you’re talking about real time.” If it’s possible to refactor things to avoid many of these calls, it might kill several birds with one elegant stone.
¹ I’m reminded also of a quote of unknown origin that says, “A statesman is a dead politician. Lord knows, we need more statesmen.” ;)
Updated by Luke Kanies over 2 years ago
Todd Zullinger wrote:
It doesn’t take any real noticeable amount of time for most of these calls, aside from the one that runs inside the download_plugins function and hangs, of course. But surely they do add up to something more than is really needed. To paraphrase an old statesman¹, “A few milliseconds here and a few milliseconds there, and pretty soon you’re talking about real time.” If it’s possible to refactor things to avoid many of these calls, it might kill several birds with one elegant stone.
It should be quite easy to use the Puppet::Util::Cacher module to cache this data relatively briefly, especially if whoever’s using the data has access to a catalog. That should get rid of the problems with it sometimes working and sometimes not, also.
Updated by Ricky Zhou over 2 years ago
I think the read is guaranteed(?) to work before any threaded code starts, so if it gets cached then, everything should run fine. That really doesn’t leave any way to update the information at any point after though :–/.
Honestly, I think this may be the least behavior-changing hack to correct the issue until the underlying kernel bug is resolved.
diff --git a/lib/puppet/util/selinux.rb b/lib/puppet/util/selinux.rb
index 7be119c..5816ce2 100644
--- a/lib/puppet/util/selinux.rb
+++ b/lib/puppet/util/selinux.rb
@@ -153,7 +153,7 @@ module Puppet::Util::SELinux
# Internal helper function to read and parse /proc/mounts
def read_mounts
begin
- mountfh = File.open("/proc/mounts", File::NONBLOCK)
+ mountfh = IO.popen("/bin/cat /proc/mounts")
mounts = mountfh.read
mountfh.close
rescue
In some other testing, I found that the read_nonblock function does not trigger the select calls, so this seems to work for me as well with ruby-1.8.6.287-8.fc11.x86_64:
diff --git a/lib/puppet/util/selinux.rb b/lib/puppet/util/selinux.rb
index 7be119c..bb2fbd1 100644
--- a/lib/puppet/util/selinux.rb
+++ b/lib/puppet/util/selinux.rb
@@ -152,9 +152,13 @@ module Puppet::Util::SELinux
# Internal helper function to read and parse /proc/mounts
def read_mounts
+ mounts = ""
begin
- mountfh = File.open("/proc/mounts", File::NONBLOCK)
- mounts = mountfh.read
+ mountfh = File.open("/proc/mounts")
+ while true
+ mounts += mountfh.read_nonblock(1024)
+ end
+ rescue EOFError
mountfh.close
rescue
return nil
Updated by Luke Kanies over 2 years ago
Haven’t we accepted a final patch for this? I know we’ve gone through multiple reviews, but I seem to have lost the final state.
Updated by Ricky Zhou over 2 years ago
I don’t think so, we’ve discussed a couple of approaches to the issue at this point, and I’d personally lean towards the first suggestion in comment 25 with the addition of an explanatory comment (even though it’s kind of hack), since it’s pretty much a drop-in replacement with relatively few additional moving parts involved. Then again, I don’t know my way around Ruby at all, and I may very well be missing an obvious elegant way to solve the problem too :–)
Updated by Todd Zullinger over 2 years ago
I think the read_nonblock from comment 25 might be better, as long as both methods are sufficient to work-around the problem. The cost of forking a shell to run cat seems rather high to me.
# This is 1000 iterations of the IO.popen("/bin/cat /proc/mounts") method
$ time ./test1.rb
real 0m1.721s
user 0m0.287s
sys 0m1.273s
# This is 1000 iterations of the read_nonblock(1024) method
$ time ./test2.rb
real 0m0.165s
user 0m0.036s
sys 0m0.087s
I’ve not had a lot of time for testing in the last few days (rum and clear, 86°F ocean water have won out over computer time ;), but both methods do seem to work-around the problem (tested against 0.25.0beta2 on CentOS 5.3). A note in the code that this is a only there until the linux kernel bug is resolved would be a good idea. Some other testing and confirmation that it fixes the issue would be good, especially on RHEL and Fedora systems.
Updated by Ricky Zhou over 2 years ago
Todd Zullinger wrote:
I think the read_nonblock from comment 25 might be better, as long as both methods are sufficient to work-around the problem. The cost of forking a shell to run cat seems rather high to me.
[…]
I’ve not had a lot of time for testing in the last few days (rum and clear, 86°F ocean water have won out over computer time ;), but both methods do seem to work-around the problem (tested against 0.25.0beta2 on CentOS 5.3). A note in the code that this is a only there until the linux kernel bug is resolved would be a good idea. Some other testing and confirmation that it fixes the issue would be good, especially on RHEL and Fedora systems. Fair enough :–) I didn’t think about the time issues with spawning a new shell each time, especially with the number of times that this can get called – thanks a lot for the testing. The main reason I wasn’t too sure about the read_nonblock solution was that I could not find any definitive ruby documentation that guarantees that the function will never call select (perhaps this would be a good question to ask ruby developers?). I did test the patch I mentioned successfully on an F11 x86_64 VM, with kernel-2.6.29.5-191.fc11.x86_64 and ruby-1.8.6.287-8.fc11.x86_64 (I applied the patch on master).
Updated by Luke Kanies over 2 years ago
The read_nonblock option seems like the right one, then. Is the patch in a final form?
Updated by Ricky Zhou over 2 years ago
Luke Kanies wrote:
The read_nonblock option seems like the right one, then. Is the patch in a final form? Yup, although feel free to clean it up/make it more rubyish if you want. It’d also be nice to add a comment explaining why all this code is necessary. I’d be fine with you making those modifications when you commit, or if you’d like, I can post another patch with the comment added.
Updated by Todd Zullinger over 2 years ago
For convenience I’ve committed Ricky’s patch (with a comment about why this is needed) to the tickets/master/1963 branch at git://jet.mox.net/~tmz/puppet.git. (I’d mark this Ready for Checkin, but I either don’t have permissions to do that or I’m just too thick to see how. :)
Updated by James Turnbull over 2 years ago
- Status changed from Accepted to Ready For Checkin
- Assignee set to James Turnbull
Ready for Checkin.
Todd & Ricky – added you both to the project as developers – means you can change ticket status now.
Updated by James Turnbull over 2 years ago
- Status changed from Ready For Checkin to Closed
Pushed in commit:910a5e2f39d2fefab28768f2a2d3d02de7abd54d in branch master.