Bug #2824
Failing to account for bug in ruby GC
| Status: | Closed | Start date: | 11/17/2009 | |
|---|---|---|---|---|
| Priority: | High | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | plumbing | |||
| Target version: | 0.25.2 | |||
| Affected Puppet version: | 0.25.1 | Branch: | http://github.com/MarkusQ/puppet/tree/ticket/0.25.x/2824 | |
| Keywords: | memory leak | |||
| Votes: | 0 |
Description
There is a known (albeit poorly understood) bug in MRI’s GC/stack frame handling that can cause extreme memory leakage.
Ruby manages a number of special “global” variables that are actually scope limited transitory thread local state accessors; for example $_ (the last line read with gets) $~,$1,$2,… (the details of the last regular expression match).
These are normally stored in the stack frame of the routine that triggered them (so if you do “Bob’s your uncle” =~ /(.*)’s/ in a routine you will subsequently see $1 == “Bob” until the next match is performed, but the routine that called you will not see a change in $1 on return).
A problem arises when a routine that does not have its own stack frame (because of a compiler optimization) performs an operation that creates one or more of these variables. In this case MRI dynamically creates a stack frame but (due to the bug) neither cleans it up on exit nor properly manages for the GC to collect later. This is worse than a normal memory leak (due to careless object reference management for example) in that the orphaned chunks apparently can not be moved. Thus the heap grows increasingly fragmented causing even more memory to be wasted; hundreds of megabytes can be consumed in a few minutes, only to be released when the program exits.
The simplest characterization of the sort of routines that trigger the optimization is “routines with no local variable assignments”. Characterizing the operations that will trigger the bug is a bit harder (see the attached files for a few examples that explore the boundary).
We have around ten routines that meet the definition, several of which can be shown to exhibit the problem.
The following files show cases that do (and do not) cause explosive memory consumption. They are all as simple as possible to demonstrate the point; their behavior under MRI 1.8.6 can be inferred from their names.
History
Updated by Markus Roberts about 2 years ago
- File regex_subscript_boom.rb added
- File simple_boom.rb added
- File simple_no_boom.rb added
- File tricky_boom.rb added
- File tricky_no_boom.rb added
Updated by Markus Roberts about 2 years ago
- Branch set to http://github.com/MarkusQ/puppet/tree/ticket/0.25.x/2824
Possible patch up at http://github.com/MarkusQ/puppet/tree/ticket/0.25.x/2824
Updated by Brice Figureau about 2 years ago
Markus Roberts wrote:
There is a known (albeit poorly understood) bug in MRI’s GC/stack frame handling that can cause extreme memory leakage.
What version of MRI are affected? Do they plan to fix it? Do you have any pointers to the MRI bug report/ticket/whatever?
Updated by Markus Roberts about 2 years ago
- Status changed from Accepted to In Topic Branch Pending Review
The bug is known and apparently affects ruby 1.8.2 through 1.8.7, though the biggest impact is in 1.8.5 & 1.8.6 (my understanding is that the bug was partially fixed in 1.8.7 and finally killed in 1.9)
English bug reports are scattered, anecdotal, and often only partially characterize the problem (see, for example) http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=19088 http://groups.google.com/group/god-rb/browse_thread/thread/1cca2b7c4a581c2/f0f040d41d7c49ea http://stackoverflow.com/questions/181406/ruby-memory-management
Much more detailed information is available to people who read either Japanese or C. My summary above is based on babble fish translations, C-delving in the 1.8.4-1.8.7 source, and experimentation with small ruby programs like the ones attached to test my understanding.
I’m marking this “Ready for testing”; the change from 0.25.x has no semantic impact, so the only thing to test for is memory usage.
Updated by Brice Figureau about 2 years ago
Markus Roberts wrote:
The bug is known and apparently affects ruby 1.8.2 through 1.8.7, though the biggest impact is in 1.8.5 & 1.8.6 (my understanding is that the bug was partially fixed in 1.8.7 and finally killed in 1.9)
English bug reports are scattered, anecdotal, and often only partially characterize the problem (see, for example) http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=19088 http://groups.google.com/group/god-rb/browse_thread/thread/1cca2b7c4a581c2/f0f040d41d7c49ea http://stackoverflow.com/questions/181406/ruby-memory-management
Much more detailed information is available to people who read either Japanese or C. My summary above is based on babble fish translations, C-delving in the 1.8.4-1.8.7 source, and experimentation with small ruby programs like the ones attached to test my understanding.
That’s the issue with MRI, most of the developper are Japanese, that doesn’t help :–( Hopefully we speak C fluently :–)
I’m marking this “Ready for testing”; the change from 0.25.x has no semantic impact, so the only thing to test for is memory usage.
Yes, I read the patch and it’s mostly cosmetic things with no impact. I’ll put it on one of my node ASAP to see.
Updated by James Turnbull about 2 years ago
- Status changed from In Topic Branch Pending Review to Closed
Pushed in commit:bd5dc649ad55fc4724cafad99852b825adfde182 in branch 0.25.x