Bug #1469
File recursion with a remote source should not recurse locally
| Status: | Closed | Start: | 07/30/2008 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | % Done: | 100% |
||
| Category: | file | |||
| Target version: | 0.25.0 | |||
| Affected version: | 0.24.8 | Branch: | ||
| Keywords: | file recursive slow | |||
| Votes: | 0 |
Description
Hi, I have the following manifests:
file {
"${homeroot}/$name":
source => [ "puppet:///files/users/$name/skel/", "puppet:///files/users/skel/"],
sourceselect => all,
checksum => mtime,
recurse => true, owner => $name, group => $name, replace => no,
require => [ User["$name"], Group["$name"] ];
}
that is used to copy various files to the home directory of a new user.
If the home directory of this user contains a deep hierarchy full of files, puppet takes ages and consume a lot of memory to finally copy over a handful of files.
My understanding of the issue is the following: * Puppet builds localrecurse first (see puppet/type/file.rb). This parts can take a long time, each local present file has to be stat'ed and a new (implicit) resource is created in the catalog. * then Puppet builds the sourcerecurse
If Puppet would build the sourcerecurse first, and prune the localrecurse in function of the availability of the sourced file/dir it would be way much faster. Moreover I don’t think people are using file{} to distribute more than a few files, but I bet they like me have more local files than distributed.
Also note, that there is another issue if filebucket and checksum are enabled in the aforementioned manifest: it will copy over all files, and filebucket the previous one, as seen in this log extract:
debug: Calling fileserver.list
debug: Calling fileserver.describe
debug: Calling fileserver.retrieve
debug: //all_users/Account[brice]/File[/home/brice/.zshrc]/source: Executing 'diff -u /home/brice/.zshrc /tmp/user/0/puppet-diffing20080704-15708-glfegc-0'
debug: //all_users/Account[brice]/File[/home/brice/.zshrc]: Changing checksum,source
debug: //all_users/Account[brice]/File[/home/brice/.zshrc]: 2 change(s)
debug: //all_users/Account[brice]/File[/home/brice/.zshrc]/checksum: Replacing /home/brice/.zshrc checksum {md5}1395a1f669e2c86443b142cc5ad10f55 with {mtime}Fri Jul 04 15:17:39 +0200 2008
notice: //all_users/Account[brice]/File[/home/brice/.zshrc]/checksum: checksum changed '{md5}1395a1f669e2c86443b142cc5ad10f55' to '{mtime}Fri Jul 04 15:17:39 +0200 2008'
debug: Calling puppetbucket.addfile
info: //all_users/Account[brice]/File[/home/brice/.zshrc]: Filebucketed to server with sum 1395a1f669e2c86443b142cc5ad10f55
debug: //all_users/Account[brice]/File[/home/brice/.zshrc]/checksum: Replacing /home/brice/.zshrc checksum {mtime}Fri Jul 04 15:17:39 +0200 2008 with {md5}1395a1f669e2c86443b142cc5ad10f55
notice: //all_users/Account[brice]/File[/home/brice/.zshrc]/source: replacing from source puppet:///files/users/brice/skel/.zshrc with contents {md5}1395a1f669e2c86443b142cc5ad10f55
This is because default file checksum is md5, but my file resource above said “mtime”. It then compares apples to oranges.
Related issues
| duplicated by Puppet - Feature #1700: Recursive file copies touching unmanaged files should be ... | Duplicate | 10/28/2008 |
History
Updated by James Turnbull about 2 years ago
- Status changed from Unreviewed to Accepted
- Assignee set to Luke Kanies
- Affected version changed from 0.24.4 to 0.25.0
Updated by Luke Kanies almost 2 years ago
- Subject changed from Recursive file resource with lots of local files is horribly slow. to File recursion with a remote source should not recurse locally
- Priority changed from Normal to High
I guess we need to change the behaviour here, but we need to provide backward compatibility, probably through some kind of recursion selection attribute:
file { "/etc/ssh":
source => "...",
recurse => true,
recurseselect =>
}
I don’t like that attribute name, but we need something, and at least for now, it should default to the current behaviour (probably with a deprecation warning).
Updated by Brice Figureau almost 2 years ago
luke wrote:
I guess we need to change the behaviour here, but we need to provide backward compatibility, probably through some kind of recursion selection attribute: […]
I don’t like that attribute name, but we need something, and at least for now, it should default to the current behaviour (probably with a deprecation warning).
Why not simply:
file { "/etc/ssh":
source => "...",
recurse => ,
}
With true = both With false = no recurse
Updated by Peter Meier almost 2 years ago
Why not simply: […] recurse => <remote|both|true|false>,
With true = both With false = no recurse
+1
Updated by Luke Kanies almost 2 years ago
masterzen wrote:
Why not simply: …
recurse => <remote|both|true|false>…With true = both With false = no recurse
This is a change in behaviour, but I’m not sure anyone’s actually using that behaviour. That is, you can say ‘recurse => 2’.
This means you’d either need to support multiple arguments to recurse (e.g., recurse => [remote, 2]) or change this behaviour. I expect it’s safe to change the behaviour, but….
Updated by Brice Figureau almost 2 years ago
luke wrote:
masterzen wrote:
Why not simply: …
recurse => <remote|both|true|false>…With true = both With false = no recurse
This is a change in behaviour, but I’m not sure anyone’s actually using that behaviour. That is, you can say ‘recurse => 2’.
Yes, I forgot this feature (which I used to workaround my original issue).
This means you’d either need to support multiple arguments to recurse (e.g.,
recurse => [remote, 2]) or change this behaviour. I expect it’s safe to change the behaviour, but….
I don’t really like the syntax you’re proposing. It is not really “typed”, and error prone (what happens if I write [2,remote] or [1,2]).
If we want to go this way, the following would IMHO be preferable:
... recurse =>, recurselimit => [0-9]+
Where:
recurse parameter: both => old behavior remote => new behavior (recurse on master, prune on local) true => same as both false => default (ie no-recurse) [0-9]+ => both, but limit recursion, give a warning to user that this syntax is deprecated and that they should move to recurselimit. recurselimit parameter: [0-9]+ => recursion limit, usable with both (or true) and remote. If recurse=false it is ignored. If recurse is a number, then an error should be raised.
This way, we preserve the old behavior by default and we warn people that the syntax is deprecated.
Does it make sense?
Updated by Luke Kanies almost 2 years ago
Heh, I didn’t like the syntax I was proposing. :)
Your proposal makes perfect sense, although I would use ‘depth’ instead of ‘recurselimit’.
Updated by Brice Figureau almost 2 years ago
luke wrote:
Heh, I didn’t like the syntax I was proposing. :)
Your proposal makes perfect sense, although I would use ‘depth’ instead of ‘recurselimit’.
I’m all for it.
Updated by Brice Figureau over 1 year ago
- Assignee changed from Luke Kanies to Brice Figureau
Since I’m working on it, it should be assigned to me :–)
Updated by Brice Figureau over 1 year ago
- Status changed from Accepted to Ready for Testing
- Assignee changed from Brice Figureau to James Turnbull
- Priority changed from High to Normal
- Target version set to 0.25.0
- % Done changed from 0 to 100
- 3 changed from Unknown to Easy
- Affected version changed from 0.25.0 to 0.24.8
Hi,
It took 5 months :–) but the patch is now available for master in my github repository (see tickets/1469): http://github.com/masterzen/puppet/tree/tickets/1469
Thanks,
Updated by James Turnbull over 1 year ago
- Status changed from Ready for Testing to Closed
Pushed in commit:“33d3624c91b236e237fe194d9df3d9fa324111c3” in branch master