Bug #2910

tidy failing to remove any matching files

Added by Peter Couvares over 2 years ago. Updated about 2 years ago.

Status:Closed Start date:12/09/2009
Priority:High Due date:
Assignee:Markus Roberts % Done:

0%

Category:tidy
Target version:0.25.5
Affected Puppet version:0.25.4 Branch:http://github.com/MarkusQ/puppet/tree/ticket/0.25.x/2910
Keywords:
Votes: 3

Description

I’m somehow unable to get tidy to work at all — here is what I’m specifying, right inside a node definition in nodes.pp:

     tidy { tidy_tmp:
          path => "/tmp/",
          age => "7d",
          backup => false,
          matches => [ "jna*.tmp" ],
     }

The rest of my node definition, and everything else in nodes.pp more generally, works fine — but matching files are not being tidied.

I can confirm many matching files on the client host in question with:

find /tmp -maxdepth 1 -name 'jna*.tmp' -atime +7 -print

I’ve tried the following in turn, with no luck: * removing the matches attribute altogether, * removing the age attribute altogether, * replacing the matches list with a simple string, * removing the explicit path attribute and using “/tmp” as the name, * and setting the type attribute to “mtime” instead of the default atime.

I’m out of ideas, no one on IRC had any other ideas, and so I now suspect a bug.

If I’m doing something foolish, however, I’d be grateful to know. Thanks.

History

Updated by Peter Couvares over 2 years ago

I should add that there are no errors (or output of any kind) related to tidy in the client’s log—unless I introduce an invalid attribute, in which case it is noted (which I mention simply to confirm that this isn’t due to some more fundamental failure to pull the manifest from the master).

Updated by Markus Roberts over 2 years ago

  • Status changed from Unreviewed to Rejected

Many issues in 0.24.7 were fixed in 0.24.8.

Please try with 0.24.8 and reopen this ticket with a revised version number if the problem persists.

(I’m suggesting 0.24.8 on the assumption that you are using 0.24.7 in production; f you are doing a new installation you may want to consider going straight to the 0.25 series).

Updated by Peter Couvares over 2 years ago

  • Status changed from Rejected to Re-opened
  • Affected Puppet version changed from 0.24.7 to 0.25.2rc3

Alas, I see the exact same behavior (or lack thereof) with puppet-0.25.2rc3 on all hosts.

Updated by Markus Roberts over 2 years ago

  • Status changed from Re-opened to Investigating

Given the timing, I think this is going to have to wait for 0.25.3; final determination within a few hours.

Updated by Cody Robertson over 2 years ago

Any update on this? I’m experiencing the same exact behavior 0.25.1 (still have to update, but it I rather wait until this is patched).

Updated by Cody Robertson about 2 years ago

  • Affected Puppet version changed from 0.25.2rc3 to 0.25.4

I’m having the same issue with 0.25.4 however I was able to get it by specifying recurse => true (not quite what I want, however it shall suffice for now).

Could someone confirm I’m not doing this incorrectly..

touch -t 200805101024 /some/path/error_log.test

find /some/path -name 'error_log.*'
/some/path/error_log.test
class someTidy {
    tidy { "/some/path":
        age          => "1s",
        recurse   => false,
        matches => [ "error_log.*" ]
    }
}

Doesn’t get triggered, however if I specify recurse as true it does. Is this intended behavior?

Updated by James Turnbull about 2 years ago

It is the intended behaviour I think – at least useful as a work-around.

Updated by Cody Robertson about 2 years ago

Alright if that is intended behavior the example in the documentation regarding the Tidy type would be inaccurate

tidy { "/tmp":
    age => "1w",
    recurse => false,
    matches => [ "[0-9]pub*.tmp", "*.temp", "tmpfile?" ]
}

Thanks for clarifying :)

Updated by Ashley Penney about 2 years ago

Well, I don’t know if this is how it’s supposed to work but it makes absolutely no sense from a user perspective – if I have a directory with a bunch of logfiles I shouldn’t have to ‘recurse’ if it’s a flat directory with some files in and no subdirectories. This just took me about 30 minutes to discover and solve with my first attempt to use tidy{}.

Updated by Cody Robertson about 2 years ago

Ashley Penney wrote:

Well, I don’t know if this is how it’s supposed to work but it makes absolutely no sense from a user perspective – if I have a directory with a bunch of logfiles I shouldn’t have to ‘recurse’ if it’s a flat directory with some files in and no subdirectories. This just took me about 30 minutes to discover and solve with my first attempt to use tidy{}.

I tend to agree, I don’t think this was intended behavior and or wasn’t thought out thoroughly before implementing. Luckily my use of Tidy right now it’s not a big issue however if I have files in sub-directories that match my regex and I don’t want them pruned it could be a very annoying / crucial issue.

Updated by James Turnbull about 2 years ago

  • Category set to tidy
  • Assignee set to James Turnbull
  • Target version set to 0.25.5

Okay I’ve had a look and it’s a bug. You are quite correct it should tidy files in a flat sub-directory without recurse. I’ll poke at the code some to work out what’s happening.

Updated by Markus Roberts about 2 years ago

  • Status changed from Investigating to Accepted

Updated by Markus Roberts about 2 years ago

  • Status changed from Accepted to Needs Decision

So I’m now thinking that this is “correct” behaviour and the documentation is wrong. To explain:

  • Peter confirmed that there were matching files by running:
find /tmp -maxdepth 1 -name 'jna*.tmp' -atime +7 -print
  • If we use the corresponding tidy, it works fine:
     tidy { tidy_tmp:
          path => "/tmp/",
          recurse => 1,
          matches => [ "jna*.tmp" ],
          age => "7d",
          backup => false
     }
  • What doesn’t work is omitting the recurse, or explicitly setting it to zero or false (as shown in the example), all of which are the equivalent of setting -maxdepth to 0 on the find—and give the results you would (or should) expect.

So I’d say tidy is working correctly in the “recurse => false” case (it is looking at the specified path, and noting else—this would be what you would want if you were trying to manage a specific file) but the example given for tidy is wrong.

However, this leaves open the case of the default value for recurse. Should it perhaps be 1 (only look at direct children of the stated path) or true (look at all children)? Or should it be required, so that no one can say they were led astray by an unstated assumption?

Updated by Peter Couvares about 2 years ago

Markus, thanks for investigating. Personally, I think the only sensible default is recurse=1. Given a directory and a matches value, everyone will expect tidy to scan it accordingly. Defaulting to zero is highly unintuitive.

That being said, I’m delighted to have a fix, so do what you will — as long as the documentation is accurate, a “bad” default is not the end of the world.

Updated by Markus Roberts about 2 years ago

Peter —

So how about having recursive default to 1 if (and only if?) matches is specified? Or having it be an error to specify “matches” with recursive false/0?

— Markus

Updated by Peter Couvares about 2 years ago

That would be better — but honestly, I think it’s exceedingly unlikely anyone would specify a directory, even without matches, and not expect it to tidy the immediate contents of that directory. Tidying up files in a directory (e.g., based on file modification time) is a very, very common use-case (/tmp being the obvious example).

In contrast, tidying a directory with recurse=0 makes almost no sense in any scenario. What could it mean? Either a noop, which by definition no one would intend to specify, or else a request to remove the directory itself, which seems highly unusual (and only makes sense if it’s empty — otherwise it’s still a noop).

I tend to be quite conservative in these sorts of matters, erring on the side of not deleting things — but in this case the feature is for deleting things and recurse=0 fails to accomplish that. I think users will want and expect tidy to recurse=1 on a directory (even without matches) and will be very surprised if it doesn’t. However, if you disagree, please throw an error if recurse is unspecified for a directory rather than making it a silent noop. Special-casing the matches attribute doesn’t make much sense since others like age might be used instead.

Regardless of what you decide, I appreciate the time and thought you’re putting into solving this. Thanks!

Updated by Markus Roberts about 2 years ago

So the difference between match and tests such as age is that, for a given target, match will either always be true or never be true, but the value doesn’t change. It would make sense to write something like:

tidy { "/my/occasional.log": age => '1w' }

with the intent that /my/occasional.log gets tidied after it sits for a week. But

tidy { "/my/occasional.log": matches => "*.txt" }

doesn’t make sense; /my/occasional.log will never match @.txt@, and so this is just a noop. (The converse case with matches => “@.log@” is only a little less silly).

At the moment I’m leaning towards requiring a non-default, non-false, non-zero value for recurse if matches is specified; rather than trying to guess and “do the right thing” when someone specifies something a meaningless setting, we should tell them and make them specify something meaningful.

— Markus

P.S. The killer argument about defaulting to “recurse => 1” is that we just push the problem into different edge cases; what if they specify “recurse => false” (as in the example text)? What if they specify “recurse => 1”? Do we bump it to 2? Or is 1 now a synonym for 0? And so on.

Updated by Markus Roberts about 2 years ago

  • Status changed from Needs Decision to In Topic Branch Pending Review
  • Assignee changed from James Turnbull to Markus Roberts
  • Branch set to http://github.com/MarkusQ/puppet/tree/ticket/0.25.x/2910

I’ve changed matches to require an explicit recurse, on the Principle of Least Surprise (see the attached branch) but in testing it I noticed something else; unless the path ends in “/” the recursion seems to be ignored. Thus:

     tidy { tidy_tmp:
          path => "/tmp/",
          recurse => 1,
          matches => [ "jna*.tmp" ],
          age => "7d",
          backup => false
     }

works but:

     tidy { tidy_tmp:
          path => "/tmp",
          recurse => 1,
          matches => [ "jna*.tmp" ],
          age => "7d",
          backup => false
     }

doesn’t. This seems wrong to me, but may serve some deeper purpose. Since the actual recursion is being done by the file serving code, I’m reluctant to just “fix” it without greater understanding of what’s going on.

Updated by Markus Roberts about 2 years ago

So the explanation for the problem I observed in comment 18 above is that I’m testing on a Mac and my /tmp is a symlink. If I specify “/tmp/” I get the directory but if I just specify “/tmp” I get the symlink (and it doesn’t recurse across it).

As there’s no “links” attribute for tidy this seems to be a limitation, though not as mysterious as it originally seemed. Specifically, it appears that tidy will never recurse across symlinks it encounters.

Updated by James Turnbull about 2 years ago

  • Status changed from In Topic Branch Pending Review to Closed

Pushed in commit:bcde541e4433aa46c9f922b01e34368a09abb7e8 in branch 0.25.x

Also available in: Atom PDF