Exclude certain filename patterns and directories from Find - find

I'm trying to find all files that are not up to date compared to CVS. As our CVS structure is broken (it does not recurse well in some directories) I'm trying to work around it with find for now.
What I have now is this:
for i in `find . -not \( -name "*\.jpg" \) -path './bookshop/mediaimg' -prune -o -path '*/CVS*' -prune -o -path './files' -prune -o -path './images/cms' -prune -o -path './internal' -prune -o -path './limesurvey171plus_build5638' -prune -o -path './gallery2' -prune -o -print `; do cvs status "$i" |grep Status ; done &>~/output.txt
But somehow my attempt at excluding images (jpg in this case) does not work, they still show up.
Anyone have a suggestion on how to get them out of my results from find?

This is failing for the same reason that mixing boolean AND and OR always fails.
What you're saying here is really (in pseudo-code):
If (
File Not Named '*.jpg' AND
Path Matches './bookshop/mediaimg' AND
Prone OR
Path Matches '*/CVS*' AND
Prune OR
Path Matches './files' AND
Prune OR
Path Matches './images/cms' AND
Prune OR
Path Matches './internal' AND
Prune OR
Path Matches './limesurvey171plus_build5638' AND
Prune OR
Path Matches './gallery2' AND
Prune OR
Print
)
Now, print always returns true, and I think prune does as well, so you see that none of the ANDs matter if any OR matches. The careful application of parentheses will probably yield the results you're after.

Related

Shell alias input search syntax for a negated find search

Mac OSX Bash Shell
I want to use find to identify anything (directories or files) which do not follow an input pattern.
This works fine:
find . -path /Users/Me/Library -prune -o \! \( -path '*.jpg' \)
However I want to have a general ability to do from a bash alias or function eg:
alias negate_find="find . -path /Users/Me/Library -prune -o \! \( -path ' "$1" ' \)"
To allow shell input of the search term (which may contain spaces). The current syntax does not work, returning unknown primary or operator on invocation. Grateful for assistance in what I am doing wrong.
Not entirely sure why, but separating the input parameter into its own string seemed to work. Here it is as a working shell function and case invariant.
negate_find () {
search="$1"
lowersearch=$(echo "$search" | awk '{print tolower($0)}')
uppersearch=$(echo "$search" | awk '{print toupper($0)}')
echo "search = $search"
find . -path $HOME/Library -prune -o \! \( -path "$lowersearch" \) -a \! \( -path "$uppersearch" \)
}
export -f negate_find

delete directories with find and exclude other directories

I'm attempting to delete some directories and I want to be able to exclude a directory called 'logs' from being deleted.
This is my basic find operation (without the exclusion):
# find . -type d |tail -10
./d20160124-1120-df8mfb/deployments
./d20160124-1120-df8mfb/releases
./d20160131-16993-vazqg5
./d20160131-16993-vazqg5/metadata
./d20160131-16993-vazqg5/deployments
./d20160131-16993-vazqg5/releases
./logs
./d20160203-27735-1tqbjh6
./d20160125-1120-1yccr9p
./d20160131-16993-1yf9lnc
I'm just tailing the output so that you have an idea of what's going on without taking up the whole page. :)
If I try to exlclude the logs directory with the prune command I get back no results.
root#ops-manager:/tmp/tmp# find . -type d -prune -o -name 'logs' -print
root#ops-manager:/tmp#
What am I doing wrong?
Once I get this right, I'll tack on an -exec rm rf {} \; command so I can delete those directories.
Any help here would be appreciated!
-prune always evaluates to true, which means the expression on the other side of -o is never evaluated. You need to change the order:
find . -type d -name 'logs' -prune -o -print

unix find and multiple commands to find all files, exclude a dir, and base results on time

Given a folder, I want to ignore a folder within, and then find all files outside that folder, whether they be folders or files, and eventually delete them via a cron style action on OS X.
On Mac OS X will use Launchd to run this, so far, I have this:
find /Users/me/Downloads -not \( -path /Users/me/Downloads/In\ Progress -prune \) -name "*" -Btime 1m
With the -Btime 1m or mime 1m I get zero results, without it i get results I can exec to rm:
find /Users/me/Downloads -not \( -path /Users/me/Downloads/In\ Progress -prune \) -name "*"
/Users/me/Downloads
/Users/me/Downloads/.DS_Store
/Users/me/Downloads/test
/Users/me/Downloads/text.txt
Eventually my criteria will be 1 week, but for now, I use 1 minute, as that surely has passed.
cd ~/Downloads
$find . -mtime 1m
$
Or
cd ~/Downloads
$find . -mtime 1m -print
$find: *: unknown primary or operator
I believe I figured it out, but if someone could look over my shoulder, I certainly would appreciate it:
/usr/bin/find /Users/$USER/Downloads -not ( -path /Users/$USER/Downloads/In\ Progress -prune -o -type d ) -mtime +1s
That finds all files in ~/Downloads at are older than 1 second(s) and are not inside the "In Progress" directory. At first, it was also locating the path /Users/$USER/Downloads as the first hit
/usr/bin/cd ~/Downloads/
echo "You are in: " $(/bin/pwd)
/usr/bin/find /Users/$USER/Downloads -not \( -path /Users/$USER/Downloads/In\ Progress -prune -o -type d \) -mtime +1s
So, can anyone tell me why it is not picking up /Users/$USER/Downloads I certainly don't mind, but I also don't like not understanding.
Finally, the last step would be, change mime to +1w and then append xargs rm -rf {} \; and I should be good to go? If I also wanted to pass in a little list of what find has found and >> it to a log file, where would be a good place to shove that?

find -mindepth -prune conflict?

I have a dir structure like:
gameplay/ch1_of_182/ajax.googleapis.com/...
gameplay/ch1_of_182/platform.twitter.com/...
gameplay/ch1_of_182/privacy-policy.truste.com/...
gameplay/ch1_of_182/www.facebook.com/...
gameplay/ch1_of_182/www.gameplay.com/...
gameplay/ch1_of_182/www-mega.gameplay.com/...
gameplay/ch2_of_182/ajax.googleapis.com/...
gameplay/ch2_of_182/platform.twitter.com/...
gameplay/ch2_of_182/privacy-policy.truste.com/...
gameplay/ch2_of_182/www.facebook.com/...
gameplay/ch2_of_182/www.gameplay.com/...
gameplay/ch2_of_182/www-mega.gameplay.com/...
...
gameplay/ch182_of_182/ajax.googleapis.com/...
gameplay/ch182_of_182/platform.twitter.com/...
gameplay/ch182_of_182/privacy-policy.truste.com/...
gameplay/ch182_of_182/www.facebook.com/...
gameplay/ch182_of_182/www.gameplay.com/...
gameplay/ch182_of_182/www-mega.gameplay.com/...
created using wget. Now I want to delete all of the directories and all the files they contain except for the two directories with "gameplay.com" in their name.
I have been trying different variants of:
find . -mindepth 2 ! -path "*gameplay.com" -prune -exec rm -r {} \;
but without luck.
When run from the gameplay parent dir, the 4 of 6 directories in each of the 182 directories (actually only using 1 for testing purposes) that do not contain the "gameplay.com" name pattern are deleted along with all their contents, as desired. And although the 2 directories that do contain the "gameplay.com" name pattern are left undeleted, all the files contained within each are deleted, which is not good.
I thought the -prune option was supposed to let find know to basically ignore the directory specified, but I must be specifying it wrong, because that is not happening, leaving me with two empty directories in each of the 182 parent dirs.
I think there is a potential conflict with the -mindepth and -prune options, but because I am running the command from the gameplay parent directory, without it all 182 child directories, each containing two directories I want left in tact, are deleted.
I guess I could write a for loop to cycle thru each dir, but it seems to me find should be able to accomplish this task in one foul swoop if someone doesn't mind shedding some light how.
Logically, this is find . -mindepth 2 \( ! -path "*gameplay.com" \) -a \( -prune \) -a \( -exec rm -r {} \; \)
So since your -prune comes after the path check, it's never gotten to if your path check is false. Try switching the order.
find . -mindepth 2 -prune ! -path "*gameplay.com" -exec rm -r {} \;

grouping predicates in find

This part " ( -name *txt -o -name *html ) " confuses me in the code:
find $HOME \( -name \*txt -o -name \*html \) -print0 | xargs -0 grep -li vpn
Can someone explain the the brackets and "-o"? Is "-o" a command or a parameter? I know the brackets are escaped by "\" , but why are they for?
By default, the conditions in the find argument list are 'and'ed together. The -o option means 'or'.
If you wrote:
find $HOME -name \*txt -o -name \*html -print0
then there is no output action associated with the file names end with 'txt', so they would not be printed. By grouping the name options with parentheses, you get both the 'html' and 'txt' files.
Consider the example:
mkdir test-find
cd test-find
cp /dev/null file.txt
cp /dev/null file.html
The comments below have an interesting side-light on this. If the command was:
find . -name '*.txt' -o -name '*.html'
then, since no explicit action is specified for either alternative, the default -print (not -print0!) action is used for both alternatives and both files are listed. With a -print or other explicit action after one of the alternatives (but not the other), then only the alternative with the action takes effect.
find . -name '*.txt' -print -o -name '*.html'
This also suggests that you could have different actions for the different alternatives.
You could also apply other conditions, such as a modification time:
find . \( -name '*.txt' -o -name '*.html' \) -mtime +5 -print0
find . \( -name '*.txt' -mtime +5 -o -name '*.html' \) -print0
The first prints txt or html files older than 5 days (so it prints nothing for the example directory - the files are a few seconds old); the second prints txt files older than 5 days or html files of any age (so just file.html). And so on...
Thanks to DevSolar for his comments leading to this addition.
The "-o" means OR. I.e., name must end in "txt" or "html". The brackets just group the two conditions together.
The ( and ) provide a way to group search parameters for the find command. The -o is an "or" operator.
This find command will find all files ending in "txt" or "html" and pass those as arguments to the grep command.