Using semicolon (;) vs plus (+) with exec in find - find

Why is there a difference in output between using
find . -exec ls '{}' \+
and
find . -exec ls '{}' \;
I got:
$ find . -exec ls \{\} \+
./file1 ./file2
.:
file1 file2 testdir1
./testdir1:
testdir2
./testdir1/testdir2:
$ find . -exec ls \{\} \;
file1 file2 testdir1
testdir2
./file2
./file1

This might be best illustrated with an example. Let's say that find turns up these files:
file1
file2
file3
Using -exec with a semicolon (find . -exec ls '{}' \;), will execute
ls file1
ls file2
ls file3
But if you use a plus sign instead (find . -exec ls '{}' \+), as many filenames as possible are passed as arguments to a single command:
ls file1 file2 file3
The number of filenames is only limited by the system's maximum command line length. If the command exceeds this length, the command will be called multiple times.

All of the answers so far are correct. I offer this as a clearer (to me) demonstration of the behaviour that is described using echo rather than ls:
With a semicolon, the command echo is called once per file (or other filesystem object) found:
$ find . -name 'test*' -exec echo {} \;
./test.c
./test.cpp
./test.new
./test.php
./test.py
./test.sh
With a plus, the command echo is called once only. Every file found is passed in as an argument.
$ find . -name 'test*' -exec echo {} \+
./test.c ./test.cpp ./test.new ./test.php ./test.py ./test.sh
If find turns up large numbers of results, you may find that the command being called chokes on the number of arguments.

From man find:
-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command until
an argument consisting of ';' is encountered. The string '{}'
is replaced by the current file name being processed everywhere
it occurs in the arguments to the command, not just in arguments
where it is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a '\') or quoted to
protect them from expansion by the shell. See the EXAMPLES sec
section for examples of the use of the '-exec' option. The
specified command is run once for each matched file.
The command is executed in the starting directory. There are
unavoidable security problems surrounding use of the -exec option;
you should use the -execdir option instead.
-exec command {} +
This variant of the -exec option runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number of
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of '{}'
is allowed within the command. The command is executed in
the starting directory.
So, the way I understand it, \; executes a separate command for each file found by find, whereas \+ appends the files and executes a single command on all of them. The \ is an escape character, so it's:
ls testdir1; ls testdir2
vs
ls testdir1 testdir2
Doing the above in my shell mirrored the output in your question.
example of when you would want to use \+
Suppose two files, 1.tmp and 2.tmp:
1.tmp:
1
2
3
2.tmp:
0
2
3
With \;:
find *.tmp -exec diff {} \;
> diff: missing operand after `1.tmp'
> diff: Try `diff --help' for more information.
> diff: missing operand after `2.tmp'
> diff: Try `diff --help' for more information.
Whereas if you use \+ (to concatenate the results of find):
find *.tmp -exec diff {} \+
1c1,3
< 1
---
> 0
> 2
> 30
So in this case it's the difference between diff 1.tmp; diff 2.tmp and diff 1.tmp 2.tmp
There are cases where \; is appropriate and \+ will be necessary. Using \+ with rm is one such instance, where if you are removing a large number of files performance (speed) will be superior to \;.

find has special syntax. You use the {} as they are because they have meaning to find as the pathname of the found file and (most) shells don't interpret them otherwise. You need the backslash \; because the semicolon has meaning to the shell, which eats it up before find can get it. So what find wants to see AFTER the shell is done, in the argument list passed to the C program, is
"-exec", "rm", "{}", ";"
but you need \; on the command line to get a semicolon through the shell to the arguments.
You can get away with \{\} because the shell-quoted interpretation of \{\} is just {}. Similarly, you could use '{}'.
What you cannot do is use
-exec 'rm {} ;'
because the shell interprets that as one argument,
"-exec", "rm {} ;"
and rm {} ; isn't the name of a command. (At least unless someone is really screwing around.)
Update
the difference is between
$ ls file1
$ ls file2
and
$ ls file1 file2
The + is catenating the names onto a command line.

The difference between ; (semicolon) or + (plus sign) is how the arguments are passed into find's -exec/-execdir parameter. For example:
using ; will execute multiple commands (separately for each argument),
Example:
$ find /etc/rc* -exec echo Arg: {} ';'
Arg: /etc/rc.common
Arg: /etc/rc.common~previous
Arg: /etc/rc.local
Arg: /etc/rc.netboot
All following arguments to find are taken to be arguments to the command.
The string {} is replaced by the current file name being processed.
using + will execute the least possible commands (as the arguments are combined together). It's very similar to how xargs command works, so it will use as many arguments per command as possible to avoid exceeding the maximum limit of arguments per line.
Example:
$ find /etc/rc* -exec echo Arg: {} '+'
Arg: /etc/rc.common /etc/rc.common~previous /etc/rc.local /etc/rc.netboot
The command line is built by appending each selected file name at the end.
Only one instance of {} is allowed within the command.
See also:
man find
Using semicolon (;) vs plus (+) with exec in find at SO
Simple unix command, what is the {} and \; for at SO
What is meaning of {} + in find's -exec command? at Unix

we were trying to find file for housekeeping.
find . -exec echo {} \; command ran over night in the end no result.
find . -exec echo {} \ + have results and only took a few hours.
Hope this helps.

Related

Get current directory in find command and use in sed - one-liner

I'm using this to find files of a particular name in subdirectories, then editing some content:
find prod -type f -name "file.txt" -exec sed -i '' -e "s,^varname.*$, varname = \"$value\"," {} +
How can I get the name of the current directory (not the directory the script is executed in, rather the directory the file is found in) and insert it into the replace text? Something like:
find prod -type f -name "file.txt" -exec sed -i '' -e "s,^ varname.*$, varname = \"$value/$dirname\"," {} +
I'm hoping to keep it as a one-liner. My most recent attempt was this, but the replacement didn't work and I feel there must be a simpler syntax:
find prod -type f -name "file.txt" -exec sh -c '
for file do
dirname=${file%/*}
done' sed -i '' -e "s,^varname.*$, varname = \"$value/$dirname\"," {} +
Example:
value=bar
file.txt input:
varname = “foo”
file.txt output:
varname = “bar/directory_name”
You can do this with GNU awk in the same way:
The sed command you make use of can be replaced with:
$ awk --inplace -v v="$value" '(FNR==1){d=FILENAME;sub("/[^/]*$","",d)}/^varname/{$0="varname = "v"/"d}1'
So your find woud read:
$ find prod -type f -name "file.txt" -exec awk --inplace -v v="$value" '(FNR==1){d=FILENAME;sub("/[^/]*$","",d)}/^varname/{$0="varname = "v"/"d}1' {} \;
This might work for you (GNU sed & parallel):
find prod -type f -name "file.txt" |
parallel -qa- --link sed -i 's#\(varname=\).*#\1"{2}{1//}"#' {1} ::: $value
We supply 2 sources to the parallel command. The first source is the list of files from the find command using the parallel option -a -. The second source is the variable $value, being only a single value it is linked to the first source using the parallel option --link. The sed command is quoted using the parallel option -q and normal regexp rules apply excepting that the values {2} and {1//} are first interpreted by parallel to represent the second source and the directory of the first source respectively.
N.B. To check the commands to parallel are as you desire, use the --dryrun option and check the output before running for real.
You need to use -execdir and spawn a shell:
find ... -execdir \
bash -c 'sed -i "" -e "s,^ varname.*$, varname = \"$value/${PWD}\"," "$1"' -- {} +
-execdir runs sed in the parent folder of the file instead of the folder from where you run find. This allows to use
$PWD.
Further note: I calling bash with two arguments:
-exec bash -c '... code ...' -- {}
^^ ^^
I'm passing the -- as a placeholder. When called with -c, bash starts to index arguments at $0 instead of $1. ($0 would normally contain the script's name). That allows to use $1 for the filename from {} which is imo more readable and understandable.

How to rename all the files (without for loop) in a single line command?

I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).
If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp
If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.

sed not working properly with multiple input files

sed -i is creating a backup of all files in subdirectories before editing in place (as expected) but it's not actually editing files in subdirectories.
$ mkdir -p a/b
$ echo "A" > a/a.txt
$ echo "B" > a/b/b.txt
Now I have two text files, one in a one in a subdirectory of a
$ sed -i.bac "1s/^/PREPENDED /" a/**/*.txt
Backups are created for both:
$ find a
a
a/a.txt
a/a.txt.bac
a/b
a/b/b.txt
a/b/b.txt.bac
Only a.txt is edited:
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
B
I'm using ZSH (so I have globstar support) and I'm on Mac.
Why is this happening and how can I fix it?
It's happening because your sed invocation only has a single line 1, which happens to be in a.txt. If you want it to do it for each file then you need to invoke sed multiple times.
for f in a/**/*.txt
do
sed ... "$f"
done
Since you are needing to descend through several levels of directories, a single invocation of sed alone is not sufficient. However, using find you can accomplish what you want in a single line. If you are not familiar with find ... -exec '{}' \; it is worth taking a few minutes with startpage.com and do a quick search. In your case, the following invocation works well:
find a -type f -name "*.txt" -exec sed -i.bac 's/^/PREPENDED /' '{}' \;
Here find searches directory a and all below for any file (-type f) matching *.txt, then for each file (indicated by '{}') -exec executes sed -i.bac 's/^/PREPENDED /' and lastly an escaped \; is given to indicate the end of the -exec command.
results:
$ ls -1 a
b
a.txt
a.txt.bac
$ ls -1 a/b
b.txt
b.txt.bac
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
PREPENDED B
As was correctly pointed out, with globstar set shopt -s globstar it is unnecessary to use find as the following invocation of sed is sufficient:
sed -i.bac 's/^/PREPENDED /' a/**/*.txt

Clarification of 'sed' usage

I just blindly followed a command from a tutorial to rename several folders at a time. Can anyone explain the meaning of "p;s" given as the argument to sed's -e option.
[root#LinuxD delsure]# ls
ar1 ar2 ar3 ar4 ar5 ar6 ar7
[root#LinuxD delsure]# find . -type d -name "ar*"|sed -e "p;s/ar/AR/g"|xargs -n2 mv
[root#LinuxD delsure]# ls
AR1 AR2 AR3 AR4 AR5 AR6 AR7
A sed script (the bit following the -e option) can contain multiple commands, separated by ;
The script in your example uses the p command to print the pattern space (i.e. the line just read from the input) followed by the s command to perform a substitution on the pattern space.
By default (unless the pattern space is cleared or the -n option is given to sed) after processing each line the current pattern spaceline is printed again, so the result of the substitution will be printed.
Another way to write the same thing would be:
sed -e "p" -e "s/ar/AR/g"
This separates the commands into two scripts. Another way would be:
sed "p;s/ar/AR/g"
because if the only argument to sed is a script then the -e option is not needed
The argument to the -e option is a script consisting of two commands. The first is p, which prints the unadulterated input, the second is a standard, global substitution. So for input ar1, this should output
ar1
AR1
The other part of this trick is the -n2 option on xargs, which forces it to only use two arguments at a time (instead of as many as it can handle, which would produce very different results).
One way in bash:
$ ls
ar6 ar7
$ find . -name 'ar*' | while IFS= read -r file; do echo mv "$file" "${file^^}"; done
mv ./ar6 ./AR6
mv ./ar7 ./AR7
get rid of the "echo" when you're happy with the output.

unix find and replace text in dir and subdirs

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml
find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete
This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).
There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.