Two ways of entering a directory by inode number? - find

If I want to enter a directory by its inode number, why
cd $(find . -inum $inode_num)
works, but the following command does not work:
find . -inum $inode_num -exec cd {} \;
what's the difference between these two, and why is the 2nd one wrong?

cd is not a program that can be executed, it's a built-in shell command. It has to be, since it's too hard to change current directory in parent process.

Related

Copying the files and SUBDIRECTORIES based on modification date?

It may be a duplicate question but i could not find the solution for this i want to copy a last 3 months files AND subdirectories from one disk to andother but i could find only to listing the files by using the following command. I really don't know how to copy the files by using -mtime. I'm new to linux please help me.
find . -mtime -90 -exec cp {} targetdir \;
but how to copy directories with subdirectories and files too? (but do not use command rsync, i don;t have it with this instance) Regards S.
Copy needs a recursive option specified to handle the subdirectories
$ find testroot # shows some dirs and files
testroot
testroot/sub1
testroot/sub1/subtestfile
testroot/sub2
testroot/testf
$ find target # empty at this stage
target
$ find ./testroot/ -exec cp -R {} target/ \;
$ find target
target
target/sub1
target/sub1/subtestfile
target/sub2
target/subtestfile
target/testf

Using semicolon (;) vs plus (+) with exec in find

Why is there a difference in output between using
find . -exec ls '{}' \+
and
find . -exec ls '{}' \;
I got:
$ find . -exec ls \{\} \+
./file1 ./file2
.:
file1 file2 testdir1
./testdir1:
testdir2
./testdir1/testdir2:
$ find . -exec ls \{\} \;
file1 file2 testdir1
testdir2
./file2
./file1
This might be best illustrated with an example. Let's say that find turns up these files:
file1
file2
file3
Using -exec with a semicolon (find . -exec ls '{}' \;), will execute
ls file1
ls file2
ls file3
But if you use a plus sign instead (find . -exec ls '{}' \+), as many filenames as possible are passed as arguments to a single command:
ls file1 file2 file3
The number of filenames is only limited by the system's maximum command line length. If the command exceeds this length, the command will be called multiple times.
All of the answers so far are correct. I offer this as a clearer (to me) demonstration of the behaviour that is described using echo rather than ls:
With a semicolon, the command echo is called once per file (or other filesystem object) found:
$ find . -name 'test*' -exec echo {} \;
./test.c
./test.cpp
./test.new
./test.php
./test.py
./test.sh
With a plus, the command echo is called once only. Every file found is passed in as an argument.
$ find . -name 'test*' -exec echo {} \+
./test.c ./test.cpp ./test.new ./test.php ./test.py ./test.sh
If find turns up large numbers of results, you may find that the command being called chokes on the number of arguments.
From man find:
-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command until
an argument consisting of ';' is encountered. The string '{}'
is replaced by the current file name being processed everywhere
it occurs in the arguments to the command, not just in arguments
where it is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a '\') or quoted to
protect them from expansion by the shell. See the EXAMPLES sec
section for examples of the use of the '-exec' option. The
specified command is run once for each matched file.
The command is executed in the starting directory. There are
unavoidable security problems surrounding use of the -exec option;
you should use the -execdir option instead.
-exec command {} +
This variant of the -exec option runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number of
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of '{}'
is allowed within the command. The command is executed in
the starting directory.
So, the way I understand it, \; executes a separate command for each file found by find, whereas \+ appends the files and executes a single command on all of them. The \ is an escape character, so it's:
ls testdir1; ls testdir2
vs
ls testdir1 testdir2
Doing the above in my shell mirrored the output in your question.
example of when you would want to use \+
Suppose two files, 1.tmp and 2.tmp:
1.tmp:
1
2
3
2.tmp:
0
2
3
With \;:
find *.tmp -exec diff {} \;
> diff: missing operand after `1.tmp'
> diff: Try `diff --help' for more information.
> diff: missing operand after `2.tmp'
> diff: Try `diff --help' for more information.
Whereas if you use \+ (to concatenate the results of find):
find *.tmp -exec diff {} \+
1c1,3
< 1
---
> 0
> 2
> 30
So in this case it's the difference between diff 1.tmp; diff 2.tmp and diff 1.tmp 2.tmp
There are cases where \; is appropriate and \+ will be necessary. Using \+ with rm is one such instance, where if you are removing a large number of files performance (speed) will be superior to \;.
find has special syntax. You use the {} as they are because they have meaning to find as the pathname of the found file and (most) shells don't interpret them otherwise. You need the backslash \; because the semicolon has meaning to the shell, which eats it up before find can get it. So what find wants to see AFTER the shell is done, in the argument list passed to the C program, is
"-exec", "rm", "{}", ";"
but you need \; on the command line to get a semicolon through the shell to the arguments.
You can get away with \{\} because the shell-quoted interpretation of \{\} is just {}. Similarly, you could use '{}'.
What you cannot do is use
-exec 'rm {} ;'
because the shell interprets that as one argument,
"-exec", "rm {} ;"
and rm {} ; isn't the name of a command. (At least unless someone is really screwing around.)
Update
the difference is between
$ ls file1
$ ls file2
and
$ ls file1 file2
The + is catenating the names onto a command line.
The difference between ; (semicolon) or + (plus sign) is how the arguments are passed into find's -exec/-execdir parameter. For example:
using ; will execute multiple commands (separately for each argument),
Example:
$ find /etc/rc* -exec echo Arg: {} ';'
Arg: /etc/rc.common
Arg: /etc/rc.common~previous
Arg: /etc/rc.local
Arg: /etc/rc.netboot
All following arguments to find are taken to be arguments to the command.
The string {} is replaced by the current file name being processed.
using + will execute the least possible commands (as the arguments are combined together). It's very similar to how xargs command works, so it will use as many arguments per command as possible to avoid exceeding the maximum limit of arguments per line.
Example:
$ find /etc/rc* -exec echo Arg: {} '+'
Arg: /etc/rc.common /etc/rc.common~previous /etc/rc.local /etc/rc.netboot
The command line is built by appending each selected file name at the end.
Only one instance of {} is allowed within the command.
See also:
man find
Using semicolon (;) vs plus (+) with exec in find at SO
Simple unix command, what is the {} and \; for at SO
What is meaning of {} + in find's -exec command? at Unix
we were trying to find file for housekeeping.
find . -exec echo {} \; command ran over night in the end no result.
find . -exec echo {} \ + have results and only took a few hours.
Hope this helps.

How to use multiple files at once using bash

I have a perl script which is used to process some data files from a given directory. I have written below bash script to look for the last updated file in the given directory and process that file.
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} \;
Sometimes, user copied multiple files to the data dir and hence the previous one skipped. The perl script execute only the last updated file. Can you please suggest me how to fix this using bash script.
Try
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} +
Note the termination of -exec with a + vs your \;
From the man page
-exec command {} +
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end;
Now that you'll have one or more file names passed into your perl script, you can alter your perl script to iterate over each passed in file name.
If I understood the question correctly, you need to process any files that were created or modified in a directory since the last time your script was run.
In my opinion find is not the right tool to determine those files, because it has no notion of which files it has already seen.
Using any of the -atime/-ctime/-mtime options will either produce duplicates if you run your script twice in the specified period, or miss some files if it is not executed at the right time. The timing intricacies of using these options for something like this are not easy to deal with.
I can propose a few alternatives:
a) Use three directories instead of one: incoming/ processing/ done/. Your users should only be allowed to put files in incoming/. You move any files in there to processing/ with a simple mv incoming/* processing/ before running your perl script. Then you move them from processing/ to done/ when its over.
In my opinion this is the simplest and best solution, and the one used by mail servers etc when dealing with this issue. If I were you and there were not any special circumstances preventing you from doing this, I'd stop reading here.
b) Have your finder script touch a special file (e.g. .timestamp, perhaps in a different directory, so that your users will not tamper with it) when it's done. This will allow your script to remember the last time it was run. Then use
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' ';'
to run your perl script for each file. You should modify your perl script so that it can run repeatedly with a different file name each time. If you can modify it to accept multiple files in one go, you can also run it with
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' +
which will minimise the number of ./script.pl processes. Take care to handle the first run of the find script, when the .timestamp file is missing. A good solution would be to simply ignore it by not using the -*newer options at all in that case. Also keep in mind that there is a race condition where files added after find was started but before touching the timestamp file will not be processed.
c) As a variation of (b), have your script update the timestamp with the time of the processed file that was created/modified most recently. This is tricky, because find cannot order its output on its own. You could use a wrapper around your perl script to handle this:
#!/bin/bash
for i in "$#"; do
find "$i" \( -cnewer .timestamp -o -newer .timestamp \) -exec touch -r '{}' .timestamp ';'
done
./script.pl "$#"
This will update the timestamp if it is called to process a file with a newer mtime or ctime, minimising (but not eliminating) the race condition. It is however somewhat awkward - unavoidable since bash's [[ -nt option seems to only check the mtime. It might be better if your perl script handled that on its own.
d) Have your script store each processed filename and its timestamps somewhere and then skip duplicates. That would allow you to just pass all files in the directory to it and let it sort out the mess. Kinda tricky though...
e) Since your are using Linux, you might want to have a look at inotify and the inotify-tools package - specifically the inotifywait tool. With a bit of scripting it would allow you to process files as they are added in the directory:
inotifywait -e MOVED_TO -e CLOSE_WRITE -m -r testd/ | grep --line-buffered -e MOVED_TO -e CLOSE_WRITE | while read d e f; do ./script.pl "$f"; done
This has no race conditions, as long as your users do not create/copy/move any directories rather than just files.
The perl script will only execute against the file which find gives it. Perhaps you should remove the -mtime -1 option from the find command so that it picks up all the files in the directory?

tarring find results on hp-ux

$ find /tmp/a1
/tmp/a1
/tmp/a1/b2
/tmp/a1/b1
/tmp/a1/b1/x1
simply trying
find /tmp/a1 -exec tar -cvf dirall.tar {} \;
simply doesn't work
any help
The command specified for -exec is run once for each file found. As such, you're recreating dirall.tar every time the command is run. Instead, you should pipe the output of find to tar.
find /tmp/a1 -print0 | tar --null -T- -cvf dirall.tar
Note that if you're simply using find to get a list of all the files under /tmp/a1 and not doing any sort of filtering, it's much simpler to use tar -cvf dirall.tar /tmp/a1.
You're one character away from the solution. The find command's exec option will execute the command for each file found, so you should replace -c with -r to put tar into append mode. Each time find invokes it, it'll tack on one more file:
rm -f dirall.tar
find /tmp/a1 -exec tar -rvf dirall.tar {} \;
I'd think something like "find /tmp/a1 | xargs tar cvf foo.tar" would work. But make sure you have backups first!
Does hpux have cpio ?
That will take a list of files on stdin and some versions
will write output in tar format.

Deleting empty (zero-byte) files

What's the easiest/best way to find and remove empty (zero-byte) files using only tools native to Mac OS X?
Easy enough:
find . -type f -size 0 -exec rm -f '{}' +
To ignore any file having xattr content (assuming the MacOS find implementation):
find . -type f -size 0 '!' -xattr -exec rm -f '{}' +
That said, note that many xattrs are not particularly useful (for example, com.apple.quarantine exists on all downloaded files).
You can lower the potentially huge number of forks to run /bin/rm by:
find . -type f -size 0 -print0 | xargs -0 /bin/rm -f
The above command is very portable, running on most versions of Unix rather than just Linux boxes, and on versions of Unix going back for decades. For long file lists, several /bin/rm commands may be executed to keep the list from overrunning the command line length limit.
A similar effect can be achieved with less typing on more recent OSes, using a + in find to replace the most common use of xargs in a style still lends itself to other actions besides /bin/rm. In this case, find will handle splitting truly long file lists into separate /bin/rm commands. The {} is customarily quoted to keep the shell from doing anything to it; the quotes aren't always required but the intricacies of shell quoting are too involved to cover here, so when in doubt, include the apostrophes:
find . -type f -size 0 -exec /bin/rm -f '{}' +
In Linux, briefer approaches are usually available using -delete. Note that recent find's -delete primary is directly implemented with unlink(2) and doesn't spawn a zillion /bin/rm commands, or even the few that xargs and + do. Mac OS find also has the -delete and -empty primaries.
find . -type f -empty -delete
To stomp empty (and newly-emptied) files - directories as well - many modern Linux hosts can use this efficient approach:
find . -empty -delete
find /path/to/stuff -empty
If that's the list of files you're looking for then make the command:
find /path/to/stuff -empty -exec rm {} \;
Be careful! There won't be any way to undo this!
Use:
find . -type f -size 0b -exec rm {} ';'
with all the other possible variations to limit what gets deleted.
A very simple solution in case you want to do it inside ONE particular folder:
Go inside the folder, right click -> view -> as list.
Now you'll find all the files listed as a list. Click on "Size" which must be a column heading. This will sort all the files based on it's size.
Finally, you can find all the files that have zero bites at the last. Just select those and delete it!