Deleting empty (zero-byte) files - command-line

What's the easiest/best way to find and remove empty (zero-byte) files using only tools native to Mac OS X?

Easy enough:
find . -type f -size 0 -exec rm -f '{}' +
To ignore any file having xattr content (assuming the MacOS find implementation):
find . -type f -size 0 '!' -xattr -exec rm -f '{}' +
That said, note that many xattrs are not particularly useful (for example, com.apple.quarantine exists on all downloaded files).

You can lower the potentially huge number of forks to run /bin/rm by:
find . -type f -size 0 -print0 | xargs -0 /bin/rm -f
The above command is very portable, running on most versions of Unix rather than just Linux boxes, and on versions of Unix going back for decades. For long file lists, several /bin/rm commands may be executed to keep the list from overrunning the command line length limit.
A similar effect can be achieved with less typing on more recent OSes, using a + in find to replace the most common use of xargs in a style still lends itself to other actions besides /bin/rm. In this case, find will handle splitting truly long file lists into separate /bin/rm commands. The {} is customarily quoted to keep the shell from doing anything to it; the quotes aren't always required but the intricacies of shell quoting are too involved to cover here, so when in doubt, include the apostrophes:
find . -type f -size 0 -exec /bin/rm -f '{}' +
In Linux, briefer approaches are usually available using -delete. Note that recent find's -delete primary is directly implemented with unlink(2) and doesn't spawn a zillion /bin/rm commands, or even the few that xargs and + do. Mac OS find also has the -delete and -empty primaries.
find . -type f -empty -delete
To stomp empty (and newly-emptied) files - directories as well - many modern Linux hosts can use this efficient approach:
find . -empty -delete

find /path/to/stuff -empty
If that's the list of files you're looking for then make the command:
find /path/to/stuff -empty -exec rm {} \;
Be careful! There won't be any way to undo this!

Use:
find . -type f -size 0b -exec rm {} ';'
with all the other possible variations to limit what gets deleted.

A very simple solution in case you want to do it inside ONE particular folder:
Go inside the folder, right click -> view -> as list.
Now you'll find all the files listed as a list. Click on "Size" which must be a column heading. This will sort all the files based on it's size.
Finally, you can find all the files that have zero bites at the last. Just select those and delete it!

Related

Delete multiple files in CENTOS command

How can i delete all the file that are ending with *0x0.jpg in CENTOS ? I need to delete multiple files nested into folders and subfolders
I assume you have a shell - try
find /mydirectory -type f -print | grep '0x0.jpg$' | xargs -n1 rm -f
There is probable a more elegant solution but that should work
However I would put an echo in before rm on the first run to ensure that the right files are going to be removed.
Ed Heal's answer works just fine but neither the grep nor xargs calls are necessary. The following should work just as well and be a good bit more efficient for large amounts of files.
find /mydirectory -name '*0x0.jpg' -type f -exec rm -rf () \+

clearcase create diff between parent and child branches

So the typical way I would create a diff log/patch between two branches in clearcase would to simply create two views and do a typical unix diff. But I have to assume that there is a more clearcase way (and also a '1-liner').
so knowing how to get a list of all files that have been modified on a branch:
cleartool find . -type f -branch "brtype(<BRANCH_NAME>)" -print
and knowing how to get the diff formatted output for two separate files:
cleartool diff FILE FILE##/main/PARENT_BRANCH_PATH/LATEST
so does anyone see any issues with the following to get a diff for all files that have been changed in a branch?
cleartool find . -type f -branch "brtype(CHILD_BRANCH)" -exec 'cleartool diff -ser $CLEARCASE_PN `echo $CLEARCASE_XPN | sed "s/CHILD_BRANCH/LATEST/"` ' > diff.log
Any modifications and comments are greatly welcomed
thanks in advance!
update: any ideas on how to get this too be a unix unified diff would also be greatly appreciated.
update2: So I think I have my solution, thanks go to VonC for sending me in the right directions:
cleartool find . -type f -branch "brtype(CHILD_BRANCH)" -exec 'cleartool get -to $CLEARCASE_PN.prev `echo $CLEARCASE_XPN | sed "s/CHILD_BRANCH/LATEST/"`; diff -u $CLEARCASE_PN.prev $CLEARCASE_PN; rm -f $CLEARCASE_PN.prev' > CHILD_BRANCH.diff
the output seems to work, I can read the file in via kompare without complaints.
The idea is sound.
I would simply make sure the $CLEARCASE_PN and $CLEARCASE_XPN are used with double quotes around them, to take into account with potential spaces in the file path or file name (as illustrated in "How do I list ClearCase versions without the Fully-qualified version?").
cleartool find . -type f -branch "brtype(CHILD_BRANCH)" -exec 'cleartool diff -ser "$CLEARCASE_PN" `echo "$CLEARCASE_XPN" | sed "s/CHILD_BRANCH/LATEST/"` ' > diff.log
Using simple quotes for the -exec directive is a good idea, as explained in "CLEARCASE_XPN not parsed as variable in clearcase command".
However, cleartool diff, even with the -ser (-serial) option don't produce exactly an Unix unified diff format (or Unified Format for short).
The -diff(_format) option is the closest, as I mention in "How would you measure inserted / changed / removed code lines (LoC)?"
The -diff_format option causes both the headers and differences to be reported in the style of the UNIX and Linux diff utility, writing a list of the changes necessary to convert the first file being compared into the second file.
One idea would be to not use cleartool diff, but use directly diff, since it can access in a dynamic view the right version through the extended pathname of the elements found.
The OP ckcin's solution is close that what I suggested with cleartool get:
cleartool find . -type f -branch "brtype(CHILD_BRANCH)" -exec 'cleartool get -to $CLEARCASE_PN.prev `echo $CLEARCASE_XPN | sed "s/CHILD_BRANCH/LATEST/"`; diff -u $CLEARCASE_PN.prev $CLEARCASE_PN; rm -f $CLEARCASE_PN.prev' > CHILD_BRANCH.diff
the output seems to work, I can read the file in via kompare without complaints.
In multiple line, for readability:
cleartool find . -type f -branch "brtype(CHILD_BRANCH)"
-exec 'cleartool get -to $CLEARCASE_PN.prev
`echo $CLEARCASE_XPN | sed "s/CHILD_BRANCH/LATEST/"`;
diff -u $CLEARCASE_PN.prev $CLEARCASE_PN;
rm -f $CLEARCASE_PN.prev' > CHILD_BRANCH.diff
(Note that $CLEARCASE_XPN and $CLEARCASE_PN are set by the cleartool find commant, they're not variables you set yourself.)
Transferring the answer from VonC and einpoklum to Windows I came up with the following. Create a separate batch file, which I called diffClearCase.bat, this eases up the command line significantly. It creates a separate tree for all modified files, which I personally liked, but the file and folders can be deleted afterwards.
#echo off
SET PLAINFILE=%1
SET PLAINDIR=%~dp1
SET CLEARCASE_FILE=%2
SET BRANCH_NAME=%3
SET SOURCE_DRIVE=T:
SET TARGET_TEMP_DIR=D:
SET DIFF_TARGET_FILE=D:\allPatch.diff
call set BASE_FILE=%%CLEARCASE_FILE:%BRANCH_NAME%=LATEST%%
call set TARGET_FILE=%%PLAINFILE:%SOURCE_DRIVE%=%TARGET_TEMP_DIR%%%
call set TARGET_DIR=%%PLAINDIR:%SOURCE_DRIVE%=%TARGET_TEMP_DIR%%%
echo Diffing file %PLAINFILE%
IF NOT EXIST %TARGET_DIR% mkdir %TARGET_DIR%
cleartool get -to %TARGET_FILE% %BASE_FILE%
diff -u %TARGET_FILE% %PLAINFILE% >> %DIFF_TARGET_FILE%
rem del /F/Q %TARGET_FILE%
And then I created a second bat file which simply takes the branch name as argument. In our case this directory contains multiple VOBs, so I iterate over them and do this per VOB.
#echo off
SET BRANCHNAME=%1
SET DIFF_TARGET_FILE=D:\allPatch.diff
SET SOURCE_DRIVE=T:
SET DIFF_TOOL=D:\Data\Scripts\diffClearCase.bat
IF EXIST %DIFF_TARGET_FILE% DEL /Q %DIFF_TARGET_FILE%
for /D %%V in ("%SOURCE_DRIVE%\*") DO (
echo Checking VOB %%V
cd %%V
cleartool find %%V -type f -branch "brtype(%BRANCHNAME%)" -exec "%DIFF_TOOL% \"%%CLEARCASE_PN%%\" \"%%CLEARCASE_XPN%%\" %BRANCHNAME%"
)

Need to find list of scripts that uses a certain file

I have a file named "performance". I need to know which scripts use this file.
I don't believe there is a straight forward way of listing files used by scripts. You will have to run grep in combination of find to check if the script contains the name of the file that you want to check for. Knowing the exact name of the file will help. Using words like performance might end up grepping files that uses that word in comments.
find /path/ \( -name "*.sh" -o -name "*.pl" \) -type f -print0 | xargs -0 grep "performance"
If you're on linux, then you may install and configure auditd to watch for accesses to a particular file.
You can use the -r option to recursively grep through all sub-directories and find text. The syntax is as follows:
grep -r "performance" /dir/

unix - delete files only from directory

Say with a directory structure such as:
toplev/
file2.txt
file5.txt
midlev/
test.txt
anotherdirec/
other.dat
myfile.txt
furtherdown/
morefiles.txt
otherdirec/
myfile4.txt
file7.txt
How would you delete all files (not directories and not recursively) from the 'anotherdirec'? In this example it would delete 2 files (other.dat, myfile.txt)
I have tried the below command from within the 'midlev' directory but it gives this error (find: bad option -maxdepth find: [-H | -L] path-list predicate-list):
find anotherdirec/ -type f -maxdepth 1
I'm running SunOS 5.10.
rm anotherdirec/*
should work for you.
Rob's answer (rm anotherdirec/*) will probably work, but it is a bit verbose and generates a bunch of error messages. The problem is that you are using a version of find that does not support the -maxdepth option. If you want to avoid the error messages that 'rm anotherdirec/*' gives, you can just do:
for i in anotherdirec/*; do test -f $i && rm $i; done
Note that neither of these solutions will work if any of the files contain spaces or other special characters. You can put double quotes around $i if that is an issue.
Find is sensitive to options order. Try this:
find anotherdirec/ -maxdepth 1 -type f -exec rm {} \;
rm toplev/midlev/anotherdirec/* if you want to delete only files.
rm -rf toplev/midlev/anotherdirec/* if you want to delete files and lower directories

How to use multiple files at once using bash

I have a perl script which is used to process some data files from a given directory. I have written below bash script to look for the last updated file in the given directory and process that file.
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} \;
Sometimes, user copied multiple files to the data dir and hence the previous one skipped. The perl script execute only the last updated file. Can you please suggest me how to fix this using bash script.
Try
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} +
Note the termination of -exec with a + vs your \;
From the man page
-exec command {} +
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end;
Now that you'll have one or more file names passed into your perl script, you can alter your perl script to iterate over each passed in file name.
If I understood the question correctly, you need to process any files that were created or modified in a directory since the last time your script was run.
In my opinion find is not the right tool to determine those files, because it has no notion of which files it has already seen.
Using any of the -atime/-ctime/-mtime options will either produce duplicates if you run your script twice in the specified period, or miss some files if it is not executed at the right time. The timing intricacies of using these options for something like this are not easy to deal with.
I can propose a few alternatives:
a) Use three directories instead of one: incoming/ processing/ done/. Your users should only be allowed to put files in incoming/. You move any files in there to processing/ with a simple mv incoming/* processing/ before running your perl script. Then you move them from processing/ to done/ when its over.
In my opinion this is the simplest and best solution, and the one used by mail servers etc when dealing with this issue. If I were you and there were not any special circumstances preventing you from doing this, I'd stop reading here.
b) Have your finder script touch a special file (e.g. .timestamp, perhaps in a different directory, so that your users will not tamper with it) when it's done. This will allow your script to remember the last time it was run. Then use
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' ';'
to run your perl script for each file. You should modify your perl script so that it can run repeatedly with a different file name each time. If you can modify it to accept multiple files in one go, you can also run it with
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' +
which will minimise the number of ./script.pl processes. Take care to handle the first run of the find script, when the .timestamp file is missing. A good solution would be to simply ignore it by not using the -*newer options at all in that case. Also keep in mind that there is a race condition where files added after find was started but before touching the timestamp file will not be processed.
c) As a variation of (b), have your script update the timestamp with the time of the processed file that was created/modified most recently. This is tricky, because find cannot order its output on its own. You could use a wrapper around your perl script to handle this:
#!/bin/bash
for i in "$#"; do
find "$i" \( -cnewer .timestamp -o -newer .timestamp \) -exec touch -r '{}' .timestamp ';'
done
./script.pl "$#"
This will update the timestamp if it is called to process a file with a newer mtime or ctime, minimising (but not eliminating) the race condition. It is however somewhat awkward - unavoidable since bash's [[ -nt option seems to only check the mtime. It might be better if your perl script handled that on its own.
d) Have your script store each processed filename and its timestamps somewhere and then skip duplicates. That would allow you to just pass all files in the directory to it and let it sort out the mess. Kinda tricky though...
e) Since your are using Linux, you might want to have a look at inotify and the inotify-tools package - specifically the inotifywait tool. With a bit of scripting it would allow you to process files as they are added in the directory:
inotifywait -e MOVED_TO -e CLOSE_WRITE -m -r testd/ | grep --line-buffered -e MOVED_TO -e CLOSE_WRITE | while read d e f; do ./script.pl "$f"; done
This has no race conditions, as long as your users do not create/copy/move any directories rather than just files.
The perl script will only execute against the file which find gives it. Perhaps you should remove the -mtime -1 option from the find command so that it picks up all the files in the directory?