ls: terminated by signal 13 when using xargs - find

I'm using the following command to delete four largest size files in a folder:
find "/var/www/site1/" -maxdepth 1 -type f | xargs ls -1S | head -n 4 | xargs -d '\n' rm -f
It works fine, but from time to time throws broken pipe error:
xargs: ls: terminated by signal 13

I ran across a similar issue and found this thread on search for an answer:
Signal 13 means something is written to a pipe where nothing is read from anymore (e.g. see http://people.cs.pitt.edu/~alanjawi/cs449/code/shell/UnixSignals.htm ).
The point here is that the ls command as executed by xargs is still writing output when the following head command already got all the input it wants and closed its input-pipe. Thus it's safe to ignore, yet it's ugly. See also the accepted answer in https://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error

You are purposely terminating your program with head -n 4, which creates the broken pipe because you terminated it before the "caller" finished. Since this is expected by you, you can ignore the error by redirecting it to /dev/null which discards it:
find "/var/www/site1/" -maxdepth 1 -type f | xargs ls -1S | head -n 4
| xargs -d '\n' rm -f 2>/dev/null

I got the same error, "terminated by signal 13", under different circumstances and other answers here helped me work out the fix. I'd like to expand on the nature of the problem:
corpy386 ~/gw/Release/5.1_v9/ClaimCenter $ find . -name '*.pcf' -not -name '*build*' | xargs grep -l ClaimSnapshotGeneralPanelSet | ( read f && echo $f && grep 'def=' $f )
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.auto.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
xargs: grep: terminated by signal 13
So here's the same error and I'd only get a single line of output when I knew there are numerous files that match what I'm looking for. The problem was that xargs is producing multiple lines of output and read is only consuming a single line before ending. xargs tries to write the rest of its results to one of the pipes but the receiving end has already quit and gone home. Hence, signal 13: Broken Pipe.
The fix was to consume all of xargs's output by looping - change read f && do_some_things (which reads one time only) to while read f; do do_some_things; done.
corpy386 ~/gw/Release/5.1_v9/ClaimCenter $ **find . -name '*.pcf' -not -name '*build*' | xargs grep -l ClaimSnapshotGeneralPanelSet | while read f; do echo $f; grep 'def=' $f; done**
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.auto.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.gl.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.Pr.pcf
def="ClaimSnapshotGeneralPRPanelSet(Claim, Snapshot)"
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.Trav.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.wc.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/build/idea/classes/web/pcf/claim/snapshot/default/ClaimSnapshotLossDetailsScreen.default.pcf
def="ClaimSnapshotGeneralPanelSet(Claim, SnapshotParam)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.auto.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.gl.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.Pr.pcf
def="ClaimSnapshotGeneralPRPanelSet(Claim, Snapshot)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.Trav.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotGeneralPanelSet.wc.pcf
def="AddressSnapshotInputSet(Snapshot.LossLocation, Snapshot)"
./modules/configuration/config/web/pcf/claim/snapshot/default/ClaimSnapshotLossDetailsScreen.default.pcf
def="ClaimSnapshotGeneralPanelSet(Claim, SnapshotParam)"
This isn't exactly the same situation as OP's script - They wanted a part of the input and cut it off on purpose, I wanted the whole stream and cut it off by accident - but the shell semantics work out the same. Programs tend to be written to keep running until they have consumed all their input rather than test to see if their recipient is still listening.

Related

Issue with Sed no input file when Xgrep

I am trying to create a script which looks for x days old files that have a specific string in it, it then removes it and logs the file it has changed.
My way is probably not the best way, but I am new to this so looking for some help. I have got to a stage where the script works but it does not log the file name it has worked on
Working script is
find /home/test -mtime +5 -type f ! -size 0 | xargs grep -E -l '(abc_Pswd_[1-9])' | xargs -n1 sed -i '/abc_Pswd_[1-9].*/d'
I am trying to get the file name from 2nd part of the script I have tried few things
find /home/test -mtime +7 -type f ! -size 0 | xargs grep -E -l '(abc.1x.[1-9] )' > /home/test/tst.log| xargs -n1 sed -i '/abc_Pswd_[1-9].*/d'
This works in terms of logging the result but it exits with the error "sed: no input files"

follow logfile with tail and exec on event

i wonder if there is a more simplyfied way to run the tail -f or -F on a logfile and execute a command each time a special keyword is mentioned.
this is my working solution so far, but i don't like it for following reasons:
i have to write new lines for each match to log file to avoid endless loop
tail does not follow exactly the log, it could miss some lines while the command is executed
i am not aware about CPU usage because of high frequency
example:
#!/sbin/sh
while [ 1 ]
do
tail -n1 logfile.log | grep "some triggering text" && mount -v $stuff >> logfile.log
done
i tried the following but grep won't give return code until the pipe break
#!/sbin/sh
tail -f -n1 logfile.log | grep "some triggering text" && mount $stuff
i am running a script on android which is limited to
busybox ash
edit:
the problem is related to grep. grep won't give return code until the last line. what i need is return code for each line. maybe kind of a --follow option for grep, or sed, awk, or a user defined function which works with tail --follow

how to get number of files older than 1 hour on ksh HP-UX

I need to list set of files created older than 1 hour in certain folder of HP-UX. Following is the command i tried.
find . -type f -mmin +60 | wc -l
But it return following error for ksh
find: bad option -mmin
What is the alternative option to get number of files older than 1 hour?
Even i tried following command. Still another error. But it also work on bash
find . -type f -mtime +0.04 | wc -l
find: Error in processing the argument 0.04
find in HP-UX has no options for minutes, mtime takes days as argument.
You can create a testfile, "touch" it with the desired time and then compare with ! -newer[m]. For instance:
# onehourago=`date +"%m %d %H %M" | awk '{ onehourago=$3 - 1 ; if (onehourago<0) { onehourago=59 } printf("%.2d%.2d%.2d%.2d\n",$1,$2,onehourago,$4) }'`
# touch -t "$onehourago" testfile
# find . -type f ! -newer testfile | wc -l

Perl / xargs terrible performance with xargs -n1/-i

I have a little perl one-liner I wrote:
find . -name '*.cpp' -print0 2>/dev/null | xargs -0 -i perl -ne 'if (/\+\+\S*[cC]ursor\S*/ && !/[!=]=\s*DB_NULL_CURSOR/) {print "$ARGV:$.\n $_\n";}' {}
In the directory I'm running this, the find portion returns 5802 results.
Now, I understand xargs -i (or -n1) is going to have a performance impact, but with -i:
find . -name '*.cpp' -print0 2> /dev/null 0.33s user 1.12s system 0% cpu 3:12.57 total
xargs -0 -i perl -ne {} 4.12s user 32.80s system 16% cpu 3:42.22 total
And without:
find . -name '*.cpp' -print0 2> /dev/null 0.27s user 1.22s system 95% cpu 1.556 total
xargs -0 perl -ne 0.62s user 0.69s system 61% cpu 2.117 total
Minutes vs. a couple seconds (order of testing confirmed not to matter). The actual perl results are identical other than the line numbers which are obviously incorrect in the second instance.
Behavior is identical in Cygwin/bash/perl5v26, and WSL Ubuntu 16.04/zsh/perl5v22. File system is NTFS in both cases. But...I'm kind of assuming the little one-liner I wrote must have some sort of bug in it and that stuff is irrelevant?
EDIT: It occurred to me that disabling sitecustomize.pl at startup with -f --an option I'd vaguely remembered seeing with perl --help--might help. It did not. Also, I'm aware that the performance impact of -i is going to be significant due to perl compiling the regex. This still seems out of control.
xargs will invoke a new process for every line it processes, so in your case it will be spinning up perl 5802 times and doing this in series
You could try in parallel
You might be using xargs to invoke a compute intensive command for
every line of input. Wouldn’t it be nice if xargs allowed you to take
advantage of the multiple cores in your machine? That’s what -P is
for. It allows xargs to invoke the specified command multiple times in
parallel. You might use this for example to run multiple ffmpeg
encodes in parallel. However I’m just going to show you yet another
contrived example.
Or on the other hand, you could use sed which is much lighter to spin up
Okay, my fundamental misunderstanding was an assumption that the max command line length would be something in the 2000 range. So I was assuming a perl instance for every 20 files or so (being about 120 characters each). This was incredibly incorrect.
getconf ARG_MAX shows you the actual acceptable length. In my case:
2097152
So, I'm looking at 1 perl instance vs. 5802 instances. The only perl solution I can think of would be to remove -n and implement the loop by hand, explicitly closing each file.
Better solutions, I think, are awk:
find . -name '*.cpp' 2>/dev/null -print0 | xargs -0 awk '{if (/\+\+\S*[cC]ursor\S*/ && !/[!=]=\s*DB_NULL_CURSOR/) {print FILENAME ":" FNR " " $0}}'
or grep:
find . -name '*.cpp' 2>/dev/null -print0 | xargs -0 grep -nE '\+\+\S*[cC]ursor\S*' | grep -v '[!=]=\s*DB_NULL_CURSOR'
Both of which execute in the 2 or 3 second range.

How can I traverse a directory tree using a bash or Perl script?

I am interested into getting into bash scripting and would like to know how you can traverse a unix directory and log the path to the file you are currently looking at if it matches a regex criteria.
It would go like this:
Traverse a large unix directory path file/folder structure.
If the current file's contents contained a string that matched one or more regex expressions,
Then append the file's full path to a results text file.
Bash or Perl scripts are fine, although I would prefer how you would do this using a bash script with grep, awk, etc commands.
find . -type f -print0 | xargs -0 grep -l -E 'some_regexp' > /tmp/list.of.files
Important parts:
-type f makes the find list only files
-print0 prints the files separated not by \n but by \0 - it is here to make sure it will work in case you have files with spaces in their names
xargs -0 - splits input on \0, and passes each element as argument to the command you provided (grep in this example)
The cool thing with using xargs is, that if your directory contains really a lot of files, you can speed up the process by paralleling it:
find . -type f -print0 | xargs -0 -P 5 -L 100 grep -l -E 'some_regexp' > /tmp/list.of.files
This will run the grep command in 5 separate copies, each scanning another set of up to 100 files
use find and grep
find . -exec grep -l -e 'myregex' {} \; >> outfile.txt
-l on the grep gets just the file name
-e on the grep specifies a regex
{} places each file found by the find command on the end of the grep command
>> outfile.txt appends to the text file
grep -l -R <regex> <location> should do the job.
If you wanted to do this from within Perl, you can take the find commands that people suggested and turn them into a Perl script with find2perl:
If you have:
$ find ...
make that
$ find2perl ...
That outputs a Perl program that does the same thing. From there, if you need to do something that easy in Perl but hard in shell, you just extend the Perl program.
find /path -type f -name "*.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
#do some other things here
}
}
}'
similar thread
find /path -type f -name "outfile.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
}
}
}'