How best (idiomatically) to fail perl script (run with -n/-p) when input file not found? - perl

$ perl -pe 1 foo && echo ok
Can't open foo: No such file or directory.
ok
I'd really like the perl script to fail when the file does not exist. What's the "proper" way to make -p or -n fail when the input file does not exist?

The -p switch is just a shortcut for wrapping your code (the argument following -e) in this loop:
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
(-n is the same but without the continue block.)
The <> empty operator is equivalent to readline *ARGV, and that opens each argument in succession as a file to read from. There's no way to influence the error handling of that implicit open, but you can make the warning it emits fatal (note, this will also affect several warnings related to the -i switch):
perl -Mwarnings=FATAL,inplace -pe 1 foo && echo ok

Set a flag in the body of the loop, check the flag in the END block at the end of the oneliner.
perl -pe '$found = 1; ... ;END {die "No file found" unless $found}' -- file1 file2
Note that it only fails when no file was processed.
To report the problem when not all files have been found, you can use something like
perl -pe 'BEGIN{ $files = #ARGV} $found++ if eof; ... ;END {die "Some files not found" unless $files == $found}'

Related

Perl removing the files using system command returns success always

I have a script which takes filenames (with its full path) as an arguments and deletes them from the system.
Here is the code:
#!/usr/bin/perl
use strict; use warnings;
warn "No arguments/files names passed to the script: $!\n" unless #ARGV;
my $count = 0;
foreach (#ARGV) {
my $cmd = "rm -rf $_";
my $exit_code = system($cmd);
if($exit_code != 0){
print "Command $cmd failed with an exit code of $exit_code.\n";
exit($exit_code >> 8);
} else {
print "Command $cmd successful!\n";
$count++;
}
}
print "Out of ".scalar(#ARGV)." file(s) ".$count." file(s) deleted\n";
I have two questions:
Here if I pass the dummy file say the file which doesn't exists, it gives me $exit_code as 0. How it is possible ? Shouldn't it through exit code other than 0 ?
When I delete the file in Perl way unlink $_; it doesn't delete them. How can I forcefully delete using unlink command ?
Here if I pass the dummy file say the file which doesn't exists, it
gives me $exit_code as 0. How it is possible ? Shouldn't it through
exit code other than 0 ?
You are using rm with the -f option. From the man page of rm:
-f, --force
ignore nonexistent files and arguments, never prompt
With this option, as far as I know, you will always get a return code of 0 when trying to remove a file that does not exist.
When I delete the file in Perl way unlink $_; it doesn't delete them.
How can I forcefully delete using unlink command ?
There are lots of reasons a file will not delete. If it has been set to immutable, the sticky bit is set on the directory containing the files (and you are not the owner of the files) or simply the user running your script does not have write permissions of the files. The point is none of that has to do with unlink. You have to have proper permissions before removing a file using any method at all whether its rm or unlink etc.
I like to use rmtree from File::Path. No need to shell out at all to get a recursive delete.
As BryanK already answered, 0 is the expected error code with the -f options. When you run into these issues, test the command in the shell to see if it's Perl (or whatever), or the command. The exit value of the command shows up in $? (the shell version, which is why Perl's variable has the same name):
$ rm -rf test_dir
$ echo $?
0

Can I pass a string from perl back to the calling c-shell?

RHEL6
I have a c-shell script that runs a perl script. After dumping tons of stuff to stdout, it determines where (what dir) the parent shell should cd to when the perl script finishes. But that's a string, not an int which is all I can pass back with "exit()".
Storing the name of the dir in a file which the c-shell script can read is what I have now. It works, but is not elegant. Is there a better way to do this ? Maybe a little chunk of memory that I can share with the perl script ?
Short:
Redirect Perl's streams and restore in the end to print that info, taken by the shell script
Or, print that last and the shell script can pass output to the console and take the last line
Or, use a named pipe (either shell) or specific file descriptors (not csh) for that print
When the Perl script prints out that name you can assign it to a variable
in the shell script
#!/bin/csh
set DIR `perl -e'print "dir_name"'`
while in bash
#!/bin/bash
DIR="$(perl -e'print "dir_name"')"
where $(...) is preferred for the command substitution.
But those other prints to console from the Perl script then need be handled
One way is to redirect all output in Perl script other than that one print, what can be controlled by a command-line option (filename to which to redirect, which shell script can print out)
Or, take all Perl's output and pass it to console, the last line being the needed "return." This puts the burden on the Perl script to print that last (perhaps in an END block). The program's output can be printed from the shell script after it completes or line by line as it is emitted.
Or, use a named pipe (both shells) or a specific file descriptor (bash only) to which the Perl script can print that information. In this case its streams go straight to the console.
The question explicitly mentions csh so it is given below. But I must repeat the old and worn fact that shell scripting is far better done in bash than in csh. I strongly recommend to reconsider.
bash
If you need the program's output on the console as it goes, take and print it line by line
#!/bin/bash
while read line; do
echo "$line"
DIR=$line
done < <(perl script.pl)
echo "$DIR"
Or, if you don't need output on the console before the script is finished
#!/bin/bash
mapfile -t lines < <(perl script.pl)
DIR="${lines[-1]}"
printf '%s\n' "${lines[#]}" # print script.pl's output
Or, use file descriptors for that particular print
F=$(mktemp) # safe filename
exec 3> "$F" # open fd 3 to write to it
exec 4< "$F" # open fd 4 to read from it
rm -f "$F" # remove file(name) for safety; opened fd's can still access
perl -E'$fd=shift; say "...normal prints to STDOUT...";
open(FH, ">&=$fd") or die $!;
say FH "dirname";
close FH
' 3
read dir_name <&4
exec 3>&- # close them
exec 4<&-
echo "$dir_name"
I couldn't get it to work with a single file descriptor for both reading and writing (exec 3<> ...), I think because the read can't rewind after the write, thus separate descriptors are used.
With a Perl script (and not the demo one-liner above) pass the fd number as a command-line option. The script can then do this only if it's invoked with that option.
Or, use a named pipe very similarly to how it's done for csh below. This is probably best here, if the manipulation of the program's STDOUT isn't to your liking.
csh
Iterate over the program's (completed) output line by line
#!/bin/csh
foreach line ( "`perl script.pl`" )
echo "$line"
set dir_name = "$line"
end
echo "Directory name: $dir_name"
or extract the last line first and then print the whole output
#!/bin/csh
set lines = ( "`perl script.pl`" )
set dir_name = $lines[$#]
# Print program's output
while ( $#lines )
echo "$lines[1]"
shift lines
end
or use a named pipe
set fifo_name = "/tmp/fifo$$" # or use mktemp
mkfifo "$fifo_name"
( perl script.pl --fifo $fifo_name [other args] & )
set dir_name = `cat "$fifo_name"`
rm -f $fifo_name
echo "dir name from FIFO: $dir_name"
The Perl command is in the background since FIFO blocks until written and read. So if the shell script were to wait for perl ... to complete the Perl script would block as it's writing to FIFO (since that's not being read) so shell would never get to read it; we would deadlock. It is also in a subshell, with ( ), so to avoid the informational prints about the background job.
The --fifo NAME command-line option is needed so that Perl script knows what special file to use (and not to do this if the option is not there).
For an in-line example replace ( perl script ...) with this one-liner, used above as well
( perl -E'$ff = shift; say qq(\t...normal prints to STDOUT...);
open FF, ">$ff" or die $!;
say FF "dir_name_$$";
close FF
' $fifo_name
& )
(broken over lines for readability)

perl script to add line of code only modifies one file

I have this:
perl -pi -e 'print "code I want to insert\n" if $. == 2' *.php
which puts the line code I want to insert on the second line of the file, which is what I need done to every single PHP file
If I run it in a directory with both PHP files and non-PHP files it does the right thing, but only to one PHP file. I thought *.php would apply it to all PHP files, but it doesn't do it.
How can I write it so it will modify every PHP file in a directory? Bonus if there is an easy way to do this recursively through all directories. I don't mind running the Perl script for each directory as there aren't that many, but don't want to hand edit every single file.
The problem is that the file handle ARGV that Perl uses to read the files passed on the command line is never explicitly closed, so the line number $. just keeps incrementing after the end of the first file and never goes back to one.
Fix this by closing ARGV when it has reached end of file. Perl will reopen it to read the next file in the list, and so reset $.
perl -i -pe 'print "code I want to insert\n" if $. == 2; close ARGV if eof' *.php
If you can use sed, this should work:
sed -si '2i\CODE YOU WANT TO INSERT' *.php
To do it recursively, you might try:
find -name '*.php' -execdir sed -si '2i\CODE YOU WANT TO INSERT' '{}' +
Using File::Find.
Note, I've included 3 sanity checks to verify that things are actually being processed they way that you want.
Initially the script will just print out the found files until you comment out the bare return.
Then the script will save backups unless you uncomment the unlink statement.
Finally, the script will only process a single file until you comment out the exit statement.
These three checks are just so you can verify that everything is working as you desire before editing a whole directory tree.
use strict;
use warnings;
use File::Find;
my $to_insert = "code I want to insert\n";
find(sub {
return unless -f && /\.php$/;
print "Edit $File::Find::name\n";
return; # Comment out once satisfied with found files
local $^I = '.bak';
local #ARGV = $_;
while (<>) {
print $to_insert if $. == 2 && $_ ne $to_insert;
print;
}
# unlink "$_$^I"; # Uncomment to delete backups once certain that first file is processed correctly.
exit; # Comment out once certain that first file is processed correctly
}, '.')

Perl one-liner: how to reference the filename passed in when -ne or -pe commandline switches are used

In Perl, it's normally easy enough to get a reference to the commandline arguments. I just use $ARGV[0] for example to get the name of a file that was passed in as the first argument.
When using a Perl one-liner, however, it seems to no longer work. For example, here I want to print the name of the file that I'm iterating through if a certain string is found within it:
perl -ne 'print $ARGV[0] if(/needle/)' haystack.txt
This doesn't work, because ARGV doesn't get populated when the -n or -p switch is used. Is there a way around this?
What you are looking for is $ARGV. Quote from perlvar:
$ARGV
Contains the name of the current file when reading from <> .
So, your one-liner would become:
perl -ne 'print $ARGV if(/needle/)' haystack.txt
Though be aware that it will print once for each match. If you want a newline added to the print, you can use the -l option.
perl -lne 'print $ARGV if(/needle/)' haystack.txt
If you want it to print only once for each match, you can close the ARGV file handle and make it skip to the next file:
perl -lne 'if (/needle/) { print $ARGV; close ARGV }' haystack.txt haystack2.txt
As Peter Mortensen points out, $ARGV and $ARGV[0] are two different variables. $ARGV[0] refers to the first element of the array #ARGV, whereas $ARGV is a scalar which is a completely different variable.
You say that #ARGV is not populated when using the -p or -n switch, which is not true. The code that runs silently is something like:
while (#ARGV) {
$ARGV = shift #ARGV; # arguments are removed during runtime
open ARGV, $ARGV or die $!;
while (defined($_ = <ARGV>)) { # long version of: while (<>) {
# your code goes here
} continue { # when using the -p switch
print $_; # it includes a print statement
}
}
Which in essence means that using $ARGV[0] will never show the real file name, because it is removed before it is accessed, and placed in $ARGV.

How can I record changes made during in-place editing in Perl?

I've scripted up a simple ksh that calls a Perl program to find and replace in files.
The passed-in arg is the home directory:
perl -pi -e 's/find/replace/g' $1/*.html
It works great. However, I'd like to output all the changes to a log file. I've tried piping and redirecting and haven't been able to get it work. Any ideas?
Thanks,
Glenn
Something like this to send all changes to STDERR:
perl -pi -e '$old = $_; s/find/replace/g and warn "${ARGV}[$.]: $old $_"; close ARGV if eof' $1/*.html
Updated: Fixed $. on multiple files.
You can print to STDERR and redirect just the STDERR output to a file as below:
perl -pi -e 'chomp($prev=$_);s/find/replace/g and print STDERR "$ARGV - $.: $prev -> $_"; close ARGV if eof' $1/*.html 2> logfile.txt
edit: added the filename, and fixed line number display when multiple input files are used