Combine 'if' and 'if not' in perl - perl

I have got the following filter inside my httpd.conf:
ExtFilterDefine jsonfilter mode=output intype=application/json cmd="/usr/bin/perl -pe 's|^|qq(\,\") . valid . qq(\"\: ) . qq(\") . time() . \\x0D . qq(\") . qq(\\n)|e if ($==eof) && unless (-f q{/tmp/md5_filter.tmp})'"
But the way how I used the && operater is not valid. I receive no output if I request the file. The filter should only run if the md5_filter.tmp file doesn't exists and s command should only add the timestamp at the end of file (eof). Does somebody know what's wrong with my code?

There are a lot of strange things about your code.
/usr/bin/perl -pe 's|^|qq(\,\") . valid . qq(\"\: ) . qq(\") . time() . \\x0D . qq(\") . qq(\\n)|e if ($==eof) && unless (-f q{/tmp/md5_filter.tmp})'
perl -pe does not actually change the file. Not sure if that is your intent. If you do want to change the file, you need to add the -i switch. Note that this will alter your file each time you run it.
s| You have changed the delimiter of the substitution operator from / to |. This is usually done when you need to use slash / inside the pattern, and you want to avoid having to escape it \/. You however do not have slashes in your pattern, which makes this change unnecessary.
^ This denotes the start of a string. Your substitution will add from the beginning of the string. Since you want to add to the end of the file, I am not sure how you are thinking there.
qq(\,\") . valid . qq(\"\: ) . qq(\") . time() . \\x0D . qq(\") . qq(\\n) You seem to be under the impression that the concatenation operator . does something special. This statement can be written qq(,\"valid\": \") . time() . qq(\x0D\"\n) Probably. The quoting and escaping is somewhat tricky to figure out. (No need to escape , and :)
if ($==eof) is a weirdness beyond words. It means $= = eof, which assigns the return value of the eof() test to the predefined $= variable (The current page length). I feel pretty sure you didn't mean that, and only meant to check if (eof).
if ($==eof) && unless (-f q{/tmp/md5_filter.tmp}) -- This is actually two separate statements, and you cannot use post-fix if twice in a row. Nor do you need to. You would chain them into one, like this: if ( (eof) && (not -f q{...}) ).
What do we get if we combine all these corrections? Well, we get this:
/usr/bin/perl -pe' s/^/qq(,\"valid\": \") . time() . qq(\x0D\"\n)/e if ( (eof) && (not -f q{/tmp/md5_filter.tmp}) )'
But.... what are you doing with this statement? For each line in the file, you are checking if it is the end of the file. And if it is, you check for the existence of a file, and then add to the last line, from the start of the line, a new string.
I think what you want to do is wait until the last line of the file, and then print a line after it. You don't need a substitution to print, you just print. So we remove the substitution and replace it with a print. Since we want to print after the last line, we use an END block to execute code at the end.
/usr/bin/perl -pe 'END { unless (-f q{/tmp/md5_filter.tmp}) { print qq(\,\"valid\"\: \") . time() . qq(\x0D\"\n) }'
The printed string may need to start with a newline qq(\n..., if the last line of the file does not end with a newline. If, for some reason, you actually meant to write the string at the beginning of the line, things get more complicated. Then you're probably better off using the first command.

Related

Interpreting & modifying Perl one-liner?

I have the following Perl 'one-liner' script (found it online, so not mine):
perl -lsne '
/$today.* \[([0-9.]+)\]:.+dovecot_(?:login|plain):([^\s]+).* for (.*)/
and $sender{$2}{r}+=scalar (split / /,$3)
and $sender{$2}{i}{$1}=1;
END {
foreach $sender(keys %sender){
printf"Recip=%05d Hosts=%03d Auth=%s\n",
$sender{$sender}{r},
scalar (keys %{$sender{$sender}{i}}),
$sender;
}
}
' -- -today=$(date +%F) /var/log/exim_mainlog | sort
I've been trying to understand its innards, because I would like to modify it to re-use its functionality.
Some questions I got:
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
Where does $sender gets its value from?
What about that (?:login|plain) segment, are they 'variables'? (I get that's ReGex, I'm just not familiarized with it)
What I'm trying to achieve:
Get the number of emails sent by each user in a SMTP relay periodically (cron job)
If there's an irregular number of emails (say, 500 in a 1-hour timespan), do something (like shutting of the service, or send a notification)
Why I'm trying to achieve this:
Lately, someone has been using my SMTP server to send spam, so I would like to monitor email activity so they stop abusing the SMTP relay resources. (Security-related suggestions are always welcomed, but out of topic for this question. Trying to focus on the script for now)
What I'm NOT trying to achieve:
To get the script done by third-parties. (Just try and point me in the right direction, maybe an example)
So, any suggestions, guidance,and friendly comments are welcomed. I understand this may be an out-of-topic question, yet I've been struggling with this for almost a week and my background with Perl is null.
Thanks in advance.
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
-l causes lines of input read in to be auto-chomped, and lines of
output printed out to have "\n" auto-appended
-s enables switch
parsing. This is what creates the variable $today, because a
command-line switch of --today=$(date +%F) was passed.
-n surrounds the entire "one-liner" in a while (<>) { ... } loop.
Effectively reading every line from standard input and running the
body of the one liner on that line
-e is the switch that tells
perl to execute the following code from the command line, rather
than running a file containing Perl code
Where does $sender gets its value from?
I suspect you are confusing $sender with %sender. The code uses $sender{$2}{r} without explicitly mentioning %sender. This is a function of Perl called "auto-vivification". Basically, because we used $sender{$2}{r}, perl automatically created a variable %sender, and added a key whose name is whatever is in $2, and set the value of that key in %sender to be a reference to a new hash. It then set that new hash to have a key 'r' and a value of scalar (split / /,$3)
What about that (?:login|plain) segment, are they 'variables'? (I get that's ReGex, I'm just not familiarized with it)
It's saying that this portion of the regular expression will match either 'login' or 'plain'. The ?: at the beginning tells Perl that these parentheses are used only for clustering, not capturing. In other words, the result of this portion of the pattern match will not be stored in the $1, $2, $3, etc variables.
-MO=Deparse is your friend for understanding one-liners (and one liners that wrap into five lines on your terminal):
$ perl -MO=Deparse -lsne '/$today.* \[([0-9.]+)\]:.+dovecot_( ...
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE:
while ( defined($_ = <ARGV>) ) {
chomp $_;
$sender{$2}{'i'}{$1} = 1 if
/$today.* \[([0-9.]+)\]:.+dovecot_(?:login|plain):([^\s]+).* for (.*)/
and $sender{$2}{'r'} += scalar split(/ /, $3, 0);
sub END {
foreach $sender (keys %sender) {
printf "Recip=%05d Hosts=%03d Auth=%s\n",
$sender{$sender}{'r'},
scalar keys %{$sender{$sender}{'i'};}, $sender;
}
}
}
-e syntax OK
[newlines and indentation added for clarity]
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
You can access a summary of the available perl command line options by running '~$ perl -h' in the terminal. Below are filtered out the specific command line options you were asking about.
~$ perl -h|perl -ne 'print if /^\s+(-l|-s|-n|-e)/'
-e program one line of program (several -e's allowed, omit programfile)
-l[octal] enable line ending processing, specifies line terminator
-n assume "while (<>) { ... }" loop around program
-s enable rudimentary parsing for switches after programfile
Two examples of the '-s' command line option in use.
~$ perl -se 'print "Todays date is $today\n"' -- -today=`date +%F`
Todays date is 2016-10-17
~$ perl -se 'print "The sky is $color.\n"' -- -color='blue'
The sky is blue.
For detailed explanations of those command line options read the online documentation below.
http://perldoc.perl.org/perlrun.html
Or run the command below from your terminal.
~$ perldoc perlrun
Unrelated to the questions of the OP, I'm aware that this is not a complete answer (added as much as I was able to at the moment), so if this post/answer violates any SO rules, the moderators should remove it. Thx.

Creating CSV of information extracted from filenames in a given format

I have a little script that lists paths to all files in a directory and all subdirectories and parses each path on the list with regex in Perl.
#!/bin/sh
find * -type f | while read j; do
echo $j | perl -n -e '/\/(\d{2})\/(\d{2})\/(\d+).*-([a-zA-Z]+)(?:_(\d{1}))?/ && print "\"0\";\"$1$2$3\";\"$4\";\"$5\";$fl\""' >> bss.csv
echo | readlink -f -n "$j" >>bss.csv
echo \">>bss.csv
done
Output:
"0";"13957";"4121113";"2";"/home/root/dir1/bss/164146/13/95/7___/000240216___Abc-4121113_2.jpg"
I am using the readlink from GNU coreutils: -n suppresses newline at the end, -f performs canonicalization by recursively following symlinks on the path.
Problem is, when input string did not pass regex I have only line with file path.
How can I add condition to check if regex passed - show path, else - no.
I broke my brain with various combinations, but didn't find any that work properly.
Description of solution
In Perl, use if (/…/) {…} else {…} instead of /…/ && …. Thus you can execute print if match is successful and some other code otherwise.
If this is not the problem and you only want to get rid of the readlink output and closing quote, you can call readlink from Perl using backticks.
Resulting code
I turned everything into a single Perl program, used File::Find instead of find command, assumed $fl at the end of print in Perl is a relict (ignored it) and used Cwd::realpath() to find canonical path of the file instead of readlink -f from GNU coreutils. If you still want to use readlink -f, feel free to change Cwd::realpath($_) to `readlink -f '$_'` (including the backticks!), but then it will not work for filenames containing a single-quote.
You should call this script as ./script-name starting-directory > bss.csv. If you put it in the directory you are examining, the output would contain it too, along with the bss.csv.
#!/usr/bin/perl
# Usage: ./$0 [<starting-directory>...]
use strict;
use warnings;
use File::Find;
use Cwd;
no warnings 'File::Find';
sub handleFile() {
return if not -f;
if ($File::Find::name =~ /\/(\d{2})\/(\d{2})\/(\d+).*-([a-zA-Z]+)(?:_(\d{1}))?/) {
local $, = ';', $\ = "\n";
print map "\"$_\"", 0, $1.$2.$3, $4, $5, Cwd::realpath($_);
} else {
print STDERR "File $File::Find::name did not match\n";
}
}
find(\&handleFile, #ARGV ? #ARGV : '.');
For reference I also enclose polished version of the original program. It is calling readlink from Perl as I suggested above and really utilizes the -n option of Perl, avoiding the while read loop.
#!/bin/sh
find . -type f | perl -n -e 'm{/(\d{2})/(\d{2})/(\d+).*-([a-zA-Z]+)(?:_(\d{1}))?} && print qq{"0";"$1$2$3";"$4";"$5";"`readlink -f -n '\''$_'\''`"}' > bss.csv
Other remarks to the original code
The echo | before the readlink does nothing and should be removed. Readlink does not read its stdin.
Where does $fl at the end of print in Perl come from? I assume it is a relict.
Use of generic quotes like qq{} and thoughtful use of delimiters (e.g. in regex matching and other quote-like operators) can save you from quoting hell. I already used this tip above: /…/ → m{…} and "…" → qq{…}. Thx, Slade! See perlop manpage for more info.
If I understand you, you want to capture the following parts of the filename:
/home/root/dir1/bss/164146/13/95/7___/000240216___Abc-4121113_2.jpg
~~ ~~ ~ ~~~ ~~~~~~~ ~
1 2 3 4 5 6
But your perl regex doesn't do that. Let's break it apart for better understanding.
/\/(\d{2})\/(\d{2})\/(\d+).*-([a-zA-Z]+)(?:_(\d{1}))?/
Sliced into pieces, this would be...
\/(\d{2}) - a slash then two digits (with the digits captured)
\/(\d{2}) - another slash and two digits
\/(\d) - one more slash and any number of digits
.*- - any run of characters until the final hyphen in the input string
([a-zA-Z]+) - one or more alpha characters
(?:_(\d{1}))? - nonsensical (I think) construct matching an optional single digit that won't be captured (because it's inside a (?:...))
If you step through your filename, you'll see that there is nothing here to handle the second last string of digits.
I'd do this using simpler tools. Sed, for example:
[ghoti#pc ~]$ s="/home/root/dir1/bss/164146/13/95/7___/000240216___Abc-4121113_2.jpg"
[ghoti#pc ~]$ echo "$s" | sed -rne 's/.*/"&"/;h;s:.*/([0-9]{2})/([0-9]{2})/([0-9]+)[^[a-zA-Z]]*[^-]+-([0-9]+)(_([0-9]+))?.*:"0";"\1\2\3";"\4";"\6":;G;s/\n/;/;p'
"0";"13957";"4121113";"2";"/home/root/dir1/bss/164146/13/95/7___/000240216___Abc-4121113_2.jpg"
[ghoti#pc ~]$
I'll break up the sed script for easier reading:
s/.*/"&"/; - Put quotes around the filename.
h; - Store the filename in Sed's "hold" space, for future use...
s: - Start the big substitution...
.*/([0-9]{2})/([0-9]{2})/([0-9]+)[^[a-zA-Z]]*[^-]+-([0-9]+)(_([0-9]+))?.* - This is the pattern we want to match for substitution. Similar to what you did in Perl, obviously, but using ERE instead of PCRE.
:"0";"\1\2\3";"\4";"\6":; - The replacement pattern, with \n being replaced by the bracketed elements of the RE. Note that \5 is skipped in the replace string, as that subexpression is only being used for the match.
G; - Append the "hold" space to the pattern space
s/\n/;/; - and remove the newline between them.
p - Print the result.
Note that this solution, as is, assumes that all input lines match the pattern you're looking for. If that's not the case, then you may get unpredictable output, and should put some pattern matching into the script.

Perl: How to get filename when using <> construct?

Perl offers this very nice feature:
while ( <> )
{
# do something
}
...which allows the script to be used as script.pl <filename> as well as cat <filename> | script.pl.
Now, is there a way to determine if the script has been called in the former way, and if yes, what the filename was?
I know I knew this once, and I know I even used the construct, but I cannot remember where / how. And it proved very hard to search the 'net for this ("perl stdin filename"? No...).
Help, please?
The variable $ARGV holds the current file being processed.
$ echo hello1 > file1
$ echo hello2 > file2
$ echo hello3 > file3
$ perl -e 'while(<>){s/^/$ARGV:/; print;}' file*
file1:hello1
file2:hello2
file3:hello3
The I/O Operators section of perlop is very informative about this.
Essentially, the first time <> is executed, - is added to #ARGV if it started out empty. Opening - has the effect of cloning the STDIN file handle, and the variable $ARGV is set to the current element of #ARGV as it is processed.
Here's the full clip.
The null filehandle "<>" is special: it can be used to emulate the
behavior of sed and awk, and any other Unix filter program that takes a
list of filenames, doing the same to each line of input from all of
them. Input from "<>" comes either from standard input, or from each
file listed on the command line. Here's how it works: the first time
"<>" is evaluated, the #ARGV array is checked, and if it is empty,
$ARGV[0] is set to "-", which when opened gives you standard input. The
#ARGV array is then processed as a list of filenames. The loop
while (<>) {
... # code for each line
}
is equivalent to the following Perl-like pseudo code:
unshift(#ARGV, '-') unless #ARGV;
while ($ARGV = shift) {
open(ARGV, $ARGV);
while (<ARGV>) {
... # code for each line
}
}
except that it isn't so cumbersome to say, and will actually work. It
really does shift the #ARGV array and put the current filename into the
$ARGV variable. It also uses filehandle ARGV internally. "<>" is just
a synonym for "<ARGV>", which is magical. (The pseudo code above doesn't
work because it treats "<ARGV>" as non-magical.)
If you care to know about when <> switches to a new file (e.g. in my case - I wanted to record the new filename and line number), then the eof() function documentation offers a trick:
# reset line numbering on each input file
while (<>) {
next if /^\s*#/; # skip comments
print "$.\t$_";
} continue {
close ARGV if eof; # Not eof()!
}

Perl Concat String Truncates Beginning of Line

I am running into a strange issue in perl that I can't seem to find an answer for.
I have a small script that will parse data from an external sorce (be it file, website, etc). Once the data has been parsed, it will then save it to a CSV file. However, the issue is when I am writing the file or printing to screen the data, it seems to be truncating the beginning of the string. I am using strict and warnings and I am not seeing any errors.
Here is an example:
print "Name: " . $name . "\n";
print "Type: " . $type. "\n";
print "Price: " . $price . "\n";
print "Count: " . $count . "\n";
It will return the following:
John
Blue
7.99
5
If I attempt to do it this way:
print "$name,$type,$price,$count\n";
I get the following as a result:
,7.99,5
I tried the following to see where the issue begins and get the following:
print "$name\n";
print "$name,$type\n";
print "$name,$type,$price\n";
print "$name,$type,$price,$count\n";
Results:
John
John,Blue
,7.99
,7.99,5
I am still learning perl, but can't seem to find out (maybe due to lack of knowledge) of what is causing this. I tried debugging the script, but I did not see any special character in the price variable that would cause this.
The string in $price ends with a Carriage Return. This is causing your terminal to move the cursor to the start of the line, causing the first two fields to be overwritten by the ones that follow.
You are probably reading a Windows text file on a unix box. Convert the file (using dos2unix, for example), or use s/\s+\z//; instead of chomp;.
If the CR made into the middle of a string, you could use s/\r//g;.
Per #Mat suggestion I ran the output through hexdump -C and found there was a carriage return (indicated by the hex value 0d). Using the code $price =~ s/\r//g; to remove the CR from the line of text fixed the problem.
Also, the input file was in Windows format not Unix, ran the command dos2unix to fix that.

How can I eval environment variables in Perl?

I would like to evaluate an environment variable and set the result to a variable:
$x=eval($ENV{EDITOR});
print $x;
outputs:
/bin/vi
works fine.
If I set an environment variable QUOTE to \' and try the same thing:
$x=eval($ENV{QUOTE});
print $x;
outputs:
(nothing)
$# set to: "Can't find a string terminator anywhere before ..."
I do not wish to simply set $x=$ENV{QUOTE}; as the eval is also used to call a script and return its last value (very handy), so I would like to stick with the eval(); Note that all of the Environment variables eval'ed in this manner are set by me in a different place so I am not concerned with malicious access to the environment variables eval-ed in this way.
Suggestions?
Well, of course it does nothing.
If your ENV varaible contains text which is half code, but isn't and you give the resulting string to something that evaluates that code as Perl, of course it's not going to work.
You only have 3 options:
Programmatically process the string so it doesn't have invalid syntax in it
Manually make sure your ENV variables are not rubbish
Find a solution not involving eval but gives the right result.
You may as well complain that
$x = '
Is not valid code, because that's essentially what's occurring.
Samples of Fixing the value of 'QUOTE' to work
# Bad.
QUOTE="'" perl -wWe 'print eval $ENV{QUOTE}; print "$#"'
# Can't find string terminator "'" anywhere before EOF at (eval 1) line 1.
# Bad.
QUOTE="\'" perl -wWe 'print eval $ENV{QUOTE}; print "$#"'
# Can't find string terminator "'" anywhere before EOF at (eval 1) line 1.
# Bad.
QUOTE="\\'" perl -wWe 'print eval $ENV{QUOTE}; print "$#"'
# Can't find string terminator "'" anywhere before EOF at (eval 1) line 1.
# Good
QUOTE="'\''" perl -wWe 'print eval $ENV{QUOTE}; print "$#"'
# '
Why are you eval'ing in the first place? Should you just say
my $x = $ENV{QUOTE};
print "$x\n";
The eval is executing the string in $ENV{QUOTE} as if it were Perl code, which I certainly hope it isn't. That is why \ disappears. If you were to check the $# variable you would find an error message like
syntax error at (eval 1) line 2, at EOF
If you environment variables are going to contain code that Perl should be executing then you should look into the Safe module. It allows you to control what sort of code can execute in an eval so you don't accidentally wind up executing something like "use File::Find; find sub{unlink $File::Find::file}, '.'"
Evaluating an environment value is very dangerous, and would generate errors if running under taint mode.
# purposely broken
QUOTE='`rm system`'
$x=eval($ENV{QUOTE});
print $x;
Now just imagine if this script was running with root access, and was changed to actually delete the file system.
Kent's answer, while technically correct, misses the point. The solution is not to use eval better, but to not use eval at all!
The crux of this problem seems to be in understanding what eval STRING does (there is eval BLOCK which is completely different despite having the same name). It takes a string and runs it as Perl code. 99.99% this is unnecessary and dangerous and results in spaghetti code and you absolutely should not be using it so early in your Perl programming career. You have found the gun in your dad's sock drawer. Discovering that it can blow holes in things you are now trying to use it to hang a poster. It's better to forget it exists, your code will be so much better for it.
$x = eval($ENV{EDITOR}); does not do what you think it does. I don't even have to know what you think it does, that you even used it there means you don't know. I also know that you're running with warnings off because Perl would have screamed at you for that. Why? Let's assume that EDITOR is set to /bin/vi. The above is equivalent to $x = /bin/vi which isn't even valid Perl code.
$ EDITOR=/bin/vi perl -we '$x=eval($ENV{EDITOR}); print $x'
Bareword found where operator expected at (eval 1) line 1, near "/bin/vi"
(Missing operator before vi?)
Unquoted string "vi" may clash with future reserved word at (eval 1) line 2.
Use of uninitialized value $x in print at -e line 1.
I'm not sure how you got it to work in the first place. I suspect you left something out of your example. Maybe tweaking EDITOR until it worked?
You don't have to do anything magical to read an environment variable. Just $x = $ENV{EDITOR}. Done. $x is now /bin/vi as you wanted. It's just the same as $x = $y. Same thing with QUOTE.
$ QUOTE=\' perl -wle '$x=$ENV{QUOTE}; print $x'
'
Done.
Now, I suspect what you really want to do is run that editor and use that quote in some shell command. Am I right?
Well, you could double-escape the QUOTE's value, I guess, since you know that it's going to be evaled.
Maybe what you want is not Perl's eval but to evaluate the environment variable as the shell would. For this, you want to use backticks.
$x = `$ENV{QUOTE}`