Use awk in Perl to parse everything between two strings - perl

I have a huge pile of log files constantly being updated on HP-UX server.I have created the Perl code to find out the name of log file in which the string i'm using resides inside.
Perl gets the file name using split and passes it into a variable.Using the userinput i create the start and stop strings as two variables.Such as:
my $ssh = Net::OpenSSH->new($host, user => $user,
master_opts => [ -o => 'NumberOfPasswordPrompts=1',
-o => 'PreferredAuthentications=keyboard-interactive,password'],
login_handler => \&login_handler);
$ssh-> error and die "Unable to connect" . $ssh->error;
my $output=$ssh->capture("grep .$userinput1. /app/bea/user_projects/domains/granite/om_ni.log*");
my $array = (split ":", $output)[0];
print "$array"."\n";
[EDIT]: As you guys requested,above is the beginning of how the $array got filled in.Below is where the awk sequence starts:
my $a= "INFO - $userinput1";print $a;
my $b= "INFO - ProcessNode terminated... [$userinput1]";print $b;
Using the awk as part of ssh capture command,it will search through the whole log file and capture every line between the string $a and string $b,then get everything inside another array.Such as:
my $output2=$ssh->capture("awk -v i=$array '$a,$b' i");
Here $array is where the log file's full path is held and it work completely fine as a passing variable.
I tried using the awk without -v parameter as well,didn't matter at all.
[EDIT 2]:this is the result of print "$array"."\n";
/app/bea/user_projects/domains/granite/om_ni.log.2
When I run the perl script,I get the result:
INFO - 28B26AD1-E959-4F5F-BD89-A7A6E601BE18INFO - ProcessNode terminated... [28B26AD1-E959-4F5F-BD89-A7A6E601BE18] syntax error The source line is 1.
The error context is
INFO - 28B26AD1-E959-4F5F-BD89-A7A6E601BE18,INFO - ProcessNode >>> terminated. <<< .. [28B26AD1-E959-4F5F-BD89-A7A6E601BE18]
awk: Quitting
The source line is 1.
Error pointing at the "terminate" word somehow but even when I use escape characters all over the strings,it just doesn't care and returns the same error.
Any help on this issue is highly appreciated.Thanks a lot in advance.

While I don't really know awk, the way you are invoking it does not appear to be correct. Here is the manual for awk on HP-UX.
The part in single quotes ($a,$b) should be the program. However, you are passing it two text strings, which are not even quoted to separate them. This is not a valid awk program; hence the syntax error.
I think what you want is something like '/$a/, /$b/' for the program (but again, I am not an awk expert).
Also, you are setting the filename to variable i, then using i in the place of the filename when you invoke the command. I don't know why you are doing this, and I don't think it will even work to use a variable in the filename. Just use $array (which you should rename to something like $file for clarity) in the filename position.
So your whole command should look something like:
"awk '/$a/,/$b/' $file"
In this single command, you are dealing with three different tools: Perl, SSH, and awk. This is very hard to debug, because if there is a problem, it can be hard to tell where the problem is. It is essential that you break down the task into smaller parts in order to get something like this working.
In this case, that means that you should manually SSH into the server and play around with awk until you get the command right. Only when you are sure that you have the awk command right should you try incorporating it into Perl. It will be much easier if you break down the task in that way.

Related

Interpreting & modifying Perl one-liner?

I have the following Perl 'one-liner' script (found it online, so not mine):
perl -lsne '
/$today.* \[([0-9.]+)\]:.+dovecot_(?:login|plain):([^\s]+).* for (.*)/
and $sender{$2}{r}+=scalar (split / /,$3)
and $sender{$2}{i}{$1}=1;
END {
foreach $sender(keys %sender){
printf"Recip=%05d Hosts=%03d Auth=%s\n",
$sender{$sender}{r},
scalar (keys %{$sender{$sender}{i}}),
$sender;
}
}
' -- -today=$(date +%F) /var/log/exim_mainlog | sort
I've been trying to understand its innards, because I would like to modify it to re-use its functionality.
Some questions I got:
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
Where does $sender gets its value from?
What about that (?:login|plain) segment, are they 'variables'? (I get that's ReGex, I'm just not familiarized with it)
What I'm trying to achieve:
Get the number of emails sent by each user in a SMTP relay periodically (cron job)
If there's an irregular number of emails (say, 500 in a 1-hour timespan), do something (like shutting of the service, or send a notification)
Why I'm trying to achieve this:
Lately, someone has been using my SMTP server to send spam, so I would like to monitor email activity so they stop abusing the SMTP relay resources. (Security-related suggestions are always welcomed, but out of topic for this question. Trying to focus on the script for now)
What I'm NOT trying to achieve:
To get the script done by third-parties. (Just try and point me in the right direction, maybe an example)
So, any suggestions, guidance,and friendly comments are welcomed. I understand this may be an out-of-topic question, yet I've been struggling with this for almost a week and my background with Perl is null.
Thanks in advance.
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
-l causes lines of input read in to be auto-chomped, and lines of
output printed out to have "\n" auto-appended
-s enables switch
parsing. This is what creates the variable $today, because a
command-line switch of --today=$(date +%F) was passed.
-n surrounds the entire "one-liner" in a while (<>) { ... } loop.
Effectively reading every line from standard input and running the
body of the one liner on that line
-e is the switch that tells
perl to execute the following code from the command line, rather
than running a file containing Perl code
Where does $sender gets its value from?
I suspect you are confusing $sender with %sender. The code uses $sender{$2}{r} without explicitly mentioning %sender. This is a function of Perl called "auto-vivification". Basically, because we used $sender{$2}{r}, perl automatically created a variable %sender, and added a key whose name is whatever is in $2, and set the value of that key in %sender to be a reference to a new hash. It then set that new hash to have a key 'r' and a value of scalar (split / /,$3)
What about that (?:login|plain) segment, are they 'variables'? (I get that's ReGex, I'm just not familiarized with it)
It's saying that this portion of the regular expression will match either 'login' or 'plain'. The ?: at the beginning tells Perl that these parentheses are used only for clustering, not capturing. In other words, the result of this portion of the pattern match will not be stored in the $1, $2, $3, etc variables.
-MO=Deparse is your friend for understanding one-liners (and one liners that wrap into five lines on your terminal):
$ perl -MO=Deparse -lsne '/$today.* \[([0-9.]+)\]:.+dovecot_( ...
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE:
while ( defined($_ = <ARGV>) ) {
chomp $_;
$sender{$2}{'i'}{$1} = 1 if
/$today.* \[([0-9.]+)\]:.+dovecot_(?:login|plain):([^\s]+).* for (.*)/
and $sender{$2}{'r'} += scalar split(/ /, $3, 0);
sub END {
foreach $sender (keys %sender) {
printf "Recip=%05d Hosts=%03d Auth=%s\n",
$sender{$sender}{'r'},
scalar keys %{$sender{$sender}{'i'};}, $sender;
}
}
}
-e syntax OK
[newlines and indentation added for clarity]
What does the flag -lsne does? (From what I know, it's got to be, at least, 3 different flags in one)
You can access a summary of the available perl command line options by running '~$ perl -h' in the terminal. Below are filtered out the specific command line options you were asking about.
~$ perl -h|perl -ne 'print if /^\s+(-l|-s|-n|-e)/'
-e program one line of program (several -e's allowed, omit programfile)
-l[octal] enable line ending processing, specifies line terminator
-n assume "while (<>) { ... }" loop around program
-s enable rudimentary parsing for switches after programfile
Two examples of the '-s' command line option in use.
~$ perl -se 'print "Todays date is $today\n"' -- -today=`date +%F`
Todays date is 2016-10-17
~$ perl -se 'print "The sky is $color.\n"' -- -color='blue'
The sky is blue.
For detailed explanations of those command line options read the online documentation below.
http://perldoc.perl.org/perlrun.html
Or run the command below from your terminal.
~$ perldoc perlrun
Unrelated to the questions of the OP, I'm aware that this is not a complete answer (added as much as I was able to at the moment), so if this post/answer violates any SO rules, the moderators should remove it. Thx.

Perl open file from command line with wildcard

I am executing my script this way:
./script.pl -f files*
I looked at some other threads (like How can I open a file in Perl using a wildcard in the directory name?)
If i hard code the file name like it is written in this thread I get my desired result. If I take it from the command line it does not.
My options subroutine should save all the files I get this way in an array.
my #file;
sub Options{
my $i=0;
foreach my $opt (#ARGV){
switch ($opt){
case "-f" {
$i++;
### This part does not work:
#file= glob $ARGV[$i];
print Dumper("$ARGV[$i]"); #$VAR1 = 'files';
print Dumper(#file); #$VAR1 = 'files';
}
}
$i++;
}
}
It seems the execution is interpreted in advance and the wildcard (*) is dropped in the process.
Desired result: All files beginning with files are saved in an array, after execution from the command line.
I hope you get my problem. If not feel free to ask.
Thank you.
Well, first I'd suggest using a module to do args on command line:
Getopt::Long for example.
But otherwise your problem is simpler - your shell is expanding the 'file*' before perl gets it. (shell glob is getting there first).
If you do this with:
-f 'file*'
then it'll work properly. You should be able to see this - for example - if you just:
use Data::Dumper;
print Dumper \#ARGV;
I expect you'll see a much longer list than you thought.
However, I'd also point out - perl has a really nice feature you may be able to use (depending what you're doing with your files).
You can use <>, which automatically opens and reads all files specified on command line (in order).
Since your shell is already expanding the glob files* into a list of filenames, that's what the Perl program gets.
$ perl -E 'say #ARGV' files*
files1files2files3
There's no need to do that in Perl, if your shell can do it for you. If all you want is the filenames in an array, you already have #ARGV which contains those.

How to run a local program with user input in Perl

I'm trying to get user input from a web page written in Perl and send it to a local program (blastp), then display the results.
This is what I have right now:
(input code)
print $q->p, "Your database: $bd",
$q->p, "Your protein is: $prot",
$q->p, "Executing...";
print $q->p, system("blastp","-db $bd","-query $prot","-out results.out");
Now, I've done a little research, but I can't quite grasp how you're supposed to do things like this in Perl. I've tried opening a file, writing to it, and sending it over to blastp as an input, but I was unsuccessful.
For reference, this line produces a successful output file:
kold#sazabi ~/BLAST/pataa $ blastp -db pataa -query ../teste.fs -out results.out
I may need to force the bd to load from an absolute path, but that shouldn't be difficult.
edit: Yeah, the DBs have an environmental variable, that's fixed. Ok, all I need is to get the input into a file, pass it to the command, and then print the output file to the CGI page.
edit2: for clarification:
I am receiving user input in $prot, I want to pass it over to blastp in -query, have the program blastp execute, and then print out to the user the results.out file (or just have a link to it, since blastp can output in HTML)
EDIT:
All right, fixed everything I needed to fix. The big problem was me not seeing what was going wrong: I had to install Tiny:Capture and print out stderr, which was when I realized the environmental variable wasn't getting set correctly, so BLAST wasn't finding my databases. Thanks for all the help!
Write $prot to the file. Assuming you need to do it as-is without processing the text to split it or something:
For a fixed file name (may be problematic):
use File::Slurp;
write_file("../teste.fs", $prot, "\n") or print_error_to_web();
# Implement the latter to print error in nice HTML format
For a temp file (better):
my ($fh, $filename) = tempfile( $template, DIR => "..", CLEANUP => 1);
# You can also create temp directory which is even better, via tempdir()
print $fh "$prot\n";
close $fh;
Step 2: Run your command as you indicated:
my $rc = system("$BLASTP_PATH/blastp", "-db", "pataa"
,"-query", "../teste.fs", "-out", "results.out");
# Process $rc for errors
# Use qx[] instead of system() if you want to capture
# standard output of the command
Step 3: Read the output file in:
use File::Slurp;
my $out_file_text = read_file("results.out");
Send back to web server
print $q->p, $out_file_text;
The above code has multiple issues (e.g. you need better file/directory paths, more error handling etc...) but should start you on the right track.

Executing bash script, read its output and create html with Perl

I have a bash script which produces different integer values. When I run that script, the output looks like this:
12
34
34
67
6
This script runs on a Solaris server. In order to provide other users in the network with these values, I decided to write a Perl script which can:
run the bash file
read its output
build a tiny html page with a table in which the bash values are stored
Thats a hard job for me because I have almost no experience with Perl. I know I can use system to execute unix commands (and bash files) but I cannot get the output. I also heared about qx which sounds very useful for my case.
But I must admit I have no clue how do start... Could you give me a few hints how to solve that?
With a question like this it's a little hard to know where to begin.
The qx to which you are referring is a feature of Perl. The "q*" or "Quote and Quote-like Operators" are documented in the Perl "operators" man page (normally you'd use man perlop to read that on systems with a conventional installation of Perl).
Specifically qx is the "quoted-execution of a command" ... which is essentially an alternative form of the ` (back tick or "command substitution") operator in Perl.
In other words if you execute a command like:
perl -e '$foo = qx{ls}; print "\n###\n$foo\n###\n";'
... on a system with Perl installed then it should run Perl, which should evaluate (-e) the expression you've provided (quoted). In other words we're writing a small program right on the command line. This program starts by creating a variable whose contents will be a "scalar" (which is Perl terminology for a string or number). We're assigning (the =, or assignment, operator) the output which is captured by executing the ls command back to this variable ($foo). After that we're printing the contents of our variable (whatever the ls command would have printed) with ### lines preceding and following those contents..
A quirk of Perl's qx operator (and the various other q* operators) is that it allows you to delimit the command with just about any characters you like. For example perl -e '$bar = qx/pwd/;' would capture the output of the pwd command. When you use any of the characters that are normally used as delimiters around text parentheses, braces, brackets, etc) then the qx command will look for the appropriate matching delimiter. If you use any other punctuation (or non-alpha-numeric character?) then that same character will be the terminating delimiter as well. This later behavior is similar to, and was inspired by, a feature in "substitution" command from the old sed utility and ed line editors; while the matching of parentheses, braces, etc. are a Perl novelty.
So that's the basics of how to capture your shell script's output. To print the numbers in an HTML table you'd have to split the captured output into separate lines (saving them into a list or array) then print your HTML prologue (the <table> and <th> (header) tags, and so on) ... them loop over a series of <tr> rows, interpolating your numbers into <td>> (table data) containers) and then finally print your HTML epilogue (with the closing tags).
For that you'll want to read up on the Perl print function and about "interpolation" in Perl. That's a fairly complex topic.
This is all extremely crude and there are tools around which allow you to approach the generation of HTML at a much higher level. It's also rather dubious that you want to wrap the execution of your shell script in a Perl script since it seems likely that you could modify the shell script to directly output HTML (perhaps as an option controlled by a command line switch or environment variable) or that you could re-write the shell script in Perl. This could potentially eliminate the extra work of parsing the output (splitting it into lines and separating the values out of those lines into an array because you can capture the data directly into the array (or possibly print out your HTML rows) directly as you are generating them (however your existing shell script is doing that).
To capture the output of your bash file, you can use the backtick operator:
use strict;
my $e = `ls`;
print $e;
Many, many thanks to you! With your great help. I was able to build a perl script which does a big part of the job.
This is what I have created so far:
#!/usr/bin/perl -w
use strict;
use CGI qw(:standard);
#some variables
my $message = "please wait, loading data...\n";
#First build the web page
print header;
print start_html('Hello World');
print "<H1>we need love, peace and harmony</H1>\n";
print "<p>$message</p>\n";
#Establish a pipeline between the bash and my script.
my $bash_command = '/love/peace/harmony/./lovepeace.bash';
open(my $pipe, '-|', $bash_command) or die $!;
while (my $line = <$pipe>){
# Do something with each line.
print "<p>$line</p>\n";
}
#job done, now refresh page?
print end_html;
When I call that .pl script in my browser, everything works nice :-) But a few questions are still on my mind:
When I call this website, it is busy loading the values from the pipe. Since there are about 10 Values its rather
quick (2-4 seconds) But if I have 100+ Values the user has to wait a while. Since I cannot have a progress bar, I
should give an information to the user. Like:
"Loading data, please wait..."
And when the job is done, this message should say: "Job done" or something similar.
But how do I realize if the process is finnished?
can I reload the page if the job is done ?
Is there any chance of using my own stylesheet wihtin this perl-CGI
Regards,
JJ
Why only perl:
you can use awk for that in side your shell script itself.
I have done this earlier.
if you have the out put values in a variable then use the below method:
echo $SUBSCRIBERS|awk 'BEGIN {
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?><GenTransactionHandler xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"><EntityToPublish>\n<Entitytype=\"C\" typeDesc=\"Subscriber level\"><TargetApplCode>UHUNLD</TargetApplCode><TrxName>GET_SUBSCR_DATA</TrxName>"
}
{for(i=1;i<NF+1;i++) printf("<value>%d</value>\n",$i)}
END{
print "</Entity>\n</EntityToPublish></GenTransactionHandler>"}' >XML_SUB_NUM`date +%Y%m%d%H%M%S`.xml
in $SUBSCRIBERS the values should eb tab separated.

Substituting text in a file within a Perl script

I am using webmin and I am trying to change some settings in a file. I am having problems if the person uses any weird characters that might trip up sed or Perl using the following code:
&execute_command("sed -i 's/^$Pref.*\$/$Pref \"$in{$Pref}\"/g' $DIR/pserver.prefs.cache");
Where execute_command is a webmin function to basically run a special system call. $pref is the preference name such as "SERVERNAME", "OPTION2", etc. and $in{Pref} is going to be the option I want set for the PREF. For example here is a typical pserver.prefs:
SERVERNAME "Test Name"
OWNERPASSWORD "Hd8sdH&3"
Therefore, if we wanted to change SERVERNAME to say Tes"t#&^"#'"##& and OWNERPASSWORD to *#(&'"#$"(')29 then they would be passed in as $in{Pref}. What is the easiest way to escape the $in{} variables so that they can work OK with sed, or better yet, what is a way I can convert my sed command to a strictly Perl command so that it doesn't have errors?
Update:
Awesome, now I'm just trying to get it to work with and I get this error:
**/bin/sh: -c: line 0: unexpected EOF while looking >for matching `"' /bin/sh: -c: line 1: syntax error: unexpected end of file**
This does not work:
my $Pref = "&*())(*&'''''^%$##!";
&execute_command("perl -pi -e 's/^SERVERNAME.*\$/SERVERNAME \"\Q$Pref\E\"/g' $DIR/pserver.prefs");
This does:
my $Pref = "&*())(*&^%$##!";
&execute_command("perl -pi -e 's/^SERVERNAME.*\$/SERVERNAME \"\Q$Pref\E\"/g' $DIR/pserver.prefs");
Perl's regex support includes the \Q and \E operators, which will cause it to avoid interpreting regex symbols within their scope, yet they allow variable interpolation.
This works:
$i = '(*&%)*$£(*';
if ($i =~ /\Q$i\E/){
print "matches!\n";
}
Without the \Q and \E, you'd get an error because of the regex symbols in $i.
The most trivial part is simply to stop executing a command as a single string. Get the shell out of it. Assuming your execute_command function just calls system under the covers, try:
execute_command(qw/perl -pi -e/, 's/^SERVERNAME.*$/SERVERNAME "\Q$Pref\E"/g', "$DIR/pserver.prefs");
That's better, but not perfect. After all, the user could put in something silly like "#[system qw:rm -rf /:]" and then silly things would happen. I think there are ways around this, too, but the most trivial might be to simply do the work inside your code. How to do that? Maybe starting with what perl is doing with the "-pi" flags might help. Let's take a peek:
$ perl -MO=Deparse -pi -e 's/^SERVERNAME.*$/SERVERNAME "\Qfoo\E"/'
BEGIN { $^I = ""; }
LINE: while (defined($_ = <ARGV>)) {
s/^SERVERNAME.*$/SERVERNAME "foo"/;
}
continue {
print $_;
}
Maybe you can do the same thing in your code? Not sure how easy that is to replicate, especially that $^I bit. Worst case scenario, read the file, write to a new file, delete the original file, rename the new file to the original name. That'll help get rid of all the exposures of passing dangerous junk around.