I have a suite of small Java app that all compiles/packages to <name-of-the-app>.jar and run on my server. Occasionally one of them will throw an exception, choke and die. I am trying to write a quick-n-dirty Perl script that will periodically poll to see if all of these executable JARs are still running, and if any of them are not, send me an email and notify me which one is dead.
To determine this manually, I have to run a ps -aef | grep <name-of-app> for each app I want to check. For example, to see if myapp.jar is running as a process, I run ps -aef | grep myapp, and look for a grep result that describes the JVM process representing myapp.jar. This manual checking is now getting tedious and is a prime candidate for automation!
I am trying to implement the code that checks to see whether a process is running or not. I'd like to make this a sub that accepts the name of the executable JAR and returns true or false:
sub isAppStillRunning($appName) {
# Somehow run "ps -aef | grep $appName"
# Somehow count the number of processes returned by the grep
# Since grep always returns itself, determine if (count - 1) == 1.
# If so, return true, otherwise, false.
}
I need to be able to pass the sub the name of an app, run my normal command, and count the number of results returned by grep. Since running a grep always results in at least one result (the grep command itself), I need logic that says if the (# of results - 1) is equal to 1, then we know the app is running.
I'm brand new to Perl and am having a tough time figuring out how to implement this logic. Here's my best attempt so far:
sub isAppStillRunning($appName) {
# Somehow run "ps -aef | grep $appName"
#grepResults = `ps -aef | grep $appName`;
# Somehow count the number of processes returned by the grep
$grepResultCount = length(#grepResults);
# Since grep always returns itself, determine if (count - 1) == 1.
# If so, return true, otherwise, false.
if(($grepResultCount - 1) == 1)
true
else
false
}
Then to call the method, from inside the same Perl script, I think I would just run:
&isAppStillRunning("myapp");
Any help with defining the sub and then calling it with the right app name is greatly appreciated. Thanks in advance!
It would be about a billion times easier to use the Proc::ProcessTable module from CPAN. Here's an example of what it might look like:
use strict;
use warnings;
use Proc::ProcessTable;
...
sub isAppStillRunning {
my $appname = shift;
my $pt = Proc::ProcessTable->new;
my #procs = grep { $_->fname =~ /$appname/ } #{ $pt->table };
if ( #procs ) {
# we've got at least one proc matching $appname. Hooray!
} else {
# everybody panic!
}
}
isAppStillRUnning( "myapp" );
Some notes to keep in mind:
Turn on strict and warnings. They are your friends.
You don't specify subroutine arguments with prototypes. (Prototypes in Perl do something completely different, which you don't want.) Use shift to get arguments off the #_ array.
Don't use & to call subroutines; just use its name.
An array evaluated in scalar context (including if its inside an if) gives you its size. length doesn't work on arrays.
Your sub is almost there, but the final if-else construct has to be corrected, and in some cases Perl idiom can make your life easier.
Perl Has Prototypes, But They Suck
sub isAppStillRunning($appName) {
will not work. Instead use
sub isAppStillRunning {
my ($appName) = #_;
The #_ array holds the arguments to the function.
Perl has some simple prototypes (the sub name(&$#) {...} syntax), but they are broken, and an advanced topic, so don't use them.
Perl Has Built-In Grep
`ps -aef | grep $appName`;
This returns one (1) string, possibly containing multiple lines. You could split the output at newlines, and grep manually, which is safer than interpolating variables:
my #lines = split /\n/ `ps -aef`;
my #grepped = grep /$appName/, #lines;
You could also use the open function to explicitly open a pipe to ps:
my #grepped = ();
open my $ps, '-|', 'ps -aef' or die "can't invocate ps";
while (<$ps>) {
push #grepped if /$appName/;
}
This is exactly equal, but better style. It reads all lines from the ps output and then pushes all lines with your $appName into the #grepped array.
Scalar vs. List Context
Perl has this unusual thing called "context". There is list context and scalar context. For example, subroutine calls take argument lists - so these lists (usually) have list context. Concatenating two strings is a scalar context, in contrast.
Arrays behave differently depending on their context. In list context, they evaluate to their elements, but in scalar context, they evaluate to the number of their elements. So there is no need to manually count elements (or use the length function that works on strings).
So we have:
my #array = (1, 2, 3);
print "Length: ", scalar(#array), "\n"; # prints "Length: 3\n"
print "Elems: ", #array, "\n"; # prints "Elems: 123\n";
print "LastIdx: ", $#array, "\n"; # prints "LastIdx: 2\n";
The last form, $#array, is the last index in the array. Unless you meddle with special variables, this is the same as #array - 1.
The scalar function forces scalar context.
Perl Has No Booleans
Perl has no boolean data type, and therefore no true or false keywords. Instead, all values are true, unless stated otherwise. False values are:
The empty string "", the number zero 0, the string zero "0", the special value undef, and some other oddities you won't run into.
So generally use 1 as true and 0 as false.
The if/else constructs require curly braces
So you probably meant:
if (TEST) {
return 1;
} else {
return 0;
}
which is the same as return TEST, where TEST is a condition.
The Ultimate reduction
Using these tricks, your sub could be written as short as
sub isAppStillRunning {
return scalar grep /$_[0]/, (split /\n/, `ps -aef`);
}
This returns the number of lines that contain your app name.
You could modify your routine like this:
sub isAppRunning {
my $appName = shift;
#grepResults = `ps -aef | grep $appName`;
my $items = 0;
for $item(#grepResults){
$items++;
}
return $items;
}
This will iterate over the #grepResults and allow you to inspect the $item if necessary.
Calling it like this should return the number of processes:
print(isAppRunning('myapp') . "\n");
Related
Details for the stateful behaviour for operators such as match (ie. m//g), stat (ie. stat _) and range (ie. //..//) are in the documentation. However is there a 'listing' of all operators or functions that exhibit stateful behaviour? The ones that come to mind are:
#ARGV/File/dir/glob: The current line/position is remembered
say while <>; #Read line by line by Line from ARGV files
say while <$fh>; #Read line by line by line from file handle
say while defined($_ = readdir($dh)); #Read an entry at a time from dir handle
say while <*>; #Read an entry at a time from file glob in current dir
say while <{a,b,c},{1,2}>; #Print combination glob one at a time ie (a1,b1,c1,a2,b2,c2)
#Regex: Global modifier/list context remembers previous matches
say while m/$re/g; #Print matches one at a time
my $_="hello"; say /hello/g; say /hello/gc; #Only prints hello once. Stateful continuation from last match position
#stat
stat _; #Returns stat array state from previous stat "filename";
#range: State remembers if the first condition is met
perl -n -e 'print if 1..10' myfile.txt; #Bistable flipflop state. Print lines 1 to 10
perl -n -e 'print if /startmatch/../endmatch/' myfile.text; #print lines between matches
#state variable: make your own
sub { state $myStateVar;}; #Your own state variable
Any info would be great. Thanks
There are three stateful operators.
glob in scalar context (including <> used as glob).
for my $pass (1..2) {
say glob("abc") // "[undef]";
}
Output
abc
[undef]
The flip-flop operator (.. and ... in scalar context).
for my $pass (1..2) {
$. = 5;
say scalar(5..6);
}
Output
1
2
state
for my $pass (1..2) {
state $x = "abc";
say $x;
$x = "def";
}
Output
abc
def
A lot of operators use the TARG mechanism which makes them technically stateful, but this is an optimization that's transparent to the user. The mechanism allows operators to remember the scalar they return so it can be reused by subsequent invocations.
perl -e'
use Devel::Peek qw( Dump );
for my $pass (1..2) {
my $x = "abc";
Dump(uc($x));
}
' 2>&1 | grep -P '^SV ='
Output
SV = PV(0x55d87231fea0) at 0x55d87237ce08
SV = PV(0x55d87231fea0) at 0x55d87237ce08
It's not a coincidence that both scalars are at the same address (0x55d87237ce08); it's the same scalar.
It was suggested that m//g in scalar context is stateful, but its result is strictly based on its inputs. The effect observed is the result of using pos($_), where $_ is an input.
local $_ = "abcdef";
for my $pass (1..2) {
pos($_) = 2;
last if !/./g;
say $&;
}
Output
c
c
It was suggested that each is stateful, but its result is strictly based on its input. The effect observed is the result of using an iterator that's part of the input.
my %h = ( a=>1, b=>2, c=>3 );
for my $pass (1..2) {
keys(%h); # `keys` in void context resets a hash's iterator.
say join " ", each(%h);
}
Output
c 3
c 3
keys and values also use this same iterator.
It was suggested that readline and readdir are stateful, but their result is strictly based on their input. The only reason you were getting different outputs is that the input (the file handle or directory handle) was different each time.
These functions (and many others) have side effects, and these side-effects including modifying their inputs. But I wouldn't call them stateful as there are significant differences between these and stateful operators.
It was suggested that stat is stateful, but its result is strictly based on its input (be it _ or something else).
This one is not the least bit stateful.
cperl has marked all not stateful ops marked as pure.
Which would be a p in the third column in https://github.com/perl11/cperl/blob/master/regen/opcodes
At runtime you can check that with PL_opargs[optype] & OA_PURE.
Also cperl supports the :pure attribute for functions, and the compiler should later mark them automatically, if worthwhile for the jit.
eval is slow when done on a string: The string first has to be parsed before it can be executed.
I am looking for a way to cache the parsing, so that I can reuse the parsed string for yet another eval. The next eval will be the same code, but will not eval to the same value, so I cannot simply cache the results.
From the description I am looking for ceval from Eval::Compile.
But I cannot use Eval::Compile, as that requires a C compiler for the platform, and it is not given that the user has a C compiler.
So can I do something similar to ceval in pure Perl?
Background
GNU Parallel lets the user give Perl expressions that will be eval'ed on every argument. Currently the Perl expressions are given as strings by the user and eval'ed for every argument. The Perl expressions remain unchanged for each argument. It is therefore a waste to recompile the expression as the recompilation will not change anything.
Profiling of the code shows that the eval is one of the bottlenecks.
Example
The user enters: $_ .= "foo" and s/a/b/g
A user's scripts are stored in $usereval1 and $usereval2.
The user gives 10000 random arguments (strings) stored in #arguments.
sub replace {
my ($script, $arg) = #_;
local $_;
$_ = $arg;
# This is where I would like to cache the parsed $script.
eval $script;
return $_;
}
for my $arg (#arguments) {
# Loads of indirect code (in the order of 1000 lines) that
# call subs calling subs calling subs that eventually does:
$replaced1 = replace($usereval1, $arg);
$replaced2 = replace($usereval2, $arg);
# Yet more code that does stuff with $replaced1 $replaced2
}
You can store a subroutine ref like this:
perl -lwe 'my $x = eval(q( sub { my $foo = shift; $foo*2; } )); print $x->(12);'
This prints 24. You can reuse the code without the need to recompile it.
saw the script (see below) but could not find more info about "-n".
my $numeric =0;
my $input = shift;
if ($input eq "-n") {
$numeric =1;
$input = shift;
}
my $output = shift;
open INPUT, $input or die $!;
open OUTPUT, ">$output" or die $!;
my #file = <INPUT>;
if ($numeric) {
#file = sort { $a <=> $b } #file;
} else {
#file = sort #file;
}
print OUTPUT #file;
The text explaining the script says the following "If the first thing we see on the command line after our program's name is the string -n, then we are doing a numeric sort."
Google search does not seem to recognize most "non-alphanumeric" symbols, so "-n" search yields nothing. The only other place I saw "-n"is in learning perl, where it says the following "the converted sed script can operate either with or without -n option". Not even sure if this is the same "-n" as in the script. Any idea where I can find out more info about the -n (although it may simply means a numeric string ?? nothing else more)
The -n used by this script is entirely unrelated to the -n flag used by perl. In other words, this:
perl -n script.pl
Is completely different from this:
perl script.pl -n
What you have is the second case. Take a look at the documentation for shift:
Shifts the first value of the array off and returns it, shortening the
array by 1 and moving everything down. If there are no elements in the
array, returns the undefined value. If ARRAY is omitted, shifts the #_
array within the lexical scope of subroutines and formats, and the
#ARGV array outside a subroutine and also within the lexical scopes
established by the eval STRING , BEGIN {} , INIT {} , CHECK {} ,
UNITCHECK {} , and END {} constructs.
That's a mouthfull, but what it's saying is that if we're not in a subroutine, and shift appears by itself, it's going to grab the first element of #ARGV. What's #ARGV? Let's look in perlvar, where all those weird variables are documented:
The array #ARGV contains the command-line arguments intended for the
script.
Note that those are the arguments for the script, not for perl. So if somebody executes your script with perl script.pl -n, then we can expect $ARGV[0] to be the string -n.
Looking at your code now, it's obvious what's going on:
my $input = shift;
if ($input eq "-n") {
$numeric =1;
$input = shift;
}
They use shift without an argument and outside a subroutine to grab the first element of #ARGV. If that's -n, the variable $numeric is set to 1. That variable controls how the script behaves. (The script then goes on to get the names of the input and output files out of #ARGV as well.)
Its a command line argument for this script itself. If the user invokes it with the name of the script followed by "-n" then that will tell the script how to behave.
I'm just starting in Perl and I'm quite enjoying it. I'm writing some basic functions, but what I really want to be able to do is to use those functions intelligently using console commands. For example, say I have a function adding two numbers. I'd want to be able to type in console "add 2, 4" and read the first word, then pass the two numbers as parameters in an "add" function. Essentially, I'm asking for help in creating some basic scripting using Perl ^^'.
I have some vague ideas about how I might do this in VB, but Perl, I have no idea where I'd start, or what functions would be useful to me. Is there something like VB.net's "Split" function where you can break down the contents of a scalar into an array? Is there a simple way to analyse one word at a time in a scalar, or iterate through a scalar until you hit a separator, for example?
I hope you can help, any suggestions are appreciated! Bear in mind, I'm no expert, I started Perl all of a few weeks ago, and I've only been doing VB.net half a year.
Thank you!
Edit: If you're not sure what to suggest and you know any simple/intuitive resources that might be of help, that would also be appreciated.
Its rather easy to make a script which dispatches to a command by name. Here is a simple example:
#!/usr/bin/env perl
use strict;
use warnings;
# take the command name off the #ARGV stack
my $command_name = shift;
# get a reference to the subroutine by name
my $command = __PACKAGE__->can($command_name) || die "Unknown command: $command_name\n";
# execute the command, using the rest of #ARGV as arguments
# and print the return with a trailing newline
print $command->(#ARGV);
print "\n";
sub add {
my ($x, $y) = #_;
return $x + $y;
}
sub subtract {
my ($x, $y) = #_;
return $x - $y;
}
This script (say its named myscript.pl) can be called like
$ ./myscript.pl add 2 3
or
$ ./myscript.pl subtract 2 3
Once you have played with that for a while, you might want to take it further and use a framework for this kind of thing. There are several available, like App::Cmd or you can take the logic shown above and modularize as you see fit.
You want to parse command line arguments. A space serves as the delimiter, so just do a ./add.pl 2 3 Something like this:
$num1=$ARGV[0];
$num2=$ARGV[1];
print $num1 + $num2;
will print 5
Here is a short implementation of a simple scripting language.
Each statement is exactly one line long, and has the following structure:
Statement = [<Var> =] <Command> [<Arg> ...]
# This is a regular grammar, so we don't need a complicated parser.
Tokens are seperated by whitespace. A command may take any number of arguments. These can either be the contents of variables $var, a string "foo", or a number (int or float).
As these are Perl scalars, there is no visible difference between strings and numbers.
Here is the preamble of the script:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
strict and warnings are essential when learning Perl, else too much weird stuff would be possible. The use 5.010 is a minimum version, it also defines the say builtin (like a print but appends a newline).
Now we declare two global variables: The %env hash (table or dict) associates variable names with their values. %functions holds our builtin functions. The values are anonymous functions.
my %env;
my %functions = (
add => sub { $_[0] + $_[1] },
mul => sub { $_[0] * $_[1] },
say => sub { say $_[0] },
bye => sub { exit 0 },
);
Now comes our read-eval-loop (we don't print by default). The readline operator <> will read from the file specified as the first command line argument, or from STDIN if no filename is provided.
while (<>) {
next if /^\s*\#/; # jump comment lines
# parse the line. We get a destination $var, a $command, and any number of #args
my ($var, $command, #args) = parse($_);
# Execute the anonymous sub specified by $command with the #args
my $value = $functions{ $command }->(#args);
# Store the return value if a destination $var was specified
$env{ $var } = $value if defined $var;
}
That was fairly trivial. Now comes some parsing code. Perl “binds” regexes to strings with the =~ operator. Regexes may look like /foo/ or m/foo/. The /x flags allows us to include whitespace in our regex that doesn't match actual whitespace. The /g flag matches globally. This also enables the \G assertion. This is where the last successful match ended. The /c flag is important for this m//gc style parsing to consume one match at a time, and to prevent the position of the regex engine in out string to being reset.
sub parse {
my ($line) = #_; # get the $line, which is a argument
my ($var, $command, #args); # declare variables to be filled
# Test if this statement has a variable declaration
if ($line =~ m/\G\s* \$(\w+) \s*=\s* /xgc) {
$var = $1; # assign first capture if successful
}
# Parse the function of this statement.
if ($line =~ m/\G\s* (\w+) \s*/xgc) {
$command = $1;
# Test if the specified function exists in our %functions
if (not exists $functions{$command}) {
die "The command $command is not known\n";
}
} else {
die "Command required\n"; # Throw fatal exception on parse error.
}
# As long as our matches haven't consumed the whole string...
while (pos($line) < length($line)) {
# Try to match variables
if ($line =~ m/\G \$(\w+) \s*/xgc) {
die "The variable $1 does not exist\n" if not exists $env{$1};
push #args, $env{$1};
}
# Try to match strings
elsif ($line =~ m/\G "([^"]+)" \s*/xgc) {
push #args, $1;
}
# Try to match ints or floats
elsif ($line =~ m/\G (\d+ (?:\.\d+)? ) \s*/xgc) {
push #args, 0+$1;
}
# Throw error if nothing matched
else {
die "Didn't understand that line\n";
}
}
# return our -- now filled -- vars.
return $var, $command, #args;
}
Perl arrays can be handled like linked list: shift removes and returns the first element (pop does the same to the last element). push adds an element to the end, unshift to the beginning.
Out little programming language can execute simple programs like:
#!my_little_language
$a = mul 2 20
$b = add 0 2
$answer = add $a $b
say $answer
bye
If (1) our perl script is saved in my_little_language, set to be executable, and is in the system PATH, and (2) the above file in our little language saved as meaning_of_life.mll, and also set to be executable, then
$ ./meaning_of_life
should be able to run it.
Output is obviously 42. Note that our language doesn't yet have string manipulation or simple assignment to variables. Also, it would be nice to be able to call functions with the return value of other functions directly. This requires some sort of parens, or precedence mechanism. Also, the language requires better error reporting for batch processing (which it already supports).
The following perl code generates a warning in PerlCritic (by Activestate):
sub natural_sort {
my #sorted;
#sorted = grep {s/(^|\D)0+(\d)/$1$2/g,1} sort grep {s/(\d+)/sprintf"%06.6d",$1/ge,1} #_;
}
The warning generated is:
Don't modify $_ in list functions
More info about that warning here
I don't understand the warning because I don't think I'm modifying $_, although I suppose I must be.
Can someone explain it to me please?
Both of your greps are modifying $_ because you're using s//. For example, this:
grep {s/(^|\D)0+(\d)/$1$2/g,1}
is the same as this:
grep { $_ =~ s/(^|\D)0+(\d)/$1$2/g; 1 }
I think you'd be better off using map as you are not filtering anything with your greps, you're just using grep as an iterator:
sub natural_sort {
my $t;
return map { ($t = $_) =~ s/(^|\D)0+(\d)/$1$2/g; $t }
sort
map { ($t = $_) =~ s/(\d+)/sprintf"%06.6d",$1/ge; $t }
#_;
}
That should do the same thing and keep critic quiet. You might want to have a look at List::MoreUtils if you want some nicer list operators than plain map.
You are doing a substitution ( i.e. s/// ) in the grep, which modifies $_ i.e. the list being grepped.
This and other cases are explained in perldoc perlvar:
Here are the places where Perl will
assume $_ even if you don't use it:
The following functions:
abs, alarm, chomp, chop, chr, chroot,
cos, defined, eval, exp, glob, hex,
int, lc, lcfirst, length, log, lstat,
mkdir, oct, ord, pos, print,
quotemeta, readlink, readpipe, ref,
require, reverse (in scalar context
only), rmdir, sin, split (on its
second argument), sqrt, stat, study,
uc, ucfirst, unlink, unpack.
All file tests (-f , -d ) except for -t , which defaults to STDIN.
See -X
The pattern matching operations m//, s/// and tr/// (aka y///) when
used without an =~ operator.
The default iterator variable in a foreach loop if no other variable is
supplied.
The implicit iterator variable in the grep() and map() functions.
The implicit variable of given().
The default place to put an input record when a operation's result
is tested by itself as the sole
criterion of a while test. Outside a
while test, this will not happen.
Many people have correctly answered that the s operator is modifying $_, however in the soon to be released Perl 5.14.0 there will be the new r flag for the s operator (i.e. s///r) which rather than modify in-place will return the modified elements. Read more at The Effective Perler . You can use perlbrew to install this new version.
Edit: Perl 5.14 is now available! Announcement Announcement Delta
Here is the function suggested by mu (using map) but using this functionality:
use 5.14.0;
sub natural_sort {
return map { s/(^|\D)0+(\d)/$1$2/gr }
sort
map { s/(\d+)/sprintf"%06.6d",$1/gre }
#_;
}
The VERY important part that other answers have missed is that the line
grep {s/(\d+)/sprintf"%06.6d",$1/ge,1} #_;
Is actually modifying the arguments passed into the function, and not copies of them.
grep is a filtering command, and the value in $_ inside the code block is an alias to one of the values in #_. #_ in turn contains aliases to the arguments passed to the function, so when the s/// operator performs its substitution, the change is being made to the original argument. This is shown in the following example:
sub test {grep {s/a/b/g; 1} #_}
my #array = qw(cat bat sat);
my #new = test #array;
say "#new"; # prints "cbt bbt sbt" as it should
say "#array"; # prints "cbt bbt sbt" as well, which is probably an error
The behavior you are looking for (apply a function that modifies $_ to a copy of a list) has been encapsulated as the apply function in a number of modules. My module List::Gen contains such an implementation. apply is also fairly simple to write yourself:
sub apply (&#) {
my ($sub, #ret) = #_;
$sub->() for #ret;
wantarray ? #ret : pop #ret
}
With that, your code could be rewritten as:
sub natural_sort {
apply {s/(^|\D)0+(\d)/$1$2/g} sort apply {s/(\d+)/sprintf"%06.6d",$1/ge} #_
}
If your goal with the repeated substitutions is to perform a sort of the original data with a transient modification applied, you should look into a Perl idiom known as the Schwartzian transform which is a more efficient way of achieving that goal.