I have an http response in a perl object (HTTP::Tiny) that I'm trying to extract specific values from but a positive match returns the entire object instead of just the values I want. How can I just extract those lines or values I want from the object?
for my $url (#urls) {
print $url."\n";
$response = $http->get("$url");
my ($res) = grep {/href/} $response->{content};
print $res;
}
grep in Perl isn't quite like the command-line program of the same name. It works over arrays rather than text lines. Try something like this instead:
my ($res) = grep {/href/} split(/\n/, $response->{content});
The split will turn the document you GET into an array of lines so that the grep will do what you expect.
Related
I have a variable $string and i want to print all the lines after I find a keyword in the line (including the line with keyword)
$string=~ /apple /;
I'm using this regexp to find the key word but I do not how to print lines after this keyword.
It's not really clear where your data is coming from. Let's assume it's a string containing newlines. Let's start by splitting it into an array.
my #string = split /\n/, $string;
We can then use the flip-flop operator to decide which lines to print. I'm using \0 as a regex that is very unlikely to match any string (so, effectively, it's always false).
for (#string) {
say if /apple / .. /\0/;
}
Just keep a flag variable, set it to true when you see the string, print if the flag is true.
perl -ne 'print if $seen ||= /apple/'
If your data in scalar variable we can use several methods
Recommended method
($matching) = $string=~ /([^\n]*apple.+)/s;
print "$matching\n";
And there is another way to do it
$string=~ /[^\n]*apple.+/s;
print $&; #it will print the data which is match.
If you reading the data from file, try the following
while (<$fh>)
{
if(/apple/)
{
print <$fh>;
}
}
Or else try the following one liner
perl -ne 'print <> and exit if(/apple/);' file.txt
Being a Perl newbie, please pardon me for asking this basic question.
I have a text file #server1 that shows a bunch of sentences (white space is the field separator) on many lines in the file.
I needed to match lines with my keyword, remove the same lines, and extract only the last field, so I have tried with:
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my #uniqmatchedline = split(/ /, #allmatchedlines);
my $lastfield = $uniqmatchedline[-1]\n";
print "$lastfield\n";
and it gives me the output showing:
1
I don't know why it's giving me just "1".
Could someone please explain why I'm getting "1" and how I can get the last field of the matched line correctly?
Thank you!
my #uniqmatchedline = split(/ /, #allmatchedlines);
You're getting "1" because split takes a scalar, not an array. An array in scalar context returns the number of elements.
You need to split on each individual line. Something like this:
my #uniqmatchedline = map { split(/ /, $_) } #allmatchedlines;
There are two issues with your code:
split is expecting a scalar value (string) to split on; if you are passing an array, it will convert the array to scalar (which is just the array length)
You did not have a way to remove same lines
To address these, the following code should work (not tested as no data):
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my %existing;
my #uniqmatchedline = grep !$existing{$_}++, #allmatchedlines; #this will return the unique lines
my #lastfields = map { ((split / /, $_)[-1]) . "\n" } #uniqmatchedline ; #this maps the last field in each line into an array
print for #lastfields;
Apart from two errors in the code, I find the statement "remove the same lines and extract only the last field" unclear. Once duplicate matching lines are removed, there may still be multiple distinct sentences with the pattern.
Until a clarification comes, here is code that picks the last field from the last such sentence.
use warnings 'all';
use strict;
use List::MoreUtils qw(uniq)
my $file = '/tmp/myfile.txt';
my $cmd = "ssh user1\#server1 cat $file";
open my $fh, '-|', $cmd // die "Error opening $cmd: $!"; # /
while (<$fh>) {
chomp;
push #allmatchedlines, $_ if /mysearch/;
}
close(output1);
my #unique_matched_lines = uniq #allmatchedlines;
my $lastfield = ( split ' ', $unique_matched_lines[-1] )[-1];
print $lastfield, "\n";
I changed to the three-argument open, with error checking. Recall that open for a process involves a fork and returns pid, so an "error" doesn't at all relate to what happened with the command itself. See open. (The # / merely turns off wrong syntax highlighting.) Also note that # under "..." indicates an array and thus need be escaped.
The (default) pattern ' ' used in split splits on any amount of whitespace. The regex / / turns off this behavior and splits on a single space. You most likely want to use ' '.
For more comments please see the original post below.
The statement #allmatchedlines = $_ if /mysearch/; on every iteration assigns to the array, overwriting whatever has been in it. So you end up with only the last line that matched mysearch. You want push #allmatchedlines, $_ ... to get all those lines.
Also, as shown in the answer by Justin Schell, split needs a scalar so it is taking the length of #allmatchedlines – which is 1 as explained above. You should have
my #words_in_matched_lines = map { split } #allmatchedlines;
When all this is straightened out, you'll have words in the array #uniqmatchedline and if that is the intention then its name is misleading.
To get unique elements of the array you can use the module List::MoreUtils
use List::MoreUtils qw(uniq);
my #unique_elems = uniq #whole_array;
I have searched a fair bit and hope I'm not duplicating something someone has already asked. I have what amounts to a CSV that is specifically formatted (as required by a vendor). There are four values that are being delimited as follows:
"Name","Description","Tag","IPAddresses"
The list is quite long (and there are ~150 unique names--only 2 in the sample below) but it basically looks like this:
"2B_AppName-Environment","desc","tag","192.168.1.1"
"2B_AppName-Environment","desc","tag","192.168.22.155"
"2B_AppName-Environment","desc","tag","10.20.30.40"
"6G_ServerName-AltEnv","desc","tag","1.2.3.4"
"6G_ServerName-AltEnv","desc","tag","192.192.192.40"
"6G_ServerName-AltEnv","desc","tag","192.168.50.5"
I am hoping for a way in Perl (or sed/awk, etc.) to come up with the following:
"2B_AppName-Environment","desc","tag","192.168.1.1,192.168.22.155,10.20.30.40"
"6G_ServerName-AltEnv","desc","tag","1.2.3.4,192.192.192.40,192.168.50.5"
So basically, the resulting file will APPEND the duplicates to the first match -- there should only be one line per each app/server name with a list of comma-separated IP addresses just like what is shown above.
Note that the "Decription" and "Tag" fields don't need to be considered in the duplication removal/append logic -- let's assume these are blank for the example to make things easier. Also, in the vendor-supplied list, the "Name" entries are all already sorted to be together.
This short Perl program should suit you. It expects the path to the input CSV file as a parameter on the command line and prints the result to STDOUT. It keeps track of the appearance of new name fields in the #names array so that it can print the output in the order that each name first appears, and it takes the values for desc and tag from the first occurrence of each unique name.
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new({always_quote => 1, eol => "\n"});
my (#names, %data);
while (my $row = $csv->getline(*ARGV)) {
my $name = $row->[0];
if ($data{$name}) {
$data{$name}[3] .= ','.$row->[3];
}
else {
push #names, $name;
$data{$name} = $row;
}
}
for my $name (#names) {
$csv->print(*STDOUT, $data{$name});
}
output
"2B_AppName-Environment","desc","tag","192.168.1.1,192.168.22.155,10.20.30.40"
"6G_ServerName-AltEnv","desc","tag","1.2.3.4,192.192.192.40,192.168.50.5"
Update
Here's a version that ignores any record that doesn't have a valid IPv4 address in the fourth field. I've used Regexp::Common as it's the simplest way to get complex regex patterns right. It may need installing on your system.
use strict;
use warnings;
use Text::CSV;
use Regexp::Common;
my $csv = Text::CSV->new({always_quote => 1, eol => "\n"});
my (#names, %data);
while (my $row = $csv->getline(*ARGV)) {
my ($name, $address) = #{$row}[0,3];
next unless $address =~ $RE{net}{IPv4};
if ($data{$name}) {
$data{$name}[3] .= ','.$address;
}
else {
push #names, $name;
$data{$name} = $row;
}
}
for my $name (#names) {
$csv->print(*STDOUT, $data{$name});
}
I would advise you to use a CSV parser like Text::CSV for this type of problem.
Borodin has already pasted a good example of how to do this.
One of the approaches that I'd advise you NOT to use are regular expressions.
The following one-liner demonstrates how one could do this, but this is a very fragile approach compared to an actual csv parser:
perl -0777 -ne '
while (m{^((.*)"[^"\n]*"\n(?:(?=\2).*\n)*)}mg) {
$s = $1;
$s =~ s/"\n.*"([^"\n]+)(?=")/,$1/g;
print $s
}' test.csv
Outputs:
"2B_AppName-Environment","desc","tag","192.168.1.1,192.168.22.155,10.20.30.40"
"6G_ServerName-AltEnv","desc","tag","1.2.3.4,192.192.192.40,192.168.50.5"
Explanation:
Switches:
-0777: Slurp the entire file
-n: Creates a while(<>){...} loop for each “line” in your input file.
-e: Tells perl to execute the code on command line.
Code:
while (m{^((.*)"[^"]*"\n(?:(?=\2).*\n)*)}mg): Separate text into matching sections.
$s =~ s/"\n.*"([^"\n]+)(?=")/,$1/g;: Join all ip addresses by a comma in matching sections.
print $s: Print the results.
I have a perl object that was returned to me whose data I can't seem to extract. If I run Data::Dumper->Dump on it as:
Data::Dumper->Dump($message_body)
I get:
$VAR1 = 'SBM Message
';
$VAR2 = '--SBD.Boundary.605592468
';
$VAR3 = 'Content-Type: text/plain;charset=US-ASCII
';
$VAR4 = 'Content-Disposition: inline
If I execute the line:
print $message_body;
I get:
ARRAY(0x9145668)
I would think this is an array. However, trying to iterate through it there only seems to be a single element. How do I extract each of the elements from this? By the way this, is basically the body of a mail message extracted using the MIME::Parser package. It was created using the following:
my $parser = new MIME::Parser;
my $entity = $parser->parse($in_fh); # Where $in_fh points to a mail message
$message_body = $entity->body;
Try below foreach loop.
foreach my $item (#{$message_body})
{
print $item."\n";
}
$message_body is an ARRAY reference. Hence you need to dereference it and then iterate through each element using the foreach loop.
Read:
http://perlmeme.org/howtos/using_perl/dereferencing.html and http://www.thegeekstuff.com/2010/06/perl-array-reference-examples/
Data::Dumper is only a poor man's choice to see the content.
To see all the gory internal details use Devel::Peek instead.
use Devel::Peek;
Dump $message_body;
I have a suite of small Java app that all compiles/packages to <name-of-the-app>.jar and run on my server. Occasionally one of them will throw an exception, choke and die. I am trying to write a quick-n-dirty Perl script that will periodically poll to see if all of these executable JARs are still running, and if any of them are not, send me an email and notify me which one is dead.
To determine this manually, I have to run a ps -aef | grep <name-of-app> for each app I want to check. For example, to see if myapp.jar is running as a process, I run ps -aef | grep myapp, and look for a grep result that describes the JVM process representing myapp.jar. This manual checking is now getting tedious and is a prime candidate for automation!
I am trying to implement the code that checks to see whether a process is running or not. I'd like to make this a sub that accepts the name of the executable JAR and returns true or false:
sub isAppStillRunning($appName) {
# Somehow run "ps -aef | grep $appName"
# Somehow count the number of processes returned by the grep
# Since grep always returns itself, determine if (count - 1) == 1.
# If so, return true, otherwise, false.
}
I need to be able to pass the sub the name of an app, run my normal command, and count the number of results returned by grep. Since running a grep always results in at least one result (the grep command itself), I need logic that says if the (# of results - 1) is equal to 1, then we know the app is running.
I'm brand new to Perl and am having a tough time figuring out how to implement this logic. Here's my best attempt so far:
sub isAppStillRunning($appName) {
# Somehow run "ps -aef | grep $appName"
#grepResults = `ps -aef | grep $appName`;
# Somehow count the number of processes returned by the grep
$grepResultCount = length(#grepResults);
# Since grep always returns itself, determine if (count - 1) == 1.
# If so, return true, otherwise, false.
if(($grepResultCount - 1) == 1)
true
else
false
}
Then to call the method, from inside the same Perl script, I think I would just run:
&isAppStillRunning("myapp");
Any help with defining the sub and then calling it with the right app name is greatly appreciated. Thanks in advance!
It would be about a billion times easier to use the Proc::ProcessTable module from CPAN. Here's an example of what it might look like:
use strict;
use warnings;
use Proc::ProcessTable;
...
sub isAppStillRunning {
my $appname = shift;
my $pt = Proc::ProcessTable->new;
my #procs = grep { $_->fname =~ /$appname/ } #{ $pt->table };
if ( #procs ) {
# we've got at least one proc matching $appname. Hooray!
} else {
# everybody panic!
}
}
isAppStillRUnning( "myapp" );
Some notes to keep in mind:
Turn on strict and warnings. They are your friends.
You don't specify subroutine arguments with prototypes. (Prototypes in Perl do something completely different, which you don't want.) Use shift to get arguments off the #_ array.
Don't use & to call subroutines; just use its name.
An array evaluated in scalar context (including if its inside an if) gives you its size. length doesn't work on arrays.
Your sub is almost there, but the final if-else construct has to be corrected, and in some cases Perl idiom can make your life easier.
Perl Has Prototypes, But They Suck
sub isAppStillRunning($appName) {
will not work. Instead use
sub isAppStillRunning {
my ($appName) = #_;
The #_ array holds the arguments to the function.
Perl has some simple prototypes (the sub name(&$#) {...} syntax), but they are broken, and an advanced topic, so don't use them.
Perl Has Built-In Grep
`ps -aef | grep $appName`;
This returns one (1) string, possibly containing multiple lines. You could split the output at newlines, and grep manually, which is safer than interpolating variables:
my #lines = split /\n/ `ps -aef`;
my #grepped = grep /$appName/, #lines;
You could also use the open function to explicitly open a pipe to ps:
my #grepped = ();
open my $ps, '-|', 'ps -aef' or die "can't invocate ps";
while (<$ps>) {
push #grepped if /$appName/;
}
This is exactly equal, but better style. It reads all lines from the ps output and then pushes all lines with your $appName into the #grepped array.
Scalar vs. List Context
Perl has this unusual thing called "context". There is list context and scalar context. For example, subroutine calls take argument lists - so these lists (usually) have list context. Concatenating two strings is a scalar context, in contrast.
Arrays behave differently depending on their context. In list context, they evaluate to their elements, but in scalar context, they evaluate to the number of their elements. So there is no need to manually count elements (or use the length function that works on strings).
So we have:
my #array = (1, 2, 3);
print "Length: ", scalar(#array), "\n"; # prints "Length: 3\n"
print "Elems: ", #array, "\n"; # prints "Elems: 123\n";
print "LastIdx: ", $#array, "\n"; # prints "LastIdx: 2\n";
The last form, $#array, is the last index in the array. Unless you meddle with special variables, this is the same as #array - 1.
The scalar function forces scalar context.
Perl Has No Booleans
Perl has no boolean data type, and therefore no true or false keywords. Instead, all values are true, unless stated otherwise. False values are:
The empty string "", the number zero 0, the string zero "0", the special value undef, and some other oddities you won't run into.
So generally use 1 as true and 0 as false.
The if/else constructs require curly braces
So you probably meant:
if (TEST) {
return 1;
} else {
return 0;
}
which is the same as return TEST, where TEST is a condition.
The Ultimate reduction
Using these tricks, your sub could be written as short as
sub isAppStillRunning {
return scalar grep /$_[0]/, (split /\n/, `ps -aef`);
}
This returns the number of lines that contain your app name.
You could modify your routine like this:
sub isAppRunning {
my $appName = shift;
#grepResults = `ps -aef | grep $appName`;
my $items = 0;
for $item(#grepResults){
$items++;
}
return $items;
}
This will iterate over the #grepResults and allow you to inspect the $item if necessary.
Calling it like this should return the number of processes:
print(isAppRunning('myapp') . "\n");