grep word in text files using perl - perl

I have a text file A00010.txt A00011.txt A00012.txt to A00099.txt in myfolder which contains different entries like,
umxwtdn8vtnt_n
umxwtdtnt_nn8v
umxwt_ntdn8vtn
u8vtnt_nmxwtdn
utnt_nmxwtdn8v
my perl code is
#!/usr/bin/perl
use strict;
my $count = 10;
for ($count = 10; $count<= 99; $count++) {
my $result = `/bin/cat /myfolder/A000$count.txt | grep "umxwtdn8vtnt_n"`;
return $result;
}
print $result;
i trying to get $result value but show empty

Is /myfolder really in the root directory? (what do you see in ls /? Do you see myfolder?) It's very rare to add things in the root directory in a Unix system, and I don't think you are messing with /.
Also, you are returning $result outside a subroutine (sub { }), and if that's the case, you should get a Perl runtime error.
If you are copying code fragments, then please note that $result is a local variable and it disappears after a subroutine ends.

Do you really need to use Perl?
If not:
find /myfolder -name "A000??.txt" | xargs grep -n "umxwtdn8vtnt_n"
Will find the pattern in your files and tell you at which line...
Would you like to know if the pattern is in one or more of your files? Then:
my $not_found = 1;
for (my $count = 10; $count<= 99; $count++) {
my $result = `grep "umxwtdn8vtnt_n" /myfolder/A000$count.txt`;
if ($result) {
print $result;
$not_found = 0; # error level 0 = no error = found
last;
}
}
exit $not_found; # error level 1 = error = not found
Still trying to understand your need... what about:
my $result;
for (my $count = 10; $count<= 99; $count++) {
# you should test that A000$count.txt actually exists here
my $match = `grep "umxwtdn8vtnt_n" /myfolder/A000$count.txt`;
if ($match == "umxwtdn8vtnt_n") {
print "found $match in A000${count}.txt";
$result = $match;
last; # exit for loop
}
}
if ($result) {
# do something with it?
}

Related

Perl -two list matching elements

I am trying to grab the list of the files jenkins has updated from last build and latest build and stored in a perl array.
Now i have list of files and folders in source code which are considered as sensitive in terms of changes like XXXX\yyy/., XXX/TTTT/FFF.txt,...in FILE.TXT
i want that script should tell me if any these sensitive files was part of my changed files and if yes list its name so that we can double check with development team about is change before we trigger build .
How should i achieve this , and how to ,compare multiple files under one path form the changed path files .
have written below script ---which is called inside the jenkins with %workspace# as argument
This is not giving any matching result.
use warnings;
use Array::Utils qw(:all);
$url = `curl -s "http://localhost:8080/job/Rev-number/lastStableBuild/" | findstr "started by"`;
$url =~ /([0-9]+)/;
system("cd $ARGV[1]");
#difffiles = `svn diff -r $1:HEAD --summarize`;
chomp #difffiles;
foreach $f (#difffiles) {
$f = substr( $f, 8 );
$f = "$f\n";
}
open FILE, '/path/to/file'
or die "Can't open file: $!\n";
#array = <FILE>;
#isect = intersect( #difffiles, #array );
print #isect;
I have manged to solve this issue using below perl script -
sub Verifysensitivefileschanges()
{
$count1=0;
#isect = intersect(#difffiles,#sensitive);
#print "ISECT=#isect\n";
foreach $t (#isect)
{
if (#isect) {print "Matching element found -- $t\n"; $count1=1;}
}
return $count1;
}
sub Verifysensitivedirschanges()
{
$count2=0;
foreach $g (#difffiles)
{
$dirs = dirname($g);
$filename = basename($g);
#print "$dirs\n";
foreach $j (#array)
{
if( "$j" =~ /\Q$dirs/)
{print "Sensitive path files changed under path $j and file name is $filename\n";$count2=1;}
}
}
return $count2;
}

Perl - Use of uninitialized value in string

I started teaching myself Perl, and with the help of some Googling, I was able to throw together a script that would print out the file extensions in a given directory. The code works well, however, it will sometimes complain the following:
Use of uninitialized value $exts[xx] in string eq at get_file_exts.plx
I tried to correct this by initializing my array as follows: my #exts = (); but this did not work as expected.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
#Check for correct number of arguments
if(#ARGV != 1) {
print "ERROR: Incorrect syntax...\n";
print "Usage: perl get_file_exts.plx <Directory>\n";
exit 0;
}
#Search through directory
find({ wanted => \&process_file, no_chdir => 1 }, #ARGV);
my #exts;
sub process_file {
if (-f $_) {
#print "File: $_\n";
#Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
#Add first extension
if(scalar #exts == 0) {
push(#exts, $ext);
}
#Loop through array
foreach my $index (0..$#exts) {
#Check for match
if($exts[$index] eq $ext) {
last;
}
if($index == $#exts) {
push(#exts, $ext);
}
}
} else {
#print "Searching $_\n";
}
}
#Sort array
#exts = sort(#exts);
#Print contents
print ("#exts", "\n");
You need to test if you found an extension.
Also, you should not be indexing your array. You also do not need to manage 'push' just do it. It is not the Perl way. Your for loop should start like this:
sub process_file {
if (-f $_) {
#print "File: $_\n";
#Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
# If we found an extension, and we have not seen it before, add it to #exts
if ($ext) {
#Loop through array to see if this is a new extension
my $newExt = 1;
for my $seenExt (#exts) {
#Check for match
if ($seenExt eq $ext) {
$newExt = 0
last;
}
}
if ($newExt) {
push #exts,$ext;
}
}
}
}
But what you really want to do is to use a hash table to record if you saw an extension
# Move this before find(...); if you want to initialize it or you will clobber the
# contents
my %sawExt;
sub process_file {
if (-f $_) {
#print "File: $_\n";
# Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
# If we have an extension, mark that we've seen it
$sawExt{$ext} = 1
if $ext;
}
}
# Print the extensions we've seen in sorted order
print join(' ',sort keys %sawExt) . "\n";
Or even
sub process_file {
if (-f $_ && $_ =~ /(\.[^.]+)$/) {
$sawExt{$1} = 1;
}
}
Or
sub process_file {
$sawExt{$1} = 1
if -f && /(\.[^.]+)$/;
}
Once you start thinking in Perl this is the natural way to write it
The warning is complaining about a content of $exts[xx], not #exts itself.
Actually $ext can be undef, when the filename doesn't match to your regexp, for instance README.
Try like:
my ($ext) = $_ =~ /(\.[^.]+)$/ or return;
The main problem is that you aren't accounting for file names that don't contain a dot, so
my ($ext) = $_ =~ /(\.[^.]+)$/;
sets $ext to undef.
Despite the warning, processing continues by evaluating undef as the null string, failing to find that in #exts, and so percolating undef to the array as well.
The minimal change to get your code working is to replace
my ($ext) = $_ =~ /(\.[^.]+)$/;
with
return unless /(\.[^.]+)$/;
my $ext = $1;
But there is a couple of Perl lessons to be learned here. It used to be taught that good programs were well-commented programs. That was in the days of having to write efficient but incomprehensible code, but is no longer true. You should write code that is as clear as possible, and add comments only if you absolutely have to write something that isn't self-explanatory.
You should remember and use Perl idioms, and try to forget most C that you knew. For instance, Perl accepts the "here document" syntax, and it is common practice to use or and and as short-circuit operators. Your parameter check becomes
#ARGV or die <<END;
ERROR: Incorrect syntax...
Usage: perl get_file_exts.plx <Directory>
END
Perl allows for clear but concise programming. This is how I would have written your wanted subroutine
sub process_file {
return unless -f and /(\.[^.]+)$/;
my $ext = $1;
foreach my $index (0 .. $#exts) {
return if $exts[$index] eq $ext;
}
push #exts, $ext;
}
Use exists on $exts[xx] before accessing it.
exists is deprecated though as #chrsblck pointed out :
Be aware that calling exists on array values is deprecated and likely
to be removed in a future version of Perl.
But you should be able to check if it exists (and not 0 or "") simply with :
if($exts[index] && $exts[$index] eq $ext){
...
}

How to compare two directories and their files in perl

Fred here again with a little issue I'm having that I hope you guys can help me with.
I'm reviewing for midterms and going over an old file I found on here and I wanted to get it working. I can't find it on here anymore but I still have the source code so I'll make another question on it.
So here was his assignment:
Write a perl script that will compare two directories for differences in regular files. All regular files with the same names should be tested with the unix function /usr/bin/diff -q which will determine whether they are identical. A file in dir1 which does not have a similarly named file in dir2 will have it's name printed after the string <<< while a file in dir2 without a corresponding dir1 entry will be prefixed with the string >>>. If two files have the same name but are different then the file name will be surrounded by > <.
Here is the script:
#!/usr/bin/perl -w
use File::Basename;
#files1 = `/usr/bin/find $ARGV[0] -print`;
chop #files1;
#files2 = `/usr/bin/find $ARGV[1] -print`;
chop #files2;
statement:
for ($i=1; #files1 >= $i; $i++) {
for ($x=1; #files2 >= $x; $x++) {
$file1 = basename($files1[$i]);
$file2 = basename($files2[$x]);
if ($file1 eq $file2) {
shift #files1;
shift #files2;
$result = `/usr/bin/diff -q $files1[$i] $files2[$x]`;
chop $result;
if ($result eq "Files $files1[$i] and $files2[$x] differ") {
print "< $file1 >\n";
next statement;
} else {
print "> $file1 <\n";
}
} else {
if ( !-e "$files1[$i]/$file2") { print ">>> $file2\n";}
unless ( -e "$files2[$x]/$file1") { print "<<< $file1\n";}
}
}
}
This is the output:
> file2 <
>>> file5
<<< file1
The output should be:
> file1 <
> file2 <
<<< file4
>>> file5
I already checked the files to make sure that they all match and such but still having problems. If anyone can help me out I would greatly appreciate it!
First off, always use these:
use strict;
use warnings;
It comes with a short learning curve, but they more than make up for it in the long run.
Some notes:
You should use the File::Find module instead of using a system call.
You start your loops at array index 1. In perl, the first array index is 0. So you skip the first element.
Your loop condition is wrong. #files >= $x means you will iterate to 1 more than max index (normally). You want either $x < #files or $x <= $#files.
You should use chomp, which is a safer version of chop.
Altering the arrays you are iterating over is a sure way to cause yourself some confusion.
Why use if (! -e ...) and then unless (-e ...)? That surely just adds confusion.
And this part:
$file1 = basename($files1[$i]);
...
if ( !-e "$files1[$i]/$file2" )
Assuming #files1 contains file names and not just directories, this will never match anything. For example:
$file2 = basename("dir/bar.html");
$file1 = basename("foo/bar.html");
-e "foo/bar.html/bar.html"; # does not compute
I would recommend using hashes for the lookup, assuming you only want to match against identical file names and missing file names:
use strict;
use warnings;
use File::Find;
use List::MoreUtils qw(uniq);
my (%files1, %files2);
my ($dir1, $dir2) = #ARGV;
find( sub { -f && $files1{$_} = $File::Find::name }, $dir1);
find( sub { -f && $files2{$_} = $File::Find::name }, $dir2);
my #all = uniq(keys %files1, keys %files2);
for my $file (#all) {
my $result;
if ($files1{$file} && $files2{$file}) { # file exists in both dirs
$result = qx(/usr/bin/diff -q $files1{$file} $files2{$file});
# ... etc
} elsif ($files1{$file}) { # file only exists in dir1
} else { # file only exists in dir2
}
}
In the find() subroutine, $_ represents the base name, and $File::Find::name the name including path (which is suitable for use with diff). The -f check will assert that you only include regular files in your hash.

Curl crashes when running under cron

I've got a really bizzare problem, I've googled this to death, and cannot for the life of me find an answer. I'm a bit of a newbie to programming (lt 2 years) so sorry if this is something obvious, or I've not provided adequate detail
The Problem is...
curl crashes when I call it for the 5th time in a while loop (when run from root's cron).
curl is fine when I run said while loop manually whilst logged in (about 50 iterations).
I run a bash script from cron
The bash script runs a perl script
The perl script calls curl within a while loop
On the 5th iteration of this while loop, curl is called and crashes (no output)
I'm running cron as root (crontab -u root /path/to/the/crontab/file)
I don't think it's environment based, as it runs fine 4 times
If I end the while loop at 4 itterations and start it again, it still fails, so I figure the problem is not with the while loop.
This exact script works fine on my old server running Ubuntu desktop ( I'm now on Ubuntu server 10.04)
I think this a problem between curl and cron.
The line of the crash looks like this (vars filled in)
$err = system("/usr/bin/curl -f -v -s -r "36155357-36259993,36790101-37194555,53623979-53745261" http://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.2012040100/master/gfs.t00z.mastergrb2f21 -o /root/Desktop/getGFS_uploadGFS/GFS/windvect/gfs.t00z.mastergrb2f21.tmp");
I'm totally stumped write now, if anyone has any ideas it would be much appreciated. Below is the while loop (with the crash point highlighted near the bottom).
while ($fhr <= $hr1) {
if ($fhr <= 9) { $fhr="0$fhr"; }
$url = $URL;
$url =~ s/\$FHR/$fhr/g;
$url =~ s/\${FHR}/$fhr/g;
$file = $url;
$file =~ s/^.*\///;
#
# read the inventory
# $line[] = wgrib inventory, $start[] = start of record (column two of $line[])
#
if ($windows eq 'yes') {
$err = system("$curl -f -s $url$inv -o $OUTDIR/$file.tmp");
$err = $err >> 8;
if ($err) {
print STDERR "error code=$err, problem reading $url$inv\n";
sleep(10);
exit(8);
}
open (In, "$OUTDIR/$file.tmp");
}
else {
open (In, "$curl -f -s $url$inv |");
}
$n=0;
while (<In>) {
chomp;
$line[$n] = $_;
s/^[^:]*://;
s/:.*//;
$start[$n] = $_;
$n++;
}
close(In);
if ($n == 0) {
print STDERR "Problem reading file $url$inv\n";
sleep(10);
exit(8);
}
#
# find end of record: $last[]
#
$lastnum = $start[$n-1];
for ($i = 0; $i < $n; $i++) {
$num = $start[$i];
if ($num < $lastnum) {
$j = $i + 1;
while ($start[$j] == $num) { $j++; }
$last[$i] = $start[$j] - 1;
}
else {
$last[$i] = '';
}
}
if ($action eq 'inv') {
for ($i = 0; $i < $n; $i++) {
print "$line[$i]:range=$start[$i]-$last[$i]\n";
}
exit(0);
}
#
# make the range field for Curl
#
$range = '';
$lastfrom = '';
$lastto = '-100';
for ($i = 0; $i < $n; $i++) {
$_ = $line[$i];
if (/$LEVS/i && /$VARS/i) {
$from=$start[$i];
$to=$last[$i];
if ($lastto + 1 == $from) {
$lastto = $to;
}
elsif ($lastto ne $to) {
if ($lastfrom ne '') {
if ($range eq '') { $range = "$lastfrom-$lastto"; }
else { $range = "$range,$lastfrom-$lastto"; }
}
$lastfrom = $from;
$lastto = $to;
}
}
}
if ($lastfrom ne '') {
if ($range eq '') { $range="$lastfrom-$lastto"; }
else { $range="$range,$lastfrom-$lastto"; }
}
if ($range ne '') {
#################################################################################
########### THE BELOW LINE IS WHERE CURL IS CALLED AND IT CRASHES ###############
#################################################################################
$err = system("$curl -f -v -s -r \"$range\" $url$grb -o $OUTDIR/$file.tmp");
$err = $err >> 8;
if ($err != 0) {
print STDERR "error in getting file $err $url$grb\n";
sleep(20);
exit $err;
}
rename "$OUTDIR/$file.tmp", "$OUTDIR/$file";
$output = "$output $OUTDIR/$file";
}
else {
print "no matches (no download) for $file\n";
}
$fhr += $dhr;
}
Why do you want to shell out to curl? If it's just the range, that's easy:
use v5.10.1;
use Mojo::UserAgent;
say Mojo::UserAgent->new->get(
'http://www.example.com',
{ 'Range' => 'bytes=500-600' }
)->res->body;
There are also Perl bindings to libcurl: Net::Curl and WWW::Curl.

Using a Perl hash

It's the first time I've used a hash in Perl, and I'm stuck in a weird problem. What I'm trying to do is after I backup files in a directory, I use a Perl program to check if all files appearin the log file. So I had the following code:
our (%missing_files) = (); # global definition on the top of the program
... do something ...
sub CheckTarResult {
my (#dir_list) = (); # dir list
my (#file_list) = (); # will be filled with all file names in one dir
my ($j) = "";
my ($k) = ""; # loop variable
my ($errors) = 0; # number of missing files
... do something ...
foreach $j (#dir_list) {
#file_list = `ls $j`;
foreach $k (#file_list) {
$result = `cat $logfile | grep $k`;
if ($result eq "") {
$errors++;
$missing_files{$j} = ${k};
}
}
#file_list = ();
}
... do something ...
my($dir) = "";
my($file) = "";
while ( ($dir, $file) = each(%missing_files) ) {
print $dir . " : " . $file;
}
I made an empty log file to do the test, the expecting result should give me all files missing, but somehow "missing_files" only stores the last missing file in each dir. The logic seems to be straightforward, so what am I missing here?
Edit:
I used the advice from #Borodin, and it worked. But in order to print the content of an array reference, we need to loop through elements in the array. The code after the change looks like the following:
... everything before is the same ...
push #{$missing_files{$j}}, ${k}; # put elements in dictionary
# in the print statement
while( ($dir, $file) = each(%missing_files) ) {
for $i ( 0 .. $#$file ) { # $#$file represents the array size by reference
print $dir . " : " . ${$file}[i];
}
}
Perl hash values can contain only a single scalar. If you want to store a list of things then you must make that scalar an array reference. To do that, change the line
$missing_files{$j} = ${k};
to
push #{$missing_files{$j}}, ${k};