Perl IF statement not matching variables in REGEX - perl

my $pointer = 0;
foreach (#new1)
{
my $test = $_;
foreach (#chk)
{
my $check = $_;
chomp $check;
delete($new1[$pointer]) if ($test =~ /^$check/i);
}
$pointer++;
}
The if statement never matches the fact that many entries in the #new1 array do contain $check at the start of the array element (88 at least).
I am not sure it is the nested loop that is causing the problem because if i try this it also fails to match:
foreach (#chk)
{
#final = (grep /^$_/, #new1);
}
#final is empty but I know at least 88 entires for $_ are in #new1.
I wrote this code on a machine running Windows ActivePerl 5.14.2 and the top code works. I then (using a copy of #new1) compare the two and remove any duplicates (also works on 5.14.2). I did try to negate the if match but that seemed to wipe out the #new1 array (so that I didn't need to do a hash compare).
When I try to run this code on a Linux RedHat box with Perl 5.8.0 it seems to struggle with the variable matching in the REGEX. If I hard code the REGEX with an example I know is in #new1 the match works and in the first code the entry is deleted (in the second one value is inserted in #final).
The #chk array is a listing file on the web server and the #new1 array is created by opening two log files on the web server and then pushing one into the other.
I had even gone to the trouble of printing out $test and $check in each loop iteration and manually checking to see if any of the the values did match and some of them do.
It has had me baffled for days now and I have had to throw the towel in and ask for help, any ideas?

As tested by user1568538, the solution was to replace
chomp $check;
with
$check =~ s/\r\n//g;
to remove Windows-style line endings from the variable.
Since chomp removes the contents of the input record separator $/ from the end of its argument, you could also change its value:
my $pointer = 0;
foreach (#new1)
{
my $test = $_;
foreach (#chk)
{
local $/="\r\n";
my $check = $_;
chomp $check;
delete($new1[$pointer]) if ($test =~ /^$_/i);
}
$pointer++;
}
However, since $/ also affects other operations (such as reading from a file handle), perhaps it is safest to avoid changing $/ unless you are sure if it is safe. Here I limit the change to the foreach loop where the chomp occurs.

No knowing what your input data looks like, using \Q might help:
if ($test =~ /^\Q$check/i);
See quotemeta.

It is not clear what you are trying to do. However, you may be trying to only get those elements for which there is no match or vice versa. Adapt the code below for your needs
#!/usr/bin/perl
use strict; use warnings;
my #item = qw(...); # your #new?
my #check = qw(...); # your #chk?
my #match;
my #nomatch;
ITEM:
foreach my $item (#item) {
CHECK:
foreach my $check (#check) {
# uncomment this if $check should not be interpreted as a pattern,
# but as literal characters:
# $item = '\Q' . $item;
if ($item =~ /^$check/) {
push #match, $item;
next ITEM; # there was a match, so this $item is burnt
# we don't need to test against other $checks.
}
}
# there was no match, so lets store it:
push #nomatch, $item.
}
print "matched $_\n" for #matched;
print "didn't match $_" for #nomatch;
Your code is somewhat difficult to read. Let me tell you what this
foreach (#chk) {
#final = (grep /^$_/, #new1);
}
does: It is roughly equivalent to
my #final = ();
foreach my $check (#chk) {
#final = grep /^$check/, #new1;
}
which is equivalent to
my #final = ();
foreach my $check (#chk) {
# #final = grep /^$check/, #new1;
#final = ();
foreach (#new) {
if (/^$check/) {
push #final, $_;
last;
}
}
}
So your #final array gets reset, possibly emptied.

Related

How do I skip an iteration step in a while loop in Perl?

In Perl I'm trying to achieve this:
while ($row = <$fh>){
if the row contains the character >:
#do something AND then skip to the next line
else:
#continue to parse normally and do other things
You can skip to the next iteration of a loop with the next built-in. Since you are reading line by line, that's all you need to do.
For checking if the character is present, use a regular expression. That's done with the m// operator and =~ in Perl.
while ($row = <$fh>) {
if ( $row =~ m/>/ ) {
# do stuff ...
next;
}
# no need for else
# continue and do other stuff ...
}
Try this way:
while ($row = <$fh>)
{
if($row =~ />/)
{
#do something AND then skip to the next line
next;
}
#continue to parse normally and do other things
}

What am I not getting about foreach loops?

It was always my understanding that
foreach (#arr)
{
....
}
and
for(my $i=0; $i<#arr; $i++)
{
.....
}
were functionally equivalent.
However, in all of my code, whenever I use a foreach loop I run into problems that get fixed when I change to a for loop. It always has to do with comparing the values of two things, usually with nested loops.
Here is an example:
for(my $i=0; $i<#files; $i++)
{
my $sel;
foreach (#selected)
{
if(files[$i] eq selected[$_])
{
$selected='selected';
}
}
<option value=$Files[$i] $sel>$files[$i]</option>
}
The above code falls between select tags in a cgi program.
Basically I am editing the contents of a select box according to user specifications.
But after they add or delete choices I want the choices that were origionally selected to remain selected.
The above code is supposed to accomplish this when reassembling the select on the next form. However, with the foreach version it only gets the first choice that's selected and skips the rest. If I switch it to a 3 part for loop, without changing anything else, it will work as intended.
This is only a recent example, so clearly I am missing something here, can anyone help me out?
Let's assume that #files is a list of filenames.
In the following code, $i is the array index (i.e. it's an integer):
for (my $i=0; $i<#files; $i++) { ... }
In the following code, $i is set to each array item in turn (i.e. it's a filename):
foreach my $i (#files) { ... }
So for example:
use strict;
use warnings;
my #files = (
'foo.txt',
'bar.txt',
'baz.txt',
);
print "for...\n";
for (my $i=0; $i<#files; $i++) {
print "\$i is $i.\n";
}
print "foreach...\n";
foreach my $i (#files) {
print "\$i is $i.\n";
}
Produces the following output:
for...
$i is 0.
$i is 1.
$i is 2.
foreach...
$i is foo.txt.
$i is bar.txt.
$i is baz.txt.
foreach loops are generally preferred for looping through arrays to avoid accidental off-by-one errors caused by things like for (my $i=1;...;...) or for (my $i=0;$i<=#arr;...).
That said, for and foreach are actually implemented as synonyms in Perl, so the following script produces identical output to my previous example:
use strict;
use warnings;
my #files = (
'foo.txt',
'bar.txt',
'baz.txt',
);
print "for...\n";
foreach (my $i=0; $i<#files; $i++) {
print "\$i is $i.\n";
}
print "foreach...\n";
for my $i (#files) {
print "\$i is $i.\n";
}
It it simply customary to refer to the second type of loop as a foreach loop, even if the source code uses the keyword for to perform the loop (as has become quite common).

If condition not matching inside a foreach loop in perl

I'm trying to match a string with if statement inside a foreach loop but its not matching although i get the same string with printed before if statement inside foreach loop. Please help.
use Net::Telnet;
$ip='xx.xxx.xx.xx';
$ip_port='10002';
$port = new Net::Telnet->new( Host=>$ip,Port=>$ip_port,Dump_log=> "dump.log");
my #folder= $port->cmd("ls");
sleep(2);
$folders=#folder;
print "Number of folders are:$folders\n";
foreach my $folder(#folder)
{
print "Folder before if is:$folder\n";
if(($folder eq "acc") || ($folder eq "bda"))
{
# some code here.
}
}
Your strings probably contain white space. You can use something like chomp to remove it, or alternatively use regexs.
Try:
if ($folder =~ /^(acc|bda)/) {
# some code here
}

Perl - Use of uninitialized value in string

I started teaching myself Perl, and with the help of some Googling, I was able to throw together a script that would print out the file extensions in a given directory. The code works well, however, it will sometimes complain the following:
Use of uninitialized value $exts[xx] in string eq at get_file_exts.plx
I tried to correct this by initializing my array as follows: my #exts = (); but this did not work as expected.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
#Check for correct number of arguments
if(#ARGV != 1) {
print "ERROR: Incorrect syntax...\n";
print "Usage: perl get_file_exts.plx <Directory>\n";
exit 0;
}
#Search through directory
find({ wanted => \&process_file, no_chdir => 1 }, #ARGV);
my #exts;
sub process_file {
if (-f $_) {
#print "File: $_\n";
#Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
#Add first extension
if(scalar #exts == 0) {
push(#exts, $ext);
}
#Loop through array
foreach my $index (0..$#exts) {
#Check for match
if($exts[$index] eq $ext) {
last;
}
if($index == $#exts) {
push(#exts, $ext);
}
}
} else {
#print "Searching $_\n";
}
}
#Sort array
#exts = sort(#exts);
#Print contents
print ("#exts", "\n");
You need to test if you found an extension.
Also, you should not be indexing your array. You also do not need to manage 'push' just do it. It is not the Perl way. Your for loop should start like this:
sub process_file {
if (-f $_) {
#print "File: $_\n";
#Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
# If we found an extension, and we have not seen it before, add it to #exts
if ($ext) {
#Loop through array to see if this is a new extension
my $newExt = 1;
for my $seenExt (#exts) {
#Check for match
if ($seenExt eq $ext) {
$newExt = 0
last;
}
}
if ($newExt) {
push #exts,$ext;
}
}
}
}
But what you really want to do is to use a hash table to record if you saw an extension
# Move this before find(...); if you want to initialize it or you will clobber the
# contents
my %sawExt;
sub process_file {
if (-f $_) {
#print "File: $_\n";
# Get extension
my ($ext) = $_ =~ /(\.[^.]+)$/;
# If we have an extension, mark that we've seen it
$sawExt{$ext} = 1
if $ext;
}
}
# Print the extensions we've seen in sorted order
print join(' ',sort keys %sawExt) . "\n";
Or even
sub process_file {
if (-f $_ && $_ =~ /(\.[^.]+)$/) {
$sawExt{$1} = 1;
}
}
Or
sub process_file {
$sawExt{$1} = 1
if -f && /(\.[^.]+)$/;
}
Once you start thinking in Perl this is the natural way to write it
The warning is complaining about a content of $exts[xx], not #exts itself.
Actually $ext can be undef, when the filename doesn't match to your regexp, for instance README.
Try like:
my ($ext) = $_ =~ /(\.[^.]+)$/ or return;
The main problem is that you aren't accounting for file names that don't contain a dot, so
my ($ext) = $_ =~ /(\.[^.]+)$/;
sets $ext to undef.
Despite the warning, processing continues by evaluating undef as the null string, failing to find that in #exts, and so percolating undef to the array as well.
The minimal change to get your code working is to replace
my ($ext) = $_ =~ /(\.[^.]+)$/;
with
return unless /(\.[^.]+)$/;
my $ext = $1;
But there is a couple of Perl lessons to be learned here. It used to be taught that good programs were well-commented programs. That was in the days of having to write efficient but incomprehensible code, but is no longer true. You should write code that is as clear as possible, and add comments only if you absolutely have to write something that isn't self-explanatory.
You should remember and use Perl idioms, and try to forget most C that you knew. For instance, Perl accepts the "here document" syntax, and it is common practice to use or and and as short-circuit operators. Your parameter check becomes
#ARGV or die <<END;
ERROR: Incorrect syntax...
Usage: perl get_file_exts.plx <Directory>
END
Perl allows for clear but concise programming. This is how I would have written your wanted subroutine
sub process_file {
return unless -f and /(\.[^.]+)$/;
my $ext = $1;
foreach my $index (0 .. $#exts) {
return if $exts[$index] eq $ext;
}
push #exts, $ext;
}
Use exists on $exts[xx] before accessing it.
exists is deprecated though as #chrsblck pointed out :
Be aware that calling exists on array values is deprecated and likely
to be removed in a future version of Perl.
But you should be able to check if it exists (and not 0 or "") simply with :
if($exts[index] && $exts[$index] eq $ext){
...
}

Perl search is only showing last result

I have two arrays, one with search terms and another which is multiple lines fetched from a file. I have a nested foreach statement and am searching for for all combinations, but only the very last match is showing even though I know for a fact that there are many other matches!! I have tried many different versions of the code but here is my last one:
open (MYFILE, 'searchTerms.txt');
open (MYFILE2, 'fileToSearchIn.xml');
#searchTerms = <MYFILE>;
#xml = <MYFILE2>;
close(MYFILE2);
close(MYFILE);
$results = "";
foreach $searchIn (#xml)
{
foreach $searchFor (#searchTerms)
{
#print "searching for $searchFor in: $searchIn\n";
if ($searchIn =~ m/$searchFor/)
{
$temp = "found in $searchIn \n while searching for: $searchFor ";
$results = $results.$temp."\n";
$temp = "";
}
}
}
print $results;
You should always use strict and use warnings at the start of your program, and declare all variables at the point of their first use using my. This applies especially when you are asking for help with your code as this measure can quickly reveal many simple mistakes.
As Raze2dust has said it is important to remember that lines read from a file will have a trailing newline "\n" character. If you were checking for exact matches between a pair of lines then this wouldn't matter, but since it's not working for you I assume the strings in searchTerms.txt can appear anywhere in the lines of fileToSearchIn.xml. That means you need to use chomp the strings from searchTerms.txt; lines from the other file can stay as they are.
Things like this are made a lot easier by using the File::Slurp module. It does all the file handling for you and will chomp any newlines from the input text if you ask.
I have changed your program to use this module so that you can see how it works.
use strict;
use warnings;
use File::Slurp;
my #searchTerms = read_file('searchTerms.txt', chomp => 1);
my #xml = read_file('fileToSearchIn.xml');
my #results;
foreach my $searchIn (#xml) {
foreach my $searchFor (#searchTerms) {
if ($searchIn =~ m/$searchFor/) {
push #results, qq/Found in "$searchIn"\n while searching for "$searchFor"/;
}
}
}
print "$_\n" for #results;
chomp your inputs to remove newline characters:
open (MYFILE, 'searchTerms.txt');
open (MYFILE2, 'fileToSearchIn.xml');
#searchTerms = <MYFILE>;
#xml = <MYFILE2>;
close(MYFILE2);
close(MYFILE);
$results = "";
foreach $searchIn (#xml)
{
chomp($searchIn);
foreach $searchFor (#searchTerms)
{
chomp($searchFor);
#print "searching for $searchFor in: $searchIn\n";
if ($searchIn =~ m/$searchFor/)
{
$temp = "found in $searchIn \n while searching for: $searchFor ";
$results = $results.$temp."\n";
$temp = "";
}
}
}
print $results;
Basically, you are thinking you are searching for 'a', but actually it is searching for 'a\n' because that is how it reads the input unless you use chomp. It matches only if 'a' is the last character because in that case, it will be succeeded by a newline.