How to prematurely detect whether it's the last iteration during a while loop in Perl - perl

Similar question(does not solve my question): Is it possible to detect if the current while loop iteration is the last in perl?
Post above has an answer which solves the issue of detecting whether it's the last iteration solely when reading from a file.
In a while loop, is it possible to detect if the current iteration is the last one from a mysql query?
while( my($id, $name, $email) = $sth->fetchrow_array() )
{
if(this_is_last_iteration)
{
print "last iteration";
}
}

my $next_row = $sth->fetch();
while (my $row = $next_row) {
my ($id, $name, $email) = #$row;
$next_row = $sth->fetch();
if (!$next_row) {
print "last iteration";
}
...
}

You'll need to verify this compiles, but a rough outline is:
my($rows) = $sth->rows;
my($i) = 0;
while( my($id, $name, $email) = $sth->fetchrow_array() )
{
$i++;
if ($i == $rows)
{
print "last iteration";
}
}
If you give us some more context, there may be other options. For example, your print statement is the last thing in the while loop. If that matches reality, you could simply move this after the loop and do away with the counter.
A couple of commentators have correctly noted that the rows command will not always have the correct value for a SELECT command (eg. see here). If you're using SELECT (which seems likely from your code) then this could be an issue. You could perform a COUNT before the SELECT to get the number of rows provided the data set does not change between the COUNT and the SELECT.

Although you can count rows to tell when you are on the last result from an SQL query, no, in the general case it is not possible to know in advance whether you're on the last iteration of a while loop.
Consider the following example:
while (rand() > 0.05) {
say "Is this the last iteration?";
}
There is no way to predict in advance what rand() will return, thus the code within the loop has no way of knowing whether it will iterate again until the next iteration starts.

You can keep a counter and compare it to the array length. I'm not familiar with Perl, but that's how I would do it in any other language.

Related

Dynamic Array Inside a Foreach Loop

First time poster and new to Perl so I'm a little stuck. I'm iterating through a collection of long file names with columns separated by variable amounts of whitespace for example:
0 19933 12/18/2013 18:00:12 filename1.someextention
1 11912 12/17/2013 18:00:12 filename2.someextention
2 19236 12/16/2013 18:00:12 filename3.someextention
These are generated by multiple servers so I am iterating through multiple collections. That mechanism is simple enough.
I'm focused solely on the date column and need to ensure the date is changing like the above example as that ensures the file is being created on a daily basis and only once. If the file is created more than once per day I need to do something like send an email to myself and move on to the next server collection. If the date changes from the first file to the second exit the loop as well.
My issue is I don't know how to keep the date element of the first file stored so that I can compare it to the next file's date going through the loop. I thought about keeping the element stored in an array inside the loop until the current collection is finished and then move onto the next collection but I don't know the correct way of doing so. Any help would be greatly appreciated. Also, if there is a more eloquent way please enlighten me since I am willing to learn and not just wanting someone to write my script for me.
#file = `command -h server -secFilePath $secFilePath analyzer -archive -list`;
#array = reverse(#file); # The output from the above command lists the oldest file first
foreach $item (#array) {
#first = split (/ +/, #item);
#firstvar = #first[2];
#if there is a way to save the first date in the #firstvar array and keep it until the date
changes
if #firstvar == #first[2] { # This part isnt designed correctly I know. }
elsif #firstvar ne #first[2]; {
last;
}
}
One common technique is to use a hash, which is a data structure mapping key-value pairs. If you key by date, you can check if a given date has been encountered before.
If a date hasn't been encountered, it has no key in the hash.
If a date has been encountered, we insert 1 under that key to mark it.
my %dates;
foreach my $line (#array) {
my ($idx, $id, $date, $time, $filename) = split(/\s+/, $line);
if ($dates{$date}) {
#handle duplicate
} else {
$dates{$date} = 1;
#...
#code here will be executed only if the entry's date is unique
}
#...
#code here will be executed for each entry
}
Note that this will check each date against each other date. If for some reason you only want to check if two adjacent dates match, you could just cache the last $date and check against that.
In comments, OP mentioned they might rather perform that second check I mentioned. It's similar. Might look like this:
#we declare the variable OUTSIDE of the loop
#if needs to be, so that it stays in scope between runs
my $last_date;
foreach my $line (#array) {
my ($idx, $id, $date, $time, $filename) = split(/\s+/, $line);
if ($date eq $last_date) { #we use 'eq' for string comparison
#handle duplicate
} else {
$last_date = $date;
#...
#code here will be executed only if the entry's date is unique
}
#...
#code here will be executed for each entry
}

Perl - Data comparison taking huge time

open(INFILE1,"INPUT.txt");
my $modfile = 'Data.txt';
open MODIFIED,'>',$modfile or die "Could not open $modfile : $!";
for (;;) {
my $line1 = <INFILE1>;
last if not defined $line1;
my $line2 = <INFILE1>;
last if not defined $line2;
my ($tablename1, $colname1,$sql1) = split(/\t/, $line1);
my ($tablename2, $colname2,$sql2) = split(/\t/, $line2);
if ($tablename1 eq $tablename2)
{
my $sth1 = $dbh->prepare($sql1);
$sth1->execute;
my $hash_ref1 = $sth1->fetchall_hashref('KEY');
my $sth2 = $dbh->prepare($sql2);
$sth2->execute;
my $hash_ref2 = $sth2->fetchall_hashref('KEY');
my #fieldname = split(/,/, $colname1);
my $colcnt=0;
my $rowcnt=0;
foreach $key1 ( keys(%{$hash_ref1}) )
{
foreach (#fieldname)
{
$colname =$_;
my $strvalue1='';
#val1 = $hash_ref1->{$key1}->{$colname};
if (defined #val1)
{
my #filtered = grep /#val1/, #metadata;
my $strvalue1 = substr(#filtered[0],index(#filtered[0],'||') + 2);
}
my $strvalue2='';
#val2 = $hash_ref2->{$key1}->{$colname};
if (defined #val2)
{
my #filtered = grep /#val2/, #metadata2;
my $strvalue2 = substr(#filtered[0],index(#filtered[0],'||') + 2);
}
if ($strvalue1 ne $strvalue2 )
{
$colcnt = $colcnt + 1;
print MODIFIED "$tablename1\t$colname\t$strvalue1\t$strvalue2\n";
}
}
}
if ($colcnt>0)
{
print "modified count is $colcnt\n";
}
%$hash_ref1 = ();
%$hash_ref2 = ();
}
The program is Read input file in which every line contrain three strings seperated by tab. First is TableName, Second is ALL Column Name with commas in between and third contain the sql to be run. As this utlity is doing comparison of data, so there are two rows for every tablename. One for each DB. So data needs to be picked from each respective db's and then compared column by column.
SQL returns as ID in the result set and if the value is coming from db then it needs be translated to a string by reading from a array (that array contains 100K records with Key and value seperated by ||)
Now I ran this for one set of tables which contains 18K records in each db. There are 8 columns picked from db in each sql. So for every record out of 18K, and then for every field in that record i.e. 8, this script is taking a lot of time.
My question is if someone can look and see if it can be imporoved so that it takes less time.
File contents sample
INPUT.TXT
TABLENAME COL1,COL2 select COL1,COL2 from TABLENAME where ......
TABLENAMEB COL1,COL2 select COL1,COL2 from TABLENAMEB where ......
Metadata array contains something like this(there are two i.e. for each db)
111||Code 1
222||Code 2
Please suggest
Your code does look a bit unusual, and could gain clarity from using subroutines vs. just using loops and conditionals. Here are a few other suggestions.
The excerpt
for (;;) {
my $line1 = <INFILE1>;
last if not defined $line1;
my $line2 = <INFILE1>;
last if not defined $line2;
...;
}
is overly complicated: Not everyone knows the C-ish for(;;) idiom. You have lots of code duplication. And aren't you actually saying loop while I can read two lines?
while (defined(my $line1 = <INFILE1>) and defined(my $line2 = <INFILE1>)) {
...;
}
Yes, that line is longer, but I think it's a bit more self-documenting.
Instead of doing
if ($tablename1 eq $tablename2) { the rest of the loop }
you could say
next if $tablename1 eq $tablename2;
the rest of the loop;
and save a level of intendation. And better intendation equals better readability makes it easier to write good code. And better code might perform better.
What are you doing at foreach $key1 (keys ...) — something tells me you didn't use strict! (Just a hint: lexical variables with my can perform slightly better than global variables)
Also, doing $colname = $_ inside a for-loop is a dumb thing, for the same reason.
for my $key1 (keys ...) {
...;
for my $colname (#fieldname) { ... }
}
my $strvalue1='';
#val1 = $hash_ref1->{$key1}->{$colname};
if (defined #val1)
{
my #filtered = grep /#val1/, #metadata;
my $strvalue1 = substr(#filtered[0],index(#filtered[0],'||') + 2);
}
I don't think this does what you think it does.
From the $hash_ref1 you retrive a single element, then assign that element to an array (a collection of multiple values).
Then you called defined on this array. An array cannot be undefined, and what you are doing is quite deprecated. Calling defined function on a collection returns info about the memory management, but does not indicate ① whether the array is empty or ② whether the first element in that array is defined.
Interpolating an array into a regex isn't likely to be useful: The elements of the array are joined with the value of $", usually a whitespace, and the resulting string treated as a regex. This will wreak havoc if there are metacharacters present.
When you only need the first value of a list, you can force list context, but assign to a single scalar like
my ($filtered) = produce_a_list;
This frees you from weird subscripts you don't need and that only slow you down.
Then you assign to a $strvalue1 variable you just declared. This shadows the outer $strvalue1. They are not the same variable. So after the if branch, you still have the empty string in $strvalue1.
I would write this code like
my $val1 = $hash_ref1->{$key1}{$colname};
my $strvalue1 = defined $val1
? do {
my ($filtered) = grep /\Q$val1/, #metadata;
substr $filtered, 2 + index $filtered, '||'
} : '';
But this would be even cheaper if you pre-split #metadata into pairs and test for equality with the correct field. This would remove some of the bugs that are still lurking in that code.
$x = $x + 1 is commonly written $x++.
Emptying the hashrefs at the end of the iteration is unneccessary: The hashrefs are assigned to a new value at the next iteration of the loop. Also, it is unneccessary to assist Perls garbage collection for such simple tasks.
About the metadata: 100K records is a lot, so either put it in a database itself, or at the very least a hash. Especially for so many records, using a hash is a lot faster than looping through all entries and using slow regexes … aargh!
Create the hash from the file, once at the beginning of the program
my %metadata;
while (<METADATA>) {
chomp;
my ($key, $value) = split /\|\|/;
$metadata{$key} = $value; # assumes each key only has one value
}
Simply look up the key inside the loop
my $strvalue1 = defined $val1 ? $metadata{$val1} // '' : ''
That should be so much faster.
(Oh, and please consider using better names for variables. $strvalue1 doesn't tell me anything, except that it is a stringy value (d'oh). $val1 is even worse.)
This is not really an answer but it won't really fit well in a comment either so, until you provide some more information, here are some observations.
Inside you inner for loop, there is:
#val1 = $hash_ref1->{$key1}->{$colname};
Did you mean #val1 = #{ $hash_ref1->{$key1}->{$colname} };?
Later, you check if (defined #val1)? What did you really want to check? As perldoc -f defined points out:
Use of "defined" on aggregates (hashes and arrays) is
deprecated. It used to report whether memory for that aggregate
had ever been allocated. This behavior may disappear in future
versions of Perl. You should instead use a simple test for size:
In your case, if (defined #val1) will always be true.
Then, you have my #filtered = grep /#val1/, #metadata; Where did #metadata come from? What did you actually intend to check?
Then you have my $strvalue1 = substr(#filtered[0],index(#filtered[0],'||') + 2);
There is some interesting stuff going on in there.
You will need to verbalize what you are actually trying to do.
I strongly suspect there is a single SQL query you can run that will give you what you want but we first need to know what you want.

Perl need the right grep operator to match value of variable

I want to see if I have repeated items in my array, there are over 16.000 so will automate it
There may be other ways but I started with this and, well, would like to finish it unless there is a straightforward command. What I am doing is shifting and pushing from one array into another and this way, check the destination array to see if it is "in array" (like there is such a command in PHP).
So, I got this sub routine and it works with literals, but it doesn't with variables. It is because of the 'eq' or whatever I should need. The 'sourcefile' will contain one or more of the words of the destination array.
// Here I just fetch my file
$listamails = <STDIN>;
# Remove the newlines filename
chomp $listamails;
# open the file, or exit
unless ( open(MAILS, $listamails) ) {
print "Cannot open file \"$listamails\"\n\n";
exit;
}
# Read the list of mails from the file, and store it
# into the array variable #sourcefile
#sourcefile = <MAILS>;
# Close the handle - we've read all the data into #sourcefile now.
close MAILS;
my #destination = ('hi', 'bye');
sub in_array
{
my ($destination,$search_for) = #_;
return grep {$search_for eq $_} #$destination;
}
for($i = 0; $i <=100; $i ++)
{
$elemento = shift #sourcefile;
if(in_array(\#destination, $elemento))
{
print "it is";
}
else
{
print "it aint there";
}
}
Well, if instead of including the $elemento in there I put a 'hi' it does work and also I have printed the value of $elemento which is also 'hi', but when I put the variable, it does not work, and that is because of the 'eq', but I don't know what else to put. If I put == it complains that 'hi' is not a numeric value.
When you want distinct values think hash.
my %seen;
#seen{ #array } = ();
if (keys %seen == #array) {
print "\#array has no duplicate values\n";
}
It's not clear what you want. If your first sentence is the only one that matters ("I want to see if I have repeated items in my array"), then you could use:
my %seen;
if (grep ++$seen{$_} >= 2, #array) {
say "Has duplicates";
}
You said you have a large array, so it might be faster to stop as soon as you find a duplicate.
my %seen;
for (#array) {
if (++$seen{$_} == 2) {
say "Has duplicates";
last;
}
}
By the way, when looking for duplicates in a large number of items, it's much faster to use a strategy based on sorting. After sorting the items, all duplicates will be right next to each other, so to tell if something is a duplicate, all you have to do is compare it with the previous one:
#sorted = sort #sourcefile;
for (my $i = 1; $i < #sorted; ++$i) { # Start at 1 because we'll check the previous one
print "$sorted[$i] is a duplicate!\n" if $sorted[$i] eq $sorted[$i - 1];
}
This will print multiple dupe messages if there are multiple dupes, but you can clean it up.
As eugene y said, hashes are definitely the way to go here. Here's a direct translation of the code you posted to a hash-based method (with a little more Perlishness added along the way):
my #destination = ('hi', 'bye');
my %in_array = map { $_ => 1 } #destination;
for my $i (0 .. 100) {
$elemento = shift #sourcefile;
if(exists $in_array{$elemento})
{
print "it is";
}
else
{
print "it aint there";
}
}
Also, if you mean to check all elements of #sourcefile (as opposed to testing the first 101 elements) against #destination, you should replace the for line with
while (#sourcefile) {
Also also, don't forget to chomp any values read from a file! Lines read from a file have a linebreak at the end of them (the \r\n or \n mentioned in comments on the initial question), which will cause both eq and hash lookups to report that otherwise-matching values are different. This is, most likely, the reason why your code is failing to work correctly in the first place and changing to use sort or hashes won't fix that. First chomp your input to make it work, then use sort or hashes to make it efficient.

How to get rid of this infinite loop in my code? (Making a number guessing game program)

I have to create a number guessing game program and have written the code but after inputting my first number guess, the output turns into an infinite loop and keeps repeating forever, so I am forced to shut down my program. It seems to be an error with my "{}" but I can't figure out where the error is. I have to let the user guess 8 different times, and am stuck on the 1st guess result because it keeps repeating. Here is my code:
print "Welcome to the Perl Whole Number Guessing Game!\n Please enter a number between 1 and 100 and I will tell you if the number you're trying to guess is higher or lower than your guess. You have up to 8 chances to guess the number.\n";
my ($guess, $target, $counter); #my variables
$target = (int rand 100) +1; #must be between 1-100
$counter =1;
#1st guess:
print "Enter guess #$counter:";
chomp ($guess = <>);
while ($guess != $target)
{ if ($guess < $target)
{
print "Your guess, $guess, is too low. Try again.\n ";
}
else
{ print "Your guess, $guess, is too high. Try again.\n ";
}}
until ($guess ==$target)
{
print "Congratulations! You guessed the secret number ($target) in $counter tries!\n";
}
$counter ++;
...Then that exact code repeats 8 times until the last bit says this:
#8th and final guess:
print "Enter guess #$counter:";
chomp ($guess = <>);
while ($guess != $target)
{ if ($guess < $target)
{print "I'm sorry, you didn't guess the secret number, which was $target.\n";
}
else
{print "I'm sorry, you didn't guess the secret number, which was $target.\n";
}}
until ($guess ==$target)
{ print "Congratulations! You guessed the secret number ($target) in $counter tries!\n";
}
I just want to be able to have the code ask me to guess again all 8 times, without having the very first guess go on an infinite loop.
NOTE: I am in a beginning beginning Perl programming class and can't use any fancy, difficult code, basically the simplest solution is best and really the only thing I can kinda understand.
Thanks for any help!
Repeating code is something you should basically never do, except perhaps in very simple cases. Since you have a specific count, just use a for loop with a counter.
You should also always use use warnings; use strict;, because it will help you avoid simple mistakes and give you informative errors.
Also, your if statement will not detect correct guesses. You will need to use elsif (yes, no "e") to also check if the number is too high.
my $guesses = 8;
for my $counter (1 .. $guesses) {
print "Enter guess #$counter:";
chomp (my $guess = <>);
if ($guess < $target) {
print "Your guess, $guess, is too low. Try again.\n ";
} elsif ($guess > $target) {
print "Your guess, $guess, is too high. Try again.\n ";
} else {
print "Congratulations! You guessed the secret number ($target) in $counter tries!\n";
}
}
You need to put your increment inside your while, first of all.
Repeating code is a terrible way of making a program. That's what loops are for.
You just need to break out of the loop after the 8th try.
PERL is not my lang of choice, but in most languages, syntax like this will work:
while (($guess != $target) && ($count <= 8))
{...}
while ($guess != $target)
means WHILE the user's guess is NOT the random number the computer has "thought of"...
DO what is enclosed in the brackets...
WHICH IS : to print a message (whether the guess was "low", "high" or correct)
Now, think about it... Let's say WE play this game... and I'm the one which has to guess YOUR number... Let's simulate it...
You think of a number (let's say : target = 34)
I'm making a guess (let's say : 45)
WHILE my guess is not correct, tell me "it's too high"
So, you get my point : the rest of the code will NEVER be executed. As you will simply keep telling me that I'm wrong, without letting make another guess...
So, why don't you do... something with that evil... WHILE?
I don't know perl, but logically you could put the whole thing in one loop
While (guess != target && wrongAnswer < 8)
{
}
If the answer is not equal to the target increase the wrongAnswer counter.
As the other posters have pointed out, you need to increment somewhere inside your while loop. Also, keep in mind that you don't need the following code block:
until ($guess ==$target) {
print "Congratulations! You guessed the secret number ($target) in $counter tries!\n";
}
If you reach this point in your code, the condition that $guess is equal to $target will already have been satisfied. So, checking that again is redundant. You can skip the until block and go right to the print statement after the while loop.
The most important remaining issue is that you're only asking for input once, before the while block begins. So, during the while block the user never gets the chance to guess a second time. The guess never changes and the loop will go on forever. To avoid code repetition, you could put the guess code into a subroutine:
sub guess {
print "Enter guess #$counter:";
chomp( $guess = <> );
}
Then call this subroutine once before you get to the while loop:
guess();
Then once just before the end of your while block:
guess();
That will reseed the $guess variable before the next pass at the while loop, which will break you out of your infinite loop. I would have given you an entire code sample, but as this is tagged as "homework", that would rob you of the benefit of working over these issues on your own. :)

Best way to prevent output of a duplicate item in Perl in realtime during a loop

I see a lot of 'related' questions showing up, but none I looked at answer this specific scenario.
During a while/for loop that parses a result set generated from a SQL select statement, what is the best way to prevent the next line from being outputted if the line before it contains the same field data (whether it be the 1st field or the xth field)?
For example, if two rows were:
('EML-E','jsmith#mail.com','John','Smith')
('EML-E','jsmith2#mail.com','John','Smith')
What is the best way to print only the first row based on the fact that 'EML-E' is the same in both rows?
Right now, I'm doing this:
Storing the first field (specific to my scenario) into a 2-element array (dupecatch[1])
Checking if dupecatch[0] = dupcatch[1] (duplicate - escape loop using 's')
After row is processed, set dupecatch[0] = dupecatch[1]
while ($DBS->SQLFetch() == *PLibdata::RET_OK)
{
$s=0; #s = 1 to escape out of inside loop
while ($i != $array_len and $s==0)
{
$rowfetch = $DBS->{Row}->GetCharValue($array_col[$i]);
if($i==0){$dupecatch[1] = $rowfetch;} #dupecatch prevents duplicate primary key field entries
if($dupecatch[0] ne $dupecatch[1])
{
dosomething($rowfetch);
}
else{$s++;}
$i++;
}
$i=0;
$dupecatch[0]=$dupecatch[1];
}
That is that standard way if you only care about duplicate items in a row, but $dupecatch[0] is normally named $old and $dupecatch[1] normally just the variable in question. You can tell the array is not a good fit because you only ever refer to its indices.
If you want to avoid all duplicates you can use a %seen hash:
my %seen;
while (defined (my $row = get_data())) {
next if $seen{$row->[0]}++; #skip all but the first instance of the key
do_stuff();
}
I suggest using DISTINCT in your SQL statement. That's probably by far the easiest fix.