I am writing a small perl script, primarily to learn the language. Basically it has an action dispatch table. Based on the user input any one of the target actions will be invoked. Each action is a small, independent utility function (say print time), working on which allows me to explore and learn different aspects of perl.
I have run into a problem with the dispatch mechanism. The script runs in a continuous loop, each time taking a user request for an action. This input is compared against the regular expressions of each available action. If there is a match, that action is executed and it breaks out of the match loop to read user's next request. The problem I am facing is that, if I request for the same action twice, it is not matching the second time. If I print the dispatch table immediately after the match, the matched action entry seems to be missing. If I continuously request for same action, it works only on alternate invocations. If I avoid the "last LABEL", it works without any issues.
Perl version is 5.12.4 (on Fedora 15, 32 bit). Below is a simplified but complete example. I am still a beginner in perl. Please excuse if it doesn't meet the standards of a monk :) Kindly help in figuring out the issue with this code. Your help is much appreciated.
#!/bin/env perl
use strict ;
use warnings ;
use Text::ParseWords ;
my #Actions ;
my $Quit ;
sub main
{
# Dispatch table
# Each row has [syntax, help, {RegExp1 => Function1, RegExp2 => Function2,...}]
# There can be multiple RegExps depending on optional arguments in user input
#Actions =
(
['first <value>', 'Print first', {'first (.*)' => \&PrintFirst} ],
['second <value>', 'Print second', {'second (.*)' => \&PrintSecond} ],
['quit', 'Exits the script', {'quit' => \&Quit} ]
) ;
my $CommandLine ;
my #Captures ;
my $RegEx ;
my $Function ;
while(!$Quit)
{
# Get user input, repeat until there is some entry
while(!$CommandLine)
{
print "\$ " ;
my $argline = <STDIN> ;
my #args = shellwords($argline) ;
$CommandLine = join (" ", grep { length() } #args) ;
}
# Find and execute the action matching given input
# For each entry in the #Actions array
ExecAction: foreach my $Action (#Actions)
{
# For each RegExp for that action (only 1 per action in this example)
while (($RegEx, $Function) = each %{#$Action[2]})
{
if (#Captures = $CommandLine =~ m/^$RegEx$/i)
{
print "Match : $RegEx\n" ;
&$Function(#Captures) ;
last ExecAction ; # Works if this line is commented
}
}
}
$CommandLine = undef ;
}
}
sub PrintFirst { print "first $_[0]\n" ; }
sub PrintSecond { print "second $_[0]\n" ; }
sub Quit { $Quit = 1 ; }
main ;
You have stumbled across some subtle behavior of the each operator. By breaking out of the loop (with last ExecAction) before the each operator has exhausted the key-value pairs of %{#$Action[2]}, the next call to each %{#$Action[2]} will attempt to retrieve a different key-value pair. Since there isn't one (there is only one key-value pair defined for each element of the #Action data structure), each returns an empty list, and the contents of the while loop are skipped.
The simplest workaround is to reset the "iterator" associated with each before each use, say, by calling keys:
keys %{#$Action[2]};
while (($RegEx, $Function) = each %{#$Action[2]})
{
...
}
I think explicitly copying the hash to a temporary variable would work, too:
my %action_hash = %{#$Action[2]};
while (($RegEx, $Function) = each %action_hash) {
...
}
You need to reset the hash's iterator if you're going to break out of an each loop
$ perl -E'
%h=(a=>4,b=>5,c=>6,d=>7);
while (my ($k, $v) = each %h) { last if ++$i == 2 }
keys %h if $ARGV[0];
while (my ($k, $v) = each %h) { say "$k:$v"; }
' 0
b:5
d:7
$ perl -E'
%h=(a=>4,b=>5,c=>6,d=>7);
while (my ($k, $v) = each %h) { last if ++$i == 2 }
keys %h if $ARGV[0];
while (my ($k, $v) = each %h) { say "$k:$v"; }
' 1
c:6
a:4
b:5
d:7
each, keys
Related
I'm just starting in Perl and I'm quite enjoying it. I'm writing some basic functions, but what I really want to be able to do is to use those functions intelligently using console commands. For example, say I have a function adding two numbers. I'd want to be able to type in console "add 2, 4" and read the first word, then pass the two numbers as parameters in an "add" function. Essentially, I'm asking for help in creating some basic scripting using Perl ^^'.
I have some vague ideas about how I might do this in VB, but Perl, I have no idea where I'd start, or what functions would be useful to me. Is there something like VB.net's "Split" function where you can break down the contents of a scalar into an array? Is there a simple way to analyse one word at a time in a scalar, or iterate through a scalar until you hit a separator, for example?
I hope you can help, any suggestions are appreciated! Bear in mind, I'm no expert, I started Perl all of a few weeks ago, and I've only been doing VB.net half a year.
Thank you!
Edit: If you're not sure what to suggest and you know any simple/intuitive resources that might be of help, that would also be appreciated.
Its rather easy to make a script which dispatches to a command by name. Here is a simple example:
#!/usr/bin/env perl
use strict;
use warnings;
# take the command name off the #ARGV stack
my $command_name = shift;
# get a reference to the subroutine by name
my $command = __PACKAGE__->can($command_name) || die "Unknown command: $command_name\n";
# execute the command, using the rest of #ARGV as arguments
# and print the return with a trailing newline
print $command->(#ARGV);
print "\n";
sub add {
my ($x, $y) = #_;
return $x + $y;
}
sub subtract {
my ($x, $y) = #_;
return $x - $y;
}
This script (say its named myscript.pl) can be called like
$ ./myscript.pl add 2 3
or
$ ./myscript.pl subtract 2 3
Once you have played with that for a while, you might want to take it further and use a framework for this kind of thing. There are several available, like App::Cmd or you can take the logic shown above and modularize as you see fit.
You want to parse command line arguments. A space serves as the delimiter, so just do a ./add.pl 2 3 Something like this:
$num1=$ARGV[0];
$num2=$ARGV[1];
print $num1 + $num2;
will print 5
Here is a short implementation of a simple scripting language.
Each statement is exactly one line long, and has the following structure:
Statement = [<Var> =] <Command> [<Arg> ...]
# This is a regular grammar, so we don't need a complicated parser.
Tokens are seperated by whitespace. A command may take any number of arguments. These can either be the contents of variables $var, a string "foo", or a number (int or float).
As these are Perl scalars, there is no visible difference between strings and numbers.
Here is the preamble of the script:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
strict and warnings are essential when learning Perl, else too much weird stuff would be possible. The use 5.010 is a minimum version, it also defines the say builtin (like a print but appends a newline).
Now we declare two global variables: The %env hash (table or dict) associates variable names with their values. %functions holds our builtin functions. The values are anonymous functions.
my %env;
my %functions = (
add => sub { $_[0] + $_[1] },
mul => sub { $_[0] * $_[1] },
say => sub { say $_[0] },
bye => sub { exit 0 },
);
Now comes our read-eval-loop (we don't print by default). The readline operator <> will read from the file specified as the first command line argument, or from STDIN if no filename is provided.
while (<>) {
next if /^\s*\#/; # jump comment lines
# parse the line. We get a destination $var, a $command, and any number of #args
my ($var, $command, #args) = parse($_);
# Execute the anonymous sub specified by $command with the #args
my $value = $functions{ $command }->(#args);
# Store the return value if a destination $var was specified
$env{ $var } = $value if defined $var;
}
That was fairly trivial. Now comes some parsing code. Perl “binds” regexes to strings with the =~ operator. Regexes may look like /foo/ or m/foo/. The /x flags allows us to include whitespace in our regex that doesn't match actual whitespace. The /g flag matches globally. This also enables the \G assertion. This is where the last successful match ended. The /c flag is important for this m//gc style parsing to consume one match at a time, and to prevent the position of the regex engine in out string to being reset.
sub parse {
my ($line) = #_; # get the $line, which is a argument
my ($var, $command, #args); # declare variables to be filled
# Test if this statement has a variable declaration
if ($line =~ m/\G\s* \$(\w+) \s*=\s* /xgc) {
$var = $1; # assign first capture if successful
}
# Parse the function of this statement.
if ($line =~ m/\G\s* (\w+) \s*/xgc) {
$command = $1;
# Test if the specified function exists in our %functions
if (not exists $functions{$command}) {
die "The command $command is not known\n";
}
} else {
die "Command required\n"; # Throw fatal exception on parse error.
}
# As long as our matches haven't consumed the whole string...
while (pos($line) < length($line)) {
# Try to match variables
if ($line =~ m/\G \$(\w+) \s*/xgc) {
die "The variable $1 does not exist\n" if not exists $env{$1};
push #args, $env{$1};
}
# Try to match strings
elsif ($line =~ m/\G "([^"]+)" \s*/xgc) {
push #args, $1;
}
# Try to match ints or floats
elsif ($line =~ m/\G (\d+ (?:\.\d+)? ) \s*/xgc) {
push #args, 0+$1;
}
# Throw error if nothing matched
else {
die "Didn't understand that line\n";
}
}
# return our -- now filled -- vars.
return $var, $command, #args;
}
Perl arrays can be handled like linked list: shift removes and returns the first element (pop does the same to the last element). push adds an element to the end, unshift to the beginning.
Out little programming language can execute simple programs like:
#!my_little_language
$a = mul 2 20
$b = add 0 2
$answer = add $a $b
say $answer
bye
If (1) our perl script is saved in my_little_language, set to be executable, and is in the system PATH, and (2) the above file in our little language saved as meaning_of_life.mll, and also set to be executable, then
$ ./meaning_of_life
should be able to run it.
Output is obviously 42. Note that our language doesn't yet have string manipulation or simple assignment to variables. Also, it would be nice to be able to call functions with the return value of other functions directly. This requires some sort of parens, or precedence mechanism. Also, the language requires better error reporting for batch processing (which it already supports).
I want to see if I have repeated items in my array, there are over 16.000 so will automate it
There may be other ways but I started with this and, well, would like to finish it unless there is a straightforward command. What I am doing is shifting and pushing from one array into another and this way, check the destination array to see if it is "in array" (like there is such a command in PHP).
So, I got this sub routine and it works with literals, but it doesn't with variables. It is because of the 'eq' or whatever I should need. The 'sourcefile' will contain one or more of the words of the destination array.
// Here I just fetch my file
$listamails = <STDIN>;
# Remove the newlines filename
chomp $listamails;
# open the file, or exit
unless ( open(MAILS, $listamails) ) {
print "Cannot open file \"$listamails\"\n\n";
exit;
}
# Read the list of mails from the file, and store it
# into the array variable #sourcefile
#sourcefile = <MAILS>;
# Close the handle - we've read all the data into #sourcefile now.
close MAILS;
my #destination = ('hi', 'bye');
sub in_array
{
my ($destination,$search_for) = #_;
return grep {$search_for eq $_} #$destination;
}
for($i = 0; $i <=100; $i ++)
{
$elemento = shift #sourcefile;
if(in_array(\#destination, $elemento))
{
print "it is";
}
else
{
print "it aint there";
}
}
Well, if instead of including the $elemento in there I put a 'hi' it does work and also I have printed the value of $elemento which is also 'hi', but when I put the variable, it does not work, and that is because of the 'eq', but I don't know what else to put. If I put == it complains that 'hi' is not a numeric value.
When you want distinct values think hash.
my %seen;
#seen{ #array } = ();
if (keys %seen == #array) {
print "\#array has no duplicate values\n";
}
It's not clear what you want. If your first sentence is the only one that matters ("I want to see if I have repeated items in my array"), then you could use:
my %seen;
if (grep ++$seen{$_} >= 2, #array) {
say "Has duplicates";
}
You said you have a large array, so it might be faster to stop as soon as you find a duplicate.
my %seen;
for (#array) {
if (++$seen{$_} == 2) {
say "Has duplicates";
last;
}
}
By the way, when looking for duplicates in a large number of items, it's much faster to use a strategy based on sorting. After sorting the items, all duplicates will be right next to each other, so to tell if something is a duplicate, all you have to do is compare it with the previous one:
#sorted = sort #sourcefile;
for (my $i = 1; $i < #sorted; ++$i) { # Start at 1 because we'll check the previous one
print "$sorted[$i] is a duplicate!\n" if $sorted[$i] eq $sorted[$i - 1];
}
This will print multiple dupe messages if there are multiple dupes, but you can clean it up.
As eugene y said, hashes are definitely the way to go here. Here's a direct translation of the code you posted to a hash-based method (with a little more Perlishness added along the way):
my #destination = ('hi', 'bye');
my %in_array = map { $_ => 1 } #destination;
for my $i (0 .. 100) {
$elemento = shift #sourcefile;
if(exists $in_array{$elemento})
{
print "it is";
}
else
{
print "it aint there";
}
}
Also, if you mean to check all elements of #sourcefile (as opposed to testing the first 101 elements) against #destination, you should replace the for line with
while (#sourcefile) {
Also also, don't forget to chomp any values read from a file! Lines read from a file have a linebreak at the end of them (the \r\n or \n mentioned in comments on the initial question), which will cause both eq and hash lookups to report that otherwise-matching values are different. This is, most likely, the reason why your code is failing to work correctly in the first place and changing to use sort or hashes won't fix that. First chomp your input to make it work, then use sort or hashes to make it efficient.
I'm working from a question I posted earlier (here), and trying to convert the answer to a sub so I can use it multiple times. Not sure that it's done right though. Can anyone provide a better or cleaner sub?
I have a good deal of experience programming, but my primary language is PHP. It's frustrating to know how to execute in one language, but not be able to do it in another.
sub search_for_key
{
my ($args) = #_;
foreach $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
my $thiskey = NULL;
my #result = map { $args->{search_ary}[$_][0] } # Get the 0th column...
grep { #$args->{search_in} =~ /$args->{search_ary}[$_][1]/ } # ... of rows where the
0 .. $#array; # first row matches
$thiskey = #result;
print "\nReturning: " . $thiskey . "\n";
return $thiskey;
}
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
---Edit---
From the answers so far, I've cobbled together the function below. I'm new to Perl, so I don't really understand much of the syntax. All I know is that it throws an error (Not an ARRAY reference at line 26.) about that grep line.
Since I seem to not have given enough info, I will also mention that:
I am calling this function like this (which may or may not be correct):
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
And $ref_cam_make is an array I collect from a database table like this:
$ref_cam_make = $sth->fetchall_arrayref;
And it is in the structure like this (if I understood how to make the associative fetch work properly, I would like to use it like that instead of by numeric keys):
Reference Array
Associative
row[1][cam_make_id]: 13, row[1][name]: Sony
Numeric
row[1][0]: 13, row[1][1]: Sony
row[0][0]: 19, row[0][1]: Canon
row[2][0]: 25, row[2][1]: HP
sub search_for_key
{
my ($args) = #_;
foreach my $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
print grep { $args->{search_in} =~ #$args->{search_ary}[$_][1] } #$args->{search_ary};
}
You are moving in the direction of a 2D array, where the [0] element is some sort of ID number and the [1] element is the camera make. Although reasonable in a quick-and-dirty way, such approaches quickly lead to unreadable code. Your project will be easier to maintain and evolve if you work with richer, more declarative data structures.
The example below uses hash references to represent the camera brands. An even nicer approach is to use objects. When you're ready to take that step, look into Moose.
use strict;
use warnings;
demo_search_feature();
sub demo_search_feature {
my #camera_brands = (
{ make => 'Canon', id => 19 },
{ make => 'Sony', id => 13 },
{ make => 'HP', id => 25 },
);
my #test_searches = (
"Sony's Cyber-shot DSC-S600",
"Canon cameras",
"Sony HPX-32",
);
for my $ts (#test_searches){
print $ts, "\n";
my #hits = find_hits($ts, \#camera_brands);
print ' => ', cb_stringify($_), "\n" for #hits;
}
}
sub cb_stringify {
my $cb = shift;
return sprintf 'id=%d make=%s', $cb->{id}, $cb->{make};
}
sub find_hits {
my ($search, $camera_brands) = #_;
return grep { $search =~ $_->{make} } #$camera_brands;
}
This whole sub is really confusing, and I'm a fairly regular perl user. Here are some blanket suggestions.
Do not create your own undef ever -- use undef then return at the bottom return $var // 'NULL'.
Do not ever do this: foreach $row, because foreach my $row is less prone to create problems. Localizing variables is good.
Do not needlessly concatenate, for it offends the style god: not this, print "\nReturning: " . $thiskey . "\n";, but print "\nReturning: $thiskey\n";, or if you don't need the first \n: say "Returning: $thiskey;" (5.10 only)
greping over 0 .. $#array; is categorically lame, just grep over the array: grep {} #{$foo[0]}, and with that code being so complex you almost certainly don't want grep (though I don't understand what you're doing to be honest.). Check out perldoc -q first -- in short grep doesn't stop until the end.
Lastly, do not assign an array to a scalar: $thiskey = #result; is an implicit $thiskey = scalar #result; (see perldoc -q scalar) for more info. What you probably want is to return the array reference. Something like this (which eliminates $thiskey)
printf "\nReturning: %s\n", join ', ', #result;
#result ? \#result : 'NULL';
If you're intending to return whether a match is found, this code should work (inefficiently). If you're intending to return the key, though, it won't -- the scalar value of #result (which is what you're getting when you say $thiskey = #result;) is the number of items in the list, not the first entry.
$thiskey = #result; should probably be changed to $thiskey = $result[0];, if you want mostly-equivalent functionality to the code you based this off of. Note that it won't account for multiple matches anymore, though, unless you return #result in its entirety, which kinda makes more sense anyway.
I have a small program:
#!/user/bin/perl
use strict;
system ("clear");
my($option, $path);
do
{
print "\tEnter the number of your chosen option:\n";
print "\n";
print "\tOption\t\tCommand\n";
print "\t======\t\t=======\n";
print "\t1\t\tDate\n";
print "\t2\t\tDirectory Listing\n";
print "\t3\t\tCalendar\n";
print "\t4\t\tVi Editor\n";
print "\t5\t\tCalculator\n";
print "\t6\t\tExit\n\n";
chomp($option=<STDIN>);
SWITCH:
{
($option =="1") and do
{
system(date);
last;
};
($option =="2") and do
{
print "Enter the path:"; ############################
chomp($path=<STDIN>); #This is giving me an error#
system(ls $path); ############################
last;
};
($option =="3") and do
{
system(cal);
last;
};
($option =="4") and do
{
system(vi);
last;
};
($option =="5") and do
{
system(bc);
last;
};
}
}while ($option!=6);
print "Goodbye!\n";
sleep 2;
First question: Can anyone help me how to write the proper command to create a directory listing in case 2.
Second Question: Why do I get a loop if I use
$date = `date`;
print "$date";
instead of
system(date);
You should be able to solve a lot of your problems by remembering to put quotes around literal arguments to system():
system("date");
system("ls $path");
and the same for most other places you call system() (your first call to system("clear") is correct).
It is a quirk of Perl that calling something like system(cal) works at all, because the unquoted cal is treated as a "bareword" by Perl, which happens to be roughly equivalent to a string when passed to a function such as system(). Relying on this behaviour would be terribly bad practice, and so you should always quote literal strings.
You could read the path like:
chomp($path=<STDIN>);
system("ls $path");
Not sure why you'd get the loop for $date =date;print "$date";. But I don't think there's a date function unless you're using a package for it. You can show a time like:
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime;
$year += 1900;
$mon += 1;
printf "%04d-%02d-%02d %02d:%02d:%02d",
$year, $mday, $mon,
$hour, $min, $sec;
On most unix systems perl resides in /usr/bin, without the e in user, so you might consider double-checking the first line of your script.
Your immediate problems were caused by quoting issues and the lack of use warnings in your script.
It's also worth noting that menu-driven scripts like yours are ideal candidates for dispatch tables. A dispatch table is a technique for defining actions as data. The actions are Perl subroutines. The data is usually a set of key-value pairs that end up getting stored in a hash.
The keys to the hash are the choices made by the user (menu items 1-6 in your case).
The values in the hash are called code references. There are two ways to set up these code references: (1) Directly in the dispatch table, using anonymous subroutines; or (2) using the &\foo syntax, which would create a reference to a subroutine named foo.
The handy thing about this approach is that your menu() method can be reused -- simply with a different dispatch table and a different usage message.
This example is so small that the benefit of reuse might not seem compelling, but the general technique of having data -- in the form of a dispatch table -- control program behavior is powerful in many contexts.
# Always use both of these.
use strict;
use warnings;
sub dispatch_table {
return
1 => sub { system 'date' },
2 => \&ls_path,
3 => sub { system 'cal' },
4 => sub { system 'vi' },
5 => sub { system 'bc' },
6 => sub { print "Goodbye!\n"; sleep 2 },
;
}
sub ls_path {
print "\nEnter the path: ";
chomp(my $path=<STDIN>);
# Note quoting. To be super robust, you would
# need to escape apostrophes in the path.
system "ls '$path'";
}
sub usage_message {
return "Choose wisely:
Option Command
====== =======
1 Date
2 Directory Listing
3 Calendar
4 Vi Editor
5 Calculator
6 Exit
";
}
sub menu {
system 'clear';
my %dt = dispatch_table();
my $option;
print usage_message();
while (1){
print "> ";
chomp($option = <STDIN>);
last if exists $dt{$option};
}
$dt{$option}->();
}
menu();
I can not reproduce your loop with:
$date =date;print "$date";
I doubt that is exactly how you coded it since I get a compile error
with use strict;. If you can show a reduced code example which still illustrates the problem, we could help debug it further.
If you are trying to capture the output of an external command into a variable, you could use backticks or qx:
my $date = qx(date);
print "$date";
On a side note, whenever I see a series of print statements, I think here-doc:
print <<"EOF";
Enter the number of your chosen option:
Option Command
====== =======
1 Date
2 Directory Listing
etc...
EOF
A little easier to read and maintain, no?
Finally, it is also a good idea to use warnings;.
The first couple of suggests I have are, first like others have already suggested, use warnings is strongly encouraged. Older Perl interpreters may require you use the older form #!/usr/bin/perl -w as the first line of your Perl script. Second, there is a Switch module available, to make the switch statement look less ugly. I've also shown usage of subroutines to clean up the appearance of the program.
I've attached a alternative version of your script with some potential suggestions. Note it uses a slightly different alternative for switch. If available, I'd recommend using the Switch module. It includes a different way of printing the time, and of course fixes your problem with the system calls.
I hope that helps.
#!/usr/bin/perl
use strict;
use warnings; # otherwise /usr/bin/perl -w in first line
sub menu() {
print <<EOM;
Enter the number of your chosen option:
Option Command
====== =======
1 Date
2 Directory Listing
3 Calendar
4 Vi Editor
5 Calculator
6 Exit
EOM
}
sub showtime() {
my $time = localtime;
print $time,"\n";
}
sub listdir() {
my $path;
print "Enter the path: ";
chomp($path = <STDIN>);
system("ls $path");
print "\n";
}
system("clear");
my $option;
do {
menu();
chomp($option = <STDIN>);
# SWITCH:
for ($option) {
/1/ and do {
showtime();
};
/2/ and do {
listdir();
};
/3/ and do {
system("cal");
};
/4/ and do {
system("vi");
};
/5/ and do {
system("bc");
};
last;
}
} while ($option != 6);
print "Goodbye!\n";
sleep 2;
I'm dismayed. OK, so this was probably the most fun Perl bug I've ever found. Even today I'm learning new stuff about Perl. Essentially, the flip-flop operator .. which returns false until the left-hand-side returns true, and then true until the right-hand-side returns false keep global state (or that is what I assume.)
Can I reset it (perhaps this would be a good addition to Perl 4-esque hardly ever used reset())? Or, is there no way to use this operator safely?
I also don't see this (the global context bit) documented anywhere in perldoc perlop is this a mistake?
Code
use feature ':5.10';
use strict;
use warnings;
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr;
}
my #foo = qw/foo bar start baz end quz quz/;
my #bar = qw/foo bar start baz end quz quz/;
say 'first shot - foo';
say for search \#foo;
say 'second shot - bar';
say for search \#bar;
Spoiler
$ perl test.pl
first shot
foo
bar
second shot
Can someone clarify what the issue with the documentation is? It clearly indicates:
Each ".." operator maintains its own boolean state.
There is some vagueness there about what "Each" means, but I don't think the documentation would be well served by a complex explanation.
Note that Perl's other iterators (each or scalar context glob) can lead to the same problems. Because the state for each is bound to a particular hash, not a particular bit of code,each can be reset by calling (even in void context) keys on the hash. But for glob or .., there is no reset mechanism available except by calling the iterator until it is reset. A sample glob bug:
sub globme {
print "globbing $_[0]:\n";
print "got: ".glob("{$_[0]}")."\n" for 1..2;
}
globme("a,b,c");
globme("d,e,f");
__END__
globbing a,b,c:
got: a
got: b
globbing d,e,f:
got: c
Use of uninitialized value in concatenation (.) or string at - line 3.
got:
For the overly curious, here are some examples where the same .. in the source is a different .. operator:
Separate closures:
sub make_closure {
my $x;
return sub {
$x if 0; # Look, ma, I'm a closure
scalar( $^O..!$^O ); # handy values of true..false that don't trigger ..'s implicit comparison to $.
}
}
print make_closure()->(), make_closure()->();
__END__
11
Comment out the $x if 0 line to see that non-closures have a single .. operation shared by all "copies", with the output being 12.
Threads:
use threads;
sub coderef { sub { scalar( $^O..!$^O ) } }
coderef()->();
print threads->create( coderef() )->join(), threads->create( coderef() )->join();
__END__
22
Threaded code starts with whatever the state of the .. had been before thread creation, but changes to its state in the thread are isolated from affecting anything else.
Recursion:
sub flopme {
my $recurse = $_[0];
flopme($recurse-1) if $recurse;
print " "x$recurse, scalar( $^O..!$^O ), "\n";
flopme($recurse-1) if $recurse;
}
flopme(2)
__END__
1
1
2
1
3
2
4
Each depth of recursion is a separate .. operator.
The trick is not use the same flip-flop so you have no state to worry about. Just make a generator function to give you a new subroutine with a new flip-flop that you only use once:
sub make_search {
my( $left, $right ) = #_;
sub {
grep { !( /\Q$left\E/ .. /\Q$right\E/ ) } #{$_[0]};
}
}
my $search_sub1 = make_search( 'start', 'never_existed' );
my $search_sub2 = make_search( 'start', 'never_existed' );
my #foo = qw/foo bar start baz end quz quz/;
my $count1 = $search_sub1->( \#foo );
my $count2 = $search_sub2->( \#foo );
print "count1 $count1 and count2 $count2\n";
I also write about this in Make exclusive flip-flop operators.
The "range operator" .. is documented in perlop under "Range Operators". Looking through the doucmentation, it appears that there isn't any way to reset the state of the .. operator. Each instance of the .. operator keeps its own state, which means there isn't any way to refer to the state of any particular .. operator.
It looks like it's designed for very small scripts such as:
if (101 .. 200) { print; }
The documentation states that this is short for
if ($. == 101 .. $. == 200) { print; }
Somehow the use of $. is implicit there (toolic points out in a comment that that's documented too). The idea seems to be that this loop runs once (until $. == 200) in a given instance of the Perl interpreter, and therefore you don't need to worry about resetting the state of the .. flip-flop.
This operator doesn't seem too useful in a more general reusable context, for the reasons you've identified.
A workaround/hack/cheat for your particular case is to append the end value to your array:
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr, 'never_exist';
}
This will guarantee that the RHS of range operator will eventually be true.
Of course, this is in no way a general solution.
In my opinion, this behavior is not clearly documented. If you can construct a clear explanation, you could apply a patch to perlop.pod via perlbug.
I found this problem, and as far as I know there's no way to fix it. The upshot is - don't use the .. operator in functions, unless you are sure you are leaving it in the false state when you leave the function, otherwise the function may return different output for the same input (or exhibit different behaviour for the same input).
Each use of the .. operator maintains its own state. Like Alex Brown said, you need to leave it in the false state when you leave the function. Maybe you could do something like:
sub search {
my $arr = shift;
grep { !( /start/ || $_ eq "my magic reset string" ..
/never_exist/ || $_ eq "my magic reset string" ) }
(#$arr, "my magic reset string");
}