Why doesn't this run forever? - perl

I was looking at a rather inconclusive question about whether it is best to use for(;;) or while(1) when you want to make an infinite loop and I saw an interesting solution in C where you can #define "EVER" as a constant equal to ";;" and literally loop for(EVER).
I know defining an extra constant to do this is probably not the best programming practice but purely for educational purposes I wanted to see if this could be done with Perl as well.
I tried to make the Perl equivalent, but it only loops once and then exits the loop.
#!/usr/bin/perl -w
use strict;
use constant EVER => ';;';
for (EVER) {
print "FOREVER!\n";
}
Output:
FOREVER!
Why doesn't this work in perl?

C's pre-processor constants are very different from the constants in most languages.
A normal constant acts like a variable which you can only set once; it has a value which can be passed around in most of the places a variable can be, with some benefits from you and the compiler knowing it won't change. This is the type of constant that Perl's constant pragma gives you. When you pass the constant to the for operator, it just sees it as a string value, and behaves accordingly.
C, however, has a step which runs before the compiler even sees the code, called the pre-processor. This actually manipulates the text of your source code without knowing or caring what most of it means, so can do all sorts of things that you couldn't do in the language itself. In the case of #DEFINE EVER ;;, you are telling the pre-processor to replace every occurrence of EVER with ;;, so that when the actual compiler runs, it only sees for(;;). You could go a step further and define the word forever as for(;;), and it would still work.
As mentioned by Andrew Medico in comments, the closest Perl has to a pre-processor is source filters, and indeed one of the examples in the manual is an emulation of #define. These are actually even more powerful than pre-processor macros, allowing people to write modules like Acme::Bleach (replaces your whole program with whitespace while maintaining functionality) and Lingua::Romana::Perligata (interprets programs written in grammatically correct Latin), as well as more sensible features such as adding keywords and syntax for class and method declarations.

It doesn't run forever because ';;' is an ordinary string, not a preprocessor macro (Perl doesn't have an equivalent of the C preprocessor). As such, for (';;') runs a single time, with $_ set to ';;' that one time.

Andrew Medico mentioned in his comment that you could hack it together with a source filter.
I confirmed this, and here's an example.
use Filter::cpp;
#define EVER ;;
for (EVER) {
print "Forever!\n";
}
Output:
Forever!
Forever!
Forever!
... keeps going ...
I don't think I would recommend doing this, but it is possible.

This is not possible in Perl. However, you can define a subroutine named forever which takes a code block as a parameter and runs it again and again:
#!/usr/bin/perl
use warnings;
use strict;
sub forever (&) {
$_[0]->() while 1
}
forever {
print scalar localtime, "\n";
sleep 1;
};

Related

Safe.pm base_math as math calculator and new variables

I do not want to write my own recursive-descent math parser or think too deeply about grammar, so I am (re-)using the Perl module Safe.pm as an arithmetic calculator with variables. My task is to let one anonymous web user A type into a textfield a couple of math expressions, like:
**Input Formula:** $x= 2; $y=sqrt(2*$x+(25+$x)*$x); $z= log($y); ...
Ideally, this should only contain math expressions, but not generic Perl code. Later, I want to use it for web user B:
**Input Print:** you start with x=$x and end with z=$z . you don't know $a.
to <pre> text output that looks like this:
**Output Txt:** you start with x=2 and end with z=2.03 . you don't know $a.
(The fact that $a was not replaced is its own warning.) Ideally, I want to check that my web users have not only not tried to break in, but also have made no syntax errors.
My current Safe.pm-based implementation has drawbacks:
I want only math expressions in the first textfield. Alas, :base_math only extends Safe.pm beyond :base_core, so I have to live with the user having access to more than just math algebra expressions. For example, the web users could accidentally try to use a Perl reserved name, define subs, or do who knows what. Is there a better solution that picks off only the recursive descent math grammar parser? (and, subs like system() should not be permitted math functions!)
For the printing, I can just wrap a print "..." around the text and go another Safe eval, but this replaces $a with undef. What I really mean my code to do is to go through the table of newly added variables ($x, $y, and $z) and if they appear unescaped, then replace them; others should be ignored. I also have to watch carefully here that my guys are not working together to try to escape and type text like "; system("rm -rf *"); print ";, though Safe would catch this particular issue. More likely, A could try to inject some nasty JavaScript for B or who knows what.
Questions:
Is Safe.pm the right tool for the job? Perl seems like a heavy cannon here, but not having to reinvent the wheel is nice.
Can one further restrict Safe.pm to Perl's arithmetic only?
Is there a "new symbols" table that I can iterate over for substitution?
Safe.pm seems like a bad choice, because you're going to run the risk of overlooking some exploitable operation. I would suggest looking at a parsing tool, such as Marpa. It even has the beginnings of a calculator implementation which you could probably adapt to your purposes.

Correct way of variable declaration in Perl

I have a set of 3 or 4 separate Perl scripts that used to be part of a simple pipeline, but I am now trying to combine them in a single script for easier use (for now without subroutine functions). The thing is that several variables with the same name are defined in the different scripts. The workaround I found was to give different names to those variables, but it can start to become messy and probably it is not the correct way of doing so.
I know the concept of global and local variables but I do not quite understand how do they exactly work.
Are there any rules of thumb for dealing with this sort of variables? Do you know any good documentation that can shed some light on variable-scope or have any advise on this?
Thanks.
EDITED: I already use "use warnings; use strict;" and declare variables with "my". The question might actually be more related to the definition of scoping blocks and how to get them to be independent from each other...
You are likely getting into trouble because of your use of global variables (which actually likely exist in package main). You should try to avoid the use of global variables.
And to do so, you should become familiar with the meaning of variable scope. Although somewhat dated, Coping with Scoping offers a good introduction to this topic. Also see this answer and the others to the question How to properly use Global variables in perl. (Short Answer: avoid them to the degree possible.)
The principle of variable scope and limiting use of global variables actually applies to nearly all programming languages. You should get in the habit of declaring variables as close as possible to the point where you are actually using them.
And finally, to save yourself from a lot of headaches, get in the habit of:
including use strict; and use warnings; at the top every Perl source file, and
declaring variables with my within each of your sub's (to limit the scope of those variables to the sub).
(See this PerlMonks article for more on this recommendation.)
I refer to this practice as "Perl programming with your seat belt on." :-)
The rule of thumb is to put your code in subroutines, each of them focused on a simple, well-defined part of the larger process. From this one decision flow many virtuous outcomes, including a natural solution to the variable scoping problem you asked about.
sub foo {
my $x = 99;
...
}
sub bar {
my $x = 1234; # Won't interfere with foo's $x.
...
}
If, for some reason, you really don't want to do that, you can wrap each section of the code in scoping blocks, and make sure you declare all variables with my (you should be doing the latter nearly always as a matter of common practice):
{
my $x = 99;
...
}
{
my $x = 1234; # Won't interfere with the other $x.
...
}

What compile time features does Perl provide that other languages don't?

Is Perl considered a general purpose programming language?
Reading about it on Wikipedia
Perl has a Turing-complete grammar because parsing can be affected by run-time code executed during the compile phase.[41] Therefore, Perl cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language.
It is often said that "Only perl can parse Perl," meaning that only the Perl interpreter (perl) can parse the Perl language (Perl), but even this is not, in general, true. Because the Perl interpreter can simulate a Turing machine during its compile phase, it would need to decide the Halting Problem in order to complete parsing in every case. It's a long-standing result that the Halting Problem is undecidable, and therefore not even perl can always parse Perl. Perl makes the unusual choice of giving the user access to its full programming power in its own compile phase. The cost in terms of theoretical purity is high, but practical inconvenience seems to be rare.
So, it says that though Perl has the Turing complete badge, it is different from other languages because gives "the user access to its full programming power in its own compile phase". What does that mean? What programming power does Perl provide me at compiling phase that others don't?
There are no features of Perl that do not appear in any other language. Lisp can do anything (Lisp is an example, here.). So perhaps we can narrow the question down to what are the features of Perl that make wide behavior swings an easy thing to do.
BEGIN blocks (END blocks, too.) which alter the behavior during compile. So I can write Perl code that changes the location of modules to be loaded.
Even the following code might have a different meaning.
use Frobnify;
Frobnify->new->initialize;
Because I could have changed where Frobnify loads from:
BEGIN {
if ( [ localtime ]->[6] == 2 ) {
s|^/var|/var/days/tuesday| foreach #INC;
}
}
So on Tuesdays, I load /var/days/tuesday/perl/lib/Frobnify.pm
Source Filters can programmatically edit the code that will perform. (CAVEAT on source filters!) (crudely and roughly equivalent to LISP macros)
Somewhat along with BEGIN blocks are #INC hooks. As I can modify #INC at the beginning to see change what gets loaded. I can set a subroutine at the front of the #INC array to load anything I want to load. The hook can receive a request to load Frobnify and respond to it by loading Defrobnify.pm.
Somewhat along with this is Symbol Manipuation. After loading Defrobnify.pm, I can do this:
*Frobnify:: = \*Defrobnify::;
Now Frobnify->new creates a Defrobnify object!
Subroutine prototypes are a compile time feature that is more or less exclusive to Perl. Many of Perl's builtin functions impose special types of context on their arguments (scalar, list, reference, code-block, capture). Prototypes are a way of porting some of that functionality over to user defined subroutines.
For example, Perl allows you to effectively generate new syntactic constructs with the (&) prototype. This is used in modules like Try::Tiny to add try and catch keywords to the language:
try {
die "foo";
} catch {
warn "caught error: $_"; # not $#
};
This works because try and catch are declared as sub try (&;#) { ... }. The sub name {...} syntax is equivalent to BEGIN { *name = sub {...} } which means it has a compile time effect. In the case of try, the (&;#) prototype tells the compiler that any time it sees the identifier try, the first argument must be a bare block, and following the block is an optional list.
This is just one example of prototypes, and they are able to do many other things:
$ imposes scalar context on an argument
& imposes code context on an argument
# imposes list context on an argument
% imposes list context (with an even number of elements)
* imposes glob context on the argument
\$ imposes scalar reference context
\# imposes array reference context
... for the rest of the sigils
Due to their power (and absence in other languages) prototypes can be confusing and are best used in moderation. (like every other advanced feature of Perl).
The simple answer is that BEGIN blocks provide Turing-completeness:
BEGIN {
my $foo = turing_machine_simulator($program);
}
BEGIN blocks are executed as soon as the perl compiler sees them. This means that the compiler can be asked to do tasks of arbitrary complexity. Anything Perl can do, it can do during its compilation phase.

Why do Perl control statements require braces?

This may look like the recent question that asked why Perl doesn't allow one-liners to be "unblocked," but I found the answers to that question unsatisfactory because they either referred to the syntax documentation that says that braces are required, which I think is just begging the question, or ignored the question and simply gave braceless alternatives.
Why does Perl require braces for control statements like if and for? Put another way, why does Perl require blocks rather than statements, like some other popular languages allow?
One reason could be that some styles dictate that you should always use braces with control structures, even for one liners, in order to avoid breaking them later, e.g.:
if (condition)
myObject.doSomething();
else
myObject.doSomethingElse();
Then someone adds something more to the first part:
if (condition)
myObject.doSomething();
myObject.doSomethingMore(); // Syntax error next line
else
myObject.doSomethingElse();
Or worse:
if (condition)
myObject.doSomething();
else
myObject.doSomethingElse();
myObject.doSomethingMore(); // Compiles, but not what you wanted.
In Perl, these kinds of mistakes are not possible, because not using braces with control structures is always a syntax error. In effect, a style decision has been enforced at the language syntax level.
Whether that is any part of the real reason, only Larry's moustache knows.
One reason could be that some constructs would be ambiguous without braces :
foreach (#l) do_something unless $condition;
Does unless $condition apply to the whole thing or just the do_something statement?
Of course this could have been worked out with priority rules or something,
but it would have been yet another way to create confusing Perl code :-)
One problem with braceless if-else clauses is they can lead to syntactic ambiguity:
if (foo)
if (bar)
mumble;
else
tumble;
Given the above, under what condition is tumble executed? It could be interpreted as happening when !foo or foo && !bar. Adding braces clears up the ambiguity without dirtying the source too much. You could then go on to say that it's always a good idea to have the braces, so let's make the language require it and solve the endless C bickering over whether they should be used or not. Or, of course, you could address the problem by getting rid of the braces completely and using the indentation to indicate nesting. Both are ways of making clear, unambiguous code a natural thing rather than requiring special effort.
In Programming Perl (which Larry Wall co-authored), 3rd Edition, page 113, compound statements are defined in terms of expressions and blocks, not statements, and blocks have braces.
Note that unlike in C and Java,
[compound statements] are defined in
terms of BLOCKS, not statements.
This means that the braces are
requried--no dangling statements
allowed.
I don't know if that answers your question but it seems like in this case he chose to favor a simple language structure instead of making exceptions.
Perhaps not directly relevant to your question about (presumably) Perl 5 and earlier, but…
In Perl 6, control structures do not require parentheses:
if $x { say '$x is true' }
for <foo bar baz> -> $s { say "[$s]" }
This would be horrendously ambiguous if the braces were also optional.
Isn't it that Perl allows you to skip the braces, but then you have to write statement before condition? i.e.
#!/usr/bin/perl
my $a = 1;
if ($a == 1) {
print "one\n";
}
# is equivalent to:
print "one\n" if ($a == 1);
"Okay, so normally, you need braces around blocks, but not if the block is only one statement long, except, of course, if your statement would be ambiguous in a way that would be ruled by precedence rules not like you want if you omitted the braces -- in this case, you could also imagine the use of parentheses, but that would be inconsistent, because it is a block after all -- this is of course dependent on the respective precedence of the involved operators. In any case, you don't need to put semicolons after closing braces -- it is even wrong if you end an if statement that is followed by an else statement -- except that you absolutely must put a semicolon at the end of a header file in C++ (or was it C?)."
Seriously, I am glad for every explicitness and uniformity in code.
Just guessing here, but "unblocked" loops/ifs/etc. tend to be places where subtle bugs are introduced during code maintenance, since a sloppy maintainer might try to add another line "inside the loop" without realizing that it's not really inside.
Of course, this is Perl we're talking about, so probably any argument that relies on maintainability is suspect... :)

Is it okay to use modules from within subroutines?

Recently I start playing with OO Perl and I've been creating quite a bunch of new objects for a new project that I'm working on. As I'm unfamilliar with any best practice regarding OO Perl and we're kind in a tight rush to get it done :P
I'm putting a lot of this kind of code into each of my function:
sub funcx{
use ObjectX; # i don't declare this on top of the pm file
# but inside the function itself
my $obj = new ObjectX;
}
I was wondering if this will cause any negative impact versus putting on the use Object line on top of the Perl modules outside of any function scope.
I was doing this so that I feel it's cleaner in case I need to shift the function around.
And the other thing that I have noticed is that when I try to run a test.pl script on the unix server itself which test my objects, it slow as heck. But when the same code are run through CGI which is connected to an apache server, the web page doesn't load as slowly.
Where to put use?
use occurs at compile time, so it doesn't matter where you put it. At least from a purely pragmatic, 'will it work', point of view. Because it happens at compile time use will always be executed, even if you put it in a conditional. Never do this: if( $foo eq 'foo' ) { use SomeModule }
In my experience, it is best to put all your use statements at the top of the file. It makes it easy to see what is being loaded and what your dependencies are.
Update:
As brian d foy points out, things compiled before the use statement will not be affected by it. So, the location can matter. For a typical module, location does not matter, however, if it does things that affect compilation (for example it imports functions that have prototypes), the location could matter.
Also, Chas Owens points out that it can affect compilation. Modules that are designed to alter compilation are called pragmas. Pragmas are, by convention, given names in all lower-case. These effects apply only within the scope where the module is used. Chas uses the integer pragma as an example in his answer. You can also disable a pragma or module over a limited scope with the keyword no.
use strict;
use warnings;
my $foo;
print $foo; # Generates a warning
{ no warnings 'unitialized`; # turn off warnings for working with uninitialized values.
print $foo; # No warning here
}
print $foo; # Generates a warning
Indirect object syntax
In your example code you have my $obj = new ObjectX;. This is called indirect object syntax, and it is best avoided as it can lead to obscure bugs. It is better to use this form:
my $obj = ObjectX->new;
Why is your test script slow on the server?
There is no way to tell with the info you have provided.
But the easy way to find out is to profile your code and see where the time is being consumed. NYTProf is another popular profiling tool you may want to check out.
Best practices
Check out Perl Best Practices, and the quick reference card. This page has a nice run down of Damian Conway's OOP advice from PBP.
Also, you may wish to consider using Moose. If the long script startup time is acceptable in your usage, then Moose is a huge win.
question 1
It depends on what the module does. If it has lexical effects, then it will only affect the scope it is used in:
my $x;
{
use integer;
$x = 5/2; #$x is now 2
}
my $y = 5/2; #$y is now 2.5
If it is a normal module then it makes no difference where you use it, but it is common to use all of those modules at the top of the program.
question 2
Things that can affect the speed of a program between machines
speed of the processor
version of modules installed (some modules have XS versions that are much faster)
version of Perl
number of entries in PERL5LIB
speed of the drive
daotoad and Chas. Owens already answered the part of your question pertaining to the position of use statements. Let me remark on something else here:
I was doing this so that I feel it's
cleaner in case I need to shift the
function around.
Personally, I find it much cleaner to have all the used modules in one place at the top of the file. You won't have to search for use statements to see what other modules are being used and a quick glance will tell you what is being used and even what is not being used.
Regarding your performance problem: with Apache and mod_perl the Perl interpreter will have to parse and compile your used modules only once. The next time the script is run, execution should be much faster. On the command line, however, a second run doesn't get this benefit.