This question already has answers here:
What does the function declaration "sub function($$)" mean?
(2 answers)
Closed 7 years ago.
What dose the ( $$ ) do in this code. I have programmed Perl for a long time but never came across this syntax until recently when I opened a very old Perl .plx file
These rows prevent me from upgrading to a more modern Perl version.
sub help( $$ ){
}
The reason it affects me is because I get an error message stating that the help function was called before it was declared. Any idea of how I can solve this without removing the ( $$ ) block??
The are called prototypes. This particular one says that the sub routine expects to be called with exactly 2 scalar variables. Although prototypes are sometimes useful, mostly they are not.
If you can drop them depends on the rest of the code...
That's a function prototype, which is used to specify the number and types of arguments that the subroutine takes. See the documentation.
Since it's in the current documentation, I don't see why it's preventing you from upgrading.
Is the error you're getting help called too early to check prototype? Here's the explanation from the perldiag documentation:
(W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See perlsub.
It's a prototype. The $$ specifies that the help function expects two arguments and that they should each be evaluated in scalar context. Note that this does not mean that they are scalar values! Perl's prototypes aren't like prototypes in other languages. They allow you to define functions that behave like built-in functions: parentheses are optional and context is imposed on the arguments.
sub f($$) { print "#_\n" }
my #a = ('a' .. 'c');
f(#a, 'd'); # prints "3 d"
I'm guessing that the error message you're seeing is
help() called too early to check prototype
which means that Perl saw a call to the function before it saw the declaration of the function and knew about the prototype. This means that the prototype wasn't enforced and the call may not behave as expected.
my #a = ('a' .. 'c');
f(#a, 'd'); # prints "a b c d"
sub f($$) { print "#_\n" }
To fix the error you need to either move the subroutine definition before the call, or add a declaration before the call.
sub f($$); # forward declaration
my #a = ('a' .. 'c');
f(#a, 'd'); # prints "3 d"
sub f($$) { print "#_\n" }
All of this should have absolutely nothing to do with your ability to upgrade to a newer version of Perl.
Related
I have to debug someone else's code and ran across sub declarations that look like this...
sub mysub($$$$) {
<code here>
}
...also...
sub mysub($$$;$) {
<code here>
}
What does the parenthesized list of '$' (with optional ';') mean?
I ran an experiment and it doesn't seem to care if I pass more and fewer args to a sub declared this way than there are '$' in the list. I was thinking that it might be used to disambiguate two different subs with the same name, differring only by the number of args pased to it (as defined by the ($$$$) vs ($$$) vs ($$) etc... ). But that doesn't seem to be it.
That's a Perl subroutine prototype. It's an old-school way of letting the parser know how many arguments to demand. Unless you know what they are going to do for you, I suggest you avoid these for any new code. If you can avoid prototypes, avoid it. It doesn't gain you as much as you think. There's a newer but experimental way to do it better.
The elements after the ; are optional arguments. So, mysub($$$$) has four mandatory arguments, and mysub($$$;$) has three mandatory arguments and one optional argument.
A little about parsing
Perl lets you be a bit loose about parentheses when you want to specify arguments, so these are the same:
print "Hello World";
print( "Hello World\n" );
This is one of Perl's philosophical points. When we can omit boilerplate, we should be able to.
Also, Perl lets you pass as many arguments as you like to a subroutine and you don't have to say anything about parameters ahead of time:
sub some_sub { ... }
some_sub( 1, 2, 3, 4 );
some_sub 1, 2, 3, 4; # same
This is another foundational idea of Perl: we have scalars and lists. Many things work on a list, and we don't care what's in it or how many elements it has.
But, some builtins take a definite number of arguments. The sin takes exactly one argument (but print takes zero to effectively infinity):
print sin 5, 'a'; # -0.958924274663138a (a is from `a`)
The rand takes zero or one:
print rand; # 0.331390818188996
print rand 10; # 4.23956650382937
But then, you can define your own subroutines. Prototypes are a way to mimic that same behavior you see in the builtins (which I think is kinda cool but also not as motivating for production situations).
I tend to use parens in argument lists because I find it's easier for people to see what I intend (although not always with print, I guess):
print sin(5), 'a';
There's one interesting use of prototypes that I like. You can make your own syntax that works like map and grep block forms:
map { ... } #array;
If you want to play around with that (but still not subject maintenance programmers to it), check out Object::Iterate for a demonstration of it.
Experimental signatures
Perl v5.20 introduced an experimental signatures feature where you can give names to parameters. All of these are required:
use v5.20;
use feature qw(signatures);
sub mysub ( $name, $address, $phone ) { ... }
If you wanted an optional parameter, you can give it a default value:
sub mysub ( $name, $address, $phone = undef ) { ... }
Since this is an experimental feature, it warns whenever you use it. You can turn it off though:
no warnings qw(experimental::signatures);
This is interesting.
I ran an experiment and it doesn't seem to care if I pass more and fewer args to a sub declared this way than there are '$' in the list.
Because, of course, that's exactly what the code's author was trying to enforce.
There are two ways to circumvent the parameter counting that prototypes are supposed to enforce.
Call the subroutine as a method on an object ($my_obj->my_sub(...)) or on a class (MyClass->my_sub(...)).
Call the subroutine using the "old-style" ampersand syntax (&my_sub(...)).
From which we learn:
Don't use prototypes on subroutines that are intended to be used as methods.
Don't use the ampersand syntax for calling subroutines.
So I've been diving into Perl 6 and have been looking at interpreting another language using Perl 6's operator definitions. I understand that this could be done by parsing the code but I'm looking to push Perl 6's capabilities to see what it can do. Having this functionality would also make the parsing a lot easier
I'm trying to make a variable definition in a C-style format.(The language isn't important)
Something like:
char foo;
Where the char represents the type and the foo is the variable name. From my understanding the char can be interpreted using an operator definition like so:
sub prefix:<char>($input) {
say $input;
}
Which calls a subroutine with the foo as $input. The idea from here would be to use foo as a string and hold it's reference in a hash somewhere. The problem with this is that Perl 6 seems to see any bareword as a function call and will complain when it can't find the "Undeclared routine".
I've looked possibly everywhere for an answer to this and the only thing that makes me still think that this may be possible is the qw function from Perl 5 which is now < > in Perl 6. The < > is obviously an operator which leads me to believe that there is a subroutine defined somewhere that tells this operator how to work and how to deal with the bareword input.
So to my question:
Is there a way of accepting bareword input into a subroutine just like the < > operator does?
Cheers!
The best way to do that would be to create a Grammar that parses your language. If you additionally want it to run the DSL you have just created, combine it with Actions.
(Assume use strict; use warnings; throughout this question.)
I am exploring the usage of sub.
sub bb { print #_; }
bb 'a';
This works as expected. The parenthesis is optional, like with many other functions, like print, open etc.
However, this causes a compilation error:
bb 'a';
sub bb { print #_; }
String found where operator expected at t13.pl line 4, near "bb 'a'"
(Do you need to predeclare bb?)
syntax error at t13.pl line 4, near "bb 'a'"
Execution of t13.pl aborted due to compilation errors.
But this does not:
bb('a');
sub bb { print #_; }
Similarly, a sub without args, such as:
special_print;
my special_print { print $some_stuff }
Will cause this error:
Bareword "special_print" not allowed while "strict subs" in use at t13.pl line 6.
Execution of t13.pl aborted due to compilation errors.
Ways to alleviate this particular error is:
Put & before the sub name, e.g. &special_print
Put empty parenthesis after sub name, e.g. special_print()
Predeclare special_print with sub special_print at the top of the script.
Call special_print after the sub declaration.
My question is, why this special treatment? If I can use a sub globally within the script, why can't I use it any way I want it? Is there a logic to sub being implemented this way?
ETA: I know how I can fix it. I want to know the logic behind this.
I think what you are missing is that Perl uses a strictly one-pass parser. It does not scan the file for subroutines, and then go back and compile the rest. Knowing this, the following describes how the one pass parse system works:
In Perl, the sub NAME syntax for declaring a subroutine is equivalent to the following:
sub name {...} === BEGIN {*name = sub {...}}
This means that the sub NAME syntax has a compile time effect. When Perl is parsing source code, it is working with a current set of declarations. By default, the set is the builtin functions. Since Perl already knows about these, it lets you omit the parenthesis.
As soon as the compiler hits a BEGIN block, it compiles the inside of the block using the current rule set, and then immediately executes the block. If anything in that block changes the rule set (such as adding a subroutine to the current namespace), those new rules will be in effect for the remainder of the parse.
Without a predeclared rule, an identifier will be interpreted as follows:
bareword === 'bareword' # a string
bareword LIST === syntax error, missing ','
bareword() === &bareword() # runtime execution of &bareword
&bareword === &bareword # same
&bareword() === &bareword() # same
When using strict and warnings as you have stated, barewords will not be converted into strings, so the first example is a syntax error.
When predeclared with any of the following:
sub bareword;
use subs 'bareword';
sub bareword {...}
BEGIN {*bareword = sub {...}}
Then the identifier will be interpreted as follows:
bareword === &bareword() # compile time binding to &bareword
bareword LIST === &bareword(LIST) # same
bareword() === &bareword() # same
&bareword === &bareword # same
&bareword() === &bareword() # same
So in order for the first example to not be a syntax error, one of the preceding subroutine declarations must be seen first.
As to the why behind all of this, Perl has a lot of legacy. One of the goals in developing Perl was complete backwards compatibility. A script that works in Perl 1 still works in Perl 5. Because of this, it is not possible to change the rules surrounding bareword parsing.
That said, you will be hard pressed to find a language that is more flexible in the ways it lets you call subroutines. This allows you to find the method that works best for you. In my own code, if I need to call a subroutine before it has been declared, I usually use name(...), but if that subroutine has a prototype, I will call it as &name(...) (and you will get a warning "subroutine called too early to check prototype" if you don't call it this way).
The best answer I can come up with is that's the way Perl is written. It's not a satisfying answer, but in the end, it's the truth. Perl 6 (if it ever comes out) won't have this limitation.
Perl has a lot of crud and cruft from five different versions of the language. Perl 4 and Perl 5 did some major changes which can cause problems with earlier programs written in a free flowing manner.
Because of the long history, and the various ways Perl has and can work, it can be difficult for Perl to understand what's going on. When you have this:
b $a, $c;
Perl has no way of knowing if b is a string and is simply a bareword (which was allowed in Perl 4) or if b is a function. If b is a function, it should be stored in the symbol table as the rest of the program is parsed. If b isn't a subroutine, you shouldn't put it in the symbol table.
When the Perl compiler sees this:
b($a, $c);
It doesn't know what the function b does, but it at least knows it's a function and can store it in the symbol table waiting for the definition to come later.
When you pre-declare your function, Perl can see this:
sub b; #Or use subs qw(b); will also work.
b $a, $c;
and know that b is a function. It might not know what the function does, but there's now a symbol table entry for b as a function.
One of the reasons for Perl 6 is to remove much of the baggage left from the older versions of Perl and to remove strange things like this.
By the way, never ever use Perl Prototypes to get around this limitation. Use use subs or predeclare a blank subroutine. Don't use prototypes.
Parentheses are optional only if the subroutine has been predeclared. This is documented in perlsub.
Perl needs to know at compile time whether the bareword is a subroutine name or a string literal. If you use parentheses, Perl will guess that it's a subroutine name. Otherwise you need to provide this information beforehand (e.g. using subs).
The reason is that Larry Wall is a linguist, not a computer scientist.
Computer scientist: The grammar of the language should be as simple & clear as possible.
Avoids complexity in the compiler
Eliminates sources of ambiguity
Larry Wall: People work differently from compilers. The language should serve the programmer, not the compiler. See also Larry Wall's outline of the three virtues of a programmer.
As per my sample code below, there are two styles to call a subroutine: subname and subname().
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
use 5.010;
&marine(); # style 1
&marine; # style 2
sub marine {
state $n = 0; # private, persistent variable $n
$n += 1;
print "Hello, sailor number $n!\n";
}
Which one, &marine(); or &marine;, is the better choice if there are no arguments in the call?
In Learning Perl, where this example comes from, we're at the very beginning of showing you subroutines. We only tell you to use the & so that you, as the beginning Perler, don't run into a problem where you define a subroutine with the same name as a Perl built-in then wonder why it doesn't work. The & in front always calls your defined subroutine. Beginning students often create their own subroutine log to print a message because they are used to doing that in other technologies they use. In Perl, that's the math function builtin.
After you get used to using Perl and you know about the Perl built-ins (scan through perlfunc), drop the &. There's some special magic with & that you hardly ever need:
marine();
You can leave off the () if you've pre-declared the subroutine, but I normally leave the () there even for an empty argument list. It's a bit more robust since you're giving Perl the hint that the marine is a subroutine name. To me, I recognize that more quickly as a subroutine.
The side effect of using & without parentheses is that the subroutine is invoked with #_. This program
sub g {
print "g: #_\n";
}
sub f {
&g(); # g()
&g; # g(#_)
g(); # g()
g; # g()
}
f(1,2,3);
produces this output:
g:
g: 1 2 3
g:
g:
It's good style to declare your subroutines first with the sub keyword, then call them. (Of course there are ways around it, but why make things more complicated than necessary?)
Do not use the & syntax unless you know what it does exactly to #_ and subroutines declared with prototypes. It is terribly obscure, rarely needed and a source of bugs through unintended behaviour. Just leave it away – Perl::Critic aptly says about it:
Since Perl 5, the ampersand sigil is completely optional when invoking subroutines.
Now, given following these style hints, I prefer to call subroutines that require no parameters in style 1, that is to say marine();. The reasons are
visual consistency with subroutines that do require parameters
it cannot be confused with a different keyword.
As a general rule I recommend the following:
Unless you need the & because you're over riding a built in function or you have no parameter list omit it.
Always include the () as in marine().
I do both of these for code readability. The first rule makes it clear when I'm overriding internal Perl functions by making them distinct. The second makes it clear when I'm invoking functions.
perl allows you to omit parenthesis in your function call.
So you can call your function with arguments in two different ways:
your_function( arg1,arg2,arg3);
or
your function arg1,arg2,arg3 ;
Its a matter of choice that which form do you prefer. With users from C background the former is more intuitive.
I personally use the former for functions defined by me and latter for built in functions like:
print "something" instead of print("something")
I have heard that people shouldn't be using & to call Perl subs, i.e:
function($a,$b,...);
# opposed to
&function($a,$b,...);
I know for one the argument list becomes optional, but what are some cases where it is appropriate to use the & and the cases where you should absolutely not be using it?
Also how does the performace increase come into play here when omitting the &?
I'm a frequent abuser of &, but mostly because I'm doing weird interface stuff. If you don't need one of these situations, don't use the &. Most of these are just to access a subroutine definition, not call a subroutine. It's all in perlsub.
Taking a reference to a named subroutine. This is probably the only common situation for most Perlers:
my $sub = \&foo;
Similarly, assigning to a typeglob, which allows you to call the subroutine with a different name:
*bar = \&foo;
Checking that a subroutine is defined, as you might in test suites:
if( defined &foo ) { ... }
Removing a subroutine definition, which shouldn't be common:
undef &foo;
Providing a dispatcher subroutine whose only job is to choose the right subroutine to call. This is the only situation I use & to call a subroutine, and when I expect to call the dispatcher many, many times and need to squeeze a little performance out of the operation:
sub figure_it_out_for_me {
# all of these re-use the current #_
if( ...some condition... ) { &foo }
elsif( ...some other... ) { &bar }
else { &default }
}
To jump into another subroutine using the current argument stack (and replacing the current subroutine in the call stack), an unrare operation in dispatching, especially in AUTOLOAD:
goto ⊂
Call a subroutine that you've named after a Perl built-in. The & always gives you the user-defined one. That's why we teach it in Learning Perl. You don't really want to do that normally, but it's one of the features of &.
There are some places where you could use them, but there are better ways:
To call a subroutine with the same name as a Perl built-in. Just don't have subroutines with the same name as a Perl built-in. Check perlfunc to see the list of built-in names you shouldn't use.
To disable prototypes. If you don't know what that means or why you'd want it, don't use the &. Some black magic code might need it, but in those cases you probably know what you are doing.
To dereference and execute a subroutine reference. Just use the -> notation.
IMO, the only time there's any reason to use & is if you're obtaining or calling a coderef, like:
sub foo() {
print "hi\n";
}
my $x = \&foo;
&$x();
The main time that you can use it that you absolutely shouldn't in most circumstances is when calling a sub that has a prototype that specifies any non-default call behavior. What I mean by this is that some prototypes allow reinterpretation of the argument list, for example converting #array and %hash specifications to references. So the sub will be expecting those reinterpretations to have occurred, and unless you go to whatever lengths are necessary to mimic them by hand, the sub will get inputs wildly different from those it expects.
I think mainly people are trying to tell you that you're still writing in Perl 4 style, and we have a much cleaner, nicer thing called Perl 5 now.
Regarding performance, there are various ways that Perl optimizes sub calls which & defeats, with one of the main ones being inlining of constants.
There is also one circumstance where using & provides a performance benefit: if you're forwarding a sub call with foo(#_). Using &foo is infinitesimally faster than foo(#_). I wouldn't recommend it unless you've definitively found by profiling that you need that micro-optimization.
The &subroutine() form disables prototype checking. This may or may not be what you want.
http://www.perl.com/doc/manual/html/pod/perlsub.html#Prototypes
Prototypes allow you to specify the numbers and types of your subroutine arguments, and have them checked at compile time. This can provide useful diagnostic assistance.
Prototypes don't apply to method calls, or calls made in the old-fashioned style using the & prefix.
The & is necessary to reference or dereference a subroutine or code reference
e.g.
sub foo {
# a subroutine
}
my $subref = \&foo; # take a reference to the subroutine
&$subref(#args); # make a subroutine call using the reference.
my $anon_func = sub { ... }; # anonymous code reference
&$anon_func(); # called like this
Protypes aren't applicable to subroutine references either.
The &subroutine form is also used in the so-called magic goto form.
The expression goto &subroutine replaces the current calling context with a call to the named subroutine, using the current value of #_.
In essence, you can completely switch a call to one subroutine with a call to the named one. This is commonly seen in AUTOLOAD blocks, where a deferred subroutine call can be made, perhaps with some modification to #_ , but it looks to the program entirely as if it was a call to the named sub.
e.g.
sub AUTOLOAD {
...
push #_, #extra_args; # add more arguments onto the parameter list
goto &subroutine ; # change call another subroutine, as if we were never here
}
}
Potentially this could be useful for tail call elimination, I suppose.
see detailed explanation of this technique here
I've read the arguments against using '&', but I nearly always use it. It saves me too much time not to. I spend a very large fraction of my Perl coding time looking for what parts of the code call a particular function. With a leading &, I can search and find them instantly. Without a leading &, I get the function definition, comments, and debug statements, usually tripling the amount of code I have to inspect to find what I'm looking for.
The main thing not using '&' buys you is it lets you use function prototypes. But Perl function prototypes may create errors as often as they prevent them, because they will take your argument list and reinterpret it in ways you might not expect, so that your function call no longer passes the arguments that it literally says it does.