How can I smoke out undefined subroutines? - perl

I want to scan a code base to identify all instances of undefined subroutines that are not presently reachable.
As an example:
use strict;
use warnings;
my $flag = 0;
if ( $flag ) {
When $flag evaluates to true, the following warning is emitted:
Undefined subroutine &main::undefined_sub called at - line 6
I don't want to rely on warnings issued at run-time to identify undefined subroutines
The strict and warnings pragmas don't help here. use strict 'subs' has no effect.
Even the following code snippet is silent
$ perl -Mstrict -we 'exit 0; undefined_sub()'

Perhaps Subroutines::ProhibitCallsToUndeclaredSubs policy from Perl::Critic can help
This Policy checks that every unqualified subroutine call has a matching subroutine declaration in the current file, or that it explicitly appears in the import list for one of the included modules.
This "policy" is a part of Perl::Critic::StricterSubs, which needs to be installed. There are a few more policies there. This is considered a severity 4 violation, so you can do
perlcritic -4
and parse the output for neither declared nor explicitly imported, or use
perlcritic -4 --single-policy ProhibitCallsToUndeclaredSubs
Some legitimate uses are still flagged, since it requires all subs to be imported explicitly.
This is a static analyzer, which I think should fit your purpose.

What you're asking for is in at least some sense impossible. Consider the following code snippet:
( rand()<0.5 ? *foo : *bar } = sub { say "Hello World!" };
There is a 50% chance that this will run OK, and a 50% chance that it will give an "Undefined subroutine" error. The decision is made at runtime, so it's not possible to tell before that what it will be. This is of course a contrived case to demonstrate a point, but runtime (or compile-time) generation of subroutines is not that uncommon in real code. For an example, look at how Moose adds functions that create methods. Static source code analysis will never be able to fully analyze such code.
B::Lint is probably about as good as something pre-runtime can get.

To find calls to subs that aren't defined at compile time, you can use B::Lint as follows:
use List::Util qw( min );
sub defined_sub { }
sub defined_later;
sub undefined_sub;
max(); # XXX Didn't import
List::Util::mac(); # XXX Typo!
sub defined_later { }
$ perl -MO=Lint,undefined-subs
Undefined subroutine 'undefined_sub' called at line 9
Nonexistent subroutine 'undeclared_sub' called at line 10
Nonexistent subroutine 'max' called at line 12
Nonexistent subroutine 'List::Util::mac' called at line 14 syntax OK
Note that this is just for sub calls. Method calls (such as Class->method and method Class) aren't checked. But you are asking about sub calls.
Note that foo $x is a valid method call (using the indirect method call syntax) meaning $x->foo if foo isn't a valid function or sub, so B::Lint won't catch that. But it will catch foo($x).


perl function definition fails with uninitialized value

My perl chops are a little stale so I'm probably missing something really obvious here, but I've added a small module to some older code CGI code to refactor common functions. Here is an excerpt of the module with the part that is giving me problems:
package Common;
use strict;
use warnings;
use base 'Exporter';
our #EXPORT_OK = (&fail_with_error);
sub fail_with_error {
my ($errmsg, $textcolor) = #_;
my $output = printf("<p><font color=\"%s\">ERROR: %s </font>/<p>", $textcolor, $errmsg);
When I execute this module directly with perl (or when I just import the function in test code, without even calling it) what I get is an uninitialized value error for $errmsg and $textcolor like this:
$ perl
Use of uninitialized value $textcolor in printf at line 10.
Use of uninitialized value $errmsg in printf at line 10.
<p><font color="">ERROR: </font>/<p>1
It would seem that perl is giving the warning because it is executing the subroutine code literally, but the nature of a subroutine is that it is abstracted so different values can be passed in correct? It would stand to reason these shouldn't have to be populated to pass interpreter warnings, but nonetheless something is wrong.
I've searched around, but this error is very common because in most cases the variable really is uninitialized. I can't seem to find anything that applies to this type of case.
That's because you're accidentally populating #EXPORT_OK with a call to fail_with_error: &fail_with_error, instead of the function name. This calls fail_with_errors with the arguments populated from the current #_ which happens to be empty - so naturally both the variables are uninitialized (and also your function doesn't get exported). The correct assignment uses just the subroutine name:
our #EXPORT_OK = qw( fail_with_error );

What's the difference between a Perl sub declaration with and without parentheses?

What is exactly the difference (if there is any) between:
sub foobar() {
# doing some stuff
sub foobar {
# doing some stuff
I see some of each, and the first syntax sometimes fail to compile.
By putting the () on the end of the subroutine name, you are giving it a prototype. A prototype gives Perl hints about the number and types of the arguments that you will be passing to the subroutine. See this section in perlsub for details.
Specifically, () is the empty prototype which tells Perl that this subroutine takes no arguments and Perl will throw a compilation error if you then call this subroutine with arguments. Here's an example:
use strict;
use warnings;
use 5.010;
sub foo {
say 'In foo';
sub bar() {
say 'In bar';
The output from this is:
Too many arguments for main::bar at ./foobar line 18, near "1)"
Execution of ./foobar aborted due to compilation errors.
It's that last call to bar() (the one with the argument of 1) which causes this error.
It's worth noting that Perl's implementation of prototypes is generally not as useful as people often think that they are and that outside of a few specialised cases, most Perl experts will avoid them. I recommend that you do the same. As of Perl v5.22, the experimental "signatures" feature is in testing that will hopefully accomplish many of the goals programmers from other languages would have expected from prototypes.

perl -wle 'if (0) {no_such_func()}'
The above runs without errors, despite the -w, because no_such_func()
is never called.
How do I make Perl check all functions/modules I reference, even ones
I don't use?
In a more realistic case, some functions may only be called in special
cases, but I'd still like to make sure they exist.
EDIT: I installed perlcritic, but I think I'm still doing something wrong. I created this file:
#!/bin/perl -w
use strict;
if (0) {no_such_func();}
and perlcritic said it was fine ("source OK"). Surely static analysis could catch the non-existence of no_such_func()? The program also runs fine (and produces no output).
You can't do it because Perl doesn't see if functions exist until runtime. It can't. Consider a function that only gets evaled into existence:
eval 'sub foo { return $_[0]+1 }';
That line of code will create a subroutine at runtime.
Or consider that Perl can use symbolic references
my $func = 'func';
$func = "no_such_" . $func;
In that case, it's calling your no_such_func function, but you can't tell via static analysis.
BTW, if you want to find functions that are never referenced, at least via static analysis, then you can use a tool like Perl::Critic. See or install Perl::Critic from CPAN.
Hmm, this is difficult: When Perl parses a function call, it doesn't always know whether that function will exist. This is the case when a function is called before it's declared:
sub foo { say 42 }
Sometimes, the function may only be made available during runtime:
my $bar = sub { say 42 };
my $baz = sub { say "" };
*foo = rand > 0.5 ? $bar : $baz;
(I'd like to mention the “Only perl can parse Perl” meme at this point.)
I'm sure you could hack the perl internals to complain when the function can't be resolved before run time, but this is not really useful considering the use cases above.
You can force compile-time checks if you call all subs without parentheses. So the following will fail:
$ perl -e 'use strict; my $condvar; if ($condvar) {no_such_func}'
Bareword "no_such_func" not allowed while "strict subs" in use at -e line 1.
Execution of -e aborted due to compilation errors.
(However, it does not fail if you write if (0), it seems that perl's optimizer is removing the whole block without checking any further)
This has the consequence that you have to define all subroutines before using them. If you work like this, then sometimes "forward" declarations are necessary. But then it's possible that a forward declaration never gets a definition, which is another possible error case.

What do these two warnings about comments and prototypes mean in Perl?

I have the following code
#! /usr/bin/perl
use strict;
use warnings;
################### Start Main ####################
my #startupPrograms = qw(google-chrome thunderbird skype pidgin );
my #pagesToBeOpenedInChrome = qw(;
sub main() {
and I get following warning
[aniket#localhost TestCodes]$ ./
Possible attempt to put comments in qw() list at ./ line 8.
main::main() called too early to check prototype at ./ line 9.
Program works fine but I am not able to understand the warnings. What do they mean?
This warning:
Possible attempt to put comments in qw() list at ./ line 8.
Refers to this part of the specified line:
# ---^
The # sign is a comment in Perl, and qw() has a few special warnings attached to it. It's nothing to worry about, but it does look like a redundant warning in this case. If you want to fix it you can enclose the assignment in a block and use no warnings 'qw'. This is however somewhat clunky with a lexically scoped variable:
my #pages; # must be outside block
no warnings 'qw';
#pages = qw( .... );
I have some doubts about the usefulness of warnings 'qw', and in a small script you can just remove the pragma globally by adding no warnings 'qw' at the top of the script.
This warning:
main::main() called too early to check prototype at ./ line 9.
This has to do with the empty parentheses after your sub name. They denote that you wish to use prototypes with your subroutine, and that your sub should be called without args. Prototypes are used to make subroutines behave like built-ins, which is to say its not something you really need to worry about, and should in almost all cases ignore. So just remove the empty parentheses.
If you really, truly wish to use prototypes, you need to put either a predeclaration or the sub declaration itself before the place you intend to use it. E.g.
sub main (); # predeclaration
sub main () {
In the first warning Perl complains about the hash in the quote operator:
my #foo = qw(foo bar #baz);
Here the hash is a part of the last URL and Perl thinks you maybe wanted to place a comment there. You can get rid of the warning by quoting the items explictly:
my #foo = (
'first URL',
'second URL',
'and so on',
It’s also more readable IMHO, the qw(…) construct is better fit for simpler lists only.
The second warning is a bit weird, because Perl obviously knows about the sub, otherwise it would not complain. Anyway, you can drop the () part in the sub definition, and everything will be OK:
sub main {
The () here does something else than you think, it’s not needed to define a simple sub. (It’s a sub prototype and most probably you don’t want to use it.) By the way, there’s no need at all to declare a main sub in Perl, just dump whatever code you need there instead of the sub definition.
Possible attempt to put comments in qw() list at ./ line 8.
This warning complains because you have a # in your quoted words list. The # starts a comment in Perl. The warning lets you know that you might have put a comment in there by mistake.
To remove the second warning
main::main() called too early to check prototype at ./
you can call the main() method with &main()
sub main() {

Why is parenthesis optional only after sub declaration?

(Assume use strict; use warnings; throughout this question.)
I am exploring the usage of sub.
sub bb { print #_; }
bb 'a';
This works as expected. The parenthesis is optional, like with many other functions, like print, open etc.
However, this causes a compilation error:
bb 'a';
sub bb { print #_; }
String found where operator expected at line 4, near "bb 'a'"
(Do you need to predeclare bb?)
syntax error at line 4, near "bb 'a'"
Execution of aborted due to compilation errors.
But this does not:
sub bb { print #_; }
Similarly, a sub without args, such as:
my special_print { print $some_stuff }
Will cause this error:
Bareword "special_print" not allowed while "strict subs" in use at line 6.
Execution of aborted due to compilation errors.
Ways to alleviate this particular error is:
Put & before the sub name, e.g. &special_print
Put empty parenthesis after sub name, e.g. special_print()
Predeclare special_print with sub special_print at the top of the script.
Call special_print after the sub declaration.
My question is, why this special treatment? If I can use a sub globally within the script, why can't I use it any way I want it? Is there a logic to sub being implemented this way?
ETA: I know how I can fix it. I want to know the logic behind this.
I think what you are missing is that Perl uses a strictly one-pass parser. It does not scan the file for subroutines, and then go back and compile the rest. Knowing this, the following describes how the one pass parse system works:
In Perl, the sub NAME syntax for declaring a subroutine is equivalent to the following:
sub name {...} === BEGIN {*name = sub {...}}
This means that the sub NAME syntax has a compile time effect. When Perl is parsing source code, it is working with a current set of declarations. By default, the set is the builtin functions. Since Perl already knows about these, it lets you omit the parenthesis.
As soon as the compiler hits a BEGIN block, it compiles the inside of the block using the current rule set, and then immediately executes the block. If anything in that block changes the rule set (such as adding a subroutine to the current namespace), those new rules will be in effect for the remainder of the parse.
Without a predeclared rule, an identifier will be interpreted as follows:
bareword === 'bareword' # a string
bareword LIST === syntax error, missing ','
bareword() === &bareword() # runtime execution of &bareword
&bareword === &bareword # same
&bareword() === &bareword() # same
When using strict and warnings as you have stated, barewords will not be converted into strings, so the first example is a syntax error.
When predeclared with any of the following:
sub bareword;
use subs 'bareword';
sub bareword {...}
BEGIN {*bareword = sub {...}}
Then the identifier will be interpreted as follows:
bareword === &bareword() # compile time binding to &bareword
bareword LIST === &bareword(LIST) # same
bareword() === &bareword() # same
&bareword === &bareword # same
&bareword() === &bareword() # same
So in order for the first example to not be a syntax error, one of the preceding subroutine declarations must be seen first.
As to the why behind all of this, Perl has a lot of legacy. One of the goals in developing Perl was complete backwards compatibility. A script that works in Perl 1 still works in Perl 5. Because of this, it is not possible to change the rules surrounding bareword parsing.
That said, you will be hard pressed to find a language that is more flexible in the ways it lets you call subroutines. This allows you to find the method that works best for you. In my own code, if I need to call a subroutine before it has been declared, I usually use name(...), but if that subroutine has a prototype, I will call it as &name(...) (and you will get a warning "subroutine called too early to check prototype" if you don't call it this way).
The best answer I can come up with is that's the way Perl is written. It's not a satisfying answer, but in the end, it's the truth. Perl 6 (if it ever comes out) won't have this limitation.
Perl has a lot of crud and cruft from five different versions of the language. Perl 4 and Perl 5 did some major changes which can cause problems with earlier programs written in a free flowing manner.
Because of the long history, and the various ways Perl has and can work, it can be difficult for Perl to understand what's going on. When you have this:
b $a, $c;
Perl has no way of knowing if b is a string and is simply a bareword (which was allowed in Perl 4) or if b is a function. If b is a function, it should be stored in the symbol table as the rest of the program is parsed. If b isn't a subroutine, you shouldn't put it in the symbol table.
When the Perl compiler sees this:
b($a, $c);
It doesn't know what the function b does, but it at least knows it's a function and can store it in the symbol table waiting for the definition to come later.
When you pre-declare your function, Perl can see this:
sub b; #Or use subs qw(b); will also work.
b $a, $c;
and know that b is a function. It might not know what the function does, but there's now a symbol table entry for b as a function.
One of the reasons for Perl 6 is to remove much of the baggage left from the older versions of Perl and to remove strange things like this.
By the way, never ever use Perl Prototypes to get around this limitation. Use use subs or predeclare a blank subroutine. Don't use prototypes.
Parentheses are optional only if the subroutine has been predeclared. This is documented in perlsub.
Perl needs to know at compile time whether the bareword is a subroutine name or a string literal. If you use parentheses, Perl will guess that it's a subroutine name. Otherwise you need to provide this information beforehand (e.g. using subs).
The reason is that Larry Wall is a linguist, not a computer scientist.
Computer scientist: The grammar of the language should be as simple & clear as possible.
Avoids complexity in the compiler
Eliminates sources of ambiguity
Larry Wall: People work differently from compilers. The language should serve the programmer, not the compiler. See also Larry Wall's outline of the three virtues of a programmer.