Make Perl warn me about missing unused functions [duplicate] - perl

This question already has answers here:
How do I get perl -c to throw Undefined or Undeclared function errors?
(4 answers)
Closed 8 years ago.
perl -wle 'if (0) {no_such_func()}'
The above runs without errors, despite the -w, because no_such_func()
is never called.
How do I make Perl check all functions/modules I reference, even ones
I don't use?
In a more realistic case, some functions may only be called in special
cases, but I'd still like to make sure they exist.
EDIT: I installed perlcritic, but I think I'm still doing something wrong. I created this file:
#!/bin/perl -w
use strict;
if (0) {no_such_func();}
and perlcritic said it was fine ("source OK"). Surely static analysis could catch the non-existence of no_such_func()? The program also runs fine (and produces no output).

You can't do it because Perl doesn't see if functions exist until runtime. It can't. Consider a function that only gets evaled into existence:
eval 'sub foo { return $_[0]+1 }';
That line of code will create a subroutine at runtime.
Or consider that Perl can use symbolic references
my $func = 'func';
$func = "no_such_" . $func;
&$func;
In that case, it's calling your no_such_func function, but you can't tell via static analysis.
BTW, if you want to find functions that are never referenced, at least via static analysis, then you can use a tool like Perl::Critic. See http://perlcritic.com/ or install Perl::Critic from CPAN.

Hmm, this is difficult: When Perl parses a function call, it doesn't always know whether that function will exist. This is the case when a function is called before it's declared:
foo();
sub foo { say 42 }
Sometimes, the function may only be made available during runtime:
my $bar = sub { say 42 };
my $baz = sub { say "" };
*foo = rand > 0.5 ? $bar : $baz;
foo();
(I'd like to mention the “Only perl can parse Perl” meme at this point.)
I'm sure you could hack the perl internals to complain when the function can't be resolved before run time, but this is not really useful considering the use cases above.

You can force compile-time checks if you call all subs without parentheses. So the following will fail:
$ perl -e 'use strict; my $condvar; if ($condvar) {no_such_func}'
Bareword "no_such_func" not allowed while "strict subs" in use at -e line 1.
Execution of -e aborted due to compilation errors.
(However, it does not fail if you write if (0), it seems that perl's optimizer is removing the whole block without checking any further)
This has the consequence that you have to define all subroutines before using them. If you work like this, then sometimes "forward" declarations are necessary. But then it's possible that a forward declaration never gets a definition, which is another possible error case.

Related

How can I smoke out undefined subroutines?

I want to scan a code base to identify all instances of undefined subroutines that are not presently reachable.
As an example:
use strict;
use warnings;
my $flag = 0;
if ( $flag ) {
undefined_sub();
}
Observations
When $flag evaluates to true, the following warning is emitted:
Undefined subroutine &main::undefined_sub called at - line 6
I don't want to rely on warnings issued at run-time to identify undefined subroutines
The strict and warnings pragmas don't help here. use strict 'subs' has no effect.
Even the following code snippet is silent
$ perl -Mstrict -we 'exit 0; undefined_sub()'
Perhaps Subroutines::ProhibitCallsToUndeclaredSubs policy from Perl::Critic can help
This Policy checks that every unqualified subroutine call has a matching subroutine declaration in the current file, or that it explicitly appears in the import list for one of the included modules.
This "policy" is a part of Perl::Critic::StricterSubs, which needs to be installed. There are a few more policies there. This is considered a severity 4 violation, so you can do
perlcritic -4 script.pl
and parse the output for neither declared nor explicitly imported, or use
perlcritic -4 --single-policy ProhibitCallsToUndeclaredSubs script.pl
Some legitimate uses are still flagged, since it requires all subs to be imported explicitly.
This is a static analyzer, which I think should fit your purpose.
What you're asking for is in at least some sense impossible. Consider the following code snippet:
( rand()<0.5 ? *foo : *bar } = sub { say "Hello World!" };
foo();
There is a 50% chance that this will run OK, and a 50% chance that it will give an "Undefined subroutine" error. The decision is made at runtime, so it's not possible to tell before that what it will be. This is of course a contrived case to demonstrate a point, but runtime (or compile-time) generation of subroutines is not that uncommon in real code. For an example, look at how Moose adds functions that create methods. Static source code analysis will never be able to fully analyze such code.
B::Lint is probably about as good as something pre-runtime can get.
To find calls to subs that aren't defined at compile time, you can use B::Lint as follows:
a.pl:
use List::Util qw( min );
sub defined_sub { }
sub defined_later;
sub undefined_sub;
defined_sub();
defined_later();
undefined_sub();
undeclared_sub();
min();
max(); # XXX Didn't import
List::Util::max();
List::Util::mac(); # XXX Typo!
sub defined_later { }
Test:
$ perl -MO=Lint,undefined-subs a.pl
Undefined subroutine 'undefined_sub' called at a.pl line 9
Nonexistent subroutine 'undeclared_sub' called at a.pl line 10
Nonexistent subroutine 'max' called at a.pl line 12
Nonexistent subroutine 'List::Util::mac' called at a.pl line 14
a.pl syntax OK
Note that this is just for sub calls. Method calls (such as Class->method and method Class) aren't checked. But you are asking about sub calls.
Note that foo $x is a valid method call (using the indirect method call syntax) meaning $x->foo if foo isn't a valid function or sub, so B::Lint won't catch that. But it will catch foo($x).

identify a procedure and replace it with a different procedure

What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?

Why use strict and warnings?

It seems to me that many of the questions in the Perl tag could be solved if people would use:
use strict;
use warnings;
I think some people consider these to be akin to training wheels, or unnecessary complications, which is clearly not true, since even very skilled Perl programmers use them.
It seems as though most people who are proficient in Perl always use these two pragmas, whereas those who would benefit most from using them seldom do. So, I thought it would be a good idea to have a question to link to when encouraging people to use strict and warnings.
So, why should a Perl developer use strict and warnings?
For starters, use strict; (and to a lesser extent, use warnings;) helps find typos in variable names. Even experienced programmers make such errors. A common case is forgetting to rename an instance of a variable when cleaning up or refactoring code.
Using use strict; use warnings; catches many errors sooner than they would be caught otherwise, which makes it easier to find the root causes of the errors. The root cause might be the need for an error or validation check, and that can happen regardless or programmer skill.
What's good about Perl warnings is that they are rarely spurious, so there's next to no cost to using them.
Related reading: Why use my?
Apparently use strict should (must) be used when you want to force Perl to code properly which could be forcing declarations, being explicit on strings and subs, i.e., barewords or using refs with caution. Note: if there are errors, use strict will abort the execution if used.
While use warnings; will help you find typing mistakes in program like you missed a semicolon, you used 'elseif' and not 'elsif', you are using deprecated syntax or function, whatever like that. Note: use warnings will only provide warnings and continue execution, i.e., it won't abort the execution...
Anyway, it would be better if we go into details, which I am specifying below
From perl.com (my favourite):
use strict 'vars';
which means that you must always declare variables before you use them.
If you don't declare you will probably get an error message for the undeclared variable:
Global symbol "$variablename" requires explicit package name at scriptname.pl line 3
This warning means Perl is not exactly clear about what the scope of the variable is. So you need to be explicit about your variables, which means either declaring them with my, so they are restricted to the current block, or referring to them with their fully qualified name (for ex: $MAIN::variablename).
So, a compile-time error is triggered if you attempt to access a variable that hasn't met at least one of the following criteria:
Predefined by Perl itself, such as #ARGV, %ENV, and all the global punctuation variables such as $. Or $_.
Declared with our (for a global) or my (for a lexical).
Imported from another package. (The use vars pragma fakes up an import, but use our instead.)
Fully qualified using its package name and the double-colon package separator.
use strict 'subs';
Consider two programs
# prog 1
$a = test_value;
print "First program: ", $a, "\n";
sub test_value { return "test passed"; }
Output: First program's result: test_value
# prog 2
sub test_value { return "test passed"; }
$a = test_value;
print "Second program: ", $a, "\n";
Output: Second program's result: test passed
In both cases we have a test_value() sub and we want to put its result into $a. And yet, when we run the two programs, we get two different results:
In the first program, at the point we get to $a = test_value;, Perl doesn't know of any test_value() sub, and test_value is interpreted as string 'test_value'. In the second program, the definition of test_value() comes before the $a = test_value; line. Perl thinks test_value as sub call.
The technical term for isolated words like test_value that might be subs and might be strings depending on context, by the way, is bareword. Perl's handling of barewords can be confusing, and it can cause bug in program.
The bug is what we encountered in our first program, Remember that Perl won't look forward to find test_value(), so since it hasn't already seen test_value(), it assumes that you want a string. So if you use strict subs;, it will cause this program to die with an error:
Bareword "test_value" not allowed while "strict subs" in use at
./a6-strictsubs.pl line 3.
Solution to this error would be
Use parentheses to make it clear you're calling a sub. If Perl sees $a = test_value();,
Declare your sub before you first use it
use strict;
sub test_value; # Declares that there's a test_value() coming later ...
my $a = test_value; # ...so Perl will know this line is okay.
.......
sub test_value { return "test_passed"; }
And If you mean to use it as a string, quote it.
So, This stricture makes Perl treat all barewords as syntax errors. A bareword is any bare name or identifier that has no other interpretation forced by context. (Context is often forced by a nearby keyword or token, or by predeclaration of the word in question.) So If you mean to use it as a string, quote it and If you mean to use it as a function call, predeclare it or use parentheses.
Barewords are dangerous because of this unpredictable behavior. use strict; (or use strict 'subs';) makes them predictable, because barewords that might cause strange behavior in the future will make your program die before they can wreak havoc
There's one place where it's OK to use barewords even when you've turned on strict subs: when you are assigning hash keys.
$hash{sample} = 6; # Same as $hash{'sample'} = 6
%other_hash = ( pie => 'apple' );
Barewords in hash keys are always interpreted as strings, so there is no ambiguity.
use strict 'refs';
This generates a run-time error if you use symbolic references, intentionally or otherwise.
A value that is not a hard reference is then treated as a symbolic reference. That is, the reference is interpreted as a string representing the name of a global variable.
use strict 'refs';
$ref = \$foo; # Store "real" (hard) reference.
print $$ref; # Dereferencing is ok.
$ref = "foo"; # Store name of global (package) variable.
print $$ref; # WRONG, run-time error under strict refs.
use warnings;
This lexically scoped pragma permits flexible control over Perl's built-in warnings, both those emitted by the compiler as well as those from the run-time system.
From perldiag:
So the majority of warning messages from the classifications below, i.e., W, D, and S can be controlled using the warnings pragma.
(W) A warning (optional)
(D) A deprecation (enabled by default)
(S) A severe warning (enabled by default)
I have listed some of warnings messages those occurs often below by classifications. For detailed info on them and others messages, refer to perldiag.
(W) A warning (optional):
Missing argument in %s
Missing argument to -%c
(Did you mean &%s instead?)
(Did you mean "local" instead of "our"?)
(Did you mean $ or # instead of %?)
'%s' is not a code reference
length() used on %s
Misplaced _ in number
(D) A deprecation (enabled by default):
defined(#array) is deprecated
defined(%hash) is deprecated
Deprecated use of my() in false conditional
$# is no longer supported
(S) A severe warning (enabled by default)
elseif should be elsif
%s found where operator expected
(Missing operator before %s?)
(Missing semicolon on previous line?)
%s never introduced
Operator or semicolon missing before %s
Precedence problem: open %s should be open(%s)
Prototype mismatch: %s vs %s
Warning: Use of "%s" without parentheses is ambiguous
Can't open %s: %s
These two pragmas can automatically identify bugs in your code.
I always use this in my code:
use strict;
use warnings FATAL => 'all';
FATAL makes the code die on warnings, just like strict does.
For additional information, see: Get stricter with use warnings FATAL => 'all';
Also... The strictures, according to Seuss
There's a good thread on perlmonks about this question.
The basic reason obviously is that strict and warnings massively help you catch mistakes and aid debugging.
Source: Different blogs
Use will export functions and variable names to the main namespace by
calling modules import() function.
A pragma is a module which influences some aspect of the compile time
or run time behavior of Perl. Pragmas give hints to the compiler.
Use warnings - Perl complains about variables used only once and improper conversions of strings into numbers. Trying to write to
files that are not opened. It happens at compile time. It is used to
control warnings.
Use strict - declare variables scope. It is used to set some kind of
discipline in the script. If barewords are used in the code they are
interpreted. All the variables should be given scope, like my, our or
local.
The "use strict" directive tells Perl to do extra checking during the compilation of your code. Using this directive will save you time debugging your Perl code because it finds common coding bugs that you might overlook otherwise.
Strict and warnings make sure your variables are not global.
It is much neater to be able to have variables unique for individual methods rather than having to keep track of each and every variable name.
$_, or no variable for certain functions, can also be useful to write more compact code quicker.
However, if you do not use strict and warnings, $_ becomes global!
use strict;
use warnings;
Strict and warnings are the mode for the Perl program. It is allowing the user to enter the code more liberally and more than that, that Perl code will become to look formal and its coding standard will be effective.
warnings means same like -w in the Perl shebang line, so it will provide you the warnings generated by the Perl program. It will display in the terminal.

Why doesn't Perl warn for the redeclaration of a my() variable in the same scope?

use strict;
my $world ="52";
my $in = "42" ;
my $world="42";
my $out = "good" ."good";
chop($out);
print $out;
Do not worry about the code.The question is that I used my $world in two different lines but compiler didn't give any error but if we consider C language's syntax then we will get the error because of the redeclaration of variable. Why don't perl gives any error for redeclaration. I have one more question: What is the size of a scalar variable ?
1/ Variable redeclaration is not an error. Had you included "use warnings" then you would get a warning.
2/ By "size of scalar variable" do you mean the amount of data that it can store? If that's the case, Perl imposes no arbitrary limits.
You seem to be posting a lot of rather simple questions very quickly. Have you considered reading "Learning Perl"?
The question is my $world i used it in two different lines but compiler said no error but to the c we get error as redclaration of variable but why not in perl.
Simply because Perl isn't C, and redefining a variable isn't an error condition.
It can be a cause of unexpected behaviour though and would be picked up if you had use warnings; (as has been suggested to you before).
What is the size of scalar variable ? is there any size?
Define 'size'. Bytes? Characters? Something else? You might be looking for length
Because Perl likes to be robust. If you had warnings turned on, you would have heard about it.
"my" variable $world masks earlier declaration in same scope at - line 7.
Although USUW (use strict; use warnings;) is a good development practice, so would be using autodie--if autodie worried about syntax warnings. But the following, concept is roughly the same, to make sure that you're not avoiding any warnings.
BEGIN { $SIG{__WARN__} = sub { die #_; }; }
The above code creates a signal handler for warnings that just dies instead. However, I think this is better for a beginner:
BEGIN {
$SIG{__WARN__}
= sub {
eval {
# take me out of the chain, to avoid recursion
delete $SIG{__WARN__};
# diag will install the warn handler we want to use.
eval 'use diagnostics;';
$SIG{__WARN__}->( #_ ); # invoke that handler
};
exit 1; # exit regardless of errors that might have cropped up.
};
}
Anywhere you want, you can tell perl that you are not interested in changing your code to issue a particular category of warnings (and diagnostics will tell you the category!) and if you explicitly tell perl no warnings 'misc', not only will it not warn you, but it will also not fire off the warning handler, which kills the program.
This will give you a more c-like feel--except that c has warnings too (so you could implement a lexical counter as well...oh well.)

Why is parenthesis optional only after sub declaration?

(Assume use strict; use warnings; throughout this question.)
I am exploring the usage of sub.
sub bb { print #_; }
bb 'a';
This works as expected. The parenthesis is optional, like with many other functions, like print, open etc.
However, this causes a compilation error:
bb 'a';
sub bb { print #_; }
String found where operator expected at t13.pl line 4, near "bb 'a'"
(Do you need to predeclare bb?)
syntax error at t13.pl line 4, near "bb 'a'"
Execution of t13.pl aborted due to compilation errors.
But this does not:
bb('a');
sub bb { print #_; }
Similarly, a sub without args, such as:
special_print;
my special_print { print $some_stuff }
Will cause this error:
Bareword "special_print" not allowed while "strict subs" in use at t13.pl line 6.
Execution of t13.pl aborted due to compilation errors.
Ways to alleviate this particular error is:
Put & before the sub name, e.g. &special_print
Put empty parenthesis after sub name, e.g. special_print()
Predeclare special_print with sub special_print at the top of the script.
Call special_print after the sub declaration.
My question is, why this special treatment? If I can use a sub globally within the script, why can't I use it any way I want it? Is there a logic to sub being implemented this way?
ETA: I know how I can fix it. I want to know the logic behind this.
I think what you are missing is that Perl uses a strictly one-pass parser. It does not scan the file for subroutines, and then go back and compile the rest. Knowing this, the following describes how the one pass parse system works:
In Perl, the sub NAME syntax for declaring a subroutine is equivalent to the following:
sub name {...} === BEGIN {*name = sub {...}}
This means that the sub NAME syntax has a compile time effect. When Perl is parsing source code, it is working with a current set of declarations. By default, the set is the builtin functions. Since Perl already knows about these, it lets you omit the parenthesis.
As soon as the compiler hits a BEGIN block, it compiles the inside of the block using the current rule set, and then immediately executes the block. If anything in that block changes the rule set (such as adding a subroutine to the current namespace), those new rules will be in effect for the remainder of the parse.
Without a predeclared rule, an identifier will be interpreted as follows:
bareword === 'bareword' # a string
bareword LIST === syntax error, missing ','
bareword() === &bareword() # runtime execution of &bareword
&bareword === &bareword # same
&bareword() === &bareword() # same
When using strict and warnings as you have stated, barewords will not be converted into strings, so the first example is a syntax error.
When predeclared with any of the following:
sub bareword;
use subs 'bareword';
sub bareword {...}
BEGIN {*bareword = sub {...}}
Then the identifier will be interpreted as follows:
bareword === &bareword() # compile time binding to &bareword
bareword LIST === &bareword(LIST) # same
bareword() === &bareword() # same
&bareword === &bareword # same
&bareword() === &bareword() # same
So in order for the first example to not be a syntax error, one of the preceding subroutine declarations must be seen first.
As to the why behind all of this, Perl has a lot of legacy. One of the goals in developing Perl was complete backwards compatibility. A script that works in Perl 1 still works in Perl 5. Because of this, it is not possible to change the rules surrounding bareword parsing.
That said, you will be hard pressed to find a language that is more flexible in the ways it lets you call subroutines. This allows you to find the method that works best for you. In my own code, if I need to call a subroutine before it has been declared, I usually use name(...), but if that subroutine has a prototype, I will call it as &name(...) (and you will get a warning "subroutine called too early to check prototype" if you don't call it this way).
The best answer I can come up with is that's the way Perl is written. It's not a satisfying answer, but in the end, it's the truth. Perl 6 (if it ever comes out) won't have this limitation.
Perl has a lot of crud and cruft from five different versions of the language. Perl 4 and Perl 5 did some major changes which can cause problems with earlier programs written in a free flowing manner.
Because of the long history, and the various ways Perl has and can work, it can be difficult for Perl to understand what's going on. When you have this:
b $a, $c;
Perl has no way of knowing if b is a string and is simply a bareword (which was allowed in Perl 4) or if b is a function. If b is a function, it should be stored in the symbol table as the rest of the program is parsed. If b isn't a subroutine, you shouldn't put it in the symbol table.
When the Perl compiler sees this:
b($a, $c);
It doesn't know what the function b does, but it at least knows it's a function and can store it in the symbol table waiting for the definition to come later.
When you pre-declare your function, Perl can see this:
sub b; #Or use subs qw(b); will also work.
b $a, $c;
and know that b is a function. It might not know what the function does, but there's now a symbol table entry for b as a function.
One of the reasons for Perl 6 is to remove much of the baggage left from the older versions of Perl and to remove strange things like this.
By the way, never ever use Perl Prototypes to get around this limitation. Use use subs or predeclare a blank subroutine. Don't use prototypes.
Parentheses are optional only if the subroutine has been predeclared. This is documented in perlsub.
Perl needs to know at compile time whether the bareword is a subroutine name or a string literal. If you use parentheses, Perl will guess that it's a subroutine name. Otherwise you need to provide this information beforehand (e.g. using subs).
The reason is that Larry Wall is a linguist, not a computer scientist.
Computer scientist: The grammar of the language should be as simple & clear as possible.
Avoids complexity in the compiler
Eliminates sources of ambiguity
Larry Wall: People work differently from compilers. The language should serve the programmer, not the compiler. See also Larry Wall's outline of the three virtues of a programmer.