I am playing around with Devel::Cover to see how well our test suite is actually testing our codebase. I run all of our tests using -MDevel::Cover nothing seems to fail or crash, but the HTML output of the coverage table has entries like these for all of our Modules:
The number of BEGINs listed seems to match the number of use Module::X statements in the source file, but really clutters the HTML output. Is there any way to disable this feature? I don't see any mention of it in the tutorial or the Github issue tracker.
The reason for this is that "use" is "exactly equivalent to"
BEGIN { require Module; Module->import( LIST ); }
(See perldoc -f use.)
And then "BEGIN" is basically the same as "sub BEGIN" - you can put the "sub" there if you want to. See perldoc perlmod.
So what you really do have is a subroutine, and that is what Devel::Cover is reporting.
Like many parts of Devel::Cover, the details of perl's implementation, or at least the semantics, are leaking through. There is no way to stop this, though I would be amenable to changes in this area.
Related
If I have, for example, next perl script:
use strict;
use warnings;
print $x;
When I run this script, compilation will fail with error:
Global symbol "$x" requires explicit package name (did you forget to declare "my $x"?) at ...
Is it possible to write some perl module which will be called when this error occur and automatically fix this error and continue compilation? (Even links to any info is OK)
# This code is incorrect.
# Here I just ask about such ability
# This code is very weak approximation how it might look
package AutoFix;
sub fix {
$main::x = 'You are defined now';
}
1;
So next code will not fail and print You are defined now:
use strict;
use warnings;
use AutoFix;
print $x;
How much work would you like to do to create the code that could figure out what the fix should be? And, will that amount of work be comparable or less to the work required to examine code by hand?
Now, I'm writing all of this having spent quite a bit of time trying to come up with a system to analyze CPAN installer output to figure out what went wrong (a major impetus for CPANPLUS, now relegated to history). It's easy to tell that something is not right, but beyond that is a lot of suffering.
In your example, you have an error about an undeclared variable. How does AutoFix know if that should be a package or a lexical variable? You can guess one or the other, but you actually have two big problems:
What is the intent of the code?
Does the code reflect the actual intent?
Determining the intent of the code is often very difficult for even an experienced human programmer to figure out (just read StackOverflow question comments). Compiling code is often not correct code, in the sense that it doesn't achieve the desired outcome. Furthermore, does the programmer even understand the problem? Does the code the programmer wrote (incorrectly here) reflect the actual work the code should do? It's difficult for humans in code review to figure this out. Tools like Coverity can guess at problems it knows about, but they aren't going to be able to correct the code.
But let's say that the programmer understands the problem. Have they correctly expressed that? The longer you've been programming, the more you lean toward "no", in general, in my experience.
This is completely different than the database constraint you mentioned. That's a narrowly targeted fix for an expected and allowed situation. Consider a different parallel: if the record has a New York area code but a Chicago address, should I fix the city? When I was a younger dumbass, I did a similar thing to a database. It was stupid because I thought I knew something I didn't, and everyone who understood the situation recognized it immediately. Even then, those sorts of constraints are how we model what we know about the world, not what the world actually is.
Now, to make AutoFix, you need to make something that can look at code, understand it, and figure out what it should do. You can make guesses, but you have no basis for playing the probabilities there.
Technical matters can't solve this. AutoFix can undo the work of pragmas such that some classes of errors don't show up, but so what? The program with an error just continues? How does that help anyone?
Not only that, compilers tend to complain when they realize they can't parse something. What they complain about is often not the problem. The first thing I teach people while debugging is that they need to look at the statement immediately proceeding the line line number in the error message. Any error message you catch can have a virtually infinite number of causes.
Consider this code, which fails in the same way as your example (same error message) but for a completely different but common reason:
use strict;
use warnings;
my $x = 5,
print $x++;
How do you figure out what the fix should be? It's not about declaring $x.
So, you now have two cases, and you build that your fixer. Then you encounter another case, so you build that in. And you keep doing this until eventually you have a large dictionary of fixes. Maybe you get a bit crazy and do some machine learning (and wouldn't a corpus of bad code and resolutions be cool).
But, the program still can't continue. It has to start over because it has to at least back up to where it should have done something but didn't. You can't merely restart the program because you don't know if its idempotent. Re-runing the program might redo work it shouldn't, such as inserting duplicate into databases.
Having said all that, this sort of thing is related to static analysis and the refactoring browser. Adam Kennedy's Parse Perl Isolated (PPI) project was a first step into understanding Perl code without compiling it, then move toward the Smalltalk ideal of understanding which parts of code represented the same thing. If you knew that two things named foo were the same thing, you could rearrange code dealing with foo. For example, if you renamed a method from bar to set_bar, you could immediately know which bars you should rename and which belonged to some other class.
Adam wrote Acme::BadExample and challenged anyone to get it to run. He wrote "any given piece of Perl source exists in bizarre pseudo-quantum-like state, in that it demonstrates both duality and indeterminism."
Jos Boumans stepped up and used some mind-bending Perl, which he then showed in Barely Legal XXX Perl, which I think he first presented in 2006. He was amazingly creative in his solutions, and in a way that I wouldn't want in production code.
Perl doesn't even know, by design, what type of thing will be in a variable or even that the method you might call on it will exist. In fact, it defers so much to the runtime, trusting that things will be in place by the time you need them, that we often say "only Perl can parse perl". You literally need to be able to run Perl code to properly compile it since BEGIN blocks can affect the parse. For example, a BEGIN can define a subroutine with a certain arity. How do you parse foo 5, 6? You have to know what has already been defined.
Perl has other "action at a distance" features that make this even tougher. autodie redefines CORE features to add extra behavior, but you might not be able to see that in the code. You can set default regex flags (and I've seen plenty of big screw ups by people applying /isxm to entire files without checking).
As noted above, autofixing compile time error is not possible (or probably hard to fix)
Instead of fixing compile time error try to resolve your problem in different way.
For example. In your script you use $x variable. Probably you know that you will use it and you want to get instance of some value, e.g. You are defined now then you could use Exporter:
use strict;
use warnings;
use AutoFix qw/ $x /;
print $x;
And AutoFix module will look like:
package AutoFix;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT_OK = qw( $x $y $z ); # symbols to export on request
... # code which will create instance of $x $y $z on request
1;
Gool luck ;-)
In some code coverage tools you can "hide" certain lines of code from the coverage tool, so that those lines do not count towards the coverage totals. For example, some code might be run only in circumstances that are hard or impossible to test (such as certain hardware failures). Thus, you might get 100% coverage reported even though some code was not exercised.
Setting aside for the moment whether this is wise, is this sort of thing possible with Perl's Devel::Cover?
(Devel::Cover can ignore entire files, but I am interested in ignoring just a few lines in a single file.)
A lot of uncoverable code features have been implemented but they are not documented because I wasn't sure of the interface. However, it's been a few years since anything changed in that area.
Probably the easiest way to see how to use the features is to look at tests/uncoverable in the distribution (see https://github.com/pjcj/Devel--Cover/blob/master/test/uncoverable). If you can't or don't want to change your code you can use the .uncoverable file (see https://github.com/pjcj/Devel--Cover/blob/master/tests/.uncoverable) and the cover options as mentioned by toolic.
If you do this, be sure to use the basic_html report which will mark a construct as in error if you tag it as uncoverable but it gets executed anyway.
I really should get around to tidying everything up and documenting it.
According to the TODO file on CPAN, this capability is not currently supported, but the developers see it as a valuable addition:
Enhancements:
Marking of unreachable code - commandline tool and gui.
The cover script mentions promising options: -add_uncoverable_point and -delete_uncoverable_point.
I have to optimize an intranet written in Perl (about 3000 files). The first thing I want to do is enable warnings "-w" or "use warnings;" so I can get rid of all those errors, then try to implement "use strict;".
Is there a way of telling Perl to use warnings all the time (like the settings in php.ini for PHP), without the need to modify each script to add "-w" to it's first line?
I even thought to make an alias for /usr/bin/perl, or move it to another name and make a simple script instead of it just to add the -w flag (like a proxy).
How would you debug it?
Well…
You could set the PERL5OPT envariable to hold -w. See the perlrun manpage for details. I hope you’ll consider tainting, too, like -T or maybe -t, for security tracking.
But I don’t envy you. Retrofitting code developed without the benefit of use warnings and use strict is usually a royal PITA.
I have something of a standard boiler-plate I use to start new Perl programs. But I haven’t given any thought to one for CGI programs, which would likely benefit from some tweaks against that boiler-plate.
Retrofitting warnings and strict is hard. I don't recommend a Big Bang approach, setting warnings (let alone strictures) on everything. You will be inundated with warnings to the point of uselessness.
You start by enabling warnings on the modules used by the scripts (there are some, aren't there?), rather than applying warnings to everything. Get the core clean, then get to work on the periphery, one unit at a time. So, in fact, I'd recommend having a simple (Perl) script that simply finds a line that does not start with a hash and adds use warnings; (and maybe use strict; too, since you're going to be dealing with one script at a time), so you can do the renovations one script at a time.
In other words, you will probably be best off actually editing each file as you're about to renovate it.
I'd only use the blanket option to make a simple assessment of the scope of the problem: is it a complete and utter disaster, or merely a few peccadilloes in a few files. Sadly, if the code was developed without warnings and strict, it is more likely to be 'disaster' than 'minimal'.
You may find that your predecessors were prone to copy and paste and some erroneous idioms crop up repeatedly in copied code. Write a Perl script that fixes each one. I have a bunch of fix* scripts in my personal bin directory that deal with various changes - either fixing issues created by recalcitrant (or, more usually, simply long departed) colleagues or to accommodate my own changing standards.
You can set warnings and strictures for all Perl scripts by adding -Mwarnings -Mstrict to your PERL5OPT environment variable. See perlrun for details.
It is "common knowledge" that source filters are bad and should not be used in production code.
When answering a a similar, but more specific question I couldn't find any good references that explain clearly why filters are bad and when they can be safely used. I think now is time to create one.
Why are source filters bad?
When is it OK to use a source filter?
Why source filters are bad:
Nothing but perl can parse Perl. (Source filters are fragile.)
When a source filter breaks pretty much anything can happen. (They can introduce subtle and very hard to find bugs.)
Source filters can break tools that work with source code. (PPI, refactoring, static analysis, etc.)
Source filters are mutually exclusive. (You can't use more than one at a time -- unless you're psychotic).
When they're okay:
You're experimenting.
You're writing throw-away code.
Your name is Damian and you must be allowed to program in latin.
You're programming in Perl 6.
Only perl can parse Perl (see this example):
#result = (dothis $foo, $bar);
# Which of the following is it equivalent to?
#result = (dothis($foo), $bar);
#result = dothis($foo, $bar);
This kind of ambiguity makes it very hard to write source filters that always succeed and do the right thing. When things go wrong, debugging is awkward.
After crashing and burning a few times, I have developed the superstitious approach of never trying to write another source filter.
I do occasionally use Smart::Comments for debugging, though. When I do, I load the module on the command line:
$ perl -MSmart::Comments test.pl
so as to avoid any chance that it might remain enabled in production code.
See also: Perl Cannot Be Parsed: A Formal Proof
I don't like source filters because you can't tell what code is going to do just by reading it. Additionally, things that look like they aren't executable, such as comments, might magically be executable with the filter. You (or more likely your coworkers) could delete what you think isn't important and break things.
Having said that, if you are implementing your own little language that you want to turn into Perl, source filters might be the right tool. However, just don't call it Perl. :)
It's worth mentioning that Devel::Declare keywords (and starting with Perl 5.11.2, pluggable keywords) aren't source filters, and don't run afoul of the "only perl can parse Perl" problem. This is because they're run by the perl parser itself, they take what they need from the input, and then they return control to the very same parser.
For example, when you declare a method in MooseX::Declare like this:
method frob ($bubble, $bobble does coerce) {
... # complicated code
}
The word "method" invokes the method keyword parser, which uses its own grammar to get the method name and parse the method signature (which isn't Perl, but it doesn't need to be -- it just needs to be well-defined). Then it leaves perl to parse the method body as the body of a sub. Anything anywhere in your code that isn't between the word "method" and the end of a method signature doesn't get seen by the method parser at all, so it can't break your code, no matter how tricky you get.
The problem I see is the same problem you encounter with any C/C++ macro more complex than defining a constant: It degrades your ability to understand what the code is doing by looking at it, because you're not looking at the code that actually executes.
In theory, a source filter is no more dangerous than any other module, since you could easily write a module that redefines builtins or other constructs in "unexpected" ways. In practice however, it is quite hard to write a source filter in a way where you can prove that its not going to make a mistake. I tried my hand at writing a source filter that implements the perl6 feed operators in perl5 (Perl6::Feeds on cpan). You can take a look at the regular expressions to see the acrobatics required to simply figure out the boundaries of expression scope. While the filter works, and provides a test bed to experiment with feeds, I wouldn't consider using it in a production environment without many many more hours of testing.
Filter::Simple certainly comes in handy by dealing with 'the gory details of parsing quoted constructs', so I would be wary of any source filter that doesn't start there.
In all, it really depends on the filter you are using, and how broad a scope it tries to match against. If it is something simple like a c macro, then its "probably" ok, but if its something complicated then its a judgement call. I personally can't wait to play around with perl6's macro system. Finally lisp wont have anything on perl :-)
There is a nice example here that shows in what trouble you can get with source filters.
http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/
They used a module called Switch, which is based on source filters. And because of that, they were unable to find the source of an error message for days.
I am implementing a CLI tool using Perl.
What are the best-practices we can follow here?
As a preface, I spent 3 years engineering and implementing a pretty complicated command line toolset in Perl for a major financial company. The ideas below are basically part of our team's design guidelines.
User Interface
Command line option: allow as many as possible have default values.
NO positional parameters for any command that has more than 2 options.
Have readable options names. If length of command line is a concern for non-interactive calling (e.g. some un-named legacy shells have short limits on command lines), provide short aliases - GetOpt::Long allows that easily.
At the very least, print all options' default values in '-help' message.
Better yet, print all the options' "current" values (e.g. if a parameter and a value are supplied along with "-help", the help message will print parameter's value from command line). That way, people can assemble command line string for complicated command and verify it by appending "-help", before actually running.
Follow Unix standard convention of exiting with non-zero return code if program terminated with errors.
If your program may produce useful (e.g. worth capturing/grepping/whatnot) output, make sure any error/diagnostic messages go to STDERR so they are easily separable.
Ideally, allow the user to specify input/output files via command line parameter, instead of forcing "<" / ">" redirects - this allows MUCH simpler life to people who need to build complicated pipes using your command. Ditto for error messages - have logfile option.
If a command has side effect, having a "whatif/no_post" option is usually a Very Good Idea.
Implementation
As noted previously, don't re-invent the wheel. Use standard command line parameter handling modules - MooseX::Getopt, or Getopt::Long
For Getopt::Long, assign all the parameters to a single hash as opposed to individual variables. Many useful patterns include passing that CLI args hash to object constructors.
Make sure your error messages are clear and informative... E.g. include "$!" in any IO-related error messages. It's worth expending extra 1 minute and 2 lines in your code to have a separate "file not found" vs. "file not readable" errors, as opposed to spending 30 minutes in production emergency because a non-readable file error was misdiagnosed by Production Operations as "No input file" - this is a real life example.
Not really CLI-specific, but validate all parameters, ideally right after getting them.
CLI doesn't allow for a "front-end" validation like webapps do, so be super extra vigilant.
As discussed above, modularize business logic. Among other reasons already listed, the amount of times I had to re-implement an existing CLI tool as a web app is vast - and not that difficult if the logic is already a properly designed perm module.
Interesting links
CLI Design Patterns - I think this is ESR's
I will try to add more bullets as I recall them.
Use POD to document your tool, follow the guidelines of manpages; include at least the following sections: NAME, SYNOPSIS, DESCRIPTION, AUTHOR. Once you have proper POD you can generate a man page with pod2man, view the documentation at the console with perldoc your-script.pl.
Use a module that handles command line options for you. I really like using Getopt::Long in conjunction with Pod::Usage this way invoking --help will display a nice help message.
Make sure that your scripts returns a proper exit value if it was successful or not.
Here's a small skeleton of a script that does all of these:
#!/usr/bin/perl
=head1 NAME
simplee - simple program
=head1 SYNOPSIS
simple [OPTION]... FILE...
-v, --verbose use verbose mode
--help print this help message
Where I<FILE> is a file name.
Examples:
simple /etc/passwd /dev/null
=head1 DESCRIPTION
This is as simple program.
=head1 AUTHOR
Me.
=cut
use strict;
use warnings;
use Getopt::Long qw(:config auto_help);
use Pod::Usage;
exit main();
sub main {
# Argument parsing
my $verbose;
GetOptions(
'verbose' => \$verbose,
) or pod2usage(1);
pod2usage(1) unless #ARGV;
my (#files) = #ARGV;
foreach my $file (#files) {
if (-e $file) {
printf "File $file exists\n" if $verbose;
}
else {
print "File $file doesn't exist\n";
}
}
return 0;
}
Some lessons I've learned:
1) Always use Getopt::Long
2) Provide help on usage via --help, ideally with examples of common scenarios. It helps people don't know or have forgotten how to use the tool. (I.e., you in six months).
3) Unless it's pretty obvious to the user as why, don't go for long period (>5s) without output to the user. Something like 'print "Row $row...\n" unless ($row % 1000)' goes a long way.
4) For long running operations, allow the user to recover if possible. It really sucks to get through 500k of a million, die, and start over again.
5) Separate the logic of what you're doing into modules and leave the actual .pl script as barebones as possible; parsing options, display help, invoking basic methods, etc. You're inevitably going to find something you want to reuse, and this makes it a heck of a lot easier.
The most important thing is to have standard options.
Don't try to be clever, be simply consistent with already existing tools.
How to achieve this is also important, but only comes second.
Actually, this is quite generic to all CLI interfaces.
There are a couple of modules on CPAN that will make writing CLI programs a lot easier:
App::CLI
App::Cmd
If you app is Moose based also have a look at MooseX::Getopt and MooseX::Runnable
The following points aren't specific to Perl but I've found many Perl CL scripts to be deficient in these areas:
Use common command line options. To show the version number implement -v or --version not --ver. For recursive processing -r (or perhaps -R although in my Gnu/Linux experience -r is more common) not --rec. People will use your script if they can remember the parameters. It's easy to learn a new command if you can remember "it works like grep" or some other familiar utility.
Many command line tools process "things" (files or directories) within the "current directory". While this can be convenient make sure you also add command line options for explicitly identifying the files or directories to process. This makes it easier to put your utility in a pipeline without developers having to issue a bunch of cd commands and remember which directory they're in.
You should use Perl modules to make your code reusable and easy to understand.
should have a look at Perl best practices