Perl: Is it possible to dynamically fix compile time error? - perl

If I have, for example, next perl script:
use strict;
use warnings;
print $x;
When I run this script, compilation will fail with error:
Global symbol "$x" requires explicit package name (did you forget to declare "my $x"?) at ...
Is it possible to write some perl module which will be called when this error occur and automatically fix this error and continue compilation? (Even links to any info is OK)
# This code is incorrect.
# Here I just ask about such ability
# This code is very weak approximation how it might look
package AutoFix;
sub fix {
$main::x = 'You are defined now';
}
1;
So next code will not fail and print You are defined now:
use strict;
use warnings;
use AutoFix;
print $x;

How much work would you like to do to create the code that could figure out what the fix should be? And, will that amount of work be comparable or less to the work required to examine code by hand?
Now, I'm writing all of this having spent quite a bit of time trying to come up with a system to analyze CPAN installer output to figure out what went wrong (a major impetus for CPANPLUS, now relegated to history). It's easy to tell that something is not right, but beyond that is a lot of suffering.
In your example, you have an error about an undeclared variable. How does AutoFix know if that should be a package or a lexical variable? You can guess one or the other, but you actually have two big problems:
What is the intent of the code?
Does the code reflect the actual intent?
Determining the intent of the code is often very difficult for even an experienced human programmer to figure out (just read StackOverflow question comments). Compiling code is often not correct code, in the sense that it doesn't achieve the desired outcome. Furthermore, does the programmer even understand the problem? Does the code the programmer wrote (incorrectly here) reflect the actual work the code should do? It's difficult for humans in code review to figure this out. Tools like Coverity can guess at problems it knows about, but they aren't going to be able to correct the code.
But let's say that the programmer understands the problem. Have they correctly expressed that? The longer you've been programming, the more you lean toward "no", in general, in my experience.
This is completely different than the database constraint you mentioned. That's a narrowly targeted fix for an expected and allowed situation. Consider a different parallel: if the record has a New York area code but a Chicago address, should I fix the city? When I was a younger dumbass, I did a similar thing to a database. It was stupid because I thought I knew something I didn't, and everyone who understood the situation recognized it immediately. Even then, those sorts of constraints are how we model what we know about the world, not what the world actually is.
Now, to make AutoFix, you need to make something that can look at code, understand it, and figure out what it should do. You can make guesses, but you have no basis for playing the probabilities there.
Technical matters can't solve this. AutoFix can undo the work of pragmas such that some classes of errors don't show up, but so what? The program with an error just continues? How does that help anyone?
Not only that, compilers tend to complain when they realize they can't parse something. What they complain about is often not the problem. The first thing I teach people while debugging is that they need to look at the statement immediately proceeding the line line number in the error message. Any error message you catch can have a virtually infinite number of causes.
Consider this code, which fails in the same way as your example (same error message) but for a completely different but common reason:
use strict;
use warnings;
my $x = 5,
print $x++;
How do you figure out what the fix should be? It's not about declaring $x.
So, you now have two cases, and you build that your fixer. Then you encounter another case, so you build that in. And you keep doing this until eventually you have a large dictionary of fixes. Maybe you get a bit crazy and do some machine learning (and wouldn't a corpus of bad code and resolutions be cool).
But, the program still can't continue. It has to start over because it has to at least back up to where it should have done something but didn't. You can't merely restart the program because you don't know if its idempotent. Re-runing the program might redo work it shouldn't, such as inserting duplicate into databases.
Having said all that, this sort of thing is related to static analysis and the refactoring browser. Adam Kennedy's Parse Perl Isolated (PPI) project was a first step into understanding Perl code without compiling it, then move toward the Smalltalk ideal of understanding which parts of code represented the same thing. If you knew that two things named foo were the same thing, you could rearrange code dealing with foo. For example, if you renamed a method from bar to set_bar, you could immediately know which bars you should rename and which belonged to some other class.
Adam wrote Acme::BadExample and challenged anyone to get it to run. He wrote "any given piece of Perl source exists in bizarre pseudo-quantum-like state, in that it demonstrates both duality and indeterminism."
Jos Boumans stepped up and used some mind-bending Perl, which he then showed in Barely Legal XXX Perl, which I think he first presented in 2006. He was amazingly creative in his solutions, and in a way that I wouldn't want in production code.
Perl doesn't even know, by design, what type of thing will be in a variable or even that the method you might call on it will exist. In fact, it defers so much to the runtime, trusting that things will be in place by the time you need them, that we often say "only Perl can parse perl". You literally need to be able to run Perl code to properly compile it since BEGIN blocks can affect the parse. For example, a BEGIN can define a subroutine with a certain arity. How do you parse foo 5, 6? You have to know what has already been defined.
Perl has other "action at a distance" features that make this even tougher. autodie redefines CORE features to add extra behavior, but you might not be able to see that in the code. You can set default regex flags (and I've seen plenty of big screw ups by people applying /isxm to entire files without checking).

As noted above, autofixing compile time error is not possible (or probably hard to fix)
Instead of fixing compile time error try to resolve your problem in different way.
For example. In your script you use $x variable. Probably you know that you will use it and you want to get instance of some value, e.g. You are defined now then you could use Exporter:
use strict;
use warnings;
use AutoFix qw/ $x /;
print $x;
And AutoFix module will look like:
package AutoFix;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT_OK = qw( $x $y $z ); # symbols to export on request
... # code which will create instance of $x $y $z on request
1;
Gool luck ;-)

Related

Perl - Requires explicit package name

I have googled around and looked online and I understand a few criteria would need to be met in order to get this function to work, however, I don't understand why it's able to work in the first place.
Context:
I have a Perl script that I want to integrate into a Perl module. The situation is that I'm new to the language and I'm a bit unsure the difference and I don't understand why this error is coming up in the first place.
The Perl module is this:
https://github.com/slic3r/Slic3r/blob/master/lib/Slic3r/Print/SupportMaterial.pm
I thought I could just add the script into the module and be done, but unfortunately, that is not the case due to the error message. Now to what I know so far as someone new to Perl, you need to declare them "my ..." or remove use strict. I am somewhat interested in the latter since the script is working correctly. Does anyone have any help or tips?
Now to what I know so far as someone new to Perl, you need to declare them "my ..." or remove use strict. I am somewhat interested in the latter since the script is working correctly. Does anyone have any help or tips?
Declaring the variables with my is the right approach. use strict does an number of things - forcing variable declaration is only one of them.
No serious Perl programmer would consider writing code without use strict and use warnings. Removing them is a bad idea.

General check of missing semicolon

As a Perl beginner I am sometimes getting compilation errors and have to search a lot to find it. In the end it is just a missing semicolon at the end of a line. Some syntax errors with missing semicolon are checked by Perl but not in general. Is there a way to get this check?
edit:
I know about Perl::Critic but can't use it atm. And I don't know if it checks for missing semicolon in general.
Because semicolons actually mean something in Perl and aren't just there for decoration, it's not possible for any tool (even the Perl interpreter itself) to know in every case whether you actually meant to leave off the semi-colon or not. Thus, there's no general-case answer to your question; you'll just need to go through your code and make sure it's correct.
As mentioned in my comments, there are various tricks you can try with your editor to expedite the process of finding potentially-incorrect lines; you must, however, either examine and fix these by hand or risk introducing new problems.
The syntax check is perl -c, but that's no different than attempting to run the program outright. Due to its flexible/undecidable syntax, one cannot generally do what you want. That's the downside of comfort and expressiveness.
Upgrade to the latest stable Perl, the parser's error messages got better/more exact over the last years and will correctly recognise many circumstances of a missing semicolon.
Rule of thumb that works for many parsers/other languages: if the error makes no sense, look a couple of lines before.
use diagnostics; usually gives you a nice hint, same as use warnings;. Try to keep a consistent coding style, check perlstyle.
Also you can use Perl::Critic online.
Also as general advice learn how to use packages and modules, try to group code into subs and study the syntax of arrays, lists and hashes. A common mistake is forgetting the ; after an anonymous hashref assignment:
my $hashref = { a => 5, b => 10};

Should I use common::sense or just stick with `use strict` and `use warnings`?

I recently installed a module from CPAN and noticed one of its dependencies was common::sense, a module that offers to enable all the warnings you want, and none that you don't. From the module's synopsis:
use common::sense;
# supposed to be the same, with much lower memory usage, as:
#
# use strict qw(vars subs);
# use feature qw(say state switch);
# no warnings;
# use warnings qw(FATAL closed threads internal debugging pack substr malloc
# unopened portable prototype inplace io pipe unpack regexp
# deprecated exiting glob digit printf utf8 layer
# reserved parenthesis taint closure semicolon);
# no warnings qw(exec newline);
Save for undef warnings sometimes being a hassle, I've usually found the standard warnings to be good. Is it worth switching to common::sense instead of my normal use strict; use warnings;?
While I like the idea of reducing boiler-plate code, I am deeply suspicious of tools like Modern::Perl and common::sense.
The problem I have with modules like this is that they bundle up a group of behaviors and hide behid glib names with changeable meanings.
For example, Modern::Perl today consists of enabling some perl 5.10 features and using strict and warnings. But what happens when Perl 5.12 or 5.14 or 5.24 come out with great new goodies, and the community discovers that we need to use the frobnitz pragma everywhere? Will Modern::Perl provide a consistent set of behaviors or will it remain "Modern". If MP keeps with the times, it will break existing systems that don't keep lock-step with its compiler requirements. It adds extra compatibility testing to upgrade. At least that's my reaction to MP. I'll be the first to admit that chromatic is about 10 times smarter than me and a better programmer as well--but I still disagree with his judgment on this issue.
common::sense has a name problem, too. Whose idea of common sense is involved? Will it change over time?
My preference would be for a module that makes it easy for me to create my own set of standard modules, and even create groups of related modules/pragmas for specific tasks (like date time manipulation, database interaction, html parsing, etc).
I like the idea of Toolkit, but it sucks for several reasons: it uses source filters, and the macro system is overly complex and fragile. I have the utmost respect for Damian Conway, and he produces brilliant code, but sometimes he goes a bit too far (at least for production use, experimentation is good).
I haven't lost enough time typing use strict; use warnings; to feel the need to create my own standard import module. If I felt a strong need for automatically loading a set of modules/pragmas, something similar to Toolkit that allows one to create standard feature groups would be ideal:
use My::Tools qw( standard datetime SQLite );
or
use My::Tools;
use My::Tools::DateTime;
use My::Tools::SQLite;
Toolkit comes very close to my ideal. Its fatal defects are a bummer.
As for whether the choice of pragmas makes sense, that's a matter of taste. I'd rather use the occasional no strict 'foo' or no warnings 'bar' in a block where I need the ability to do something that requires it, than disable the checks over my entire file. Plus, IMO, memory consumption is a red herring. YMMV.
update
It seems that there are many (how many?) different modules of this type floating around CPAN.
There is latest, which is no longer the latest. Demonstrates part of the naming problem.
Also, uni::perl which adds enabling unicode part of the mix.
ToolSet offers a subset of Toolkit's abilities, but without source filters.
I'll include Moose here, since it automatically adds strict and warnings to the calling package.
And finally Acme::Very::Modern::Perl
The proliferation of these modules and the potential for overlapping requirements, adds another issue.
What happens if you write code like:
use Moose;
use common::sense;
What pragmas are enabled with what options?
I would say stick with warnings and strict for two main reasons.
If other people are going to use or work with your code, they are (almost certainly) used to warnings and strict and their rules. Those represent a community norm that you and other people you work with can count on.
Even if this or that specific piece of code is just for you, you probably don't want to worry about remembering "Is this the project where I adhere to warnings and strict or the one where I hew to common::sense?" Moving back and forth between the two modes will just confuse you.
There is one bit nobody else seems to have picked up on, and that's FATAL in the warnings list.
So as of 2.0, use common::sense is more akin to:
use strict;
use warnings FATAL => 'all'; # but with the specific list of fatals instead of 'all' that is
This is a somewhat important and frequently overlooked feature of warnings that ramps the strictness a whole degree higher. Instead of undef string interpolation, or infinite recursion just warning you and then keeping on going despite the problem, it actually halts.
To me this is helpful, because in many cases, undef string interpolation leads to further more dangerous errors, which may go silently unnoticed, and failing and bailing is a good thing.
I obviously have no common sense because I going more for Modern::Perl ;-)
The "lower memory usage" only works if you use no modules that load strict, feature, warnings, etc. and the "much" part is...not all that much.
Not everyone's idea of common sense is the same - in that respect it's anything but common.
Go with what you know. If you get undef warnings, chances are that your program or its input is incorrect.
Warnings are there for a reason. Anything that reduces them cannot be useful. (I always compile with gcc -Wall too...)
I have never had a warning that wasn't something dodgy/just plain wrong in my code. For me, it's always something technically allowed that I almost certainly don't want to do. I think the full suite of warnings is invaluable. If you find use strict + use warnings adequate for now, I don't see why you'd want to change to using a non-standard module which is then a dependency for every piece of code you write from here on out...
When it comes to warnings, I support the use of any module or built-in language feature that gives you the level of warnings that helps you make your code as solid and reliable as it can possibly be. An ignored warning is not helpful to anyone.
But if you're cozy with the standard warnings, stick with it. Coding to a stricter standard is great if you're used to it! I wouldn't recommend switching just for the memory savings. Only switch if the module helps you turn your code around quicker and with more confidence.
Many of peoples argues in a comments with what if the MP changes, it will break your code. While this can be an real threat, here is already MUCH things what are changes over time and break the code (sometimes after a deprecation cycle, sometimes not...).
Some other modules changed the API, so breaks things, and nobody care about them. E.g. Moose has at least two things what are deprecated now, and probably will be forbidden in some future releases.
Another example, years ago was allowed to write
for $i qw(some words)
now, it is deprecated. And many others... And this is a CORE language syntax.
Everybody survived. So, don't really understand why many of people argues againist helper modules. When they going to change, (probably) here will be a sort of deprecation cycle... So, my view is:
if you write programs to yourself, use any module you want ;)
if you write a program to someone, where someone others going to maintnanece it, use minimal nonstandard "pragma-like" modules (common::sense, modern::perl, uni::perl etc...)
in the stackoverflow questions, you can safely use common::sense or Modern::Perl etc. - most of users who will answer, your questions, knows them. Everybody understand than it is easier to write use 5.010; for enable strict, warnings and fearures with 10 chars as with 3 lines...

Why are Perl source filters bad and when is it OK to use them?

It is "common knowledge" that source filters are bad and should not be used in production code.
When answering a a similar, but more specific question I couldn't find any good references that explain clearly why filters are bad and when they can be safely used. I think now is time to create one.
Why are source filters bad?
When is it OK to use a source filter?
Why source filters are bad:
Nothing but perl can parse Perl. (Source filters are fragile.)
When a source filter breaks pretty much anything can happen. (They can introduce subtle and very hard to find bugs.)
Source filters can break tools that work with source code. (PPI, refactoring, static analysis, etc.)
Source filters are mutually exclusive. (You can't use more than one at a time -- unless you're psychotic).
When they're okay:
You're experimenting.
You're writing throw-away code.
Your name is Damian and you must be allowed to program in latin.
You're programming in Perl 6.
Only perl can parse Perl (see this example):
#result = (dothis $foo, $bar);
# Which of the following is it equivalent to?
#result = (dothis($foo), $bar);
#result = dothis($foo, $bar);
This kind of ambiguity makes it very hard to write source filters that always succeed and do the right thing. When things go wrong, debugging is awkward.
After crashing and burning a few times, I have developed the superstitious approach of never trying to write another source filter.
I do occasionally use Smart::Comments for debugging, though. When I do, I load the module on the command line:
$ perl -MSmart::Comments test.pl
so as to avoid any chance that it might remain enabled in production code.
See also: Perl Cannot Be Parsed: A Formal Proof
I don't like source filters because you can't tell what code is going to do just by reading it. Additionally, things that look like they aren't executable, such as comments, might magically be executable with the filter. You (or more likely your coworkers) could delete what you think isn't important and break things.
Having said that, if you are implementing your own little language that you want to turn into Perl, source filters might be the right tool. However, just don't call it Perl. :)
It's worth mentioning that Devel::Declare keywords (and starting with Perl 5.11.2, pluggable keywords) aren't source filters, and don't run afoul of the "only perl can parse Perl" problem. This is because they're run by the perl parser itself, they take what they need from the input, and then they return control to the very same parser.
For example, when you declare a method in MooseX::Declare like this:
method frob ($bubble, $bobble does coerce) {
... # complicated code
}
The word "method" invokes the method keyword parser, which uses its own grammar to get the method name and parse the method signature (which isn't Perl, but it doesn't need to be -- it just needs to be well-defined). Then it leaves perl to parse the method body as the body of a sub. Anything anywhere in your code that isn't between the word "method" and the end of a method signature doesn't get seen by the method parser at all, so it can't break your code, no matter how tricky you get.
The problem I see is the same problem you encounter with any C/C++ macro more complex than defining a constant: It degrades your ability to understand what the code is doing by looking at it, because you're not looking at the code that actually executes.
In theory, a source filter is no more dangerous than any other module, since you could easily write a module that redefines builtins or other constructs in "unexpected" ways. In practice however, it is quite hard to write a source filter in a way where you can prove that its not going to make a mistake. I tried my hand at writing a source filter that implements the perl6 feed operators in perl5 (Perl6::Feeds on cpan). You can take a look at the regular expressions to see the acrobatics required to simply figure out the boundaries of expression scope. While the filter works, and provides a test bed to experiment with feeds, I wouldn't consider using it in a production environment without many many more hours of testing.
Filter::Simple certainly comes in handy by dealing with 'the gory details of parsing quoted constructs', so I would be wary of any source filter that doesn't start there.
In all, it really depends on the filter you are using, and how broad a scope it tries to match against. If it is something simple like a c macro, then its "probably" ok, but if its something complicated then its a judgement call. I personally can't wait to play around with perl6's macro system. Finally lisp wont have anything on perl :-)
There is a nice example here that shows in what trouble you can get with source filters.
http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/
They used a module called Switch, which is based on source filters. And because of that, they were unable to find the source of an error message for days.

Is it okay to use modules from within subroutines?

Recently I start playing with OO Perl and I've been creating quite a bunch of new objects for a new project that I'm working on. As I'm unfamilliar with any best practice regarding OO Perl and we're kind in a tight rush to get it done :P
I'm putting a lot of this kind of code into each of my function:
sub funcx{
use ObjectX; # i don't declare this on top of the pm file
# but inside the function itself
my $obj = new ObjectX;
}
I was wondering if this will cause any negative impact versus putting on the use Object line on top of the Perl modules outside of any function scope.
I was doing this so that I feel it's cleaner in case I need to shift the function around.
And the other thing that I have noticed is that when I try to run a test.pl script on the unix server itself which test my objects, it slow as heck. But when the same code are run through CGI which is connected to an apache server, the web page doesn't load as slowly.
Where to put use?
use occurs at compile time, so it doesn't matter where you put it. At least from a purely pragmatic, 'will it work', point of view. Because it happens at compile time use will always be executed, even if you put it in a conditional. Never do this: if( $foo eq 'foo' ) { use SomeModule }
In my experience, it is best to put all your use statements at the top of the file. It makes it easy to see what is being loaded and what your dependencies are.
Update:
As brian d foy points out, things compiled before the use statement will not be affected by it. So, the location can matter. For a typical module, location does not matter, however, if it does things that affect compilation (for example it imports functions that have prototypes), the location could matter.
Also, Chas Owens points out that it can affect compilation. Modules that are designed to alter compilation are called pragmas. Pragmas are, by convention, given names in all lower-case. These effects apply only within the scope where the module is used. Chas uses the integer pragma as an example in his answer. You can also disable a pragma or module over a limited scope with the keyword no.
use strict;
use warnings;
my $foo;
print $foo; # Generates a warning
{ no warnings 'unitialized`; # turn off warnings for working with uninitialized values.
print $foo; # No warning here
}
print $foo; # Generates a warning
Indirect object syntax
In your example code you have my $obj = new ObjectX;. This is called indirect object syntax, and it is best avoided as it can lead to obscure bugs. It is better to use this form:
my $obj = ObjectX->new;
Why is your test script slow on the server?
There is no way to tell with the info you have provided.
But the easy way to find out is to profile your code and see where the time is being consumed. NYTProf is another popular profiling tool you may want to check out.
Best practices
Check out Perl Best Practices, and the quick reference card. This page has a nice run down of Damian Conway's OOP advice from PBP.
Also, you may wish to consider using Moose. If the long script startup time is acceptable in your usage, then Moose is a huge win.
question 1
It depends on what the module does. If it has lexical effects, then it will only affect the scope it is used in:
my $x;
{
use integer;
$x = 5/2; #$x is now 2
}
my $y = 5/2; #$y is now 2.5
If it is a normal module then it makes no difference where you use it, but it is common to use all of those modules at the top of the program.
question 2
Things that can affect the speed of a program between machines
speed of the processor
version of modules installed (some modules have XS versions that are much faster)
version of Perl
number of entries in PERL5LIB
speed of the drive
daotoad and Chas. Owens already answered the part of your question pertaining to the position of use statements. Let me remark on something else here:
I was doing this so that I feel it's
cleaner in case I need to shift the
function around.
Personally, I find it much cleaner to have all the used modules in one place at the top of the file. You won't have to search for use statements to see what other modules are being used and a quick glance will tell you what is being used and even what is not being used.
Regarding your performance problem: with Apache and mod_perl the Perl interpreter will have to parse and compile your used modules only once. The next time the script is run, execution should be much faster. On the command line, however, a second run doesn't get this benefit.