Perl does not have a simple #include, OK, but WHY? - perl

Perl does not have a C style preprocessor level "include" function. That is how it is, and there are numerous sites that explain how to more or less emulate the same sort of behavior.
The one thing I couldn't find on any of these sites is any explanation for WHY perl does not have this functionality. Given that Perl often provides many different ways to accomplish the same thing, it is a curious omission.
Can somebody please explain why the decision was made to exclude this sort of functionality?

Perl already has require, do, eval and here documents among other things. It doesn't need a builtin preprocessor, if you need one that badly, there are filters. http://perldoc.perl.org/perlfilter.html
In general, nobody wants #include, even C and C++ programmers would mostly be happy to give it up in exchange for:
Faster compiles
Clean module system
#include is legacy, period. If a mainstream language designer announced tomorrow that they were adding #include to (your favorite language here) you'd probably see mass hysteria, laughing, and loss of confidence in that designer.
Language designers don't implement #include in any new language, there are simply better ways to do it. In general the trend is to attempt to achieve single pass lexing. Preprocessing requires you to incrementally expand #includes and potentially revisit the same characters repeatedly. It has been wrought with problems, and is one of the reasons that C++ is such dog to compile. It was ok in the 60s and 70s when memory and CPU were tiny and languages and problems were simpler, as were codebases. Nowadays, you want to be able to compile a "library" once, store its type metadata with it so the compiler can access it efficiently without rescanning it. That is what Microsoft does anyway with precompiled headers.
So what would #include be good for?
Modules ? No. See above. Modules are compiled once, export their metadata efficiently, they don't pollute the namespace of the clients, they don't recursively inject other includes, they can be distributed in binary form, among umpteen other advantages that I'm not even smart enough to think of.
Including macros ? No. Replace with constants, inlining and generic programming. All of which can be precompiled and expored from a module.
Splicing in generated code ? Better ways to do it anyway. See modules.
The only useful functionality for the preprocessor, IMO, is conditional compilation.
#ifdef _WIN32_
// do windowsy stuff
#else
#endif
Again, Perl can do this with do, eval or require as well.

Perl doesn't have or lack it any more than C does.
The C preprocessor was designed such that it and C need to know as little as possible about each other. There is no reason why you can't use it with Perl.
So why don't Perl programmers do it?
As codenhein explains, it's generally a bad idea to use an include mechanism with a compiler that don't know anything about each other, as it leaves you open to some crazy errors that neither can diagnose; the fact that C programmers are used to it doesn't change that.

Related

Is there runtime flow chart for Perl?

I am trying to better understand logic and flow of exceptions. So i got to state that i really feeled lack of understanding how Perl interpretes and runs programs, which phases are involved and what happens on every phase.
For example, I'd like to understand, when are binded STD* IO and when released, what is happening with $SIG{*} things, how they are depended with execepions, how program dies, etc. I'd like to have better insight of internals mechanics.
I am looking for links or books. I prefer some material which has also visual charts involved but this is not mandatory. I'd like to see some "big picture" of whole process, then i have already possibilities to dig further if i find it necessary.
I found Chapter 18th in Programming Perl gives overview of compiling phase and i try to work it trough, but i appreciate other good sources too.
Some alternative sources (there are not very many):
Mannning's Extending and Embedding Perl, which is the go-to reference on Perl's internals outside of the source
The chapter on the Perl internals in Advanced Perl Programming, which may be exactly what you want
Simon Cozens's Perl internals FAQ
Those may be more focused to what you're looking for. I'm not sure any of them explicitly spells out the interpreter's runtime execution order, though. The first one is a better "I want to work with this stuff" book; the second two are probably good introductory references.
Some of the questions you ask are not, as far as I know, explicitly documented - the I/O question being one I can't think of a good source for in particular. Exception handling is documented very well in Try::Tiny's documentation, and it's what we use for exceptions. Signal handling is messy, but perlipc documents it pretty well. With threads, you may be stuck with unsafe signals - I generally avoid threads in favor of multiple processes unless I must have shared memory.
You might start with these topics accessible via the perldoc program:
Internals and C Language Interface
perlembed Perl ways to embed perl in your C or C++ application
perldebguts Perl debugging guts and tips
perlxstut Perl XS tutorial
perlxs Perl XS application programming interface
perlxstypemap Perl XS C/Perl type conversion tools
perlclib Internal replacements for standard C library functions
perlguts Perl internal functions for those doing extensions
perlcall Perl calling conventions from C
perlmroapi Perl method resolution plugin interface
perlreapi Perl regular expression plugin interface
perlreguts Perl regular expression engine internals
perlapi Perl API listing (autogenerated)
perlintern Perl internal functions (autogenerated)
perliol C API for Perl's implementation of IO in Layers
perlapio Perl internal IO abstraction interface
perlhack Perl hackers guide
perlsource Guide to the Perl source tree
perlinterp Overview of the Perl interpreter source and how it works
perlhacktut Walk through the creation of a simple C code patch
perlhacktips Tips for Perl core C code hacking
perlpolicy Perl development policies
perlgit Using git with the Perl repository

How can i test Perl code for DRY (Don't Repeat Yourself)

For Python, we could use something like Python Code Clone Detector
But i just could not find anything for Perl.
With reference to DRY, Catalyst mentions that its build on DRY principle. and if it is i would imagine some tool might have been used to verify that claim.
Furthermore does Perl promote DRY or not ? I know for sure it promotes repeat Others by using CPAN.
You probably mean "Perl promotes 'do not repeat others' by providing CPAN", and that is certainly true.
However, DRY is more of a general programming principle (write many specialized, small functions that can be parametrized properly by their arguments instead of writing monolithic functions that "do it all") than a language feature. You can write DRY-compliant code in C++, Python, Perl, Ruby, C and most others. Some languages require more boilerplate, some less.
Perl definitely allows for small functions with few boilerplate by providing concise language constructs.
I don't know of tools detecting non-DRY code for Perl, though.

Does Common Lisp have the fastest PCRE implementation?

A friend claimed that Common Lisp has the fastest Perl-compatible regular expression library of any language, including Perl itself, because with an optimizing JIT compiler like SBCL, CL-PPCRE can compile each particular regex down to native assembly whereas other implementations, include Perl's, must generate bytecode and interpret it. In practice, especially for the common case where we try to match the same regex against many inputs or long input, the compilation overhead is more than justified.
Unfortunately, I can't find any benchmarks on this, and I don't know enough to run my own, so I turn to the hive mind. Can anyone evaluate this claim?
I have no benchmarks of my own to share, but perhaps your friend was referring to results concerning the portable regular expression library CL-PPCRE. The current web page no longer speaks of benchmarks, but courtesy of the Wayback Machine we can see that it used to show benchmark results where CL-PPCRE outperformed Perl 2-to-1. Benchmarking is a tricky business (especially for moving targets), which might explain why the current page is silent on the matter.
PCRE library has a JIT compiler module now, and it has a good performance compared to other regexes: http://sljit.sourceforge.net/regex_perf.html or http://blog.rburchell.com/2011/12/why-i-avoid-qregexp-in-qt-4-and-so.html

What is best practice as far as using perl-isms (idiomatic expressions) in Perl?

A couple of years back I participated in writing the best practices/coding style for our (fairly large and often Perl-using) company. It was done by a committee of "senior" Perl developers.
As anything done by consensus, it had parts which everyone disagreed with. Duh.
The part that rubbed wrong the most was a strong recommendation to NOT use many Perlisms (loosely defined as code idioms not present in, say C++ or Java), such as "Avoid using '... unless X;' constructs".
The main rationale posited for such rules as this one was that non-Perl developers would have much harder time with the Perl code base otherwise. The assumption here I guess is that Perl code jockeys are rarer breed overall - and among new hires to the company - than non-Perlers.
I was wondering whether SO has any good arguments to support or reject this logic... it is mostly academic curiosity at this point as the company's Perl coding standard is ossified and will never be revised again as far as I'm aware.
P.S. Just to be clear, the question is in the context I noted - the answer for an all-Perl smaller development shop is obviously a resounding "use Perl to its maximum capability".
I write code assuming that a competent Perl programmer will be reading it. I don't go out of my way to be clever, but I don't dumb it down either.
If you're writing code for people who don't know the language, you're going to miss most of the point of using that language. I often find that people want to outlaw Perlisms because they refuse to learn any more than they already know.
Since you say that you are in a small Perl shop, it should be pretty easy to ask the person who wrote the code what it means if you don't understand it. That sort of stuff should come up in code reviews and so on. Everyone continues to learn more about the language as you have periodic and regular chances to review the code. You shouldn't let too much time elapse without other eyeballs looking at someone's code. You certainly shouldn't wait until a week after they leave the company.
As for new hires, I'm always puzzled why anyone would think that you should sit them in front of a keyboard and turn them loose expecting productive work in a codebase they have never seen.
This isn't limited to Perl, either. It's a general programming issue. You should always be learning more about your tools. Most of the big shops I know have mini-bootcamps to bring developers up to speed on the codebase, including any bits of tricky code they may encounter.
I ask myself two simple questions:
Am I doing this because it's devilishly clever and/or shows off my extensive knowledge of Perl arcana?
Then it's a bad idea. But,
Am I doing this because it's idiomatic Perl and benefits from Perl's distinct advantages?
Then it's a good idea.
I see no justifiable reason to reject, say, string interpolation just because Java and C don't have it. unless is a funny one but I think having a subroutine start with the occasional
return undef unless <something>;
isn't so bad.
What sort of perlisms do you mean?
Good:
idiomatic for loops: for(1..5) {} or for( #foo ) {}
Scalar context evaluation of arrays: my $count = #items;
map, grep and sort: my %foo = map { $_->id => $_ } #objects;
OK if limited:
statement modifier control - trailing if, unless, etc.
Restrict to error trapping and early returns. die "Bad juju\n" unless $foo eq 'good juju';
As Schwern pointed out, another good use is conditional assignment of default values: my $foo = shift; $foo = 'blarg' unless defined $foo;. This usage is, IMO, cleaner than a my $foo = defined $_[0] ? shift : 'blarg';.
Reason to avoid: if you need to add additional behaviors to the check or an else, you have a big reformatting job. IMO, the hassle to redo a statement (even in a good editor) is more disruptive than typing several "unnecessary" blocks.
Prototypes - use only to create filtery functions like map. Prototypes are compiler hints not 'prototypes' in the sense of any other language.
Logical operators - standardize on when to use and and or vs. && and ||. All your code should be consistent. Best if you use a Perl::Critic policy to enforce.
Avoid:
Local variables. Dynamic scope is damn weird, and local is not the same as local anywhere else.
Package variables. Enables bad practices. If you think you need globally shared state, refactor. If you still need globally shared state, use a singleton.
Symbol table hackery
It must have been, as you say, a few years ago, because Damian Conway has 'cornered the market' in Perl standards with Perl Best Practices for the last few years.
I've worked in a similarly ossified environment - where we were not allowed to adopt the latest best practices, because that would be a change, and no one at a sufficiently high level in the corporate structure understood (or could be bothered to understand) Perl and sign off on moving in to the 21st Century.
A corporation that deploys a technology and retains it, but doesn't either buy in expertise or train up in house, is asking for trouble.
(I'd guess you're working in a highly change-controlled environment - financial perhaps?)
I agree with brian on this by the way.
I'd say Moose kills off 99.9% of Perl-isms, by convention, that shouldn't be used: symbol table hackery, reblessing objects, common blackbox violations: treating objects as arrays or hashes. The great thing, is it does all of this without taking the functionality hit of "not using it".
If the "perl-isms" you're really referring to are mutator form (warn "bad idea" unless $good_idea), unless, and until then I don't think you really have much of an argument because these "perlisms" don't seem to inhibit readability to either perl users, or non-perl users.
Pick up a copy of Effective Perl Programming: Ways to Write Better, More Idiomatic Perl (2nd Edition), and treat that as a guideline. It contains many of the better idioms and is packed with the little bits of information that will get you writing good Perl style Perl code, as opposed to C or Java (or whatever) style Perl code.

Why are Perl source filters bad and when is it OK to use them?

It is "common knowledge" that source filters are bad and should not be used in production code.
When answering a a similar, but more specific question I couldn't find any good references that explain clearly why filters are bad and when they can be safely used. I think now is time to create one.
Why are source filters bad?
When is it OK to use a source filter?
Why source filters are bad:
Nothing but perl can parse Perl. (Source filters are fragile.)
When a source filter breaks pretty much anything can happen. (They can introduce subtle and very hard to find bugs.)
Source filters can break tools that work with source code. (PPI, refactoring, static analysis, etc.)
Source filters are mutually exclusive. (You can't use more than one at a time -- unless you're psychotic).
When they're okay:
You're experimenting.
You're writing throw-away code.
Your name is Damian and you must be allowed to program in latin.
You're programming in Perl 6.
Only perl can parse Perl (see this example):
#result = (dothis $foo, $bar);
# Which of the following is it equivalent to?
#result = (dothis($foo), $bar);
#result = dothis($foo, $bar);
This kind of ambiguity makes it very hard to write source filters that always succeed and do the right thing. When things go wrong, debugging is awkward.
After crashing and burning a few times, I have developed the superstitious approach of never trying to write another source filter.
I do occasionally use Smart::Comments for debugging, though. When I do, I load the module on the command line:
$ perl -MSmart::Comments test.pl
so as to avoid any chance that it might remain enabled in production code.
See also: Perl Cannot Be Parsed: A Formal Proof
I don't like source filters because you can't tell what code is going to do just by reading it. Additionally, things that look like they aren't executable, such as comments, might magically be executable with the filter. You (or more likely your coworkers) could delete what you think isn't important and break things.
Having said that, if you are implementing your own little language that you want to turn into Perl, source filters might be the right tool. However, just don't call it Perl. :)
It's worth mentioning that Devel::Declare keywords (and starting with Perl 5.11.2, pluggable keywords) aren't source filters, and don't run afoul of the "only perl can parse Perl" problem. This is because they're run by the perl parser itself, they take what they need from the input, and then they return control to the very same parser.
For example, when you declare a method in MooseX::Declare like this:
method frob ($bubble, $bobble does coerce) {
... # complicated code
}
The word "method" invokes the method keyword parser, which uses its own grammar to get the method name and parse the method signature (which isn't Perl, but it doesn't need to be -- it just needs to be well-defined). Then it leaves perl to parse the method body as the body of a sub. Anything anywhere in your code that isn't between the word "method" and the end of a method signature doesn't get seen by the method parser at all, so it can't break your code, no matter how tricky you get.
The problem I see is the same problem you encounter with any C/C++ macro more complex than defining a constant: It degrades your ability to understand what the code is doing by looking at it, because you're not looking at the code that actually executes.
In theory, a source filter is no more dangerous than any other module, since you could easily write a module that redefines builtins or other constructs in "unexpected" ways. In practice however, it is quite hard to write a source filter in a way where you can prove that its not going to make a mistake. I tried my hand at writing a source filter that implements the perl6 feed operators in perl5 (Perl6::Feeds on cpan). You can take a look at the regular expressions to see the acrobatics required to simply figure out the boundaries of expression scope. While the filter works, and provides a test bed to experiment with feeds, I wouldn't consider using it in a production environment without many many more hours of testing.
Filter::Simple certainly comes in handy by dealing with 'the gory details of parsing quoted constructs', so I would be wary of any source filter that doesn't start there.
In all, it really depends on the filter you are using, and how broad a scope it tries to match against. If it is something simple like a c macro, then its "probably" ok, but if its something complicated then its a judgement call. I personally can't wait to play around with perl6's macro system. Finally lisp wont have anything on perl :-)
There is a nice example here that shows in what trouble you can get with source filters.
http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/
They used a module called Switch, which is based on source filters. And because of that, they were unable to find the source of an error message for days.