perl add options in pod documentation from hash in GetOptions - perl

This is naive question for which I struggled to find an elegant solution. I write perl scripts that as they mature, grow in the number of options passed to GetOptions. The important options for the script, I add on top as POD documentation, but I rely on giving meaningful names to the other options, and I don't bother to document them explicitly.
I would like to pass the hash in GetOptions somehow to the content printed by perldoc, so that the non-documented options is listed there. Any option?

perldoc parses pod. You'd have to write a script to modify your .pl's pod based on values obtained by running your .pl... Yeah, that doesn't sound like a good idea.
You might be interested in Getopt::Euclid, Docopt, Getopt::Auto or Getopt::AsDocumented. These take the opposite approach: You define the options in the documentation, and they parse the documentation to determine how to process the command line.

I ralize that your point of view is code first and document later the things that become stable. That is fair when you need to write twice docs and code. But nowadays that is not true. My recomendation for prototyping is write the manpage and use a module that create the code of the options for you.
My favorite in perl and python is docopt (and has implementations for many other languages). Here you can find the Perl Docopt
I asked about perl implementatios for docopt long time ago in stackoverflow and I was pointed to the docopt module and other options like Getopt::Euclid, Getopt::Auto and Getopt::AsDocumented
Related with your concern about not incrementing cpan dependencies,
in python Docopt is selfcontained but I think it is not true for perl implementation which depends on
boolean,
Class::Accessor::Lite,
List::MoreUtils,
List::Util,
parent,
Pod::Usage
and
Scalar::Util; and the dependencies of the dependencies....
If the perl Docopt implementation is not as small and selfcontained as the original Python then probably is time to give it wider usage and start to fill feature/implementation requests because, IMHO, the doctopt way of doing things is ONE of the many ways of doing the things right.
Finally here you have a talk about the why of docopt. It is about python but command line interfaces logic and UX are the same for all the languages: PyCon UK 2012: Create beautiful command-line interfaces with Python

Related

How do I change the default ONE_AT_A_TIME_HARD hash function in Perl 5.18?

I'm not really familiar with Perl, but I've been searching in the documentation and other sources without success for the last 2 days. In the documentation, it is written:
Perl v5.18 includes support for multiple hash functions, and changed the default (to ONE_AT_A_TIME_HARD), you can choose a different algorithm by defining a symbol at compile time. For a current list, consult the INSTALL document. Note that as of Perl v5.18 we can only recommend use of the default or SIPHASH. All the others are known to have security issues and are for research purposes only.
The thing is that neither in INSTALL document nor in other sources/sites etc. I can find how to define this symbol.
What I want to do is to change the default ONE_AT_A_TIME_HARD hash function to ONE_AT_A_TIME_OLD so I can simulate the old Perl 5.16 behavior.
This sounds like an XY problem. What are you trying to accomplish by forcibly downgrading the hash algorithm in perl to one that has known problems?
From comments:
I need to run a lot of test cases written in perl 5.16 whose functionality depends on the old hash implementation and it's quite impossible to change the code as the cases are hundreds.
Whew, that's bad news. Find those developers, and hit them around the head with a copy perldata:
Hashes are unordered collections of scalar values indexed by their associated string key.
Specifically - if this is a problem for you, it means your codebase treats hashes as ordered, when they aren't and never were. (It's just they were fairly consistent before 5.18 and more random after).
From perldelta:
When encountering these changes, the key to cleaning up from them is to accept that hashes are unordered collections and to act accordingly.
See: http://blog.booking.com/hardening-perls-hash-function.html
To answer your question - if you really must:
./Configure -DPERL_HASH_FUNC_ONE_AT_A_TIME_OLD -des && make && make test
But it's a very very bad idea, because as the INSTALL file in your perl source package points out:
Note that as of Perl 5.18 we can only recommend the use of default or SIPHASH. All the others are known to have security issues and are for research purposes only.
By building your perl this way you introduce a known security flaw for every perl program using it.
Note - ONE_AT_A_TIME_HARD is the new default, so this won't change how perl 5.18 works. You may mean PERL_HASH_FUNC_ONE_AT_A_TIME_OLD

Is there runtime flow chart for Perl?

I am trying to better understand logic and flow of exceptions. So i got to state that i really feeled lack of understanding how Perl interpretes and runs programs, which phases are involved and what happens on every phase.
For example, I'd like to understand, when are binded STD* IO and when released, what is happening with $SIG{*} things, how they are depended with execepions, how program dies, etc. I'd like to have better insight of internals mechanics.
I am looking for links or books. I prefer some material which has also visual charts involved but this is not mandatory. I'd like to see some "big picture" of whole process, then i have already possibilities to dig further if i find it necessary.
I found Chapter 18th in Programming Perl gives overview of compiling phase and i try to work it trough, but i appreciate other good sources too.
Some alternative sources (there are not very many):
Mannning's Extending and Embedding Perl, which is the go-to reference on Perl's internals outside of the source
The chapter on the Perl internals in Advanced Perl Programming, which may be exactly what you want
Simon Cozens's Perl internals FAQ
Those may be more focused to what you're looking for. I'm not sure any of them explicitly spells out the interpreter's runtime execution order, though. The first one is a better "I want to work with this stuff" book; the second two are probably good introductory references.
Some of the questions you ask are not, as far as I know, explicitly documented - the I/O question being one I can't think of a good source for in particular. Exception handling is documented very well in Try::Tiny's documentation, and it's what we use for exceptions. Signal handling is messy, but perlipc documents it pretty well. With threads, you may be stuck with unsafe signals - I generally avoid threads in favor of multiple processes unless I must have shared memory.
You might start with these topics accessible via the perldoc program:
Internals and C Language Interface
perlembed Perl ways to embed perl in your C or C++ application
perldebguts Perl debugging guts and tips
perlxstut Perl XS tutorial
perlxs Perl XS application programming interface
perlxstypemap Perl XS C/Perl type conversion tools
perlclib Internal replacements for standard C library functions
perlguts Perl internal functions for those doing extensions
perlcall Perl calling conventions from C
perlmroapi Perl method resolution plugin interface
perlreapi Perl regular expression plugin interface
perlreguts Perl regular expression engine internals
perlapi Perl API listing (autogenerated)
perlintern Perl internal functions (autogenerated)
perliol C API for Perl's implementation of IO in Layers
perlapio Perl internal IO abstraction interface
perlhack Perl hackers guide
perlsource Guide to the Perl source tree
perlinterp Overview of the Perl interpreter source and how it works
perlhacktut Walk through the creation of a simple C code patch
perlhacktips Tips for Perl core C code hacking
perlpolicy Perl development policies
perlgit Using git with the Perl repository

Is there any current review of statistical modules for Perl?

I would like to know which is the current status of the statistical modules in CPAN, does any one know any recent review or could comment about its likes/dislikes with those modules?
I have used the clasical: Statistics::Descriptive, Statistics::Distributions, and some others contained in Bundle::Math::Statistics
Some of the modules has not been updated for long time. I don't know if this is because they are rock solid or has been overtaken by better modules.
Does someone know any current review similar to this old one:
Using Perl for Statistics: Data Processing and Statistical Computing
NB (for the people that will suggest to use R ;-)):
All my code is mainly in perl, but I use R a lot for statistics and plotting. I usually prepare the dataframes with perl and write the R script in the perl modules as templates and save to a file and execute them from perl. But sometimes you have small data sets where efficiency is not an issue (well I am using perl insn't it ;-)) and you want to add some statistics and histograms to your report produced with perl.
PDL, the Perl Data Language is alive and thriving so its worth taking a look at that.
And I think the other stats modules you mention are OK. For eg. Statistics::Descriptive is up-to-date and has been used in answers to a few questions here on Stackoverflow.
NB. There is also a Perl to R bridge called Statistics::R which looks interesting.
/I3az/

Why are Perl source filters bad and when is it OK to use them?

It is "common knowledge" that source filters are bad and should not be used in production code.
When answering a a similar, but more specific question I couldn't find any good references that explain clearly why filters are bad and when they can be safely used. I think now is time to create one.
Why are source filters bad?
When is it OK to use a source filter?
Why source filters are bad:
Nothing but perl can parse Perl. (Source filters are fragile.)
When a source filter breaks pretty much anything can happen. (They can introduce subtle and very hard to find bugs.)
Source filters can break tools that work with source code. (PPI, refactoring, static analysis, etc.)
Source filters are mutually exclusive. (You can't use more than one at a time -- unless you're psychotic).
When they're okay:
You're experimenting.
You're writing throw-away code.
Your name is Damian and you must be allowed to program in latin.
You're programming in Perl 6.
Only perl can parse Perl (see this example):
#result = (dothis $foo, $bar);
# Which of the following is it equivalent to?
#result = (dothis($foo), $bar);
#result = dothis($foo, $bar);
This kind of ambiguity makes it very hard to write source filters that always succeed and do the right thing. When things go wrong, debugging is awkward.
After crashing and burning a few times, I have developed the superstitious approach of never trying to write another source filter.
I do occasionally use Smart::Comments for debugging, though. When I do, I load the module on the command line:
$ perl -MSmart::Comments test.pl
so as to avoid any chance that it might remain enabled in production code.
See also: Perl Cannot Be Parsed: A Formal Proof
I don't like source filters because you can't tell what code is going to do just by reading it. Additionally, things that look like they aren't executable, such as comments, might magically be executable with the filter. You (or more likely your coworkers) could delete what you think isn't important and break things.
Having said that, if you are implementing your own little language that you want to turn into Perl, source filters might be the right tool. However, just don't call it Perl. :)
It's worth mentioning that Devel::Declare keywords (and starting with Perl 5.11.2, pluggable keywords) aren't source filters, and don't run afoul of the "only perl can parse Perl" problem. This is because they're run by the perl parser itself, they take what they need from the input, and then they return control to the very same parser.
For example, when you declare a method in MooseX::Declare like this:
method frob ($bubble, $bobble does coerce) {
... # complicated code
}
The word "method" invokes the method keyword parser, which uses its own grammar to get the method name and parse the method signature (which isn't Perl, but it doesn't need to be -- it just needs to be well-defined). Then it leaves perl to parse the method body as the body of a sub. Anything anywhere in your code that isn't between the word "method" and the end of a method signature doesn't get seen by the method parser at all, so it can't break your code, no matter how tricky you get.
The problem I see is the same problem you encounter with any C/C++ macro more complex than defining a constant: It degrades your ability to understand what the code is doing by looking at it, because you're not looking at the code that actually executes.
In theory, a source filter is no more dangerous than any other module, since you could easily write a module that redefines builtins or other constructs in "unexpected" ways. In practice however, it is quite hard to write a source filter in a way where you can prove that its not going to make a mistake. I tried my hand at writing a source filter that implements the perl6 feed operators in perl5 (Perl6::Feeds on cpan). You can take a look at the regular expressions to see the acrobatics required to simply figure out the boundaries of expression scope. While the filter works, and provides a test bed to experiment with feeds, I wouldn't consider using it in a production environment without many many more hours of testing.
Filter::Simple certainly comes in handy by dealing with 'the gory details of parsing quoted constructs', so I would be wary of any source filter that doesn't start there.
In all, it really depends on the filter you are using, and how broad a scope it tries to match against. If it is something simple like a c macro, then its "probably" ok, but if its something complicated then its a judgement call. I personally can't wait to play around with perl6's macro system. Finally lisp wont have anything on perl :-)
There is a nice example here that shows in what trouble you can get with source filters.
http://shadow.cat/blog/matt-s-trout/show-us-the-whole-code/
They used a module called Switch, which is based on source filters. And because of that, they were unable to find the source of an error message for days.

What's the modern way of declaring which version of Perl to use?

When it come to saying what version of Perl we need for our scripts, we've got options, oh, brother, we've got options:
use 5.010;
use 5.010_001;
use 5.10.0;
use v5.10;
use v5.10.0;
All seem to work. perlcritic complains about all but the first two. (It's unfortunate that the v strings seem to have such flaws, since Perl 6 expects you to do use v6; for your Perl 6 scripts...)
So, what should we be doing to indicate that we want to use a particular version of perl?
There are really only two options: decimal numbers and v-strings. Which form to use depends in part on which versions of Perl you want to "support" with a meaningful error message instead of a syntax error. (The v-string syntax was added in Perl 5.6.) The accepted best practice -- which is what perlcritic enforces -- is to use decimal notation. You should specify the minimum version of Perl that's required for your script to behave properly. Normally that means declaring a dependency on language features added in a major release, such as using the say function added in 5.10. You should include the patch level if it's important for your script to behave properly. For example, some of my code specifies use 5.008001 because it depends on the fix for a bug that 5.8.0 had which was fixed in 5.8.1.
I just use something like 5.010_001. I've grow weary of dealing with version string problems for something that should be mind-numbingly simple.
Since I mostly deal with build systems, I have the constant struggle of Module::Build's internal version.pm which is out of sync with the version.pm on CPAN. I think that's mostly better now, but I have better things to think about.
The best practice should always be to do the thing that commands the least of your attention, and certainly not take more attention than the value it gives back. In my opinion, v-strings and dotted decimals were a huge distraction with no additional benefit, wasting a lot of valuable programmer time just to get back to the starting point.
I should also note that Perl::Critic has often pushed questionable practices for the higher purpose of reducing the ways that people do things. However, those practices often cause problems, make them un-best. This is one of those cases. A more realistic best practice is to not make Perl::Critic compliance your goal. Use it where it is useful, but in cases like this, don't waste mental time on it.
The "modern" way is to use the forms starting with v. However, that may not necessarily be what you really want to do.
Critic complains because older versions of Perl won't understand and play nicely with the forms that start with v. However, if your version of Perl supports it, v is nicer to read because you can say:
use v5.10.1;
... rather than ...
use 5.010_001;
So, in the documentation for use, the following workaround is offered:
use 5.006; use v5.6.1;
NB: I think the documenation is in error here, as the v is omitted from the example at perldoc use.
Since the versions of Perl that don't support the v syntax will fail at the first use, they won't get to the second more specific and readable one.