Perl 5.20 and the fate of smart matching and given-when

Perl 5.20 and the fate of smart matching and given-when - perl

I just installed Perl 5.18, and I get a lot of warnings like this,
given is experimental at .\[...].pl line [...].
when is experimental at .\[...].pl line [...].
Smartmatch is experimental at C:/strawberry/perl/site/lib/[...] line [...].
Looking into these warnings -- which I've never heard mentioned anywhere -- I was only able to find this in two places,
perldelta for 5.18, which only really mentions insofar as to say that the feature has been downgraded to experimental?
this nntp.perl.org post
The Perl Delta still does the most to give mention as to what's happening with those features. It's halfway down buried in the pod,
Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended. Warnings will now be issued when the parser sees ~~, given, or when.
I'm confused at how the most significant change in Perl in the past 10 years could be pulled. I've started using given, when, and smartmatch all over the place. Is there any more information about these futures? How is anyone finding them "confusing?" How are these features likely to change? Is there a plan to implement these features with a module?

There are problems with the design of smart-matching. The decision of what any given TYPE ~~ TYPE should do is most often unobvious, inconsistent and/or disputed. The idea isn't to remove smart matching; it's to fix it.
Specifically, ~~ will be greatly simplified, as you can see in a proposal by the 5.18 pumpking. Decisions as to how two things should match will be done with helpers such as those that already exist in Smart::Match.
... ~~ any(...)
It is much more readable, much more flexible (fully extensible), and solves a number of problems (such as "When should X be considered a number, and when should it be considered a string?").

Some insights might be gained by reading rjbs's proposed changes to smartmatch. He is the pumpking (Perl release manager) after all, so his comments and his view of the future is more relevant than most. There is also plenty of community comment on the matter; see here for instance. The 'experimental' status is in effect because, since things are likely to change in the future, it is responsible to inform users of that fact, even if we don't know what those changes will be.

Well, that's what's said in the description of the patch that downgraded this set of features to experimental:
The behavior of given/when/~~ are likely to change in perl 5.20.0:
either smart match will be removed or stripped down. In light of this,
users of these features should be warned. A category
"experimental::smartmatch" warning should be issued for these features
when they are used.
So while you can indeed turn these warnings off, with something like this (source):
no if $] >= 5.018, warnings => "experimental::smartmatch";
... it's just turning your eyes off the problem.

Related

What parts of given/when are experimental?

Has the entire "switch" feature become experimental? Are there parts of it I can rely on using without future versions of Perl breaking my code? In general, what is the policy toward changing stable features to experimental?
Background
use feature "switch" has been in Perl since 5.10. From 5.10 to 5.14, perlsyn seems to indicate that this is a stable, supported feature.
Starting with 5.16, however, perlsyn begins to call it "an experimental switch feature" and gets a lot more confusing about what's considered experimental.
Parts of the documentation seem to indicate everything about the feature is experimental:
Under the "switch" feature, Perl gains the experimental keywords given, when, default, continue, and break.
There's even an entire section about the Experimental Details on given and when.
However, perlsyn also says that "The foreach is the non-experimental way to set a topicalizer" and gives an example that seems to imply that foreach/when is not experimental.
As far as I can tell, the "experimental" language came from commit c2f1e22 which references RT #90926 which still doesn't give much context, even when paired with RT# 90906.

Has the entire "switch" feature become experimental?
No. It has always been.
Upd: Oh wow, maybe I'm wrong. I can't find a mention of that in 5.10.0 or .1. Maybe it wasn't? Or maybe they forgot to note it? Either way, it seems they messed up worse than I thought if so! But based on what I've seen since, I think the lesson was learned. (e.g. I still think values $ref is a bad idea, but at least it was marked experimental from day
1.)
Are there parts of it I can rely on using without future versions of Perl breaking my code?
Technically, no, although the devs are always careful when it comes to backwards compatibility.
In general, what is the policy toward changing stable features to experimental?
I don't see that ever happening. The deprecation process would be used instead.
Changes so far:
given is changing from creating a lexical $_ to localising $_ like foreach loops in 5.18 (or did it already happen in 5.16?).
5.10.1 saw some big changes in smart-matching*. Don't use (smart-matching in) 5.10.0.
Possible future changes:
The behaviour of smart-matching* is still a hot topic.
* — True, this is a feature distinct from given-when, but it's the same or closely related in most people's minds.

How to test/classify CPAN modules for utf8 correctness

Here is an excellent question and the wonderful tchrist's answer with 7+24+52 advices&comments how to make an perl program utf8 safe.
But here is 19k CPAN modules. What is possible to do for differentiating "good" and "bad" ones? (from the utf8's point of view)
For example: File::Slurp if you will read the file with
#use strict encoding warnings utf8 autodie... etc....
my $str = read_file($file, binmode => ':utf8');
you will get different results based on command line switches, and perl -CSDA will not work. Sad. (Yes, i know than Encode::decode("utf8", read_file($file, binmode => ':raw')); will help, but SAD anyway.
My questions:
is here any preferred way, how to test/classify what CPAN modules are utf8 safe/ready/correct?
is here some Test::something already done for utf8 testing?
is here something like Perl::Critic for utf8 - what will check the module source for possible utf8 incorrectness? (because manually checking sources for 7+24+52 things i cannot classify as the "easy way to programming")
or any other way? :)
I understand, than much of CPAN modules simply does not need to know about utf8. But here are zilion others what should.
Please, don't misunderstand me. I love Perl language. I know than perl has extremely powerful utf8 capability. (especially 5.14). The above was not mean as perl critique - but me (and probably some others too) need to know what CPAN modules are OK, and how classify them...)
When doing development using several CPAN modules, and initially everything goes well but in the final testing, you find that some modules does not support utf8, and therefore part of your work is useless - that really can cause a bit disillusionment. :(
Edit:
I understand than all complicated things around the unicode has two roots:
unicode itself - as tchrist excellently analyzed some of problematic points
perl - simple can't break all working modules, live servers etc - so need maintaining backward compatibility.
My only hope: perl6. Is is an totally new and different language. Don't need maintaining any backward compatibility. So I hope, in perl6 will be default some things what is not possible do in perl5 and all utf8 things will be much more intuitive.
But, back to modules: #daxim told: "Authors won't even reveal whether their module is taint-safe, and this feature exists for decades!" - and this is a catastrophe. Maybe (big maybe, and honestly haven't idea how to do it), but maybe we arrived to the time, when need put much-much harder restrictions into CPAN submissions.
At one side i'm really very happy with volunteer works of CPAN authors. At the other side, publishing source code is not only like a free speech "right" - but should obey some rules too.
I understand, than is is nearly impossible make any "revolution", but we probably need some "evolution". Maybe flag any CPAN module what is not utf8 safe. Flag all what are not taint safe. Flag (like here in SO) what module does not meet the minimal coding standards and remove them. Maybe I'm an idealist and/or naive. :)

Chill, the situation is less dire than you're thinking. No one except tchrist operates on this level of Unicode correctness, also see Aristotle's recent commentary. As with all things, you get 80% of the way with 20% of the effort. This base effort, namely getting the topic of character encoding right, is well documented; and jrockway repeats it in his answer in that thread.
Replies to your specific questions:
No, there isn't. There is no concerted effort to collect this information in a central place. The Perl 5 wiki could be used to document problematic modules, Juerd already discusses some in uniadvice. I would really like to see a statement from each module author in their documentation that "this module DTRT w.r.t. encoding", but I don't see it happening. Authors won't even reveal whether their module is taint-safe, and this feature exists for decades!
encoding::warnings can be used to smoke out unintended upgrades. I mention it in the work-flow of Checklist for going the Unicode way with Perl
You can't do that with Perl::Critic or static analysis. I see no other way than knowledgable people poking at the module with pointy characters until it falls apart (or not), like mirod just commented.

What is best practice as far as using perl-isms (idiomatic expressions) in Perl?

A couple of years back I participated in writing the best practices/coding style for our (fairly large and often Perl-using) company. It was done by a committee of "senior" Perl developers.
As anything done by consensus, it had parts which everyone disagreed with. Duh.
The part that rubbed wrong the most was a strong recommendation to NOT use many Perlisms (loosely defined as code idioms not present in, say C++ or Java), such as "Avoid using '... unless X;' constructs".
The main rationale posited for such rules as this one was that non-Perl developers would have much harder time with the Perl code base otherwise. The assumption here I guess is that Perl code jockeys are rarer breed overall - and among new hires to the company - than non-Perlers.
I was wondering whether SO has any good arguments to support or reject this logic... it is mostly academic curiosity at this point as the company's Perl coding standard is ossified and will never be revised again as far as I'm aware.
P.S. Just to be clear, the question is in the context I noted - the answer for an all-Perl smaller development shop is obviously a resounding "use Perl to its maximum capability".

I write code assuming that a competent Perl programmer will be reading it. I don't go out of my way to be clever, but I don't dumb it down either.
If you're writing code for people who don't know the language, you're going to miss most of the point of using that language. I often find that people want to outlaw Perlisms because they refuse to learn any more than they already know.
Since you say that you are in a small Perl shop, it should be pretty easy to ask the person who wrote the code what it means if you don't understand it. That sort of stuff should come up in code reviews and so on. Everyone continues to learn more about the language as you have periodic and regular chances to review the code. You shouldn't let too much time elapse without other eyeballs looking at someone's code. You certainly shouldn't wait until a week after they leave the company.
As for new hires, I'm always puzzled why anyone would think that you should sit them in front of a keyboard and turn them loose expecting productive work in a codebase they have never seen.
This isn't limited to Perl, either. It's a general programming issue. You should always be learning more about your tools. Most of the big shops I know have mini-bootcamps to bring developers up to speed on the codebase, including any bits of tricky code they may encounter.

I ask myself two simple questions:
Am I doing this because it's devilishly clever and/or shows off my extensive knowledge of Perl arcana?
Then it's a bad idea. But,
Am I doing this because it's idiomatic Perl and benefits from Perl's distinct advantages?
Then it's a good idea.
I see no justifiable reason to reject, say, string interpolation just because Java and C don't have it. unless is a funny one but I think having a subroutine start with the occasional
return undef unless <something>;
isn't so bad.

What sort of perlisms do you mean?
Good:
idiomatic for loops: for(1..5) {} or for( #foo ) {}
Scalar context evaluation of arrays: my $count = #items;
map, grep and sort: my %foo = map { $_->id => $_ } #objects;
OK if limited:
statement modifier control - trailing if, unless, etc.
Restrict to error trapping and early returns. die "Bad juju\n" unless $foo eq 'good juju';
As Schwern pointed out, another good use is conditional assignment of default values: my $foo = shift; $foo = 'blarg' unless defined $foo;. This usage is, IMO, cleaner than a my $foo = defined $_[0] ? shift : 'blarg';.
Reason to avoid: if you need to add additional behaviors to the check or an else, you have a big reformatting job. IMO, the hassle to redo a statement (even in a good editor) is more disruptive than typing several "unnecessary" blocks.
Prototypes - use only to create filtery functions like map. Prototypes are compiler hints not 'prototypes' in the sense of any other language.
Logical operators - standardize on when to use and and or vs. && and ||. All your code should be consistent. Best if you use a Perl::Critic policy to enforce.
Avoid:
Local variables. Dynamic scope is damn weird, and local is not the same as local anywhere else.
Package variables. Enables bad practices. If you think you need globally shared state, refactor. If you still need globally shared state, use a singleton.
Symbol table hackery

It must have been, as you say, a few years ago, because Damian Conway has 'cornered the market' in Perl standards with Perl Best Practices for the last few years.
I've worked in a similarly ossified environment - where we were not allowed to adopt the latest best practices, because that would be a change, and no one at a sufficiently high level in the corporate structure understood (or could be bothered to understand) Perl and sign off on moving in to the 21st Century.
A corporation that deploys a technology and retains it, but doesn't either buy in expertise or train up in house, is asking for trouble.
(I'd guess you're working in a highly change-controlled environment - financial perhaps?)
I agree with brian on this by the way.

I'd say Moose kills off 99.9% of Perl-isms, by convention, that shouldn't be used: symbol table hackery, reblessing objects, common blackbox violations: treating objects as arrays or hashes. The great thing, is it does all of this without taking the functionality hit of "not using it".
If the "perl-isms" you're really referring to are mutator form (warn "bad idea" unless $good_idea), unless, and until then I don't think you really have much of an argument because these "perlisms" don't seem to inhibit readability to either perl users, or non-perl users.

Pick up a copy of Effective Perl Programming: Ways to Write Better, More Idiomatic Perl (2nd Edition), and treat that as a guideline. It contains many of the better idioms and is packed with the little bits of information that will get you writing good Perl style Perl code, as opposed to C or Java (or whatever) style Perl code.

Should I use common::sense or just stick with `use strict` and `use warnings`?

I recently installed a module from CPAN and noticed one of its dependencies was common::sense, a module that offers to enable all the warnings you want, and none that you don't. From the module's synopsis:
use common::sense;
# supposed to be the same, with much lower memory usage, as:
#
# use strict qw(vars subs);
# use feature qw(say state switch);
# no warnings;
# use warnings qw(FATAL closed threads internal debugging pack substr malloc
# unopened portable prototype inplace io pipe unpack regexp
# deprecated exiting glob digit printf utf8 layer
# reserved parenthesis taint closure semicolon);
# no warnings qw(exec newline);
Save for undef warnings sometimes being a hassle, I've usually found the standard warnings to be good. Is it worth switching to common::sense instead of my normal use strict; use warnings;?

While I like the idea of reducing boiler-plate code, I am deeply suspicious of tools like Modern::Perl and common::sense.
The problem I have with modules like this is that they bundle up a group of behaviors and hide behid glib names with changeable meanings.
For example, Modern::Perl today consists of enabling some perl 5.10 features and using strict and warnings. But what happens when Perl 5.12 or 5.14 or 5.24 come out with great new goodies, and the community discovers that we need to use the frobnitz pragma everywhere? Will Modern::Perl provide a consistent set of behaviors or will it remain "Modern". If MP keeps with the times, it will break existing systems that don't keep lock-step with its compiler requirements. It adds extra compatibility testing to upgrade. At least that's my reaction to MP. I'll be the first to admit that chromatic is about 10 times smarter than me and a better programmer as well--but I still disagree with his judgment on this issue.
common::sense has a name problem, too. Whose idea of common sense is involved? Will it change over time?
My preference would be for a module that makes it easy for me to create my own set of standard modules, and even create groups of related modules/pragmas for specific tasks (like date time manipulation, database interaction, html parsing, etc).
I like the idea of Toolkit, but it sucks for several reasons: it uses source filters, and the macro system is overly complex and fragile. I have the utmost respect for Damian Conway, and he produces brilliant code, but sometimes he goes a bit too far (at least for production use, experimentation is good).
I haven't lost enough time typing use strict; use warnings; to feel the need to create my own standard import module. If I felt a strong need for automatically loading a set of modules/pragmas, something similar to Toolkit that allows one to create standard feature groups would be ideal:
use My::Tools qw( standard datetime SQLite );
or
use My::Tools;
use My::Tools::DateTime;
use My::Tools::SQLite;
Toolkit comes very close to my ideal. Its fatal defects are a bummer.
As for whether the choice of pragmas makes sense, that's a matter of taste. I'd rather use the occasional no strict 'foo' or no warnings 'bar' in a block where I need the ability to do something that requires it, than disable the checks over my entire file. Plus, IMO, memory consumption is a red herring. YMMV.
update
It seems that there are many (how many?) different modules of this type floating around CPAN.
There is latest, which is no longer the latest. Demonstrates part of the naming problem.
Also, uni::perl which adds enabling unicode part of the mix.
ToolSet offers a subset of Toolkit's abilities, but without source filters.
I'll include Moose here, since it automatically adds strict and warnings to the calling package.
And finally Acme::Very::Modern::Perl
The proliferation of these modules and the potential for overlapping requirements, adds another issue.
What happens if you write code like:
use Moose;
use common::sense;
What pragmas are enabled with what options?

I would say stick with warnings and strict for two main reasons.
If other people are going to use or work with your code, they are (almost certainly) used to warnings and strict and their rules. Those represent a community norm that you and other people you work with can count on.
Even if this or that specific piece of code is just for you, you probably don't want to worry about remembering "Is this the project where I adhere to warnings and strict or the one where I hew to common::sense?" Moving back and forth between the two modes will just confuse you.

There is one bit nobody else seems to have picked up on, and that's FATAL in the warnings list.
So as of 2.0, use common::sense is more akin to:
use strict;
use warnings FATAL => 'all'; # but with the specific list of fatals instead of 'all' that is
This is a somewhat important and frequently overlooked feature of warnings that ramps the strictness a whole degree higher. Instead of undef string interpolation, or infinite recursion just warning you and then keeping on going despite the problem, it actually halts.
To me this is helpful, because in many cases, undef string interpolation leads to further more dangerous errors, which may go silently unnoticed, and failing and bailing is a good thing.

I obviously have no common sense because I going more for Modern::Perl ;-)

The "lower memory usage" only works if you use no modules that load strict, feature, warnings, etc. and the "much" part is...not all that much.

Not everyone's idea of common sense is the same - in that respect it's anything but common.
Go with what you know. If you get undef warnings, chances are that your program or its input is incorrect.
Warnings are there for a reason. Anything that reduces them cannot be useful. (I always compile with gcc -Wall too...)

I have never had a warning that wasn't something dodgy/just plain wrong in my code. For me, it's always something technically allowed that I almost certainly don't want to do. I think the full suite of warnings is invaluable. If you find use strict + use warnings adequate for now, I don't see why you'd want to change to using a non-standard module which is then a dependency for every piece of code you write from here on out...

When it comes to warnings, I support the use of any module or built-in language feature that gives you the level of warnings that helps you make your code as solid and reliable as it can possibly be. An ignored warning is not helpful to anyone.
But if you're cozy with the standard warnings, stick with it. Coding to a stricter standard is great if you're used to it! I wouldn't recommend switching just for the memory savings. Only switch if the module helps you turn your code around quicker and with more confidence.

Many of peoples argues in a comments with what if the MP changes, it will break your code. While this can be an real threat, here is already MUCH things what are changes over time and break the code (sometimes after a deprecation cycle, sometimes not...).
Some other modules changed the API, so breaks things, and nobody care about them. E.g. Moose has at least two things what are deprecated now, and probably will be forbidden in some future releases.
Another example, years ago was allowed to write
for $i qw(some words)
now, it is deprecated. And many others... And this is a CORE language syntax.
Everybody survived. So, don't really understand why many of people argues againist helper modules. When they going to change, (probably) here will be a sort of deprecation cycle... So, my view is:
if you write programs to yourself, use any module you want ;)
if you write a program to someone, where someone others going to maintnanece it, use minimal nonstandard "pragma-like" modules (common::sense, modern::perl, uni::perl etc...)
in the stackoverflow questions, you can safely use common::sense or Modern::Perl etc. - most of users who will answer, your questions, knows them. Everybody understand than it is easier to write use 5.010; for enable strict, warnings and fearures with 10 chars as with 3 lines...

What's the modern way of declaring which version of Perl to use?

When it come to saying what version of Perl we need for our scripts, we've got options, oh, brother, we've got options:
use 5.010;
use 5.010_001;
use 5.10.0;
use v5.10;
use v5.10.0;
All seem to work. perlcritic complains about all but the first two. (It's unfortunate that the v strings seem to have such flaws, since Perl 6 expects you to do use v6; for your Perl 6 scripts...)
So, what should we be doing to indicate that we want to use a particular version of perl?

There are really only two options: decimal numbers and v-strings. Which form to use depends in part on which versions of Perl you want to "support" with a meaningful error message instead of a syntax error. (The v-string syntax was added in Perl 5.6.) The accepted best practice -- which is what perlcritic enforces -- is to use decimal notation. You should specify the minimum version of Perl that's required for your script to behave properly. Normally that means declaring a dependency on language features added in a major release, such as using the say function added in 5.10. You should include the patch level if it's important for your script to behave properly. For example, some of my code specifies use 5.008001 because it depends on the fix for a bug that 5.8.0 had which was fixed in 5.8.1.

I just use something like 5.010_001. I've grow weary of dealing with version string problems for something that should be mind-numbingly simple.
Since I mostly deal with build systems, I have the constant struggle of Module::Build's internal version.pm which is out of sync with the version.pm on CPAN. I think that's mostly better now, but I have better things to think about.
The best practice should always be to do the thing that commands the least of your attention, and certainly not take more attention than the value it gives back. In my opinion, v-strings and dotted decimals were a huge distraction with no additional benefit, wasting a lot of valuable programmer time just to get back to the starting point.
I should also note that Perl::Critic has often pushed questionable practices for the higher purpose of reducing the ways that people do things. However, those practices often cause problems, make them un-best. This is one of those cases. A more realistic best practice is to not make Perl::Critic compliance your goal. Use it where it is useful, but in cases like this, don't waste mental time on it.

The "modern" way is to use the forms starting with v. However, that may not necessarily be what you really want to do.
Critic complains because older versions of Perl won't understand and play nicely with the forms that start with v. However, if your version of Perl supports it, v is nicer to read because you can say:
use v5.10.1;
... rather than ...
use 5.010_001;
So, in the documentation for use, the following workaround is offered:
use 5.006; use v5.6.1;
NB: I think the documenation is in error here, as the v is omitted from the example at perldoc use.
Since the versions of Perl that don't support the v syntax will fail at the first use, they won't get to the second more specific and readable one.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse