function overloading by return type - perl

Perl is one of such language which supports the function overloading by return type.
Simple example of this is wantarray().
Few nice modules are available in CPAN which extends this wantarray() and provide overloading for many other return types. These modules are Contextual::Return and Want. Unfortunately, I can not use these modules as both of these fails the perl critic with perl version 5.8.9 (I can not upgrade this perl version).
So, I am thinking to write my own module like Contextual::Return and Want but with very minimal. I tried to understand Contextual::Return and Want modules code, but I am not an expert.
I need function overloading for return types BOOL, OBJREF, LIST, SCALAR only.
Please help me by providing some guidelines, how can I start for it.

Modules that play with Perl's syntax in the way that Contextual::Return and Want do are pretty much bound to fall foul of Perl::Critic. In this case the main transgressions are occasionally disabling strict and using subroutine prototypes, which are minimal.
I personally believe it is a foolish rule that insists that all code must pass an arbitrary set of tests with no exceptions, but I also think that any code that behaves very differently depending on the context in which it is called is likely to be badly designed and difficult to understand and maintain. It is rare to see even wantarray used, as Perl generally does the right thing without you having to explain.
I think you may have come across a module that looks interesting to use, and want to incorporate it into your code somehow. Can you change my mind by showing an example of a subroutine that would require the comprehensive context checking that you describe?

Related

How can I tell if a Perl module is actually used in my program?

I have been on a "cleaning spree" lately at work, doing a lot of touch-up stuff that should have been done awhile ago. One thing I have been doing is deleted modules that were imported into files and never used, or they were used at one point but not anymore. To do this I have just been deleting an import and running the program's test file. Which gets really, really tedious.
Is there any programmatic way of doing this? Short of me writing a program myself to do it.
Short answer, you can't.
Longer possibly more useful answer, you won't find a general purpose tool that will tell you with 100% certainty whether the module you're purging will actually be used. But you may be able to build a special purpose tool to help you with the manual search that you're currently doing on your codebase. Maybe try a wrapper around your test suite that removes the use statements for you and ignores any error messages except messages that say Undefined subroutine &__PACKAGE__::foo and other messages that occur when accessing missing features of any module. The wrapper could then automatically perform a dumb source scan on the codebase of the module being purged to see if the missing subroutine foo (or other feature) might be defined in the unwanted module.
You can supplement this with Devel::Cover to determine which parts of your code don't have tests so you can manually inspect those areas and maybe get insight into whether they are using code from the module you're trying to purge.
Due to the halting problem you can't statically determine whether any program, of sufficient complexity, will exit or not. This applies to your problem because the "last" instruction of your program might be the one that uses the module you're purging. And since it is impossible to determine what the last instruction is, or if it will ever be executed, it is impossible to statically determine if that module will be used. Further, in a dynamic language, which can extend the program during it's run, analysis of the source or even the post-compile symbol tables would only tell you what was calling the unwanted module just before run-time (whatever that means).
Because of this you won't find a general purpose tool that works for all programs. However, if you are positive that your code doesn't use certain run-time features of Perl you might be able to write a tool suited to your program that can determine if code from the module you're purging will actually be executed.
You might create alternative versions of the modules in question, which have only an AUTOLOAD method (and import, see comment) in it. Make this AUTOLOAD method croak on use. Put this module first into the include path.
You might refine this method by making AUTOLOAD only log the usage and then load the real module and forward the original function call. You could also have a subroutine first in #INC which creates the fake module on the fly if necessary.
Of course you need a good test coverage to detect even rare uses.
This concept is definitely not perfect, but it might work with lots of modules and simplify the testing.

Moose OOP or Standard Perl?

I'm going into writing some crawlers for a web-site, the idea is that the site will use some back-end Perl scripts to fetch data from other sites, my design (in a very abstract way..) will be to write a package, lets say:
package MyApp::Crawler::SiteName
where site name will be a module / package for crawling specific sites, I will obviously will have other packages that will be shared across different modules, but that not relevant here.
anyway, making long short, my question is: Why (or why not...) should I prefer Moose over Standard OO Perl?
While I disagree with Flimzy's introduction ("I've not used Moose, but I have used this thing that uses Moose"), I agree with his premise.
Use what you feel you can produce the best results with. If the (or a) goal is to learn how to effectively use Moose then use Moose. If the goal is to produce good code and learning Moose would be a distraction to that, then don't use Moose.
Your question however is open-ended (as others have pointed out). There is no answer that will be universally accepted as true, otherwise Moose's adoption rate would be much higher and I wouldn't be answering this. I can really only explain why I choose to use Moose every time I start a new project.
As Sid quotes from the Moose documentation. Moose's core goal is to be a cleaner, standardized way of doing what Object Oriented Perl programmers have been doing since Perl 5.0 was released. It provides shortcuts to make doing the right thing simpler than doing the wrong thing. Something that is, in my opinion, lacking in standard Perl. It provides new tools to make abstracting your problem into smaller more easily solved problems simpler, and it provides a robust introspection and meta-programming API that tries to normalize the beastiary that is hacking Perl internals from Perl space (ie what I used to refer to as Symbol Table Witchery).
I've found that my natural sense of how much code is "too much" has been reduced by 66% since I started using Moose[^1]. I've found that I more easily follow good design principles such as encapsulation and information hiding, since Moose provides tools to make it easier than not. Because Moose automatically generates much of the boiler-plate that I normally would have to write out (such as accessor methods, delegate methods, and other such things) I find that it's easier to quickly get up to speed on what I was doing six months ago. I also find myself writing far less tricky code just to save myself a few keystrokes than I would have a few years ago.
It is possible to write clean, robust, elegant Object Oriented Perl that doesn't use Moose[^2]. In my experience it requires more effort and self control. I've found that in those occasions where the project demands I can't use Moose, my regular Object Oriented code has benefitted from the habits I have picked up from Moose. I think about it the same way I would think about writing it with Moose and then type three times as much code as I write down what I expect Moose would generate for me[^3].
So I use Moose because I find it makes me better programmer, and because of it I write better programs. If you don't find this to be true for you too, then Moose isn't the right answer.
[^1]: I used to start to think about my design when I reached ~300 lines of code in a module. Now I start getting uneasy at ~100 lines.
[^2]: Miyagawa's code in Twiggy is an excellent example that I was reading just today.
[^3]: This isn't universally true. There are several stories going around about people who write less maintainable, horrific code by going overboard with the tools that Moose provides. Bad programmers can write bad code anywhere.
You find the answer why to use Moose in the Documentation of it.
The main goal of Moose is to make Perl 5 Object Oriented programming easier, more consistent, and less tedious. With Moose you can think more about what you want to do and less about the mechanics of OOP.
From my experience and probably others will tell you the same. Moose reduces your code size extremly, it has a lot of features and just standard features like validation, forcing values on creation of a object, lazy validation, default values etc. are just so easy and readable that you will never want to miss Moose.
Use Moose. This is from something I wrote last night (using Mouse in this case). It should be pretty obvious what it does, what it validates, and what it sets up. Now imagine writing the equivalent raw OO. It's not super hard but it does start to get much harder to read and not just the code proper but the intent which can be the most important part when reading code you haven't seen before or in awhile.
has "io" =>
is => "ro",
isa => "FileHandle",
required => 1,
handles => [qw( sysread )],
trigger => sub { binmode +shift->{io}, ":bytes" },
;
I wrote a big test class last year that also used the handles functionality to redispatch a ton of methods to the underlying Selenium/WWWMech object. Disappearing this sort of boilerplate can really help readability and maintenance.
I've never used Moose, but I've used Catalyst, and have extensive experience with OO Perl and non-OO Perl. And my experience tells me that the best method to use is the method you're most comfortable using.
For me, that method has become "anything except Catalyst" :) (That's not to say that people who love and swear by Catalyst are wrong--it's just my taste).
If you already have the core of a crawler app that you can build on, use whatever it's written in. If you're starting from scratch, use whatever you have the most experience in--unless this is your chance to branch out and try something new, then by all means, accomplish your task while learning something new at the same time.
I think this is just another instance of "which language is best?" which can never be answered, except by the individual.
When I learned about objects in Perl, first thing I thought was why it's so complicated when Perl is usually trying to keep things simple.
With Moose I see that uncomplicated OOP is possible in Perl. It sort of bring OOP of Perl back to manageable level.
(yes, I admit, I don't like perl's object design.)
Although this was asked 10 years ago, much has changed in the Perl world. I think most people now see that Moose didn't deliver all people thought it might. There are several derivative projects, lots of MooseX shims, and so on. That much churn is the code smell of something that's close to useful but not quite there. Even Stevan Little, the original author, was working on it's replacement, Moxie, but it never really went anywhere. I don't think anyone was ever completely satisfied with Moose.
Currently, the future is probably not Moose, irrespective of your estimation of that framework. I have a chapter for Moose in Intermediate Perl, but I'd remove it in the next edition. That's not a decision from quality, just on-the-ground truth from what's happening in that space.
Ovid is working on Corrina, a similar but also different framework that tries to solve some of the same problems. Ovid outlines various things he wants to improve, so you can use that as a basis for your own decision. No matter what you think about either Moose or Corrina, I think the community is now transitioning out of its Moose phase.
You may want to listen to How Moose made me a bad OO programmer, a short talk from Tadeusz Sośnierz at PerlCon 2019.
Even if your skeptical, I'd say try Moose in a small project before you commit to reconfiguring a large codebase. Then, try it in a medium project. I tend to think that frameworks like Moose look appealing in the sorts of examples that fit on a page, but don't show their weaknesses until the problems get much more complex. Don't go all in until you know a little more.
You should at least know the basics of Moose, though, because you are likely to run into other code that uses it.
The Perl 5 Porters may even include this one in core perl, but a full delivery may happen in the five to ten year range, and even then you'd have to insist on a supported Perl. Most people I encounter are about three to five years behind on Perl versions. If you control your perl version, that's not a problem. Not everyone does though. Holding out for Corrina may not be a good short-term plan.

How can I represent sets in Perl?

I would like to represent a set in Perl. What I usually do is using a hash with some dummy value, e.g.:
my %hash=();
$hash{"element1"}=1;
$hash{"element5"}=1;
Then use if (defined $hash{$element_name}) to decide whether an element is in the set.
Is this a common practice? Any suggestions on improving this?
Also, should I use defined or exists?
Thank you
Yes, building hash sets that way is a common idiom. Note that:
my #keys = qw/a b c d/;
my %hash;
#hash{#keys} = ();
is preferable to using 1 as the value because undef takes up significantly less space. This also forces you to uses exists (which is the right choice anyway).
Use one of the many Set modules on CPAN. Judging from your example, Set::Light or Set::Scalar seem appropriate.
I can defend this advice with the usual arguments pro CPAN (disregarding possible synergy effects).
How can we know that look-up is all that is needed, both now and in the future? Experience teaches that even the simplest programs expand and sprawl. Using a module would anticipate that.
An API is much nicer for maintenance, or people who need to read and understand the code in general, than an ad-hoc implementation as it allows to think about partial problems at different levels of abstraction.
Related to that, if it turns out that the overhead is undesirable, it is easy to go from a module to a simple by removing indirections or paring data structures and source code. But on the other hand, if one would need more features, it is moderately more difficult to achieve the other way around.
CPAN modules are already tested and to some extent thoroughly debugged, perhaps also the API underwent improvement steps over the time, whereas with ad-hoc, programmers usually implement the first design that comes to mind.
Rarely it turns out that picking a module at the beginning is the wrong choice.
That's how I've always done it. I would tend to use exists rather than defined but they should both work in this context.

Why would you want to export symbols in Perl?

It seems strange to me that Perl would allow a package to export symbols into another package's namespace. The exporting package doesn't know if the using package already defined a symbol by the same name, and it certainly can't guarantee that it's the only package exporting a symbol by that name.
A very common problem caused by this is using CGI and LWP::Simple at the same time. Both packages export head() and cause an error. I know, it's easy enough to work around, but that's not the point. You shouldn't have to employ work arounds to use two practically-core Perl libraries.
As far as I can see, the only reason to do this is laziness. You save some key strokes by not typing Foo:: or using an object interface, but is it really worth it?
The practice of exporting all the functions from a module by default is not the recommended one for Perl. You should only export functions if you have a good reason. The recommended practice is to use EXPORT_OK so that the user has to type the name of the required function, like:
use My::Module 'my_function';
Modules from way back when, like LWP::Simple and CGI, were written before this recommendation came in, and it's hard now to alter them to not export things since it would break existing software. I guess the recommendation came about through people noticing problems like that.
Anyway Perl's object-oriented objects or whatever it's called doesn't require you to export anything, and you don't not have to say $foo->, so that part of your question is wrong.
Exporting is a feature. Like every other feature in any language, it can cause problems if you (ab)use it too frequently, or where you shouldn't. It's good when used wisely and bad otherwise, just like any other feature.
Back in the day when there weren't many modules around, it didn't seem like such a bad thing to export things by default. With 15,000 packages on CPAN, however, there are bound to be conflicts and that's unfortunate. However, fixing the modules now might break existing code. Whenever you make a poor interface choice and release it to the public, you're committed to it even if you don't like it.
So, it sucks, but that's the way it is, and there are ways around it.
The exporting package doesn't know if the using package already defined a symbol by the same name, and it certainly can't guarantee that it's the only package exporting a symbol by that name.
If you wanted to, I imagine your import() routine could check, but the default Exporter.pm routine doesn't check (and probably shouldn't, because it's going to get used a lot, and always checking if a name is defined will cause a major slowdown (and if you found a conflict, what is Exporter expected to do?)).
As far as I can see, the only reason to do this is laziness. You save some key strokes by not typing Foo:: or using an object interface, but is it really worth it?
Foo:: isn't so bad, but consider My::Company::Private::Special::Namespace:: - your code will look much cleaner if we just export a few things. Not everything can (or should) be in a top-level namespace.
The exporting mechanism should be used when it makes code cleaner. When namespaces collide, it shouldn't be used, because it obviously isn't making code cleaner, but otherwise I'm a fan of exporting things when requested.
It's not just laziness, and it's not just old modules. Take Moose, "the post-modern object system", and Rose::DB::Object, the object interface to a popular ORM. Both import the meta method into the useing package's symbol table in order to provide features in that module.
It's not really any different than the problem of multiply inheriting from modules that each provide a method of the same name, except that the order of your parentage would decide which version of that method would get called (or you could define your own overridden version that somehow manually folded the features of both parents together).
Personally I'd love to be able to combine Rose::DB::Object with Moose, but it's not that big a deal to work around: one can make a Moose-derived class that “has a” Rose::DB::Object-derived object within it, rather than one that “is a” (i.e., inherits from) Rose::DB::Object.
One of the beautiful things about Perl's "open" packages is that if you aren't crazy about the way a module author designed something, you can change it.
package LWPS;
require LWP::Simple;
for my $sub (#LWP::Simple::EXPORT, #LWP::Simple::EXPORT_OK) {
no strict 'refs';
*$sub = sub {shift; goto &{'LWP::Simple::' . $sub}};
}
package main;
my $page = LWPS->get('http://...');
of course in this case, LWP::Simple::get() would probably be better.

How do you do Design by Contract in Perl?

I'm investigating using DbC in our Perl projects, and I'm trying to find the best way to verify contracts in the source (e.g. checking pre/post conditions, invariants, etc.)
Class::Contract was written by Damian Conway and is now maintained by C. Garret Goebel, but it looks like it hasn't been touched in over 8 years.
It looks like what I want to use is Moose, as it seems as though it might offer functionality that could be used for DbC, but I was wondering if anyone had any resources (articles, etc.) on how to go about this, or if there are any helpful modules out there that I haven't been able to find.
Is anyone doing DbC with Perl? Should I just "jump in" to Moose and see what I can get it to do for me?
Moose gives you a lot of the tools (if not all the sugar) to do DbC. Specifically, you can use the before, after and around method hooks (here's some examples) to perform whatever assertions you might want to make on arguments and return values.
As an alternative to "roll your own DbC" you could use a module like MooseX::Method::Signatures or MooseX::Method to take care of validating parameters passed to a subroutine. These modules don't handle the "post" or "invariant" validations that DbC typically provides, however.
EDIT: Motivated by this question, I've hacked together MooseX::Contract and uploaded it to the CPAN. I'd be curious to get feedback on the API as I've never really used DbC first-hand.
Moose is an excellent oo system for perl, and I heartily recommend it for anyone coding objects in perl. You can specify "subtypes" for your class members that will be enforced when set by accessors or constructors (the same system can be used with the Moose::Methods package for functions). If you are coding more than one liners, use Moose;
As for doing DbC, well, might not be the best fit for perl5. It's going to be hard in a language that offers you very few guarantees. Personally, in a lot of dynamic languages, but especially perl, I tend to make my guiding philosophy DRY, and test-driven development.
I would also recommend using Moose.
However as an "alternative" take a look at Sub::Contract.
To quote the author....
Sub::Contract offers a pragmatic way to implement parts of the programming by contract paradigm in Perl.
Sub::Contract is not a design-by-contract framework.
Sub::Contract aims at making it very easy to constrain subroutines input arguments and return values in order to emulate strong typing at runtime.
If you don't need class invariants, I've found the following Perl Hacks book recommendation to be a good solution for some programs. See Smart::Comments.