Why would I use Perl anonymous subroutines instead of a named one? inspired me to think about the merit of:
Storing anonymous subs in arrays, hashes and scalars.
It's a pretty cool concept, but is it practical in any way? Is there any reason why I'd have to use anonymous subs/sub references stored in some sort of data structure? Or perhaps a situation where it will be convenient?
I understand why anonymous subs are required in certain contexts such as dealing with shared variables (when an anonymous sub is declared inside another sub), but unless I'm missing something, I just don't see the point of using any sort of function reference. It seems like we should just call the functions outright and the code would look much nicer/more organized.
Please tell me I'm wrong. I'd love to have a good reason to use these things.
Thanks in advance.
A dispatch table is useful for dynamically determining steps to take based on some value:
my %disp = (
foo => sub { 'foo' },
bar => sub { 'bar' },
);
my $cmd = get_cmd_somehow();
if (defined $disp{$cmd}) {
$disp{$cmd}->(#args)
} else {
die "I don't know how to handle $cmd"
}
(Method dispatch via ->can($method) is conceptually similar, but more flexible and the details are hidden.)
Anonymous functions and lexical closure has many other uses; perhaps look deeper into "higher-order functions." (Think about map()/grep(), for example.)
Object-oriented methods are very much akin to anonymous subroutines. Polymorphism means that an object's methods can change without the calling code having to do lookups manually to see what routine to run. And that's VERY useful.
Also, think about perl's sort. Why set up a named routine just for a simple sort method? Ditto map and grep.
As well, iterators are very useful. Also, think about storing a routine that can be resolved later, rather than only being able to store a static value.
In the end, if you don't want to store anonymous routines (or even references to routines) that's your business. But having the option is way better than not having it.
Related
Perl is a bit too forgiving: If you pass extra arguments to subs they are simply ignored.
To avoid this I would like to use prototypes to make sure each sub is given the correct amount of arguments.
This works OK as long as I declare the prototype before using it:
sub mysub($);
sub mysub2($);
mysub(8);
mysub(8,2); # Complain here
sub mysub($) {
mysub2($#);
}
sub mysub2($) {
if($_[0] == 1) {
mysub(2);
}
print $#;
}
But I really hate splitting this up. I would much rather that Perl read the full file to see if there are declarations further down. So I would like to write something like:
use prototypes_further_down; # This does not work
mysub(8);
mysub(8,2); # Complain here
sub mysub($) {
mysub2($#);
}
sub mysub2($) {
if($_[0] == 1) {
mysub(2);
}
print $#;
}
Can I somehow ask Perl to do that?
To avoid this I would like to use prototypes to make sure each sub is given the correct amount of arguments.
No, you would not. Despite the similarity in name, Perl prototypes are not your father's function prototypes. Quoting The Problem with Prototypes (emphasis mine),
Perl 5's prototypes serve two purposes. First, they're hints to the parser to change the way it parses subroutines and their arguments. Second, they change the way Perl 5 handles arguments to those subroutines when it executes them. A common novice mistake is to assume that they serve the same language purpose as subroutine signatures in other languages. This is not true.
In addition to them not having the same intended purpose, bypassing prototypes is trivial, so they provide no actual protection against someone who deliberately wishes to call your code in (what you believe to be) the "wrong" way. As perldoc perlsub tells us,
The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the & character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like \&foo or on indirect subroutine calls like &{$subref} or $subref->().
Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.
Even if you could get it to complain about mysub(8,2), &mysub(8,2) or $subref = \&mysub; $subref->(8,2) or (if mysub were an object method inside package MyModule) $o = MyModule->new; $o->mysub(8,2) would work without complaint.
If you want to validate how your subs are called using core Perl (prior to 5.20), then you need to perform the validation yourself within the body of the sub. Perl 5.20 and newer have a ("experimental" at the time of this writing) Signatures extension to sub declarations which may work for your purposes, but I've never used it myself, so I can't speak to its effectiveness or limitations. There are also many CPAN modules available for handling this sort of thing, which you can find by doing searches for things like "signature" or "prototype".
Regardless of your chosen approach, you will not be able to get compile-time errors about incorrect function signatures unless you define those signatures before they are used. In cases such as your example, where two subs mutually call each other, this can be accomplished by using a forward declaration to establish its signature in advance:
sub mysub($foo); # Forward declaration
sub mysub2 { mysub(8) }
sub mysub { mysub2('infinite loops ftw!') } # Complete version of the code
Inspired a little by: https://stackoverflow.com/questions/30977789/why-is-c-not-a-functional-programming-language
I found: Higher Order Perl
It made me wonder about the assertion that Perl is a functional programming language. Now, I appreciate that functional programming is a technique (much like object oriented).
However I've found a list of what makes a functional programming language:
First Class functions
Higher Order Functions
Lexical Closures
Pattern Matching
Single Assignment
Lazy Evaluation
Garbage Collection
Type Inference
Tail Call Optimization
List Comprehensions
Monadic effects
Now some of these I'm quite familiar with:
Garbage collection, for example, is Perl reference counting and releasing memory when no longer required.
Lexical closures are even part of the FAQ: What is a closure? - there's probably a better article here: http://www.perl.com/pub/2002/05/29/closure.html
But I start to get a bit fuzzy on some of these - List Comprehensions, for example - I think that's referring to map/grep (List::Util and reduce?)
I anyone able to help me fill in the blanks here? Which of the above can Perl do easily (and is there an easy example) and are there examples where it falls down?
Useful things that are relevant:
Perl monks rant about functional programming
Higher Order Perl
C2.com functional programming definitions
First Class functions
In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. Specifically, this means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures.
So in Perl:
my $print_something = sub { print "Something\n" };
sub do_something {
my ($function) = #_;
$function->();
}
do_something($print_something);
Verdict: Natively supported
Higher Order Functions
In mathematics and computer science, a higher-order function (also functional form, functional or functor) is a function that does at least one of the following:
takes one or more functions as an input
outputs a function
With reference to this post on perlmonks:
In Perl terminology, we often refer to them as callbacks, factories, and functions that return code refs (usually closures).
Verdict: Natively supported
Lexical Closures
Within the perl FAQ we have questions regarding What is a closure?:
Closure is a computer science term with a precise but hard-to-explain meaning. Usually, closures are implemented in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These lexicals magically refer to the variables that were around when the subroutine was defined (deep binding).
Closures are most often used in programming languages where you can have the return value of a function be itself a function, as you can in Perl.
This is explained perhaps a little more clearly in the article: Achieving Closure
sub make_hello_printer {
my $message = "Hello, world!";
return sub { print $message; }
}
my $print_hello = make_hello_printer();
$print_hello->()
Verdict: Natively supported
Pattern Matching
In the context of pure functional languages and of this page, Pattern Matching is a dispatch mechanism: choosing which variant of a function is the correct one to call. Inspired by standard mathematical notations.
Dispatch tables are the closest approximation - essentially a hash of either anonymous subs or code refs.
use strict;
use warnings;
sub do_it {
print join( ":", #_ );
}
my $dispatch = {
'onething' => sub { print #_; },
'another_thing' => \&do_it,
};
$dispatch->{'onething'}->("fish");
Because it's just a hash, you can add code references and anonymous subroutines too. (Note - not entirely dissimilar to Object Oriented programming)
Verdict: Workaround
Single Assignment
Any assignment that changes an existing value (e.g. x := x + 1) is disallowed in purely functional languages.4 In functional programming, assignment is discouraged in favor of single assignment, also called initialization. Single assignment is an example of name binding and differs from assignment as described in this article in that it can only be done once, usually when the variable is created; no subsequent reassignment is allowed.
I'm not sure perl really does this. The closest approximation might be references/anonymous subs or perhaps constant.
Verdict: Not Supported
Lazy Evaluation
Waiting until the last possible moment to evaluate an expression, especially for the purpose of optimizing an algorithm that may not use the value of the expression.
Examples of lazy evaluation techniques in Perl 5?
And again, coming back to Higher Order Perl (I'm not affiliated with this book, honest - it just seems to be one of the key texts on the subject).
The core concept here seems to be - create a 'linked list' in perl (using object oriented techniques) but embed a code reference at your 'end marker' that evaluates if you ever get that far.
Verdict: Workaround
Garbage Collection
"GarbageCollection (GC), also known as automatic memory management, is the automatic recycling of heap memory."
Perl does this via reference counting, and releasing things when they are no longer referenced. Note that this can have implications for certain things that you're (probably!) more likely to encounter when functional programming.
Specifically - circular references which are covered in perldoc perlref
Verdict: Native support
Type Inference
TypeInference is the analysis of a program to infer the types of some or all expressions, usually at CompileTime
Perl does implicitly cast values back and forth as it needs to. Usually this works well enough that you don't need to mess with it. Occasionally you need to 'force' the process, by making an explicit numeric or string operation. Canonically, this is either by adding 0, or concatenating an empty string.
You can overload a scalar to do different things in by using dualvars
Verdict: Native support
Tail Call Optimization
Tail-call optimization (or tail-call merging or tail-call elimination) is a generalization of TailRecursion: If the last thing a routine does before it returns is call another routine, rather than doing a jump-and-add-stack-frame immediately followed by a pop-stack-frame-and-return-to-caller, it should be safe to simply jump to the start of the second routine, letting it re-use the first routine's stack frame (environment).
Why is Perl so afraid of "deep recursion"?
It'll work, but it'll warn if your recursion depth is >100. You can disable this by adding:
no warnings 'recursion';
But obviously - you need to be slightly cautious about recursion depth and memory footprint.
As far as I can tell, there isn't any particular optimisation and if you want to do something like this in an efficient fashion, you may need to (effectively) unroll your recursives and iterate instead.
Tailcalls are supported by perl. Either see the goto ⊂ notation, or see the neater syntax for it provided by Sub::Call::Tail
Verdict: Native
List Comprehensions
List comprehensions are a feature of many modern FunctionalProgrammingLanguages. Subject to certain rules, they provide a succinct notation for GeneratingElements? in a list.
A list comprehension is SyntacticSugar for a combination of applications of the functions concat, map and filter
Perl has map, grep, reduce.
It also copes with expansion of ranges and repetitions:
my #letters = ( "a" .. "z" );
So you can:
my %letters = map { $_ => 1 } ( "A" .. "z" );
Verdict: Native (List::Utils is a core module)
Monadic effects
... nope, still having trouble with these. It's either much simpler or much more complex than I can grok.
If anyone's got anything more, please chip in or edit this post or ... something. I'm still a sketchy on some of the concepts involved, so this post is more a starting point.
Really nice topic, I wanted to write an article titled something link "the camel is functional". Let me contribute with some code.
Perl also support this anonymous functions like
sub check_config {
my ( $class, $obj ) = #_;
my $separator = ' > ';
# Build message from class namespace.
my $message = join $separator, ( split '::', $class );
# Use provided object $obj or
# create an instance of class with defaults, provided by configuration.
my $object = $obj || $class->new;
# Return a Function.
return sub {
my $attribute = shift;
# Compare attribute with configuration,
# just to ensure it is read from there.
is $object->config->{$attribute},
# Call attribute accessor so it is read from config,
# and validated by type checking.
$object->$attribute,
# Build message with attribute.
join $separator, ( $message, $attribute );
}
}
sub check_config_attributes {
my ( $class, $obj ) = #_;
return sub {
my $attributes = shift;
check_config( $class, $obj )->($_) for (#$attributes);
}
}
Sorry to bother the community for this but I have unfortunately to code in Perl :'(. It is about an OO perl code I want to understand but I am failing to put all the pieces together.
The following is a template of code that represents somehow what I am currently looking at. The following is the class MyClass:
package Namespace::MyClass;
sub new($)
{
my ($class) = #_;
$self = { };
bless ($self, $class);
}
sub init($$)
{
my ($self, $param1) = #_;
$self->{whatever} = ($param1, $param1, $param1);
}
and then the following is a script.pl that supposedly uses the class:
#!/path/to/your/perl
require Namespace::MyClass;
my myClass = new Namespace::MyClass()
myClass->init("data_for_param1");
There may be error but I am interested more in having the following questions answered than having my possibly wrong code corrected:
Questions group 1 : "$" in a sub definition means I need to supply one parameter, right? If so, why does new ask for one and I do not supply it? Has this to do with the call in the script using () or something similar to how Python works (self is implied)?
Question group 2 : is for the same previous reason that the init subroutine (here a method) declares to expect two parameters? If so, is the blessing in some way implying a self is ever passed for all the function in the module?
I ask this because I saw that in non blessed modules one $ = one parameter.
Thank you for your time.
QG1:
Prototypes (like "$") mean exactly nothing in Method calls.
Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.
Most experienced Perl folk avoid prototypes entirely unless they are trying to imitate a built-in function. Some PHBs inexperienced in Perl mandate their use under the mistaken idea that they work like prototypes in other languages.
The 1st parameter of a Method call is the Object (Blessed Ref) or Class Name (String) that called the Method. In the case of your new Method that would be 'Namespace::MyClass'.
Word to the wise: Also avoid indirect Method calls. Rewrite your line using the direct Method call as follows: my $myClass = Namespace::MyClass->new;
QG2:
Your init method is getting $myClass as it's 1st parameter because it is what 'called' the method. The 2nd parameter is from the parameter list. Blessing binds the name of the Class to the Reference, so that when a method call is seen, It knows which class in which to start the search for the correct sub. If the correct sub is not immediately found, the search continues in the classes named in the class's #ISA array.
Don't use prototypes! They don't do what you think they do.
Prototypes in Perl are mainly used to allow functions to be defined without the use of parentheses or to allow for functions that take array references to use the array name like pop or push do. Otherwise, prototypes can cause more trouble and heartbreak than experienced by most soap opera characters.
is what you actually want to do validate parameters? if so then that is not the purpose of prototypes. you could try using signatures, but for some reason they are new and still experimental. some consider lack of a stable signatures feature to be a flaw of perl. the alternatives are CPAN and writing code in your subs/methods that explicitly validate the params.
I keep getting :: confused with -> when calling subroutines from modules. I know that :: is more related to paths and where the module/subroutine is and -> is used for objects, but I don't really understand why I can seemingly interchange both and it not come up with immediate errors.
I have perl modules which are part of a larger package, e.g. FullProgram::Part1
I'm just about getting to grips with modules, but still am on wobbly grounds when it comes to Perl objects, but I've been accidentally doing this:
FullProgram::Part1::subroutine1();
instead of
FullProgram::Part1->subroutine1();
so when I've been passing a hash ref to subroutine1 and been careful about using $class/$self to deal with the object reference and accidentally use :: I end up pulling my hair out wondering why my hash ref seems to disappear. I have learnt my lesson, but would really like an explanation of the difference. I have read the perldocs and various websites on these but I haven't seen any comparisons between the two (quite hard to google...)
All help appreciated - always good to understand what I'm doing!
There's no inherent difference between a vanilla sub and one's that's a method. It's all in how you call it.
Class::foo('a');
This will call Class::foo. If Class::foo doesn't exist, the inheritance tree will not be checked. Class::foo will be passed only the provided arguments ('a').
It's roughly the same as: my $sub = \&Class::foo; $sub->('a');
Class->foo('a');
This will call Class::foo, or foo in one of its base classes if Class::foo doesn't exist. The invocant (what's on the left of the ->) will be passed as an argument.
It's roughly the same as: my $sub = Class->can('foo'); $sub->('Class', 'a');
FullProgram::Part1::subroutine1();
calls the subroutine subroutine1 of the package FullProgram::Part1 with an empty parameter list while
FullProgram::Part1->subroutine1();
calls the same subroutine with the package name as the first argument (note that it gets a little bit more complex when you're subclassing). This syntax is used by constructor methods that need the class name for building objects of subclasses like
sub new {
my ($class, #args) = #_;
...
return bless $thing, $class;
}
FYI: in Perl OO you see $object->method(#args) which calls Class::method with the object (a blessed reference) as the first argument instead of the package/class name. In a method like this, the subroutine could work like this:
sub method {
my ($self, $foo, $bar) = #_;
$self->do_something_with($bar);
# ...
}
which will call the subroutine do_something_with with the object as first argument again followed by the value of $bar which was the second list element you originally passed to method in #args. That way the object itself doesn't get lost.
For more informations about how the inheritance tree becomes important when calling methods, please see ikegami's answer!
Use both!
use Module::Two;
Module::Two::->class_method();
Note that this works but also protects you against an ambiguity there; the simple
Module::Two->class_method();
will be interpreted as:
Module::Two()->class_method();
(calling the subroutine Two in Module and trying to call class_method on its return value - likely resulting in a runtime error or calling a class or instance method in some completely different class) if there happens to be a sub Two in Module - something that you shouldn't depend on one way or the other, since it's not any of your code's business what is in Module.
Historically, Perl dont had any OO. And functions from packages called with FullProgram::Part1::subroutine1(); sytax. Or even before with FullProgram'Part1'subroutine1(); syntax(deprecated).
Later, they implemented OOP with -> sign, but dont changed too much actually. FullProgram::Part1->subroutine1(); calls subroutine1 and FullProgram::Part1 goes as 1st parameter. you can see usage of this when you create an object: my $cgi = CGI->new(). Now, when you call a method from this object, left part also goes as first parameter to function: $cgi->param(''). Thats how param gets object he called from (usually named $self). Thats it. -> is hack for OOP. So as a result Perl does not have classes(packages work as them) but does have objects("objects" hacks too - they are blessed scalars).
Offtop: Also you can call with my $cgi = new CGI; syntax. This is same as CGI->new. Same when you say print STDOUT "text\n";. Yeah, just just calling IOHandle::print().
Many years ago I remember a fellow programmer counselling this:
new Some::Class; # bad! (but why?)
Some::Class->new(); # good!
Sadly now I cannot remember the/his reason why. :( Both forms will work correctly even if the constructor does not actually exist in the Some::Class module but instead is inherited from a parent somewhere.
Neither of these forms are the same as Some::Class::new(), which will not pass the name of the class as the first parameter to the constructor -- so this form is always incorrect.
Even if the two forms are equivalent, I find Some::Class->new() to be much more clear, as it follows the standard convention for calling a method on a module, and in perl, the 'new' method is not special - a constructor could be called anything, and new() could do anything (although of course we generally expect it to be a constructor).
Using new Some::Class is called "indirect" method invocation, and it's bad because it introduces some ambiguity into the syntax.
One reason it can fail is if you have an array or hash of objects. You might expect
dosomethingwith $hashref->{obj}
to be equal to
$hashref->{obj}->dosomethingwith();
but it actually parses as:
$hashref->dosomethingwith->{obj}
which probably isn't what you wanted.
Another problem is if there happens to be a function in your package with the same name as a method you're trying to call. For example, what if some module that you use'd exported a function called dosomethingwith? In that case, dosomethingwith $object is ambiguous, and can result in puzzling bugs.
Using the -> syntax exclusively eliminates these problems, because the method and what you want the method to operate upon are always clear to the compiler.
See Indirect Object Syntax in the perlobj documentation for an explanation of its pitfalls. freido's answer covers one of them (although I tend to avoid that with explicit parens around my function calls).
Larry once joked that it was there to make the C++ feel happy about new, and although people will tell you not to ever use it, you're probably doing it all the time. Consider this:
print FH "Some message";
Have you ever wondered my there was no comma after the filehandle? And there's no comma after the class name in the indirect object notation? That's what's going on here. You could rewrite that as a method call on print:
FH->print( "Some message" );
You may have experienced some weirdness in print if you do it wrong. Putting a comma after the explicit file handle turns it into an argument:
print FH, "some message"; # GLOB(0xDEADBEEF)some message
Sadly, we have this goofiness in Perl. Not everything that got into the syntax was the best idea, but that's what happens when you pull from so many sources for inspiration. Some of the ideas have to be the bad ones.
The indirect object syntax is frowned upon, for good reasons, but that's got nothing to do with constructors. You're almost never going to have a new() function in the calling package. Rather, you should use Package->new() for two other (better?) reasons:
As you said, all other class methods take the form Package->method(), so consistency is a Good Thing
If you're supplying arguments to the constructor, or you're taking the result of the constructor and immediately calling methods on it (if e.g. you don't care about keeping the object around), it's simpler to say e.g.
$foo = Foo->new(type => 'bar', style => 'baz');
Bar->new->do_stuff;
than
$foo = new Foo(type => 'bar', style => 'baz');
(new Bar)->do_stuff;
Another problem is that new Some::Class happens at run time. If there is an error and you testing never branches to this statement, you never know it until it happens in production. It is better to use Some::Class->new unless you are doing dynamic programing.