What is the difference between new Some::Class and Some::Class->new() in Perl? - perl

Many years ago I remember a fellow programmer counselling this:
new Some::Class; # bad! (but why?)
Some::Class->new(); # good!
Sadly now I cannot remember the/his reason why. :( Both forms will work correctly even if the constructor does not actually exist in the Some::Class module but instead is inherited from a parent somewhere.
Neither of these forms are the same as Some::Class::new(), which will not pass the name of the class as the first parameter to the constructor -- so this form is always incorrect.
Even if the two forms are equivalent, I find Some::Class->new() to be much more clear, as it follows the standard convention for calling a method on a module, and in perl, the 'new' method is not special - a constructor could be called anything, and new() could do anything (although of course we generally expect it to be a constructor).

Using new Some::Class is called "indirect" method invocation, and it's bad because it introduces some ambiguity into the syntax.
One reason it can fail is if you have an array or hash of objects. You might expect
dosomethingwith $hashref->{obj}
to be equal to
$hashref->{obj}->dosomethingwith();
but it actually parses as:
$hashref->dosomethingwith->{obj}
which probably isn't what you wanted.
Another problem is if there happens to be a function in your package with the same name as a method you're trying to call. For example, what if some module that you use'd exported a function called dosomethingwith? In that case, dosomethingwith $object is ambiguous, and can result in puzzling bugs.
Using the -> syntax exclusively eliminates these problems, because the method and what you want the method to operate upon are always clear to the compiler.

See Indirect Object Syntax in the perlobj documentation for an explanation of its pitfalls. freido's answer covers one of them (although I tend to avoid that with explicit parens around my function calls).
Larry once joked that it was there to make the C++ feel happy about new, and although people will tell you not to ever use it, you're probably doing it all the time. Consider this:
print FH "Some message";
Have you ever wondered my there was no comma after the filehandle? And there's no comma after the class name in the indirect object notation? That's what's going on here. You could rewrite that as a method call on print:
FH->print( "Some message" );
You may have experienced some weirdness in print if you do it wrong. Putting a comma after the explicit file handle turns it into an argument:
print FH, "some message"; # GLOB(0xDEADBEEF)some message
Sadly, we have this goofiness in Perl. Not everything that got into the syntax was the best idea, but that's what happens when you pull from so many sources for inspiration. Some of the ideas have to be the bad ones.

The indirect object syntax is frowned upon, for good reasons, but that's got nothing to do with constructors. You're almost never going to have a new() function in the calling package. Rather, you should use Package->new() for two other (better?) reasons:
As you said, all other class methods take the form Package->method(), so consistency is a Good Thing
If you're supplying arguments to the constructor, or you're taking the result of the constructor and immediately calling methods on it (if e.g. you don't care about keeping the object around), it's simpler to say e.g.
$foo = Foo->new(type => 'bar', style => 'baz');
Bar->new->do_stuff;
than
$foo = new Foo(type => 'bar', style => 'baz');
(new Bar)->do_stuff;

Another problem is that new Some::Class happens at run time. If there is an error and you testing never branches to this statement, you never know it until it happens in production. It is better to use Some::Class->new unless you are doing dynamic programing.

Related

Override constructor in Perl Class::Accessor

Premise
The question is difficult to understand, due to a misunderstanding of the OO logic in perl by me, the OP. The comments can be useful to understand it.
Original question
The Class::Accessor module is extremely convenient for me, but I can't find any documentation about how to write a constructor such that I can, for instance, derive the values for a field out of some computation.
The closest thing I can think of, with the given documentation, is passing trough a "sort of" override:
package FooHack;
...
use Class::Accessor 'antlers';
has something => ( is => 'ro' );
# Methods here...
package Foo; # Foo is a plain module, not a class.
sub new {
my $macguffin = &crunch_chop_summon;
FooHack->new({something => $macguffin });
}
This kinda works, except my $f = Foo->new(); say ref $f will yield FooHack instead of Foo.
So my questions are:
Is my idea good enough or do you see some possible issues with it? Or maybe some improvements?
Is there a better way of doing the same thing?
Edit:
This is NOT an actual override. Foo is nowhere a class. It's just a plain module declaring a sub new. Plus, the module FooHack is not an external module. It is defined within the very same file.
The module Foo pretends to be a class in that it follows the convention of having a sub new, while new is actually just a function which calls the real constructor, FooHack->new and passes some initialization value for it.
TL;DR use Moose instead of use Class::Accessor will help a lot, and you shouldn't have to change your has definitions
As I wrote in my comment, you are asking Class::Accessor — a module that easily creates accessor methods — to provide the full quorum of object-oriented features
I also think your thoughts about object-oriented inheritance are confused. I don't see anything wrong in what you have written, but having Foo as a subclass of FooHack is wrong thinking, and confused me as well as probably many others
There should be a Foo base class and potentially multiple subclasses, like FooHack, FooPlus, FooOnHorseback etc.
Well, it's a little bit nasty to tamper with a module like that - if it's not doing what you want, then normally you'd just write a new one.
You can take a class and extend it to make a new class - that's what you'd normally do if the class in question doesn't do what you need. Applying a hack to override the constructor is subverting the expectations of the class maintainer, and the road to brittle code, which is why you can't do it easily.
That said - normally as part of a constructor you'll call bless to instantiate the object into a class. By convention, this is done into the current class, using the new method. But there's no real reason you can't:
my $self = {};
bless ( $self, 'Foo' );
Just bear in mind that if your constructor doesn't do things this object is expecting to have happened, then you might break things.

Basic Object Oriented subfunction definition and use in Perl

Sorry to bother the community for this but I have unfortunately to code in Perl :'(. It is about an OO perl code I want to understand but I am failing to put all the pieces together.
The following is a template of code that represents somehow what I am currently looking at. The following is the class MyClass:
package Namespace::MyClass;
sub new($)
{
my ($class) = #_;
$self = { };
bless ($self, $class);
}
sub init($$)
{
my ($self, $param1) = #_;
$self->{whatever} = ($param1, $param1, $param1);
}
and then the following is a script.pl that supposedly uses the class:
#!/path/to/your/perl
require Namespace::MyClass;
my myClass = new Namespace::MyClass()
myClass->init("data_for_param1");
There may be error but I am interested more in having the following questions answered than having my possibly wrong code corrected:
Questions group 1 : "$" in a sub definition means I need to supply one parameter, right? If so, why does new ask for one and I do not supply it? Has this to do with the call in the script using () or something similar to how Python works (self is implied)?
Question group 2 : is for the same previous reason that the init subroutine (here a method) declares to expect two parameters? If so, is the blessing in some way implying a self is ever passed for all the function in the module?
I ask this because I saw that in non blessed modules one $ = one parameter.
Thank you for your time.
QG1:
Prototypes (like "$") mean exactly nothing in Method calls.
Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.
Most experienced Perl folk avoid prototypes entirely unless they are trying to imitate a built-in function. Some PHBs inexperienced in Perl mandate their use under the mistaken idea that they work like prototypes in other languages.
The 1st parameter of a Method call is the Object (Blessed Ref) or Class Name (String) that called the Method. In the case of your new Method that would be 'Namespace::MyClass'.
Word to the wise: Also avoid indirect Method calls. Rewrite your line using the direct Method call as follows: my $myClass = Namespace::MyClass->new;
QG2:
Your init method is getting $myClass as it's 1st parameter because it is what 'called' the method. The 2nd parameter is from the parameter list. Blessing binds the name of the Class to the Reference, so that when a method call is seen, It knows which class in which to start the search for the correct sub. If the correct sub is not immediately found, the search continues in the classes named in the class's #ISA array.
Don't use prototypes! They don't do what you think they do.
Prototypes in Perl are mainly used to allow functions to be defined without the use of parentheses or to allow for functions that take array references to use the array name like pop or push do. Otherwise, prototypes can cause more trouble and heartbreak than experienced by most soap opera characters.
is what you actually want to do validate parameters? if so then that is not the purpose of prototypes. you could try using signatures, but for some reason they are new and still experimental. some consider lack of a stable signatures feature to be a flaw of perl. the alternatives are CPAN and writing code in your subs/methods that explicitly validate the params.

Anonymous subroutines/subroutine references stored in data structures

Why would I use Perl anonymous subroutines instead of a named one? inspired me to think about the merit of:
Storing anonymous subs in arrays, hashes and scalars.
It's a pretty cool concept, but is it practical in any way? Is there any reason why I'd have to use anonymous subs/sub references stored in some sort of data structure? Or perhaps a situation where it will be convenient?
I understand why anonymous subs are required in certain contexts such as dealing with shared variables (when an anonymous sub is declared inside another sub), but unless I'm missing something, I just don't see the point of using any sort of function reference. It seems like we should just call the functions outright and the code would look much nicer/more organized.
Please tell me I'm wrong. I'd love to have a good reason to use these things.
Thanks in advance.
A dispatch table is useful for dynamically determining steps to take based on some value:
my %disp = (
foo => sub { 'foo' },
bar => sub { 'bar' },
);
my $cmd = get_cmd_somehow();
if (defined $disp{$cmd}) {
$disp{$cmd}->(#args)
} else {
die "I don't know how to handle $cmd"
}
(Method dispatch via ->can($method) is conceptually similar, but more flexible and the details are hidden.)
Anonymous functions and lexical closure has many other uses; perhaps look deeper into "higher-order functions." (Think about map()/grep(), for example.)
Object-oriented methods are very much akin to anonymous subroutines. Polymorphism means that an object's methods can change without the calling code having to do lookups manually to see what routine to run. And that's VERY useful.
Also, think about perl's sort. Why set up a named routine just for a simple sort method? Ditto map and grep.
As well, iterators are very useful. Also, think about storing a routine that can be resolved later, rather than only being able to store a static value.
In the end, if you don't want to store anonymous routines (or even references to routines) that's your business. But having the option is way better than not having it.

Why do '::' and '->' work (sort of) interchangeably when calling methods from Perl modules?

I keep getting :: confused with -> when calling subroutines from modules. I know that :: is more related to paths and where the module/subroutine is and -> is used for objects, but I don't really understand why I can seemingly interchange both and it not come up with immediate errors.
I have perl modules which are part of a larger package, e.g. FullProgram::Part1
I'm just about getting to grips with modules, but still am on wobbly grounds when it comes to Perl objects, but I've been accidentally doing this:
FullProgram::Part1::subroutine1();
instead of
FullProgram::Part1->subroutine1();
so when I've been passing a hash ref to subroutine1 and been careful about using $class/$self to deal with the object reference and accidentally use :: I end up pulling my hair out wondering why my hash ref seems to disappear. I have learnt my lesson, but would really like an explanation of the difference. I have read the perldocs and various websites on these but I haven't seen any comparisons between the two (quite hard to google...)
All help appreciated - always good to understand what I'm doing!
There's no inherent difference between a vanilla sub and one's that's a method. It's all in how you call it.
Class::foo('a');
This will call Class::foo. If Class::foo doesn't exist, the inheritance tree will not be checked. Class::foo will be passed only the provided arguments ('a').
It's roughly the same as: my $sub = \&Class::foo; $sub->('a');
Class->foo('a');
This will call Class::foo, or foo in one of its base classes if Class::foo doesn't exist. The invocant (what's on the left of the ->) will be passed as an argument.
It's roughly the same as: my $sub = Class->can('foo'); $sub->('Class', 'a');
FullProgram::Part1::subroutine1();
calls the subroutine subroutine1 of the package FullProgram::Part1 with an empty parameter list while
FullProgram::Part1->subroutine1();
calls the same subroutine with the package name as the first argument (note that it gets a little bit more complex when you're subclassing). This syntax is used by constructor methods that need the class name for building objects of subclasses like
sub new {
my ($class, #args) = #_;
...
return bless $thing, $class;
}
FYI: in Perl OO you see $object->method(#args) which calls Class::method with the object (a blessed reference) as the first argument instead of the package/class name. In a method like this, the subroutine could work like this:
sub method {
my ($self, $foo, $bar) = #_;
$self->do_something_with($bar);
# ...
}
which will call the subroutine do_something_with with the object as first argument again followed by the value of $bar which was the second list element you originally passed to method in #args. That way the object itself doesn't get lost.
For more informations about how the inheritance tree becomes important when calling methods, please see ikegami's answer!
Use both!
use Module::Two;
Module::Two::->class_method();
Note that this works but also protects you against an ambiguity there; the simple
Module::Two->class_method();
will be interpreted as:
Module::Two()->class_method();
(calling the subroutine Two in Module and trying to call class_method on its return value - likely resulting in a runtime error or calling a class or instance method in some completely different class) if there happens to be a sub Two in Module - something that you shouldn't depend on one way or the other, since it's not any of your code's business what is in Module.
Historically, Perl dont had any OO. And functions from packages called with FullProgram::Part1::subroutine1(); sytax. Or even before with FullProgram'Part1'subroutine1(); syntax(deprecated).
Later, they implemented OOP with -> sign, but dont changed too much actually. FullProgram::Part1->subroutine1(); calls subroutine1 and FullProgram::Part1 goes as 1st parameter. you can see usage of this when you create an object: my $cgi = CGI->new(). Now, when you call a method from this object, left part also goes as first parameter to function: $cgi->param(''). Thats how param gets object he called from (usually named $self). Thats it. -> is hack for OOP. So as a result Perl does not have classes(packages work as them) but does have objects("objects" hacks too - they are blessed scalars).
Offtop: Also you can call with my $cgi = new CGI; syntax. This is same as CGI->new. Same when you say print STDOUT "text\n";. Yeah, just just calling IOHandle::print().

Simulating aspects of static-typing in a duck-typed language

In my current job I'm building a suite of Perl scripts that depend heavily on objects. (using Perl's bless() on a Hash to get as close to OO as possible)
Now, for lack of a better way of putting this, most programmers at my company aren't very smart. Worse, they don't like reading documentation and seem to have a problem understanding other people's code. Cowboy coding is the game here. Whenever they encounter a problem and try to fix it, they come up with a horrendous solution that actually solves nothing and usually makes it worse.
This results in me, frankly, not trusting them with code written in duck typed language. As an example, I see too many problems with them not getting an explicit error for misusing objects. For instance, if type A has member foo, and they do something like, instance->goo, they aren't going to see the problem immediately. It will return a null/undefined value, and they will probably waste an hour finding the cause. Then end up changing something else because they didn't properly identify the original problem.
So I'm brainstorming for a way to keep my scripting language (its rapid development is an advantage) but give an explicit error message when an object isn't used properly. I realize that since there isn't a compile stage or static typing, the error will have to be at run time. I'm fine with this, so long as the user gets a very explicit notice saying "this object doesn't have X"
As part of my solution, I don't want it to be required that they check if a method/variable exists before trying to use it.
Even though my work is in Perl, I think this can be language agnostic.
If you have any shot of adding modules to use, try Moose. It provides pretty much all the features you'd want in a modern programming environment, and more. It does type checking, excellent inheritance, has introspection capabilities, and with MooseX::Declare, one of the nicest interfaces for Perl classes out there. Take a look:
use MooseX::Declare;
class BankAccount {
has 'balance' => ( isa => 'Num', is => 'rw', default => 0 );
method deposit (Num $amount) {
$self->balance( $self->balance + $amount );
}
method withdraw (Num $amount) {
my $current_balance = $self->balance();
( $current_balance >= $amount )
|| confess "Account overdrawn";
$self->balance( $current_balance - $amount );
}
}
class CheckingAccount extends BankAccount {
has 'overdraft_account' => ( isa => 'BankAccount', is => 'rw' );
before withdraw (Num $amount) {
my $overdraft_amount = $amount - $self->balance();
if ( $self->overdraft_account && $overdraft_amount > 0 ) {
$self->overdraft_account->withdraw($overdraft_amount);
$self->deposit($overdraft_amount);
}
}
}
I think it's pretty cool, myself. :) It's a layer over Perl's object system, so it works with stuff you already have (basically.)
With Moose, you can create subtypes really easily, so you can make sure your input is valid. Lazy programmers agree: with so little that has to be done to make subtypes work in Moose, it's easier to do them than not! (from Cookbook 4)
subtype 'USState'
=> as Str
=> where {
( exists $STATES->{code2state}{ uc($_) }
|| exists $STATES->{state2code}{ uc($_) } );
};
And Tada, the USState is now a type you can use! No fuss, no muss, and just a small amount of code. It'll throw an error if it's not right, and all the consumers of your class have to do is pass a scalar with that string in it. If it's fine (which it should be...right? :) ) They use it like normal, and your class is protected from garbage. How nice is that!
Moose has tons of awesome stuff like this.
Trust me. Check it out. :)
In Perl,
make it required that use strict and use warnings are on in 100% of the code
You can try to make an almost private member variables by creating closures. A very good example is "Private Member Variables, Sort of " section in http://www.usenix.org/publications/login/1998-10/perl.html . They are not 100% private but fairly un-obvious how to access unless you really know what you're doing (and require them to read your code and do research to find out how).
If you don't want to use closures, the following approach works somewhat well:
Make all of your object member variables (aka object hash keys in Perl) wrapped in accessors. There are ways to do this efficiently from coding standards POV. One of the least safe is Class::Accessor::Fast. I'm sure Moose has better ways but I'm not that familiar with Moose.
Make sure to "hide" actual member variables in private-convention names, e.g. $object->{'__private__var1'} would be the member variable, and $object->var1() would be a getter/setter accessor.
NOTE: For the last, Class::Accessor::Fast is bad since its member variables share names with accessors. But you can have very easy builders that work just like Class::Accessor::Fast and create key values such as $obj->{'__private__foo'} for "foo".
This won't prevent them shooting themselves in the foot, but WILL make it a lot harder to do so.
In your case, if they use $obj->goo or $obj->goo(), they WOULD get a runtime error, at least in Perl.
They could of course go out of their way to do $obj->{'__private__goo'}, but if they do the gonzo cowboy crap due to sheer laziness, the latter is a lot more work than doing the correct $obj->foo().
You can also have a scan of code-base which detects $object->{"_ type strings, though from your description that might not work as a deterrent that much.
You can use Class::InsideOut or Object::InsideOut which give you true data privacy. Rather than storing data in a blessed hash reference, a blessed scalar reference is used as a key to lexical data hashes. Long story short, if your co-workers try $obj->{member} they'll get a run time error. There's nothing in $obj for them to grab at and no easy way to get at the data except through accessors.
Here is a discussion of the inside-out technique and various implementations.