How can I stop or allow a perl object to be used as a hashref? - perl

I have a Perl class which is a based on a blessed hashref ( https://github.com/kylemhall/Koha/blob/master/Koha/Object.pm )
This is a community based project, with many developers of varying skill.
What I've seen is some developers accidentally using our objects as hashrefs. The actual data is not stored in the blessed hashref, but in a dbic object that is stored in the hashref ( in $self->{_result} ). When the dev tries something like $object->{id} perl doesn't complain, it just returns undef, as would be expected.
What I'd like to do is to either
A) Make the script explode with an error when this happens
B) Allow the use of the hashref syntax for setting / getting values in the dbic objects stored in $self->{_result}
I tried using:
use overload '%{}' => \&get_hashref;
but when I do this, get_hashref is called any time a regular method is called! This makes sense in a way, since the object itself is a hashref. I'm sure this has something to do with the Perl internals for blessed hashrefs as objects.
Is what I'm trying to accomplish even possible?

I suggest using a scalar-based or array-based object instead of a hash-based object. This is a cheap (efficient) solution since it simply causes the offender to run afoul of existing type checks.
For example, the object produced by the following is simply a reference to the actual object. Just use $$self instead of $self in the methods.
$ perl -e'
sub new {
my $class = shift;
my $self = bless(\{}, $class);
# $$self->{...} = ...;
return $self;
}
my $o = __PACKAGE__->new();
my $id = $o->{id};
'
Not a HASH reference at -e line 9.

One way to do this, is by using "inside-out objects", where the object is just a blessed simple scalar, and the data is stored separately.
See, for instance, Class::STD

Related

Query in Perl Subroutines

I've to use perl as a part of our internship, I've come across this piece of code and could not understand what this may mean.
$val->ReadSim($first_sim, \&DataProcessing);
In the script, subroutine DataProcessing is defined, but could not find ReadSim. I tried searching in our infrastructure but was not able to. This was given to me to understand a week ago and I can't ask the guide without losing credits...
Please help...
What you're seeing is not a mere subroutine, but a method on some object called $val.
I take it you might see something on top of the program like this:
use Foo::Bar; # Some Perl module
This Perl module will contain the method ReadSim. Somewhere in your code, you probably see something like this:
my $val = Foo::Bar->new; # If the people who wrote this followed standards...
This defines $val as an object of Foo::Bar. If you look in package Foo::Bar, you'd see something like this:
#! Foo/Bar.pm
package Foo::Bar;
use strict; # Because I'm an optimist
use warnings;
...
sub new {
my $class = shift;
...
my $self = {};
...
bless $self, $class;
...
return $self; # May simply be bless {}, $class;
}
Then further down, you'll see:
sub ReadSim {
my $self = shift;
...
}
The $self = {} is a reference to a Perl hash. This is how most objects are defined. That's pretty much all the constructor does. It defines a reference to something, then blesses it as that object type. Then, the methods are merely subroutines that take the defined object and manipulate that.
$val-> ReadSim(...);
is equivalent to:
Foo::Bar::ReadSim( $val, ... );
So much for your introduction to Object Oriented Perl by Fire. You still have a question about what does ReadSim mean.
If all is right in the world, the developer of that module should have created built in Perl documentation called POD. First, determine the type of object $val is. Look where $val is defined (Something like my $val = Foo::Bar->new(...);). The Foo::Bar is the class that $val is a member of. You can do this from the command line:
$ perldoc Foo::Bar
And, if you're lucky, you'll see the documentation for Foo::Bar printed out. If you're really, really lucky, you will also see the what ReadSim also does.
And, if you're not so lucky, you'll have to do some digging. You can do this:
$ perldoc -l Foo::Bar
/usr/perl/lib/perl5/5.12/Foo/Bar.pm
This will print out the location of where the Perl Module resides on your system. For example, in this case, the module's code is in /usr/perl/lib/perl5/5.12/Foo/Bar.pm. Now, you can use an editor on this file to read it, and search for sub ReadSim and find out what that subroutine ... I mean method does.
One final thing. If you're new to Perl, you may want to look at a few tutorials that come with Perl. One is the Perl Reference Tutorial. This tutorial will tell you about references. In standard Perl, there are three different types of variables: scalar, hashes, and arrays. To create more complex data structures, you can create hashes of hashes or hashes of arrays, or arrays of arrays, etc. This tutorial will teach you about how to do this.
Once you understand references, you should read the tutorial on Perl Object Oriented Programming. Object Oriented Perl uses references to create a simulated world object oriented programming design. (I say simulated because some people will argue that Object Oriented Perl isn't really object oriented since you don't have things like private methods and variables. To me, if you can think in terms of objects and methods as you program, it's object oriented).

Why do '::' and '->' work (sort of) interchangeably when calling methods from Perl modules?

I keep getting :: confused with -> when calling subroutines from modules. I know that :: is more related to paths and where the module/subroutine is and -> is used for objects, but I don't really understand why I can seemingly interchange both and it not come up with immediate errors.
I have perl modules which are part of a larger package, e.g. FullProgram::Part1
I'm just about getting to grips with modules, but still am on wobbly grounds when it comes to Perl objects, but I've been accidentally doing this:
FullProgram::Part1::subroutine1();
instead of
FullProgram::Part1->subroutine1();
so when I've been passing a hash ref to subroutine1 and been careful about using $class/$self to deal with the object reference and accidentally use :: I end up pulling my hair out wondering why my hash ref seems to disappear. I have learnt my lesson, but would really like an explanation of the difference. I have read the perldocs and various websites on these but I haven't seen any comparisons between the two (quite hard to google...)
All help appreciated - always good to understand what I'm doing!
There's no inherent difference between a vanilla sub and one's that's a method. It's all in how you call it.
Class::foo('a');
This will call Class::foo. If Class::foo doesn't exist, the inheritance tree will not be checked. Class::foo will be passed only the provided arguments ('a').
It's roughly the same as: my $sub = \&Class::foo; $sub->('a');
Class->foo('a');
This will call Class::foo, or foo in one of its base classes if Class::foo doesn't exist. The invocant (what's on the left of the ->) will be passed as an argument.
It's roughly the same as: my $sub = Class->can('foo'); $sub->('Class', 'a');
FullProgram::Part1::subroutine1();
calls the subroutine subroutine1 of the package FullProgram::Part1 with an empty parameter list while
FullProgram::Part1->subroutine1();
calls the same subroutine with the package name as the first argument (note that it gets a little bit more complex when you're subclassing). This syntax is used by constructor methods that need the class name for building objects of subclasses like
sub new {
my ($class, #args) = #_;
...
return bless $thing, $class;
}
FYI: in Perl OO you see $object->method(#args) which calls Class::method with the object (a blessed reference) as the first argument instead of the package/class name. In a method like this, the subroutine could work like this:
sub method {
my ($self, $foo, $bar) = #_;
$self->do_something_with($bar);
# ...
}
which will call the subroutine do_something_with with the object as first argument again followed by the value of $bar which was the second list element you originally passed to method in #args. That way the object itself doesn't get lost.
For more informations about how the inheritance tree becomes important when calling methods, please see ikegami's answer!
Use both!
use Module::Two;
Module::Two::->class_method();
Note that this works but also protects you against an ambiguity there; the simple
Module::Two->class_method();
will be interpreted as:
Module::Two()->class_method();
(calling the subroutine Two in Module and trying to call class_method on its return value - likely resulting in a runtime error or calling a class or instance method in some completely different class) if there happens to be a sub Two in Module - something that you shouldn't depend on one way or the other, since it's not any of your code's business what is in Module.
Historically, Perl dont had any OO. And functions from packages called with FullProgram::Part1::subroutine1(); sytax. Or even before with FullProgram'Part1'subroutine1(); syntax(deprecated).
Later, they implemented OOP with -> sign, but dont changed too much actually. FullProgram::Part1->subroutine1(); calls subroutine1 and FullProgram::Part1 goes as 1st parameter. you can see usage of this when you create an object: my $cgi = CGI->new(). Now, when you call a method from this object, left part also goes as first parameter to function: $cgi->param(''). Thats how param gets object he called from (usually named $self). Thats it. -> is hack for OOP. So as a result Perl does not have classes(packages work as them) but does have objects("objects" hacks too - they are blessed scalars).
Offtop: Also you can call with my $cgi = new CGI; syntax. This is same as CGI->new. Same when you say print STDOUT "text\n";. Yeah, just just calling IOHandle::print().

Can one pass Perl hash references between processes?

I have an ActiveState PerlCtrl project. I'd like to know if it's possible to have a hash in the COM DLL, pass it's ref out to the calling process as a string (e.g. "HASH(0x2345)") and then pass that string back into the COM DLL and somehow bless it back into pointing to the relevant hash.
Getting the hashref seems easy enough, using return "" . \%Graph; and I have tried things like $Graph = shift; $Graph = bless {%$Graph}; but they don't seem to achieve what I'm after. The %Graph hash is at least global to the module.
The testing code (VBScript):
set o = CreateObject("Project.BOGLE.1")
x = o.new_graph()
wscript.echo x
x = o.add_vertex(x, "foo")
If these are different processes, you will need to either serialize the content of the hash or persistently store it in a disk file. To do the former, see Storable or Data::Dumper; for the latter, it depends whether it's a hash of simple scalars or something more complex.
If it is the same perl interpreter in the same process, you can keep some global variable like %main::hashes;
set $main::hashes{\%Graph} = \%Graph before passing the stringified reference back to the calling process, then later use it to look up the actual hash reference.
Don't do this, though: http://perlmonks.org/?node_id=379395.
No, you can't reliably pass hash references between processes.

Why shouldn't I use UNIVERSAL::isa?

According to this
http://perldoc.perl.org/UNIVERSAL.html
I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or CLASS->isa().
This means that to find out if something is a reference in the first place and then is reference to this class I have to do
eval { $poss->isa("Class") }
and check $# and all that gumph, or else
use Scalar::Util 'blessed';
blessed $ref && $ref->isa($class);
My question is why? What's wrong with UNIVERSAL::isa called like that? It's much cleaner for things like:
my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__)
To see whether this function is being called on the object or not. And is there a nice clean alternative that doesn't get cumbersome with ampersands and potentially long lines?
The primary problem is that if you call UNIVERSAL::isa directly, you are bypassing any classes that have overloaded isa. If those classes rely on the overloaded behavior (which they probably do or else they would not have overridden it), then this is a problem. If you invoke isa directly on your blessed object, then the correct isa method will be called in either case (overloaded if it exists, UNIVERSAL:: if not).
The second problem is that UNIVERSAL::isa will only perform the test you want on a blessed reference just like every other use of isa. It has different behavior for non-blessed references and simple scalars. So your example that doesn't check whether $ref is blessed is not doing the right thing, you're ignoring an error condition and using UNIVERSAL's alternate behavior. In certain circumstances this can cause subtle errors (for example, if your variable contains the name of a class).
Consider:
use CGI;
my $a = CGI->new();
my $b = "CGI";
print UNIVERSAL::isa($a,"CGI"); # prints 1, $a is a CGI object.
print UNIVERSAL::isa($b,"CGI"); # Also prints 1!! Uh-oh!!
So, in summary, don't use UNIVERSAL::isa... Do the extra error check and invoke isa on your object directly.
See the docs for UNIVERSAL::isa and UNIVERSAL::can for why you shouldn't do it.
In a nutshell, there are important modules with a genuine need to override 'isa' (such as Test::MockObject), and if you call it as a function, you break this.
I have to say, my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__) doesn't look terribly clean to me - anti-Perl advocates would be complaining about line noise. :)
To directly answer your question, the answer is at the bottom of the page you linked to, namely that if a package defines an isa method, then calling UNIVERSAL::isa directly will not call the package isa method. This is very unintuitive behaviour from an object-orientation point of view.
The rest of this post is just more questions about why you're doing this in the first place.
In code like the above, in what cases would that specific isa test fail? i.e., if it's a method, in which case would the first argument not be the package class or an instance thereof?
I ask this because I wonder if there is a legitimate reason why you would want to test whether the first argument is an object in the first place. i.e., are you just trying to catch people saying FooBar::method instead of FooBar->method or $foobar->method? I guess Perl isn't designed for that sort of coddling, and if people mistakenly use FooBar::method they'll find out soon enough.
Your mileage may vary.
Everyone else has told you why you don't want to use UNIVERSAL::isa, because it breaks when things overload isa. If they've gone to all the habit of overloading that very special method, you certainly want to respect it. Sure, you could do this by writing:
if (eval { $foo->isa("thing") }) {
# Do thingish things
}
because eval guarantees to return false if it throws an exception, and the last value otherwise. But that looks awful, and you shouldn't need to write your code in funny ways because the language wants you to. What we really want is to write just:
if ( $foo->isa("thing") ) {
# Do thingish things
}
To do that, we'd have to make sure that $foo is always an object. But $foo could be a string, a number, a reference, an undefined value, or all sorts of weird stuff. What a shame Perl can't make everything a first class object.
Oh, wait, it can...
use autobox; # Everything is now a first class object.
use CGI; # Because I know you have it installed.
my $x = 5;
my $y = CGI->new;
print "\$x is a CGI object\n" if $x->isa('CGI'); # This isn't printed.
print "\$y is a CGI object\n" if $y->isa('CGI'); # This is!
You can grab autobox from the CPAN. You can also use it with lexical scope, so everything can be a first class object just for the files or blocks where you want to use ->isa() without all the extra headaches. It also does a lot more than what I've covered in this simple example.
Assuming your example of what you want to be able to do is within an object method, you're being unnecessarily paranoid. The first passed item will always be either a reference to an object of the appropriate class (or a subclass) or it will be the name of the class (or a subclass). It will never be a reference of any other type, unless the method has been deliberately called as a function. You can, therefore, safely just use ref to distinguish between the two cases.
if (ref $_[0]) {
my $self = shift;
# called on instance, so do instancey things
} else {
my $class = shift;
# called as a class/static method, so do classy things
}
Right. It does a wrong thing for classes that overload isa. Just use the following idiom:
if (eval { $obj->isa($class) }) {
It is easily understood and commonly accepted.
Update for 2020: Perl v5.32 has the class infix operator, isa, which handles any sort of thing on the lefthand side. If the $something is not an object, you get back false with no blowup.
use v5.32;
if( $something isa 'Animal' ) { ... }

How can I determine the type of a blessed reference in Perl?

In Perl, an object is just a reference to any of the basic Perl data types that has been blessed into a particular class. When you use the ref() function on an unblessed reference, you are told what data type the reference points to. However, when you call ref() on a blessed reference, you are returned the name of the package that reference has been blessed into.
I want to know the actual underlying type of the blessed reference. How can I determine this?
Scalar::Util::reftype() is the cleanest solution. The Scalar::Util module was added to the Perl core in version 5.7 but is available for older versions (5.004 or later) from CPAN.
You can also probe with UNIVERSAL::isa():
$x->isa('HASH') # if $x is known to be an object
UNIVERSAL::isa($x, 'HASH') # if $x might not be an object or reference
Obviously, you'd also have to check for ARRAY and SCALAR types. The UNIVERSAL module (which serves as the base class for all objects) has been part of the core since Perl 5.003.
Another way -- easy but a little dirty -- is to stringify the reference. Assuming that the class hasn't overloaded stringification you'll get back something resembling Class=HASH(0x1234ABCD), which you can parse to extract the underlying data type:
my $type = ($object =~ /=(.+)\(0x[0-9a-f]+\)$/i);
You probably shouldn't do this. The underlying type of an object is an implementation detail you shouldn't mess with. Why would you want to know this?
And my first thought on this was: "Objects in Perl are always hash refs, so what the hack?"
But, Scalar::Util::reftype is the answer. Thanks for putting the question here.
Here is a code snippet to prove this.. (in case it is of any use to anyone).
$> perl -e 'use strict; use warnings "all";
my $x = [1]; bless ($x, "ABC::Def");
use Data::Dumper; print Dumper $x;
print ref($x) . "\n";
use Scalar::Util "reftype"; print reftype($x) . "\n"'`
Output:
$VAR1 = bless( [
1
], 'ABC::Def' );
ABC::Def
ARRAY