Basic OOP concept in Perl - perl

I'm learning the OOP concept in Perl where the syntax is different from Java, which I used to learn the OOP concept.
I have an example to declare a class Person, but I'm a bit confused.
The code is as follows
package Person;
sub new {
my $class = shift;
my $self = {
_firstName => shift,
_lastName => shift,
_ssn => shift,
};
# Print all the values just for clarification.
print "First Name is $self->{_firstName}\n";
print "Last Name is $self->{_lastName}\n";
print "SSN is $self->{_ssn}\n";
bless $self, $class;
return $self;
}
From the example above, is my $self a scalar variable or a hash variable?
As far as I know, hash variables in Perl are declared with % while $ is used for scalar variables.
And what is the use of the bless function? Why does it return $self?

From the example above, is my $self a scalar variable or a hash variable?
$self is a scalar variable. You can tell that because it starts with a $.
(Parenthetical update: In the comments, brian points out that this rule is flawed. And he's right of course - as he usually is. The $ at the start of $self shows that it's a scalar value, not a scalar variable. But if you take the $ at the start together with the lack of look-up brackets - like [...] or {...} - following the variable name, then you can be sure this is a scalar variable.)
But your complete line of code is this:
my $self = {
_firstName => shift,
_lastName => shift,
_ssn => shift,
};
Here, the { ... } is an "anonymous hash constructor". It creates a hash and returns a reference to it. References are always scalar values, so they can be stored in scalar variables (that's one of the major reasons for their existence).
So $self is still a scalar variable, but it contains a reference to a hash.
And what is the use of the bless function?
The call to bless() effectively assign a type to the $self variable. Perl needs to know the type of your object in order to know which symbol table to search for the object's methods. When I'm running classes in this, I like to say that bless() is like writing the type of the object on a post-it note and slapping it on the object's forehead - so, later on, people can know what the type is.
Why does it return $self?
You will call this method something like this:
my $person = Person->new(...); # passing various parameters
The new() method needs to return the newly created object in order that you can store it in a variable and manipulate it in some way later on.
But a Perl subroutine returns the value of the last expression in the subroutine, and bless() returns the "blessed" object, so it would be fine to end the subroutine with the previous line:
bless $self, $class;
But it's traditional to be more explicit about return values, so most people would add the return line. It makes for better documented code.

$self is a hash reference, so it's considered as a scalar in most manipulations.
bless is used to say that this hash reference is an object of the given class (which will be Person here unless you have child classe), so that you can call functions of this class using the $object->function notation.
And the return $self is necessary so you can actually retrieve the created object when you call new!

Related

overloading a Perl object to redirect hash access to custom routine

A while ago, I wrote a routine that parses a given string and returns a record in the form of a hash (field => value, field2 => value2). Great, except requirements have changed and I now need to return more data and offer getter methods to get at this data. So, I adjusted the routine to return a Record object which stores this same hash in a data attribute.
However, this will break legacy code that expects a hash so that it can get at the data using $record->{field}. With the new Record object, the path to this data is now $record->{data}->{field} or $record->getByShortName('field').
My idea was to overload the object's FETCH method and return the corresponding field. However, this does not seem to work. It looks like FETCH is never called.
I'm looking for three pieces of advice:
How can I overload my object correctly so that hash access attempts are redirected to a custom method of the object?
Is this an advisable way of working or will there be a massive speed penalty?
Are there better methods to keep backward compatibility in my case?
Here's an MVE:
Record.pm
package Record;
use strict;
use warnings;
use Data::Dumper;
use overload fallback => 1, '%{}' => \&access_hash;
sub new {
my ($class, %args) = #_;
my %fields = (answer => 42, question => 21);
$args{fields} = \%fields;
return bless { %args }, $class;
}
sub access_hash {
my ($self) = shift;
return $self; # cannot return $self->{fields} because that would recurse ad infinitum
}
sub FETCH {
print(Dumper(#_)); # does not return anything, is this method not being called
}
test.pl
use Record;
my $inst = Record->new();
print($inst->{answer}."\n");
print($inst->{question}."\n");
Record is a blessed hash reference, so if you overload the %{} operator, you will have trouble accessing the fields of the underlying hash.
The overload authors thought about this, and provided the overloading pragma as a way to disable overloading for this and some other use cases.
use overload '%{}' => \&access_hash;
...
sub access_hash {
no overloading '%{}';
my ($self) = shift;
return $self->{fields};
}
Prior to Perl 5.10, the workaround was to disable overloading by temporarily reblessing your object to something that wouldn't activate your overloaded operators.
sub access_hash {
my ($self) = shift;
my $orig_ref = ref($self);
bless $self, "#$%^&*()";
my $fields = $self->{fields};
bless $self, $orig_ref;
return $fields;
}
You don't neccessarily need a dedicated constructor for Perl objects. You can define your Record class, and then simply return bless $hashref, 'Record'; where you are now doing return $hashref;. All code that operates directly on the hashref will continue to work, but you will also be able to call methods on it.

Matching optional query parameters using the variable name

I want to create a hash of optional query parameters that are sometimes passed to my subroutine. Sometimes a query parameter called welcome is passed in, and the value is either 1 or 0.
If that variable exists, I want to add it to a hash.
I've created a configuration value called OPTIONAL_URL_PARAMS which is a list of expected parameter names that can be passed in:
use constant OPTIONAL_URL_PARAMS => ("welcome")
So far I have:
my $tempParams = {};
if ( $optionalParam ) {
foreach my $param (#{&OPTIONAL_URL_PARAMS}) {
if ($optionalParam eq $self->{$param}) {
$tempParams->{$param} = $optionalParam;
$tempParams->{$param} =~ s/\s+//g; # strip whitespace
}
}
}
But this tries to use the value of $self->{$param} instead of its name. i.e. I want welcome to match welcome, but it's trying to match 1 instead.
I know when it comes to keys in a hash you can use keys %hash, but is there a way you can do this with regular variables?
Edit
My subroutine is being called indirectly:
my $url_handler = URL::Handler->new($param, $optionalParam);
sub new {
my $class = shift;
my $args = #_;
my $self = {
param => $args{param},
optionalParams => $args{optionalParam}
};
}
If $optionalParam's variable name is 'welcome', then I want to try and map it to the constant welcome.
This is not an answer any more, but I cannot remove it yet as there is still a discussion going on to clarify the question.
foreach my $param (#{&OPTIONAL_URL_PARAMS}) {
# ...
}
The return value of OPTIONAL_URL_PARAMS (you already got an error here and that's why you have the &, that should have told you something...) is simply a list, not an array ref. Actually at this point it should throw an error because you cannot use 1 as an array reference.
Edit
In Perl, when you pass arguments to a subroutine, all the values are flattened into a single list (reference). Specifically, if you are passing parameters to a sub, the sub doesn't know the names of the variables you originally used. It only knows their values. Therefore, if you want names as well as values, you have to pass them separately. An easy way is using a hash. E.g., in new():
my $class = shift;
my $param = shift; # The first, required parameter
my %therest = (#_); # Optional parameters, if any
Then you can say URL::Handler->new($param, 'welcome' => 1), and $therest{welcome} will have the value 1. You can also say
URL::Handler->new($param, 'welcome'=>1, 'another'=>42);
and %therest will have the two keys welcome and another.
See also some further discussion of passing whole hashes as parameters
Original
This also probably doesn't answer the question!
Some thoughts on the code from your comment.
my $url_handler = URL::Handler->new($param, $optionalParam);
sub new {
my $class = shift; # OK; refers to URL::Handler
my $args = #_; # Problematic: now $args is the _number_ of args passed (list #_ evaluated in scalar context).
my $self = {
# There is no %args hash, so the next two lines are not doing what you expect.
# I actually don't know enough perl to know what they do! :)
param => $args{param},
optionalParams => $args{optionalParam}
};
}
Some thoughts:
use strict; and use warnings; at the top of your source file, if you haven't yet.
I can think of no languages other than Algol 60 that support this idea. It goes against the idea of encapsulation, and prevents you from using an array or hash element, a function call, a constant, or an expression as the actual parameter to a call
Variable names are purely for the convenience of the programmer and have no part in the functionality of any running program. If you wished so, you could write your code using a single array #memory and have "variables" $memory[0], $memory[1] etc. But you would be bypassing the most useful part of compiler technology that allows us to relate readable text to memory locations. It is best to consider those names to be lost once the program is running
The called code should be interested only in the values passed, and it would be a nightmare if the name of a variable passed as an actual parameter were significant within the subroutine
If you were able to access the name of a variable passed as a parameter, what do you suppose would be provided to subroutine stats if the call looked like this
stats( ( $base[$i] + 0.6 + sum(#offset{qw/ x y /}) + sum(#aa) ) / #aa )
In summary, it cannot be done in general. If you need to associate a value with a name then you should probably be looking at hashes
Your code
my $url_handler = URL::Handler->new($param, $optionalParam);
sub new {
my $class = shift;
my $args = #_;
my $self = {
param => $args{param},
optionalParams => $args{optionalParam}
};
}
has a number of problems
You correctly shift the name of the class from parameter array #_, but then set my $args = #_, which sets $args to the number of elements remaining in #_. But the value of $args is irrelevant because you never use it again
You then set $self to a new anonymous hash, which is created with two elements, using the values from hash %args. But %args doesn't exist, so the value of both elements will be undef. Had you put use strict and use warnings 'all' in place you would have been alerted to this
The keys that you're using to access this non-existent hash are param and optionalParam, and I think it's more than a coincidence that they match the names of the actual parameters of the call to new
While Perl is unusual in that it allows programmatic access to its symbol tables, it is an arcane and unrecommended method. Those names are essentially hidden from the program and the programmer and while modules like Exporter must manipulate symbol tables to do their job, any such behaviour inside base-level software is very much to be avoided
Finally, you never use $self again after defining it. You should be blessing it into a class according to the $class variable (which contains the string URL::Handler) and returning it from the constructor
I hope this helps

Perl dereferencing a subroutine

I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.

perl constructor keyword 'new'

I am new to Perl and currently learning Perl object oriented and came across writing a constructor.
It looks like when using new for the name of the subroutine the first parameter will be the package name.
Must the constructor be using the keyword new? Or is it because when we are calling the new subroutine using the packagename, then the first parameter to be passed in will be package name?
packagename->new;
and when the subroutine have other name it will be the first parameter will be the reference to an object? Or is it because when the subroutine is call via the reference to the object so that the first parameter to be passed in will be the reference to the object?
$objRef->subroutine;
NB: All examples below are simplified for instructional purposes.
On Methods
Yes, you are correct. The first argument to your new function, if invoked as a method, will be the thing you invoked it against.
There are two “flavors” of invoking a method, but the result is the same either way. One flavor relies upon an operator, the binary -> operator. The other flavor relies on ordering of arguments, the way bitransitive verbs work in English. Most people use the dative/bitransitive style only with built-ins — and perhaps with constructors, but seldom anything else.
Under most (but not quite all) circumstances, these first two are equivalent:
1. Dative Invocation of Methods
This is the positional one, the one that uses word-order to determine what’s going on.
use Some::Package;
my $obj1 = new Some::Package NAME => "fred";
Notice we use no method arrow there: there is no -> as written. This is what Perl itself uses with many of its own functions, like
printf STDERR "%-20s: %5d\n", $name, $number;
Which just about everyone prefers to the equivalent:
STDERR->printf("%-20s: %5d\n", $name, $number);
However, these days that sort of dative invocation is used almost exclusively for built-ins, because people keep getting things confused.
2. Arrow Invocation of Methods
The arrow invocation is for the most part clearer and cleaner, and less likely to get you tangled up in the weeds of Perl parsing oddities. Note I said less likely; I did not say that it was free of all infelicities. But let’s just pretend so for the purposes of this answer.
use Some::Package;
my $obj2 = Some::Package->new(NAME => "fred");
At run time, barring any fancy oddities or inheritance matters, the actual function call would be
Some::Package::new("Some::Package", "NAME", "fred");
For example, if you were in the Perl debugger and did a stack dump, it would have something like the previous line in its call chain.
Since invoking a method always prefixes the parameter list with invocant, all functions that will be invoked as methods must account for that “extra” first argument. This is very easily done:
package Some::Package;
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
This is just an extremely simplified example of the new most frequent ways to call constructors, and what happens on the inside. In actual production code, the constructor would be much more careful.
Methods and Indirection
Sometimes you don’t know the class name or the method name at compile time, so you need to use a variable to hold one or the other, or both. Indirection in programming is something different from indirect objects in natural language. Indirection just means you have a variable that contains something else, so you use the variable to get at its contents.
print 3.14; # print a number directly
$var = 3.14; # or indirectly
print $var;
We can use variables to hold other things involved in method invocation that merely the method’s arguments.
3. Arrow Invocation with Indirected Method Name:
If you don’t know the method name, then you can put its name in a variable. Only try this with arrow invocation, not with dative invocation.
use Some::Package;
my $action = (rand(2) < 1) ? "new" : "old";
my $obj = Some::Package->$action(NAME => "fido");
Here the method name itself is unknown until run-time.
4. Arrow Invocation with Indirected Class Name:
Here we use a variable to contain the name of the class we want to use.
my $class = (rand(2) < 1)
? "Fancy::Class"
: "Simple::Class";
my $obj3 = $class->new(NAME => "fred");
Now we randomly pick one class or another.
You can actually use dative invocation this way, too:
my $obj3 = new $class NAME => "fred";
But that isn’t usually done with user methods. It does sometimes happen with built-ins, though.
my $fh = ($error_count == 0) ? *STDOUT : *STDERR;
printf $fh "Error count: %d.\n", $error_count;
That’s because trying to use an expression in the dative slot isn’t going to work in general without a block around it; it can otherwise only be a simple scalar variable, not even a single element from an array or hash.
printf { ($error_count == 0) ? *STDOUT : *STDERR } "Error count: %d.\n", $error_count;
Or more simply:
print { $fh{$filename} } "Some data.\n";
Which is pretty darned ugly.
Let the invoker beware
Note that this doesn’t work perfectly. A literal in the dative object slot works differently than a variable does there. For example, with literal filehandles:
print STDERR;
means
print STDERR $_;
but if you use indirect filehandles, like this:
print $fh;
That actually means
print STDOUT $fh;
which is unlikely to mean what you wanted, which was probably this:
print $fh $_;
aka
$fh->print($_);
Advanced Usage: Dual-Nature Methods
The thing about the method invocation arrow -> is that it is agnostic about whether its left-hand operand is a string representing a class name or a blessed reference representing an object instance.
Of course, nothing formally requires that $class contain a package name. It may be either, and if so, it is up to the method itself to do the right thing.
use Some::Class;
my $class = "Some::Class";
my $obj = $class->new(NAME => "Orlando");
my $invocant = (rand(2) < 1) ? $class : $obj;
$invocant->any_debug(1);
That requires a pretty fancy any_debug method, one that does something different depending on whether its invocant was blessed or not:
package Some::Class;
use Scalar::Util qw(blessed);
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
sub any_debug {
my($invocant, $value) = #_;
if (blessed($invocant)) {
$invocant->obj_debug($value);
} else {
$invocant->class_debug($value);
}
}
sub obj_debug {
my($self, $value) = #_;
$self->{DEBUG} = $value;
}
my $Global_Debug;
sub class_debug {
my($classname, $value) = #_;
$Global_Debug = $value;
}
However, this is a rather advanced and subtle technique, one applicable in only a few uncommon situations. It is not recommended for most situations, as it can be confusing if not handled properly — and perhaps even if it is.
It is not first parameter to new, but indirect object syntax,
perl -MO=Deparse -e 'my $o = new X 1, 2'
which gets parsed as
my $o = 'X'->new(1, 2);
From perldoc,
Perl suports another method invocation syntax called "indirect object" notation. This syntax is called "indirect" because the method comes before the object it is being invoked on.
That being said, new is not some kind of reserved word for constructor invocation, but name of method/constructor itself, which in perl is not enforced (ie. DBI has connect constructor)

What are the differences between parameter inputting mechanisms in Perl?

While reading a downloaded Perl module, I found several ways of defining input parameters, which are listed as following. What are the differences between them?
sub new{
my $class = shift;
my $self = {#_};
bless{$self, $class};
}
sub count1{
my ($self, $lab1) = #_;
}
sub new1{
my ($class, $lab1) = #_;
my $self = {};
bless $class, $self;
}
sub setpath{
my $self = shift;
}
When a subroutine is called, the passed parameters are put into a special array #_. One can consume this array by shifting values out my $foo = shift or by direct array assignment my ($foo,$bar)=#_; It is even possible to use the values directly from the array: $_[0]
Why one versus the others? Direct array assignment is the most standard and common. Sometimes the shift way is used when there are optional trailing values. Direct array usage is discouraged except in few small niches: wrapper functions that are calling other functions, especially inside of objects. functions that wrap other functions and and modify the inputs. Also the special form of goto &func which immediately drops the current call stack and calls func on the current value of #_.
# use shift for optional trailing values
use v5.10;
my $foo = shift;
my $bar = shift // 'default bar value';
my $baz = shift // 'default baz value';
#obj method to call related non-object function.
sub bar { my $self = shift; _bar(#_) }
sub longname { shortname(#_) }
sub get { return $_[0]->$_[1]; }
#1 and #3 are examples of associating an object with a class (Object Oriented Perl).
In #2, #_ is the list of parameters passed to the function, so $self and $lab1 get the values of the first 2 passed parameters.
In #4, shift() is a built in Perl subroutine that takes an array as an argument, then returns and deletes the first item in that array. If it has no argument, it is executed implicitly on #_. So $self gets the value of the first passed parameter.