What is "stringification" in Perl? - perl

In the documentation for the CPAN module DateTime I found the following:
Once you set the formatter, the
overloaded stringification method will
use the formatter.
It seems there is some Perl concept called "stringification" that I somehow missed. Googling has not clarified it much. What is this "stringification"?

"stringification" happens any time that perl needs to convert a value into a string. This could be to print it, to concatenate it with another string, to apply a regex to it, or to use any of the other string manipulation functions in Perl.
say $obj;
say "object is: $obj";
if ($obj =~ /xyz/) {...}
say join ', ' => $obj, $obj2, $obj3;
if (length $obj > 10) {...}
$hash{$obj}++;
...
Normally, objects will stringify to something like Some::Package=HASH(0x467fbc) where perl is printing the package it is blessed into, and the type and address of the reference.
Some modules choose to override this behavior. In Perl, this is done with the overload pragma. Here is an example of an object that when stringified produces its sum:
{package Sum;
use List::Util ();
sub new {my $class = shift; bless [#_] => $class}
use overload fallback => 1,
'""' => sub {List::Util::sum #{$_[0]}};
sub add {push #{$_[0]}, #_[1 .. $#_]}
}
my $sum = Sum->new(1 .. 10);
say ref $sum; # prints 'Sum'
say $sum; # prints '55'
$sum->add(100, 1000);
say $sum; # prints '1155'
There are several other ifications that overload lets you define:
'bool' Boolification The value in boolean context `if ($obj) {...}`
'""' Stringification The value in string context `say $obj; length $obj`
'0+' Numification The value in numeric context `say $obj + 1;`
'qr' Regexification The value when used as a regex `if ($str =~ /$obj/)`
Objects can even behave as different types:
'${}' Scalarification The value as a scalar ref `say $$obj`
'#{}' Arrayification The value as an array ref `say for #$obj;`
'%{}' Hashification The value as a hash ref `say for keys %$obj;`
'&{}' Codeification The value as a code ref `say $obj->(1, 2, 3);`
'*{}' Globification The value as a glob ref `say *$obj;`

Stringification methods are called when an object is used in a context where a string is expected. The method describes how to represent the object as a string. So for instance, if you say print object; then since print is expecting a string, it's actually passing the result of the stringify method to print.

Just adding to the above answer, to draw an analogy with java ...
Much similar to Object.toString() in Java. Omni-present by default but could be over-ridden when required.

Related

Subroutine arguments as key-value pairs without a temp variable

In Perl, I've always liked the key-value pair style of argument passing,
fruit( apples => red );
I do this a lot:
sub fruit {
my %args = #_;
$args{apples}
}
Purely for compactness and having more than one way to do it, is there a way to either:
access the key-value pairs without assigning #_ to a hash? I.e. in a single statement?
have the subroutine's arguments automatically become a hash reference, perhaps via a subroutine prototype?
Without:
assigning to a temp variable my %args = #_;
having the caller pass by reference i.e. fruit({ apples => red }); purely for aesthetics
Attempted
${%{\#_}}{apples}
Trying to reference #_, interpret that as a hash ref, and access a value by key.
But I get an error that it's not a hash reference. (Which it isn't ^.^ ) I'm thinking of C where you can cast pointers, amongst other things, and avoid explicit reassignment.
I also tried subroutine prototypes
sub fruit (%) { ... }
...but the arguments get collapsed into #_ as usual.
You can't perform a hash lookup (${...}{...}) without having a hash. But you could create an anonymous hash.
my $apples = ${ { #_ } }{apples};
my $oranges = ${ { #_ } }{oranges};
You could also use the simpler post-dereference syntax
my $apples = { #_ }->{apples};
my $oranges = { #_ }->{oranges};
That would be very inefficient though. You'd be creating a new hash for each parameter. That's why a named hash is normally used.
my %args = #_;
my $apples = $args{apples};
my $oranges = $args{oranges};
An alternative, however, would be to use a hash slice.
my ($apples, $oranges) = #{ { #_ } }{qw( apples oranges )};
The following is the post-derefence version, but it's only available in 5.24+[1]:
my ($apples, $oranges) = { #_ }->#{qw( apples oranges )};
It's available in 5.20+ if you use the following:
use feature qw( postderef );
no warnings qw( experimental::postderef );
If you're more concerned about compactness than efficiency, you can do it this way:
sub fruit {
print( +{#_}->{apples}, "\n" );
my $y = {#_}->{pears};
print("$y\n");
}
fruit(apples => 'red', pears => 'green');
The reason +{#_}->{apples} was used instead of {#_}->{apples} is that it conflicts with the print BLOCK LIST syntax of print without it (or some other means of disambiguation).

Why can't I initialize the member variable inside the new?

I am trying to undestand OO in Perl. I made the following trivial class:
#/usr/bin/perl
package Tools::Util;
use strict;
use warnings;
my $var;
sub new {
my ($class, $arg) = #_;
my $small_class = {
var => $arg,
};
return bless $small_class;
}
sub print_object {
print "var = $var\n"; #this is line 20
}
1;
And this is a test script:
#!/usr/bin/perl
use strict;
use warnings;
use Tools::Util;
my $test_object = new Tools::Util("Some sentence");
$test_object->print_object();
use Data::Dumper;
print Dumper($test_object);
The result I get is:
Use of uninitialized value $var in concatenation (.) or string at Tools/Util.pm line 20.
var =
$VAR1 = bless( {
'var' => 'Some sentence'
}, 'Tools::Util' );
I can not understand this. I thought that objects in Perl are hashes and so I could access/initialize the member variables using the same names without a $. Why in this case the $var is not initialized but the hash that I Dump contains the value?
How should I use/initialize/handle member variables and what am I misunderstanding here?
$var is lexical class variable, and undefined in your example.
You probably want:
sub print_object {
my $self = shift;
print "var = $self->{var}\n";
}
Perl doesn't handle object methods in quite the same way that you're used to.
Are you familiar with the implicit this argument that many object-oriented languages use? If not, now would be a great time to read up on it.
Here's a five-second introduction that glosses over the details:
//pretend C++
//this function signature
MyClass::MyFunction(int x);
//is actually more like the following
MyClass::MyFunction(MyClass this, int x);
When you access instance members of the class, my_var is equivalent to this.my_var.
In Perl, you get to do this manually! The variable $var is not equivalent to $self->{var}.
Your blessed object is actually a hash reference, and can be accessed as such. When you call $test_object->print_object(), the sub gets the value of $test_object as its first argument. Most Perl programmers handle this like so:
sub my_method {
my $self = shift; #shift first argument off of #_
print $self->{field};
}
With that in mind, you should probably rewrite your print_object sub to match mpapec's answer.
Further reading: perlsub, perlobj

Is it possible to safely access data in a nested data structure like Template Toolkit does?

Is there a module that provides functionality like Template Toolkit does when accessing a deeply nested data structure? I want to pull out something like $a = $hash{first}[0]{second}{third}[3] without having to test each part of the structure to see if it conforms to what I expect. If %hash = {} I want $a = undef, not produce an error.
Perl will do exactly what you described
This feature is called autovivification. Which means that container objects will spring into existence as soon as you use them. This holds as long as you don't violate any precedent you set yourself.
For example, trying to dereference something as a hash when you have already used it as an array reference is an error. More generally, if the value is defined, it can only be dereferenced as a particular type if it contains a reference to that type.
If you want protection against misuse as well, you can wrap the nested lookup in an eval block:
my $x = eval{ $hash{first}[0]{second}{third}[3] };
This will return undef if the eval fails. Note that this is NOT a string eval, which would be written eval '....';. In block form, Perl's eval is like the try {...} construct in other languages.
To determine if the eval failed or if the value in that position really is undef, test to see if the special variable $# is true. If so, the eval failed, and the reason will be in $#. That would be written:
my $x = eval{ $hash{first}[0]{second}{third}[3] };
if (!$x and $#) { die "nested dereference failed: $#" }
Or you can use the module Try::Tiny which abstracts away the implementation details and protects against a few edge cases:
use Try::Tiny;
my $x;
try {
$x = $hash{first}[0]{second}{third}[3];
} catch {
die "nested dereference failed: $_";
};
Your error likely comes from wrong level of indirection, not because you don't have a value.
Note that your hash variable is a scalar reference to hash, not a hash. So it should be defined as $hash = {}, not %hash = {}. Then, you access the elements there as $hash->{first}, not $hash{first}. And so on. If you define hash properly and try something like $hash->{first}->[0]->{second}->{third}->[3], you will get exactly undef, as you wanted, no errors.
Note: always use strict!
Check out Data::Diver.
You can access an arbitrary nested structure by key name (it doesn't matter if a layer is a hash or array). The Dive() subroutine will return an empty list if there is an error or it will return a matching value.
use strict;
use warnings;
use Data::Diver qw( Dive );
my $a = Dive( \%hash, 'first', 0, 'second', 'third', 3 );
if( defined $a ) {
print "Got '$a'.\n";
}
else {
print "Got no match.\n";
}
Something like this?
use strict;
use warnings;
my %hash;
my $elem = _eval( '$hash{first}[0]{second}{third}[3]' );
sub _eval {return (eval shift) // undef}
Of course you might as well do:
my $elem = eval {$hash{first}[0]{second}{third}[3] // undef};

Why does Perl's strict not let me pass a parameter hash?

I hava a perl subroutine where i would like to pass parameters as a hash
(the aim is to include a css depending on the parameter 'iconsize').
I am using the call:
get_function_bar_begin('iconsize' => '32');
for the subroutine get_function_bar_begin:
use strict;
...
sub get_function_bar_begin
{
my $self = shift;
my %template_params = %{ shift || {} };
return $self->render_template('global/bars /tmpl_incl_function_bar_begin.html',%template_params);
}
Why does this yield the error message:
Error executing run mode 'start': undef error - Can't use string ("iconsize") as a HASH ref while "strict refs" in use at CheckBar.pm at line 334
Am i doing something wrong here?
Is there an other way to submit my data ('iconsize') as a hash?
(i am still new to Perl)
EDIT: Solution which worked for me. I didn't change the call, but my function:
sub get_function_bar_begin
{
my $self = shift;
my $paramref = shift;
my %params = (ref($paramref) eq 'HASH') ? %$paramref : ();
my $iconsize = $params{'iconsize'} || '';
return $self->render_template('global/bars/tmpl_incl_function_bar_begin.html',
{
'iconsize' => $iconsize,
}
);
}
You are using the hash-dereferencing operator ( %{ } ) on the first argument of your parameter list. But that argument is not a hash reference, it's just the string 'iconsize'. You can do what you want by one of two ways:
Pass an anonymous hash reference:
get_function_bar_begin( { 'iconsize' => '32' } );
Or continue to pass a normal list, as you are right now, and change your function accordingly:
sub get_function_bar_begin {
my $self = shift;
my %template_params = #_;
}
Notice in this version that we simply assign the argument list directly to the hash (after extracting $self). This works because a list of name => value pairs is just syntactic sugar for a normal list.
I prefer the second method, since there's no particularly good reason to construct an anonymous hashref and then dereference it right away.
There's also some good information on how this works in this post: Object-Oriented Perl constructor syntax.
You're violating strict refs by trying to use the string iconsize as a hash reference.
I think you just want:
my( $self, %template_params ) = #_;
The first argument will go into $self and the rest create the hash by taking pairs of items from the rest of #_.
Passing hash with parameters as list
You need to use #_ variable instead of shift. Like this:
my %template_params = #_; ## convert key => value pairs into hash
There is different between hashes and references to hash in perl. Then you pass 'iconsize' => '32' as parameter this means list to perl, which can be interpreited as hash.
Passing hash with parameters as hash reference
But when you try %{ shift || {} } perl expect second parameter to be a hash references. In this case you can fix it in following way:
get_function_bar_begin({ 'iconsize' => '32' }); ## make anonymous hash for params
The problem is this line:
get_function_bar_begin('iconsize' => '32');
This does not pass a hash reference, as you seem to think, but a hash, which appears as a list to the callee. So when you do %{ shift }, you're only shifting the key 'iconsize', not the entire list. The solution is actually to make the second line of your function simpler:
my %template_params = #_;

Object-Oriented Perl constructor syntax and named parameters

I'm a little confused about what is going on in Perl constructors. I found these two examples perldoc perlbot.
package Foo;
#In Perl, the constructor is just a subroutine called new.
sub new {
#I don't get what this line does at all, but I always see it. Do I need it?
my $type = shift;
#I'm turning the array of inputs into a hash, called parameters.
my %params = #_;
#I'm making a new hash called $self to store my instance variables?
my $self = {};
#I'm adding two values to the instance variables called "High" and "Low".
#But I'm not sure how $params{'High'} has any meaning, since it was an
#array, and I turned it into a hash.
$self->{'High'} = $params{'High'};
$self->{'Low'} = $params{'Low'};
#Even though I read the page on [bless][2], I still don't get what it does.
bless $self, $type;
}
And another example is:
package Bar;
sub new {
my $type = shift;
#I still don't see how I can just turn an array into a hash and expect things
#to work out for me.
my %params = #_;
my $self = [];
#Exactly where did params{'Left'} and params{'Right'} come from?
$self->[0] = $params{'Left'};
$self->[1] = $params{'Right'};
#and again with the bless.
bless $self, $type;
}
And here is the script that uses these objects:
package main;
$a = Foo->new( 'High' => 42, 'Low' => 11 );
print "High=$a->{'High'}\n";
print "Low=$a->{'Low'}\n";
$b = Bar->new( 'Left' => 78, 'Right' => 40 );
print "Left=$b->[0]\n";
print "Right=$b->[1]\n";
I've injected the questions/confusion that I've been having into the code as comments.
To answer the main thrust of your question, since a hash can be initialized as a list of key => value pairs, you can send such a list to a function and then assign #_ to a hash. This is the standard way of doing named parameters in Perl.
For example,
sub foo {
my %stuff = #_;
...
}
foo( beer => 'good', vodka => 'great' );
This will result in %stuff in subroutine foo having a hash with two keys, beer and vodka, and the corresponding values.
Now, in OO Perl, there's some additional wrinkles. Whenever you use the arrow (->) operator to call a method, whatever was on the left side of the arrow is stuck onto the beginning of the #_ array.
So if you say Foo->new( 1, 2, 3 );
Then inside your constructor, #_ will look like this: ( 'Foo', 1, 2, 3 ).
So we use shift, which without an argument operates on #_ implicitly, to get that first item out of #_, and assign it to $type. After that, #_ has just our name/value pairs left, and we can assign it directly to a hash for convenience.
We then use that $type value for bless. All bless does is take a reference (in your first example a hash ref) and say "this reference is associated with a particular package." Alakazzam, you have an object.
Remember that $type contains the string 'Foo', which is the name of our package. If you don't specify a second argument to bless, it will use the name of the current package, which will also work in this example but will not work for inherited constructors.
.1. In Perl, the constructor is just a subroutine called new.
Yes, by convention new is a constructor. It may also perform initialization or not. new should return an object on success or throw an exception (die/croak) if an error has occurred that prevents object creation.
You can name your constructor anything you like, have as many constructors as you like, and even build bless objects into any name space you desire (not that this is a good idea).
.2. I don't get what my $type = shift; does at all, but I always see it. Do I need it?
shift with no arguments takes an argument off the head of #_ and assigns it to $type. The -> operator passes the invocant (left hand side) as the first argument to the subroutine. So this line gets the class name from the argument list. And, yes, you do need it.
.3. How does an array of inputs become the %params hash? my %params = #_;
Assignment into a hash is done in list context, with pairs of list items being grouped into as key/value pairs. So %foo = 1, 2, 3, 4;, creates a hash such that $foo{1} == 2 and $foo{3} == 4. This is typically done to create named parameters for a subroutine. If the sub is passed an odd number of arguments, an warning will be generated if warnings are enabled.
.4. What does 'my $self = {};` do?
This line creates an anonymous hash reference and assigns it to the lexically scoped variable $self. The hash reference will store the data for the object. Typically, the keys in the hash have a one-to-one mapping to the object attributes. So if class Foo has attributes 'size' and 'color', if you inspect the contents of a Foo object, you will see something like $foo = { size => 'm', color => 'black' };.
.5. Given $self->{'High'} = $params{'High'}; where does $params{'High'} come from?
This code relies on the arguments passed to new. If new was called like Foo->new( High => 46 ), then the hash created as per question 3 will have a value for the key High (46). In this case it is equivalent to saying $self->{High} = 46. But if the method is called like Foo->new() then no value will be available, and we have $self->{High} = undef.
.6. What does bless do?
bless takes a reference and associates with a particular package, so that you can use it to make method calls. With one argument, the reference is assoicated with the current package. With two arguments, the second argument specifies the package to associate the reference with. It is best to always use the two argument form, so that your constructors can be inherited by a sub class and still function properly.
Finally, I'll rewrite your hash based object accessor as I would write it using classical OO Perl.
package Foo;
use strict;
use warnings;
use Carp qw(croak);
sub new {
my $class = shift;
croak "Illegal parameter list has odd number of values"
if #_ % 2;
my %params = #_;
my $self = {};
bless $self, $class;
# This could be abstracted out into a method call if you
# expect to need to override this check.
for my $required (qw{ name rank serial_number });
croak "Required parameter '$required' not passed to '$class' constructor"
unless exists $params{$required};
}
# initialize all attributes by passing arguments to accessor methods.
for my $attrib ( keys %params ) {
croak "Invalid parameter '$attrib' passed to '$class' constructor"
unless $self->can( $attrib );
$self->$attrib( $params{$attrib} );
}
return $self;
}
Your question is not about OO Perl. You are confused about data structures.
A hash can be initialized using a list or array:
my #x = ('High' => 42, 'Low' => 11);
my %h = #x;
use Data::Dumper;
print Dumper \%h;
$VAR1 = {
'Low' => 11,
'High' => 42
};
When you invoke a method on a blessed reference, the reference is prepended to the argument list the method receives:
#!/usr/bin/perl
package My::Mod;
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 0;
sub new { bless [] => shift }
sub frobnicate { Dumper(\#_) }
package main;
use strict;
use warnings;
my $x = My::Mod->new;
# invoke instance method
print $x->frobnicate('High' => 42, 'Low' => 11);
# invoke class method
print My::Mod->frobnicate('High' => 42, 'Low' => 11);
# call sub frobnicate in package My::Mod
print My::Mod::frobnicate('High' => 42, 'Low' => 11);
Output:
$VAR1 = [bless( [], 'My::Mod' ),'High',42,'Low',11];
$VAR1 = ['My::Mod','High',42,'Low',11];
$VAR1 = ['High',42,'Low',11];
Some points that haven't been dealt with yet:
In Perl, the constructor is just a
subroutine called new.
Not quite. Calling the constructor new is just a convention. You can call it anything you like. There is nothing special about that name from perl's point of view.
bless $self, $type;
Both of your examples don't return the result of bless explicitly. I hope that you know that they do so implicitly anyway.
If you assign an array to a hash, perl treats alternating elements in the array as keys and values. Your array is look at like
my #array = (key1, val1, key2, val2, key3, val3, ...);
When you assign that to %hash, you get
my %hash = #array;
# %hash = ( key1 => val1, key2 => val2, key3 => val3, ...);
Which is another way of saying that in perl list/hash construction syntax, "," and "=>" mean the same thing.
In Perl, all arguments to subroutines are passed via the predefined array #_.
The shift removes and returns the first item from the #_ array. In Perl OO, this is the method invocant -- typically a class name for constructors and an object for other methods.
Hashes flatten to and can be initialized by lists. It's a common trick to emulate named arguments to subroutines. e.g.
Employee->new(name => 'Fred Flintstone', occupation => 'quarry worker');
Ignoring the class name (which is shifted off) the odd elements become hash keys and the even elements become the corresponding values.
The my $self = {} creates a new hash reference to hold the instance data. The bless function is what turns the normal hash reference $self into an object. All it does is add some metadata that identifies the reference as belonging to the class.
Yes, I know that I'm being a bit of a necromancer here, but...
While all of these answers are excellent, I thought I'd mention Moose. Moose makes constructors easy (package Foo;use Moose; automatically provides a constructor called new (although the name "new" can be overridden if you'd like)) but doesn't take away any configurability if you need it.
Once I looked through the documentation for Moose (which is pretty good overall, and there are a lot more tutorial snippets around if you google appropriately), I never looked back.