Moose traits for multdimensional data structures - perl

Breaking out handling an internal variable from calls on the variable into calls on the object is easy using the Attribute::Native::Trait handlers. However, how do you deal with multiple data structures? I can't think of any way to handle something like the below without making the stash an arrayref of My::Stash::Attribute objects, which in turn contain an arrayref of My::Stash::Subattribute objects, which contains an arrayref My::Stash::Instance objects. This includes a lot of munging and coercing the data each level down the stack as I sort things out.
Yes, I can store the items as a flat array and then grep it on every read, but in a situation with frequent reads and that most calls are reads, grepping against a large list of array items is a lot of processing every read vs just indexing the items internally in the way needed.
Is there a MooseX extension that can handle this sort of thing via handlers creating methods, instead of just treating the read accessor as the hashref it is and modifying it in place? Or am I just best off forgetting about doing things like this via method call and just doing it as-is?
use strict;
use warnings;
use 5.010;
package My::Stash;
use Moose;
has '_stash' => (is => 'ro', isa => 'HashRef', default => sub { {} });
sub add_item {
my $self = shift;
my ($item) = #_;
push #{$self->_stash->{$item->{property}}{$item->{sub}}}, $item;
}
sub get_items {
my $self = shift;
my ($property, $subproperty) = #_;
return #{$self->_stash->{$property}{$subproperty}};
}
package main;
use Data::Printer;
my $stash = My::Stash->new();
for my $property (qw/foo bar baz/) {
for my $subproperty (qw/fazz fuzz/) {
for my $instance (1 .. 2) {
$stash->add_item({ property => $property, sub => $subproperty, instance => $instance })
}
}
}
p($_) for $stash->get_items(qw/baz fuzz/);

These are very esoteric:
sub add_item {
my $self = shift;
my ($item) = #_;
push #{$self->_stash->{$item->{property}}{$item->{sub}}}, $item;
}
So add_item takes an hashref item, and pushes it onto an array key in stash indexed by it's own keys property, and sub.
sub get_items {
my $self = shift;
my ($property, $subproperty) = #_;
return #{$self->_stash->{$property}{$subproperty}};
}
Conversely, get_item takes two arguments, a $property and a $subproperty and it retrieves the appropriate elements in a Array in a HoH.
So here are the concerns into making it MooseX:
There is no way in a non-Magic hash to insist that only hashes are values -- this would be required for predictable behavior on the trait. As in your example, what would you expect if _stash->{$property} resolved to a scalar.
add_item has it's depth hardcoded to property and sub.
returning arrays is bad, it requires all of the elements to be pushed onto the stack (return refs)
Now firstly, I don't see why a regular Moose Hash trait couldn't accept array refs for both the setter and getter.
->set( [qw/ key1 key2/], 'foo' )
->get( [qw/ key1 key2/] )
This would certainly make your job easier, if your destination wasn't an array:
sub add_item {
my ( $self, $hash ) = #_;
$self->set( [ $hash->{property}, $hash->{subproperty} ], $hash );
}
# get items works as is, just pass in an `ArrayRef`
# ie, `->get([$property, $subproperty])`
When it comes to having the destination be an array than a hash slot, I assume you'd just have to build that into a totally different helper in the trait, push_to_array_or_create([$property, $subproperty], $value). I'd still just retrieve it with the fictional get helper specified above. auto_deref type functionality is a pretty bad idea.
In short ask a core developer on what they would think about extending set and get in this context to accept ArrayRefs as keys and act appropriately. I can't imagine there is a useful default for ArrayRef keys (I don't think regular stringification would be too useful.).

Related

Why is the Hashref passed to the Net::Ping constructor, set to an empty hashref after Net::Ping->new($args)?

What am I missing here?
When passing arguments to Net::Ping like this, then $args and $args_copy will both be set to an empty hashref after initializing the constructor Net::Ping->new($args).
use strict;
use warnings;
use Data::Dumper qw(Dumper);
use Net::Ping;
sub _ping {
my ($args) = #_;
my $p = Net::Ping->new($args);
$p->close();
}
my $args = { proto => 'udp' };
my $args_copy = $args;
print Dumper $args; # $VAR1 = { 'proto' => 'udp' }
print Dumper $args_copy; # $VAR1 = { 'proto' => 'udp' }
_ping($args);
print Dumper $args; # $VAR1 = {}
print Dumper $args_copy; # $VAR1 = {}
I see the same behavior on both Strawberry Perl and WSL2 running Ubuntu 20.04.4 LTS with Perl v5.30.0.
This is interesting, a class (constructor) deleting caller's data.
The shown code passes a reference to the Net::Ping constructor and that data gets cleared, and right in the constructor (see below).
To avoid having $args cleared, if that is a problem, pass its copy instead
_ping( { %$args } );
This first de-references the hash and then constructs an anonymous hash reference with it,† and passes that. So $args is safe.
The constructor new uses data from #_ directly (without making local copies), and as it then goes through the keys it also deletes them, I presume for convenience in further processing. (I find that scary, I admit.)
Since a reference is passed to new the data in the calling code can get changed.‡
† When copying a hash (or array) with a complex data structure in it -- when its values themselves contain references -- we need to make a deep copy. One way is to use Storable for it
use Storable qw(dclone);
my $deep_copy = dclone $complex_data_structure;
Here that would mean _ping( dclone $args );. It seems that new can only take a reference to a flat hash (or scalars) so this wouldn't be necessary.
‡ When a sub works directly with the references it gets then it can change data in the caller
sub sub_takes_ref {
my ($ref_data) = #_;
for my $k (keys %$ref_data) {
$ref_data->{$k} = ...; # !!! data in caller changed !!!
}
}
...
my $data = { ... }; # a hashref
sub_takes_ref( $data );
However, if a local copy of arguments is made in the sub then caller's data cannot be changed
use Storable qw(dclone); # for deep copy below
sub sub_takes_ref {
my ($ref_data) = #_;
my $local_copy_of_data = dclone $ref_data;
for my $k (keys %$local_copy_of_data) {
$local_copy_of_data->{$k} = ...; # data in caller safe
}
}
(Just remember to not touch $ref_data but to use the local copy.)
This way of changing data in the caller is of course useful when the sub is meant to work on data structures with large amounts of data, since this way they don't have to be copied. But when that is not the purpose of the sub then we need to be careful, or just make a local copy to be safe.

Copy on Write for References

Perl currently supports Copy on Write (CoW) for scalar variables however it doesn't appear to have anything for hashrefs and arrayrefs.
Perl does, however, have subroutines to modify variable internals like weaken so I'm guessing that there might exist a solution.
I have a situation where I have a large structure I'm returning from a package which keeps an internal state of this large structure. I want to ensure that if either the returned references or the internal reference (which are both currently the same reference) is modified that I end up with a Copy-on-write situation where the data the references are pointing to is copied, modified and the reference used to modified the data is updated to point to the new data.
package SomePackage;
use Moose;
has some_large_internal_variable_ref => (
'is' => 'rw',
'isa' => 'HashRef',
);
sub some_operation {
my ($self) = #_;
$self->some_large_internal_variable_ref({
# create some large result that is different every time
});
}
sub get_result {
my ($self) = #_;
return $self->some_large_internal_variable_ref;
}
1;
use strict;
use warnings;
use SomePackage;
use Test::More;
# Situtation 1 where the internally stored reference is modified
# This will pass!
my $package = SomePackage->new();
$package->some_operation();
my $result1 = $package->get_result();
$package->some_operation();
my $result2 = $package->get_result();
isnt($result1, $result2, "These two references should no longer be the same");
# Situtation 2 where the externally stored references is modified
# This will fail
$package = SomePackage->new();
$package->some_operation();
$result1 = $package->get_result();
$result1->{foo} = "bar";
$result2 = $package->get_result();
isnt($result1, $result2, "These two references should no longer be the same");
done_testing;
I'm trying to avoid a situation where I have to clone the values on the get_result return as this would result in a situation where memory usage is doubled.
I'm hoping there is some form of weaken I can call on the variable to indicate that, should a modification be made to behave with Copy on Write behaviour.

Multiple data members in a perl class

I am new to perl and still learning oop in perl. I usually code in C, C++. It is required to bless an object to notify perl to search for methods in that package first. That's what bless does. And then every function call made with help of -> passes the instance itself as first parameter. Now I have a doubt in writing the constructor for a new object. Normally a constructor would normally look like:
sub new {
my %hash = {};
return bless {%hash}; #will automatically take this package as the class
}
Now I want to have two data members in my class so I can do something like this:
sub new {
my %hash = {};
$hash->{"table_header"} = shift #_; #add element to hash
$hash->{"body_content"} = shift #_;
return bless {%hash}; #will automatically take this package as the class
}
My question is that is this the only possible way. Can't we have multiple data members like in C and C++ and we do have to use strings like "table_header" and "body_content".
EDIT:
In C or C++ we can directly reference the data member(assume its public for now). Here there is one extra reference which has to be made. I wanted to know if there is any way we can have a C like object.
sub new {
my $table_header = shift #_;
my $body_content = shift #_;
#bless somehow
}
Hope this clears some confusion.
There are modules that make OOP in Perl easier. The most important is Moose:
use strict; use warnings;
package SomeObject;
use Moose; # this is now a Moose class
# declare some members. Note that everything is "public"
has table_header => (
is => 'ro', # read-only access
);
has body_content => (
is => 'rw', # read-write access
);
# a "new" method is autogenerated
# some method that uses these fields.
# Note that the members can only be accessed via methods.
# This guards against typos that can't be easily caught with hashes.
sub display {
my ($self) = #_;
my $underline = "=" x (length $self->table_header);
return $self->table_header . "\n" . $underline . "\n\n" . $self->body_content . "\n";
}
package main;
# the "new" takes keyword arguments
my $instance = SomeObject->new(
table_header => "This is a header",
body_content => "Some body content",
);
$instance->body_content("Different content"); # set a member
print $instance->display;
# This is a header
# ================
#
# Different content
If you get to know Moose, you will find an object system that is far more flexible than that in Java or C++, as it takes ideas from Perl6 and the Common Lisp Object System. Of course, this is fairly ugly, but it works well in practice.
Because of the way Perl OOP works, it isn't possible to have the instance members accessible as variables on their own. Well, almost. There is the experimental mop module which does exactly that.
use strict; use warnings;
use mop;
class SomeObject {
# Instance variables start with $!..., and behave like ordinary variables
# If you make them externally accessible with "is ro" or "is rw", then
# appropriate accessor methods are additionally generated.
# a private member with public read-only accessor,
# which has to be initialized in the constructor.
has $!table_header is ro = die 'Please specify a "table_header"!';
# a private member with public read-write accessor,
# which is optional.
has $!body_content is rw = "";
# new is autogenerated, as in Moose
method display() {
# arguments are handled automatically, so we could also do $self->table_header.
my $underline = "=" x (length $!table_header);
return "$!table_header\n$underline\n\n$!body_content\n";
}
}
# as seen in Moose
my $instance = SomeObject->new(
table_header => "This is a header",
body_content => "Some body content",
);
$instance->body_content("Different content"); # set a member, as in Moose
print $instance->display;
# This is a header
# ================
#
# Different content
Although it has pretty syntax, don't use mop right now for serious projects and stick to Moose instead. If Moose is too heavyweight for you, then you might enjoy lighter alternatives like Mouse or Moo (these three object systems are mostly compatible with each other).
You are getting confused between hashes and hash references. You are also forgetting that the first parameter to any method is the object reference or the name of the package. Perl constructors are inherited like any other method, so you must bless the new object into the correct package for polymorphism to work properly. This code is what you intended
sub new {
my $package = shift;
my %self;
$self{table_header} = shift;
$self{body_content} = shift;
bless \%self, $package;
}
I am not clear what you mean by “directly reference the data member”, but if you hoped that you could avoid writing $self everywhere so that every variable was implicitly an element of the hash then you cannot. Perl is far more flexible than most languages, and can use any blessed reference as an object instance. It is most common to use a hash, but occasionally a reference to an array, a scalar, or even a file handle is more appropriate. The cost of this flexibility is specifying exactly when you are referring to a member of the blessed hash. I don't see that it's too great a burden.
You can always write your code more concisely. The method above can be written
sub new {
my $package = shift;
my %self;
#self{qw/ table_header body_content /} = #_;
bless \%self, $package;
}

Perl: referencing/blessing question

The idea is to implement a class that gets a list of [arrays, Thread::Conveyor queues and other stuff] in a TIEHASH constructor,
use AbstractHash;
tie(%DATA, 'AbstractHash', \#a1, \#a2, \$tcq);
What is a correct way to pass object references (like mentioned Thread::Conveyor objects) thus array references into constructor, so it can access the objects? Any cases when a passed object should be blessed?
As far as I can tell, objects are not objects unless they're bless-ed.
That said, the constructor argument would simply be an arrayref of Thread::Conveyor objects:
my $data = AbstractHash->tie ( \#a1, \#a2, \$tcq );
where the constructor is defined in the AbstractHash package:
sub tie {
my $class = shift; # Implicit variable, don't forget
my $data = {
someArray => +shift,
queues => +shift,
someValue => +shift,
};
# $data starts life as a hashref, make it an 'AbstractHash'
bless $data, $class; # $data is no longer a hashref
return $data; # AbstractHash object returned
}

Object-Oriented Perl constructor syntax and named parameters

I'm a little confused about what is going on in Perl constructors. I found these two examples perldoc perlbot.
package Foo;
#In Perl, the constructor is just a subroutine called new.
sub new {
#I don't get what this line does at all, but I always see it. Do I need it?
my $type = shift;
#I'm turning the array of inputs into a hash, called parameters.
my %params = #_;
#I'm making a new hash called $self to store my instance variables?
my $self = {};
#I'm adding two values to the instance variables called "High" and "Low".
#But I'm not sure how $params{'High'} has any meaning, since it was an
#array, and I turned it into a hash.
$self->{'High'} = $params{'High'};
$self->{'Low'} = $params{'Low'};
#Even though I read the page on [bless][2], I still don't get what it does.
bless $self, $type;
}
And another example is:
package Bar;
sub new {
my $type = shift;
#I still don't see how I can just turn an array into a hash and expect things
#to work out for me.
my %params = #_;
my $self = [];
#Exactly where did params{'Left'} and params{'Right'} come from?
$self->[0] = $params{'Left'};
$self->[1] = $params{'Right'};
#and again with the bless.
bless $self, $type;
}
And here is the script that uses these objects:
package main;
$a = Foo->new( 'High' => 42, 'Low' => 11 );
print "High=$a->{'High'}\n";
print "Low=$a->{'Low'}\n";
$b = Bar->new( 'Left' => 78, 'Right' => 40 );
print "Left=$b->[0]\n";
print "Right=$b->[1]\n";
I've injected the questions/confusion that I've been having into the code as comments.
To answer the main thrust of your question, since a hash can be initialized as a list of key => value pairs, you can send such a list to a function and then assign #_ to a hash. This is the standard way of doing named parameters in Perl.
For example,
sub foo {
my %stuff = #_;
...
}
foo( beer => 'good', vodka => 'great' );
This will result in %stuff in subroutine foo having a hash with two keys, beer and vodka, and the corresponding values.
Now, in OO Perl, there's some additional wrinkles. Whenever you use the arrow (->) operator to call a method, whatever was on the left side of the arrow is stuck onto the beginning of the #_ array.
So if you say Foo->new( 1, 2, 3 );
Then inside your constructor, #_ will look like this: ( 'Foo', 1, 2, 3 ).
So we use shift, which without an argument operates on #_ implicitly, to get that first item out of #_, and assign it to $type. After that, #_ has just our name/value pairs left, and we can assign it directly to a hash for convenience.
We then use that $type value for bless. All bless does is take a reference (in your first example a hash ref) and say "this reference is associated with a particular package." Alakazzam, you have an object.
Remember that $type contains the string 'Foo', which is the name of our package. If you don't specify a second argument to bless, it will use the name of the current package, which will also work in this example but will not work for inherited constructors.
.1. In Perl, the constructor is just a subroutine called new.
Yes, by convention new is a constructor. It may also perform initialization or not. new should return an object on success or throw an exception (die/croak) if an error has occurred that prevents object creation.
You can name your constructor anything you like, have as many constructors as you like, and even build bless objects into any name space you desire (not that this is a good idea).
.2. I don't get what my $type = shift; does at all, but I always see it. Do I need it?
shift with no arguments takes an argument off the head of #_ and assigns it to $type. The -> operator passes the invocant (left hand side) as the first argument to the subroutine. So this line gets the class name from the argument list. And, yes, you do need it.
.3. How does an array of inputs become the %params hash? my %params = #_;
Assignment into a hash is done in list context, with pairs of list items being grouped into as key/value pairs. So %foo = 1, 2, 3, 4;, creates a hash such that $foo{1} == 2 and $foo{3} == 4. This is typically done to create named parameters for a subroutine. If the sub is passed an odd number of arguments, an warning will be generated if warnings are enabled.
.4. What does 'my $self = {};` do?
This line creates an anonymous hash reference and assigns it to the lexically scoped variable $self. The hash reference will store the data for the object. Typically, the keys in the hash have a one-to-one mapping to the object attributes. So if class Foo has attributes 'size' and 'color', if you inspect the contents of a Foo object, you will see something like $foo = { size => 'm', color => 'black' };.
.5. Given $self->{'High'} = $params{'High'}; where does $params{'High'} come from?
This code relies on the arguments passed to new. If new was called like Foo->new( High => 46 ), then the hash created as per question 3 will have a value for the key High (46). In this case it is equivalent to saying $self->{High} = 46. But if the method is called like Foo->new() then no value will be available, and we have $self->{High} = undef.
.6. What does bless do?
bless takes a reference and associates with a particular package, so that you can use it to make method calls. With one argument, the reference is assoicated with the current package. With two arguments, the second argument specifies the package to associate the reference with. It is best to always use the two argument form, so that your constructors can be inherited by a sub class and still function properly.
Finally, I'll rewrite your hash based object accessor as I would write it using classical OO Perl.
package Foo;
use strict;
use warnings;
use Carp qw(croak);
sub new {
my $class = shift;
croak "Illegal parameter list has odd number of values"
if #_ % 2;
my %params = #_;
my $self = {};
bless $self, $class;
# This could be abstracted out into a method call if you
# expect to need to override this check.
for my $required (qw{ name rank serial_number });
croak "Required parameter '$required' not passed to '$class' constructor"
unless exists $params{$required};
}
# initialize all attributes by passing arguments to accessor methods.
for my $attrib ( keys %params ) {
croak "Invalid parameter '$attrib' passed to '$class' constructor"
unless $self->can( $attrib );
$self->$attrib( $params{$attrib} );
}
return $self;
}
Your question is not about OO Perl. You are confused about data structures.
A hash can be initialized using a list or array:
my #x = ('High' => 42, 'Low' => 11);
my %h = #x;
use Data::Dumper;
print Dumper \%h;
$VAR1 = {
'Low' => 11,
'High' => 42
};
When you invoke a method on a blessed reference, the reference is prepended to the argument list the method receives:
#!/usr/bin/perl
package My::Mod;
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 0;
sub new { bless [] => shift }
sub frobnicate { Dumper(\#_) }
package main;
use strict;
use warnings;
my $x = My::Mod->new;
# invoke instance method
print $x->frobnicate('High' => 42, 'Low' => 11);
# invoke class method
print My::Mod->frobnicate('High' => 42, 'Low' => 11);
# call sub frobnicate in package My::Mod
print My::Mod::frobnicate('High' => 42, 'Low' => 11);
Output:
$VAR1 = [bless( [], 'My::Mod' ),'High',42,'Low',11];
$VAR1 = ['My::Mod','High',42,'Low',11];
$VAR1 = ['High',42,'Low',11];
Some points that haven't been dealt with yet:
In Perl, the constructor is just a
subroutine called new.
Not quite. Calling the constructor new is just a convention. You can call it anything you like. There is nothing special about that name from perl's point of view.
bless $self, $type;
Both of your examples don't return the result of bless explicitly. I hope that you know that they do so implicitly anyway.
If you assign an array to a hash, perl treats alternating elements in the array as keys and values. Your array is look at like
my #array = (key1, val1, key2, val2, key3, val3, ...);
When you assign that to %hash, you get
my %hash = #array;
# %hash = ( key1 => val1, key2 => val2, key3 => val3, ...);
Which is another way of saying that in perl list/hash construction syntax, "," and "=>" mean the same thing.
In Perl, all arguments to subroutines are passed via the predefined array #_.
The shift removes and returns the first item from the #_ array. In Perl OO, this is the method invocant -- typically a class name for constructors and an object for other methods.
Hashes flatten to and can be initialized by lists. It's a common trick to emulate named arguments to subroutines. e.g.
Employee->new(name => 'Fred Flintstone', occupation => 'quarry worker');
Ignoring the class name (which is shifted off) the odd elements become hash keys and the even elements become the corresponding values.
The my $self = {} creates a new hash reference to hold the instance data. The bless function is what turns the normal hash reference $self into an object. All it does is add some metadata that identifies the reference as belonging to the class.
Yes, I know that I'm being a bit of a necromancer here, but...
While all of these answers are excellent, I thought I'd mention Moose. Moose makes constructors easy (package Foo;use Moose; automatically provides a constructor called new (although the name "new" can be overridden if you'd like)) but doesn't take away any configurability if you need it.
Once I looked through the documentation for Moose (which is pretty good overall, and there are a lot more tutorial snippets around if you google appropriately), I never looked back.