Creating a hash that is read-only outside of a module, but read/write inside - hash

I am creating a module that has some fairly heavily nested hashes. The hash needs to be semi-regularly modified by the module, which unfortunately rules out using Map.
Generally, a branch of the nested hash will be returned to users of the module [1], and the simplest thing to do is to just return that nested hash, e.g.:
return %data{$branch}{$subbranch}
# ↪︎ %(subsubbranch1 => ... , subsubbranch2 => ... )
However, the nature of containers like arrays or hashes is that while you can make them read-only, the key/values can still be modified. The module users though should not actually modify those values for a number of reasons. Coercing to Map won't help, because if any of the values are also containers, they too will be modifiable.
My first thought was to subclass Hash (or otherwise make a custom Associative), but autovivification by default still goes to Hash. That, however, can be easily solved by overriding both AT-KEY and ASSIGN-KEY so that the AT-KEY returns an instance of the subclass if the key doesn't already exist:
class ProtectedHash is Hash {
has %!hash = ();
method EXISTS-KEY ($key) { %!hash{$key}:exists }
method ASSIGN-KEY ($key, \value) { %!hash{$key} = value }
method AT-KEY ($key) {
%!hash{$key} := ProtectedHash.new unless %!hash{$key}:exists;
%!hash{$key};
}
}
What I'd like to do is to fail if the ASSIGN-KEY (or the autovivification part of AT-KEY) is called from outside my module. I thought about using something like $?MODULE but that would be set at compile time and always be true. It looks like I can shimmy off of Backtrace a bit and check for the name of the file that called, but how consistent can I assume the call trace to those two functions?
For example, for ASSIGN-KEY I've got:
method ASSIGN-KEY ($key, \value) {
my #trace = Backtrace.new.list[3..*];
# The first three can be ignored:
# 0: code at ...Backtrace.pm6
# 1: method new at ...Backtrace.pm6
# 2: method AT-KEY at ...ThisFile.pm6
if/unless ??? {
%!hash{$key} = value
}
}
AT-KEY is normally called by the sub postcircumfix<{ }> (in which case #trace[0] can be ignored, and trace[1] would be the one of interest) but could also be, albeit rarely, called directly, in which case trace[0] is where I'd want to verify the file name.
Are there any other common ways in which AT-KEY or ASSIGN-KEY might be called? Or should check those two steps account for 99.9% of calls to those methods? [2]
[1] There are only a few subx4 branches that a user might want to manipulate, and so I figure it's best to provide them with the necessarily-slower .Hash method for when they really need it than to assume they always need a manipulable container. At times these may be called enough (particularly via a get-branch($foo){$subbranch}{$subsubbranch} pattern), that the addition overhead in creating a deepclone of the Hash becomes decently consequential.
[2] I'm not too concerned about preventing ANY access (although I'm certainly curious if that's possible purely via subclassing), because I'm sure that a fairly industrious coder could always figure something out, but I'd like to catch the most common ones as a way of saying "Can't touch this!" (cue the 90's music…) and provide an Awesome error message.

It's probably easier to achieve this by returning something wrapping the original Array or Hash, or alternatively using but to do a shallow copy and mix in to it (which means you retain the original type).
We can declare a role like this:
role Can'tTouchThis {
method AT-KEY(|) {
untouchable callsame
}
method ASSIGN-KEY(|) {
die "Cannot assign to this";
}
method AT-POS(|) {
untouchable callsame
}
method ASSIGN-POS(|) {
die "Cannot assign to this";
}
}
Where the sub untouchable is defined as:
multi untouchable(Positional \p) {
p but Can'tTouchThis
}
multi untouchable(Associative \a) {
a but Can'tTouchThis
}
multi untouchable(\o) {
o
}
Thus handling nested data structures by - on access - creating a read-only facade to those too.
Here's an example and some test cases to illustrate the effect:
class Example {
has %!foo = a => [ 1, 2, [ 3, 4] ], b => { c => { d => 42, e => 19 }, f => 100 };
method get($sym) {
untouchable %!foo{$sym}
}
}
given Example.new {
use Test;
# Positional cases
is .get('a')[0], 1;
is .get('a')[2][1], 4;
dies-ok { .get('a')[1] = 42 };
is .get('a')[1], 2;
# Associative cases
is .get('b')<c><d>, 42;
dies-ok { .get('b')<f> = 99 };
dies-ok { .get('b')<c><d> = 99 };
is .get('b')<f>, 100;
is .get('b')<c><d>, 42;
# Auto-viv also doesn't work
dies-ok { .get('a')[4]<a> = 99 };
dies-ok { .get('a')[4][0] = 99 };
}
Remove the untouchable call in the get method to see the majority of the tests here fail due to lack of protection.

The solution I ultimately employed served my needs, and I'm posting it here for those who may encounter similar situations. (The answer with role mixing unfortunately doesn't survive binding)
My ultimate approach was to worry the most about unintended editing. To protect against this, I created an Associative-type class called DB-Item that internally has a hash. The AT-KEY method returns the item from the hash if it exists, but ASSIGN-KEY and BIND-KEY simply immediately fail with an appropriate error message. The only other method is ADD-TO-DATABASE. That method handles adds leafs/branches depending on what it's passed (and in general end users should be wary of using all caps methods directly). Since branches can be of different lengths, this also greatly simplifies the initial DB creation:
class DB-Item does Associative {
has %!hash = ();
my $epitaph = "Modification of the database is not a good idea:\n" ~
" - Use .clone if you want to get a editable branch.\n" ~
" - If you really know what you're doing, use .ADD-TO-DATABASE";
method ADD-TO-DATABASE (*#branch) {
if #branch == 2 {
%!hash{#branch.head} = #branch.tail
}else{
%!hash{#branch.head} = DB-Item.new;
%!hash{#branch.head}.ADD-TO-DATABASE(#branch[1..*]);
}
}
method ASSIGN-KEY(|) is hidden-from-backtrace { die $epitaph }
method BIND-KEY(|) is hidden-from-backtrace { die $epitaph }
method EXISTS-KEY($key) { %!hash{$key}:exists }
method AT-KEY($key) { %!hash{$key}:exists ?? %!hash{$key} !! Nil }
method clone { ... }
}

Related

Return a base-class object from a derived-class object

I'm aware that Perl is not statically typed when I want to apply this mechanism to a Perl object of a derived class:
Say I have a base class B and a derived class D inheriting from B.
Also I have an object $obj that holds a D object.
A function Bf() is expecting a parameter of type B.
Obviously (by the rules of polymorphism) I can pass $obj to Bf() like Bf($obj), but unlike to a static-typed language Bf() will see the whole D object (and not just the elements of B).
Is there a (rather clean and simple) solution for this problem in Perl? The solution should "hide" the attributes (and methods) a B does not have from D in Bf(), not restricting modifications of the original B (which is D actually).
Adult Programmers only (added 2020-03-06)
OK, people wanted a more concrete description.
Unfortunately (as pointed out) the original program is highly complex and uses reflection-like mechanisms to generate getters, setters and formatters automatically, to I really can't give a minimum working example here, because it would not be minimal.
First I have a class MessageHandler that handle messages (no surprise!).
Then I have a function log_message($$$) that expects (among others) a MessageHandler object as first argument.
Then I have this hierarchy of classes (it's much more complex in reality):
MessageHandler
ControlMessageHandler (ISA: MessageHandler)
ControlMessageResponseHandler (ISA: ControlMessageHandler)
Now if log_message wants a MessageHandler I can pass a ControlMessageResponseHandler as it conforms to MessageHandler.
But doing so exposes all the attributes of ControlMessageResponseHandler to log_message that are non-existent in MessageHandler.
The danger is that log_message might (by mistake) access an attribute of ControlMessageResponseHandler that is not present in MessageHandler. To prevent errors I'd like to prevent that, or at least get some warning (like I would get in a statically-typed language as Eiffel).
Dirty Details inside
Just in case it matters, I'll sketch how my array objects are built (a lot of extra code would be needed for a working example):
First the array indices are allocated automatically like this:
use constant I_VERBOSITY => IS_NEXT->(); # verbosity level
use constant I_TAG => IS_NEXT->(); # additional tag
use constant I_TAG_STACK => IS_NEXT->(); # tag stack
use constant I_MSG_DEBUG => IS_NEXT->(); # handler for debug messages
...
use constant I_LAST => IS_LAST->(); # last index (must be last)
I_LAST is needed for inheritance.
The attributes are defines like this:
use constant ATTRIBUTES => (
['verbosity', I_VERBOSITY, undef],
['tag', I_TAG, \&Class::_format_string],
['tag_stack', I_TAG_STACK, undef],
['msg_debug', I_MSG_DEBUG, \&Class::_format_code],
...
);
The definition contains a hint how to format each attribute.
This information is used to set up formatters to format each attribute like this:
use constant FORMATTERS =>
(map { Class::_attribute_string($_->[0], $_->[1], undef, $_->[2]) }
ATTRIBUTES); # attribute formatters
Getters and setters are automatically defined like this:
BEGIN {
foreach (ATTRIBUTES) {
Class::_assign_gs_ai(__PACKAGE__, $_->[0], $_->[1]);
}
}
The constructor would use the following lines:
my $self = [];
$#$self = I_LAST;
$self->[I_VERBOSITY] = $verbosity;
...
And finally my object print routine goes like this:
sub as_string($)
{
my $self = shift;
my $a_sep = ', ';
return join($a_sep, map { $_->($self, $a_sep) } FORMATTERS);
}
With inheritance it looks like this:
sub as_string($)
{
my $self = shift;
my $a_sep = ', ';
return join($a_sep, $self->SUPER::as_string(),
map { $_->($self, $a_sep) } FORMATTERS);
}
I'm not sure what your problem is, although I think you took the long way to say "I have a function that expects a B object, and I want to pass it a D object."
If you only want objects of a certain exact type, don't accept anything else:
use Carp qw(croak);
sub Bf {
croak "Bad object! I only like B" unless ref $_[0] eq 'B';
...
}
But, that's a bad idea. A derived class should be just as good as the base class. The clean solution is to not care what type you get.
sub Bf {
croak "Bad object! Doesn't respond to foo!" unless $_[0]->can('foo');
...
}
Since this Bf method works with the base class, why would it look for something in some derived class it didn't know about? If the derived class has changed the interface and no longer acts like its parent, then maybe it's isn't a good fit for inheritance. There are many problems like this that are solved by a different architecture.
I think you'll have to come up with a concrete example where the derived class wouldn't work.
It sounds like for some reason you need your D object to behave like a B object, but at the same time not like a D object. As the existing answers and comments indicate, it's a very common to use a sub-class where the base class is expected, and most algorithms shouldn't care whether what you actually passed is D or B. The only reason I can think of why you would want otherwise is that D overrides (redefines) some methods in an incompatible way, and you want the methods from B instead.
package Dog;
sub new {
my ($class, %args) = #_;
return bless \%args, $class;
}
sub bark { print "Bark!\n"; }
package Dingo;
use parent 'Dog';
sub bark { print "...\n"; }
package main;
my $dingo = Dingo->new;
$dingo->bark; # "..."
(n.b., I've left off the recommended use strict; and use warnings; for terseness, they should be used in all packages)
You may be aware from reading perldoc perlootut and perldoc perlobj that an object in Perl is just a blessed reference of some sort; in the example above, we use a hash reference. If you are trying to get the "attributes" that only exist in B, I think you would have to write some sort of translation method. But, if you care about the methods that exist in B, all you have to do is re-bless it into the parent class.
my $dingo = Dingo->new;
$dingo->bark; # "..."
bless $dingo, "Dog";
$dingo->bark; # "Bark!"
Note that bless does not return a new reference, but modifies that reference in-place; if you want it to behave like a Dingo again, you have to bless it back.
Perhaps more conveniently you can define a method to create a copy for you and bless it into the appropriate class:
package Dog;
sub as_dog {
my ($self) = #_;
# The {} below create a shallow copy, i.e., a new reference
return bless { %{$self} }, __PACKAGE__;
}
#...
package main;
my $dingo = Dingo->new;
$dingo->bark; # ...
$dingo->as_dog->bark; # Bark!
$dingo->bark; # ...
While there doesn't seem to be a perfect solution, temporary "re-blessing" the object seems to get quite close to what is asked for:
sub Bf($) # expects a "B" object (or descendant of "B" (like "D"))
{
my $B = shift;
my $type = ref($B); # save original type
die "unexpected type $type" unless ($B->isa('B'));
bless $B, 'B'; # restrict to "B"'s features
$B->whatever(...);
#...
bless $B, $type; # restore original type
}

Mixing roles into callables

Theoretically, you can mix in a role into an object in runtime. So I am trying to do this with a function:
my &random-f = -> $arg { "Just $arg" };
say random-f("boo");
role Argable {
method argh() {
self.CALL-ME( "argh" );
}
}
&random-f does Argable;
say random-f.argh;
Within the role, I use self to refer to the already defined function, and CALL-ME to actually call the function within the role. However, this results in the following error:
Too few positionals passed; expected 1 argument but got 0
in block <unit> at self-call-me.p6 line 5
I really have no idea who's expecting 1 argument. Theoretically, it should be the CALL-ME function, but who knows. Eliminating the self. yields a different error: CALL-ME used at line 11. Adding does Callable to Argable (after putting self back) results in the same error. Can this be done? Any idea of how?
There's two things incorrect in your code:
say random-f.argh; # *call* random-f and then call .argh on the result
You want to call .argh on the Callable so:
say &random-f.argh;
Secondly, you should just be able to call self: you can tweak this in the signature of the .argh method:
method argh(&self:) {
So the final code becomes:
my &random-f = -> $arg { "Just $arg" };
say random-f("boo");
role Argable {
method argh(&self:) {
self( "argh" );
}
}
&random-f does Argable;
say &random-f.argh;

How to check whether an attribute of an object has a value?

I get an error such as "can't call method 'xxxx' on an undefined value" when attempting to check if an object has been created (by the perl module Bio::Perl).
Is there a general way of checking if an attribute has a value or not? I would have liked to do something like:
if ($the_object->the_attribute) {
But as long as the attribute is "undef", calling the method will only give me the error message. I have not been able to find a solution to this problem - which is real, because the object is created by the Bio::Perl module, and some attributes may or may not be set. Maybe I should add that I am not particularly perl-objects-savvy.
edit:
Below is a relevant part of my code. The get_sequence() function is in the Bio::Perl module. On line 13, how can I make sure that there is a value (sequence in this case) before checking the length of it?
my #msgs;
my #sequence_objects;
my $maxlength = 0;
for ( #contigs ) {
my $seq_obj;
try {
$seq_obj = get_sequence( 'genbank', $_ );
}
catch Bio::Root::Exception with {
push #msgs, "Nothing found for $_ ";
};
if ( $seq_obj ) {
my $seq_length = length( $seq_obj->seq );
if ( $seq_length > $maxlength ) {
$maxlength = $seq_length;
}
push #sequence_objects, $seq_obj;
}
}
...
if ($the_object->the_attribute) {
This checks if the return value of the method the_attribute is true. True means that it's not 0, the empty string q{} or undef.
But you said you want to know whether the object exists.
Let's go over some basics first.
# | this variable contains an object
# | this arrow -> tells Perl to call the method on the obj
# | | this is a method that is called on $the_object
# | | |
if ($the_object->the_attribute) {
# ( )
# the if checks the return value of the expression between those parenthesis
It looks like you're confusing a few things.
First, your $the_object is supposed to be an object. It probably came from a call like this:
my $the_object = Some::Class->new;
Or maybe it was returned from some other function call. Maybe some other object returned it.
my $the_object = $something_else->some_property_that_be_another_obj
Now the_attribute is a method (that's like a function) that returns a specific piece of data in your object. Depending on the implementation of the class (the building plan of the object), if that attribute is not set (initialized), it might either just return undef, or some other value.
But the error message you are seeing is not related to the_attribute. If it was, you'd just not call the code in the block. The if check would catch it, and decide to go to else, or do nothing if there is no else.
Your error message says you are trying to call a method on something that is undef. We know you are calling the the_attribute accessor method on $the_object. So $the_object is undef.
The easiest way to check if something has a true value is to just put it in an if. But you already seem to know that.
if ($obj) {
# there is some non-false value in $obj
}
You've now checked that $obj is something that is true. So it could be an object. So you could now call your method.
if ($obj && $obj->the_attribute) { ... }
This will check the true-ness of $obj and only continue if there is something in $obj. If not, it will never call the right hand side of the && and you will not get an error.
But if you want to know whether $obj is an object that has a method, you can use can. Remember that attributes are just accessor methods to values stored inside the object.
if ($obj->can('the_attribute')) {
# $obj has a method the_attribute
}
But that can blow up if $obj is not there.
If you're not sure that $obj is really an object, you can use the Safe::Isa module. It provides a method $_call_if_object1 that you can use to safely call your method on your maybe-object.
$maybe_an_object->$_call_if_object(method_name => #args);
Your call would translate to.
my $the_attribute = $obj->$_call_if_object('the_attribute');
if ($the_attribute) {
# there is a value in the_attribute
}
The same way you can use $_isa and $_can from Safe::Isa.
1) Yes, the method starts with a $, it's really a variable. If you want to learn more about how and why this works, watch the talk You did what? by mst.

Perl OOP attribute manipulation best practice

Assume the following code:
package Thing;
sub new {
my $this=shift;
bless {#_},$this;
}
sub name {
my $this=shift;
if (#_) {
$this->{_name}=shift;
}
return $this->{_name};
}
Now assume we've instantiated an object thusly:
my $o=Thing->new();
$o->name('Harold');
Good enough. We could also instantiate the same thing more quickly with either of the following:
my $o=Thing->new(_name=>'Harold'); # poor form
my $o=Thing->new()->name('Harold');
To be sure, I allowed attributes to be passed in the constructor to allow "friendly" classes to create objects more completely. It could also allow for a clone-type operator with the following code:
my $o=Thing->new(%$otherthing); # will clone attrs if not deeper than 1 level
This is all well and good. I understand the need for hiding attributes behind methods to allow for validation, etc.
$o->name; # returns 'Harold'
$o->name('Fred'); # sets name to 'Fred' and returns 'Fred'
But what this doesn't allow is easy manipulation of the attribute based on itself, such as:
$o->{_name}=~s/old/ry/; # name is now 'Harry', but this "exposes" the attribute
One alternative is to do the following:
# Cumbersome, not syntactically sweet
my $n=$o->name;
$n=~s/old/ry/;
$o->name($n);
Another potential is the following method:
sub Name :lvalue { # note the capital 'N', not the same as name
my $this=shift;
return $this->{_name};
}
Now I can do the following:
$o->Name=~s/old/ry/;
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way? I mean, doing that takes away any validation that might be found in the 'name' method. For example, if the 'name' method enforced a capital first letter and lowercase letters thereafter, the 'Name' (capital 'N') bypasses that and forces the user of the class to police herself in the use of it.
So, if the 'Name' lvalue method isn't exactly "kosher" are there any established ways to do such things?
I have considered (but get dizzy considering) things like tied scalars as attributes. To be sure, it may be the way to go.
Also, are there perhaps overloads that may help?
Or should I create replacement methods in the vein of (if it would even work):
sub replace_name {
my $this=shift;
my $repl=shift;
my $new=shift;
$this->{_name}=~s/$repl/$new/;
}
...
$o->replace_name(qr/old/,'ry');
Thanks in advance... and note, I am not very experienced in Perl's brand of OOP, even though I am fairly well-versed in OOP itself.
Additional info:
I guess I could get really creative with my interface... here's an idea I tinkered with, but I guess it shows that there really are no bounds:
sub name {
my $this=shift;
if (#_) {
my $first=shift;
if (ref($first) eq 'Regexp') {
my $second=shift;
$this->{_name}=~s/$first/$second/;
}
else {
$this->{_name}=$first;
}
}
return $this->{_name};
}
Now, I can either set the name attribute with
$o->name('Fred');
or I can manipulate it with
$o->name(qr/old/,'ry'); # name is now Harry
This still doesn't allow stuff like $o->name.=' Jr.'; but that's not too tough to add. Heck, I could allow calllback functions to be passed in, couldn't I?
Your first code example is abolutely fine. This is a standard method to write accessors. Of course this can get ugly when doing a substitution, the best solution might be:
$o->name($o->name =~ s/old/ry/r);
The /r flag returns the result of the substitution. Equivalently:
$o->name(do { (my $t = $o->name) =~ s/old/ry/; $t });
Well yes, this 2nd solution is admittedly ugly. But I am assuming that accessing the fields is a more common operation than setting them.
Depending on your personal style preferences, you could have two different methods for getting and setting, e.g. name and set_name. (I do not think get_ prefixes are a good idea – 4 unneccessary characters).
If substituting parts of the name is a central aspect of your class, then encapsulating this in a special substitute_name method sounds like a good idea. Otherwise this is just unneccessary ballast, and a bad tradeoff for avoiding occasional syntactic pain.
I do not advise you to use lvalue methods, as these are experimental.
I would rather not see (and debug) some “clever” code that returns tied scalars. This would work, but feels a bit too fragile for me to be comfortable with such solutions.
Operator overloading does not help with writing accessors. Especially assignment cannot be overloaded in Perl.
Writing accessors is boring, especially when they do no validation. There are modules that can handle autogeneration for us, e.g. Class::Accessor. This adds generic accessors get and set to your class, plus specific accessors as requested. E.g.
package Thing;
use Class::Accessor 'antlers'; # use the Moose-ish syntax
has name => (is => 'rw'); # declare a read-write attribute
# new is autogenerated. Achtung: this takes a hashref
Then:
Thing->new({ name => 'Harold'});
# or
Thing->new->name('Harold');
# or any of the other permutations.
If you want a modern object system for Perl, there is a row of compatible implementations. The most feature-rich of these is Moose, and allows you to add validation, type constraints, default values, etc. to your attributes. E.g.
package Thing;
use Moose; # this is now a Moose class
has first_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name, # run this sub after attribute is set
);
has last_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name,
);
has name => (
is => 'ro', # readonly
writer => '_set_name', # but private setter
);
sub _update_name {
my $self = shift;
$self->_set_name(join ' ', $self->first_name, $self->last_name);
}
# accessors are normal Moose methods, which we can modify
before first_name => sub {
my $self = shift;
if (#_ and $_[0] !~ /^\pU/) {
Carp::croak "First name must begin with uppercase letter";
}
};
The purpose of class interface is to prevent users from directly manipulating your data. What you want to do is cool, but not a good idea.
In fact, I design my classes, so even the class itself doesn't know it's own structure:
package Thingy;
sub new {
my $class = shift;
my $name = shift;
my $self = {};
bless, $self, $class;
$self->name($name);
return $self;
}
sub name {
my $self = shift;
my $name = shift;
my $attribute = "GLUNKENSPEC";
if ( defined $name ) {
$self->{$attribute} = $name;
}
return $self->{$attribute};
}
You can see by my new constructor that I could pass it a name for my Thingy. However, my constructor doesn't know how I store my name. Instead, it merely uses my name method to set the name. As you can see by my name method, it stores the name in an unusual way, but my constructor doesn't need to know or care.
If you want to manipulate the name, you have to work at it (as you showed):
my $name = $thingy->name;
$name =~ s/old/ry/;
$thingy->name( $name );
In fact, a lot of Perl developers use inside out classes just to prevent this direct object manipulation.
What if you want to be able to directly manipulate a class by passing in a regular expression? You have to write a method to do this:
sub mod_name {
my $self = shift;
my $pattern = shift;
my $replacement = shift;
if ( not defined $replacement ) {
croak qq(Some basic error checking: Need pattern and replacement string);
}
my $name = $self->name; # Using my name method for my class
if ( not defined $name ) {
croak qq(Cannot modify name: Name is not yet set.);
}
$name = s/$pattern/$replacement/;
return $self->name($name);
}
Now, the developer can do this:
my $thingy->new( "Harold" );
$thingy->mod_name( "old", "new" );
say $thingy->name; # Says "Harry"
Whatever time or effort you save by allowing for direct object manipulation is offset by the magnitude of extra effort it will take to maintain your program. Most methods don't take more than a few minutes to create. If I suddenly got an hankering to manipulate my object in a new and surprising way, it's easy enough to create a new method to do this.
1. No. I don't actually use random nonsense words to protect my class. This is purely for demo purposes to show that even my constructor doesn't have to know how methods actually store their data.
I understand the need for hiding attributes behind methods to allow for validation, etc.
Validation is not the only reason, although it is the only one you refer to. I mention this because another is that encapsulation like this leaves the implementation open. For example, if you have a class which needs to have a string "name" which can be get and set, you could just expose a member, name. However, if you instead use get()/set() subroutines, how "name" is stored and represented internally doesn't matter.
That can be very significant if you write bunches of code with uses the class and then suddenly realize that although the user may be accessing "name" as a string, it would be much better stored some other way (for whatever reason). If the user was accessing the string directly, as a member field, you now either have to compensate for this by including code that will change name when the real whatever is changed and...but wait, how can you then compensate for the client code that changed name...
You can't. You're stuck. You now have to go back and change all the code that uses the class -- if you can. I'm sure anyone who has done enough OOP has run into this situation in one form or another.
No doubt you've read all this before, but I'm bringing it up again because there are a few points (perhaps I've misunderstood you) where you seem to outline strategies for changing "name" based on your knowledge of the implementation, and not what was intended to be the API. That is very tempting in perl because of course there is no access control -- everything is essential public -- but it is still a very very bad practice for the reason just described.
That doesn't mean, of course, that you can't simply commit to exposing "name" as a string. That's a decision and it won't be the same in all cases. However, in this particular case, if what you are particularly concerned with is a simple way to transform "name", IMO you might as well stick with a get/set method. This:
# Cumbersome, not syntactically sweet
Maybe true (although someone else might say it is simple and straightforward), but your primary concern should not be syntactic sweetness, and neither should speed of execution. They can be concerns, but your primary concern has to be design, because no matter how sweet and fast your stuff is, if it is badly designed, it will all come down around you in time.
Remember, "Premature optimization is the root of all evil" (Knuth).
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way?
It boils down to: Will this continue to work if the internals change? If the answer is yes, you can do many other things including but not limited to validation.)
The answer is yes. This can be done by having the method return a magical value.
{
package Lvalue;
sub TIESCALAR { my $class = shift; bless({ #_ }, $class) }
sub FETCH { my $self = shift; my $m = $self->{getter}; $self->{obj}->$m(#_) }
sub STORE { my $self = shift; my $m = $self->{setter}; $self->{obj}->$m(#_) }
}
sub new { my $class = shift; bless({}, $class) }
sub get_name {
my ($self) = #_;
return $self->{_name};
}
sub set_name {
my ($self, $val) = #_;
die "Invalid name" if !length($val);
$self->{_name} = $val;
}
sub name :lvalue {
my ($self) = #_;
tie my $rv, 'Lvalue', obj=>$self, getter=>'get_name', setter=>'set_name';
return $rv;
}
my $o = __PACKAGE__->new();
$o->name = 'abc';
print $o->name, "\n"; # abc
$o->name = ''; # Invalid name

How do I implement a dispatch table in a Perl OO module?

I want to put some subs that are within an OO package into an array - also within the package - to use as a dispatch table. Something like this
package Blah::Blah;
use fields 'tests';
sub new {
my($class )= #_;
my $self = fields::new($class);
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
return $self;
}
_sub1 { ... };
_sub2 { ... };
I'm not entirely sure on the syntax for this?
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
or
$self->{'tests'} = [
\&{$self->_sub1}
,\&{$self->_sub2}
];
or
$self->{'tests'} = [
\&{_sub1}
,\&{_sub2}
];
I don't seem to be able to get this to work within an OO package, whereas it's quite straightforward in a procedural fashion, and I haven't found any examples for OO.
Any help is much appreciated,
Iain
Your friend is can. It returns a reference to the subroutine if it exists, null otherwise. It even does it correctly walking up the OO chain.
$self->{tests} = [
$self->can('_sub1'),
$self->can('_sub2'),
];
# later
for $tn (0..$#{$self->{tests}}) {
ok defined $self->{tests}[$tn], "Function $tn is available.";
}
# and later
my $ref = $self->{tests}[0];
$self->$ref(#args1);
$ref = $self->{tests}[1];
$self->$ref(#args2);
Or, thanks to this question (which happens to be a variation of this question), you can call it directly:
$self->${\$self->{tests}[0]}(#args1);
$self->${\$self->{tests}[1]}(#args1);
Note that the \ gives us a reference to a the subref, which then gets dereferenced by the ${} after $self->. Whew!
To solve the timeliness issue brain d foy mentions, an alternative would be to simply make the {test} a subroutine itself, that returns a ref, and then you could get it at exactly the time you need it:
sub tests {
return [
$self->can('_sub1'),
$self->can('_sub2')
];
}
and then use it:
for $tn (0..$#{$self->tests()}) {
...
}
Of course, if you have to iterate over the refs anyway, you might as well just go straight for passing the reference out:
for my $ref (0..$#{$self->tests()}) {
$self->$ref(#args);
}
Although Robert P's answer might work for you, it has the problem of fixing the dispatch very early in the process. I tend to resolve the methods as late as I can, so I would leave the things in the tests array as method names until you want to use them:
$self->{tests} = [
qw( _sub1 _sub2 )
];
The strength of a dynamic language is that you can wait as long as you like to decide what's going to happen.
When you want to run them, you can go through the same process that Robert already noted. I'd add an interface to it though:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->$method_name();
}
That might even be better as not tying the test to an existing method name:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->run_test_named( $method_name );
}
That run_test_named could then be your dispatcher, and it can be very flexible:
sub run_test_named
{
my( $self, $name ) = #_;
# do anything you want, like in Robert's answer
}
Some things you might want to do:
Run a method on an object
Pass the object as an argument to something else
Temporarily override a test
Do nothing
etc, etc
When you separate what you decide to do from its implementation, you have a lot more freedom. Not only that, the next time you call the same test name, you can do something different.
use lib Alpha;
my $foo = Alpha::Foo->new; # indirect object syntax is deprecated
$foo->bar();
my %disp_table = ( bar => sub { $foo->bar() } );
$disp_table{bar}->(); # call it
You need a closure because you want to turn a method call into an ordinary subroutine call, so you have to capture the object you're calling the method on.
There are a few ways to do this. Your third approach is closest. That will store a reference to the two subs in the array. Then when you want to call them, you have to be sure to pass them an object as their first argument.
Is there a reason you are using the use fields construct?
if you want to create self contained test subs, you could do it this way:
$$self{test} = [
map {
my $code = $self->can($_); # retrieve a reference to the method
sub { # construct a closure that will call it
unshift #_, $self; # while passing $self as the first arg
goto &$code; # goto jumps to the method, to keep 'caller' working
}
} qw/_sub1 _sub2/
];
and then to call them
for (#{ $$self{test} }) {
eval {$_->(args for the test); 1} or die $#;
}