Moo, lazy attributes, and default/coerce invocation - perl

My Moo based class has both lazy & non-lazy attributes which have both default and coerce subs. If I don't initialize the attributes I'm finding that both default and coerce subs are called for the normal attribute, but only default is called for the lazy attribute. That seems inconsistent. Here's sample code:
package Foo;
use Moo;
has nrml => ( is => 'ro',
default => sub { print "nrml default\n" },
coerce => sub { print "nrml coerce\n" }
);
has lazy => ( is => 'ro',
lazy => 1,
default => sub { print "lazy default\n" },
coerce => sub { print "lazy coerce\n" }
);
my $q = Foo->new( );
$q->lazy;
The output is:
nrml default
nrml coerce
lazy default
I only expect coerce to run if I provide a value in the constructor. More importantly I expect the same sequence of execution (either default or default and coerce) from both lazy and normal attributes.
So, are my expectations off, is this a bug, or what? Thanks!

Current status: fix shipped in 009014
One of those two is a bug.
In fact, thinking about it, one could argue either way about whether coercions -should- be fired on defaults but since Moose does do so, and since coercions are structural (unlike type checks, which are often used for assertion-like things and should always pass except in the presence of a bug), I think it falls that way.
... in fact, the problem is that Method::Generate::Accessor when it fires _use_default always wraps it in _generate_simple_set, when it's _generate_set that provides the isa+coerce+trigger wrapping - and I'm fairly sure that Moose fires all three when it's applying a default, so we need to too.
It's not an entirely trivial fix to make though, because I didn't parameterise _generate_set to take a value indicating how to generate the value to set. I'll try and sort it out tomorrow since I'm planning to cut a release then.
If you want support for Moo from the developers, please contact bugs-Moo#rt.cpan.org or join #web-simple on irc.perl.org - it's sheer luck that somebody on the IRC channel saw this question and asked about it :)

That would qualify as a bug to me. Either the value from default is expected to be of the right type, or it's not. Having and enforcing the expectation only half of the time makes no sense.

Related

Why is the "lazy_build" feature in Moose discouraged?

The documentation for the lazy_build feature in Moose has this to say:
Note that use of this feature is strongly discouraged. Some documentation used to encourage use of this feature as a best practice, but we have changed our minds.
However, it does not explain what the reasoning for this is, and either my google-fu is terrible or there is no public explanation for why this is discouraged.
What's the problem with lazy_build that makes it discouraged today?
This is in Moose::Manual::BestPractices:
Avoid lazy_build
As described above, you rarely actually need a clearer or a predicate. lazy_build adds both to your public API, which exposes you to use cases that you must now test for. It's much better to avoid adding them until you really need them - use explicit lazy and builder options instead.
So what it's saying is that instead of using the property:
has attribute => (
...,
lazy_build => 1, # creates a builder called _build_attribute
);
You should instead be more explicit:
has attribute => (
...,
lazy => 1,
builder => '_build_attribute',
);
As that doesn't implicitly add clearer and predicate methods.

make object instance immutable

I want to be able to instantiate a Moose based object add to it until I serialize it and then I want to make it unchangeable. How can/should I go about doing this?
I would make two classes and a common Role:
package Thing
use Moose::Role;
has some_attrib => (isa => 'AnotherThing');
### Behaviour (the important stuff) goes here
package ImmutableThing;
use Moose;
with 'Thing';
has +some_attrib => (is => 'ro');
sub finalize { shift }
package MutableThing
use Moose;
with 'Thing';
has +some_attrib => (is => 'rw');
sub finalize {
my $self = shift;
Thing->new({some_attrib => $self->some_attrib});
}
I'm not sure that having mutable and immutable forms of the same class is necessarily a good idea though. I tend to try and think about build time and operation time as two distinct phases with different interfaces.
I would be more inclined to write a Parameter Collector (I've capitalised it like it's a pattern, but I've not seen it in the literature) that has an interface optimised to gathering the info needed to create a Thing, and the Thing Itself, which is the object that's used by the rest of the program.
I don't know of (and can't easily find) any modules to do this on CPAN which is surprising but explains why you are asking :-)
A "before" modifier over all your attributes is the obvious way to go about it. I'm sure there's a suitable meta-programming way to get a list of all attribute accessors and apply the modifier, but I'd be tempted to explicitly list them all with a big comment.
Have you considered whether you have one class or two here (Thingy, LockedThingy)? Two classes would let you encapsulate the meta cleverness if you're that way inclined.

Moose traits on type unions

In Moose v1.x, I used to be able to do this:
package Class;
use Test::More tests => 1;
use Moose;
use MooseX::Types::Moose qw/Undef Str/;
eval {
has 'trait_boom' => (
is => 'rw'
, isa => Str | Undef
, default => ''
, traits => ['String']
);
};
ok ( !$#, "Created attr trait_boom, a type union of Str and Undef\n$#" );
However, it no longer works with Moose 2.x. I assume this is a bug. Why did Moose break backwards compatibility? Is there another way to get this job done. I want that to be either Undef or a Str. I do not want to coerce Undef to an empty string though.
I'm only asking here because apparently magnet is broke
17:43 [perl2] -!- ERROR Closing Link: 64.200.109.13 (Banned)
I would guess this was changed in Moose 2.0300, Fri, Sep 23, 2011:
The ->is_subtype_of and ->is_a_type_of methods have changed their behavior
for union types. Previously, they returned true if any of their member
types returned true for a given type. Now, all of the member types must
return true. RT #67731. (Dave Rolsky)
Have you tried Maybe[Str] instead of Str | Undef?
As we told you on MagNet right after I reported you for ban evasion, this is not a bug. The trait's methods should never have worked against the value Undef, so allowing this behavior to work in 1.x was the bug. Moose has ways optimized for correct behavior and never promised bug compat between versions.
You will either need to write your own traits or write the methods by hand to deal with this situation.

Pluggable/dynamic data processing/munging/transforming perl module?

Cross-posted from perlmonks:
I have to clean up some gross, ancient code at $work,
and before I try to make a new module I'd love to use an existing one if anyone knows of something appropriate.
At runtime I am parsing a file to determine what processing I need to do on a set of data.
If I were to write a module I would try to do it more generically (non-DBI-specific), but my exact use case is this:
I read a SQL file to determine the query to run against the database.
I parse comments at the top and determine that
column A needs to have a s/// applied,
column B needs to be transformed to look like a date of given format,
column C gets a sort of tr///.
Additionally things can be chained so that column D might s///, then say if it isn't 1 or 2, set it to 3.
So when fetching from the db the program applies the various (possibly stacked) transformations before returning the data.
Currently the code is a disgustingly large and difficult series of if clauses
processing hideously difficult to read or maintain arrays of instructions.
So what I'm imagining is perhaps an object that will parse those lines
(and additionally expose a functional interface),
stack up the list of processors to apply,
then be able to execute it on a passed piece of data.
Optionally there could be a name/category option,
so that one object could be used dynamically to stack processors only for the given name/category/column.
A traditionally contrived example:
$obj = $module->new();
$obj->parse("-- greeting:gsub: /hi/hello"); # don't say "hi"
$obj->parse("-- numbers:gsub: /\D//"); # digits only
$obj->parse("-- numbers:exchange: 1,2,3 one,two,three"); # then spell out the numbers
$obj->parse("-- when:date: %Y-%m-%d 08:00:00"); # format like a date, force to 8am
$obj->stack(action => 'gsub', name => 'when', format => '/1995/1996/'); # my company does not recognize the year 1995.
$cleaned = $obj->apply({greeting => "good morning", numbers => "t2", when => "2010116"});
Each processor (gsub, date, exchange) would be a separate subroutine.
Plugins could be defined to add more by name.
$obj->define("chew", \&CookieMonster::chew);
$obj->parse("column:chew: 3x"); # chew the column 3 times
So the obvious first question is, does anybody know of a module out there that I could use?
About the only thing I was able to find so far is [mod://Hash::Transform],
but since I would be determining which processing to do dynamically at runtime
I would always end up using the "complex" option and I'd still have to build the parser/stacker.
Is anybody aware of any similar modules or even a mildly related module that I might want to utilize/wrap?
If there's nothing generic out there for public consumption (surely mine is not the only one in the darkpan),
does anybody have any advice for things to keep in mind or interface suggestions or even other possible uses
besides munging the return of data from DBI, Text::CSV, etc?
If I end up writing a new module, does anybody have namespace suggestions?
I think something under Data:: is probably appropriate...
the word "pluggable" keeps coming to mind because my use case reminds me of PAM,
but I really don't have any good ideas...
Data::Processor::Pluggable ?
Data::Munging::Configurable ?
I::Chew::Data ?
First I'd try to place as much of the formatting as possible in the SQL queries if possible.
Things like date format etc. definitely should be handled in SQL.
Out top of my head a module I know and which could be used for your purpose is Data::FormValidator. Although is is mainly aimed at validating CGI parameters, it has the functionality you need: you can defined filters and constraints and chain them in various ways. Doesn't mean there no other modules for you purpose, I just don't know.
Or you can do something what you already hinted at. You could define some sort of command classes and chain them on the various data inputs. I'd do something along these lines:
package MyDataProcessor;
use Moose;
has 'Transformations' => (
traits => ['Array'],
is => 'rw',
isa => 'ArrayRef[MyTransformer]',
handles => {
add_transformer => 'push',
}
);
has 'input' => (is => 'rw', isa => 'Str');
sub apply_transforms { }
package MyRegexTransformer;
use Moose;
extends 'MyTransformer';
has 'Regex' => (is => 'rw', isa => 'Str');
has 'Replacement' => (is => 'rw', isa => 'Str');
sub transform { }
# some other transformers
#
# somewhere else
#
#
my $processor = MyDataProcessor->new(input => 'Hello transform me');
my $tr = MyRegexTransformer->new(Regex => 'Hello', Replacement => 'Hi');
$processor->add_transformer($tr);
#...
$processor->apply_transforms;
I'm not aware of any data transform CPAN modules, so I've had to roll my own for work. It was significantly more complicated than this, but operated under a similar principle; it was basically a poor man's implementation of Informatica-style ETL sans the fancy GUI... the configuration was Perl hashes (Perl instead of XML since it allowed me to implement certain complex rules as subroutine references).
As far as namespace, i'd go for Data::Transform::*
Thanks to everyone for their thoughts.
The short version:
After trying to adapt a few existing modules I ended up abstracting my own: Sub::Chain.
It needs some work, but is doing what I need so far.
The long version:
(an excerpt from the POD)
=head1 RATIONALE
This module started out as Data::Transform::Named,
a named wrapper (like Sub::Chain::Named) around
Data::Transform (and specifically Data::Transform::Map).
As the module was nearly finished I realized I was using very little
of Data::Transform (and its documentation suggested that
I probably wouldn't want to use the only part that I I using).
I also found that the output was not always what I expected.
I decided that it seemed reasonable according to the likely purpose
of Data::Transform, and this module simply needed to be different.
So I attempted to think more abstractly
and realized that the essence of the module was not tied to
data transformation, but merely the succession of simple subroutine calls.
I then found and considered Sub::Pipeline
but needed to be able to use the same
named subroutine with different arguments in a single chain,
so it seemed easier to me to stick with the code I had written
and just rename it and abstract it a bit further.
I also looked into Rule::Engine which was beginning development
at the time I was searching.
However, like Data::Transform, it seemed more complex than what I needed.
When I saw that Rule::Engine was using [the very excellent] Moose
I decided to pass since I was doing work on a number of very old machines
with old distros and old perls and constrained resources.
Again, it just seemed to be much more than what I was looking for.
=cut
As for the "parse" method in my original idea/example,
I haven't found that to be necessary, and am currently using syntax like
$chain->append($sub, \#arguments, \%options)

Perl - Calling subclass constructor from superclass (OO)

This may turn out to be an embarrassingly stupid question, but better than potentially creating embarrassingly stupid code. :-) This is an OO design question, really.
Let's say I have an object class 'Foos' that represents a set of dynamic configuration elements, which are obtained by querying a command on disk, 'mycrazyfoos -getconfig'. Let's say that there are two categories of behavior that I want 'Foos' objects to have:
Existing ones: one is, query ones that exist in the command output I just mentioned (/usr/bin/mycrazyfoos -getconfig`. Make modifications to existing ones via shelling out commands.
Create new ones that don't exist; new 'crazyfoos', using a complex set of /usr/bin/mycrazyfoos commands and parameters. Here I'm not really just querying, but actually running a bunch of system() commands. Affecting changes.
Here's my class structure:
Foos.pm
package Foos, which has a new($hashref->{name => 'myfooname',) constructor that takes a 'crazyfoo NAME' and then queries the existence of that NAME to see if it already exists (by shelling out and running the mycrazyfoos command above). If that crazyfoo already exists, return a Foos::Existing object. Any changes to this object requires shelling out, running commands and getting confirmation that everything ran okay.
If this is the way to go, then the new() constructor needs to have a test to see which subclass constructor to use (if that even makes sense in this context). Here are the subclasses:
Foos/Existing.pm
As mentioned above, this is for when a Foos object already exists.
Foos/Pending.pm
This is an object that will be created if, in the above, the 'crazyfoo NAME' doesn't actually exist. In this case, the new() constructor above will be checked for additional parameters, and it will go ahead and, when called using ->create() shell out using system() and create a new object... possibly returning an 'Existing' one...
OR
As I type this out, I am realizing it is perhaps it's better to have a single:
(an alternative arrangement)
Foos class, that has a
->new() that takes just a name
->create() that takes additional creation parameters
->delete(), ->change() and other params that affect ones that exist; that will have to just be checked dynamically.
So here we are, two main directions to go with this. I'm curious which would be the more intelligent way to go.
In general it's a mistake (design-wise, not syntax-wise) for the new method to return anything but a new object. If you want to sometimes return an existing object, call that method something else, e.g. new_from_cache().
I also find it odd that you're splitting up this functionality (constructing a new object, and returning an existing one) not just into separate namespaces, but also different objects. So in general, you're closer with your second approach, but you can still have the main constructor (new) handle a variety of arguments:
package Foos;
use strict;
use warnings;
sub new
{
my ($class, %args) = #_;
if ($args{name})
{
# handle the name => value option
}
if ($args{some_other_option})
{
# ...
}
my $this = {
# fill in any fields you need...
};
return bless $this, $class;
}
sub new_from_cache
{
my ($class, %args) = #_;
# check if the object already exists...
# if not, create a new object
return $class->new(%args);
}
Note: I don't want to complicate things while you're still learning, but you may also want to look at Moose, which takes care of a lot of the gory details of construction for you, and the definition of attributes and their accessors.
It is generally speaking a bad idea for a superclass to know about its subclasses, a principle which extends to construction.[1] If you need to decide at runtime what kind of object to create (and you do), create a fourth class to have just that job. This is one kind of "factory".
Having said that in answer to your nominal question, your problem as described does not seem to call for subclassing. In particular, you apparently are going to be treating the different classes of Foos differently depending on which concrete class they belong to. All you're really asking for is a unified way to instantiate two separate classes of objects.
So how's this suggestion[3]: Make Foos::Exists and Foos::Pending two separate and unrelated classes and provide (in Foos) a method that returns the appropriate one. Don't call it new; you're not making a new Foos.
If you want to unify the interfaces so that clients don't have to know which kind they're talking about, then we can talk subclassing (or better yet, delegation to a lazily-created and -updated Foos::Handle).
[1]: Explaining why this is true is a subject hefty enough for a book[2], but the short answer is that it creates a dependency cycle between the subclass (which depends on its superclass by definition) and the superclass (which is being made to depend on its subclass by a poor design decision).
[2]: Lakos, John. (1996). Large-scale C++ Software Design. Addison-Wesley.
[3]: Not a recommendation, since I can't get a good enough handle on your requirements to be sure I'm not shooting fish in a dark ocean.
It is also a factory pattern (bad in Perl) if the object's constructor will return an instance blessed into more than one package.
I would create something like this. If the names exists than is_created is set to 1, otherwise it is set to 0.. I would merge the ::Pending, and ::Existing together, and if the object isn't created just put that into the default for the _object, the check happens lazily. Also, Foo->delete() and Foo->change() will defer to the instance in _object.
package Foo;
use Moose;
has 'name' => ( is => 'ro', isa => 'Str', required => 1 );
has 'is_created' => (
is => 'ro'
, isa => 'Bool'
, init_arg => undef
, default => sub {
stuff_if_exists ? 1 : 0
}
);
has '_object' => (
isa => 'Object'
, is => 'ro'
, lazy => 1
, init_arg => undef
, default => sub {
my $self = shift;
$self->is_created
? Foo->new
: Bar->new
}
, handles => [qw/delete change/]
);
Interesting answers! I am digesting it as I try out different things in code.
Well, I have another variation of the same question -- the same question, mind you, just a different problem to the same class:subclass creation issue!
This time:
This code is an interface to a command line that has a number of different complex options. I told you about /usr/bin/mycrazyfoos before, right? Well, what if I told you that that binary changes based on versions, and sometimes it completely changes its underlying options. And that this class we're writing, it has to be able to account for all of these things. The goal (or perhaps idea) is to do: (perhaps called FROM the Foos class we were discussing above):
Foos::Commandline, which has as subclasses different versions of the underlying '/usr/bin/mycrazyfoos' command.
Example:
my $fcommandobj = new Foos::Commandline;
my #raw_output_list = $fcommandobj->getlist();
my $result_dance = $fcommandobj->dance();
where 'getlist' and 'dance' are version-dependent. I thought about doing this:
package Foos::Commandline;
new (
#Figure out some clever way to decide what version user has
# (automagically)
# And call appropriate subclass? Wait, you all are telling me this is bad OO:
# if v1.0.1 (new Foos::Commandline::v1.0.1.....
# else if v1.2 (new Foos::Commandline::v1.2....
#etc
}
then
package Foos::Commandline::v1.0.1;
sub getlist ( eval... system ("/usr/bin/mycrazyfoos", "-getlistbaby"
# etc etc
and (different .pm files, in subdir of Foos/Commandline)
package Foos::Commandline::v1.2;
sub getlist ( eval... system ("/usr/bin/mycrazyfoos", "-getlistohyeahrightheh"
#etc
Make sense? I expressed in code what I'd like to do, but it just doesn't feel right, particularly in light of what was discussed in the above responses. What DOES feel right is that there should be a generic interface / superclass to Commandline... and that different versions should be able to override it. Right? Would appreciate a suggestion or two on that. Gracias.