Perl hash referencing itself at declaration - perl

When declaring a Perl hash, I'm wondering if it's possible to use a value that was assigned earlier in the declaration.
I'd like to do the equivalent of this, all in one shot:
my %H = (something => generateString());
$H{foo} = $H{something} . "/FOO",
$H{bar} = $H{something} . "/BAR",
I can imagine something like this:
my %H = (
something => generateString(),
foo => $_{something} . "/FOO",
bar => $_{something} . "/BAR",
);
EDIT: To be clear, I don't care about an actual reference to $H{something} (i.e. changing $H{something} later shouldn't affect $H{foo}). I'd just like to get its value into the string concatenations.

You seem to think there are two assignment operators in
%H = ( a=>1, b=>$H{a} );
There isn't. Keep in mind the above is identical to
%H = ( 'a', 1, 'b', $H{a} );
There's one assignment operator, and before you can perform the assignment, you need to know what is going to be assignment.
What I'm saying is that the real problem with %H = ( a=>1, b=>$H{a} ); isn't one of scope; the real problem is that nothing been assigned to %H when you do $H{a}[1]. As such, $_{a} makes more sense than $H{a}.
The solution is simple:
my $something = generateString();
my %H = (
something => $something,
foo => "$something/FOO",
bar => "$something/BAR",
);
%H hasn't even been created yet!

my %H = (something => generateString());
%H = (%H,
foo => $H{something} ."/FOO",
bar => $H{something} ."/BAR",
);
seems reasonable, but if you want it in one shot at ANY cost,
use strict;
use warnings;
%{ $_->{hash} } = (
something => $_->{thing},
foo => "$_->{thing}/FOO",
bar => "$_->{thing}/BAR",
)
for {hash => \my %H, thing => generateString()};
which can be translated to more verbose version,
my %H;
local $_ = {hash => \%H, thing => generateString()};
%{ $_->{hash} } = (
something => $_->{thing},
foo => "$_->{thing}/FOO",
bar => "$_->{thing}/BAR",
);

As ikegami says, there is only one assignment operation; it is executed after the entire right hand side is evaluated.
A couple alternatives:
my %H = map {;
'something' => $_,
'foo' => "$_/FOO",
'bar' => "$_/BAR",
} generateString();
and
use Memoize;
memoize('generateString');
my %H = (
'something' => scalar(generateString()),
'foo' => generateString() . '/FOO',
'bar' => generateString() . '/BAR',
);
(scalar needed because otherwise, memoize will make separate calls for list and scalar context.)

An alternative is to do the definition as part of the assignments:
my %H = (
something => ( local our $str=generateString() ),
foo => qq{$str/FOO},
bar => qq{$str/BAR}
);
There are two things to consider:
Notice the use of local our instead of my
I advise against this. For many reasons; if not for best coding practices, but for bugs that may be introduced.
I feel that defining a variable outside the hash definition (as shown in Ikegami's answer) is the best way to go for many reasons, but I realize that's also not the question being asked. The answer to the question is that a hash cannot be referenced before it is created and during creation it is not exposed within the constructor.

Related

Force coercion in Moose

I want to modify an attribute's value every time it is set, no matter if it is done within constructor or by a 'writer'(i don't use 'builder' or 'default' in that case). Basically the attribute(not necessary 'Str' type) is passed to the constructor and in some cases I want to modify its value after that, but in every scenario I want to do some regexp on it (for example).
My first approach was to use a BUILDARGS and around method, both of would use the same regex function, but then I wonder about coercion. The only problem is I don't know how to create a subtype/type definition that will force coercion no matter what.
For example:
package Foo;
use Moose::Util::TypeConstraints;
subtype 'Foo::bar' => as 'Str';
coerce 'Foo::bar'
=> from 'Str'
=> via {
$_ =~ s/some_stuff//g;
$_ =~ s/other_stuff//g;
$_ =~ s/some_other_stuff//g;
};
has 'bar' => (isa => 'Foo:bar', coerce => 1);
I don't want to define subtype/type with 'where' clause like
subtype 'Foo::bar' => as 'Str' => where {$_ !~ /some_stuff/ && $_ !~ /other_stuff/ && ... };
because it seems tedious to me.
Edit: I'm looking for a comprehensive solution I could use not only with 'Str' type attributes but also 'ArrayRef', 'HashRef' etc.
Sounds like you want a trigger.
package Foo;
use Moose;
has 'bar' => (
is => 'rw',
isa => 'Str',
trigger => sub {
my ( $self, $value, $old_value ) = #_;
say 'in trigger';
# prevent infinite loop
return if $old_value && $old_value eq $value;
my $original_value = $value;
$value =~ s/some_stuff//g;
$value =~ s/other_stuff//g;
$value =~ s/some_other_stuff//g;
# prevent infinite loop
return if $value eq $original_value;
say '... setting new value';
$self->bar($value);
},
);
package main;
my $foo = Foo->new( bar => 'foo some_stuff and other_stuff and some_more_stuff' );
say $foo->bar;
I want to modify an attribute's value every time it is set, no matter
if it is done within constructor or by a 'writer'(i don't use
'builder' or 'default' in that case)
This is exactly what a trigger does. The doc says almost verbatim what you asked for:
NOTE: Triggers will only fire when you assign to the attribute, either in the constructor, or using the writer. Default and built values will not cause the trigger to be fired.
Edit: There was a bug in the inifite loop detection. It now works and will stop on the second invocation. I left debug output in to demonstrate.
in trigger
... setting new value
in trigger
foo and and some_more_stuff

Moose's attribute vs simple sub?

How to decide - what is the recommended way for the next code fragment?
I have a Moose-based module, where some data is a simple HashRef.
It is possible to write - as a Mooseish HashRef, like:
package Some;
has 'data' => (
isa => 'HashRef',
builder => '_build_href',
init_arg => undef,
lazy => 1,
);
sub _build-href {
my $href;
$href = { a=>'a', b=>'b'}; #some code what builds a href
return $href;
}
vs
sub data {
my $href;
$href = { a=>'a', b=>'b'}; #some code what builds a href
return $href;
}
What is the difference? I'm asking because when calling:
my $obj = Some->new;
my $href = $obj->data;
In both case I get a correct HashRef. So when is it recommended to use a Moose-ish has construction (which is longer) vs a simple data sub?
PS: probably this question is so simple for an average perl programmer, but please, keep in mind, I'm still only learning perl.
If you have an attribute, then whoever is constructing the object can set the hashref in the constructor:
my $obj = Some->new(data => { a => 'c', b => 'd' });
(Though in your example, you've used init_arg => undef which would disable that ability.)
Also, in the case of the attribute, the builder is only run once per object while with a standard method, the method might be called multiple times. If building the hashref is "expensive", that may be an important concern.
Another difference you'll notice is with this:
use Data::Dumper;
my $obj = Some->new;
$obj->data->{c} = 123;
print Dumper( $obj->data );

What are anonymous hashes in perl?

$hash = { 'Man' => 'Bill',
'Woman' => 'Mary,
'Dog' => 'Ben'
};
What exactly do Perl's “anonymous hashes” do?
It is a reference to a hash that can be stored in a scalar variable. It is exactly like a regular hash, except that the curly brackets {...} creates a reference to a hash.
Note the usage of different parentheses in these examples:
%hash = ( foo => "bar" ); # regular hash
$hash = { foo => "bar" }; # reference to anonymous (unnamed) hash
$href = \%hash; # reference to named hash %hash
This is useful to be able to do, if you for example want to pass a hash as an argument to a subroutine:
foo(\%hash, $arg1, $arg2);
sub foo {
my ($hash, #args) = #_;
...
}
And it is a way to create a multilevel hash:
my %hash = ( foo => { bar => "baz" } ); # $hash{foo}{bar} is now "baz"
You use an anonymous hash when you need reference to a hash and a named hash is inconvenient or unnecessary. For instance, if you wanted to pass a hash to a subroutine, you could write
my %hash = (a => 1, b => 2);
mysub(\%hash);
but if there is no need to access the hash through its name %hash you could equivalently write
mysub( {a => 1, b => 2} );
This comes in handy wherever you need a reference to a hash, and particularly when you are building nested data structures. Instead of
my %person1 = ( age => 34, position => 'captain' );
my %person2 = ( age => 28, position => 'boatswain' );
my %person3 = ( age => 18, position => 'cabin boy' );
my %crew = (
bill => \%person1,
ben => \%person2,
weed => \%person3,
);
you can write just
my %crew = (
bill => { age => 34, position => 'captain' },
ben => { age => 28, position => 'boatswain' },
weed => { age => 18, position => 'cabin boy' },
);
and to add a member,
$crew{jess} = { age => 4, position => "ship's cat" };
is a lot neater than
my %newperson = ( age => 4, position => "ship's cat" );
$crew{jess} = \%newperson;
and of course, even if a hash is created with a name, if its reference is passed elsewhere then there may be no way of using that original name, so it must be treated as anonymous. For instance in
my $crew_member = $crew{bill}
$crew_member is now effectively a reference to an anonymous hash, regardless of how the data was originally constructed. Even if the data is (in some scope) still accessible as %person1 there is no general way of knowing that, and the data can be accessed only by its reference.
It's quite simple. They allow you to write
push #hashes, { ... };
f(config => { ... });
instead of
my %hash = ( ... );
push #hashes, \%hash;
my %config = ( ... );
f(config => \%config);
(If you want to know the purpose of references, that's another story entirely.)
Anything "anonymous" is a data structure that used in a way where it does not get a name.
Your question has confused everyone else on this page, because your example shows you giving a name to the hash you created, thus it is no longer anonymous.
For example - if you have a subroutine and you want to return a hash, you could write this code:-
return {'hello'=>123};
since it has no name there - it is anonymous. Read on to unwind the extra confusion other people have added on this page by introducing references, which are not the same thing.
This is another anonymous hash (an empty one):
{}
This is an anonymous hash with something in it:
{'foo'=>123}
This is an anonymous (empty) array:
[]
This is an anonymous array with something in it:
['foo',123]
Most of the time when people use these things, they are really trying to magically put them inside of other data structures, without the bother of giving them a waste-of-time temporary name when they do this.
For example - you might want to have a hash in the middle of an array!
#array=(1,2,{foo=>3});
that array has 3 elements - the last element is a hash! ($array[2]->{foo} is 3)
perl -e '#array=(1,2,{foo=>1});use Data::Dumper;print Data::Dumper->Dump([\#array],["\#array"]);'
$#array = [
1,
2,
{
'foo' => 1
}
];
Sometimes you want to don't want to pass around an entire data structure, instead, you just want to use a pointer or reference to the data structure. In perl, you can do this by adding a "\" in front of a variable;
%hashcopy=%myhash; # this duplicates the hash
$myhash{test}=2; # does not affect %hashcopy
$hashpointer=\%myhash; # this gives us a different way to access the same hash
$hashpointer->{test}=2;# changes %myhash
$$hashpointer{test}=2; # identical to above (notice double $$)
If you're crazy, you can even have references to anonymous hashes:
perl -e 'print [],\[],{},\{}'
ARRAY(0x10eed48)REF(0x110b7a8)HASH(0x10eee38)REF(0x110b808)
and sometimes perl is clever enough to know you really meant reference, even when you didn't specifically say so, like my first "return" example:
perl -e 'sub tst{ return {foo=>bar}; }; $var=&tst();use Data::Dumper;print Data::Dumper->Dump([\$var],["\$var"]);'
$var = \{
'foo' => 'bar'
};
or:-
perl -e 'sub tst{ return {foo=>bar}; }; $var=&tst(); print "$$var{foo}\n$var->{foo}\n"'
bar
bar

How can I make all lazy Moose features be built?

I have a bunch of lazy features in a Moose object.
Some of the builders require some time to finish.
I would like to nvoke all the builders (the dump the "bomplete" object).
Can I make all the lazy features be built at once, or must I call each feature manually to cause it builder to run?
If you want to have "lazy" attributes with builders, but ensure that their values are constructed before new returns, the usual thing to do is to call the accessors in BUILD.
sub BUILD {
my ($self) = #_;
$self->foo;
$self->bar;
}
is enough to get the job done, but it's probably best to add a comment as well explaining this apparently useless code to someone who doesn't know the idiom.
Maybe you could use the meta class to get list of 'lazy' attributes. For example:
package Test;
use Moose;
has ['attr1', 'attr2'] => ( is => 'rw', lazy_build => 1);
has ['attr3', 'attr4'] => ( is => 'rw',);
sub BUILD {
my $self = shift;
my $meta = $self->meta;
foreach my $attribute_name ( sort $meta->get_attribute_list ) {
my $attribute = $meta->get_attribute($attribute_name);
if ( $attribute->has_builder ) {
my $code = $self->can($attribute_name);
$self->$code;
}
}
}
sub _build_attr1 { 1 }
sub _build_attr2 { 1 }
I've had this exact requirement several times in the past, and today I actually had to do it from the metaclass, which meant no BUILD tweaking allowed. Anyway I felt it would be good to share since it basically does exactly what ether mentioned:
'It would allow marking attributes "this is lazy, because it depends
on other attribute values to be built, but I want it to be poked
before construction finishes."'
However, derp derp I have no idea how to make a CPAN module so here's some codes:
https://gist.github.com/TiMBuS/5787018
Put the above into Late.pm and then you can use it like so:
package Thing;
use Moose;
use Late;
has 'foo' => (
is => 'ro',
default => sub {print "setting foo to 10\n"; 10},
);
has 'bar' => (
is => 'ro',
default => sub {print 'late bar being set to ', $_[0]->foo*2, "\n"; $_[0]->foo*2},
late => 1,
);
#If you want..
__PACKAGE__->meta->make_immutable;
1;
package main;
Thing->new();
#`bar` will be initialized to 20 right now, and always after `foo`.
#You can even set `foo` to 'lazy' or 'late' and it will still work.

How can I reduce duplication in constants?

I have this Perl script with many defined constants of configuration files. For example:
use constant {
LOG_DIR => "/var/log/",
LOG_FILENAME => "/var/log/file1.log",
LOG4PERL_CONF_FILE => "/etc/app1/log4perl.conf",
CONF_FILE1 => "/etc/app1/config1.xml",
CONF_FILE2 => "/etc/app1/config2.xml",
CONF_FILE3 => "/etc/app1/config3.xml",
CONF_FILE4 => "/etc/app1/config4.xml",
CONF_FILE5 => "/etc/app1/config5.xml",
};
I want to reduce duplication of "/etc/app1" and "/var/log" , but using variables does not work. Also using previously defined constants does not work in the same "use constant block". For example:
use constant {
LOG_DIR => "/var/log/",
FILE_FILENAME => LOG_DIR . "file1.log"
};
does not work.
Using separate "use constant" blocks does workaround this problem, but that adds a lot of unneeded code.
What is the correct way to do this?
Thank you.
Using separate "use constant" blocks
does workaround this problem, but that
adds a lot of unneeded code.
Does it really?
use constant BASE_PATH => "/etc/app1";
use constant {
LOG4PERL_CONF_FILE => BASE_PATH . "/log4perl.conf",
CONF_FILE1 => BASE_PATH . "/config1.xml",
CONF_FILE2 => BASE_PATH . "/config2.xml",
CONF_FILE3 => BASE_PATH . "/config3.xml",
CONF_FILE4 => BASE_PATH . "/config4.xml",
CONF_FILE5 => BASE_PATH . "/config5.xml",
};
I don't see a lot of problems with this. You have specified the base path in one point only, thereby respecting the DRY principle. If you assign BASE_PATH with an environment variable:
use constant BASE_PATH => $ENV{MY_BASE_PATH} || "/etc/app1";
... you then have a cheap way of reconfiguring your constant without having to edit your code. What's there to not like about this?
If you really want to cut down the repetitive "BASE_PATH . " concatenation, you could add a bit of machinery to install the constants yourself and factor that away:
use strict;
use warnings;
use constant BASE_PATH => $ENV{MY_PATH} || '/etc/apps';
BEGIN {
my %conf = (
FILE1 => "/config1.xml",
FILE2 => "/config2.xml",
);
for my $constant (keys %conf) {
no strict 'refs';
*{__PACKAGE__ . "::CONF_$constant"}
= sub () {BASE_PATH . "$conf{$constant}"};
}
}
print "Config is ", CONF_FILE1, ".\n";
But at this point I think the balance has swung away from Correct to Nasty :) For a start, you can no longer grep for CONF_FILE1 and see where it is defined.
I'd probably write it like this:
use Readonly;
Readonly my $LOG_DIR => "/var/log";
Readonly my $LOG_FILENAME => "$LOG_DIR/file1.log";
Readonly my $ETC => '/etc/app1';
Readonly my $LOG4PERL_CONF_FILE => "$ETC/log4perl.con";
# hash because we don't have an index '0'
Readonly my %CONF_FILES => map { $_ => "$ETC/config$_.xml" } 1 .. 5;
However, that's still a lot of code, but it does remove the duplication and that's a win.
Why are your logfiles numeric? If they start with 0, an array is a better choice than a hash. If they're named, they're more descriptive.
use constant +{
map { sprintf $_, '/var/log' } (
LOG_DIR => "%s/",
LOG_FILENAME => "%s/file1.log",
),
map { sprintf $_, '/etc/app1' } (
LOG4PERL_CONF_FILE => "%s/log4perl.conf",
CONF_FILE1 => "%s/config1.xml",
CONF_FILE2 => "%s/config2.xml",
CONF_FILE3 => "%s/config3.xml",
CONF_FILE4 => "%s/config4.xml",
CONF_FILE5 => "%s/config5.xml",
),
};
That's not going to work, sadly. The reason for this is that you are using functions ('constants') before they are defined. You evaluate them before the call to constant->import.
Using variables doesn't work because use statements are evaluated at compile time. Assigning to variables is only done at runtime, so they won't be defined yet.
The only solution I can give is to split it into multiple use constant statements. In this case, two statements will do (one for LOG_DIR and CONF_DIR, another for the rest).
Depending on what you are doing, you might not want constants at all. Mostly, I write stuff that other people use to get their stuff done, so I solve this problem in a way that gives other programmers flexibility. I make these things into methods:
sub base_log_dir { '...' }
sub get_log_file
{
my( $self, $number ) = #_;
my $log_file = catfile(
$self->base_log_dir,
sprintf "foo%03d", $number
);
}
By doing it this way, I can easily extend or override things.
Doing this loses the value of constant folding though, so you have to think about how important that is to you.