perl how to reference hash itself

perl how to reference hash itself - perl

This might seem to be an odd thing to do, but how do I reference a hash while 'inside' the hash itself? Here's what I'm trying to do:
I have a hash of hashes with a sub at the end, like:
my $h = { A => [...], B => [...], ..., EXPAND => sub { ... } };
. I'm looking to implement EXPAND to see if the key C is present in this hash, and if so, insert another key value pair D.
So my question is, how do I pass the reference to this hash to the sub, without using the variable name of the hash? I expect to need to do this to a few hashes and I don't want to keep having to change the sub to reference the name of the hash it's currently in.

What you've got there is some nested array references, not hashes. Let's assume you actually meant that you have something like this:
my $h = { A => {...}, B => {...}, ..., EXPAND() };
In that case, you can't reference $h from within its own definition, because $h does not exist until the expression is completely evaluated.
If you're content to make it two lines, then you can do this:
my $h = { A=> {...}, B => {...} };
$h = { %$h, EXPAND( $h ) };

The general solution is to write a function that, given a hash and a function to expand that hash, returns that hash with the expansion function added to it. We can close over the hash in the expansion function so that the hash's name doesn't need to be mentioned in it. That looks like this:
use strict;
use warnings;
use 5.010;
sub add_expander {
my ($expanding_hash, $expander_sub) = #_;
my $result = { %$expanding_hash };
$result->{EXPAND} = sub { $expander_sub->($result) };
return $result;
}
my $h = add_expander(
{
A => 5,
B => 6,
},
sub {
my ($hash) = #_;
my ($maxkey) = sort { $b cmp $a } grep { $_ ne 'EXPAND' } keys %$hash;
my $newkey = chr(ord($maxkey) + 1);
$hash->{$newkey} = 'BOO!';
}
);
use Data::Dumper;
say Dumper $h;
$h->{EXPAND}->();
say Dumper $h;
Notice that we are creating $h but that the add_expander call contains no mention of $h. Instead, the sub passed into the call expects the hash it is meant to expand as its first argument. Running add_expander on the hash on the sub creates a closure that will remember which hash the expander is associated with and incorporates it into the hash.
This solution assumes that what should happen when a hash is expanded can vary by subject hash, so add_expander takes an arbitrary sub. If you don't need that degree of freedom, you can incorporate the expansion sub into add_expander.

The hash being built (potentially) happens after EXPAND() runs. I would probably use something like this:
$h = EXPAND( { A=>... } )
Where EXPAND(...) returns the modified hashref or a clone if the original needs to remain intact.

Related

Subroutine arguments as key-value pairs without a temp variable

In Perl, I've always liked the key-value pair style of argument passing,
fruit( apples => red );
I do this a lot:
sub fruit {
my %args = #_;
$args{apples}
}
Purely for compactness and having more than one way to do it, is there a way to either:
access the key-value pairs without assigning #_ to a hash? I.e. in a single statement?
have the subroutine's arguments automatically become a hash reference, perhaps via a subroutine prototype?
Without:
assigning to a temp variable my %args = #_;
having the caller pass by reference i.e. fruit({ apples => red }); purely for aesthetics
Attempted
${%{\#_}}{apples}
Trying to reference #_, interpret that as a hash ref, and access a value by key.
But I get an error that it's not a hash reference. (Which it isn't ^.^ ) I'm thinking of C where you can cast pointers, amongst other things, and avoid explicit reassignment.
I also tried subroutine prototypes
sub fruit (%) { ... }
...but the arguments get collapsed into #_ as usual.

You can't perform a hash lookup (${...}{...}) without having a hash. But you could create an anonymous hash.
my $apples = ${ { #_ } }{apples};
my $oranges = ${ { #_ } }{oranges};
You could also use the simpler post-dereference syntax
my $apples = { #_ }->{apples};
my $oranges = { #_ }->{oranges};
That would be very inefficient though. You'd be creating a new hash for each parameter. That's why a named hash is normally used.
my %args = #_;
my $apples = $args{apples};
my $oranges = $args{oranges};
An alternative, however, would be to use a hash slice.
my ($apples, $oranges) = #{ { #_ } }{qw( apples oranges )};
The following is the post-derefence version, but it's only available in 5.24+[1]:
my ($apples, $oranges) = { #_ }->#{qw( apples oranges )};
It's available in 5.20+ if you use the following:
use feature qw( postderef );
no warnings qw( experimental::postderef );

If you're more concerned about compactness than efficiency, you can do it this way:
sub fruit {
print( +{#_}->{apples}, "\n" );
my $y = {#_}->{pears};
print("$y\n");
}
fruit(apples => 'red', pears => 'green');
The reason +{#_}->{apples} was used instead of {#_}->{apples} is that it conflicts with the print BLOCK LIST syntax of print without it (or some other means of disambiguation).

How do I pass a hash to subroutine?

Need help figuring out how to do this. My code:
my %hash;
$hash{'1'}= {'Make' => 'Toyota','Color' => 'Red',};
$hash{'2'}= {'Make' => 'Ford','Color' => 'Blue',};
$hash{'3'}= {'Make' => 'Honda','Color' => 'Yellow',};
&printInfo(%hash);
sub printInfo{
my (%hash) = %_;
foreach my $key (keys %_{
my $a = $_{$key}{'Make'};
my $b = $_{$key}{'Color'};
print "$a $b\n";
}
}

The easy way, which may lead to problems when the code evolves, is simply by assigning the default array #_ (which contains all key-value-pairs as an even list) to the %hash which then rebuilds accordingliy. So your code would look like this:
sub printInfo {
my %hash = #_;
...
}
The better way would be to pass the hash as reference to the subroutine. This way you could still pass more parameters to your subroutine.
printInfo(\%hash);
sub PrintInfo {
my %hash = %{$_[0]};
...
}
An introduction to using references in Perl can be found in the perlreftut

You're so very, very close. There is no %_ for passing hashes, it must be passed in #_. Luckily, Hashes are assigned using a list context, so
sub printInfo {
my %hash = #_;
...
}
will make it work!
Also note, using the & in front of the subroutine call has been, in most cases, unnecessary since at least Perl 5.000. You can call Perl subroutines just like in other languages these days, with just the name and arguments. (As #mob points out in the comments, there are some instances where this is still necessary; see perlsub to understand this more, if interested.)

The best way to pass hashes and arrays is by reference. A reference is simply a way to talk about a complex data structure as a single data point -- something that can be stored in a scalar variable (like $foo).
Read up on references, so you understand how to create a reference and dereference a reference in order to get your original data back.
The very basics: You precede your data structure with a backslash to get the reference to that structure.
my $hash_ref = \%hash;
my $array_ref = \#array;
my $scalar_ref = \$scalar; #Legal, but doesn't do much for you...
A reference is a memory location of the original structure (plus a clue about the structure):
print "$hash_ref\n";
Will print something like:
HASH(0x7f9b0a843708)
To get the reference back into a useable format, you simply put the reference into the correct sigil in front:
my %new_hash = %{ $hash_ref };
You should learn about using references since this is the way you can create extremely complex data structures in Perl, and how Object Oriented Perl works.
Let's say you want to pass three hashes to your subroutine. Here are the three hashes:
my %hash1 = ( this => 1, that => 2, the => 3, other => 4 );
my %hash2 = ( tom => 10, dick => 20, harry => 30 );
my %hash3 = ( no => 100, man => 200, is => 300, an => 400, island => 500 );
I'll create the references for them
my $hash_ref1 = \%hash1;
my $hash_ref2 = \%hash2;
my $hash_ref3 = \%hash3;
And now just pass the references:
mysub ( $hash_ref1, $hash_ref2, $hash_ref3 );
The references are scalar data, so there's no problem passing them to my subroutine:
sub mysub {
my $sub_hash_ref1 = shift;
my $sub_hash_ref2 = shift;
my $sub_hash_ref3 = shift;
Now, I just dereference them, and my subroutine can use them.
my %sub_hash1 = %{ $sub_hash_ref1 };
my %sub_hash2 = %{ $sub_hash_ref2 };
my %sub_hash3 = %{ $sub_hash_ref3 };
You can see what a reference is a reference to by using the ref command:
my $ref_type = ref $sub_hash_ref; # $ref_type is now equal to "HASH"
This is useful if you want to make sure you're being passed the correct type of data structure.
sub mysub {
my $hash_ref = shift;
if ( ref $hash_ref ne "HASH" ) {
croak qq(You need to pass in a hash reference);
}
Also note that these are memory references, so modifying the reference will modify the original hash:
my %hash = (this => 1, is => 2, a => 3 test => 4);
print "$hash{test}\n"; # Printing "4" as expected
sub mysub ( \%hash ); # Passing the reference
print "$hash{test}\n"; # This is printing "foo". See subroutine:
sub mysub {
my $hash_ref = shift;
$hash_ref->{test} = "foo"; This is modifying the original hash!
}
This can be good -- it allows you to modify data passed to the subroutine, or bad -- it allows you to unintentionally modify data passed to the original subroutine.

I believe you want
my %hash;
$hash{'1'}= {'Make' => 'Toyota','Color' => 'Red',};
$hash{'2'}= {'Make' => 'Ford','Color' => 'Blue',};
$hash{'3'}= {'Make' => 'Honda','Color' => 'Yellow',};
printInfo(%hash);
sub printInfo{
my %hash = #_;
foreach my $key (keys %hash){
my $a = $hash{$key}{'Make'};
my $b = $hash{$key}{'Color'};
print "$a $b\n";
}
}
In the line printInfo(%hash) the %hash is expanded to a list with the alternating key-value pairs.
In printInfo, the #_ is this list that, and assigned to %hash it creates again the keys with their corresponding value from the alternating elements in the list.

You can pass them as
The argument list do_hash_thing( %hash )
A reference to the hash in the argument list
`do_hash_thing( #args_before, \%hash, #args_after )
As a reference by prototype, working like keys and other hash operators.
The list works like so:
sub do_hash_thing {
my %hash = #_;
...
}
do_hash_thing( %hash );
This also allows you to "stream" hash arguments as well:
do_hash_thing( %hash_1, %hash_2, parameter => 'green', other => 'pair' );
By reference works like this:
sub do_hash_thing {
my $hash_ref = shift;
...
}
do_hash_thing( \%hash, #other_args );
Here by prototype (\%#). The prototype makes perl look for a hash in the first argument and pass it by reference.
sub do_hash_thing (\%#) {
my $hash_ref = shift;
...
}
do_hash_thing( %hash => qw(other args) );
# OR
do_hash_thing %hash => qw(other args);
Caveat: prototypes don't work on methods.

Why does Perl's strict not let me pass a parameter hash?

I hava a perl subroutine where i would like to pass parameters as a hash
(the aim is to include a css depending on the parameter 'iconsize').
I am using the call:
get_function_bar_begin('iconsize' => '32');
for the subroutine get_function_bar_begin:
use strict;
...
sub get_function_bar_begin
{
my $self = shift;
my %template_params = %{ shift || {} };
return $self->render_template('global/bars /tmpl_incl_function_bar_begin.html',%template_params);
}
Why does this yield the error message:
Error executing run mode 'start': undef error - Can't use string ("iconsize") as a HASH ref while "strict refs" in use at CheckBar.pm at line 334
Am i doing something wrong here?
Is there an other way to submit my data ('iconsize') as a hash?
(i am still new to Perl)
EDIT: Solution which worked for me. I didn't change the call, but my function:
sub get_function_bar_begin
{
my $self = shift;
my $paramref = shift;
my %params = (ref($paramref) eq 'HASH') ? %$paramref : ();
my $iconsize = $params{'iconsize'} || '';
return $self->render_template('global/bars/tmpl_incl_function_bar_begin.html',
{
'iconsize' => $iconsize,
}
);
}

You are using the hash-dereferencing operator ( %{ } ) on the first argument of your parameter list. But that argument is not a hash reference, it's just the string 'iconsize'. You can do what you want by one of two ways:
Pass an anonymous hash reference:
get_function_bar_begin( { 'iconsize' => '32' } );
Or continue to pass a normal list, as you are right now, and change your function accordingly:
sub get_function_bar_begin {
my $self = shift;
my %template_params = #_;
}
Notice in this version that we simply assign the argument list directly to the hash (after extracting $self). This works because a list of name => value pairs is just syntactic sugar for a normal list.
I prefer the second method, since there's no particularly good reason to construct an anonymous hashref and then dereference it right away.
There's also some good information on how this works in this post: Object-Oriented Perl constructor syntax.

You're violating strict refs by trying to use the string iconsize as a hash reference.
I think you just want:
my( $self, %template_params ) = #_;
The first argument will go into $self and the rest create the hash by taking pairs of items from the rest of #_.

Passing hash with parameters as list
You need to use #_ variable instead of shift. Like this:
my %template_params = #_; ## convert key => value pairs into hash
There is different between hashes and references to hash in perl. Then you pass 'iconsize' => '32' as parameter this means list to perl, which can be interpreited as hash.
Passing hash with parameters as hash reference
But when you try %{ shift || {} } perl expect second parameter to be a hash references. In this case you can fix it in following way:
get_function_bar_begin({ 'iconsize' => '32' }); ## make anonymous hash for params

The problem is this line:
get_function_bar_begin('iconsize' => '32');
This does not pass a hash reference, as you seem to think, but a hash, which appears as a list to the callee. So when you do %{ shift }, you're only shifting the key 'iconsize', not the entire list. The solution is actually to make the second line of your function simpler:
my %template_params = #_;

How can I cleanly turn a nested Perl hash into a non-nested one?

Assume a nested hash structure %old_hash ..
my %old_hash;
$old_hash{"foo"}{"bar"}{"zonk"} = "hello";
.. which we want to "flatten" (sorry if that's the wrong terminology!) to a non-nested hash using the sub &flatten(...) so that ..
my %h = &flatten(\%old_hash);
die unless($h{"zonk"} eq "hello");
The following definition of &flatten(...) does the trick:
sub flatten {
my $hashref = shift;
my %hash;
my %i = %{$hashref};
foreach my $ii (keys(%i)) {
my %j = %{$i{$ii}};
foreach my $jj (keys(%j)) {
my %k = %{$j{$jj}};
foreach my $kk (keys(%k)) {
my $value = $k{$kk};
$hash{$kk} = $value;
}
}
}
return %hash;
}
While the code given works it is not very readable or clean.
My question is two-fold:
In what ways does the given code not correspond to modern Perl best practices? Be harsh! :-)
How would you clean it up?

Your method is not best practices because it doesn't scale. What if the nested hash is six, ten levels deep? The repetition should tell you that a recursive routine is probably what you need.
sub flatten {
my ($in, $out) = #_;
for my $key (keys %$in) {
my $value = $in->{$key};
if ( defined $value && ref $value eq 'HASH' ) {
flatten($value, $out);
}
else {
$out->{$key} = $value;
}
}
}
Alternatively, good modern Perl style is to use CPAN wherever possible. Data::Traverse would do what you need:
use Data::Traverse;
sub flatten {
my %hash = #_;
my %flattened;
traverse { $flattened{$a} = $b } \%hash;
return %flattened;
}
As a final note, it is usually more efficient to pass hashes by reference to avoid them being expanded out into lists and then turned into hashes again.

First, I would use perl -c to make sure it compiles cleanly, which it does not. So, I'd add a trailing } to make it compile.
Then, I'd run it through perltidy to improve the code layout (indentation, etc.).
Then, I'd run perlcritic (in "harsh" mode) to automatically tell me what it thinks are bad practices. It complains that:
Subroutine does not end with "return"
Update: the OP essentially changed every line of code after I posted my Answer above, but I believe it still applies. It's not easy shooting at a moving target :)

There are a few problems with your approach that you need to figure out. First off, what happens in the event that there are two leaf nodes with the same key? Does the second clobber the first, is the second ignored, should the output contain a list of them? Here is one approach. First we construct a flat list of key value pairs using a recursive function to deal with other hash depths:
my %data = (
foo => {bar => {baz => 'hello'}},
fizz => {buzz => {bing => 'world'}},
fad => {bad => {baz => 'clobber'}},
);
sub flatten {
my $hash = shift;
map {
my $value = $$hash{$_};
ref $value eq 'HASH'
? flatten($value)
: ($_ => $value)
} keys %$hash
}
print join( ", " => flatten \%data), "\n";
# baz, clobber, bing, world, baz, hello
my %flat = flatten \%data;
print join( ", " => %flat ), "\n";
# baz, hello, bing, world # lost (baz => clobber)
A fix could be something like this, which will create a hash of array refs containing all the values:
sub merge {
my %out;
while (#_) {
my ($key, $value) = splice #_, 0, 2;
push #{ $out{$key} }, $value
}
%out
}
my %better_flat = merge flatten \%data;
In production code, it would be faster to pass references between the functions, but I have omitted that here for clarity.

Is it your intent to end up with a copy of the original hash or just a reordered result?
Your code starts with one hash (the original hash that is used by reference) and makes two copies %i and %hash.
The statement my %i=%{hashref} is not necessary. You are copying the entire hash to a new hash. In either case (whether you want a copy of not) you can use references to the original hash.
You are also losing data if your hash in the hash has the same value as the parent hash. Is this intended?

Traversing a multi-dimensional hash in Perl

If you have a hash (or reference to a hash) in perl with many dimensions and you want to iterate across all values, what's the best way to do it. In other words, if we have
$f->{$x}{$y}, I want something like
foreach ($x, $y) (deep_keys %{$f})
{
}
instead of
foreach $x (keys %f)
{
foreach $y (keys %{$f->{$x})
{
}
}

Stage one: don't reinvent the wheel :)
A quick search on CPAN throws up the incredibly useful Data::Walk. Define a subroutine to process each node, and you're sorted
use Data::Walk;
my $data = { # some complex hash/array mess };
sub process {
print "current node $_\n";
}
walk \&process, $data;
And Bob's your uncle. Note that if you want to pass it a hash to walk, you'll need to pass a reference to it (see perldoc perlref), as follows (otherwise it'll try and process your hash keys as well!):
walk \&process, \%hash;
For a more comprehensive solution (but harder to find at first glance in CPAN), use Data::Visitor::Callback or its parent module - this has the advantage of giving you finer control of what you do, and (just for extra street cred) is written using Moose.

Here's an option. This works for arbitrarily deep hashes:
sub deep_keys_foreach
{
my ($hashref, $code, $args) = #_;
while (my ($k, $v) = each(%$hashref)) {
my #newargs = defined($args) ? #$args : ();
push(#newargs, $k);
if (ref($v) eq 'HASH') {
deep_keys_foreach($v, $code, \#newargs);
}
else {
$code->(#newargs);
}
}
}
deep_keys_foreach($f, sub {
my ($k1, $k2) = #_;
print "inside deep_keys, k1=$k1, k2=$k2\n";
});

This sounds to me as if Data::Diver or Data::Visitor are good approaches for you.

Keep in mind that Perl lists and hashes do not have dimensions and so cannot be multidimensional. What you can have is a hash item that is set to reference another hash or list. This can be used to create fake multidimensional structures.
Once you realize this, things become easy. For example:
sub f($) {
my $x = shift;
if( ref $x eq 'HASH' ) {
foreach( values %$x ) {
f($_);
}
} elsif( ref $x eq 'ARRAY' ) {
foreach( #$x ) {
f($_);
}
}
}
Add whatever else needs to be done besides traversing the structure, of course.
One nifty way to do what you need is to pass a code reference to be called from inside f. By using sub prototyping you could even make the calls look like Perl's grep and map functions.

You can also fudge multi-dimensional arrays if you always have all of the key values, or you just don't need to access the individual levels as separate arrays:
$arr{"foo",1} = "one";
$arr{"bar",2} = "two";
while(($key, $value) = each(%arr))
{
#keyValues = split($;, $key);
print "key = [", join(",", #keyValues), "] : value = [", $value, "]\n";
}
This uses the subscript separator "$;" as the separator for multiple values in the key.

There's no way to get the semantics you describe because foreach iterates over a list one element at a time. You'd have to have deep_keys return a LoL (list of lists) instead. Even that doesn't work in the general case of an arbitrary data structure. There could be varying levels of sub-hashes, some of the levels could be ARRAY refs, etc.
The Perlish way of doing this would be to write a function that can walk an arbitrary data structure and apply a callback at each "leaf" (that is, non-reference value). bmdhacks' answer is a starting point. The exact function would vary depending one what you wanted to do at each level. It's pretty straightforward if all you care about is the leaf values. Things get more complicated if you care about the keys, indices, etc. that got you to the leaf.

It's easy enough if all you want to do is operate on values, but if you want to operate on keys, you need specifications of how levels will be recoverable.
a. For instance, you could specify keys as "$level1_key.$level2_key.$level3_key"--or any separator, representing the levels.
b. Or you could have a list of keys.
I recommend the latter.
Level can be understood by #$key_stack
and the most local key is $key_stack->[-1].
The path can be reconstructed by: join( '.', #$key\_stack )
Code:
use constant EMPTY_ARRAY => [];
use strict;
use Scalar::Util qw<reftype>;
sub deep_keys (\%) {
sub deeper_keys {
my ( $key_ref, $hash_ref ) = #_;
return [ $key_ref, $hash_ref ] if reftype( $hash_ref ) ne 'HASH';
my #results;
while ( my ( $key, $value ) = each %$hash_ref ) {
my $k = [ #{ $key_ref || EMPTY_ARRAY }, $key ];
push #results, deeper_keys( $k, $value );
}
return #results;
}
return deeper_keys( undef, shift );
}
foreach my $kv_pair ( deep_keys %$f ) {
my ( $key_stack, $value ) = #_;
...
}
This has been tested in Perl 5.10.

If you are working with tree data going more than two levels deep, and you find yourself wanting to walk that tree, you should first consider that you are going to make a lot of extra work for yourself if you plan on reimplementing everything you need to do manually on hashes of hashes of hashes when there are a lot of good alternatives available (search CPAN for "Tree").
Not knowing what your data requirements actually are, I'm going to blindly point you at a tutorial for Tree::DAG_Node to get you started.
That said, Axeman is correct, a hashwalk is most easily done with recursion. Here's an example to get you started if you feel you absolutely must solve your problem with hashes of hashes of hashes:
#!/usr/bin/perl
use strict;
use warnings;
my %hash = (
"toplevel-1" =>
{
"sublevel1a" => "value-1a",
"sublevel1b" => "value-1b"
},
"toplevel-2" =>
{
"sublevel1c" =>
{
"value-1c.1" => "replacement-1c.1",
"value-1c.2" => "replacement-1c.2"
},
"sublevel1d" => "value-1d"
}
);
hashwalk( \%hash );
sub hashwalk
{
my ($element) = #_;
if( ref($element) =~ /HASH/ )
{
foreach my $key (keys %$element)
{
print $key," => \n";
hashwalk($$element{$key});
}
}
else
{
print $element,"\n";
}
}
It will output:
toplevel-2 =>
sublevel1d =>
value-1d
sublevel1c =>
value-1c.2 =>
replacement-1c.2
value-1c.1 =>
replacement-1c.1
toplevel-1 =>
sublevel1a =>
value-1a
sublevel1b =>
value-1b
Note that you CAN NOT predict in what order the hash elements will be traversed unless you tie the hash via Tie::IxHash or similar — again, if you're going to go through that much work, I recommend a tree module.