Is %$var dereferencing a Perl hash? - perl

I'm sending a subroutine a hash, and fetching it with my($arg_ref) = #_;
But what exactly is %$arg_ref? Is %$ dereferencing the hash?

$arg_ref is a scalar since it uses the $ sigil. Presumably, it holds a hash reference. So yes, %$arg_ref deferences that hash reference. Another way to write it is %{$arg_ref}. This makes the intent of the code a bit more clear, though more verbose.
To quote from perldata(1):
Scalar values are always named with '$', even when referring
to a scalar that is part of an array or a hash. The '$'
symbol works semantically like the English word "the" in
that it indicates a single value is expected.
$days # the simple scalar value "days"
$days[28] # the 29th element of array #days
$days{'Feb'} # the 'Feb' value from hash %days
$#days # the last index of array #days
So your example would be:
%$arg_ref # hash dereferenced from the value "arg_ref"
my($arg_ref) = #_; grabs the first item in the function's argument stack and places it in a local variable called $arg_ref. The caller is responsible for passing a hash reference. A more canonical way to write that is:
my $arg_ref = shift;
To create a hash reference you could start with a hash:
some_sub(\%hash);
Or you can create it with an anonymous hash reference:
some_sub({pi => 3.14, C => 4}); # Pi is a gross approximation.

Instead of dereferencing the entire hash like that, you can grab individual items with
$arg_ref->{key}

A good brief introduction to references (creating them and using them) in Perl is perldoc perfeftut. You can also read it online (or get it as a pdf). (It talks more about references in complex data structures than in terms of passing in and out of subroutines, but the syntax is the same.)

my %hash = ( fred => 'wilma',
barney => 'betty');
my $hashref = \%hash;
my $freds_wife = $hashref->{fred};
my %hash_copy = %$hash # or %{$hash} as noted above.
Soo, what's the point of the syntax flexibility? Let's try this:
my %flintstones = ( fred => { wife => 'wilma',
kids => ['pebbles'],
pets => ['dino'],
}
barney => { # etc ... }
);
Actually for deep data structures like this it's often more convenient to start with a ref:
my $flintstones = { fred => { wife => 'Wilma',
kids => ['Pebbles'],
pets => ['Dino'],
},
};
OK, so fred gets a new pet, 'Velociraptor'
push #{$flintstones->{fred}->{pets}}, 'Velociraptor';
How many pets does Fred have?
scalar # {flintstones->{fred}->{pets} }
Let's feed them ...
for my $pet ( # {flintstones->{fred}->{pets} } ) {
feed($pet)
}
and so on. The curly-bracket soup can look a bit daunting at first, but it becomes quite easy to deal with them in the end, so long as you're consistent in the way that you deal with them.

Since it's somewhat clear this construct is being used to provide a hash reference as a list of named arguments to a sub it should also be noted that this
sub foo {
my ($arg_ref) = #_;
# do something with $arg_ref->{keys}
}
may be overkill as opposed to just unpacking #_
sub bar {
my ($a, $b, $c) = #_;
return $c / ( $a * $b );
}
Depending on how complex the argument list is.

Related

Unable to properly merge hashes

I currently have the following
# $dog, $cat, $rat are all hash refs
my %rethash = ('success' => 'Your Cool');
my %ref ={ 'dog' => $dog, 'cat' => $cat, 'mouse' => $rat,
'chicken' => '' };
my $perlobj = ( \%ref,\%rethash );
When $perlobj is dumped this is the result
$VAR1 = {
'success' => 'Your Cool'
};
However when warnings are enabled I get the following message
Useless use of reference constructor in void context at ..
I realize there is something terribly wrong with how %ref is assigned using {}, What is wrong with this code? I can't seem to get rid of this warning....
EDIT:
Ok I think I figured out whats going on,
my $perlobj = ( \%ref,\%rethash );
This does not merge but results in $perlobj becoming a reference to %rethash, this is obvious after reading your responses.
What RobEarl is saying is correct. I'll give an explanation of that and add some more stuff.
Your variable name %ref and the fact that you are using {} kinda implies you want a reference here.
Let's take a look what value we will have in %ref. Consider this example.
use strict; use warnings;
use Data::Printer;
my %foo = { key => 'value' };
p %foo;
This will throw a warning Reference found where even-sized list expected on my Perl 5.20.2. The output will be:
{
HASH(0x7e33c0) undef
}
It's a hash with a hashref as the key and undef as a value. HASH(0x07e33c0) is what you get when you look at a hash reference without dereferencing it. (The {} are there because Data::Printer converts the hash to a hashref).
Back to your code, the correct sigil for a reference is $. It does not matter what kind of reference it is. The reference is always a scalar (a pointer to the place in memory where the hash/array/something) is stored.
my $ref = {
dog => $dog,
cat => $cat,
mouse => $rat,
chicken => '', # maybe this should be undef?
};
Now you've got a hashref with the values of $dog, $cat, $rat and an empty string.
Now you're assigning a variable named $perlobj, which implies it's an object. Instead you are assigning a scalar variable (the $ makes it a scalar) with a list. If you do that, Perl will only assign the right-most value to the variable.
my $foo = (1, 2, 3); # $foo will be 3 and there's a warning
You are assigning a list of two references. The first one is disregarded and only \$rethash gets assigned. That works because conveniently, $perlobj is a scalar, and references are also scalars. So now $perlobj is a reference of %rethash. That's why your Data::Dumper output looks like %rethash.
I'm not sure what you want to do, so I cannot really help you with that. I suggest you read up on some stuff.
perlreftut is useful to learn how references work
If you want to do Object Oriented Programming, check out Moose
It might also be useful to just go get a book to learn a bit more about basic Perl. Learning Perl by Randal L. Schwartz and Beginning Perl by Curtis Poe are both very good for that
You are taking a list of hash references and assigning them to a scalar
my $perlobj = ( \%ref, \%rethash ); # same as $perlobj = \%rethash
Instead you want to take a reference to a merger of hashes
my $perlobj = { %ref, %rethash };

Need help understanding portion of script (globs and references)

I was reviewing this question, esp the response from Mr Eric Strom, and had a question regarding a portion of the more "magical" element within. Please review the linked question for the context as I'm only trying to understand the inner portion of this block:
for (qw($SCALAR #ARRAY %HASH)) {
my ($sigil, $type) = /(.)(.+)/;
if (my $ref = *$glob{$type}) {
$vars{$sigil.$name} = /\$/ ? $$ref : $ref
}
}
So, it loops over three words, breaking each into two vars, $sigil and $type. The if {} block is what I am not understanding. I suspect the portion inside the ( .. ) is getting a symbolic reference to the content within $glob{$type}... there must be some "magic" (some esoteric element of the underlying mechanism that I don't yet understand) relied upon there to determine the type of the "pointed-to" data?
The next line is also partly baffling. Appears to me that we are assigning to the vars hash, but what is the rhs doing? We did not assign to $_ in the last operation ($ref was assigned), so what is being compared to in the /\$/ block? My guess is that, if we are dealing with a scalar (though I fail to discern how we are), we deref the $ref var and store it directly in the hash, otherwise, we store the reference.
So, just looking for a little tale of what is going on in these three lines. Many thanks!
You have hit upon one of the most arcane parts of the Perl language, and I can best explain by referring you to Symbol Tables and Typeglobs from brian d foy's excellent Mastering Perl. Note also that there are further references to the relevant sections of Perl's own documentation at the bottom of the page, the most relevant of which is Typeglobs and Filehandles in perldata.
Essentially, the way perl symbol tables work is that every package has a "stash" -- a "symbol table hash" -- whose name is the same as the package but with a pair of trailing semicolons. So the stash for the default package main is called %main::. If you run this simple program
perl -E"say for keys %main::"
you will see all the familiar built-in identifiers.
The values for the stash elements are references to typeglobs, which again are hashes but have keys that correspond to the different data types, SCALAR, ARRAY, HASH, CODE etc. and values that are references to the data item with that type and identifier.
Suppose you define a scalar variable $xx, or more fully, $main:xx
our $xx = 99;
Now the stash for the main package is %main::, and the typeglob for all data items with the identifier xx is referenced by $main::{xx} so, because the sigil for typeglobs is a star * in the same way that scalar identifiers have a dollar $, we can dereference this as *{$main::{xx}}. To get the reference to the scalar variable that has the identifier xx, this typeglob can be indexed with the SCALAR string, giving *{$main::{xx}}{SCALAR}. Once more, this is a reference to the variable we're after, so to collect its value it needs dereferencing once again, and if you write
say ${*{$main::{xx}}{SCALAR}};
then you will see 99.
That may look a little complex when written in a single statement, but it is fairly stratighforward when split up. The code in your question has the variable $glob set to a reference to a typeglob, which corresponds to this with respect to $main::xx
my $type = 'SCALAR';
my $glob = $main::{xx};
my $ref = *$glob{$type};
now if we say $ref we get SCALAR(0x1d12d94) or similar, which is a reference to $main::xx as before, and printing $$ref will show 99 as expected.
The subsequent assignment to #vars is straightforward Perl, and I don't think you should have any problem understanding that once you get the principle that a packages symbol table is a stash of typglobs, or really just a hash of hashes.
The elements of the iteration are strings. Since we don't have a lexical variable at the top of the loop, the element variable is $_. And it retains that value throughout the loop. Only one of those strings has a literal dollar sign, so we're telling the difference between '$SCALAR' and the other cases.
So what it is doing is getting 3 slots out of a package-level typeglob (sometimes shortened, with a little ambiguity to "glob"). *g{SCALAR}, *g{ARRAY} and *g{HASH}. The glob stores a hash and an array as a reference, so we simply store the reference into the hash. But, the glob stores a scalar as a reference to a scalar, and so needs to be dereferenced, to be stored as just a scalar.
So if you had a glob *a and in your package you had:
our $a = 'boo';
our #a = ( 1, 2, 3 );
our %a = ( One => 1, Two => 2 );
The resulting hash would be:
{ '$a' => 'boo'
, '%a' => { One => 1, Two => 2 }
, '#a' => [ 1, 2, 3 ]
};
Meanwhile the glob can be thought to look like this:
a =>
{ SCALAR => \'boo'
, ARRAY => [ 1, 2, 3 ]
, HASH => { One => 1, Two => 2 }
, CODE => undef
, IO => undef
, GLOB => undef
};
So to specifically answer your question.
if (my $ref = *$glob{$type}) {
$vars{$sigil.$name} = /\$/ ? $$ref : $ref
}
If a slot is not used it is undef. Thus $ref is assigned either a reference or undef, which evaluates to true as a reference and false as undef. So if we have a reference, then store the value of that glob slot into the hash, taking the reference stored in the hash, if it is a "container type" but taking the value if it is a scalar. And it is stored with the key $sigil . $name in the %vars hash.

perl: can a hash entry reference be used to delete from the hash?

Here's some perl pseudo code for what I'm asking:
my %x;
# initialize %x
my $ref = whateverSyntacticSugarIsNeeded( $x{this}{hash}{is}{deep} );
# ...
# make use of $ref multiple times
# ...
delete $ref; # ideally, this would delete $x{this}{hash}{is}{deep}
... where the idea is to avoid use of $x{this}{hash}{is}{deep} more than is absolutely necessary.
I'm fairly sure this isn't possible and the least uses possible is 2 (the initial ref/copy of the value, then to delete the key/value pair from %x). However, if I'm mistaken then feel free to correct me.
It's not clear what you want exactly. If
%x = ( very => { deep => { hash => { is => "here" } } } );
and you assign
$y = $x{very}{deep}{hash}{is}
then is's like writing
$y = 'here'
so you can't delete $y. You can, though,
$z = $x{very}{deep}{hash};
delete $z->{is};
The easiest thing is just to reference the hash one level up.
This is especially true if you give the variable a semantically appropriate name (something other than ref):
my %x;
# initialize %x
my $ref = $x{this}{hash}{is};
# ...
# make use of $ref multiple times
# ...
delete $ref->{deep};
Perl tracks whether or not a piece of memory is used by using a count to that piece of memory. If I did this:
my $ref->{this}->{is}->{a}->{deep} = "hash";
undef $ref; # Note I use "undef" and not "delete"
I free up all of that memory. All of the hash references and the actual scalar that those hash references point to. That's because I have no further way of accessing that memory.
If I do something a bit simpler:
my %hash = ( one => 1, two => 2, ref => { three => 3, four => 4 }, five => 5 );
Note that $hash{ref} is a reference to another hash. If I did this:
my $ref = $hash{ref};
I now have two variables that can access that piece of memory. Doing this:
delete $hash{ref};
does not free up that memory because $ref still points to it. However, that hash reference is no longer in my %hash hash.
If I didn't delete $hash{ref}, but did this:
$ref->{seven} = 7;
I have changed %hash because $ref and $hash{ref} point to the same piece of memory: That same hash reference. Doing this:
delete $hash{ref}->{four};
or
delete $ref->{four};
will both delete a particular entry in that hash reference.
We don't have to do something that complex either:
my %hash = ( one => 1, two => 2, three => 3 );
my $ref = \%hash; #Creating a reference to that hash
delete $ref->{three};
This will delete $hash{three} since both are pointing to the same hash in memory. However,
undef $ref;
will not undefine $hash too.
I hope this covers your question. As long as there's another way to refer to a memory location, it's not freed in Perl. However, if you point a reference to a data structure, manipulating that data structure through the reference will manipulate that data structure referenced through an array or hash variable.

Subroutine that returns hash - breaks it into separate variables

I have a subroutine that returns a hash. Last lines of the subroutine:
print Dumper(\%fileDetails);
return %fileDetails;
in this case the dumper prints:
$VAR1 = {
'somthing' => 0,
'somthingelse' => 7.68016712043654,
'else' => 'burst'
}
But when I try to dump it calling the subroutine with this line:
print Dumper(\fileDetailsSub($files[$i]));
the dumper prints:
$VAR1 = \'somthing';
$VAR2 = \0;
$VAR3 = \'somthingelse';
$VAR4 = \7.68016712043654;
$VAR5 = \'else';
$VAR6 = \'burst';
Once the hash is broken, I can't use it anymore.
Why does it happen? And how can I preserve the proper structure on subroutine return?
Thanks,
Mark.
There's no such thing as returning a hash in Perl.
Subroutines take lists as their arguments and they can return lists as their result. Note that a list is a very different creature from an array.
When you write
return %fileDetails;
This is equivalent to:
return ( 'something', 0, 'somethingelse', 7.68016712043654, 'else', 'burst' );
When you invoke the subroutine and get that list back, one thing you can do is assign it to a new hash:
my %result = fileDetailsSub();
That works because a hash can be initialized with a list of key-value pairs. (Remember that (foo => 42, bar => 43 ) is the same thing as ('foo', 42, 'bar', 43).
Now, when you use the backslash reference operator on a hash, as in \%fileDetails, you get a hash reference which is a scalar the points to a hash.
Similarly, if you write \#array, you get an array reference.
But when you use the reference operator on a list, you don't get a reference to a list (since lists are not variables (they are ephemeral), they can't be referenced.) Instead, the reference operator distributes over list items, so
\( 'foo', 'bar', 'baz' );
makes a new list:
( \'foo', \'bar', \'baz' );
(In this case we get a list full of scalar references.) And this is what you're seeing when you try to Dumper the results of your subroutine: a reference operator distributed over the list of items returned from your sub.
So, one solution is to assign the result list to an actual hash variable before using Dumper. Another is to return a hash reference (what you're Dumpering anyway) from the sub:
return \%fileDetails;
...
my $details_ref = fileDetailsSub();
print Dumper( $details_ref );
# access it like this:
my $elem = $details_ref->{something};
my %copy = %{ $details_ref };
For more fun, see:
perldoc perlreftut - the Perl reference tutorial, and
perldoc perlref - the Perl reference reference.
Why not return a reference to the hash instead?
return \%fileDetails;
As long as it is a lexical variable, it will not complicate things with other uses of the subroutine. I.e.:
sub fileDetails {
my %fileDetails;
... # assign stuff
return \%fileDetails;
}
When the execution leaves the subroutine, the variable goes out of scope, but the data contained in memory remains.
The reason the Dumper output looks like that is that you are feeding it a referenced list. Subroutines cannot return arrays or hashes, they can only return lists of scalars. What you are doing is something like this:
print Dumper \(qw(something 0 somethingelse 7.123 else burst));
Perl functions can not return hashes, only lists. A return %foo statement will flatten out %foo into a list and returns the flattened list. To get the return value to be interpreted as a hash, you can assign it to a named hash
%new_hash = fileDetailsSub(...);
print Dumper(\%new_hash);
or cast it (not sure if that is the best word for it) with a %{{...}} sequence of operations:
print Dumper( \%{ {fileDetailsSub(...)} } );
Another approach, as TLP points out, is to return a hash reference from your function.
You can't return a hash directly, but perl can automatically convert between hashes and lists as needed. So perl is converting that into a list, and you are capturing it as a list. i.e.
Dumper( filedetail() ) # list
my %fd = filedetail(); Dumper( \%fd ); #hash
In list context, Perl does not distinguish between a hash and a list of key/value pairs. That is, if a subroutine returns a hash, what it really returns is an list of (key1, value1, key2, value2...). Fortunately, that works both ways; if you take such a list and assign it to a hash, you get a faithful copy of the original:
my %fileDetailsCopy = subroutineName();
But if it wouldn't break other code, it would probably make more sense to have the sub return a reference to the hash instead, as TLP said.

How to manipulate a hash-ref with Perl?

Take a look at this code. After hours of trial and error. I finally got a solution. But have no idea why it works, and to be quite honest, Perl is throwing me for a loop here.
use Data::Diff 'Diff';
use Data::Dumper;
my $out = Diff(\#comparr,\#grabarr);
my #uniq_a;
#temp = ();
my $x = #$out{uniq_a};
foreach my $y (#$x) {
#temp = ();
foreach my $z (#$y) {
push(#temp, $z);
}
push(#uniq_a, [$temp[0], $temp[1], $temp[2], $temp[3]]);
}
Why is it that the only way I can access the elements of the $out array is to pass a hash key into a scalar which has been cast as an array using a for loop? my $x = #$out{uniq_a}; I'm totally confused. I'd really appreciate anyone who can explain what's going on here so I'll know for the future. Thanks in advance.
$out is a hash reference, and you use the dereferencing operator ->{...} to access members of the hash that it refers to, like
$out->{uniq_a}
What you have stumbled on is Perl's hash slice notation, where you use the # sigil in front of the name of a hash to conveniently extract a list of values from that hash. For example:
%foo = ( a => 123, b => 456, c => 789 );
$foo = { a => 123, b => 456, c => 789 };
print #foo{"b","c"}; # 456,789
print #$foo{"c","a"}; # 789,123
Using hash slice notation with a single element inside the braces, as you do, is not the typical usage and gives you the results you want by accident.
The Diff function returns a hash reference. You are accessing the element of this hash that has key uniq_a by extracting a one-element slice of the hash, instead of the correct $out->{uniq_a}. Your code should look like this
my $out = Diff(\#comparr, \#grabarr);
my #uniq_a;
my $uniq_a = $out->{uniq_a};
for my $list (#$uniq_a) {
my #temp = #$list;
push #uniq_a, [ #temp[0..3] ];
}
In the documentation for Data::Diff it states:
The value returned is always a hash reference and the hash will have
one or more of the following hash keys: type, same, diff, diff_a,
diff_b, uniq_a and uniq_b
So $out is a reference and you have to access the values through the mentioned keys.