How hash references are created uniquely each time in perl? - perl

my %hash1 = ( a => 1, b => 2, c => 3 );
my %hash2 = ( a => 1, b => 2, c => 3 );
my $hash_ref1 = \%hash1;
my $hash_ref2 = \%hash2;
how the perl compiler creates two distinct hash reference in the memory even the key value pairs are same for both hashes?

Quite simply, my creates a new variable when executed.
Maybe you think ( ... ) creates the hash. It does not. The parens are simply there to change precedence, like in mathematics. a => 1, b => 2, c => 3 simply puts 6 scalars on the stack, to be assigned to the hash created by my.
my %h; is analogous to Hash h = new Hash(); in another language.
use Data::Printer;
my #a; # Creates an array.
for ( 1..3 ) {
my %h = ( id => "x$_" ); # Creates a hash (each time).
push #a, \%h;
}
p #a;
Output:
[
[0] {
id "x1"
},
[1] {
id "x2"
},
[2] {
id "x3"
}
]
Internally, Perl has a number of optimizations to avoid having to create and destroy so many variables. The above actually describes the behaviour you should be observing rather than what actually happens.
Internal details:
my actually creates the variable at compile time. When executed, it pushes a special instruction on the stack. When the scope is exited, this special instructions causes the variable to be cleared (in the same manner that $s=undef;, #a=(); and %h=(); would) rather than destroyed. If the reference count indicates the variable is still being used, a new scalar/array/hash is created instead. Yes, that means that my causes variables to be created on scope exit.
sub f {
# When compiled, creates a scalar.
# When executed, stacks instruction to clear $x.
my $x = shift;
# After copying $x on the stack,
# this simply clears $x instead of destroying it.
return $x;
}
sub g {
# When compiled, creates a scalar.
# When executed, stacks instruction to clear $y.
my $y = shift;
# After creating a reference to $y on the stack,
# this creates a new scalar for $y since
# that scalar is still being referenced.
# The old scalar for $y will get destroyed
# once all remaining references to it are released.
return \$y;
}

Related

Why does perl insert an undef value into my hash?

Let me start off with a simple minimal example:
use strict;
use warnings;
use Data::Dumper;
my %hash;
$hash{count} = 4;
$hash{elems}[$_] = {} for (1..$hash{count});
print Dumper \%hash;
Here is the result (reformatted):
$VAR1 = {
'count' => 4,
'elems' => [undef, {}, {}, {}, {}]
};
I do not understand, why did the first element of $hash{elems} become an undef?
I know there are probably easier ways to do what I am doing, but I am creating these empty hashes so that I can later do my $e = $hash{elems}[$i] and continue to use $e to interact with the element, eg continue the horror of nested structures with $e->{subelems}[0] = 100.
Array indices start at 0 in Perl (and in most programming languages for that matter).
In the 1st iteration of $hash{elems}[$_] = {} for (1..$hash{count});, $_ is 1, and you thus put {} at index 1 of $hash{elems}.
Since you didn't put anything at index 0 of $hash{elems}, it contains undef.
To remedy this, you could use push instead of assigning to specific indices:
push #{$hash{elems}}, {} for 1 .. $hash{count};
push adds items at the end of its first argument. Initially, $hash{elems} is empty, so the end is the 1st index (0).
Some tips:
The parenthesis are not needed in for (1..$hash{count}): for 1 .. $hash{count} works just as well and looks a bit lighter.
You could initialize your hash when you declare it:
my %hash = (
count => 4,
elems => [ map { {} } 1 .. 4 ]
);
Initializing elems with an arrayref of hashrefs is often useless, thanks to autovivification. Simply doing $hash{elems}[0]{some_key} = 42 will create an arrayref in $hash{elems}, a hashref at index 0 in this array, containing the key some_key with value 42.
In some cases though, your initialization could make sense. For instance, if you want to pass $hash{elems} (but not $hash) to a function (same thing if you want to pass $hash{elems}[..] to a function without passing $hash{elems}).

Perl auto-vivification on assignment

Does Perl real auto-vivifies key when the unexisting key is assigned to a variable?
I have this code :
my $variable = $self->{database}->{'my_key'}[0];
The variable $self->{database}->{'my_key'}[0] is undefined in my hash, but if I print a Dumper after the assignment, I'm surprised that the my_key is created.
I know the functionality for this case :
use Data::Dumper;
my $array;
$array->[3] = 'Buster'; # autovivification
print Dumper( $array );
This will give me the results :
$VAR1 = [
undef,
undef,
undef,
'Buster'
];
But never expected to work the other way arround, where :
my $weird_autovivification = $array->[3];
will also vivify $array->[3].
But never expected to work the other way arround, where : my
$weird_autovivification = $array->[3]; will also vivificate $array[3].
That's not how it works.
$ perl -MData::Dumper -E'$foo=$array->[3]; say Dumper $array'
$VAR1 = [];
Executing that code has turned $array into an array reference (where, previously, it would have been undefined), but it hasn't set $array->[3] to anything.
If we add another level of look-up, we get slightly different behaviour:
$ perl -MData::Dumper -E'$foo=$array->[0][3]; say Dumper $array'
$VAR1 = [
[]
];
Here, Perl has created $array->[0] and set it to a reference to an empty array, but it hasn't affected $array->[0][3].
In general, as you're going through a chain of look-ups in a complex data structure, Perl will autovivify all but the last link in the chain. When you think about it, that makes a lot of sense. Perl needs to autovivify one link in the chain so that it can check the existence of the next one.
Does Perl real auto-vivifies key when the unexisting key is assigned to a variable?
Perl autovivifies variables (including array elements and hash values) when they are dereferenced.
$foo->{bar} [ $foo dereferenced as a hash ] ≡ ( $foo //= {} )->{bar}
$foo->[3] [ $foo dereferenced as an array ] ≡ ( $foo //= [] )->[3]
$$foo [ $foo dereferenced as a scalar ] ≡ ${ $foo //= do { my \$anon } }
etc
This means that
$self->{database}->{'my_key'}[0]
autovivifies
$self (to a hash ref if undefined)
$self->{database} (to a hash ref if undefined)
$self->{database}->{'my_key'} (to an array ref if undefined)
but not
$self->{database}->{'my_key'}[0] (since it wasn't dereferenced)
But never expected to work the other way arround, where : my $weird_autovivification = $array->[3]; will also vivify $array->[3].
Not quite. It autovivifies $array, the variable being dereferenced. Nothing was assigned to $array->[3] since it wasn't dereferenced.
Tip: The autovivification pragma can be used to control when autovivification occurs.

perl: can a hash entry reference be used to delete from the hash?

Here's some perl pseudo code for what I'm asking:
my %x;
# initialize %x
my $ref = whateverSyntacticSugarIsNeeded( $x{this}{hash}{is}{deep} );
# ...
# make use of $ref multiple times
# ...
delete $ref; # ideally, this would delete $x{this}{hash}{is}{deep}
... where the idea is to avoid use of $x{this}{hash}{is}{deep} more than is absolutely necessary.
I'm fairly sure this isn't possible and the least uses possible is 2 (the initial ref/copy of the value, then to delete the key/value pair from %x). However, if I'm mistaken then feel free to correct me.
It's not clear what you want exactly. If
%x = ( very => { deep => { hash => { is => "here" } } } );
and you assign
$y = $x{very}{deep}{hash}{is}
then is's like writing
$y = 'here'
so you can't delete $y. You can, though,
$z = $x{very}{deep}{hash};
delete $z->{is};
The easiest thing is just to reference the hash one level up.
This is especially true if you give the variable a semantically appropriate name (something other than ref):
my %x;
# initialize %x
my $ref = $x{this}{hash}{is};
# ...
# make use of $ref multiple times
# ...
delete $ref->{deep};
Perl tracks whether or not a piece of memory is used by using a count to that piece of memory. If I did this:
my $ref->{this}->{is}->{a}->{deep} = "hash";
undef $ref; # Note I use "undef" and not "delete"
I free up all of that memory. All of the hash references and the actual scalar that those hash references point to. That's because I have no further way of accessing that memory.
If I do something a bit simpler:
my %hash = ( one => 1, two => 2, ref => { three => 3, four => 4 }, five => 5 );
Note that $hash{ref} is a reference to another hash. If I did this:
my $ref = $hash{ref};
I now have two variables that can access that piece of memory. Doing this:
delete $hash{ref};
does not free up that memory because $ref still points to it. However, that hash reference is no longer in my %hash hash.
If I didn't delete $hash{ref}, but did this:
$ref->{seven} = 7;
I have changed %hash because $ref and $hash{ref} point to the same piece of memory: That same hash reference. Doing this:
delete $hash{ref}->{four};
or
delete $ref->{four};
will both delete a particular entry in that hash reference.
We don't have to do something that complex either:
my %hash = ( one => 1, two => 2, three => 3 );
my $ref = \%hash; #Creating a reference to that hash
delete $ref->{three};
This will delete $hash{three} since both are pointing to the same hash in memory. However,
undef $ref;
will not undefine $hash too.
I hope this covers your question. As long as there's another way to refer to a memory location, it's not freed in Perl. However, if you point a reference to a data structure, manipulating that data structure through the reference will manipulate that data structure referenced through an array or hash variable.

How can I find out if a hash has an odd number of elements in assignment?

How could I find out if this hash has an odd number of elements?
my %hash = ( 1, 2, 3, 4, 5 );
Ok, I should have written more information.
sub routine {
my ( $first, $hash_ref ) = #_;
if ( $hash_ref refers to a hash with odd numbers of elements ) {
"Second argument refers to a hash with odd numbers of elements.\nFalling back to default values";
$hash_ref = { option1 => 'office', option2 => 34, option3 => 'fast' };
}
...
...
}
routine( [ 'one', 'two', 'three' ], { option1 =>, option2 => undef, option3 => 'fast' );
Well, I suppose there is some terminological confusion in the question that should be clarified.
A hash in Perl always has the same number of keys and values - because it's fundamentally an engine to store some values by their keys. I mean, key-value pair should be considered as a single element here. )
But I guess that's not what was asked really. ) I suppose the OP tried to build a hash from a list (not an array - the difference is subtle, but it's still there), and got the warning.
So the point is to check the number of elements in the list which will be assigned to a hash. It can be done as simple as ...
my #list = ( ... there goes a list ... );
print #list % 2; # 1 if the list had an odd number of elements, 0 otherwise
Notice that % operator imposes the scalar context on the list variable: it's simple and elegant. )
UPDATE as I see, the problem is slightly different. Ok, let's talk about the example given, simplifying it a bit.
my $anhash = {
option1 =>,
option2 => undef,
option3 => 'fast'
};
See, => is just a syntax sugar; this assignment could be easily rewritten as...
my $anhash = {
'option1', , 'option2', undef, 'option3', 'fast'
};
The point is that missing value after the first comma and undef are not the same, as lists (any lists) are flattened automatically in Perl. undef can be a normal element of any list, but empty space will be just ignored.
Take note the warning you care about (if use warnings is set) will be raised before your procedure is called, if it's called with an invalid hash wrapped in reference. So whoever caused this should deal with it by himself, looking at his own code: fail early, they say. )
You want to use named arguments, but set some default values for missing ones? Use this technique:
sub test_sub {
my ($args_ref) = #_;
my $default_args_ref = {
option1 => 'xxx',
option2 => 'yyy',
};
$args_ref = { %$default_args_ref, %$args_ref, };
}
Then your test_sub might be called like this...
test_sub { option1 => 'zzz' };
... or even ...
test_sub {};
The simple answer is: You get a warning about it:
Odd number of elements in hash assignment at...
Assuming you have not been foolish and turned warnings off.
The hard answer is, once assignment to the hash has been done (and warning issued), it is not odd anymore. So you can't.
my %hash = (1,2,3,4,5);
use Data::Dumper;
print Dumper \%hash;
$VAR1 = {
'1' => 2,
'3' => 4,
'5' => undef
};
As you can see, undef has been inserted in the empty spot. Now, you can check for undefined values and pretend that any existing undefined values constitutes an odd number of elements in the hash. However, should an undefined value be a valid value in your hash, you're in trouble.
perl -lwe '
sub isodd { my $count = #_ = grep defined, #_; return ($count % 2) };
%a=(a=>1,2);
print isodd(%a);'
Odd number of elements in hash assignment at -e line 1.
1
In this one-liner, the function isodd counts the defined arguments and returns whether the amount of arguments is odd or not. But as you can see, it still gives the warning.
You can use the __WARN__ signal to "trap" for when a hash assignment is incorrect.
use strict ;
use warnings ;
my $odd_hash_length = 0 ;
{
local $SIG{__WARN__} = sub {
my $msg = shift ;
if ($msg =~ m{Odd number of elements in hash assignment at}) {
$odd_hash_length = 1 ;
}
} ;
my %hash = (1, 2, 3, 4, 5) ;
}
# Now do what you want based on $odd_hash_length
if ($odd_hash_length) {
die "the hash had an odd hash length assignment...aborting\n" ;
} else {
print "the hash was initialized correctly\n";
}
See also How to capture and save warnings in Perl.

Is %$var dereferencing a Perl hash?

I'm sending a subroutine a hash, and fetching it with my($arg_ref) = #_;
But what exactly is %$arg_ref? Is %$ dereferencing the hash?
$arg_ref is a scalar since it uses the $ sigil. Presumably, it holds a hash reference. So yes, %$arg_ref deferences that hash reference. Another way to write it is %{$arg_ref}. This makes the intent of the code a bit more clear, though more verbose.
To quote from perldata(1):
Scalar values are always named with '$', even when referring
to a scalar that is part of an array or a hash. The '$'
symbol works semantically like the English word "the" in
that it indicates a single value is expected.
$days # the simple scalar value "days"
$days[28] # the 29th element of array #days
$days{'Feb'} # the 'Feb' value from hash %days
$#days # the last index of array #days
So your example would be:
%$arg_ref # hash dereferenced from the value "arg_ref"
my($arg_ref) = #_; grabs the first item in the function's argument stack and places it in a local variable called $arg_ref. The caller is responsible for passing a hash reference. A more canonical way to write that is:
my $arg_ref = shift;
To create a hash reference you could start with a hash:
some_sub(\%hash);
Or you can create it with an anonymous hash reference:
some_sub({pi => 3.14, C => 4}); # Pi is a gross approximation.
Instead of dereferencing the entire hash like that, you can grab individual items with
$arg_ref->{key}
A good brief introduction to references (creating them and using them) in Perl is perldoc perfeftut. You can also read it online (or get it as a pdf). (It talks more about references in complex data structures than in terms of passing in and out of subroutines, but the syntax is the same.)
my %hash = ( fred => 'wilma',
barney => 'betty');
my $hashref = \%hash;
my $freds_wife = $hashref->{fred};
my %hash_copy = %$hash # or %{$hash} as noted above.
Soo, what's the point of the syntax flexibility? Let's try this:
my %flintstones = ( fred => { wife => 'wilma',
kids => ['pebbles'],
pets => ['dino'],
}
barney => { # etc ... }
);
Actually for deep data structures like this it's often more convenient to start with a ref:
my $flintstones = { fred => { wife => 'Wilma',
kids => ['Pebbles'],
pets => ['Dino'],
},
};
OK, so fred gets a new pet, 'Velociraptor'
push #{$flintstones->{fred}->{pets}}, 'Velociraptor';
How many pets does Fred have?
scalar # {flintstones->{fred}->{pets} }
Let's feed them ...
for my $pet ( # {flintstones->{fred}->{pets} } ) {
feed($pet)
}
and so on. The curly-bracket soup can look a bit daunting at first, but it becomes quite easy to deal with them in the end, so long as you're consistent in the way that you deal with them.
Since it's somewhat clear this construct is being used to provide a hash reference as a list of named arguments to a sub it should also be noted that this
sub foo {
my ($arg_ref) = #_;
# do something with $arg_ref->{keys}
}
may be overkill as opposed to just unpacking #_
sub bar {
my ($a, $b, $c) = #_;
return $c / ( $a * $b );
}
Depending on how complex the argument list is.