how does this data structure work? - perl

I have to some debugging on an existing script without having a lot of knowledge about perl.
This script uses data types like these to store all the fields from a file:
${$LineRefs->{FIELD_NAME}}
I've been trying to figure out how to find all possible fields separately by iterating over this scalar/hash/array or whatever it may be but I have no clue how.
Could anyone please point me in the right direction?

It's certainly very odd
$LineRefs is a reference to a hash which has an element with key FIELD_NAME whose value is a reference to a scalar
Like this
use v5.14;
my $LineRefs = {
FIELD_NAME => \99,
};
print ${ $LineRefs->{FIELD_NAME} }, "\n";
output
99
References to hashes and arrays are common because they allow a large data structure to be represented by a single scalar. But references to scalars are far less useful because they just replace a scalar by another scalar
I'm sorry, thanks #glennjackman I read the question too hastily and assumed it was about why a hash element was being dereferenced as a scalar
I've been trying to figure out how to find all possible fields separately by iterating over this scalar/hash/array or whatever it may be but I have no clue how
You're dealing with a hash, which is like an array but indexed by strings (keys) instead of integers (indexes)
You can use keys, values, or each to iterate over a hash
You can print all the keys and their values like this. Since your variable $LineRefs is a hash reference you need to dereference it as %$LineRefs
for my $key ( keys %$LineRefs ) {
my $value = $LineRefs->{$key};
print "$key => $value\n";
}
If your hash values really are references to scalars then you will see things like SCALAR(0x640448) printed for the values

Related

Create a Perl hash with an array as the key

How can I put an array (like the tuple in the following example) into a hash in Perl?
%h=();
#a=(1,1);
$h{#a}=1 or $h{\#a}=1??
I tried with an array reference, but it does not work. How do I to make it work? I want to essentially de-duplicate by doing the hashing (among other things with this).
Regular hashes can only have string keys, so you'd need to create some kind of hashing function for your arrays. A simple way would be to simply join your array elements, e.g.
$h{join('-', #a)} = \#a; # A nice readable separator
$h{join($;, #a)} = \#a; # A less likely, configurable separator ("\034")
But that approach (using a sentinel value) requires that you pick a character that won't be found in the keys. The following doesn't suffer from that problem:
$h{pack('(j/a*)*', #a)} = \#a;
Alternatively, check out Hash::MultiKey which can take a more complex key.
I tried with array reference, but it does not work
Funny that, page 361 of the (new) Camel book has a paragraph title:
References Don't Work As Hash Keys
So yes, you proved the Camel book right. It then goes on to tell you how to fix it, using Tie::RefHash.
I guess you should buy the book.
(By the way, (1,1) might be called a tuple in Python, but it is called a list in Perl).
To remove duplicates in the array using hashes:
my %hash;
#hash{#array} = #array;
my #unique = keys %hash;
Alternatively, you can use map to create the hash:
my %hash = map {$_ => 1} #array;

Grabbing a list from a multi-dimensional hash in Perl

In Programming Perl (the book) I read that I can create a dictionary where the entries hold an array as follows:
$wife{"Jacob"} = ["Leah", "Rachel", "Bilhah", "Zilpah"];
Say that I want to grab the contents of $wife{"Jacob"} in a list. How can I do that?
If I try:
$key = "Jacob";
say $wife{$key};
I get:
ARRAY (0x56d5df8)
which makes me believe that I am getting a reference, and not the actual list.
See
perllol,
perldsc and
perlreftut
for information on using complex data structures and references.
Essentially, a hash can only have scalars as values, but references are scalars, Therefore, you are saving an arrayref inside the hash, and have to dereference it to an array.
To dereference a reference, use the #{...} syntax.
say #{$wife{Jacob}};
or
say "#{$wife{Jacob}}"; # print elements with spaces in between
I guess by this time you must be knowing that
$ refers to a scalar
and # refers to an array.
since you yourself said that the value for that key is an array,then you should
say #wife{$key};
instead of
say $wife{$key};

Is there any advantage to using keys #array instead of 0 .. $#array?

I was quite surprised to find that the keys function happily works with arrays:
keys HASH
keys ARRAY
keys EXPR
Returns a list consisting of all the keys of the named hash, or the
indices of an array. (In scalar context, returns the number of keys or
indices.)
Is there any benefit in using keys #array instead of 0 .. $#array with respect to memory usage, speed, etc., or are the reasons for this functionality more of a historic origin?
Seeing that keys #array holds up to $[ modification, I'm guessing it's historic :
$ perl -Mstrict -wE 'local $[=4; my #array="a".."z"; say join ",", keys #array;'
Use of assignment to $[ is deprecated at -e line 1.
4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
Mark has it partly right, I think. What he's missing is that each now works on an array, and, like the each with hashes, each with arrays returns two items on each call. Where each %hash returns key and value, each #array also returns key (index) and value.
while (my ($idx, $val) = each #array)
{
if ($idx > 0 && $array[$idx-1] eq $val)
{
print "Duplicate indexes: ", $idx-1, "/", $idx, "\n";
}
}
Thanks to Zaid for asking, and jmcnamara for bringing it up on perlmonks' CB. I didn't see this before - I've often looped through an array and wanted to know what index I'm at. This is waaaay better than manually manipulating some $i variable created outside of a loop and incremented inside, as I expect that continue, redo, etc., will survive this better.
So, because we can now use each on arrays, we need to be able to reset that iterator, and thus we have keys.
The link you provided actually has one important reason you might use/not use keys:
As a side effect, calling keys() resets the internal interator of the HASH or ARRAY (see each). In particular, calling keys() in void context resets the iterator with no other overhead.
That would cause each to reset to the beginning of the array. Using keys and each with arrays might be important if they ever natively support sparse arrays as a real data-type.
All that said, with so many array-aware language constructs like foreach and join in perl, I can't remember the last time I used 0..$#array.
I actually think you've answered your own question: it returns the valid indices of the array, no matter what value you've set for $[. So from a generality point of view (especially for library usage), it's more preferred.
The version of Perl I have (5.10.1) doesn't support using keys with arrays, so it can't be for historic reasons.
Well in your example, you are putting them in a list; So, in a list context
keys #array will be replaced with all elements of array
whereas 0 .. $#array will do the same but as array slicing; So, instead $array[0 .. $#array] you can also mention $array[0 .. (some specific index)]

How can I access the last Perl hash key without using a temporary array?

How can I access the last element of keys in a hash without having to create a temporary array?
I know that hashes are unordered. However, there are applications (like mine), in which my keys can be ordered using a simple sort call on the hash keys. Hope I've explained why I wanted this. The barney/elmo example is a bad choice, I admit, but it does have its applications.
Consider the following:
my %hash = ( barney => 'dinosaur', elmo => 'monster' );
my #array = sort keys %hash;
print $array[$#{$hash}];
#prints "elmo"
Any ideas on how to do this without calling on a temp (#array in this case)?
print( (keys %hash)[-1]);
Note that the extra parens are necessary to prevent syntax confusion with print's param list.
You can also use this evil trick to force it into scalar context and do away with the extra parens:
print ~~(keys %hash)[-1];
Generally, assuming you want last, sorted alphabetically, it's simple:
use List::Util qw( maxstr );
print maxstr(keys %hash);
If you'd prefer not to use module (which I don't see valid reason for, but there are people who like to make it harder):
print( (sort keys %hash)[-1] );
Hashes are unordered, so there is no such thing as the "last element." The functions for iterating over a hash (keys, values, and each) have an order, but it's not anything that you should rely on.
Technically speaking, hashes have a "hash order" which is what the iterators use. Hash order is dependent on the hashing algorithm, which can change (and has) between different versions of Perl. Moreover, as of version 5.8.1 Perl contains hash randomization features that can change the hashing algorithm in order to prevent certain types of attacks.
In general, if you care about order you should be using an array instead.
According to perldoc perldata:
Hashes are unordered collections of
scalar values indexed by their
associated string key.
Since hash are unordered. So, sorry. There are no "last" element.
To make everyone else's point more clear, a hash's keys in Perl will have the same order every time you call keys, values, or each within the same process's lifetime, assuming the hash has not been modified. From perlfunc:
The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).
$h{'11c'} = 'C';
$h{'b'} = 'B';
$h{'e22'} = 'E';
$h{'aaaaa'} = 'AAAA';
for (keys %h){
$a = \$h{$_} and $b = $_ if $a < \$h{$_};
}
print "$b\n";
!
but be carefull due to the obvious causes
Hashes are unordered element. so might be your hash last element is elmo /

How can I sort Perl hashes whose values are array references?

Hey I was just wondering if there is a cool "one liner" that would sort my hash holding array references. So I have a bunch of key/values in my hash something like:
$DataBase{$key} = \#value;
However I would like to sort the hash by the array[0] element. Then loop through 'em. I had this to begin with:
foreach my $key (sort {$DataBase{$a} cmp $DataBase{$b} } keys %DataBase)
But that obviously just sorts my hash by the pointer value of the array. It doesn't exactly have to be "one line" but I was hoping for a solution that didn't involve reconstructing the hash.
foreach my $key (sort {$DataBase{$a}->[0] cmp $DataBase{$b}->[0] } keys %DataBase)
For the record (you probably come from a C background), Perl does not have pointers, but references:
Perl [...] allows you to create
anonymous data structures, and
supports a fundamental data type
called a "reference," loosely
equivalent to a C pointer. Just as C
pointers can point to data as well as
procedures, Perl's references can
refer to conventional data types
(scalars, arrays, and hashes) and
other entities such as subroutines,
typeglobs, and filehandles. Unlike C,
they don't let you peek and poke at
raw memory locations.
Similar, but not the same.
C.
I think you are asking the same basic question as How can I sort a hash-of-hashes by key in Perl?. My answer, which is in the Perl FAQ, shows you how to sort a hash any way that you like.