Create a Perl hash with an array as the key - perl

How can I put an array (like the tuple in the following example) into a hash in Perl?
%h=();
#a=(1,1);
$h{#a}=1 or $h{\#a}=1??
I tried with an array reference, but it does not work. How do I to make it work? I want to essentially de-duplicate by doing the hashing (among other things with this).

Regular hashes can only have string keys, so you'd need to create some kind of hashing function for your arrays. A simple way would be to simply join your array elements, e.g.
$h{join('-', #a)} = \#a; # A nice readable separator
$h{join($;, #a)} = \#a; # A less likely, configurable separator ("\034")
But that approach (using a sentinel value) requires that you pick a character that won't be found in the keys. The following doesn't suffer from that problem:
$h{pack('(j/a*)*', #a)} = \#a;
Alternatively, check out Hash::MultiKey which can take a more complex key.

I tried with array reference, but it does not work
Funny that, page 361 of the (new) Camel book has a paragraph title:
References Don't Work As Hash Keys
So yes, you proved the Camel book right. It then goes on to tell you how to fix it, using Tie::RefHash.
I guess you should buy the book.
(By the way, (1,1) might be called a tuple in Python, but it is called a list in Perl).

To remove duplicates in the array using hashes:
my %hash;
#hash{#array} = #array;
my #unique = keys %hash;
Alternatively, you can use map to create the hash:
my %hash = map {$_ => 1} #array;

Related

Perl assign array elements to hash user defined key

Below is code in which I need help.
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my #arrayElements = ('Array Functions');
print join(", ", #arrayElements);
### Output => Array Functions
my %hashElements = ();
I want to assign the content of #arrayElements to $hashElements{Item}
Missing some core concepts or trying wrong and been a while struggling with this.
You seem to be missing some core concepts of Perl (or programming in general). If you are learning Perl through a book or online tutorial, I suggest you re-read the chapters on arrays and hashes.
Let's look at the things involved here. You have:
#arrayElements, which is an array. It contains a list with one elements, the string 'Array Functions'.
%hashElements, which is a hash. It's empty.
$hashElements{Item}, which is a scalar value. You want to set this.
You say you want $hashElements{Item} to have the value 'Array Functions', which you have as the first element in your array #arrayElements.
$hashElements{Item} = $arrayElements[0];
And that's it. Both $hashElements{Item} and $arrayElements[0] are scalar values. That's why their sigils (the sign at the front) changes from an # (for array) or % (for hash) to a $. You can distinguish whether the value came from a hash or an array by the brackets used to access the elements. [] is for arrays, and {} is for hashes.
You cannot do the following though.
$hashElements{Item} = #arrayElements;
Because $hashElements{Item} is a scalar, the thing on the right hand side of the assignment will be treated in scalar context. An array in scalar context gets converted to the number of elements in the array, so this would assign 1. That's not what you want.
You should really read up more about this, and also pick better names for your variables. Your example is very confusing. In general, we don't do $CamelCase for variable names in Perl, but instead use $snake_case, which is easier to read and type.
Take a look at the following resources to learn more about the concepts I've mentioned above.
Perl Maven, perldata, perldsc

Is there any advantage to using keys #array instead of 0 .. $#array?

I was quite surprised to find that the keys function happily works with arrays:
keys HASH
keys ARRAY
keys EXPR
Returns a list consisting of all the keys of the named hash, or the
indices of an array. (In scalar context, returns the number of keys or
indices.)
Is there any benefit in using keys #array instead of 0 .. $#array with respect to memory usage, speed, etc., or are the reasons for this functionality more of a historic origin?
Seeing that keys #array holds up to $[ modification, I'm guessing it's historic :
$ perl -Mstrict -wE 'local $[=4; my #array="a".."z"; say join ",", keys #array;'
Use of assignment to $[ is deprecated at -e line 1.
4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
Mark has it partly right, I think. What he's missing is that each now works on an array, and, like the each with hashes, each with arrays returns two items on each call. Where each %hash returns key and value, each #array also returns key (index) and value.
while (my ($idx, $val) = each #array)
{
if ($idx > 0 && $array[$idx-1] eq $val)
{
print "Duplicate indexes: ", $idx-1, "/", $idx, "\n";
}
}
Thanks to Zaid for asking, and jmcnamara for bringing it up on perlmonks' CB. I didn't see this before - I've often looped through an array and wanted to know what index I'm at. This is waaaay better than manually manipulating some $i variable created outside of a loop and incremented inside, as I expect that continue, redo, etc., will survive this better.
So, because we can now use each on arrays, we need to be able to reset that iterator, and thus we have keys.
The link you provided actually has one important reason you might use/not use keys:
As a side effect, calling keys() resets the internal interator of the HASH or ARRAY (see each). In particular, calling keys() in void context resets the iterator with no other overhead.
That would cause each to reset to the beginning of the array. Using keys and each with arrays might be important if they ever natively support sparse arrays as a real data-type.
All that said, with so many array-aware language constructs like foreach and join in perl, I can't remember the last time I used 0..$#array.
I actually think you've answered your own question: it returns the valid indices of the array, no matter what value you've set for $[. So from a generality point of view (especially for library usage), it's more preferred.
The version of Perl I have (5.10.1) doesn't support using keys with arrays, so it can't be for historic reasons.
Well in your example, you are putting them in a list; So, in a list context
keys #array will be replaced with all elements of array
whereas 0 .. $#array will do the same but as array slicing; So, instead $array[0 .. $#array] you can also mention $array[0 .. (some specific index)]

What is the difference between `$this`, `#that`, and `%those` in Perl?

What is the difference between $this, #that, and %those in Perl?
A useful mnemonic for Perl sigils are:
$calar
#rray
%ash
Matt Trout wrote a great comment on blog.fogus.me about Perl sigils which I think is useful so have pasted below:
Actually, perl sigils don’t denote variable type – they denote conjugation – $ is ‘the’, # is
‘these’, % is ‘map of’ or so – variable type is denoted via [] or {}. You can see this with:
my $foo = 'foo';
my #foo = ('zero', 'one', 'two');
my $second_foo = $foo[1];
my #first_and_third_foos = #foo[0,2];
my %foo = (key1 => 'value1', key2 => 'value2', key3 => 'value3');
my $key2_foo = $foo{key2};
my ($key1_foo, $key3_foo) = #foo{'key1','key3'};
so looking at the sigil when skimming perl code tells you what you’re going to -get- rather
than what you’re operating on, pretty much.
This is, admittedly, really confusing until you get used to it, but once you -are- used to it
it can be an extremely useful tool for absorbing information while skimming code.
You’re still perfectly entitled to hate it, of course, but it’s an interesting concept and I
figure you might prefer to hate what’s -actually- going on rather than what you thought was
going on :)
$this is a scalar value, it holds 1 item like apple
#that is an array of values, it holds several like ("apple", "orange", "pear")
%those is a hash of values, it holds key value pairs like ("apple" => "red", "orange" => "orange", "pear" => "yellow")
See perlintro for more on Perl variable types.
Perl's inventor was a linguist, and he sought to make Perl like a "natural language".
From this post:
Disambiguation by number, case and word order
Part of the reason a language can get away with certain local ambiguities is that other ambiguities are suppressed by various mechanisms. English uses number and word order, with vestiges of a case system in the pronouns: "The man looked at the men, and they looked back at him." It's perfectly clear in that sentence who is doing what to whom. Similarly, Perl has number markers on its nouns; that is, $dog is one pooch, and #dog is (potentially) many. So $ and # are a little like "this" and "these" in English. [emphasis added]
People often try to tie sigils to variable types, but they are only loosely related. It's a topic we hit very hard in Learning Perl and Effective Perl Programming because it's much easier to understand Perl when you understand sigils.
Many people forget that variables and data are actually separate things. Variables can store data, but you don't need variables to use data.
The $ denotes a single scalar value (not necessarily a scalar variable):
$scalar_var
$array[1]
$hash{key}
The # denotes multiple values. That could be the array as a whole, a slice, or a dereference:
#array;
#array[1,2]
#hash{qw(key1 key2)}
#{ func_returning_array_ref };
The % denotes pairs (keys and values), which might be a hash variable or a dereference:
%hash
%$hash_ref
Under Perl v5.20, the % can now denote a key/value slice or either a hash or array:
%array[ #indices ]; # returns pairs of indices and elements
%hash{ #keys }; # returns pairs of key-values for those keys
You might want to look at the perlintro and perlsyn documents in order to really get started with understanding Perl (i.e., Read The Flipping Manual). :-)
That said:
$this is a scalar, which can store a number (int or float), a string, or a reference (see below);
#that is an array, which can store an ordered list of scalars (see above). You can add a scalar to an array with the push or unshift functions (see perlfunc), and you can use a parentheses-bounded comma-separated list of scalar literals or variables to create an array literal (i.e., my #array = ($a, $b, 6, "seven");)
%those is a hash, which is an associative array. Hashes have key-value pairs of entries, such that you can access the value of a hash by supplying its key. Hash literals can also be specified much like lists, except that every odd entry is a key and every even one is a value. You can also use a => character instead of a comma to separate a key and a value. (i.e., my %ordinals = ("one" => "first", "two" => "second");)
Normally, when you pass arrays or hashes to subroutine calls, the individual lists are flattened into one long list. This is sometimes desirable, sometimes not. In the latter case, you can use references to pass a reference to an entire list as a single scalar argument. The syntax and semantics of references are tricky, though, and fall beyond the scope of this answer. If you want to check it out, though, see perlref.

How can I access the last Perl hash key without using a temporary array?

How can I access the last element of keys in a hash without having to create a temporary array?
I know that hashes are unordered. However, there are applications (like mine), in which my keys can be ordered using a simple sort call on the hash keys. Hope I've explained why I wanted this. The barney/elmo example is a bad choice, I admit, but it does have its applications.
Consider the following:
my %hash = ( barney => 'dinosaur', elmo => 'monster' );
my #array = sort keys %hash;
print $array[$#{$hash}];
#prints "elmo"
Any ideas on how to do this without calling on a temp (#array in this case)?
print( (keys %hash)[-1]);
Note that the extra parens are necessary to prevent syntax confusion with print's param list.
You can also use this evil trick to force it into scalar context and do away with the extra parens:
print ~~(keys %hash)[-1];
Generally, assuming you want last, sorted alphabetically, it's simple:
use List::Util qw( maxstr );
print maxstr(keys %hash);
If you'd prefer not to use module (which I don't see valid reason for, but there are people who like to make it harder):
print( (sort keys %hash)[-1] );
Hashes are unordered, so there is no such thing as the "last element." The functions for iterating over a hash (keys, values, and each) have an order, but it's not anything that you should rely on.
Technically speaking, hashes have a "hash order" which is what the iterators use. Hash order is dependent on the hashing algorithm, which can change (and has) between different versions of Perl. Moreover, as of version 5.8.1 Perl contains hash randomization features that can change the hashing algorithm in order to prevent certain types of attacks.
In general, if you care about order you should be using an array instead.
According to perldoc perldata:
Hashes are unordered collections of
scalar values indexed by their
associated string key.
Since hash are unordered. So, sorry. There are no "last" element.
To make everyone else's point more clear, a hash's keys in Perl will have the same order every time you call keys, values, or each within the same process's lifetime, assuming the hash has not been modified. From perlfunc:
The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).
$h{'11c'} = 'C';
$h{'b'} = 'B';
$h{'e22'} = 'E';
$h{'aaaaa'} = 'AAAA';
for (keys %h){
$a = \$h{$_} and $b = $_ if $a < \$h{$_};
}
print "$b\n";
!
but be carefull due to the obvious causes
Hashes are unordered element. so might be your hash last element is elmo /

How can I sort Perl hashes whose values are array references?

Hey I was just wondering if there is a cool "one liner" that would sort my hash holding array references. So I have a bunch of key/values in my hash something like:
$DataBase{$key} = \#value;
However I would like to sort the hash by the array[0] element. Then loop through 'em. I had this to begin with:
foreach my $key (sort {$DataBase{$a} cmp $DataBase{$b} } keys %DataBase)
But that obviously just sorts my hash by the pointer value of the array. It doesn't exactly have to be "one line" but I was hoping for a solution that didn't involve reconstructing the hash.
foreach my $key (sort {$DataBase{$a}->[0] cmp $DataBase{$b}->[0] } keys %DataBase)
For the record (you probably come from a C background), Perl does not have pointers, but references:
Perl [...] allows you to create
anonymous data structures, and
supports a fundamental data type
called a "reference," loosely
equivalent to a C pointer. Just as C
pointers can point to data as well as
procedures, Perl's references can
refer to conventional data types
(scalars, arrays, and hashes) and
other entities such as subroutines,
typeglobs, and filehandles. Unlike C,
they don't let you peek and poke at
raw memory locations.
Similar, but not the same.
C.
I think you are asking the same basic question as How can I sort a hash-of-hashes by key in Perl?. My answer, which is in the Perl FAQ, shows you how to sort a hash any way that you like.