concept of hash in perl [duplicate] - perl

I would like to properly understand hashes in Perl. I've had to use Perl intermittently for quite some time and mostly whenever I need to do it, it's mostly related to text processing.
And everytime, I have to deal with hashes, it gets messed up. I find the syntax very cryptic for hashes
A good explanation of hashes and hash references, their differences, when they are required etc. would be much appreciated.

A simple hash is close to an array. Their initializations even look similar. First the array:
#last_name = (
"Ward", "Cleaver",
"Fred", "Flintstone",
"Archie", "Bunker"
);
Now let's represent the same information with a hash (aka associative array):
%last_name = (
"Ward", "Cleaver",
"Fred", "Flintstone",
"Archie", "Bunker"
);
Although they have the same name, the array #last_name and the hash %last_name are completely independent.
With the array, if we want to know Archie's last name, we have to perform a linear search:
my $lname;
for (my $i = 0; $i < #last_name; $i += 2) {
$lname = $last_name[$i+1] if $last_name[$i] eq "Archie";
}
print "Archie $lname\n";
With the hash, it's much more direct syntactically:
print "Archie $last_name{Archie}\n";
Say we want to represent information with only slightly richer structure:
Cleaver (last name)
Ward (first name)
June (spouse's first name)
Flintstone
Fred
Wilma
Bunker
Archie
Edith
Before references came along, flat key-value hashes were about the best we could do, but references allow
my %personal_info = (
"Cleaver", {
"FIRST", "Ward",
"SPOUSE", "June",
},
"Flintstone", {
"FIRST", "Fred",
"SPOUSE", "Wilma",
},
"Bunker", {
"FIRST", "Archie",
"SPOUSE", "Edith",
},
);
Internally, the keys and values of %personal_info are all scalars, but the values are a special kind of scalar: hash references, created with {}. The references allow us to simulate "multi-dimensional" hashes. For example, we can get to Wilma via
$personal_info{Flintstone}->{SPOUSE}
Note that Perl allows us to omit arrows between subscripts, so the above is equivalent to
$personal_info{Flintstone}{SPOUSE}
That's a lot of typing if you want to know more about Fred, so you might grab a reference as sort of a cursor:
$fred = $personal_info{Flintstone};
print "Fred's wife is $fred->{SPOUSE}\n";
Because $fred in the snippet above is a hashref, the arrow is necessary. If you leave it out but wisely enabled use strict to help you catch these sorts of errors, the compiler will complain:
Global symbol "%fred" requires explicit package name at ...
Perl references are similar to pointers in C and C++, but they can never be null. Pointers in C and C++ require dereferencing and so do references in Perl.
C and C++ function parameters have pass-by-value semantics: they're just copies, so modifications don't get back to the caller. If you want to see the changes, you have to pass a pointer. You can get this effect with references in Perl:
sub add_barney {
my($personal_info) = #_;
$personal_info->{Rubble} = {
FIRST => "Barney",
SPOUSE => "Betty",
};
}
add_barney \%personal_info;
Without the backslash, add_barney would have gotten a copy that's thrown away as soon as the sub returns.
Note also the use of the "fat comma" (=>) above. It autoquotes the string on its left and makes hash initializations less syntactically noisy.

The following demonstrates how you can use a hash and a hash reference:
my %hash = (
toy => 'aeroplane',
colour => 'blue',
);
print "I have an ", $hash{toy}, " which is coloured ", $hash{colour}, "\n";
my $hashref = \%hash;
print "I have an ", $hashref->{toy}, " which is coloured ", $hashref->{colour}, "\n";
Also see perldoc perldsc.

A hash is a basic data type in Perl.
It uses keys to access its contents.
A hash ref is an abbreviation to a
reference to a hash. References are
scalars, that is simple values. It is
a scalar value that contains
essentially, a pointer to the actual
hash itself.
Link: difference between hash and hash ref in perl - Ubuntu Forums
A difference is also in the syntax for deleting. Like C, perl works like this for Hashes:
delete $hash{$key};
and for Hash References
delete $hash_ref->{$key};
The Perl Hash Howto is a great resource to understand Hashes versus Hash with Hash References
There is also another link here that has more information on perl and references.

See perldoc perlreftut which is also accessible on your own computer's command line.
A reference is a scalar value that refers to an entire array or an entire hash (or to just about anything else). Names are one kind of reference that you're already familiar with. Think of the President of the United States: a messy, inconvenient bag of blood and bones. But to talk about him, or to represent him in a computer program, all you need is the easy, convenient scalar string "Barack Obama".
References in Perl are like names for arrays and hashes. They're Perl's private, internal names, so you can be sure they're unambiguous. Unlike "Barack Obama", a reference only refers to one thing, and you always know what it refers to. If you have a reference to an array, you can recover the entire array from it. If you have a reference to a hash, you can recover the entire hash. But the reference is still an easy, compact scalar value.

Related

Perl $$var -- two dollar signs as a sigil?

This seems like something I should be able to easily Google, Bing, or DDG, but I'm coming up completely empty on everything.
Basically, I'm tasked with (in part) rewriting a set of old Perl scripts. Not being a Perl programmer myself, there's definitely a learning curve in reading these scripts to figure out what they're doing, but I've hit an absolute brick wall with lines like this one:
$comment = $$LOC{'DESCRIPTION'};
Near as I can tell, the right-hand side is a dictionary, or more precisely it's getting the value referenced by the key 'DESCRIPTION' in said dictionary. But what's with the "extra" dollar sign in front of it?
It looks suspiciously like PHP's variable variables, but after scouring search engines, assorted StackExchange sites, and perlvar, I can't find any indication that Perl even has such a feature, let alone that this is how it's invoked. The most I've turned up is that '$' is not a valid character in a variable name, so I know $$LOC is not merely calling up a variable that just happens to be named $LOC.
What is this extra dollar sign doing here?
This is an alternative form of dereferencing. Could (and honestly probably should) be written as:
$LOC->{'DESCRIPTION'};
You'll sometimes see the same with other all basic data types:
my $str = 'a string';
my %hash = (a => 1, b => 2);
my #array = (1..4);
my $strref = \$str;
my $hashref = \%hash;
my $arrayref = \#array;
print "String value is $$strref\n";
print "Hash values are either $$hashref{a} or $hashref->{b}\n";
print "Array values are accessed by either $$arrayref[0] or $arrayref->[2]\n";
Dereferencing is treating a scalar as the 'address' at which you can locate another variable (scalar, array, or hash) and its value.
In this case, $LOC{DESCRIPTION} is being interpreted as the address of another scalar, which is read into $comment.
In Perl,
my $str = 'some val';
my $ref = \$str;
my $val = ${$ref};
my $same_val = $$ref;

How do I print a hash name in Perl?

Kind of a simple question, but it has me stumped and google just lead me astray. All I want to do is print out the name of a hash. For example:
&my_sub(\%hash_named_bill);
&my_sub(\%hash_named_frank);
sub my_sub{
my $passed_in_hash = shift;
# do great stuff with the hash here
print "You just did great stuff with: ". (insert hash name here);
}
The part I don't know is how to get the stuff in the parenthesis(insert...). For a nested hash you can just use the "keys" tag to get the hash names (if you want to call them that). I can't figure out how to get the entire hash name though, it seems like it really is just another key.
As #hackattack said in the comment, the technical answer to your questions can be found in an answer to Get variable name as string in Perl
However, you should consider whether you are doing the right thing?
If you somehow need to know the name of the hash, you most likely would solve the problem better if you stash those multiple hashes into a hash-of-hashes with names being the keys (which you should be familiar with as you alluded to the approach in your question).
$hash_named_bill{name} = "bill";
$hash_named_frank{name} = "frank";
&my_sub(\%hash_named_bill);
&my_sub(\%hash_named_frank);
sub my_sub{
my $passed_in_hash = shift;
# do great stuff with the hash here
print "You just did great stuff with: ". $passed_in_hash->{name};
}
You can use a name to refer to a hash, but hashes themselves don't have names. For example, consider the following:
*foo = {};
*bar = \%foo;
$foo{x} = 3;
$bar{y} = 4;
Keeping in mind the hash contains (x=>3, y=>4): Is the hash nameless? named 'foo'? named 'bar'? All of the above? None of the above?
The best you can do is approximate an answer using PadWalker. I recommend against using it or similar (i.e. anything that finds a name) in production!
A hash is just a piece of memory, to which a name (or more than one) can be associated.
If you want to print the name of the variable, that's not very straightforward (see haccattack comment), and doesn't smell very well (are you sure you really need that?)
You can also (if this fits your scenario) consider "soft (or symbolic) references" :
%hash1 = ( x => 101, y => 501);
%hash2 = ( x => 102, y => 502);
my_sub("hash1");
#my_sub(\%hash1); # won't work
my_sub("hash2");
sub my_sub {
my $hashname = shift;
print "hash name: $hashname\n";
print $hashname->{x} . "\n";
}
Here you are passing to the function the name of the variable, instead of a (hard) reference to it. Notice that, in Perl, this feels equivalent at the time of dereferencing it (try uncommenting my_sub(\%hash1);), though it's quite a different thing.

What is the difference between `$this`, `#that`, and `%those` in Perl?

What is the difference between $this, #that, and %those in Perl?
A useful mnemonic for Perl sigils are:
$calar
#rray
%ash
Matt Trout wrote a great comment on blog.fogus.me about Perl sigils which I think is useful so have pasted below:
Actually, perl sigils don’t denote variable type – they denote conjugation – $ is ‘the’, # is
‘these’, % is ‘map of’ or so – variable type is denoted via [] or {}. You can see this with:
my $foo = 'foo';
my #foo = ('zero', 'one', 'two');
my $second_foo = $foo[1];
my #first_and_third_foos = #foo[0,2];
my %foo = (key1 => 'value1', key2 => 'value2', key3 => 'value3');
my $key2_foo = $foo{key2};
my ($key1_foo, $key3_foo) = #foo{'key1','key3'};
so looking at the sigil when skimming perl code tells you what you’re going to -get- rather
than what you’re operating on, pretty much.
This is, admittedly, really confusing until you get used to it, but once you -are- used to it
it can be an extremely useful tool for absorbing information while skimming code.
You’re still perfectly entitled to hate it, of course, but it’s an interesting concept and I
figure you might prefer to hate what’s -actually- going on rather than what you thought was
going on :)
$this is a scalar value, it holds 1 item like apple
#that is an array of values, it holds several like ("apple", "orange", "pear")
%those is a hash of values, it holds key value pairs like ("apple" => "red", "orange" => "orange", "pear" => "yellow")
See perlintro for more on Perl variable types.
Perl's inventor was a linguist, and he sought to make Perl like a "natural language".
From this post:
Disambiguation by number, case and word order
Part of the reason a language can get away with certain local ambiguities is that other ambiguities are suppressed by various mechanisms. English uses number and word order, with vestiges of a case system in the pronouns: "The man looked at the men, and they looked back at him." It's perfectly clear in that sentence who is doing what to whom. Similarly, Perl has number markers on its nouns; that is, $dog is one pooch, and #dog is (potentially) many. So $ and # are a little like "this" and "these" in English. [emphasis added]
People often try to tie sigils to variable types, but they are only loosely related. It's a topic we hit very hard in Learning Perl and Effective Perl Programming because it's much easier to understand Perl when you understand sigils.
Many people forget that variables and data are actually separate things. Variables can store data, but you don't need variables to use data.
The $ denotes a single scalar value (not necessarily a scalar variable):
$scalar_var
$array[1]
$hash{key}
The # denotes multiple values. That could be the array as a whole, a slice, or a dereference:
#array;
#array[1,2]
#hash{qw(key1 key2)}
#{ func_returning_array_ref };
The % denotes pairs (keys and values), which might be a hash variable or a dereference:
%hash
%$hash_ref
Under Perl v5.20, the % can now denote a key/value slice or either a hash or array:
%array[ #indices ]; # returns pairs of indices and elements
%hash{ #keys }; # returns pairs of key-values for those keys
You might want to look at the perlintro and perlsyn documents in order to really get started with understanding Perl (i.e., Read The Flipping Manual). :-)
That said:
$this is a scalar, which can store a number (int or float), a string, or a reference (see below);
#that is an array, which can store an ordered list of scalars (see above). You can add a scalar to an array with the push or unshift functions (see perlfunc), and you can use a parentheses-bounded comma-separated list of scalar literals or variables to create an array literal (i.e., my #array = ($a, $b, 6, "seven");)
%those is a hash, which is an associative array. Hashes have key-value pairs of entries, such that you can access the value of a hash by supplying its key. Hash literals can also be specified much like lists, except that every odd entry is a key and every even one is a value. You can also use a => character instead of a comma to separate a key and a value. (i.e., my %ordinals = ("one" => "first", "two" => "second");)
Normally, when you pass arrays or hashes to subroutine calls, the individual lists are flattened into one long list. This is sometimes desirable, sometimes not. In the latter case, you can use references to pass a reference to an entire list as a single scalar argument. The syntax and semantics of references are tricky, though, and fall beyond the scope of this answer. If you want to check it out, though, see perlref.

What's the difference between a hash and hash reference in Perl?

I would like to properly understand hashes in Perl. I've had to use Perl intermittently for quite some time and mostly whenever I need to do it, it's mostly related to text processing.
And everytime, I have to deal with hashes, it gets messed up. I find the syntax very cryptic for hashes
A good explanation of hashes and hash references, their differences, when they are required etc. would be much appreciated.
A simple hash is close to an array. Their initializations even look similar. First the array:
#last_name = (
"Ward", "Cleaver",
"Fred", "Flintstone",
"Archie", "Bunker"
);
Now let's represent the same information with a hash (aka associative array):
%last_name = (
"Ward", "Cleaver",
"Fred", "Flintstone",
"Archie", "Bunker"
);
Although they have the same name, the array #last_name and the hash %last_name are completely independent.
With the array, if we want to know Archie's last name, we have to perform a linear search:
my $lname;
for (my $i = 0; $i < #last_name; $i += 2) {
$lname = $last_name[$i+1] if $last_name[$i] eq "Archie";
}
print "Archie $lname\n";
With the hash, it's much more direct syntactically:
print "Archie $last_name{Archie}\n";
Say we want to represent information with only slightly richer structure:
Cleaver (last name)
Ward (first name)
June (spouse's first name)
Flintstone
Fred
Wilma
Bunker
Archie
Edith
Before references came along, flat key-value hashes were about the best we could do, but references allow
my %personal_info = (
"Cleaver", {
"FIRST", "Ward",
"SPOUSE", "June",
},
"Flintstone", {
"FIRST", "Fred",
"SPOUSE", "Wilma",
},
"Bunker", {
"FIRST", "Archie",
"SPOUSE", "Edith",
},
);
Internally, the keys and values of %personal_info are all scalars, but the values are a special kind of scalar: hash references, created with {}. The references allow us to simulate "multi-dimensional" hashes. For example, we can get to Wilma via
$personal_info{Flintstone}->{SPOUSE}
Note that Perl allows us to omit arrows between subscripts, so the above is equivalent to
$personal_info{Flintstone}{SPOUSE}
That's a lot of typing if you want to know more about Fred, so you might grab a reference as sort of a cursor:
$fred = $personal_info{Flintstone};
print "Fred's wife is $fred->{SPOUSE}\n";
Because $fred in the snippet above is a hashref, the arrow is necessary. If you leave it out but wisely enabled use strict to help you catch these sorts of errors, the compiler will complain:
Global symbol "%fred" requires explicit package name at ...
Perl references are similar to pointers in C and C++, but they can never be null. Pointers in C and C++ require dereferencing and so do references in Perl.
C and C++ function parameters have pass-by-value semantics: they're just copies, so modifications don't get back to the caller. If you want to see the changes, you have to pass a pointer. You can get this effect with references in Perl:
sub add_barney {
my($personal_info) = #_;
$personal_info->{Rubble} = {
FIRST => "Barney",
SPOUSE => "Betty",
};
}
add_barney \%personal_info;
Without the backslash, add_barney would have gotten a copy that's thrown away as soon as the sub returns.
Note also the use of the "fat comma" (=>) above. It autoquotes the string on its left and makes hash initializations less syntactically noisy.
The following demonstrates how you can use a hash and a hash reference:
my %hash = (
toy => 'aeroplane',
colour => 'blue',
);
print "I have an ", $hash{toy}, " which is coloured ", $hash{colour}, "\n";
my $hashref = \%hash;
print "I have an ", $hashref->{toy}, " which is coloured ", $hashref->{colour}, "\n";
Also see perldoc perldsc.
A hash is a basic data type in Perl.
It uses keys to access its contents.
A hash ref is an abbreviation to a
reference to a hash. References are
scalars, that is simple values. It is
a scalar value that contains
essentially, a pointer to the actual
hash itself.
Link: difference between hash and hash ref in perl - Ubuntu Forums
A difference is also in the syntax for deleting. Like C, perl works like this for Hashes:
delete $hash{$key};
and for Hash References
delete $hash_ref->{$key};
The Perl Hash Howto is a great resource to understand Hashes versus Hash with Hash References
There is also another link here that has more information on perl and references.
See perldoc perlreftut which is also accessible on your own computer's command line.
A reference is a scalar value that refers to an entire array or an entire hash (or to just about anything else). Names are one kind of reference that you're already familiar with. Think of the President of the United States: a messy, inconvenient bag of blood and bones. But to talk about him, or to represent him in a computer program, all you need is the easy, convenient scalar string "Barack Obama".
References in Perl are like names for arrays and hashes. They're Perl's private, internal names, so you can be sure they're unambiguous. Unlike "Barack Obama", a reference only refers to one thing, and you always know what it refers to. If you have a reference to an array, you can recover the entire array from it. If you have a reference to a hash, you can recover the entire hash. But the reference is still an easy, compact scalar value.

How can I access the last Perl hash key without using a temporary array?

How can I access the last element of keys in a hash without having to create a temporary array?
I know that hashes are unordered. However, there are applications (like mine), in which my keys can be ordered using a simple sort call on the hash keys. Hope I've explained why I wanted this. The barney/elmo example is a bad choice, I admit, but it does have its applications.
Consider the following:
my %hash = ( barney => 'dinosaur', elmo => 'monster' );
my #array = sort keys %hash;
print $array[$#{$hash}];
#prints "elmo"
Any ideas on how to do this without calling on a temp (#array in this case)?
print( (keys %hash)[-1]);
Note that the extra parens are necessary to prevent syntax confusion with print's param list.
You can also use this evil trick to force it into scalar context and do away with the extra parens:
print ~~(keys %hash)[-1];
Generally, assuming you want last, sorted alphabetically, it's simple:
use List::Util qw( maxstr );
print maxstr(keys %hash);
If you'd prefer not to use module (which I don't see valid reason for, but there are people who like to make it harder):
print( (sort keys %hash)[-1] );
Hashes are unordered, so there is no such thing as the "last element." The functions for iterating over a hash (keys, values, and each) have an order, but it's not anything that you should rely on.
Technically speaking, hashes have a "hash order" which is what the iterators use. Hash order is dependent on the hashing algorithm, which can change (and has) between different versions of Perl. Moreover, as of version 5.8.1 Perl contains hash randomization features that can change the hashing algorithm in order to prevent certain types of attacks.
In general, if you care about order you should be using an array instead.
According to perldoc perldata:
Hashes are unordered collections of
scalar values indexed by their
associated string key.
Since hash are unordered. So, sorry. There are no "last" element.
To make everyone else's point more clear, a hash's keys in Perl will have the same order every time you call keys, values, or each within the same process's lifetime, assuming the hash has not been modified. From perlfunc:
The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).
$h{'11c'} = 'C';
$h{'b'} = 'B';
$h{'e22'} = 'E';
$h{'aaaaa'} = 'AAAA';
for (keys %h){
$a = \$h{$_} and $b = $_ if $a < \$h{$_};
}
print "$b\n";
!
but be carefull due to the obvious causes
Hashes are unordered element. so might be your hash last element is elmo /