Perl: simple foreach on hash hands mixed results? [duplicate] - perl

activePerl 5.8 based
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
# declare a new hash
my %some_hash;
%some_hash = ("foo", 35, "bar", 12.4, 2.5, "hello",
"wilma", 1.72e30, "betty", "bye\n");
my #any_array;
#any_array = %some_hash;
print %some_hash;
print "\n";
print #any_array;
print "\n";
print $any_array[0];
print "\n";
print $any_array[1];
print "\n";
print $any_array[2];
print "\n";
print $any_array[3];
print "\n";
print $any_array[4];
print "\n";
print $any_array[5];
print "\n";
print $any_array[6];
print "\n";
print $any_array[7];
print "\n";
print $any_array[8];
print "\n";
print $any_array[9];
Output as this
D:\learning\perl>test.pl
bettybye
bar12.4wilma1.72e+030foo352.5hello
bettybye
bar12.4wilma1.72e+030foo352.5hello
betty
bye
bar
12.4
wilma
1.72e+030
foo
35
2.5
hello
D:\learning\perl>
What decided the elements print order in my sample code?
Any rule to follow when print a mixed(strings, numbers) hash in Perl? Thank you.
bar12.4wilma1.72e+030foo352.5hello
[Updated]
With you guys help, i updated the code as below.
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
# declare a new hash
my %some_hash;
%some_hash = ("foo", 35, "bar", 12.4, 2.5, "hello",
"wilma", 1.72e30, "betty", "bye");
my #any_array;
#any_array = %some_hash;
print %some_hash;
print "\n";
print "\n";
print #any_array;
print "\n";
print "\n";
my #keys;
#keys = keys %some_hash;
for my $k (sort #keys)
{
print $k, $some_hash{$k};
}
output
D:\learning\perl>test.pl
bettybyebar12.4wilma1.72e+030foo352.5hello
bettybyebar12.4wilma1.72e+030foo352.5hello
2.5hellobar12.4bettybyefoo35wilma1.72e+030
D:\learning\perl>
Finially, after called keys and sort functions. The hash keys print followed the rule below
2.5hellobar12.4bettybyefoo35wilma1.72e+030

Elements of a hash are printed out in their internal order, which can not be relied upon and will change as elements are added and removed. If you need all of the elements of a hash in some sort of order, sort the keys, and use that list to index the hash.
If you are looking for a structure that holds its elements in order, either use an array, or use one of the ordered hash's on CPAN.
the only ordering you can rely upon from a list context hash expansion is that key => value pairs will be together.

From perldoc -f keys:
The keys of a hash are returned in an apparently random order. The actual random order is subject to change in future versions of Perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see Algorithmic Complexity Attacks in perlsec).
...
Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.
Also note that while the order of the hash elements might be randomised, this "pseudoordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute), or for any cryptographic applications.
Note: since you are evaluating a hash in list context, you are at least guaranteed that each key is followed by its corresponding value; e.g. you will never see an output of a 4 b 3 c 2 d 1.

I went over your code and made some notes that I think you will find helpful.
use strict;
use warnings;
# declare a new hash and initialize it at the same time
my %some_hash = (
foo => 35, # use the fat-comma or '=>' operator, it quotes the left side
bar => 12.4,
2.5 => "hello",
wilma => 1.72e30,
betty => "bye", # perl ignores trailing commas,
# the final comma makes adding items to the end of the list less bug prone.
);
my #any_array = %some_hash; # Hash is expanded into a list of key/value pairs.
print "$_ => $some_hash{$_}\n"
for keys %some_hash;
print "\n\n", # You can print multiple newlines in one string.
"#any_array\n\n"; # print takes a list of things to print.
# In print #foo; #foo is expanded into a list of items to print.
# There is no separator between the members of #foo in the output.
# However print "#foo"; interpolates #foo into a string.
# It inserts spaces between the members of the arrays.
# This is the block form of 'for'
for my $k (sort keys %some_hash)
{
# Interpolating the variables into a string makes it easier to read the output.
print "$k => $some_hash{$k}\n";
}
Hashes provide unordered, access to data by a string key.
Arrays provide access to ordered data. Random access is available by using a numerical index.
If you need to preserve the order of a group of values, use an array. If you need to look up members of the group by an associated name, use a hash.
If you need to do both, you can use both structures together:
# Keep an array of sorted hash keys.
my #sorted_items = qw( first second third fourth );
# Store the actual data in the hash.
my %item;
#item{ #sorted_items } = 1..4; # This is called a hash slice.
# It allows you to access a list of hash elements.
# This can be a very powerful way to work with hashes.
# random access
print "third => $item{third}\n";
# When you need to access the data in order, iterate over
# the array of sorted hash keys. Use the keys to access the
# data in the hash.
# ordered access
for my $name ( #sorted_items ) {
print "$name => $item{$name}\n";
}
Looking at your code samples, I see a couple of things you might want to work on.
how looping structures like for and while can be used to reduce repeated code.
how to use variable interpolation
BTW, I am glad to see you working on basics and improving your code quality. This investment of time will pay off. Keep up the good work.

The elements are (almost certainly) printed out in the order they appear (internally) in the hash table itself -- i.e. based on the hash values of their keys.
The general rule to follow is to use something other than a hash table if you care much about the order.

Hashes are not (necessarily) retrieved in a sorted manner. If you want them sorted, you have to do it yourself:
use strict;
use warnings;
my %hash = ("a" => 1, "b" => 2, "c" => 3, "d" => 4);
for my $i (sort keys %hash) {
print "$i -> $hash{$i}\n";
}
You retrieve all the keys from a hash by using keys and you then sort them using sort. Yeah, I know, that crazy Larry Wall guy, who would've ever thought of calling them that? :-)
This outputs:
a -> 1
b -> 2
c -> 3
d -> 4

For most practical purposes, the order in which a hash table (not just Perl hash variables, but hash tables in general) can be considered random.
In reality, depending on the hashing implementation, the order may actually be deterministic. (i.e., If you run the program multiple times putting the same items into the hash table in the same order each time, they'll be stored in the same order each time.) I know that Perl hashes used to have this characteristic, but I'm not sure about current versions. In any case, hash key order is not a reliable source of randomness to use in cases where randomness is desirable.
Short version, then:
Don't use a hash if you care about the order (or lack of order). If you want a fixed order, it will be effectively random and if you want a random order, it will be effectively fixed.

A hash defines no ordering properties. The order in which things come out will be unpredictable.

And if you are crazy and have no duplicate values in your hash, and you need the values sorted, you can call reverse on it.
my %hash = ("a" => 1, "b" => 2, "c" => 3, "d" => 4);
my %reverse_hash = reverse %hash;
print $_ for sort keys %reverse_hash;
Caveat is the unique values part, duplicates will be overwritten and only one value will get in.

Related

Is it possible to push a key-value pair directly to hash in perl?

I know pushing is only passible to array, not hash. But it would be much more convenient to allow pushing key-value pair directly to hash (and I am still surprise it is not possible in perl). I have an example:
#!/usr/bin/perl -w
#superior words begin first, example of that word follow
my #ar = qw[Animals,dog Money,pound Jobs,doctor Food];
my %hash;
my $bool = 1;
sub marine{
my $ar = shift if $bool;
for(#$ar){
my #ar2 = split /,/, $_;
push %hash, ($ar2[0] => $ar2[1]);
}
}
marine(\#ar);
print "$_\n" for keys %hash;
Here I have an array, which has 2 words separately by , comma. I would like to make a hash from it, making the first a key, and the second a value (and if it lacks the value, as does the last Food word, then no value at all -> simply undef. How to make it in perl?
Output:
Possible attempt to separate words with commas at ./a line 4.
Experimental push on scalar is now forbidden at ./a line 12, near ");"
Execution of ./a aborted due to compilation errors.
I might be oversimplyfing things here, but why not simply assign to the hash rather than trying to push into it?
That is, replace this unsupported expression:
push %hash, ($ar2[0] => $ar2[1]);
With:
$hash{$ar2[0]} = $ar2[1];
If I incoporate this in your code, and then dump the resulting hash at the end, I get:
$VAR1 = {
'Food' => undef,
'Money' => 'pound',
'Animals' => 'dog',
'Jobs' => 'doctor'
};
Split inside map and assign directly to a hash like so:
my #ar = qw[Animals,dog Money,pound Jobs,doctor Food];
my %hash_new = map {
my #a = split /,/, $_, 2;
#a == 2 ? #a : (#a, undef)
} #ar;
Note that this can also handle the case with more than one comma delimiter (hence splitting into a max of 2 elements). This can also handle the case with no commas, such as Food - in this case, the list with the single element plus the undef is returned.
If you need to push multiple key/value pairs to (another) hash, or merge hashes, you can assign a list of hashes like so:
%hash = (%hash_old, %hash_new);
Note that the same keys in the old hash will be overwritten by the new hash.
We can assign this array to a hash and perl will automatically look at the values in the array as if they were key-value pairs. The odd elements (first, third, fifth) will become the keys and the even elements (second, fourth, sixth) will become the corresponding values. check url https://perlmaven.com/creating-hash-from-an-array
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my #ar;
my %hash;
#The code in the enclosing block has warnings enabled,
#but the inner block has disabled (misc and qw) related warnings.
{
#You specified an odd number of elements to initialize a hash, which is odd,
#because hashes come in key/value pairs.
no warnings 'misc';
#If your code has use warnings turned on, as it should, then you'll get a warning about
#Possible attempt to separate words with commas
no warnings 'qw';
#ar = qw[Animals,dog Money,pound Jobs,doctor Food];
# join the content of array with comma => Animals,dog,Money,pound,Jobs,doctor,Food
# split the content using comma and assign to hash
# split function returns the list in list context, or the size of the list in scalar context.
%hash = split(",", (join(",", #ar)));
}
print Dumper(\%hash);
Output
$VAR1 = {
'Animals' => 'dog',
'Money' => 'pound',
'Jobs' => 'doctor',
'Food' => undef
};

i don't understand #unordered{#array} in perl

I'm inexperienced programmer in Perl. I'm already reading the book Beginning Perl Curtis “Ovid” Poe and have problem with this code.
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
my #array = ( 3, 4, 1, 4, 7, 7, 4, 1, 3, 8 );
my %unordered;
#unordered{#array} = undef;
foreach my $key (keys %unordered) {
print “Unordered: $key\n”;
}
#unordered{#array} = undef;
what does this code mean ?
Could anyone explain me ?
#unordered{#array} = undef
This is called a hash slice . You can read more about it here or here or here. In other words, it acts on multiple keys of the hash(the keys are described by #array, and in this particular case, it assigns the value undef to each of those keys).
I assume it is to show you some things about the keys of hashes.
You have the #array and then create entries in your hash %unordered with the values from your array as keys and no values. This happens in this line:
#unordered{#array} = undef;
You then iterate through all the keys of that hash and print them. So you can see what keys there are and in what order they are in the hash.
You will probably notice how every key exists only once despite several values being more than once in the array. This is because keys of hashes always are unique.
Also you could see how the keys are in no particular order as the keys of a hash may be ordered in any way the implementation likes.
Yes, I understand all what you're explaining to me. Tkank you for your time.
But why, first of all we initiation an array with numbers, then create a hash (blank hash), then we create entries in hash %unordered with the values from array as keys and no values.
By in loop for, we use an "%" (%unordered). We change array "#array" to hash "%unordered", and then we use in loop for again %unordered.
It looks a bit strange

Confusion about proper usage of dereference in Perl

I noticed the other day that - while altering values in a hash - that when you dereference a hash in Perl, you actually are making a copy of that hash. To confirm I wrote this quick little script:
#! perl
use warnings;
use strict;
my %h = ();
my $hRef = \%h;
my %h2 = %{$hRef};
my $h2Ref = \%h2;
if($hRef eq $h2Ref) {
print "\n\tThey're the same $hRef $h2Ref";
}
else {
print "\n\tThey're NOT the same $hRef $h2Ref";
}
print "\n\n";
The output:
They're NOT the same HASH(0x10ff6848) HASH(0x10fede18)
This leads me to realize that there could be spots in some of my scripts where they aren't behaving as expected. Why is it even like this in the first place? If you're passing or returning a hash, it would be more natural to assume that dereferencing the hash would allow me to alter the values of the hash being dereferenced. Instead I'm just making copies all over the place without any real need/reason to beyond making syntax a little more obvious.
I realize the fact that I hadn't even noticed this until now shows its probably not that big of a deal (in terms of the need to go fix in all of my scripts - but important going forward). I think its going to be pretty rare to see noticeable performance differences out of this, but that doesn't alter the fact that I'm still confused.
Is this by design in perl? Is there some explicit reason I don't know about for this; or is this just known and you - as the programmer - expected to know and write scripts accordingly?
The problem is that you are making a copy of the hash to work with in this line:
my %h2 = %{$hRef};
And that is understandable, since many posts here on SO use that idiom to make a local name for a hash, without explaining that it is actually making a copy.
In Perl, a hash is a plural value, just like an array. This means that in list context (such as you get when assigning to a hash) the aggregate is taken apart into a list of its contents. This list of pairs is then assembled into a new hash as shown.
What you want to do is work with the reference directly.
for (keys %$hRef) {...}
for (values %$href) {...}
my $x = $href->{some_key};
# or
my $x = $$href{some_key};
$$href{new_key} = 'new_value';
When working with a normal hash, you have the sigil which is either a % when talking about the entire hash, a $ when talking about a single element, and # when talking about a slice. Each of these sigils is then followed by an identifier.
%hash # whole hash
$hash{key} # element
#hash{qw(a b)} # slice
To work with a reference named $href simply replace the string hash in the above code with $href. In other words, $href is the complete name of the identifier:
%$href # whole hash
$$href{key} # element
#$href{qw(a b)} # slice
Each of these could be written in a more verbose form as:
%{$href}
${$href}{key}
#{$href}{qw(a b)}
Which is again a substitution of the string '$href' for 'hash' as the name of the identifier.
%{hash}
${hash}{key}
#{hash}{qw(a b)}
You can also use a dereferencing arrow when working with an element:
$hash->{key} # exactly the same as $$hash{key}
But I prefer the doubled sigil syntax since it is similar to the whole aggregate and slice syntax, as well as the normal non-reference syntax.
So to sum up, any time you write something like this:
my #array = #$array_ref;
my %hash = %$hash_ref;
You will be making a copy of the first level of each aggregate. When using the dereferencing syntax directly, you will be working on the actual values, and not a copy.
If you want a REAL local name for a hash, but want to work on the same hash, you can use the local keyword to create an alias.
sub some_sub {
my $hash_ref = shift;
our %hash; # declare a lexical name for the global %{__PACKAGE__::hash}
local *hash = \%$hash_ref;
# install the hash ref into the glob
# the `\%` bit ensures we have a hash ref
# use %hash here, all changes will be made to $hash_ref
} # local unwinds here, restoring the global to its previous value if any
That is the pure Perl way of aliasing. If you want to use a my variable to hold the alias, you can use the module Data::Alias
You are confusing the actions of dereferencing, which does not inherently create a copy, and using a hash in list context and assigning that list, which does. $hashref->{'a'} is a dereference, but most certainly does affect the original hash. This is true for $#$arrayref or values(%$hashref) also.
Without the assignment, just the list context %$hashref is a mixed beast; the resulting list contains copies of the hash keys but aliases to the actual hash values. You can see this in action:
$ perl -wle'$x={"a".."f"}; for (%$x) { $_=chr(ord($_)+10) }; print %$x'
epcnal
vs.
$ perl -wle'$x={"a".."f"}; %y=%$x; for (%y) { $_=chr(ord($_)+10) }; print %$x; print %y'
efcdab
epcnal
but %$hashref isn't acting any differently than %hash here.
No, dereferencing does not create a copy of the referent. It's my that creates a new variable.
$ perl -E'
my %h1; my $h1 = \%h1;
my %h2; my $h2 = \%h2;
say $h1;
say $h2;
say $h1 == $h2 ?1:0;
'
HASH(0x83b62e0)
HASH(0x83b6340)
0
$ perl -E'
my %h;
my $h1 = \%h;
my $h2 = \%h;
say $h1;
say $h2;
say $h1 == $h2 ?1:0;
'
HASH(0x9eae2d8)
HASH(0x9eae2d8)
1
No, $#{$someArrayHashRef} does not create a new array.
If perl did what you suggest, then variables would get aliased very easily, which would be far more confusing. As it is, you can alias variables with globbing, but you need to do so explicitly.

What decides the order of keys when I print a Perl hash?

activePerl 5.8 based
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
# declare a new hash
my %some_hash;
%some_hash = ("foo", 35, "bar", 12.4, 2.5, "hello",
"wilma", 1.72e30, "betty", "bye\n");
my #any_array;
#any_array = %some_hash;
print %some_hash;
print "\n";
print #any_array;
print "\n";
print $any_array[0];
print "\n";
print $any_array[1];
print "\n";
print $any_array[2];
print "\n";
print $any_array[3];
print "\n";
print $any_array[4];
print "\n";
print $any_array[5];
print "\n";
print $any_array[6];
print "\n";
print $any_array[7];
print "\n";
print $any_array[8];
print "\n";
print $any_array[9];
Output as this
D:\learning\perl>test.pl
bettybye
bar12.4wilma1.72e+030foo352.5hello
bettybye
bar12.4wilma1.72e+030foo352.5hello
betty
bye
bar
12.4
wilma
1.72e+030
foo
35
2.5
hello
D:\learning\perl>
What decided the elements print order in my sample code?
Any rule to follow when print a mixed(strings, numbers) hash in Perl? Thank you.
bar12.4wilma1.72e+030foo352.5hello
[Updated]
With you guys help, i updated the code as below.
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
# declare a new hash
my %some_hash;
%some_hash = ("foo", 35, "bar", 12.4, 2.5, "hello",
"wilma", 1.72e30, "betty", "bye");
my #any_array;
#any_array = %some_hash;
print %some_hash;
print "\n";
print "\n";
print #any_array;
print "\n";
print "\n";
my #keys;
#keys = keys %some_hash;
for my $k (sort #keys)
{
print $k, $some_hash{$k};
}
output
D:\learning\perl>test.pl
bettybyebar12.4wilma1.72e+030foo352.5hello
bettybyebar12.4wilma1.72e+030foo352.5hello
2.5hellobar12.4bettybyefoo35wilma1.72e+030
D:\learning\perl>
Finially, after called keys and sort functions. The hash keys print followed the rule below
2.5hellobar12.4bettybyefoo35wilma1.72e+030
Elements of a hash are printed out in their internal order, which can not be relied upon and will change as elements are added and removed. If you need all of the elements of a hash in some sort of order, sort the keys, and use that list to index the hash.
If you are looking for a structure that holds its elements in order, either use an array, or use one of the ordered hash's on CPAN.
the only ordering you can rely upon from a list context hash expansion is that key => value pairs will be together.
From perldoc -f keys:
The keys of a hash are returned in an apparently random order. The actual random order is subject to change in future versions of Perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see Algorithmic Complexity Attacks in perlsec).
...
Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.
Also note that while the order of the hash elements might be randomised, this "pseudoordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute), or for any cryptographic applications.
Note: since you are evaluating a hash in list context, you are at least guaranteed that each key is followed by its corresponding value; e.g. you will never see an output of a 4 b 3 c 2 d 1.
I went over your code and made some notes that I think you will find helpful.
use strict;
use warnings;
# declare a new hash and initialize it at the same time
my %some_hash = (
foo => 35, # use the fat-comma or '=>' operator, it quotes the left side
bar => 12.4,
2.5 => "hello",
wilma => 1.72e30,
betty => "bye", # perl ignores trailing commas,
# the final comma makes adding items to the end of the list less bug prone.
);
my #any_array = %some_hash; # Hash is expanded into a list of key/value pairs.
print "$_ => $some_hash{$_}\n"
for keys %some_hash;
print "\n\n", # You can print multiple newlines in one string.
"#any_array\n\n"; # print takes a list of things to print.
# In print #foo; #foo is expanded into a list of items to print.
# There is no separator between the members of #foo in the output.
# However print "#foo"; interpolates #foo into a string.
# It inserts spaces between the members of the arrays.
# This is the block form of 'for'
for my $k (sort keys %some_hash)
{
# Interpolating the variables into a string makes it easier to read the output.
print "$k => $some_hash{$k}\n";
}
Hashes provide unordered, access to data by a string key.
Arrays provide access to ordered data. Random access is available by using a numerical index.
If you need to preserve the order of a group of values, use an array. If you need to look up members of the group by an associated name, use a hash.
If you need to do both, you can use both structures together:
# Keep an array of sorted hash keys.
my #sorted_items = qw( first second third fourth );
# Store the actual data in the hash.
my %item;
#item{ #sorted_items } = 1..4; # This is called a hash slice.
# It allows you to access a list of hash elements.
# This can be a very powerful way to work with hashes.
# random access
print "third => $item{third}\n";
# When you need to access the data in order, iterate over
# the array of sorted hash keys. Use the keys to access the
# data in the hash.
# ordered access
for my $name ( #sorted_items ) {
print "$name => $item{$name}\n";
}
Looking at your code samples, I see a couple of things you might want to work on.
how looping structures like for and while can be used to reduce repeated code.
how to use variable interpolation
BTW, I am glad to see you working on basics and improving your code quality. This investment of time will pay off. Keep up the good work.
The elements are (almost certainly) printed out in the order they appear (internally) in the hash table itself -- i.e. based on the hash values of their keys.
The general rule to follow is to use something other than a hash table if you care much about the order.
Hashes are not (necessarily) retrieved in a sorted manner. If you want them sorted, you have to do it yourself:
use strict;
use warnings;
my %hash = ("a" => 1, "b" => 2, "c" => 3, "d" => 4);
for my $i (sort keys %hash) {
print "$i -> $hash{$i}\n";
}
You retrieve all the keys from a hash by using keys and you then sort them using sort. Yeah, I know, that crazy Larry Wall guy, who would've ever thought of calling them that? :-)
This outputs:
a -> 1
b -> 2
c -> 3
d -> 4
For most practical purposes, the order in which a hash table (not just Perl hash variables, but hash tables in general) can be considered random.
In reality, depending on the hashing implementation, the order may actually be deterministic. (i.e., If you run the program multiple times putting the same items into the hash table in the same order each time, they'll be stored in the same order each time.) I know that Perl hashes used to have this characteristic, but I'm not sure about current versions. In any case, hash key order is not a reliable source of randomness to use in cases where randomness is desirable.
Short version, then:
Don't use a hash if you care about the order (or lack of order). If you want a fixed order, it will be effectively random and if you want a random order, it will be effectively fixed.
A hash defines no ordering properties. The order in which things come out will be unpredictable.
And if you are crazy and have no duplicate values in your hash, and you need the values sorted, you can call reverse on it.
my %hash = ("a" => 1, "b" => 2, "c" => 3, "d" => 4);
my %reverse_hash = reverse %hash;
print $_ for sort keys %reverse_hash;
Caveat is the unique values part, duplicates will be overwritten and only one value will get in.

What's the safest way to iterate through the keys of a Perl hash?

If I have a Perl hash with a bunch of (key, value) pairs, what is the preferred method of iterating through all the keys? I have heard that using each may in some way have unintended side effects. So, is that true, and is one of the two following methods best, or is there a better way?
# Method 1
while (my ($key, $value) = each(%hash)) {
# Something
}
# Method 2
foreach my $key (keys(%hash)) {
# Something
}
The rule of thumb is to use the function most suited to your needs.
If you just want the keys and do not plan to ever read any of the values, use keys():
foreach my $key (keys %hash) { ... }
If you just want the values, use values():
foreach my $val (values %hash) { ... }
If you need the keys and the values, use each():
keys %hash; # reset the internal iterator so a prior each() doesn't affect the loop
while(my($k, $v) = each %hash) { ... }
If you plan to change the keys of the hash in any way except for deleting the current key during the iteration, then you must not use each(). For example, this code to create a new set of uppercase keys with doubled values works fine using keys():
%h = (a => 1, b => 2);
foreach my $k (keys %h)
{
$h{uc $k} = $h{$k} * 2;
}
producing the expected resulting hash:
(a => 1, A => 2, b => 2, B => 4)
But using each() to do the same thing:
%h = (a => 1, b => 2);
keys %h;
while(my($k, $v) = each %h)
{
$h{uc $k} = $h{$k} * 2; # BAD IDEA!
}
produces incorrect results in hard-to-predict ways. For example:
(a => 1, A => 2, b => 2, B => 8)
This, however, is safe:
keys %h;
while(my($k, $v) = each %h)
{
if(...)
{
delete $h{$k}; # This is safe
}
}
All of this is described in the perl documentation:
% perldoc -f keys
% perldoc -f each
One thing you should be aware of when using each is that it has
the side effect of adding "state" to your hash (the hash has to remember
what the "next" key is). When using code like the snippets posted above,
which iterate over the whole hash in one go, this is usually not a
problem. However, you will run into hard to track down problems (I speak from
experience ;), when using each together with statements like
last or return to exit from the while ... each loop before you
have processed all keys.
In this case, the hash will remember which keys it has already returned, and
when you use each on it the next time (maybe in a totaly unrelated piece of
code), it will continue at this position.
Example:
my %hash = ( foo => 1, bar => 2, baz => 3, quux => 4 );
# find key 'baz'
while ( my ($k, $v) = each %hash ) {
print "found key $k\n";
last if $k eq 'baz'; # found it!
}
# later ...
print "the hash contains:\n";
# iterate over all keys:
while ( my ($k, $v) = each %hash ) {
print "$k => $v\n";
}
This prints:
found key bar
found key baz
the hash contains:
quux => 4
foo => 1
What happened to keys "bar" and baz"? They're still there, but the
second each starts where the first one left off, and stops when it reaches the end of the hash, so we never see them in the second loop.
The place where each can cause you problems is that it's a true, non-scoped iterator. By way of example:
while ( my ($key,$val) = each %a_hash ) {
print "$key => $val\n";
last if $val; #exits loop when $val is true
}
# but "each" hasn't reset!!
while ( my ($key,$val) = each %a_hash ) {
# continues where the last loop left off
print "$key => $val\n";
}
If you need to be sure that each gets all the keys and values, you need to make sure you use keys or values first (as that resets the iterator). See the documentation for each.
Using the each syntax will prevent the entire set of keys from being generated at once. This can be important if you're using a tie-ed hash to a database with millions of rows. You don't want to generate the entire list of keys all at once and exhaust your physical memory. In this case each serves as an iterator whereas keys actually generates the entire array before the loop starts.
So, the only place "each" is of real use is when the hash is very large (compared to the memory available). That is only likely to happen when the hash itself doesn't live in memory itself unless you're programming a handheld data collection device or something with small memory.
If memory is not an issue, usually the map or keys paradigm is the more prevelant and easier to read paradigm.
A few miscellaneous thoughts on this topic:
There is nothing unsafe about any of the hash iterators themselves. What is unsafe is modifying the keys of a hash while you're iterating over it. (It's perfectly safe to modify the values.) The only potential side-effect I can think of is that values returns aliases which means that modifying them will modify the contents of the hash. This is by design but may not be what you want in some circumstances.
John's accepted answer is good with one exception: the documentation is clear that it is not safe to add keys while iterating over a hash. It may work for some data sets but will fail for others depending on the hash order.
As already noted, it is safe to delete the last key returned by each. This is not true for keys as each is an iterator while keys returns a list.
I always use method 2 as well. The only benefit of using each is if you're just reading (rather than re-assigning) the value of the hash entry, you're not constantly de-referencing the hash.
I may get bitten by this one but I think that it's personal preference. I can't find any reference in the docs to each() being different than keys() or values() (other than the obvious "they return different things" answer. In fact the docs state the use the same iterator and they all return actual list values instead of copies of them, and that modifying the hash while iterating over it using any call is bad.
All that said, I almost always use keys() because to me it is usually more self documenting to access the key's value via the hash itself. I occasionally use values() when the value is a reference to a large structure and the key to the hash was already stored in the structure, at which point the key is redundant and I don't need it. I think I've used each() 2 times in 10 years of Perl programming and it was probably the wrong choice both times =)
I usually use keys and I can't think of the last time I used or read a use of each.
Don't forget about map, depending on what you're doing in the loop!
map { print "$_ => $hash{$_}\n" } keys %hash;
I woudl say:
Use whatever's easiest to read/understand for most people (so keys, usually, I'd argue)
Use whatever you decide consistently throught the whole code base.
This give 2 major advantages:
It's easier to spot "common" code so you can re-factor into functions/methiods.
It's easier for future developers to maintain.
I don't think it's more expensive to use keys over each, so no need for two different constructs for the same thing in your code.