Initialize hash elements to 0 - perl

I declare a hash in Perl by doing this:
my %hash = ();
I go on adding elements to the hashes. Sometimes, $hash{$x} is not defined, meaning it probably is null. So when I try to print it, I do not get anything. I expect to see a 0 in case that entry $x is not defined. Can someone tell me how do I do this? How do I initialize hash elements to 0 initially?

Instead of trying to set a default value, you can print a default value when you encounter an undefined value by using the defined-or operator, // (works for Perl 5.10 and higher).
In this example, when you print your hash elements, you either print the element, or if it is not defined, 0:
use 5.010;
say $hash{$x} // 0;

There is no such of thing as default value for non defined hash key
the correct way to manipulate hash is to test if your key is defined, with 'defined' function (see perl defined)
[~]=> perl -e '%x = (a=>1, c=>5); for (a..d) { print "$_ => " . (defined $x{$_} ? $x{$_} :0) . "\n" }'
a => 1
b => 0
c => 5
d => 0

Note that one doesn't test if a key is defined, but if a key exists using the exists key word. One tests to see if the value is defined with the defined keyword.
print "Exists\n" if exists $hash{$key};
print "Defined\n" if defined $hash{$key};
print "True\n" if $hash{$key};
The first tests if the key exists,
the 2nd if the value is defined and
the 3rd if the value returns a true value.

It is possible to initialise the hash.
my %hash = ();
#hash{#keys} = (0) x #keys;
This will create a hash, give it an element for each key and then set the values to an array of 0's as long as the key array.

Related

Explanation of Perl's syntax from module MoreUtils.pm

I am seeking explanation of the syntax of Perl's uniq and fidrstidx function from module MoreUtils.pm.
Having sought that, I already know other ways to get uniq array elements from an array having duplicate elements and finding the first index from an array by below ways :
## remove duplicate elements ##
my #arr = qw (2 4 2 8 3 4 6);
my #uniq = ();
my %hash = ();
#uniq = grep {!$hash{$_}++ } #arr;
### first index ###
#arr = qw (Java ooperl Ruby cgiperl Python);
my ($index) = grep {$arr[$_] =~ /perl/} 0..$#arr;
Can anybody please explain me second line of this below sub uniq function comprising map and ternary operator from MoreUtils.pm:
map {$h{$_}++ == 0 ? $_ : () } #;
and also
the &# passed to firstidx function and the below line in the body of the function :
local *_ = \$_[$i];
What I understand that sub routine ref is passed to firstidx. But a bit more detailed explanation will be much appreciated.
Thanks.
Your second question was answered in the comments.
Your first question asks about map {$h{$_}++ == 0 ? $_ : () } #; from List::MoreUtils. In recent versions, it's actually in List::MoreUtils::PP (for Pure Perl) since many of the subroutines are also implemented in C and XS. Here's the current version of the Pure Perl uniq:
sub uniq (#)
{
my %seen = ();
my $k;
my $seen_undef;
grep { defined $_ ? not $seen{ $k = $_ }++ : not $seen_undef++ } #_;
}
This has the same map technique although it's using grep instead. The grep goes through all of the elements in #_ and has to return either true or false for each of them. The elements which evaluate to true end up in the output list. The code then wants to make an element evaluate to true the first time it sees it and false the rest of the times.
In this code it handles undef separately. If the current element is not undef, it does the first branch of the conditional operator and the second branch otherwise. Now let's look at the branches.
The defined case adds an element to a hash. No one left code comments about the use of $k but it probably has something to do with not disturbing $_. That $k becomes the key for the hash:
not $seen{ $k = $_ }++
If that is the first time that key has been encountered the value of the hash is undef. That post-increment does its work after the value is used so hold off on thinking about that for a moment. The low-precendence not sees the value of $seen{$k}, which is undef. The not turns the false value of undef into true. That true indicates that the grep has seen $_ for the first time. It becomes part of the output list. Then the ++ does its work and increments the undef value to 1. On all subsequent encounters with the same value the hash value will be true. The not will turn the true value into false and that element won't be in the output list.
The map you show implements the grep. It returns an element when the condition is true and returns no elements when it is false:
map {$h{$_}++ == 0 ? $_ : () } #_;
For each element it adds it as the key in the hash and compares the value to 0. The first time an element is seen that value is undef. In numeric context an undef is 0. So, the == returns true and the first branch of the conditional operator fires, returning $_ to the output list. The ++ then increments the hash value from undef to 1. The next time it encounters the same value the hash value is not 0 and the second branch of the conditional operator returns the empty list. That adds no elements to the output list.
Newer version of List::MoreUtils don't use the construct any more, but as Сухой27 explained,
map { CONDITION ? $_ : () } LIST
is just a fancy alternative to
grep { CONDITION } LIST
I don't think there's any overarching reason the author chose map for this implementation, and in fact it was simplified to grep in later versions of List::MoreUtils.
The firstidx syntax is firstidx BLOCK LIST. Like the builtin map and grep, it is specified that the code in BLOCK will operate on the variable $_, and that the code is allowed to make changes to $_. So in the firstidx implementation, it is not sufficient to set $_ to each value in LIST. Rather, $_ must be aliased to each element of LIST so that a change in $_ inside BLOCK also results to a change in the element in the LIST. This is accomplished by manipulating the symbol table
local *_ = \$scalar # make $_ an alias of $scalar
And you use local so that when firstidx is done, we haven't clobbered any useful information that was previously in the $_ variable.

What does this code do in Perl: keys(%$hash) ...?

print "Enter the hash \n";
$hash=<STDIN>;chop($hash);
#keys = keys (%$hash);
#values = values (%$hash);
Since Google ignores special characters there was no way I could find what the "%$hash" thing does and how this is suppossed to work
keys(%$hash) returns the keys of the hash referenced by the value in $hash. A hash is a type of associative array, which (more or less) means an array that's indexed by strings (called "keys") instead of by numbers.
In this particular case, $hash contains a string. When one uses a string as a reference, dereferencing it access the package variable whose name matches the string.
If the full program is
%FOO = ( a=>1, b=>2 );
%BAR = ( c=>3, d=>4 );
print "Enter the hash \n";
$hash=<STDIN>;chop($hash);
#keys = keys(%$hash);
Then,
#keys will contains a and b if the user enters FOO.
#keys will contains c and d if the user enters BAR.
#keys will contains E2BIG, EACCES, EADDRINUSE and many more if the user enters !.
#keys can contains paths if the user enters INC.
#keys will be empty for most other values.
(The keys are returned in an arbitrary order.)
The last three cases are surely unintentional. This is why the posted code is awful code. This is what the code should have been:
use strict; # Always use these as they
use warnings 'all'; # find/prevent numerous errors.
my %FOO = ( a=>1, b=>2 );
my %BAR = ( c=>3, d=>4 );
my %inputs = ( FOO => \%FOO, BAR => \%BAR );
print "Enter the name of a hash: ";
my $hash_name = <STDIN>;
chomp($hash_name);
my $hash = $inputs{$hash_name}
or die("No such hash\n");
my #keys = keys(%$hash);
...
keys() returns the keys of the specified hash. In the code you wrote, the name of the hash to look at (and extract the keys and values of) is being specified via STDIN, which is really bizarre behavior.
The code you posted is nonsensical, but what it should be doing is dereferencing a hash reference, provided that you have a valid hash reference stored in your scalar $hash (which you don't).
For example:
use strict;
use warnings;
use Data::Dump;
my $href = {
foo => 'bar',
bat => 'baz',
};
dd(keys(%$href)); # ("bat", "foo")
dd(values(%$href)); # ("baz", "bar")
The keys() function will return a list consisting of all the keys of the hash.
The returned values are copies of the original keys in the hash, so
modifying them will not affect the original hash.
The values() function does the exact same thing, except with the values of the hash (obviously).
So long as a given hash is unmodified you may rely on keys, values and
each to repeatedly return the same order as each other.
For more help with references, see perlreftut, perlref, and maybe perldsc if you're feeling adventurous.

Perl "Not an ARRAY reference" error

I'll be glad if someone can enlighten me as to my mistake:
my %mymap;
#mymap{"balloon"} = {1,2,3};
print $mymap{"balloon"}[0] . "\n";
$mymap{'balloon'} is a hash not an array. The expression {1,2,3} creates a hash:
{
'1' => 2,
'3' => undef
}
You assigned it to a slice of %mymap corresponding to the list of keys: ('balloon'). Since the key list was 1 item and the value list was one item, you did the same thing as
$mymap{'balloon'} = { 1 => 2, 3 => undef };
If you had used strict and warnings it would have clued you in to your error. I got:
Scalar value #mymap{"balloon"} better written as $mymap{"balloon"} at - line 3.
Odd number of elements in anonymous hash at - line 3.
If you had used 'use strict; use warnings;' on the top of your code you probably have had better error messages.
What you're doing is creating a hash called mymap. A hash stores data as key => value pairs.
You're then assigning an array reference to the key balloon. Your small code snipped had two issues: 1. you did not addressed the mymap hash, 2. if you want to pass a list, you should use square brackets:
my %mymap;
$mymap{"balloon"} = [1,2,3];
print $mymap{"balloon"}[0] . "\n";
this prints '1'.
You can also just use an array:
my #balloon = (1,2,3);
print $balloon[0] . "\n";
Well, first off, always use strict; use warnings;. If you had, it might have told you about what is wrong here.
Here's what you do in your program:
my %mymap; # declare hash %mymap
#mymap{"balloon"} = {1,2,3}; # attempt to use a hash key on an undeclared
# array slice and assign an anonymous hash to it
print $mymap{"balloon"}[0] . "\n"; # print the first element of a scalar hash value
For it to do what you expect, do:
my %mymap = ( 'balloon' => [ 1,2,3 ] );
print $mymap{'balloon'}[0];
Okay, a few things...
%mymap is a hash. $mymap{"balloon"} is a scalar--namely, the value of the hash %mymap corresponding to the key "balloon". #mymap{"balloon"} is an attempt at what's called a hash slice--basically, you can use these to assign a bunch of values to a bunch of keys at once: #hash{#keys}=#values.
So, if you want to assign an array reference to $mymap{"balloon"}, you'd need something like:
$mymap{"balloon"}=[1,2,3].
To access the elements, you can use -> like so:
$mymap{"balloon"}->[0] #equals 1
$mymap{"balloon"}->[1] #equals 2
$mymap{"balloon"}->[2] #equals 3
Or, you can omit the arrows: $mymap{"balloon"}[0], etc.

perl prevent key value duplicates

I'm looping through a file line by line, it has key->value pair that I'm then outputting to xml. How can I do a check to make sure I haven't already outputted this key/value pair?
In C# I would do it so easy by inserting into dictionary then just using .Contains(), any tips in perl?
Perl has the defined and exists keywords that operate on hash elements.
$hash{'foo'} = 'bar';
print defined $hash{'foo'}; # prints 1
print exists $hash{'foo'}; # prints 1
For most purposes, they do the same thing. The one subtle difference is when the hash value is the special "undefined" value:
$hash{'baz'} = undef;
print defined $hash{'baz'}; # doesn't print 1
print exists $hash{'baz'}; # prints 1
You can do the same thing using a perl hash.
my %seen;
while (my $line = <$filehandle>)
{
next if ($seen{$line});
print $line;
$seen{$line} = 1;
}

How to clear a Perl hash

Let's say we define an anonymous hash like this:
my $hash = {};
And then use the hash afterwards. Then it's time to empty or clear the hash for
reuse. After some Google searching, I found:
%{$hash} = ()
and:
undef %{$hash}
Both will serve my needs. What's the difference between the two? Are they both identical ways to empty a hash?
%$hash_ref = (); makes more sense than undef-ing the hash. Undef-ing the hash says that you're done with the hash. Assigning an empty list says you just want an empty hash.
Yes, they are absolutely identical. Both remove any existing keys and values from the table and sets the hash to the empty list.
See perldoc -f undef:
undef EXPR
undef Undefines the value of EXPR, which must be an lvalue. Use only
on a scalar value, an array (using "#"), a hash (using "%"), a
subroutine (using "&"), or a typeglob (using "*")...
Examples:
undef $foo;
undef $bar{'blurfl'}; # Compare to: delete $bar{'blurfl'};
undef #ary;
undef %hash;
However, you should not use undef to remove the value of anything except a scalar. For other variable types, set it to the "empty" version of that type -- e.g. for arrays or hashes, #foo = (); %bar = ();