Can a Hash have duplicate keys or values - perl

Can a Hash have duplicate keys or values?

it can have duplicate values but not keys.

For both hashes and arrays, only one scalar can be stored at a given key. ("Keys are unique.") If they weren't, you couldn't do
$h{a} = 1;
$h{a} = 2;
$val = $h{a}; # 2
$a[4] = 1;
$a[4] = 2;
$val = $a[4]; # 2
If you wanted to associate multiple values with a key, you could place a reference to an array (or hash) at that key, and add the value to that array (or hash).
for my $n (4,5,6,10) {
if ($n % 2) {
push #{ $nums{odd} }, $n;
} else {
push #{ $nums{even} }, $n;
}
}
say join ', ', #{ $nums{even} };
See perllol for more on this.
As for values, multiple elements can have the same value in both hashes and arrays.
$counts{a} = 3;
$counts{b} = 3;
$counts[5] = 3;
$counts[6] = 3;

Assuming talking about a "%hash"
Then:
Duplicate keys not allowed.
Duplicate values allowed.
This is easy to reason about because it is a mapping of a particular Key to a particular Value where the Value plays no part in the look-up and is thus independent upon other Values.

Please try and run this code, it executes without errors.
I hope this is what you were asking!
#!/usr/bin/perl
use strict;
use warnings;
my %hash = ('a' => 1, 'a' => 2, 'b' => 4 );
print values %hash, "\n\n";
print keys %hash, "\n\n";

You can try to use Hash::MultiKey module from CPAN.
(I used Data::Dumper to show how hash is exactly looks - it is not necessary here)
use Data::Dumper;
use Hash::MultiKey;
tie my %multi_hash, 'Hash::MultiKey';
$multi_hash{['foo', 'foo', 'baz']} = "some_data";
for (keys %multi_hash) {
print #$_,"\n";
};
print Dumper\%multi_hash;
And the output shoud be () :
foofoobaz
$VAR1 = {
'ARRAY(0x98b6978)' => 'some_data'
};
So technically speaking Hash::MultiKey let you create reference as a hash key.

Yes a hash can have duplicate keys as I demonstrate below...
Key example: BirthDate|LastNameFirst4Chars|FirstNameInitial|IncNbr
"1959-12-19|Will|K|1" ... "1959-12-19|Will|K|74".
Note: This might be a useful Key for record look ups if someone did not remember their Social Security Nbr
#-- CODE SNIPPET:
#Offsets=(); #-- we will build an array of Flat File record "byte offsets" to random access
#-- for all records matching this ALT KEY with DUPS
for ($i=1; $i<=99; $i++) {
$KEY=$BirthDate . "|" . $LastNameFirst4Chars . "|" . $FirstNameInitial . "|" . $i;
if (exists $Hash{$KEY}) {
push #Offsets, $Hash{$KEY}; #-- add another hash VALUE to the end of the array
}
}

Related

Accessing a multi-dimensional hash using strings

I have a large multi-dimensional hash which is an import of a JSON structure.
my %bighash;
There is an element in %bighash called:
$bighash{'core'}{'dates'}{'year'} = 2019.
I have a separate string variable called core.dates.year which I would like to use to extract 2019 from %bighash.
I've written this code:
my #keys = split(/\./, 'core.dates.year');
my %hash = ();
my $hash_ref = \%hash;
for my $key ( #keys ){
$hash_ref->{$key} = {};
$hash_ref = $hash_ref->{$key};
}
which when I execute:
say Dumper \%hash;
outputs:
$VAR1 = {
'core' => {
'dates' => {
'year' => {}
}
}
};
All good so far. But what I now want to do is say:
print $bighash{\%hash};
Which I want to return 2019. But nothing is being returned or I'm seeing an error about "Use of uninitialized value within %bighash in concatenation (.) or string at script.pl line 1371, line 17 (#1)...
Can someone point me into what is going on?
My project involves embedding strings in an external file which is then replaced with actual values from %bighash so it's just string interpolation.
Thanks!
Can someone point me into what is going on [when I use $bighash{\%hash}]?
Hash keys are strings, and the stringification of \%hash is something like HASH(0x655178). The only element in %bighash has core —not HASH(0x655178)— for key, so the hash lookup returns undef.
Useful tools:
sub dive_val :lvalue { my $p = \shift; $p //= \( $$p->{$_} ) for #_; $$p } # For setting
sub dive { my $r = shift; $r //= $r->{$_} for #_; $r } # For getting
dive_val(\%hash, split /\./, 'core.dates.year') = 2019;
say dive(\%hash, split /\./, 'core.dates.year');
Hash::Fold would seem to be helpful here. You can "flatten" your hash and then access everything with a single key.
use Hash::Fold 'flatten';
my $flathash = flatten(\%bighash, delimiter => '.');
print $flathash->{"core.dates.year"};
There are no multi-dimensional hashes in Perl. Hashes are key/value pairs. Your understanding of Perl data structures is incomplete.
Re-imagine your data structure as follows
my %bighash = (
core => {
dates => {
year => 2019,
},
},
);
There is a difference between the round parentheses () and the curly braces {}. The % sigil on the variable name indicates that it's a hash, that is a set of unordered key/value pairs. The round () are a list. Inside that list are two scalar values, i.e. a key and a value. The value is a reference to another, anonymous, hash. That's why it has curly {}.
Each of those levels is a separate, distinct data structure.
This rewrite of your code is similar to what ikegami wrote in his answer, but less efficient and more verbose.
my #keys = split( /\./, 'core.dates.year' );
my $value = \%bighash;
for my $key (#keys) {
$value //= $value->{$key};
}
print $value;
It drills down step by step into the structure and eventually gives you the final value.

How to copy a nested hash

How to copy a multi level nested hash(say, %A) to another hash(say, %B)? I want to make sure that the new hash does not contain same references(pointers) as the original hash(%A).
If I change anything in the original hash (%A), it should not change
anything in the new hash(%B).
I want a generic way do it. I know I can do it by reassigning by value
for each level of keys(like, %{ $b{kb} } = %a;).
But, there should be a solution which would work irrespective of the number of key levels(hash of hash of hash of .... hash of hash)
PROBLEM EXAMPLE
use Data::Dumper;
my %a=(q=>{
q1=>1,
q2=>2,
},
w=>2);
my %b;
my %c;
%{ $b{kb} } = %a;
print "\%b=[".Data::Dumper::Dumper (%b)."] ";
%{ $c{kc} } = %a; # $b{kb} = \%a;
print "\n\%c=[".Data::Dumper::Dumper (%c)."] ";
# CHANGE THE VALUE OF KEY IN ORIGINAL HASH %a
$a{q}{q1} = 2; # $c{kc} = \%a;
print "\n\%b=[".Data::Dumper::Dumper (%b)."] ";
print "\n\%c=[".Data::Dumper::Dumper (%c)."] ";
Appreciate your help
What you want is commonly known as a "deep copy", where as the assignment operator does a "shallow copy".
use Storable qw( dclone );
my $copy = dclone($src);

Is there a simple way to validate a hash of hash element comparsion?

Is there a simple way to validate a hash of hash element comparsion ?
I need to validate a Perl hash of hash element $Table{$key1}{$key2}{K1}{Value} compare to all other elements in hash
third key will be k1 to kn and i want comprare those elements and other keys are same
if ($Table{$key1}{$key2}{K1}{Value} eq $Table{$key1}{$key2}{K2}{Value}
eq $Table{$key1}{$key2}{K3}{Value} )
{
#do whatever
}
Something like this may work:
use List::MoreUtils 'all';
my #keys = map "K$_", 1..10;
print "All keys equal"
if all { $Table{$key1}{$key2}{$keys[1]}{Value} eq $Table{$key1}{$key2}{$_}{Value} } #keys;
I would use Data::Dumper to help with a task like this, especially for a more general problem (where the third key is more arbitrary than 'K1'...'Kn'). Use Data::Dumper to stringify the data structures and then compare the strings.
use Data::Dumper;
# this line is needed to assure that hashes with the same keys output
# those keys in the same order.
$Data::Dumper::Sortkeys = 1;
my $string1= Data::Dumper->Dump($Table{$key1}{$key2}{k1});
for ($n=2; exists($Table{$key1}{$key2}{"k$n"}; $n++) {
my $string_n = Data::Dumper->Dump($Table{$key1}{$key2}{"k$n"});
if ($string1 ne $string_n) {
warn "key 'k$n' is different from 'k1'";
}
}
This can be used for the more general case where $Table{$key1}{$key2}{k7}{value} itself contains a complex data structure. When a difference is detected, though, it doesn't give you much help figuring out where that difference is.
A fairly complex structure. You should be looking into using object oriented programming techniques. That would greatly simplify your programming and the handling of these complex structures.
First of all, let's simplify a bit. When you say:
$Table{$key1}{$key2}{k1}{value}
Do you really mean:
my $value = $Table{$key1}->{$key2}->{k1};
or
my $actual_value = $Table{$key1}->{$key2}->{k1}->{Value};
I'm going to assume the first one. If I'm wrong, let me know, and I'll update my answer.
Let's simplify:
my %hash = %{$Table{$key1}->{$key2}};
Now, we're just dealing with a hash. There are two techniques you can use:
Sort the keys of this hash by value, then if two keys have the same value, they will be next to each other in the sorted list, making it easy to detect duplicates. The advantage is that all the duplicate keys would be printed together. The disadvantage is that this is a sort which takes time and resources.
Reverse the hash, so it's keyed by value and the value of that key is the key. If a key already exists, we know the other key has a duplicate value. This is faster than the first technique because no sorting is involved. However, duplicates will be detected, but not printed together.
Here's the first technique:
my %hash = %{$Table{$key1}->{$key2}};
my $previous_value;
my $previous_key;
foreach my $key (sort {$hash{$a} cmp $hash{$b}} keys %hash) {
if (defined $previous_key and $previous_value eq $hash{$key}) {
print "\$hash{$key} is a duplicate of \$hash{$previous_key}\n";
}
$previous_value = $hash{$key};
$previous_key = $key;
}
And the second:
my %hash = %{$Table{$key1}->{$key2}};
my %reverse_hash;
foreach $key (keys %hash) {
my $value = $hash{$key};
if (exists $reverse_hash{$value}) {
print "\$hash{$reverse_hash{$value}} has the same value as \$hash{$key}\n";
}
else {
$reverse_hash{$value} = $key;
}
}
Alternative approach to the problem is make utility function which will compare all keys if has same value returned from some function for all keys:
sub AllSame (&\%) {
my ($c, $h) = #_;
my #k = keys %$h;
my $ref;
$ref = $c->() for $h->{shift #k};
$ref ne $c->() and return for #$h{#k};
return 1
}
print "OK\n" if AllSame {$_->{Value}} %{$Table{$key1}{$key2}};
But if you start thinking in this way you can found this approach much more generic (recommended way):
sub AllSame (#) {
my $ref = shift;
$ref ne $_ and return for #_;
return 1
}
print "OK\n" if AllSame map {$_->{Value}} values %{$Table{$key1}{$key2}};
If mapping operation is expensive you can make lazy counterpart of same:
sub AllSameMap (&#) {
my $c = shift;
my $ref;
$ref = $c->() for shift;
$ref ne $c->() and return for #_;
return 1
}
print "OK\n" if AllSameMap {$_->{Value}} values %{$Table{$key1}{$key2}};
If you want only some subset of keys you can use hash slice syntax e.g.:
print "OK\n" if AllSame map {$_->{Value}} #{$Table{$key1}{$key2}}{map "K$_", 1..10};

Inverting a Hash's Key and Values in Perl

I would like to make the value the key, and the key the value. What is the best way to go about doing this?
Adapted from http://www.dreamincode.net/forums/topic/46400-swap-hash-values/:
Assuming your hash is stored in $hash:
while (($key, $value) = each %hash) {
$hash2{$value}=$key;
}
%hash=%hash2;
Seems like much more elegant solution can be achieved with reverse (http://www.misc-perl-info.com/perl-hashes.html#reverseph):
%nhash = reverse %hash;
Note that with reverse, duplicate values will be overwritten.
Use reverse:
use Data::Dumper;
my %hash = ('month', 'may', 'year', '2011');
print Dumper \%hash;
%hash = reverse %hash;
print Dumper \%hash;
As mentioned, the simplest is
my %inverse = reverse %original;
It "fails" if multiple elements have the same value. You could create an HoA to handle that situation.
my %inverse;
push #{ $inverse{ $original{$_} } }, $_ for keys %original;
So you want reverse keys & vals in a hash? So use reverse... ;)
%hash2 = reverse %hash;
reverting (k1 => v1, k2 => v2) - yield (v2=>k2, v1=>k1) - and that is what you want. ;)
my %orig_hash = (...);
my %new_hash;
%new_hash = map { $orig_hash{$_} => $_ } keys(%orig_hash);
The map-over-keys solution is more flexible. What if your value is not a simple value?
my %forward;
my %reverse;
#forward is built such that each key maps to a value that is a hash ref:
#{ a => 'something', b=> 'something else'}
%reverse = map { join(',', #{$_}{qw(a b)}) => $_ } keys %forward;
Here is a way to do it using Hash::MultiValue.
use experimental qw(postderef);
sub invert {
use Hash::MultiValue;
my $mvh = Hash::MultiValue->from_mixed(shift);
my $inverted;
$mvh->each( sub { push $inverted->{ $_[1] }->#* , $_[0] } ) ;
return $inverted;
}
To test this we can try the following:
my %test_hash = (
q => [qw/1 2 3 4/],
w => [qw/4 6 5 7/],
e => ["8"],
r => ["9"],
t => ["10"],
y => ["11"],
);
my $wow = invert(\%test_hash);
my $wow2 = invert($wow);
use DDP;
print "\n \%test_hash:\n\n" ;
p %test_hash;
print "\n \%test_hash inverted as:\n\n" ;
p $wow ;
# We need to sort the contents of the multi-value array reference
# for the is_deeply() comparison:
map {
$test_hash{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $test_hash{$_} } ]
} keys %test_hash ;
map {
$wow2->{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $wow2->{$_} } ]
} keys %$wow2 ;
use Test::More ;
is_deeply(\%test_hash, $wow2, "double inverted hash == original");
done_testing;
Addendum
Note that in order to pass the gimmicky test here, the invert() function relies on %test_hash having array references as values. To work around this if your hash values are not array references, you can "coerce" the regular/mixed hash into a multi-value hash thatHash::MultiValue can then bless into an object. However, this approach means even single values will appear as array references:
for ( keys %test_hash ) {
if ( ref $test_hash{$_} ne 'ARRAY' ) {
$test_hash{$_} = [ $test_hash{$_} ]
}
}
which is longhand for:
ref($_) or $_ = [ $_ ] for values %test_hash ;
This would only be needed to get the "round trip" test to pass.
Assuming all your values are simple and unique strings, here is one more easy way to do it.
%hash = ( ... );
#newhash{values %hash} = (keys %hash);
This is called a hash slice. Since you're using %newhash to produce a list of keys, you change the % to a #.
Unlike the reverse() method, this will insert the new keys and values in the same order as they were in the original hash. keys and values always return their values in the same order (as does each).
If you need more control over it, like sorting it so that duplicate values get the desired key, use two hash slices.
%hash = ( ... );
#newhash{ #hash{sort keys %hash} } = (sort keys %hash);

What's the best practise for Perl hashes with array values?

What is the best practise to solve this?
if (... )
{
push (#{$hash{'key'}}, #array ) ;
}
else
{
$hash{'key'} ="";
}
Is that bad practise for storing one element is array or one is just double quote in hash?
I'm not sure I understand your question, but I'll answer it literally as asked for now...
my #array = (1, 2, 3, 4);
my $arrayRef = \#array; # alternatively: my $arrayRef = [1, 2, 3, 4];
my %hash;
$hash{'key'} = $arrayRef; # or again: $hash{'key'} = [1, 2, 3, 4]; or $hash{'key'} = \#array;
The crux of the problem is that arrays or hashes take scalar values... so you need to take a reference to your array or hash and use that as the value.
See perlref and perlreftut for more information.
EDIT: Yes, you can add empty strings as values for some keys and references (to arrays or hashes, or even scalars, typeglobs/filehandles, or other scalars. Either way) for other keys. They're all still scalars.
You'll want to look at the ref function for figuring out how to disambiguate between the reference types and normal scalars.
It's probably simpler to use explicit array references:
my $arr_ref = \#array;
$hash{'key'} = $arr_ref;
Actually, doing the above and using push result in the same data structure:
my #array = qw/ one two three four five /;
my $arr_ref = \#array;
my %hash;
my %hash2;
$hash{'key'} = $arr_ref;
print Dumper \%hash;
push #{$hash2{'key'}}, #array;
print Dumper \%hash2;
This gives:
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
Using explicit array references uses fewer characters and is easier to read than the push #{$hash{'key'}}, #array construct, IMO.
Edit: For your else{} block, it's probably less than ideal to assign an empty string. It would be a lot easier to just skip the if-else construct and, later on when you're accessing values in the hash, to do a if( defined( $hash{'key'} ) ) check. That's a lot closer to standard Perl idiom, and you don't waste memory storing empty strings in your hash.
Instead, you'll have to use ref() to find out what kind of data you have in your value, and that is less clear than just doing a defined-ness check.
I'm not sure what your goal is, but there are several things to consider.
First, if you are going to store an array, do you want to store a reference to the original value or a copy of the original values? In either case, I prefer to avoid the dereferencing syntax and take references when I can:
$hash{key} = \#array; # just a reference
use Clone; # or a similar module
$hash{key} = clone( \#array );
Next, do you want to add to the values that exist already, even if it's a single value? If you are going to have array values, I'd make all the values arrays even if you have a single element. Then you don't have to decide what to do and you remove a special case:
$hash{key} = [] unless defined $hash{key};
push #{ $hash{key} }, #values;
That might be your "best practice" answer, which is often the technique that removes as many special cases and extra logic as possible. When I do this sort of thing in a module, I typically have a add_value method that encapsulates this magic where I don't have to see it or type it more than once.
If you already have a non-reference value in the hash key, that's easy to fix too:
if( defined $hash{key} and ! ref $hash{key} ) {
$hash{key} = [ $hash{key} ];
}
If you already have non-array reference values that you want to be in the array, you do something similar. Maybe you want an anonymous hash to be one of the array elements:
if( defined $hash{key} and ref $hash{key} eq ref {} ) {
$hash{key} = [ $hash{key} ];
}
Dealing with the revised notation:
if (... )
{
push (#{$hash{'key'}}, #array);
}
else
{
$hash{'key'} = "";
}
we can immediately tell that you are not following the standard advice that protects novices (and experts!) from their own mistakes. You're using a symbolic reference, which is not a good idea.
use strict;
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push(#{$hash{'key'}}, #array);
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
This does not run:
Can't use string ("value") as an ARRAY ref while "strict refs" in use at xx.pl line 8.
I'm not sure I can work out what you were trying to achieve. Even if you remove the 'use strict;' warning, the code shown does not detect a change from the push operation.
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push #{$hash{'key'}}, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
foreach my $value (#{$hash{'key'}}) { print "h_key $value\n"; }
push #value, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
Output:
key = value
array 1
array abc
array 2
value 22
value 23
value 24
h_key 1
h_key abc
h_key 2
key = value
array 1
array abc
array 2
value 22
value 23
value 24
value 1
value abc
value 2
I'm not sure what is going on there.
If your problem is how do you replace a empty string value you had stored before with an array onto which you can push your values, this might be the best way to do it:
if ( ... ) {
my $r = \$hash{ $key }; # $hash{ $key } autoviv-ed
$$r = [] unless ref $$r;
push #$$r, #values;
}
else {
$hash{ $key } = "";
}
I avoid multiple hash look-ups by saving a copy of the auto-vivified slot.
Note the code relies on a scalar or an array being the entire universe of things stored in %hash.