Find duplicates in a hash, store grouped in a new hash - perl

I have the following hash, and I need to find the duplicates between the top most hash values 6 and 4. I've tried a few solutions to no avail, and am not too familiar with Perl syntax to make it work.
The Hash I Have
$VAR1 = {
'6' => [ '1000', '2000', '4000' ],
'4' => [ '1000', '2000', '3000' ]
};
The Hash I Need
$VAR1 = {
'6' => ['4000'],
'4' => ['3000'],
'Both' => ['1000','2000']
}

Find all common elements, e.g. by deduplicating with a hash.
Find all elements that are not common.
Given two arrays #x, #y, this would mean:
use List::MoreUtils 'uniq';
# find all common elements
my %common;
$common{$_}++ for uniq(#x), uniq(#y); # count all elements
$common{$_} == 2 or delete $common{$_} for keys %common;
# remove entries from #x, #y that are common:
#x = grep { not $common{$_} } #x;
#y = grep { not $common{$_} } #y;
# Put the common strings in an array:
my #common = keys %common;
Now all that is left is to do a bit of dereferencing and such, but that should be fairly trivial.

No need for other modules. perl hashes are really good for finding uniq or common values
my %both;
# count the number of times any element was seen in 4 and 6
$both{$_}++ for (#{$VAR1->{4}}, #{$VAR1->{6}});
for (keys %both) {
# if the count is one the element isn't in both 4 and 6
delete $both{$_} if( $both{$_} == 1 );
}
$VAR1->{Both} = [keys %both];

Related

How to print specific key in an array (Perl) [duplicate]

This question already has answers here:
Simple hash search by value
(5 answers)
Closed 5 years ago.
I recently started learning Perl, so I'm not too familiar with the functions and syntax.
If I have a Perl array and some variables,
#!/usr/bin/perl
use strict;
use warnings;
my #numbers = (a =>1, b=> 2, c => 3, d =>4, e => 5);
my $x;
my $range = 5;
$x = int(rand($range));
print "$x";
to generate a random number between 1-5, how can I get the program to print the actual key (a, b, c, etc.) instead of just the number (1, 2, 3, 4, 5)?
It seems that you want to do a reverse lookup, key-by-value, opposite to what we get from a hash. Since a hash is a list you can reverse it and use the resulting hash to look up by number.
A couple of corrections: you need a hash variable (not an array), and you need to add 1 to your rand integer generator so to have the desired 1..5 range
use warnings;
use strict;
use feature 'say';
my %numbers = (a => 1, b => 2, c => 3, d => 4, e => 5);
my %lookup_by_number = reverse %numbers; # values need be unique
my $range = 5;
my $x = int(rand $range) + 1;
say $lookup_by_number{$x};
Without reversing the hash you'd need to iterate the hash %numbers over values, testing each against $x so to find its key.
If there are same values for various keys in your original hash then you have to do it by hand since reverse-ing would attempt to create a hash with duplicate keys, in which case only the last one assigned remains. So you'd lose some values. One way
my #at_num = grep { $x == $numbers{$_} } keys %numbers;
as in the post that this was marked as duplicate of.
But then you should build a data structure for reverse lookup so to not search through the list every time information is needed. This can be a hash where keys are the list of unique numbers while their values are then array references (arrayrefs) with corresponding keys from the original hash
use warnings;
use strict;
my %num = (a => 1, b => 2, c => 1, d => 3, e => 2); # with duplicate values
my %lookup_by_num;
foreach my $key (keys %num) {
push #{ $lookup_by_num{$num{$key}} }, $key;
}
say "$_ => [ #{$lookup_by_num{$_}} ]" for keys %lookup_by_num;
This prints
1 => [ c a ]
3 => [ d ]
2 => [ e b ]
A nice way to display complex data structures is via Data::Dumper, or Data::Dump (or others).
The expression #{ $lookup_by_num{ $num{$key} } } extracts the value of %lookup_by_num for the key $num{$key}and dereferences it #{ ... }, so that it can then push the $key to it. The critical part of this is that the first time it encounters $num{$key} it autovivifies the arrayref and its corresponding key. See this post with its references for details.
There's many ways to do it. For example, declare "numbers" as a hash rather than an array. Note that the keys come first in each key-value pair, and here you want to use your random int as the key:
my %numbers = ( 0 => 'a', 1 => 'b', 2 => 'c', 3 => 'd', 4 => 'e' );
Then you can look up the "key" as you call it using:
my $key = $numbers{$x};
Note that rand( $x ); returns a number greater than or equal to zero and less than $x. So if you want integers in the range 1-5, you must add 1 in your code: at the moment you'll get 0-4, not 1-5.
Firstly, arrays don't have keys (well, they kind of do, but they're integers and not the values you want). So I think you want a hash, not an array.
my %numbers = (a =>1, b=> 2, c => 3, d =>4, e => 5);
And if you want to get the letter, given the integer then you need the reverse of this hash:
my %rev_numbers = %numbers;
Note that reversing a hash like this only works if the values in your original hash are unique (because reversing a hash makes the values into keys and hash keys are always unique).
Then, you can just look up an integer in your %rev_hash to get its associated letter.
my $integer = 3;
say $rev_numbers{$integer}; # prints 'c'

How to alter the hash values with new values in perl?

How can i reset the hash values in perl
use warnings;
use strict;
my %hash = qw(one 1 two 2 three 3 four 4);
my #key = keys(%hash);
my #avz = (9..12);
my %vzm;
print "Original hash and keys : ",%hash,"\n";
for(my $i = 0; $i<=scalar #avz; $i++){
my #new = "$key[$i] $avz[$i] ";
push(%vzm , #new);
}
print "modified hash and keys",%vzm,"\n";
I tried to alter the keys of original hash with another keys. How can i do it
This program give the error is:
Original hash and keys : three3one1two2four4
Not an ARRAY reference at key.pl line 10.
I expect the output is
Original hash and keys : three3one1two2four4
modified hash and keys : three11one9two10four12
How can i do it
Ok, first off - you're doing something nasty in your code:
You're trying to take an ordered data structure - an array - and push it into a keyed data structure, which has no particular ordering defined.
This isn't going to work very well - it technically works, because internally perl treats arrays and hashes similarly.
But for example your first assignment - what you're actually getting is:
my %hash = (
one => 1,
two => 2,
three => 3,
four => 4
);
You can access the keys (in no particular order) via keys(). And the values via values(). But to try and treat it like an array is undefined behaviour.
To add elements to your array:
$hash{'nine'} = 9;
To delete elements from your array:
delete ( $hash{'one'} );
You can iterate on keys or values - and combined with sort even do them in some sort of order. (Just bear in mind for sorting alphanumeric numbers you'll have a custom sort job).
foreach my $key ( sort keys %hash ) {
print "$key => $hash{$key}\n";
}
(Note - this is sorting by alphanumeric string, so gives:
four => 4
one => 1
three => 3
two => 2
If you want to sort by value:
foreach my $key ( sort { $hash{$a} <=> $hash{$b} } keys %hash ) {
print "$key => $hash{$key}\n";
}
And so you'll get:
one => 1
two => 2
three => 3
four => 4
So the real question remains - what are you actually trying to accomplish? The point of a hash is to give you an unordered mini-database of key-value pairs. Treating one like an array doesn't make an awful lot of sense. Either you're iterating hash elements in arbitrary order, or you're applying a specific sort to it - but one where you're relying on getting elements in a particular order is a bad plan - it may work, but it's not guaranteed to work, and that makes for bad code.
You have to keep the order of the keys in some array, or take it from original list
my #tmp = qw(one 1 two 2 three 3 four 4);
my %hash = #tmp;
# 'one', 'two', ..
my #key = #tmp[ grep !($_%2), 0 .. $#tmp ];
# ..
for my $i (0 .. $#avz) {
$vzm{ $key[$i] } = $avz[$i];
}
or using hash slice as more perlish approach,
#vzm{ #key } = #avz;
You can't do what you want (replace the values for keys in the hash in the order they originally were added) without keeping track of that order separately, since the hash doesn't have any particular order. In other words, this:
my #key = keys(%hash);
needs to be this:
my #key = ( 'one', 'two', 'three', 'four' );
Once you have that, you can just assign the values all at once with a hash slice:
my %vzm;
#vzm{#key} = #avz;
To create a hash element, you use assignment to $var{$key}.
for (my $i = 0; $i < scalar #avz; $i++) {
$vzm{$key[$i]} = $avz[$i];
}
Note also that the loop condition should be <, not <=. List/array indexes end at scalar #avz - 1.

How to access a customize hash structure in Perl?

I have a custom perl hash data structure . The sturcture is like bellow:
%myhash = (
1 => {
'scf1' => [
1,3,0,4,6,7,8,
],
'sef2' => [
10,15,20,30,
]
},
2 => {
'scf1' => [
10,3,0,41,6,47,81,
],
'scf3' => [
1,66,0,123,4,1,2435,33445,1
]
},
);
How I can access this kind of perl structure.
I'm afraid your code ... is showing signs that you are misunderstanding what hashes do, and how they work. Specifically, when you're referencing #{$myhash} - this is NOT the same as the %myhash that you undef.
Likewise - what's going on with #features? It looks like you're trying to build an array of arrays, but doing so by iterating through fetchrow_array and then pushing. Multidimensional arrays are sometimes the right tool for the job, but it is unclear why it would be suitable for what you're doing. (After all, you don't use it for anything else in this piece of code).
You've also got $line[2] - which is also not doing what you might think - it does NOT refer to $line, it's the second element of a list called #line - which doesn't exist.
You are also trying to process is list of database entries, and set it '-1' if it's undef.
We need some more detail about what data you're getting out of your database - $sth -> fetchrow_array() could be anything. However, I'd strongly suggest that what you want to do is name each of the fields as you go. I'd suggest you DON'T want to be using $line there, because it's ... well, wrong. You're iterating columns in the row you've just fetched.
Which field in your fetched array are the keys to your hash? It looks like you're trying to key on 'field 5' 'field 7' and trying to insert values of 'field 1' and 'field 2'. Is that correct?
Oh, and turn on use strict; use warnings whilst you're at it.
get the inner array:
my #array = #{$hash{1}->{'scf1'}};
# is same as
# my $array_ref = $hash{1}->{'scf1'};
# my #array = #{$array_ref};
# then you can
my $some_thing = $array[0];
or get one element:
$hash{1}->{'scf1'}->[0];
From your Data::Dumper dump, I see that you have a hash called %myhash. Each element in that hash contains a reference to another hash. And, each element in that inner hash contains a reference to an array.
Let's take your Data::Dumper, and restate it like this:
$myhash{1}->{sff1} = [1, 3, 0, 4, 6, 7, 8];
$myhash{1}->{sef2} = [10, 15, 20, 30];
$myhash{2}->{scf1} = [10, 3, 0, 41, 6, 47, 81];
$myhash{2}->{scf3} = [1, 66, 0, 123, 4, 2435, 33445, 1];
Same thing. It's just a bit more compact.
To print this out, we'll need to loop through each of these layers of references:
#
# First loop: The outer hash which is a plain normal hash
#
for my $outer_key ( sort keys %myhash ) {
#
# Each element in that hash points to another hash reference. Dereference
#
my %inner_hash = %{ $myhash{$outer_key} };
for my $inner_key ( sort keys %inner_hash ) {
#
# Finally, this is our array reference in the inner hash. Let's dereference and print
#
print "\$myhash{$outer_key}->{$inner_key}: ";
my #array = #{ $myhash{$outer_key}->{$inner_key} };
for my $value ( #array ) {
print "$value";
}
print "\n";
}
}

How can I find out if a hash has an odd number of elements in assignment?

How could I find out if this hash has an odd number of elements?
my %hash = ( 1, 2, 3, 4, 5 );
Ok, I should have written more information.
sub routine {
my ( $first, $hash_ref ) = #_;
if ( $hash_ref refers to a hash with odd numbers of elements ) {
"Second argument refers to a hash with odd numbers of elements.\nFalling back to default values";
$hash_ref = { option1 => 'office', option2 => 34, option3 => 'fast' };
}
...
...
}
routine( [ 'one', 'two', 'three' ], { option1 =>, option2 => undef, option3 => 'fast' );
Well, I suppose there is some terminological confusion in the question that should be clarified.
A hash in Perl always has the same number of keys and values - because it's fundamentally an engine to store some values by their keys. I mean, key-value pair should be considered as a single element here. )
But I guess that's not what was asked really. ) I suppose the OP tried to build a hash from a list (not an array - the difference is subtle, but it's still there), and got the warning.
So the point is to check the number of elements in the list which will be assigned to a hash. It can be done as simple as ...
my #list = ( ... there goes a list ... );
print #list % 2; # 1 if the list had an odd number of elements, 0 otherwise
Notice that % operator imposes the scalar context on the list variable: it's simple and elegant. )
UPDATE as I see, the problem is slightly different. Ok, let's talk about the example given, simplifying it a bit.
my $anhash = {
option1 =>,
option2 => undef,
option3 => 'fast'
};
See, => is just a syntax sugar; this assignment could be easily rewritten as...
my $anhash = {
'option1', , 'option2', undef, 'option3', 'fast'
};
The point is that missing value after the first comma and undef are not the same, as lists (any lists) are flattened automatically in Perl. undef can be a normal element of any list, but empty space will be just ignored.
Take note the warning you care about (if use warnings is set) will be raised before your procedure is called, if it's called with an invalid hash wrapped in reference. So whoever caused this should deal with it by himself, looking at his own code: fail early, they say. )
You want to use named arguments, but set some default values for missing ones? Use this technique:
sub test_sub {
my ($args_ref) = #_;
my $default_args_ref = {
option1 => 'xxx',
option2 => 'yyy',
};
$args_ref = { %$default_args_ref, %$args_ref, };
}
Then your test_sub might be called like this...
test_sub { option1 => 'zzz' };
... or even ...
test_sub {};
The simple answer is: You get a warning about it:
Odd number of elements in hash assignment at...
Assuming you have not been foolish and turned warnings off.
The hard answer is, once assignment to the hash has been done (and warning issued), it is not odd anymore. So you can't.
my %hash = (1,2,3,4,5);
use Data::Dumper;
print Dumper \%hash;
$VAR1 = {
'1' => 2,
'3' => 4,
'5' => undef
};
As you can see, undef has been inserted in the empty spot. Now, you can check for undefined values and pretend that any existing undefined values constitutes an odd number of elements in the hash. However, should an undefined value be a valid value in your hash, you're in trouble.
perl -lwe '
sub isodd { my $count = #_ = grep defined, #_; return ($count % 2) };
%a=(a=>1,2);
print isodd(%a);'
Odd number of elements in hash assignment at -e line 1.
1
In this one-liner, the function isodd counts the defined arguments and returns whether the amount of arguments is odd or not. But as you can see, it still gives the warning.
You can use the __WARN__ signal to "trap" for when a hash assignment is incorrect.
use strict ;
use warnings ;
my $odd_hash_length = 0 ;
{
local $SIG{__WARN__} = sub {
my $msg = shift ;
if ($msg =~ m{Odd number of elements in hash assignment at}) {
$odd_hash_length = 1 ;
}
} ;
my %hash = (1, 2, 3, 4, 5) ;
}
# Now do what you want based on $odd_hash_length
if ($odd_hash_length) {
die "the hash had an odd hash length assignment...aborting\n" ;
} else {
print "the hash was initialized correctly\n";
}
See also How to capture and save warnings in Perl.

Perl hash value without key name

Is it possible in Perl to access a value of a hash, if it has just one key, without using key value?
Let's say, %h has just 'key_name' => 'value'.
Can I access the 'value' only via $h->{key_name}?
Or, is possible to access this 'value' without key name?
The values builtin function for hashes will return a list of all the hash values. You can use this to get or set any values with aliasing list constructs such as foreach, map, and grep:
for my $value (values %hash) {
say $value; # prints the value
$value++; # adds one to the value
}
Or you can store the values in an array:
my #vals = values %hash;
The order of the returned values is effectively random, but it will be the same order as the corresponding keys function.
Hashes themselves are lists, so you can access any odd element of the hash in list context to get at the value, but this method is less efficient since the whole hash needs to be taken apart to form the list, not just the values.
The techniques above work with hashes of any size. If you only have one key / value pair:
my %hash = qw(foo bar);
Then they reduce to:
{my ($x) = values %hash; say $x} # bar
{my (undef, $x) = %hash; say $x} # bar
{my $x = (values %hash)[0]; say $x} # bar
{my $x = (%hash)[1]; say $x} # bar
There are many ways to do this. For example:
my %h=("key_name"=>"value"); print values(%h)
or
my %h=("key_name"=>"value"); print( (%h)[1])
But in my opinion that doesn't look pretty...
You've got two options here - you can either optimize for space, or optimize for time. If you need to get the key from that value and you don't care about how long it takes, you can iterate over each entry in the associative array:
while(($key, $value) = each(%h))
{
if($value eq 'value')
{
return $key;
}
}
But if you don't mind having two copies, the most time-efficient solution is to hold a backwards and forwards associative array -- that is: %h_by_name and %h_by_value.
In this case, if you have multiple keys with the same value, your %h_by_value should contain an array. That is if:
%h_by_name = (
"a" => "1",
"b" => "1",
"c" => "1",
"d" => 2"
);
Then you would want to construct your %h_by_value such that it was:
%h_by_value = (
"1" => [ "a", "b", "c" ],
"2" => [ "d" ]
);