I have a hash of arrays (HoA). I have been processing the values of this HoA using $arrayrefs. However, now I need to retrieve the $key based on the $arrayrefs.
my %a = ( 1 => "ONE" ,
2 => "TWO" ,
3 => " Three", );
my %aa = ( 4 => [ 'ONE' , 'TWO', 'THREE'],
5 => ['one' , 'two', 'three'],
6 => ['more', 'dos', 'some'],
);
my #array = ('ONE' , 'TWO', 'THREE');
my $array_ref = \#array;
# returns the $key where the $value is 'ONE'
my ($any_match) = grep { $a{$_} eq 'ONE' } keys %a;
print $any_match."\n"; # this returns '1', as expected.. Good!
my ($match) = grep { $aa{$_} eq #$array_ref } keys %aa;
print $match."\n"; # <--- error: says that match is uninitialized
In the last print statement, I would like it to return 4. Does anyone know how to do this?
You can't compare arrays with eq. A simple solution is to turn both arrays into strings and comparing the strings using eq:
my ($match) = grep { join("", #{$aa{$_}}) eq join("", #$array_ref) } keys %aa;
For comparing arrays you could also utilize one of many modules from CPAN, e.g. Array::Compare, List::Compare, etc.
Always use strict; use warnings;. Add use v5.10; since Perl's (v5.10+) smart matching will be used to compare arrays. Do the following:
my ($match) = grep { #{$aa{$_}} ~~ #$array_ref } keys %aa;
The smart operator ~~ is used here to compare the arrays.
Related
I have an array of "words" (strings), which consist of letters from an "alphabet" with user-defined sequence. E.g my "alphabet" starts with "ʔ ʕ b g d", so a list of "words" (bʔd ʔbg ʕʔb bʕd) after sort by_my_alphabet should be ʔbd ʕʔb bʔd bʕd.
sort by_my_alphabet (bʔd ʔbg ʕʔb bʕd) # gives ʔbd ʕʔb bʔd bʕd
Is there a way to make a simple subroutine by_my_alphabet with $a and $b to solve this problem?
Simple, and very fast because it doesn't use a compare callback, but it needs to scan the entire string:
use utf8;
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
my #sorted =
map { join '', #my_chr[ unpack 'W*', $_ ] } # "\x00\x01\x02\x03\x04" ⇒ "ʔʕbgd"
sort
map { pack 'W*', #my_ord{ split //, $_ } } # "ʔʕbgd" ⇒ "\x00\x01\x02\x03\x04"
#unsorted;
Optimized for long strings since it only scans a string up until a difference is found:
use utf8;
use List::Util qw( min );
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
sub my_cmp($$) {
for ( 0 .. ( min map length($_), #_ ) - 1 ) {
my $cmp = $my_ord{substr($_[0], $_, 1)} <=> $my_ord{substr($_[1], $_, 1)};
return $cmp if $cmp;
}
return length($_[0]) <=> length($_[1]);
}
my #sorted = sort my_cmp #unsorted;
Both should be faster than Sobrique's. Theirs uses a compare callback, and it scans the entire strings being compared.
Yes.
sort can take any function that returns a relative sort position. All you need is a function that correctly looks up the 'sort value' of a string for comparing.
So all you need to do here is define a 'relative weight' of your extra letters, and then compare the two.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #sort_order = qw ( B C A D );
my #array_to_sort = qw ( A B C D A B C D AB BB CCC ABC );
my $count = 0;
my %position_of;
$position_of{$_} = $count++ for #sort_order;
print Dumper \%position_of;
sub sort_by_pos {
my #a = split //, $a;
my #b = split //, $b;
#iterate one letter at a time, using 'shift' to take it off the front
#of the array.
while ( #a and #b ) {
my $result = $position_of{shift #a} <=> $position_of{shift #b};
#result is 'true' if it's "-1" or "1" which indicates relative position.
# 0 is false, and that'll cause the next loop iteration to test the next
#letter-pair
return $result if $result;
}
#return a value based on remaining length - longest 'string' will sort last;
#That's so "AAA" comparing with "AA" comparison actually work,
return scalar #a <=> scalar #b;
}
my #new = sort { sort_by_pos } #array_to_sort;
print Dumper \#new;
Bit of a simple case, but it sorts our array into:
$VAR1 = [
'B',
'B',
'BB',
'C',
'C',
'CCC',
'A',
'A',
'AB',
'ABC',
'D',
'D'
];
I have a Perl script that parses an Excel file and does the following : It counts for each value in column A, the number of elements it has in column B, the script looks like this :
use strict;
use warnings;
use Spreadsheet::XLSX;
use Data::Dumper;
use List::Util qw( sum );
my $col1 = 0;
my %hash;
my $excel = Spreadsheet::XLSX->new('inout_chartdata_ronald.xlsx');
my $sheet = ${ $excel->{Worksheet} }[0];
$sheet->{MaxRow} ||= $sheet->{MinRow};
my $count = 0;
# Iterate through each row
foreach my $row ( $sheet->{MinRow}+1 .. $sheet->{MaxRow} ) {
# The cell in column 1
my $cell = $sheet->{Cells}[$row][$col1];
if ($cell) {
# The adjacent cell in column 2
my $adjacentCell = $sheet->{Cells}[$row][ $col1 + 1 ];
# Use a hash of hashes
$hash{ $cell->{Val} }{ $adjacentCell->{Val} }++;
}
}
print "\n", Dumper \%hash;
The output looks like this :
$VAR1 = {
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
This works great, my question is : How can I access the elements of this output $VAR1 in order to do : for value 13, klm + hij = 3 and get a final output like this :
$VAR1 = {
'13' => {
'somename' => 3,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
So basically what I want to do is loop through my final hash of hashes and access its specific elements based on a unique key and finally do their sum.
Any help would be appreciated.
Thanks
I used #do_sum to indicate what changes you want to make. The new key is hardcoded in the script. Note that the new key is not created if no key exists in the subhash (the $found flag).
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = (
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
);
my #do_sum = qw(klm hij);
for my $num (keys %hash) {
my $found;
my $sum = 0;
for my $key (#do_sum) {
next unless exists $hash{$num}{$key};
$sum += $hash{$num}{$key};
delete $hash{$num}{$key};
$found = 1;
}
$hash{$num}{somename} = $sum if $found;
}
print Dumper \%hash;
It sounds like you need to learn about Perl References, and maybe Perl Objects which are just a nice way to deal with references.
As you know, Perl has three basic data-structures:
Scalars ($foo)
Arrays (#foo)
Hashes (%foo)
The problem is that these data structures can only contain scalar data. That is, each element in an array can hold a single value or each key in a hash can hold a single value.
In your case %hash is a Hash where each entry in the hash references another hash. For example:
Your %hash has an entry in it with a key of 13. This doesn't contain a scalar value, but a references to another hash with three keys in it: klm, hij, and lkm. YOu can reference this via this syntax:
${ hash{13} }{klm} = 1
${ hash{13} }{hij} = 2
${ hash{13} }{lkm} = 4
The curly braces may or may not be necessary. However, %{ hash{13} } references that hash contained in $hash{13}, so I can now reference the keys of that hash. You can imagine this getting more complex as you talk about hashes of hashes of arrays of hashes of arrays. Fortunately, Perl includes an easier syntax:
$hash{13}->{klm} = 1
%hash{13}->{hij} = 2
%hash{13}->{lkm} = 4
Read up about hashes and how to manipulate them. After you get comfortable with this, you can start working on learning about Object Oriented Perl which handles references in a safer manner.
I would like to make the value the key, and the key the value. What is the best way to go about doing this?
Adapted from http://www.dreamincode.net/forums/topic/46400-swap-hash-values/:
Assuming your hash is stored in $hash:
while (($key, $value) = each %hash) {
$hash2{$value}=$key;
}
%hash=%hash2;
Seems like much more elegant solution can be achieved with reverse (http://www.misc-perl-info.com/perl-hashes.html#reverseph):
%nhash = reverse %hash;
Note that with reverse, duplicate values will be overwritten.
Use reverse:
use Data::Dumper;
my %hash = ('month', 'may', 'year', '2011');
print Dumper \%hash;
%hash = reverse %hash;
print Dumper \%hash;
As mentioned, the simplest is
my %inverse = reverse %original;
It "fails" if multiple elements have the same value. You could create an HoA to handle that situation.
my %inverse;
push #{ $inverse{ $original{$_} } }, $_ for keys %original;
So you want reverse keys & vals in a hash? So use reverse... ;)
%hash2 = reverse %hash;
reverting (k1 => v1, k2 => v2) - yield (v2=>k2, v1=>k1) - and that is what you want. ;)
my %orig_hash = (...);
my %new_hash;
%new_hash = map { $orig_hash{$_} => $_ } keys(%orig_hash);
The map-over-keys solution is more flexible. What if your value is not a simple value?
my %forward;
my %reverse;
#forward is built such that each key maps to a value that is a hash ref:
#{ a => 'something', b=> 'something else'}
%reverse = map { join(',', #{$_}{qw(a b)}) => $_ } keys %forward;
Here is a way to do it using Hash::MultiValue.
use experimental qw(postderef);
sub invert {
use Hash::MultiValue;
my $mvh = Hash::MultiValue->from_mixed(shift);
my $inverted;
$mvh->each( sub { push $inverted->{ $_[1] }->#* , $_[0] } ) ;
return $inverted;
}
To test this we can try the following:
my %test_hash = (
q => [qw/1 2 3 4/],
w => [qw/4 6 5 7/],
e => ["8"],
r => ["9"],
t => ["10"],
y => ["11"],
);
my $wow = invert(\%test_hash);
my $wow2 = invert($wow);
use DDP;
print "\n \%test_hash:\n\n" ;
p %test_hash;
print "\n \%test_hash inverted as:\n\n" ;
p $wow ;
# We need to sort the contents of the multi-value array reference
# for the is_deeply() comparison:
map {
$test_hash{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $test_hash{$_} } ]
} keys %test_hash ;
map {
$wow2->{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $wow2->{$_} } ]
} keys %$wow2 ;
use Test::More ;
is_deeply(\%test_hash, $wow2, "double inverted hash == original");
done_testing;
Addendum
Note that in order to pass the gimmicky test here, the invert() function relies on %test_hash having array references as values. To work around this if your hash values are not array references, you can "coerce" the regular/mixed hash into a multi-value hash thatHash::MultiValue can then bless into an object. However, this approach means even single values will appear as array references:
for ( keys %test_hash ) {
if ( ref $test_hash{$_} ne 'ARRAY' ) {
$test_hash{$_} = [ $test_hash{$_} ]
}
}
which is longhand for:
ref($_) or $_ = [ $_ ] for values %test_hash ;
This would only be needed to get the "round trip" test to pass.
Assuming all your values are simple and unique strings, here is one more easy way to do it.
%hash = ( ... );
#newhash{values %hash} = (keys %hash);
This is called a hash slice. Since you're using %newhash to produce a list of keys, you change the % to a #.
Unlike the reverse() method, this will insert the new keys and values in the same order as they were in the original hash. keys and values always return their values in the same order (as does each).
If you need more control over it, like sorting it so that duplicate values get the desired key, use two hash slices.
%hash = ( ... );
#newhash{ #hash{sort keys %hash} } = (sort keys %hash);
What is the best practise to solve this?
if (... )
{
push (#{$hash{'key'}}, #array ) ;
}
else
{
$hash{'key'} ="";
}
Is that bad practise for storing one element is array or one is just double quote in hash?
I'm not sure I understand your question, but I'll answer it literally as asked for now...
my #array = (1, 2, 3, 4);
my $arrayRef = \#array; # alternatively: my $arrayRef = [1, 2, 3, 4];
my %hash;
$hash{'key'} = $arrayRef; # or again: $hash{'key'} = [1, 2, 3, 4]; or $hash{'key'} = \#array;
The crux of the problem is that arrays or hashes take scalar values... so you need to take a reference to your array or hash and use that as the value.
See perlref and perlreftut for more information.
EDIT: Yes, you can add empty strings as values for some keys and references (to arrays or hashes, or even scalars, typeglobs/filehandles, or other scalars. Either way) for other keys. They're all still scalars.
You'll want to look at the ref function for figuring out how to disambiguate between the reference types and normal scalars.
It's probably simpler to use explicit array references:
my $arr_ref = \#array;
$hash{'key'} = $arr_ref;
Actually, doing the above and using push result in the same data structure:
my #array = qw/ one two three four five /;
my $arr_ref = \#array;
my %hash;
my %hash2;
$hash{'key'} = $arr_ref;
print Dumper \%hash;
push #{$hash2{'key'}}, #array;
print Dumper \%hash2;
This gives:
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
Using explicit array references uses fewer characters and is easier to read than the push #{$hash{'key'}}, #array construct, IMO.
Edit: For your else{} block, it's probably less than ideal to assign an empty string. It would be a lot easier to just skip the if-else construct and, later on when you're accessing values in the hash, to do a if( defined( $hash{'key'} ) ) check. That's a lot closer to standard Perl idiom, and you don't waste memory storing empty strings in your hash.
Instead, you'll have to use ref() to find out what kind of data you have in your value, and that is less clear than just doing a defined-ness check.
I'm not sure what your goal is, but there are several things to consider.
First, if you are going to store an array, do you want to store a reference to the original value or a copy of the original values? In either case, I prefer to avoid the dereferencing syntax and take references when I can:
$hash{key} = \#array; # just a reference
use Clone; # or a similar module
$hash{key} = clone( \#array );
Next, do you want to add to the values that exist already, even if it's a single value? If you are going to have array values, I'd make all the values arrays even if you have a single element. Then you don't have to decide what to do and you remove a special case:
$hash{key} = [] unless defined $hash{key};
push #{ $hash{key} }, #values;
That might be your "best practice" answer, which is often the technique that removes as many special cases and extra logic as possible. When I do this sort of thing in a module, I typically have a add_value method that encapsulates this magic where I don't have to see it or type it more than once.
If you already have a non-reference value in the hash key, that's easy to fix too:
if( defined $hash{key} and ! ref $hash{key} ) {
$hash{key} = [ $hash{key} ];
}
If you already have non-array reference values that you want to be in the array, you do something similar. Maybe you want an anonymous hash to be one of the array elements:
if( defined $hash{key} and ref $hash{key} eq ref {} ) {
$hash{key} = [ $hash{key} ];
}
Dealing with the revised notation:
if (... )
{
push (#{$hash{'key'}}, #array);
}
else
{
$hash{'key'} = "";
}
we can immediately tell that you are not following the standard advice that protects novices (and experts!) from their own mistakes. You're using a symbolic reference, which is not a good idea.
use strict;
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push(#{$hash{'key'}}, #array);
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
This does not run:
Can't use string ("value") as an ARRAY ref while "strict refs" in use at xx.pl line 8.
I'm not sure I can work out what you were trying to achieve. Even if you remove the 'use strict;' warning, the code shown does not detect a change from the push operation.
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push #{$hash{'key'}}, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
foreach my $value (#{$hash{'key'}}) { print "h_key $value\n"; }
push #value, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
Output:
key = value
array 1
array abc
array 2
value 22
value 23
value 24
h_key 1
h_key abc
h_key 2
key = value
array 1
array abc
array 2
value 22
value 23
value 24
value 1
value abc
value 2
I'm not sure what is going on there.
If your problem is how do you replace a empty string value you had stored before with an array onto which you can push your values, this might be the best way to do it:
if ( ... ) {
my $r = \$hash{ $key }; # $hash{ $key } autoviv-ed
$$r = [] unless ref $$r;
push #$$r, #values;
}
else {
$hash{ $key } = "";
}
I avoid multiple hash look-ups by saving a copy of the auto-vivified slot.
Note the code relies on a scalar or an array being the entire universe of things stored in %hash.
If I have a hash in Perl that contains complete and sequential integer mappings (ie, all keys from from 0 to n are mapped to something, no keys outside of this), is there a means of converting this to an Array?
I know I could iterate over the key/value pairs and place them into a new array, but something tells me there should be a built-in means of doing this.
You can extract all the values from a hash with the values function:
my #vals = values %hash;
If you want them in a particular order, then you can put the keys in the desired order and then take a hash slice from that:
my #sorted_vals = #hash{sort { $a <=> $b } keys %hash};
If your original data source is a hash:
# first find the max key value, if you don't already know it:
use List::Util 'max';
my $maxkey = max keys %hash;
# get all the values, in order
my #array = #hash{0 .. $maxkey};
Or if your original data source is a hashref:
my $maxkey = max keys %$hashref;
my #array = #{$hashref}{0 .. $maxkey};
This is easy to test using this example:
my %hash;
#hash{0 .. 9} = ('a' .. 'j');
# insert code from above, and then print the result...
use Data::Dumper;
print Dumper(\%hash);
print Dumper(\#array);
$VAR1 = {
'6' => 'g',
'3' => 'd',
'7' => 'h',
'9' => 'j',
'2' => 'c',
'8' => 'i',
'1' => 'b',
'4' => 'e',
'0' => 'a',
'5' => 'f'
};
$VAR1 = [
'a',
'b',
'c',
'd',
'e',
'f',
'g',
'h',
'i',
'j'
];
OK, this is not very "built in" but works. It's also IMHO preferrable to any solution involving "sort" as it's faster.
map { $array[$_] = $hash{$_} } keys %hash; # Or use foreach instead of map
Otherwise, less efficient:
my #array = map { $hash{$_} } sort { $a<=>$b } keys %hash;
Perl does not provide a built-in to solve your problem.
If you know that the keys cover a particular range 0..N, you can leverage that fact:
my $n = keys(%hash) - 1;
my #keys_and_values = map { $_ => $hash{$_} } 0 .. $n;
my #just_values = #hash{0 .. $n};
This will leave keys not defined in %hashed_keys as undef:
# if we're being nitpicky about when and how much memory
# is allocated for the array (for run-time optimization):
my #keys_arr = (undef) x scalar %hashed_keys;
#keys_arr[(keys %hashed_keys)] =
#hashed_keys{(keys %hashed_keys)};
And, if you're using references:
#{$keys_arr}[(keys %{$hashed_keys})] =
#{$hashed_keys}{(keys %{$hashed_keys})};
Or, more dangerously, as it assumes what you said is true (it may not always be true … Just sayin'!):
#keys_arr = #hashed_keys{(sort {$a <=> $b} keys %hashed_keys)};
But this is sort of beside the point. If they were integer-indexed to begin with, why are they in a hash now?
As DVK said, there is no built in way, but this will do the trick:
my #array = map {$hash{$_}} sort {$a <=> $b} keys %hash;
or this:
my #array;
keys %hash;
while (my ($k, $v) = each %hash) {
$array[$k] = $v
}
benchmark to see which is faster, my guess would be the second.
#a = #h{sort { $a <=> $b } keys %h};
Combining FM's and Ether's answers allows one to avoid defining an otherwise unnecessary scalar:
my #array = #hash{ 0 .. $#{[ keys %hash ]} };
The neat thing is that unlike with the scalar approach, $# works above even in the unlikely event that the default index of the first element, $[, is non-zero.
Of course, that would mean writing something silly and obfuscated like so:
my #array = #hash{ $[ .. $#{[ keys %hash ]} }; # Not recommended
But then there is always the remote chance that someone needs it somewhere (wince)...
$Hash_value =
{
'54' => 'abc',
'55' => 'def',
'56' => 'test',
};
while (my ($key,$value) = each %{$Hash_value})
{
print "\n $key > $value";
}
We can write a while as below:
$j =0;
while(($a1,$b1)=each(%hash1)){
$arr[$j][0] = $a1;
($arr[$j][1],$arr[$j][2],$arr[$j][3],$arr[$j][4],$arr[$j][5],$arr[$j][6]) = values($b1);
$j++;
}
$a1 contains the key and
$b1 contains the values
In the above example i have Hash of array and the array contains 6 elements.
An easy way is to do #array = %hash
For example,
my %hash = (
"0" => "zero",
"1" => "one",
"2" => "two",
"3" => "three",
"4" => "four",
"5" => "five",
"6" => "six",
"7" => "seven",
"8" => "eight",
"9" => "nine",
"10" => "ten",
);
my #array = %hash;
print "#array"; would produce the following output,
3 three 9 nine 5 five 8 eight 2 two 4 four 1 one 10 ten 7 seven 0 zero
6 six