Following on from a similar question I asked (Change first key of multi-dimensional Hash in perl) I have a multi-dimensional hash in perl and would like to change MULTIPLE first keys for a chosen value. For example, I have the hash
my %Hash1;
$Hash1{1}{12}=1;
$Hash1{1}{10}=1;
$Hash1{2}{31}=1;
$Hash1{3}{52}=1;
$Hash1{3}{58}=1;
$Hash1{4}{82}=1;
$Hash1{4}{154}=1;
Now I want to replace the values 3 and 4 in the first key with the value 300. After this I would get:
$Hash1{1}{12}=1;
$Hash1{1}{10}=1;
$Hash1{2}{31}=1;
$Hash1{300}{52}=1;
$Hash1{300}{58}=1;
$Hash1{300}{82}=1;
$Hash1{300}{154}=1;
I know I could create a new hash by scanning the original hash and doing the following:
my %Hash2;
foreach my $key1 (sort keys %Hash1) {
foreach my $key2 (keys %{ $Hash1{$key1} }) {
if($key1==3 || $key1==4){
$Hash2{300}{$key2}=1;
} else {
$Hash2{$key1}{$key2}=1;
}
}
}
But is there a quicker way?
$Hash1{300} = {%{$Hash1{3}},%{$Hash1{4}}};
delete $Hash1{3};
delete $Hash1{4};
If you need to replace too many keys, the following function may help.
Use it like replace_first_keys( \%Hash1, [ 3, 4 ], 300 );. The three parameters are reference of hash to modify, reference to array with keys to replace, and the replacement key.
use List::Util;
# REPLACE FIRST KEYS OF $hash LISTED IN #$replace WITH THE KEY $replacement
sub replace_first_keys {
my ( $hash, $replace, $replacement ) = #_;
unshift #$replace, $replacement if exists $hash->{$replacement};
$hash->{$replacement} = {
map { %{ delete $hash->{$_} } }
grep { ( exists $hash->{$_} ) && ( ref $hash->{$_} eq 'HASH' ) }
( List::Util::uniq #$replace )
};
$hash;
}
It also tries to handle the following situations sensibly:
replace_first_keys( \%Hash1, [ 3, 4 ], 2 ); (replacement key exists, older values are overwritten on conflict)
replace_first_keys( \%Hash1, [ 3, 4 ], 4 ); (replacement also present in replace list)
Use this script if you wish to test this.
I have a code block that i use many times with slight variations that i am trying to make into a subroutine.
This code block Completes configuration templates ( router interface, vrf, other network stuff)
It does so by looking up data in a hash data structure ( called %config_hash) that is built from ingesting a excel file :P. The data that is looked up is in different areas of the hash for different templates.
an example of the current working code is this:
my #temp_source_template = #{ clone ($source_template{$switch_int_template}) };
my %regex_replacements=(); ## hash for holding regex search and replace values, keys are !name! (look in template files) values taken from DCAP
my #regex_key =(); ## temp array used for whe more then one !name! on a line
my $find_string='';
foreach my $line (#temp_source_template){
my (#regex_key) = ( $line =~ /(\!.*?\!)/g ); ## match needs to be non greedy thus .*? not .*
foreach my $hash_refs (#regex_key){
my $lookup = $hash_refs =~ s/!//gri; ## remove ! from !name! so lookup can be done in DCAP file hash
my $excel_lookup = $lookup =~ s/_/ /gri;
$regex_replacements{$hash_refs} = $config_hash{'Vlan'}{$inner}{$excel_lookup}; ## lookup DCAP file hash a write value to regex hash
if (undef eq $regex_replacements{$hash_refs}){
$regex_replacements{$hash_refs} = $config_hash{'Switch'}{$outer}{$excel_lookup};
}
if (undef eq $regex_replacements{$hash_refs}){
$regex_replacements{$hash_refs} = $config_hash{'VRF'}{$middle}{$excel_lookup};
}
$find_string= $find_string . $hash_refs . '|' ;
}
}
So this creates a hash (regex_replacements) that contains values to lookup (hash keys in regex_replacements) and values to replace those with (values in regex_replacements). it also builds a string to be used in a regex expression ( $find_string). Different templates will have different hash lookup "paths" ( eg $config_hash{'Switch'}{$outer}{$excel_lookup} ) or in different orders (effectively a most specific match)
for completeness here is the code block that does the regex replacements:
foreach my $line (#temp_source_template){
my (#line_array) = split /(![A-Za-z_]*!)/, $line;
foreach my $chunk (#line_array){
my $temp_chunk = $chunk;
$chunk =~ s/($find_string)/$regex_replacements{$1}/gi;
if (!($chunk)){
$chunk = $temp_chunk;
}
}
$line = join ("", #line_array);
if ($line =~ /\!.*\!/){
print {$log} " ERROR line has unmatched variables deleting line \"$line\"\n";
$line ="";
}
}
So I did some searching and i found this:
Perl: How to turn array into nested hash keys
Which is almost exactly what i want but i can't get it to work because my Variable reference is a Hash and its hash variable reference is just "REF" so i get errors for trying to use a hash as a reference.
So I wont post what i have tried as i don't really understand the magic of that link.
But what i am doing is passing to the sub the following
my #temp_source_template = #{ clone ($source_template{$test}) };
my #search_array = ( ['VRF' ,$middle] , ['Switch' ,$outer]);
my $find_string, $completed_template = dynamic_regex_replace_fine_string_gen(\%source_config,\#temp_source_template, \#search_array);
and i want returned the $find_string and the regex_replacements hash ref. It should be noted that in the sub i need to append onto the end of the elements of #search array the value of $excel_lookup.
The bit that i dont understand how to do is build the variable level hash lookup.
You could try use Data::Diver it provides a simple access to elements of deeply nested structures.
For example:
use feature qw(say);
use strict;
use warnings;
use Data::Diver qw(Dive);
my $hash = { a => { b => 1, c => { d => 2 }}};
my $keys = [[ 'a', 'b'], ['a','c','d']];
lookup_keys( $hash, $keys );
sub lookup_keys {
my ( $hash, $keys ) = #_;
for my $key ( #$keys ) {
my $value = Dive( $hash, #$key );
say $value;
}
}
Output:
1
2
See Also:
Creating hash of hash dynamically in perl
Read config hash-like data into perl hash
Is there an efficient way to see, if a hash key assignment resulted in adding a new item or in modifying an existing one? Something similar in behavior to the add function in this Bloom's filter implementation.
In the construct below two lookups are performed: once explicitly with exists and another time implicitly during the assignment. The first lookup is thus logically redundant.
my %hash;
my $key;
...
my $existed = exists $hash{$key};
$hash{$key} = 1;
By "item", I think you mean "key".
If the value is meaningless, you can use the following:
my $dup = $hash{$key}++;
If the value is meaningful, you can use the following:
my $dup = exists($hash{$key});
$hash{$key} = $val;
If the value is meaningful but always defined, you can use the following:
my $ref = \$hash{$key};
my $dup = defined($$ref);
$$ref = $val;
By the way, the first snippet can easily be extended to filter out duplicates from a list.
my %seen;
my #unique = grep !$seen{$_}++, #list;
I am trying to use an existing Perl program, which includes the following function of GetItems. The way to call this function is listed in the following.
I have several questions for this program:
what does foreach my $ref (#_) aim to do? I think #_ should be related to the parameters passed, but not quite sure.
In my #items = sort { $a <=> $b } keys %items; the "items" on the left side should be different from the "items" on the right side? Why do they use the same name?
What does $items{$items[$i]} = $i + 1; aim to do? Looks like it just sets up the value for the hash $items sequentially.
$items = GetItems($classes, $pVectors, $nVectors, $uVectors);
######################################
sub GetItems
######################################
{
my $classes = shift;
my %items = ();
foreach my $ref (#_)
{
foreach my $id (keys %$ref)
{
foreach my $cui (keys %{$ref->{$id}}) { $items{$cui} = 1 }
}
}
my #items = sort { $a <=> $b } keys %items;
open(VAL, "> $classes.items");
for my $i (0 .. $#items)
{
print VAL "$items[$i]\n";
$items{$items[$i]} = $i + 1;
}
close VAL;
return \%items;
}
When you enter a function, #_ starts out as an array of (aliases to) all the parameters passed into the function; but the my $classes = shift removes the first element of #_ and stores it in the variable $classes, so the foreach my $ref (#_) iterates over all the remaining parameters, storing (aliases to) them one at a time in $ref.
Scalars, hashes, and arrays are all distinguished by the syntax, so they're allowed to have the same name. You can have a $foo, a #foo, and a %foo all at the same time, and they don't have to have any relationship to each other. (This, together with the fact that $foo[0] refers to #foo and $foo{'a'} refers to %foo, causes a lot of confusion for newcomers to the language; you're not alone.)
Exactly. It sets each element of %items to a distinct integer ranging from one to the number of elements, proceeding in numeric (!) order by key.
foreach my $ref (#_) loops through each hash reference passed as a parameter to GetItems. If the call looks like this:
$items = GetItems($classes, $pVectors, $nVectors, $uVectors);
then the loop processes the hash refs in $pVector, $nVectors, and $uVectors.
#items and %items are COMPLETELY DIFFERENT VARIABLES!! #items is an array variable and %items is a hash variable.
$items{$items[$i]} = $i + 1 does exactly as you say. It sets the value of the %items hash whose key is $items[$i] to $i+1.
Here is an (nearly) line by line description of what is happening in the subroutine
Define a sub named GetItems.
sub GetItems {
Store the first value in the default array #_, and remove it from the array.
my $classes = shift;
Create a new hash named %items.
my %items;
Loop over the remaining values given to the subroutine, setting $ref to the value on each iteration.
for my $ref (#_){
This code assumes that the previous line set $ref to a hash ref. It loops over the unsorted keys of the hash referenced by $ref, storing the key in $id.
for my $id (keys %$ref){
Using the key ($id) given by the previous line, loop over the keys of the hash ref at that position in $ref. While also setting the value of $cui.
for my $cui (keys %{$ref->{$id}}) {
Set the value of %item at position $cui, to 1.
$items{$cui} = 1;
End of the loops on the previous lines.
}
}
}
Store a sorted list of the keys of %items in #items according to numeric value.
my #items = sort { $a <=> $b } keys %items;
Open the file named by $classes with .items appended to it. This uses the old-style two arg form of open. It also ignores the return value of open, so it continues on to the next line even on error. It stores the file handle in the global *VAL{IO}.
open(VAL, "> $classes.items");
Loop over a list of indexes of #items.
for my $i (0 .. $#items){
Print the value at that index on it's own line to *VAL{IO}.
print VAL "$items[$i]\n";
Using that same value as an index into %items (which it is a key of) to the index plus one.
$items{$items[$i]} = $i + 1;
End of loop.
}
Close the file handle *VAL{IO}.
close VAL;
Return a reference to the hash %items.
return \%items;
End of subroutine.
}
I have several questions for this program:
What does foreach my $ref (#_) aim to do? I think #_ should be related to the parameters passed, but not quite sure.
Yes, you are correct. When you pass parameters into a subroutine, they automatically are placed in the #_ array. (Called a list in Perl). The foreach my $ref (#_) begins a loop. This loop will be repeated for each item in the #_ array, and each time, the value of $ref will be assigned the next item in the array. See Perldoc's Perlsyn (Perl Syntax) section about for loops and foreach loops. Also look at Perldoc's Perlvar (Perl Variables) section of General variables for information about special variables like #_.
Now, the line my $classes = shift; is removing the first item in the #_ list and putting it into the variable $classes. Thus, the foreach loop will be repeated three times. Each time, $ref will be first set to the value of $pVectors, $nVectors, and finally $uVectors.
By the way, these aren't really scalar values. In Perl, you can have what is called a reference. This is the memory location of the data structure you're referencing. For example, I have five students, and each student has a series of tests they've taken. I want to store all the values of each test in a hash keyed by the student's ID.
Normally, each entry in the hash can only contain a single item. However, what if this item refers to a list that contains the student's grades?
Here's the list of student #100's grade:
#grades = (100, 93, 89, 95, 74);
And here's how I set Student 100's entry in my hash:
$student{100} = \#grades;
Now, I can talk about the first grade of the year for Student #100 as $student{100}[0]. See the Perldoc's Mark's very short tutorial about references.
In my #items = sort { $a <=> $b } keys %items; the "items" on the left side should be different from the "items" on the right side? Why do they use the same name?
In Perl, you have three major types of variables: Lists (what some people call Arrays), Hashes (what some people call Keyed Arrays), and Scalars. In Perl, it is perfectly legal to have different variable types have the same name. Thus, you can have $var, %var, and #var in your program, and they'll be treated as completely separate variables1.
This is usually a bad thing to do and is highly discouraged. It gets worse when you think of the individual values: $var refers to the scalar while $var[3] refers to the list, and $var{3} refers to the hash. Yes, it can be very, very confusing.
In this particular case, he has a hash (a keyed array) called %item, and he's converting the keys in this hash into a list sorted by the keys. This syntax could be simplified from:
my #items = sort { $a <=> $b } keys %items;
to just:
my #items = sort keys %items;
See the Perldocs on the sort function and the keys function.
What does $items{$items[$i]} = $i + 1; aim to do? Looks like it just sets up the value for the hash $items sequentially.
Let's look at the entire loop:
foreach my $i (0 .. $#items)
{
print VAL "$items[$i]\n";
$items{$items[$i]} = $i + 1;
}
The subroutine is going to loop through this loop once for each item in the #items list. This is the sorted list of keys to the old %items hash. The $#items means the largest index in the item list. For example, if #items = ("foo", "bar", and "foobar"), then $#item would be 2 because the last item in this list is $item[2] which equals foobar.
This way, he's hitting the index of each entry in #items. (REMEMBER: This is different from %item!).
The next line is a bit tricky:
$items{$items[$i]} = $i + 1;
Remember that $item{} refers to the old %items hash! He's creating a new %items hash. This is being keyed by each item in the #items list. And, the value is being set to the index of that item plus 1. Let's assume that:
#items = ("foo", "bar", "foobar")
In the end, he's doing this:
$item{foo} = 1;
$item{bar} = 2;
$item{foobar} = 3;
1 Well, this isn't 100% true. Perl stores each variable in a kind of hash structure. In memory, $var, #var, and %var will be stored in the same hash entry in memory, but in positions related to each variable type. 99.9999% of the time, this matters not one bit. As far as you are concerned, these are three completely different variables.
However, there are a few rare occasions where a programmer will take advantage of this when they futz directly with memory in Perl.
I want to show you how I would write that subroutine.
Bur first, I want to show you some of the steps of how, and why, I changed the code.
Reduce the number of for loops:
First off this loop doesn't need to set the value of $items{$cui} to anything in particular. It also doesn't have to be a loop at all.
foreach my $cui (keys %{$ref->{$id}}) { $items{$cui} = 1 }
This does practically the same thing. The only real difference is it sets them all to undef instead.
#items{ keys %{$ref->{$id}} } = ();
If you really needed to set the values to 1. Note that (1)x#keys returns a list of 1's with the same number of elements in #keys.
my #keys = keys %{$ref->{$id}};
#items{ #keys } = (1) x #keys;
If you are going to have to loop over a very large number of elements then a for loop may be a good idea, but only if you have to set the value to something other than undef. Since we are only using the loop variable once, to do something simple; I would use this code:
$items{$_} = 1 for keys %{$ref->{$id}};
Swap keys with values:
On the line before that we see:
foreach my $id (keys %$ref){
In case you didn't notice $id was used only once, and that was for getting the associated value.
That means we can use values and get rid of the %{$ref->{$id}} syntax.
for my $hash (values %$ref){
#items{ keys %$hash } = ();
}
( $hash isn't a good name, but I don't know what it represents. )
3 arg open:
It isn't recommended to use the two argument form of open, or to blindly use the bareword style of filehandles.
open(VAL, "> $classes.items");
As an aside, did you know there is also a one argument form of open. I don't really recommend it though, it's mostly there for backward compatibility.
our $VAL = "> $classes.items";
open(VAL);
The recommend way to do it, is with 3 arguments.
open my $val, '>', "$classes.items";
There may be some rare edge cases where you need/want to use the two argument version though.
Put it all together:
sub GetItems {
# this will cause open and close to die on error (in this subroutine only)
use autodie;
my $classes = shift;
my %items;
for my $vector_hash (#_){
# use values so that we don't have to use $ref->{$id}
for my $hash (values %$ref){
# create the keys in %items
#items{keys %$hash} = ();
}
}
# This assumes that the keys of %items are numbers
my #items = sort { $a <=> $b } keys %items;
# using 3 arg open
open my $output, '>', "$classes.items";
my $index; # = 0;
for $item (#items){
print {$output} $item, "\n";
$items{$item} = ++$index; # 1...
}
close $output;
return \%items;
}
Another option for that last for loop.
for my $index ( 1..#items ){
my $item = $items[$index-1];
print {$output} $item, "\n";
$items{$item} = $index;
}
If your version of Perl is 5.12 or newer, you could write that last for loop like this:
while( my($index,$item) = each #items ){
print {$output} $item, "\n";
$items{$item} = $index + 1;
}
Can a Hash have duplicate keys or values?
it can have duplicate values but not keys.
For both hashes and arrays, only one scalar can be stored at a given key. ("Keys are unique.") If they weren't, you couldn't do
$h{a} = 1;
$h{a} = 2;
$val = $h{a}; # 2
$a[4] = 1;
$a[4] = 2;
$val = $a[4]; # 2
If you wanted to associate multiple values with a key, you could place a reference to an array (or hash) at that key, and add the value to that array (or hash).
for my $n (4,5,6,10) {
if ($n % 2) {
push #{ $nums{odd} }, $n;
} else {
push #{ $nums{even} }, $n;
}
}
say join ', ', #{ $nums{even} };
See perllol for more on this.
As for values, multiple elements can have the same value in both hashes and arrays.
$counts{a} = 3;
$counts{b} = 3;
$counts[5] = 3;
$counts[6] = 3;
Assuming talking about a "%hash"
Then:
Duplicate keys not allowed.
Duplicate values allowed.
This is easy to reason about because it is a mapping of a particular Key to a particular Value where the Value plays no part in the look-up and is thus independent upon other Values.
Please try and run this code, it executes without errors.
I hope this is what you were asking!
#!/usr/bin/perl
use strict;
use warnings;
my %hash = ('a' => 1, 'a' => 2, 'b' => 4 );
print values %hash, "\n\n";
print keys %hash, "\n\n";
You can try to use Hash::MultiKey module from CPAN.
(I used Data::Dumper to show how hash is exactly looks - it is not necessary here)
use Data::Dumper;
use Hash::MultiKey;
tie my %multi_hash, 'Hash::MultiKey';
$multi_hash{['foo', 'foo', 'baz']} = "some_data";
for (keys %multi_hash) {
print #$_,"\n";
};
print Dumper\%multi_hash;
And the output shoud be () :
foofoobaz
$VAR1 = {
'ARRAY(0x98b6978)' => 'some_data'
};
So technically speaking Hash::MultiKey let you create reference as a hash key.
Yes a hash can have duplicate keys as I demonstrate below...
Key example: BirthDate|LastNameFirst4Chars|FirstNameInitial|IncNbr
"1959-12-19|Will|K|1" ... "1959-12-19|Will|K|74".
Note: This might be a useful Key for record look ups if someone did not remember their Social Security Nbr
#-- CODE SNIPPET:
#Offsets=(); #-- we will build an array of Flat File record "byte offsets" to random access
#-- for all records matching this ALT KEY with DUPS
for ($i=1; $i<=99; $i++) {
$KEY=$BirthDate . "|" . $LastNameFirst4Chars . "|" . $FirstNameInitial . "|" . $i;
if (exists $Hash{$KEY}) {
push #Offsets, $Hash{$KEY}; #-- add another hash VALUE to the end of the array
}
}