Searching first level keys in a multidimensional hash Perl - perl

I have the value of a multi-dimensional hash in Perl.
Its structure is
$hash{$key}{$field}{$date} = $value;
Given that I have both the correct value and field.
What is the fastest process possible to search for its key given that the value itself is unique (has a 1-1 relationship to the key)
EDIT:
I have added a third level which is date.
not all dates have a value but when it does, it is shared through all the dates.
To simplify it, if it has a value, it is "A", else, blank.
Regards,
InnZaayynn

The organization of your data is not suited to do a fast search. You're going to have to iterate through the entire hash. If you're going to perform multiple searches, it's best if you generate the "inverse" hash so you only need to iterate through the entire hash once instead of once per search.
If you are performing multiple searches, and they're not all for the same field, generate the inverse hash as follows:
my %key_by_field_and_value;
for my $key (keys(%hash)) {
my $hash_for_key = $hash{$key};
for my $field (keys(%$hash_for_key)) {
my $hash_for_key_and_field = $hash_for_key->{$field};
defined( my $date = get_any_one_key($hash_for_key_and_field) )
or next;
length( my $value = $hash_for_key_and_field->{$date} )
or next;
$key_by_field_and_value{$field}{$value} = $key;
}
}
Then, a search becomes
my $field = ...;
my $target_value = ...;
if (defined(
my $target_key =
do { no autovivification; $key_by_field_and_value{$field}{$target_value} }
)) {
...
}
If you are performing multiple searches, and they're all for the same field, generate the inverse hash as follows:
my $field = ...;
my %key_by_value;
for my $key (keys(%hash)) {
my $hash_for_key = $hash{$key};
defined( my $hash_for_key_and_field = $hash_for_key->{$field} )
or next;
defined( my $date = get_any_one_key($hash_for_key_and_field) )
or next;
length( my $value = $hash_for_key_and_field->{$date} )
or next;
$key_by_value{$value} = $key;
}
Then, a search becomes
my $target_value = ...;
if (defined( my $target_key = $key_by_value{$target_value} )) {
...
}
If you're just going to search once, you'll have to search the entire hash.
my $field = ...;
my $target_value = ...;
my $target_key;
for my $key (keys(%hash)) {
my $hash_for_key = $hash{$key};
defined( my $hash_for_key_and_field = $hash_for_key->{$field} )
or next;
defined( my $date = get_any_one_key($hash_for_key_and_field) )
or next;
length( my $value = $hash_for_key_and_field->{$date} )
or next;
if ($value eq $target_value) {
$target_key = $key;
last;
}
}
if (defined($target_key)) {
...
}
Both of the above solutions use this efficient version of my ($key) = keys(%$h);:
sub get_any_one_key {
my ($h) = #_;
my $key = each(%$h);
keys(%$h); # Reset iterator
return $key;
}

Related

Split string into a hash of hashes (perl)

at the moment im a little confused..
I am looking for a way to write a string with an indefinite number of words (separated by a slash) in a recursive hash.
These "strings" are output from a text database.
Given is for example
"office/1/hardware/mouse/count/200"
the next one can be longer or shorter..
This must be created from it:
{
office {
1{
hardware {
mouse {
count => 200
}
}
}
}
}
Any idea ?
Work backwards. Split the string. Use the last two elements to make the inner-most hash. While more words exist, make each one the key of a new hash, with the inner hash as its value.
my $s = "office/1/hardware/mouse/count/200";
my #word = split(/\//, $s);
# Bottom level taken explicitly
my $val = pop #word;
my $key = pop #word;
my $h = { $key => $val };
while ( my $key = pop #word )
{
$h = { $key => $h };
}
Simple recursive function should do
use strict;
use warnings;
use Data::Dumper;
sub foo {
my $str = shift;
my ($key, $rest) = split m|/|, $str, 2;
if (defined $rest) {
return { $key => foo($rest) };
} else {
return $key;
}
}
my $hash = foo("foo/bar/baz/2");
print Dumper $hash;
Gives output
$VAR1 = {
'foo' => {
'bar' => {
'baz' => '2'
}
}
};
But like I said in the comment: What do you intend to use this for? It is not a terribly useful structure.
If there are many lines to be read into a single hash and the lines have a variable number of fields, you have big problems and the other two answers will clobber data by either smashing sibling keys or overwriting final values. I'm supposing this because there is no rational reason to convert a single line into a hash.
You will have to walk down the hash with each field. This will also give you the most control over the process.
our $hash = {};
our $eolmark = "\000";
while (my $line = <...>) {
chomp $line;
my #fields = split /\//, $line;
my $count = #fields;
my $h = $hash;
my $i = 0;
map { (++$i == $count) ?
($h->{$_}{$eolmark} = 1) :
($h = $h->{$_} ||= {});
} #fields;
}
$h->{$_}{$eolmark} = 1 You need the special "end of line" key so that you can recognize the end of a record and still permit longer records to coexist. If you had two records
foo/bar/baz foo/bar/baz/quux, the second would overwrite the final value of the first.
$h = $h->{$_} ||= {} This statement is a very handy idiom to both create and populate a cache in one step and then take a shortcut reference to it. Never do a hash lookup more than once.
HTH

How do i push a value from a hash onto an array of a hash?

I have a %hashmap and an array #values.
In my code the %hashmap is being created like this $hashmap{$key}="$name";
After the %hashmap is created i need to take it's value and add it to the same %hashmap but with a different key , the new hashmap looks like this :
#hashvalues=($name,$type,$Statement,\#parents,\#children)
$hashmap{$newkey}=\#hashvalues;
I want to push the $name from $hashmap{$key} into the \#children of the $hashmap{$newkey}
This is my code so far :
# first i check if the $hashmap exists so i know i update it
if(exists$hashmap{$name}){
my $auxiliary=\#{$hashmap{$name}};
push(#children,#$auxiliary);
}
my #hashvalues=($name,$type,$Statement,\#parents,\#children);
$hashmap{$name}=\#hashvalues;
The %hash i want to push it is created here , there is no other record of it :
if ($parent ne #$hashvalues2[0]) {
$hashmap{$parent}="$child";
}
The value i am interested to store and push is $child here .
Here is the place were the same %hash will be created again but with the fields name , type etc : (not empty fields ! , they all have a value assigned earlier )
#hashvalues = ($name, $type, $Statement, \#parents, \#children)
$hashmap{$newkey} = \#hashvalues;
I want to see if the %hash was created before this point #hasvalues=($name..
So i check it with this code :
if (exists$hashmap{$name}) { Do Code... }
If there was a recording of it i want to update the %hash , by pushing the value $child in the \#parents of the #hashvalues , so when %hash with type , name .. will be made to have the $child value for the previous version of it.
Here is the order of the code :
if (exists$hashmap{$name}) { Do Code; }
my #hashvalues = ($name, $type, $Statement, \#parents, \#children);
$hashmap{$name} = \#hashvalues;
if ($parent ne #$hashvalues2[0]) {
$hashmap{$parent} = "$child";
}
Here is the whole code :
#FileStatements - An array of Statements
$Statement - a larger string where i collect all my data from
And i fill the #hashvalues with all the data i collect
my $FROMduplicate="";
my $JOINduplicate="";
foreach my $Statement (#FileStatements) {
if ($Statement!~m/create/i) {
next;
}
if ($Statement=~m/create user |^GRANT |^spool /gim) {
next;
}
my $name="";
my $type="";
my $content="";#FileStatements
my #parents=();
my #children=();
my $duplicate="";
# print $Statement."\n";
#NAME--------------------------------------------
my $catch = (split(/ view | trigger | table | synonym | procedure | role /i, $Statement))[1];
$catch =~ s/^\s+//;
$name = (split(/\s+/, $catch))[0];
if ($name=~m/undef/gi){next;}
#DEBUG #print "$name\n";
#TYPE--------------------------------------------
if( $Statement=~m/^create or replace \w+ /i) {
my $tmp = (split(/ replace /i, $Statement))[1];
$tmp =~ s/^\s+//;
$type = (split(/\s+/, $tmp))[0];
}
else{
my $tmp = (split(/^create /i, $Statement))[1];
$tmp =~ s/^\s+//;
$type = (split(/\s+/, $tmp))[0];
}
if ($type=~m/undef| undef |\s+undef\s+|\s+undef,/) {
next;
}
#print "$type\n";
#CONTENT-----------------------------------------
#PARENTS-----------------------------------------
my #froms = split(/ from\s+/i, $Statement);
my #joins = split(/ join /i, $Statement);
foreach my $i (1..#froms-1) {
#print Writer1 "$froms[$i]"."\n\n";
my $from = (split(/ where |select | left | left | right | as /i, $froms[$i])) [0];
$from=~s/^\s+//;
$from=~s/\(+//;
my #Spaces = split(/, | , /,$from);
foreach my $x (0..#Spaces-1) {
my $SpaceFrom = (split(/ /,$Spaces[$x])) [0];
$SpaceFrom=~s/;//;
$SpaceFrom=~s/\)+//;
# print Writer1 $SpaceFrom."\n\n";
if ($SpaceFrom eq $FROMduplicate) {
next;
}
push(#parents,$SpaceFrom);
$FROMduplicate=$SpaceFrom;
}
}
foreach my $x (1..#joins-1){
#print "$joins[$i]"."\n\n";
my $join = (split(/ on /i,$joins[$x])) [0];
$join = (split(/ /i,$joins[$x])) [0];
#print Writer "\n\n".$join."\n\n";
if ($join eq $JOINduplicate) {
next;
}
push(#parents,$join);
$JOINduplicate=$join;
}
#parents = do { my %seen; grep { !$seen{$_}++ } #parents };
#check hash for existence
if(exists$hashmap{$name}){
push(#{$hashmap[3]},#parents);
push(#{$hashmap[0]},$name);
push(#{$hashmap[1]},$type);
push(#{$hashmap[2]},$Statement);
}
my #hashvalues=($name,$type,$Statement,\#parents,\#children);
$hashmap{$name}=\#hashvalues;
# push(#children,$hashmap{$name}) if( exists$hashmap{$name})
}
}
Your question is far from clear, but I think I can answer this question out of context
I want to push the $name from $hashmap{$key} into the \#children of the $hashmap{$newkey}
I assume you have something like this in place already
my %hashmap;
my ( $name, $type, $Statement, #parents, #children );
my #hashvalues = ( $name, $type, $Statement, \#parents, \#children );
$hashmap{$newkey} = \#hashvalues;
Remember that the identifiers name, type, Statement etc. have vanished, and these five values are simply elements of an array
The $name from $hashmap{$key} is the first element of the array, so it is
$hashmap{$key}[0]
The #children of the $hashmap{$newkey} is the fifth element of the array, or
$hashmap{$newkey}[4]
To push the first into the second, you need
push #{ $hashmap{$newkey}[4] }, $hashmap{$key}[0]
You should also use something more meaningful than hashmap for your identifier. The % says that the variable is a hash (there's no such thing as a Perl hash map) and you should use the name to describe the nature of its contents

compare multiple hashes for common keys merge values

I have a working bit of code here where I am comparing the keys of six hashes together to find the ones that are common amongst all of them. I then combine the values from each hash into one value in a new hash. What I would like to do is make this scaleable. I would like to be able to easily go from comparing 3 hashes to 100 without having to go back into my code and altering it. Any thoughts on how I would achieve this? The rest of the code already works well for different input amounts, but this is the one part that has me stuck.
my $comparison = List::Compare->new([keys %{$posHashes[0]}], [keys %{$posHashes[1]}], [keys %{$posHashes[2]}], [keys %{$posHashes[3]}], [keys %{$posHashes[4]}], [keys %{$posHashes[5]}]);
my %comboHash;
for ($comparison->get_intersection) {
$comboHash{$_} = ($posHashes[0]{$_} . $posHashes[1]{$_} . $posHashes[2]{$_} . $posHashes[3]{$_} . $posHashes[4]{$_} . $posHashes[5]{$_});
}
my %all;
for my $posHash (#posHashes) {
for my $key (keys(%$posHash)) {
push #{ $all{$key} }, $posHash->{$key};
}
}
my %comboHash;
for my $key (keys(%all)) {
next if #{ $all{$key} } != #posHashes;
$comboHash{$key} = join('', #{ $all{$key} });
}
Just make a subroutine and pass it hash references
my $combination = combine(#posHashes);
sub combine {
my #hashes = #_;
my #keys;
for my $href (#hashes) {
push #keys, keys %$href;
}
# Insert intersection code here..
# .....
my %combo;
for my $href (#hashes) {
for my $key (#intersection) {
$combo{$key} .= $href->{$key};
}
}
return \%combo;
}
Create a subroutine:
sub combine_hashes {
my %result = ();
my #hashes = #_;
my $first = shift #hashes;
for my $element (keys %$first) {
my $count = 0;
for my $href (#hashes) {
$count += (grep {$_ eq $element} (keys %$href));
}
if ($count > $#hashes) {
$result{$element} = $first->{$element};
$result{$element} .= $_->{$element} for #hashes;
}
}
\%result;
}
and call it by:
my %h = %{combine_hashes(\%h1, \%h2, \%h3)};
...or as:
my %h = %{combine_hashes(#posHashes)};
There is pretty straightforward solution:
sub merge {
my $first = shift;
my #hashes = #_;
my %result;
KEY:
for my $key (keys %$first) {
my $accu = $first->{$key};
for my $hash (#hashes) {
next KEY unless exists $hash->{$key};
$accu .= $hash->{$key};
}
$result{$key} = $accu;
}
return \%result;
}
You have to call it with references to hashes and it will return also hash reference e.g.:
my $comboHashRef = merge(#posHashes);

How do you access information in a hash reference that has been passed to a sub-routine?

I am trying to use hash references to pass information to sub-routines. Psuedo code:
sub output_detail {
Here I want to be able to access each record by the key name (ex. "first", "second", etc)
}
sub output_records {
I want to use a foreach to pass each record has reference to another sub-routine
that handles each record.
foreach $key ( sort( keys %someting) ) {
output_detail(something);
}
}
%records = ();
while ($recnum, $first, $second, $third) = db_read($handle)) {
my %rec = ("first"=>$first, "second"=>$second, "third=>$third);
my $id = $recnum;
$records{$id} = \%rec;
}
output_records(\%records);
I'm not sure how to de-reference the hashes when passed to a sub-routine.
Any ideas would be very helpful.
Thanks
Use -> to access keys of a hash ref. So, your argument to output_records will come through as a scalar hash ref.
sub output_records {
my $records = shift;
my $first = $records->{"first"};
}
See perlreftut for more info.
sub output_detail {
my $hash = shift;
my $value = $$hash{some_key};
}
sub output_records {
my $hash = shift;
foreach my $key (sort keys %$hash) {
output_detail($hash, $key);
# or just pass `$$hash{$key}` if you only need the value
}
}

array to hash in perl

I have a source list from which I am picking up random items and populating the destination list. The item that are in the list have a particular format. For example:
item1{'name'}
item1{'date'}
etc and many more fields.
while inserting into the destination list I check for unique names on items and insert it into that list. For this I have to traverse the entire destination list to check if an item with a given name exists and if not insert it.
I thought it would be nice if I make the destination list as hash instead of a list again so that I can look up for the item faster and efficiently. I am new to Perl and am not getting how to do this. Anybody, Please help me on how to insert an item, find for a particular item name, and delete an item in hash?
How can I make both the name and date as key and the entire item as value?
my %hash;
Insert an item $V with a key $K?
$hash{$K} = $V
Find for a particular name / key $K?
if (exists $hash{$K}) {
print "it is in there with value '$hash{$K}'\n";
} else {
print "it is NOT in there\n"
}
Delete a particular name / key?
delete $hash{$K}
Make name and date as key and entire item as value?
Easy Way: Just string everything together
set: $hash{ "$name:$date" } = "$name:$date:$field1:$field2"
get: my ($name2,$date2,$field1,$field2) = split ':', $hash{ "$name:$date" }
del: delete $hash{ "$name:$date" }
Harder Way: Store as a hash in the hash (google "perl object")
set:
my %temp;
$temp{"name"} = $name;
$temp{"date"} = $date;
$temp{"field1"} = $field1;
$temp{"field2"} = $field2
$hash{"$name:$date"} = \$temp;
get:
my $find = exists $hash{"$name:$date"} ? $hash{"$name:$date"} : undef;
if (defined find) { # i.e. it was found
printf "field 1 is %s\n", $find->{"field1"}
} else {
print "Not found\n";
}
delete:
delete $hash{"$name:$date"}
It is not easy to understand what you are asking because you do not describe the input and the desired outputs specifically.
My best guess is something along the lines of:
#!/usr/bin/perl
use strict; use warnings;
my #list = (
q(item1{'name'}),
q(item1{'date'}),
);
my %lookup;
for my $entry ( #list ) {
my ($name, $attrib) = $entry =~ /([^{]+){'([^']+)'}/;
$lookup{ $name }{ $attrib } = $entry;
}
for my $entry ( keys %lookup ) {
my %entry = %{ $lookup{$entry} };
print "#entry{keys %entry}\n"
}
use YAML;
print Dump \%lookup;
Output:
item1{'date'} item1{'name'}
---
item1:
date: "item1{'date'}"
name: "item1{'name'}"
If you know what items, you are going to need and what order you'll need them in
for keys, then re parsing the key is of questionable value. I prefer to store
them in levels.
$hash{ $h->{name} }{ $h->{date} } = $h;
# ... OR ...
$hash{ $h->{date} }{ $h->{name} } = $h;
foreach my $name ( sort keys %hash ) {
my $name_hash = $hash{$name};
foreach my $date ( keys %$name_hash ) {
print "\$hash{$name}{$date} => " . Dumper( $name_hash->{$date} ) . "\n";
}
}
For arbitrary levels, you may want a traversal function
sub traverse_hash (&#) {
my ( $block, $hash_ref, $path ) = #_;
$path = [] unless $path;
my ( #res, #results );
my $want = wantarray;
my $want_something = defined $want;
foreach my $key ( %$hash_ref ) {
my $l_path = [ #$path, $key ];
my $value = $hash_ref->{$key};
if ( ref( $value ) eq 'HASH' ) {
#res = traverse_hash( $block, $value, $l_path );
push #results, #res if $want_something && #res;
}
elsif ( $want_something ) {
#res = $block->( $l_path, $value );
push #results, #res if #res;
}
else {
$block->( $path, $value );
}
}
return unless $want_something;
return $want ? #results : { #results };
}
So this does the same thing as above:
traverse_hash {
my ( $key_path, $value ) = #_;
print( '$hash{' . join( '}{', #$key_path ) . '} => ' . ref Dumper( $value ));
();
} \%hash
;
Perl Solution
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
sub main{
my %hash;
my #keys = qw(firstname lastname age); # hash's keys
# fname lname age
# --------|--------|-----
my #arr = ( [ 'foo1', 'bar1', '1' ],
[ 'foo2', 'bar2', '2' ],
[ 'foo3', 'bar3', '3' ]
);
# test if array set up correctly
print "\$arr[1][1] : $arr[1][1] \n"; # bar2
# loads the multidimensional array into the hash
for my $row (0..$#arr){
for my $col ( 0..$#{$arr[$row]} ){
my $itemnum = "item" . ($row+1); # using the item# format you used
$hash{$itemnum}->{$keys[$col]} = $arr[$row][$col];
}
}
# manually add a 4th item
$hash{item4} = {"firstname", "foo", "lastname", "bar", "age", "35"};
# How to Retrieve
# -----------------------
# single item pull
print "item1->firstname : $hash{item1}->{firstname} \n"; # foo1
print "item3->age : $hash{item3}->{age} \n"; # 3
# whole line 1
{ local $, = " ";
print "full line :" , %{$hash{item2}} , "\n"; # firstname foo2 lastname bar2 age 2
}
# whole line 2
foreach my $key (sort keys %{$hash{item2}}){
print "$key : $hash{item2}{$key} \n";
}
# Clearer description
#print "Hash:\n", Dumper %hash;
}
main();
This should be used in addition to the accepted answer. Your question was a little vague on the array to hash requirement, perhaps this is the model you are looking for?