perl: how loop over a hash - perl

I've found a lot of different answers to this question, and none seem to work (?!)
Here's what I have:
my %FORM = ["a"=>"0AD", "b"=>"johnny manziel", "c"=>"lincoln"];
#my #k = keys (%FORM);
#for my $iter (#k) { print "$iter\n"; }
#for my $key (keys %FORM) {
# print "\t";
# print $FORM{$key};
# print "\n";
#}
while ( ($key, $value) = each %FORM )
{
print "key: $key, value: $FORM{$key}\n";
}
typical output:
./testprinthash.pl
key: ARRAY(0x13a2998), value:
I always get an array instead of a key value

You want to use parenthesis ( ) when assigning to a hash, not square brackets [ ].
my %FORM = ("a"=>"0AD", "b"=>"johnny manziel", "c"=>"lincoln");
The [ ] create an ARRAY reference, which is not what you want.
Check
http://perldoc.perl.org/perlref.html
http://perldoc.perl.org/perlreftut.html

Related

Handling Nested Delimiters in perl

use strict;
use warnings;
my %result_hash = ();
my %final_hash = ();
Compare_results();
foreach my $key (sort keys %result_hash ){
print "$key \n";
print "$result_hash{$key} \n";
}
sub Compare_results
{
while ( <DATA> )
{
my($instance,$values) = split /\:/, $_;
$result_hash{$instance} = $values;
}
}
__DATA__
1:7802315095\d\d,7802315098\d\d;7802025001\d\d,7802025002\d\d,7802025003\d\ d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
2:7802315095\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
Output
1
7802315095\d\d,7802315098\d\d;7802025001\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
2
7802315095\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
Iam trying to fetch value of each key and again trying to split the comma seperated value from result hash , if i find a semicolon in any value i would want to store the left and right values in separate hash keys.
Something like below
1.#split the value of result_hash{$key} again by , and see whether any chunk is seperated by ;
2. #every chunk without ; and value on left with ; should be stored in
#{$final_hash{"eto"}} = ['7802315095\d\d','7802315098\d\d','7802025002\d\d','7802025003\d\d','7802025004\d\d','7802025005\d\d','7802025006\d\d','7802025007\d\d'] ;
3.#Anything found on the right side of ; has to be stored in
#{$final_hash{"pro"}} = ['7802025001\d\d'] ;
Is there a way that i can handle everything in the subroutine? Can i make the code more simpler
Update :
I tried splitting the string in a single shot, but its just picking the values with semicolon and ignoring everything
foreach my $key (sort keys %result_hash ){
# print "$key \n";
# print "$result_hash{$key} \n";
my ($o,$t) = split(/,|;/, $result_hash{$key});
print "Left : $o \n";
print "Left : $t \n";
#push #{$final_hash{"eto"}}, $o;
#push #{$final_hash{"pro"}} ,$t;
}
}
My updated code after help
sub Compare_results
{
open my $fh, '<', 'Data_File.txt' or die $!;
# split by colon and further split by , and ; if any (done in insert_array)
my %result_hash = map { chomp; split ':', $_ } <$fh> ;
foreach ( sort { $a <=> $b } (keys %result_hash) )
{
($_ < 21)
? insert_array($result_hash{$_}, "west")
: insert_array($result_hash{$_}, "east");
}
}
sub insert_array()
{
my ($val,$key) = #_;
foreach my $field (split ',', $val)
{
$field =~ s/^\s+|\s+$//g; # / turn off editor coloring
if ($field !~ /;/) {
push #{ $file_data{"pto"}{$key} }, $field ;
}
else {
my ($left, $right) = split ';', $field;
push #{$file_data{"pto"}{$key}}, $left if($left ne '') ;
push #{$file_data{"ero"}{$key}}, $right if($right ne '') ;
}
}
}
Thanks
Update Added a two-pass regex, at the end
Just proceed systematically, analyze the string step by step. The fact that you need consecutive splits and a particular separation rule makes it unwieldy to do in one shot. Better have a clear method than a monster statement.
use warnings 'all';
use strict;
use feature 'say';
my (%result_hash, %final_hash);
Compare_results();
say "$_ => $result_hash{$_}" for sort keys %result_hash;
say '---';
say "$_ => [ #{$final_hash{$_}} ]" for sort keys %final_hash;
sub Compare_results
{
%result_hash = map { chomp; split ':', $_ } <DATA>;
my (#eto, #pro);
foreach my $val (values %result_hash)
{
foreach my $field (split ',', $val)
{
if ($field !~ /;/) { push #eto, $field }
else {
my ($left, $right) = split ';', $field;
push #eto, $left;
push #pro, $right;
}
}
}
$final_hash{eto} = \#eto;
$final_hash{pro} = \#pro;
return 1; # but add checks above
}
There are some inefficiencies here, and no error checking, but the method is straightforward. If your input is anything but smallish please change the above to process line by line, what you clearly know how to do. It prints
1 => ... (what you have in the question)
---
eto => [ 7802315095\d\d 7802315098\d\d 7802025002\d\d 7802025003\d\ d ...
pro => [ 7802025001\d\d ]
Note that your data does have one loose \d\ d.
We don't need to build the whole hash %result_hash for this but only need to pick the part of the line after :. I left the hash in since it is declared global so you may want to have it around. If it in fact isn't needed on its own this simplifies
sub Compare_results {
my (#eto, #pro);
while (<DATA>) {
my ($val) = /:(.*)/;
foreach my $field (split ',', $val)
# ... same
}
# assign to %final_hash, return from sub
}
Thanks to ikegami for comments.
Just for the curiosity's sake, here it is in two passes with regex
sub compare_rx {
my #data = map { (split ':', $_)[1] } <DATA>;
$final_hash{eto} = [ map { /([^,;]+)/g } #data ];
$final_hash{pro} = [ map { /;([^,;]+)/g } #data ];
return 1;
}
This picks all characters which are not , or ;, using the negated character class, [^,;]. So that is up to the first either of them, left to right. It does this globally, /g, so it keeps going through the string, collecting all fields that are "left of" , or ;. Then it cheats a bit, picking all [^,;] that are right of ;. The map is used to do this for all lines of data.
If %result_hash is needed build it instead of #data and then pull the values from it with my #values = values %hash_result and feed the map with #values.
Or, broken line by line (again, you can build %result_hash instead of taking $data directly)
my (#eto, #pro);
while (<DATA>) {
my ($data) = /:(.*)/;
push #eto, $data =~ /([^,;]+)/g;
push #pro, $data =~ /;([^,;]+)/g;
}

Argument "*" isn't numeric in array element

I want to make a hash of array from a file that looks like:
xx500173:56QWER 45 A rtt34 34C
...
I would like to have a unique "key" (e.g. column1_column2)
#!/usr/bin/perl
use warnings;
use strict;
my $seq;
while(<>){
chomp;
my #line = split(/\s+/, $_);
my $key = $line[0] . "_" . $line[1]; #try to make a unique key for each entry
map { $seq->{ $_->[$key] } = [#$_[0..4]] } [ split/\s+/ ];
}
foreach my $s (keys %{$seq} ) {
print $s,": ",join( "\t", #{ $seq->{$s}} ) . "\n";
}
but I get the following error:
Argument "xx500173:56QWER_45" isn't numeric in array element
Does is it matter if key is numeric or string?
An index to an array [] should be numeric, but $key is not numeric. Assuming you want all the white-space-separated tokens as elements of your array:
use warnings;
use strict;
my $seq;
while (<DATA>) {
chomp;
my #line = split;
my $key = $line[0] . "_" . $line[1]; #try to make a unique key for each entry
$seq->{$key} = [ #line ];
}
foreach my $s ( keys %{$seq} ) {
print $s, ": ", join( "\t", #{ $seq->{$s} } ) . "\n";
}
__DATA__
xx500173:56QWER 45 A rtt34 34C
Outputs:
xx500173:56QWER_45: xx500173:56QWER 45 A rtt34 34C
You have confused yourself with the line
map { $seq->{ $_->[$key] } = [#$_[0..4]] } [ split/\s+/ ];
which is wrong because
map is an operator for translating one list into another by performing the same operation on every element of the input list, but you are ignoring the returned value
The input list is only one item long - the array reference returned by [ split/\s+/ ]
What you have written is the same as
$_ = [ split /\s+/ ];
$seq->{ $_->[$key] } = [ #$_[0..4] ];
and the problem is that $_->[$key] tries to index the anonymous array using the string $key, which is clearly wrong.
All you need here is
$seq->{$key} = [ #line[0..4] ];
and your complete program should look like this
#!/usr/bin/perl
use strict;
use warnings;
my $seq;
while ( <> ) {
chomp;
my #line = split;
$seq->{"$line[0]_$line[1]"} = [ #line[0..4] ];
}
for my $s ( keys %{$seq} ) {
printf "%s: %s\n", $s, join("\t", #{ $seq->{$s} } );
}

perl loop hash using each keyword on foreach loop

I tried to loop through a hash using each on a for each loop. Looks like the $k $v is not updated. Can anyone explain?
%hash = (a=>5,b=>6);
foreach( my ($k,$v) = each %hash){
print "\neach loop : $k $v\n";
}
output :
each loop : a 5
each loop : a 5
foreach takes a list of values, and executes its loop body once per value, assigning some variable ($_ if otherwise unspecified) each time:
foreach ( 1, 2, 3 ) {
print "The value is $_\n";
}
In your case, you gave it a list of two things, being the first key and value taken from the hash. Additionally, you also assigned those two new variables, $key and $value to be the key and value. Thus, your loop executed twice, with those variables remaining constant throughout.
A better way to iterate keys and values from a hash is to iterate on the list of keys, taking a value each time:
foreach my $key ( keys %hash ) {
my $value = $hash{$key};
...
}
Alternatively, you might enjoy the pairs function from List::Util version 1.39:
foreach my $pair ( pairs %hash ) {
my $key = $pair->key;
my $value = $pair->value;
}
Use the while loop.
#!/usr/bin/perl
use strict;
use warnings;
my %hash = (a=>5,b=>6);
while (my ($key, $value) = each %hash) {
print "Key is $key, value is $value\n";
}
Demo
Also see: Is perl's each function worth using?
You need to do while instead of foreach:
my %hash = (a=>5,b=>6);
while( my ($k,$v) = each %hash){
print "\neach loop : $k $v\n";
}
However, each() has gotachas that you need to be aware of, so I prefer just using keys instead, like this:
for my $k (keys %hash) { my $v = $hash{$k}; }

perl return values from sub get mixed up

I'm trying to get 2 different hashes from a sub in perl. The hashes get mixed up in the output of the sub. Here is my simplified code:
#!/usr/bin/perl
use strict;
use warnings;
sub sub1 {
my (%h1, %h2);
$h1{'1a'}++;
$h1{'1b'}++;
$h2{'2a'}++;
$h2{'2b'}++;
while ( (my $key, my $value) = each %h1 ){
print "key: $key, value: $value\n";
}
print "\n";
return (%h1, %h2);
}
my (%r1, %r2) = sub1();
while ( (my $key, my $value) = each %r1 ){
print "key: $key, value: $value\n";
}
output:
key: 1b, value: 1
key: 1a, value: 1
key: 1b, value: 1
key: 2a, value: 1
key: 2b, value: 1
key: 1a, value: 1
Why is this happening? How can I correct it? Thanks.
Perl is merging your two hashes into one and storing them in your %r1 variable, this will always happen unless you return references from your subroutines.
sub sub1 {
my (%h1, %h2);
$h1{'1a'}++;
$h1{'1b'}++;
$h2{'2a'}++;
$h2{'2b'}++;
while ( (my $key, my $value) = each %h1 ){
print "key: $key, value: $value\n";
}
print "\n";
return (\%h1, \%h2); # \(backslash) creates a hashref
}
Then you need to store those references in scalar variables:
my ($r1, $r2) = sub1(); # scalar variables with references to %h1 and %h2
# use %{ } to put $r1 in hash context
while ( (my $key, my $value) = each %{ $r1 } ){
print "key: $key, value: $value\n";
}
# prints
# key: 1b, value: 1
# key: 1a, value: 1
# key: 1b, value: 1
# key: 1a, value: 1
You probably should read up on references.
Perl subroutines can take a list of parameters and can return a list. For example, if I do this:
array_sub ( #a, #b );
sub array_sub {
return print join ( ": ", #_ ) . "\n";
}
What you will notice is that the two arrays passed will get merged into a single list of parameters without any way to tell where one list began and another ended.
Similar thing happens with hashes:
array_sub ( %a, %b );
sub array_sub {
return print join ( ": ", #_ ) . "\n";
}
This will merge the two hashes (and their keys) together into a single list that is passed to the subroutine.
To get around this limitation, you can use references which basically are scalar values that point to a memory location where your actual array or hash lives:
array_sub { \#a, \#b );
sub array_sub {
my $ref_a = shift;
my $ref_b = shift;
my #sub_a = #{ $ref_a };
my #sub_b = #{ $ref_b };
In the above, I put a backslash in front of the arrays to get the reference to the array. To dereference them (i.e. turn them back into arrays) I put #{...} around the reference.
You have to do something similar in your code:
my ($r1, $r2) = sub1(); #Returns references;
my #r1 = #{ $r1 }; #dereference
my #r2 = #{ $r2 }; #dereference
And in your subroutine:
sub sub1 {
...
return \%h1, \%h2;

Traversing Array of Hashes in perl

I have a structure as below:
my $var1 = [{a=>"B", c=>"D"}, {E=>"F", G=>"H"}];
Now I want to traverse the first hash and the elements in it.. How can I do it?
When I do a dumper of $var1 it gives me Array and when on #var1 it says a hash.
You iterate over the array as you would with any other array, and you'll get hash references. Then iterate over the keys of each hash as you would with a plain hash reference.
Something like:
foreach my $hash (#{$var1}) {
foreach my $key (keys %{$hash}) {
print $key, " -> ", $hash->{$key}, "\n";
}
}
First off, you're going to trip Perl's strict mode with your variable declaration that includes barewords.
With that in mind, complete annotated example given below.
use strict;
my $test = [{'a'=>'B','c'=>'D'},{'E'=>'F','G'=>'H'}];
# Note the #{ $test }
# This says "treat this scalar reference as a list".
foreach my $elem ( #{ $test } ){
# At this point $elem is a scalar reference to one of the anonymous
# hashes
#
# Same trick, except this time, we're asking Perl
# to treat the $elem reference as a reference to hash
#
# Hence, we can just call keys on it and iterate
foreach my $key ( keys %{ $elem } ){
# Finally, another bit of useful syntax for scalar references
# The "point to" syntax automatically does the %{ } around $elem
print "Key -> $key = Value " . $elem->{$key} . "\n";
}
}
C:\wamp\bin\perl\bin\PERL_2~1\BASIC_~1\REVISION>type traverse.pl
my $var1=[{a=>"B", c=>"D"},{E=>"F", G=>"H"}];
foreach my $var (#{$var1}) {
foreach my $key (keys(%$var)) {
print $key, "=>", $var->{$key}, "\n";
}
print "\n";
}
C:\wamp\bin\perl\bin\PERL_2~1\BASIC_~1\REVISION>traverse.pl
c=>D
a=>B
G=>H
E=>F
$var1 = [] is a reference to an anonymous array
using the # sigil before it as in $var1 gives you the access to the array it is referencing. So analogous to foreach (#arr) {...} you would do foreach (#{$var1}) {...}.
Now, the elements in the array that you have provided #{$var1} are anonymous (means not named) too, but they are anonymous hashes, so just like with the arrayref, here we do %{$hash_reference} to get access to the hash referenced by $hash_reference. Here, $hash_reference is $var.
After accessing the hash using %{$var} it becomes easy to access the keys of the hash using keys(%$var) or keys(%{$var}). Since the result returned is an array of keys therefore we can use keys(%{$var}) inside foreach (keys(%{$var})) {...}.
We access the scalar value inside an anonymous hash by using a key like $hash_reference->{$keyname}, that's all the code did.
In case your array contained anonymous hashes of arrays like :
$var1=[ { akey=>["b", "c"], mkey=>["n", "o"]} ];
then, this is how you will access the array values:
C:\wamp\bin\perl\bin\PERL_2012\BASIC_PERL\REVISION>type traverse.pl
my $var1=[ {akey=>["b", "c"], mkey=>["n", "o"]} ];
foreach my $var (#{$var1}) {
foreach my $key (keys(%$var)) {
foreach my $elem (#{ $var->{$key} }) {
print "$key=>$elem,";
}
print "\n...\n";
}
print "\n";
}
C:\wamp\bin\perl\bin\PERL_2012\BASIC_PERL\REVISION>traverse.pl
mkey=>n,mkey=>o,
...
akey=>b,akey=>c,
...
Practice it more and regularly, it will soon become easy for you to break complex structures into such combinations. This is how I created a large parser for another software, it is full of answers to your questions :)
With one peek at amon's up-voted comment above (thanks, amon!) I was able to write this little ditty:
#!/usr/bin/perl
# Given an array of hashes, print out the keys and values of each hash.
use strict; use warnings;
use Data::Dump qw(dump);
my $var1=[{A=>"B",C=>"D"},{E=>"F",G=>"H"}];
my $count = 0;
# #{$var1} is the array of hash references pointed to by $var1
foreach my $href (#{$var1})
{
print "\nArray index ", $count++, "\n";
print "=============\n";
# %{$href} is the hash pointed to by $href
foreach my $key (keys %{$href})
{
# $href->{$key} ( ALT: $$href{$key} ) is the value
# corresponding to $key in the hash pointed to by
# $href
# print $key, " => ", $href->{$key}, "\n";
print $key, " => ", $$href{$key}, "\n";
}
print "\nCompare with dump():\n";
dump ($var1);
print "\nJust the first hash (index 0):\n";
# $var1->[0] ( ALT: $$var1[0] ) is the first hash reference (index 0)
# in #{$var1}
# dump ($var1->[0]);
dump ($$var1[0]);
#print "\nJust the value of key A: \"", $var1->[0]->{A}, "\"\n";
#print "\nJust the value of key A: \"", $var1->[0]{A}, "\"\n";
print "\nJust the value of key A: \"", $$var1[0]{A}, "\"\n"