Adressing a hash of hashes with an array - perl

This is my problem:
I have a file-system like data-structure:
%fs = (
"home" => {
"test.file" => {
type => "file",
owner => 1000,
content => "Hello World!",
},
},
"etc" => {
"passwd" => {
type => "file",
owner => 0,
content => "testuser:testusershash",
},
"conf" => {
"test.file" => {
type => "file",
owner => 1000,
content => "Hello World!",
},
},
},
);
Now, to get the content of /etc/conf/test.file I need $fs{"etc"}{"conf"}{"test.file"}{"content"}, but my input is an array and looks like this: ("etc","conf","test.file").
So, because the length of the input is varied, I don't know how to access the values of the hash. Any ideas?

You can use a loop. In each step, you proceed one level deeper into the structure.
my #path = qw/etc conf test.file/;
my %result = %fs;
while (#path) {
%result = %{ $result{shift #path} };
}
print $result{content};
You can also use Data::Diver.

my #a = ("etc","conf","test.file");
my $h = \%fs;
while (my $v = shift #a) {
$h = $h->{$v};
}
print $h->{type};

Same logic as what others given, but uses foreach
#keys = qw(etc conf test.file content);
$r = \%fs ;
$r = $r->{$_} foreach (#keys);
print $r;

$pname = '/etc/conf/test.file';
#names = split '/', $pname;
$fh = \%fs;
for (#names) {
$fh = $fh->{"$_"} if $_;
}
print $fh->{'content'};

Path::Class accepts an array. It also gives you an object with helper methods and handles cross platform slash issues.
https://metacpan.org/module/Path::Class

You can just build the hash element expression and call eval. This is tidier if it is wrapped in a subroutine
my #path = qw/ etc conf test.file /;
print hash_at(\%fs, \#path)->{content}, "\n";
sub hash_at {
my ($hash, $path) = #_;
$path = sprintf q($hash->{'%s'}), join q('}{'), #$path;
return eval $path;
}

Related

executing a function within an array within a hash in perl

I have a Perl data structurte like so
%myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
...
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
...
}
]
);
Elsewhere I iterate through the list in kArray which contains a bunch of hashes. I would like to either process the actual hash OR the hash returned by the function.
foreach my $elem( #{myhash{kArray}} ) {
if (ref($elem) == "CODE") {
%thisHash = &$elem;
}
else {
%thisHash = %$elem;
}
...
}
However ref ($elem) is always scalar or undefined. I tried func, &func, \&func, \%{&func}, in %myhash to no effect.
how do I extract the hash within the function in the main body?
Apart from the code sample you give being invalid Perl, the main problems seem to be that you are using == to compare strings instead of eq, and you are assigning a hash reference to a hash variable %thishash. I assure you that ref $elem never returns SCALAR with the data you show
It would help you enormously if you followed the common advice to use strict and use warnings at the top of your code
This will work for you
for my $elem ( #{ $myhash{kArray} } ) {
my $this_hash;
if ( ref $elem eq 'CODE' ) {
$this_hash = $elem->();
}
else {
$this_hash = $elem;
}
# Do stuff with $this_hash
}
or you could just use a map like this
use strict;
use warnings;
use 5.010;
use Data::Dump;
my %myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
}
]
);
for my $hash ( map { ref eq 'CODE' ? $_->() : $_ } #{ $myhash{kArray} } ) {
say $hash->{name};
}
sub funcThatReturnsHash {
{ name => 'a third anonymous hash' };
}
output
anonymous hash
a third anonymous hash
another anonymous hash
If you turn on strict and warnings, you'll see that:
foreach my $elem(#{mynahs{kArray}}) {
Isn't valid. You need at the very least a $ before mynahs.
But given something like this - your approach works - here's an example using map to 'run' the code references:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub gimme_hash {
return { 'fish' => 'paste' };
}
my $stuff =
[ { 'anon1' => 'value' },
\&gimme_hash,
{ 'anon2' => 'anothervalue' }, ];
my $newstuff = [ map { ref $_ eq "CODE" ? $_->() : $_ } #$stuff ];
print Dumper $newstuff;
Turns that hash into:
$VAR1 = [
{
'anon1' => 'value'
},
{
'fish' => 'paste'
},
{
'anon2' => 'anothervalue'
}
];
But your approach does work:
foreach my $element ( #$stuff ) {
my %myhash;
if ( ref $element eq "CODE" ) {
%myhash = %{$element -> ()};
}
else {
%myhash = %$element;
}
print Dumper \%myhash;
}
Gives:
$VAR1 = {
'anon1' => 'value'
};
$VAR1 = {
'fish' => 'paste'
};
$VAR1 = {
'anon2' => 'anothervalue'
};

Equivalent of "shift" for a hash to create a $class->next() method

I almost feel like saying "it's me again!".
Anyway, here we go.
I like using while $object->next() style constructs. They appeal to me and seem "neat".
Now, when the thing I'm iterating over is an array, it's straightforward ("shift #ary or return undef")
sub next {
my ( $self, $args ) = #_;
my $next = shift #{ $self->{list_of_things} } or return undef;
my ( $car, $engine_size, $color )
= split( /\Q$opts->{fieldsep}/, $next );
$self->car = $host;
$self->engine_size = $engine_size;
$self->color = $color;
}
In this example I use AUTOLOAD to create the getters and setters and then have those instance variables available in my object during the while loop.
I'd like to do something similar but with the "list_of_things" being a %hash.
Here's a non-OO example that doesn't make it into the first iteration. Any ideas why?
(The total "list_of_things" is not that big - maybe 100 entries - so to do a keys(%{$hash}) every time doesn't seem too wasteful to me).
use strict;
use warnings;
use Data::Dumper;
my $list_of_things = {
volvo => {
color => "red",
engine_size => 2000,
},
bmw => {
color => "black",
engine_size => 2500,
},
mini => {
color => "british racing green",
engine_size => 1200,
}
};
sub next {
my $args = $_;
my #list = keys( %{$list_of_things} );
return undef if scalar #list == "0";
my $next = $list_of_things->{ $list[0] };
delete $list_of_things->{ $list[0] };
return $next;
}
while ( next()) {
print Dumper $_;
print scalar keys %{ $list_of_things }
}
Is there a better way of doing this? Am I doing something crazy?
EDIT:
I tried Ikegami's suggestion. Of course, Ikegami's example works flawlessly. When I try and abstract a little, so that all that is exposed to the object is a next->() method, I get the same "perl-going-to-100%-cpu" problem as in my original example.
Here's a non-OO example:
use Data::Dumper qw( Dumper );
sub make_list_iter {
my #list = #_;
return sub { #list ? shift(#list) : () };
}
sub next {
make_list_iter( keys %$hash );
}
my $hash = { ... };
while ( my ($k) = next->() ) {
print Dumper $hash->{$k};
}
It does not seem to get past the first step of the while() loop.
I am obviously missing something here...
If you don't want to rely on the hash's builtin iterator (used by each, keys and values), there's nothing stopping you from making your own.
use Data::Dumper qw( Dumper );
sub make_list_iter {
my #list = #_;
return sub { #list ? shift(#list) : () };
}
my $list_of_things = { ... };
my $i = make_list_iter(keys %$list_of_things);
while (my ($k) = $i->()) {
local $Data::Dumper::Terse = 1;
local $Data::Dumper::Indent = 0;
say "$k: " . Dumper($list_of_things->{$k});
}
The each operator is a builtin that iterates over hashes. It returns undef when it runs out of elements to return. So you could so something like
package SomeObject;
# creates new object instance
sub new {
my $class = shift;
return bless { hash_of_things => { #_ } }, $class
}
sub next {
my $self = shift;
my ($key,$value) = each %{ $self->{hash_of_things} };
return $key; # or return $value
}
Calling keys on the hash will reset the each iterator. It's good to know this so you can reset it on purpose:
sub reset {
my $self = shift;
keys %{ $self->{hash_of_things} }
}
and so you can avoid resetting it on accident.
The section on tie'ing hashes in perltie also has an example like this.
Here's how List::Gen could be used to create an iterator from a list:
use strict;
use warnings;
use List::Gen 'makegen';
my #list_of_things = ( # This structure is more suitable IMO
{
make => 'volvo',
color => 'red',
engine_size => 2000,
},
{
make => 'bmw',
color => 'black',
engine_size => 2500,
},
{
make => 'mini',
color => 'british racing green',
engine_size => 1200,
}
);
my $cars = makegen #list_of_things;
print $_->{make}, "\n" while $cars->next;
Well, if you don't need $list_of_things for later, you can always do something like
while(keys %$list_of_things)
{
my $temp=(sort keys %$list_of_things)[0];
print "key: $temp, value array: " . join(",",#{$list_of_things->{$temp}}) . "\n";
delete $list_of_things->{$temp};
}
And if you do need it, you can always assign it to a temporary hash reference and perform the same while loop on it.

Perl Working On Two Hash References

I would like to compare the values of two hash references.
The data dumper of my first hash is this:
$VAR1 = {
'42-MG-BA' => [
{
'chromosome' => '19',
'position' => '35770059',
'genotype' => 'TC'
},
{
'chromosome' => '2',
'position' => '68019584',
'genotype' => 'G'
},
{
'chromosome' => '16',
'position' => '9561557',
'genotype' => 'G'
},
And the second hash is similar to this but with more hashes in the array. I would like to compare the genotype of my first and second hash if the position and the choromosome matches.
map {print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n"}sort keys %$cave_snp_list;
map {print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n"}sort keys %$geno_seq_list;
I could do that for the first array of the hashes.
Could you help me in how to work for all the arrays?
This is my actual code in full
#!/software/bin/perl
use strict;
use warnings;
use Getopt::Long;
use Benchmark;
use Config::Config qw(Sequenom.ini);
useDatabase::Conn;
use Data::Dumper;
GetOptions("sam=s" => \my $sample);
my $geno_seq_list = getseqgenotypes($sample);
my $cave_snp_list = getcavemansnpfile($sample);
#print Dumper($geno_seq_list);
print scalar %$geno_seq_list, "\n";
foreach my $sam (keys %{$geno_seq_list}) {
my $seq_used = $geno_seq_list->{$sam};
my $cave_used = $cave_snp_list->{$sam};
print scalar(#$geno_seq_list->{$_}) if sort keys %$geno_seq_list, "\n";
print scalar(#$cave_used), "\n";
#foreach my $seq2com (# {$seq_used } ){
# foreach my $cave2com( # {$cave_used} ){
# print $seq2com->{chromosome},":" ,$cave2com->{chromosome},"\n";
# }
#}
map { print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n" } sort keys %$cave_snp_list;
map { print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n" } sort keys %$geno_seq_list;
}
sub getseqgenotypes {
my $snpconn;
my $gen_list = {};
$snpconn = Database::Conn->new('live');
$snpconn->addConnection(DBI->connect('dbi:Oracle:pssd.world', 'sn', 'ss', { RaiseError => 1, AutoCommit => 0 }),
'pssd');
#my $conn2 =Database::Conn->new('live');
#$conn2->addConnection(DBI->connect('dbi:Oracle:COSI.world','nst_owner','nst_owner', {RaiseError =>1 , AutoCommit=>0}),'nst');
my $id_ind = $snpconn->execute('snp::Sequenom::getIdIndforExomeSample', $sample);
my $genotype = $snpconn->executeArrRef('snp::Sequenom::getGenotypeCallsPosition', $id_ind);
foreach my $geno (#{$genotype}) {
push #{ $gen_list->{ $geno->[1] } }, {
chromosome => $geno->[2],
position => $geno->[3],
genotype => $geno->[4],
};
}
return ($gen_list);
} #end of sub getseqgenotypes
sub getcavemansnpfile {
my $nstconn;
my $caveman_list = {};
$nstconn = Database::Conn->new('live');
$nstconn->addConnection(
DBI->connect('dbi:Oracle:CANP.world', 'nst_owner', 'NST_OWNER', { RaiseError => 1, AutoCommit => 0 }), 'nst');
my $id_sample = $nstconn->execute('nst::Caveman::getSampleid', $sample);
#print "IDSample: $id_sample\n";
my $file_location = $nstconn->execute('nst::Caveman::getCaveManSNPSFile', $id_sample);
open(SNPFILE, "<$file_location") || die "Error: Cannot open the file $file_location:$!\n";
while (<SNPFILE>) {
chomp;
next if /^>/;
my #data = split;
my ($nor_geno, $tumor_geno) = split /\//, $data[5];
# array of hash
push #{ $caveman_list->{$sample} }, {
chromosome => $data[0],
position => $data[1],
genotype => $nor_geno,
};
} #end of while loop
close(SNPFILE);
return ($caveman_list);
}
The problem that I see is that you're constructing a tree for generic storage of data, when what you want is a graph, specific to the task. While you are constructing the record, you could also be constructing the part that groups data together. Below is just one example.
my %genotype_for;
my $record
= { chromosome => $data[0]
, position => $data[1]
, genotype => $nor_geno
};
push #{ $gen_list->{ $geno->[1] } }, $record;
# $genotype_for{ position }{ chromosome }{ name of array } = genotype code
$genotype_for{ $data[1] }{ $data[0] }{ $sample } = $nor_geno;
...
return ( $caveman_list, \%genotype_for );
In the main line, you receive them like so:
my ( $cave_snp_list, $geno_lookup ) = getcavemansnpfile( $sample );
This approach at least allows you to locate similar position and chromosome values. If you're going to do much with this, I might suggest an OO approach.
Update
Assuming that you wouldn't have to store the label, we could change the lookup to
$genotype_for{ $data[1] }{ $data[0] } = $nor_geno;
And then the comparison could be written:
foreach my $pos ( keys %$small_lookup ) {
next unless _HASH( my $sh = $small_lookup->{ $pos } )
and _HASH( my $lh = $large_lookup->{ $pos } )
;
foreach my $chrom ( keys %$sh ) {
next unless my $sc = $sh->{ $chrom }
and my $lc = $lh->{ $chrom }
;
print "$sc:$sc";
}
}
However, if you had limited use for the larger list, you could construct the specific case
and pass that in as a filter when creating the longer list.
Thus, in whichever loop creates the longer list, you could just go
...
next unless $sample{ $position }{ $chromosome };
my $record
= { chromosome => $chromosome
, position => $position
, genotype => $genotype
};
...

How to loop json results in Perl

I am trying to output JSON from a perl script that accesses a mysql database.
How I can I loop through my query returns and turn that into JSON using the JSON module?
When I do this all I get is 1 return
while($query_handle->fetch()) {
$jsonStructure->{event};
$jsonStructure->{event}->{evid} = $evid;
$jsonStructure->{event}->{component} = $component;
$jsonStructure->{event}->{firstTime} = $firstTime;
$jsonStructure->{event}->{lastTime} = $lastTime;
$jsonStructure->{event}->{count} = $count;
$jsonStructure->{event}->{summary} = $summary;
$jsonStructure->{event}->{severity} = $severity;
}
Basically I have many events and don't know how to say event[0]...
Thank You
I think what you're looking for is this:
push #{ $jsonStructure->{events} }, {
evid => $evid,
component => $component,
...,
};
although even that is probably overkill, because you can probably do something like:
while (my $row = $dbh->fetchrow_hashref) {
push #{ $jsonStructure->{events} }, $row;
}
if all of the column names in the DB are the same as the field names you want in the JSON, and you want all columns, or:
my #keys = qw(evid component firstTime ...);
while (my $row = $dbh->fetchrow_hashref) {
my %hash;
#hash{#keys} = #$row{#keys};
push #{ $jsonStructure->{events} }, \%hash;
}
if you only want some columns, or:
# DB colname => JSON field name
my %mapping = (
event_id => 'evid',
component => 'component',
first_time => 'firstTime',
...,
);
while (my $row = $dbh->fetchrow_hashref) {
my %hash;
#hash{ values %mapping } = #$row{ keys %mapping };
push #{ $jsonStructure->{events} }, \%hash;
}
for a completely arbitrary mapping. Power of Perl and all that. :)

Parse::RecDescent - getting information from it

I'm working with the Parse::RecDescent parser in Perl, and I seem to have the most terrible time getting information from it. The information readily available online does not seem to have non-trivial examples.
Here is the code:
event_function: object_list ':' event_list ';'
<defer:
{ #item is a special character with Parse::Recdescent.
print Dumper($item{object_list});
$return = $item[1];
}
>
| object_list ':' ';'
<defer:
{
print Dumper($item{object_list});
$return = $item[1];
}
>
Here is the output
PS W:\developers\paulnathan\rd_dir> perl parser.pl testfile
$VAR1 = 4;
$VAR1 = 8;
PS W:\developers\paulnathan\rd_dir>
The input file parses correctly.
stuff, stuff2: pre-operation event = {foo1, foo2};
It should be outputting a hash keyed by "stuff", "stuff2".
Thoughts?
edit:
object_list :
object ',' object_list
<defer:
{
my $retval = ();
$retval = ::merge_hash_refs($item[1], $item[3]);
$return = $retval;
}
>
| object
<defer:
{
#print Dumper($item{object});
$return = $item{object};
}
>
object :
'/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
<defer:
{
$::objects->{$item[2]} = "stuff";
$return = $::objects;
}
>
| /[a-z0-9_][a-z0-9_]*/
<defer:
{
$::objects->{$item[1]} = "stuff";
$return = $::objects;
}
>
edit2:
Merge_hash_refs, just in case. :-)
#takes two hash references.
sub merge_hash_refs {
my($ref1, $ref2) = #_;
my $retref = ();
while( my ($k, $v) = each %$ref1 ) {
$retref->{$k} = $v;
}
while( my ($k, $v) = each %$ref2 ) {
$retref->{$k} = $v;
}
return $retref;
}
If you add a use strict to your script you'll get the fatal error Can't use string ("1") as a HASH ref while "strict refs" in use at [the call to merge_hash_refs]. It appears that the closures created by the <defer> directives are causing the contents of #item to be the ones when the production matched instead of the hashrefs eventually returned by the subrules. Removing the <defer> directives gives me this output:
$VAR1 = {
'stuff2' => 'stuff',
'stuff' => 'stuff'
};
Of course, this has the side effect that $::object is updated by successful object productions even if the higher level rules fail (including backtracking). I'd write it this way:
use strict;
use warnings;
use Parse::RecDescent;
use Data::Dumper;
my $parser = Parse::RecDescent->new(<<'EOT');
event_function: object_list ':' event_list(?) ';'
{
$return = $item[1];
}
object_list : <leftop: object ',' object>
{
$return = { map { %$_ } #{$item[1]} };
}
object :
'/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
{
$return = { $item[2] => 'stuff' };
}
| /[a-z0-9_][a-z0-9_]*/
{
$return = { $item[1] => 'stuff' };
}
# stub, don't know what this should be
event_list : /[^;]+/
EOT
my %object;
while (<DATA>) {
my $x = $parser->event_function($_);
next unless $x;
# merge objects into master list
while (my ($k, $v) = each %$x) {
$object{$k} = $v;
}
}
print Dumper \%object;
__DATA__
stuff, stuff2: pre-operation event = {foo1, foo2};
stuff3, stuff4: ;
The output is:
$VAR1 = {
'stuff2' => 'stuff',
'stuff3' => 'stuff',
'stuff' => 'stuff',
'stuff4' => 'stuff'
};
Probably not an answer to your question, but when you start an each() loop through a hash, if each() had previously been used on the hash it just starts from wherever the iterator was pointing. To be safe, put a void-context keys() (e.g. keys(%$ref1);) before the while loop to reset the iterator. Older versions of Data::Dumper had a cute little bug of leaving the iterator pointing just after the last element sometimes, making the hash appear to be empty to an unsafe while(...each...) loop :)