odd number of elements in anonymous hash - perl

I'm trying to understand this Perl code...
If there is one stream it works, if there are 2 or more streams it warns with odd number of elements in anonymous hash. It seems to return an array in that case. How do I add the array elements correctly to #streams? It appears to add correctly for the HASH case in the if clause. Is the else clause bunk?
my $x = $viewedProjectDataObj->{streams};
if (ref($x) eq 'HASH') {
push(#streams, $x->{id});
} elsif (ref($x) eq 'ARRAY') {
print "$x\n";
print "#$x\n";
my #array = #$x;
foreach my $obj (#array) {
print "in $obj\n";
print Dumper( $obj);
push(#streams, ($obj->{id}) );
}
}
print "streamcount " . #streams % 2;
print Dumper(#streams);
my $stream_defect_filter_spec = {
'streamIdList' => #streams,
'includeDefectInstances' => 'true',
'includeHistory' => 'true',
};
my #streamDefects = $WS->get_stream_defects($defectProxy, \#cids, $stream_defect_filter_spec);
print Dumper(#streamDefects);
I'm adding the next lines...
if ($defectSummary->{owner} eq "Various") {
foreach (#streamDefects) {
if (exists($_->{owner})) {
$defectSummary->{owner} = $_->{owner};
last;
}
}
}
my $diref = $streamDefects[0]->{defectInstances};
if ($diref) {
my $defectInstance;
if (ref($diref) eq 'HASH') {
$defectInstance = $diref;
} elsif (ref($diref) eq 'ARRAY') {
$defectInstance = #{$diref}[0];
} else {
die "Unable to handle $diref (".ref($diref).")";
}
It now errors with
Web API returned error code S:Server: calling getStreamDefects: No stream found
for name null.
$VAR1 = -1;
me
Can't use string ("-1") as a HASH ref while "strict refs" in use at xyz-handler.pl line 317.
some Dumper output
$VAR1 = {
'streamIdList' => [
{
'name' => 'asdfasdfadsfasdfa'
},
{
'name' => 'cpp-62bad47d63cfb25e76b29a4801c61d8d'
}
],
'includeDefectInstances' => 'true',
'includeHistory' => 'true'
};

The list assigned to a hash is a set of key/value pairs, which is why the number of elements must be even.
Because the => operator is little more than a comma, and the #streams array is flattened in the list, this
my $stream_defect_filter_spec = {
'streamIdList' => #streams,
'includeDefectInstances' => 'true',
'includeHistory' => 'true',
};
is equivalent to this
my $stream_defect_filter_spec = {
'streamIdList' => $streams[0],
$streams[1] => $streams[2],
$streams[3] => $streams[4],
...
'includeDefectInstances' => 'true',
'includeHistory' => 'true',
};
so I hope you can see that you will get the warning if you have an even number of elements in the array.
To fix things you need the value of the hash element to be an array reference, which is a scalar and won't upset the scheme of things
my $stream_defect_filter_spec = {
'streamIdList' => \#streams,
'includeDefectInstances' => 'true',
'includeHistory' => 'true',
};
that way you can access the array elements as
$stream_defect_filter_spec->{streamIdList}[0]
etc.
And by the way you can tidy up your code substantially by letting map do what it's good at:
if (ref $x eq 'HASH') {
push #streams, $x->{id};
}
elsif (ref $x eq 'ARRAY') {
push #streams, map $_->{id}, #$x;
}

The assignment in:
my $stream_defect_filter_spec = {
'streamIdList' => #streams, # <---- THIS ONE
'includeDefectInstances' => 'true',
'includeHistory' => 'true',
};
is not correct, you get hash keys from the 1 3 5th ... array element.
You probably want assign a reference to array, not the array itself:
'streamIdList' => \#streams,
example for the unwanted (as in your code):
use strict;
use warnings;
use Data::Dump;
my #z = qw(a b c x y z);
dd \#z;
my $q = {
'aa' => #z,
};
dd $q;
unwanted result:
["a", "b", "c", "x", "y", "z"]
Odd number of elements in anonymous hash at a line 12.
{ aa => "a", b => "c", x => "y", z => undef }
^-here
Example of assign a reference
use strict;
use warnings;
use Data::Dump;
my #z = qw(a b c x y z);
dd \#z;
my $q = {
'aa' => \#z,
};
dd $q;
produces:
["a", "b", "c", "x", "y", "z"]
{ aa => ["a", "b", "c", "x", "y", "z"] }
The difference is clearly visible.

Related

executing a function within an array within a hash in perl

I have a Perl data structurte like so
%myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
...
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
...
}
]
);
Elsewhere I iterate through the list in kArray which contains a bunch of hashes. I would like to either process the actual hash OR the hash returned by the function.
foreach my $elem( #{myhash{kArray}} ) {
if (ref($elem) == "CODE") {
%thisHash = &$elem;
}
else {
%thisHash = %$elem;
}
...
}
However ref ($elem) is always scalar or undefined. I tried func, &func, \&func, \%{&func}, in %myhash to no effect.
how do I extract the hash within the function in the main body?
Apart from the code sample you give being invalid Perl, the main problems seem to be that you are using == to compare strings instead of eq, and you are assigning a hash reference to a hash variable %thishash. I assure you that ref $elem never returns SCALAR with the data you show
It would help you enormously if you followed the common advice to use strict and use warnings at the top of your code
This will work for you
for my $elem ( #{ $myhash{kArray} } ) {
my $this_hash;
if ( ref $elem eq 'CODE' ) {
$this_hash = $elem->();
}
else {
$this_hash = $elem;
}
# Do stuff with $this_hash
}
or you could just use a map like this
use strict;
use warnings;
use 5.010;
use Data::Dump;
my %myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
}
]
);
for my $hash ( map { ref eq 'CODE' ? $_->() : $_ } #{ $myhash{kArray} } ) {
say $hash->{name};
}
sub funcThatReturnsHash {
{ name => 'a third anonymous hash' };
}
output
anonymous hash
a third anonymous hash
another anonymous hash
If you turn on strict and warnings, you'll see that:
foreach my $elem(#{mynahs{kArray}}) {
Isn't valid. You need at the very least a $ before mynahs.
But given something like this - your approach works - here's an example using map to 'run' the code references:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub gimme_hash {
return { 'fish' => 'paste' };
}
my $stuff =
[ { 'anon1' => 'value' },
\&gimme_hash,
{ 'anon2' => 'anothervalue' }, ];
my $newstuff = [ map { ref $_ eq "CODE" ? $_->() : $_ } #$stuff ];
print Dumper $newstuff;
Turns that hash into:
$VAR1 = [
{
'anon1' => 'value'
},
{
'fish' => 'paste'
},
{
'anon2' => 'anothervalue'
}
];
But your approach does work:
foreach my $element ( #$stuff ) {
my %myhash;
if ( ref $element eq "CODE" ) {
%myhash = %{$element -> ()};
}
else {
%myhash = %$element;
}
print Dumper \%myhash;
}
Gives:
$VAR1 = {
'anon1' => 'value'
};
$VAR1 = {
'fish' => 'paste'
};
$VAR1 = {
'anon2' => 'anothervalue'
};

Converting HoA to HoH with counting

Have this code:
use 5.020;
use warnings;
use Data::Dumper;
my %h = (
k1 => [qw(aa1 aa2 aa1)],
k2 => [qw(ab1 ab2 ab3)],
k3 => [qw(ac1 ac1 ac1)],
);
my %h2;
for my $k (keys %h) {
$h2{$k}{$_}++ for (#{$h{$k}});
}
say Dumper \%h2;
produces:
$VAR1 = {
'k1' => {
'aa2' => 1,
'aa1' => 2
},
'k3' => {
'ac1' => 3
},
'k2' => {
'ab1' => 1,
'ab3' => 1,
'ab2' => 1
}
};
Is possible to write the above code with "another way"? (e.g. simpler or more compact)?
Honestly, I don't like the number of times $h2{$k} is evaluated.
my %h2;
for my $k (keys %h) {
my $src = $h{$k};
my $dst = $h2{$k} = {};
++$dst->{$_} for #$src;
}
A subroutine can help make the intent more obvious. Maybe.
sub counts { my %c; ++$c{$_} for #_; \%c }
$h2{$_} = counts(#{ $h{$_} }) for keys %h;
That can be simplified if you do the change in-place.
sub counts { my %c; ++$c{$_} for #_; \%c }
$_ = counts(#$_) for values %h;

Perl adding Lines into a Multi-Dimensional Hash

Hello I want to split a Line and add the Values in to a multi dimensional Hash. This is how the Lines look like:
__DATA__
49839382;Test1;bgsae;npvxs
49839384;Test2;bgsae;npvxs
49839387;Test3;bgsae;npvxs
So what I am doing now is:
my %prefix = map { chomp; split ';' } <DATA>;
But now I can only access Test1 with:
print $prefix{"49839382"}
But how can I also add the bgsae to the Hash so I can access is with
$prefix{"49839382"}{"Test1"}
Thank you for your help.
What structure are you trying to build?
use Data::Dumper;
my %prefix = map { chomp (my #fields = split /;/); $fields[0] => { #fields[1 .. $#fields] } } <DATA>;
print Dumper \%prefix;
Output:
$VAR1 = {
'49839384' => {
'Test2' => 'bgsae',
'npvxs' => undef
},
'49839382' => {
'Test1' => 'bgsae',
'npvxs' => undef
},
'49839387' => {
'npvxs' => undef,
'Test3' => 'bgsae'
}
};
Or do you need a deeper hash?
my %prefix;
for (<DATA>) {
chomp;
my $ref = \%prefix;
for (split /;/) {
warn "[$_]";
$ref->{$_} = {};
$ref = $ref->{$_};
}
}
Returns:
$VAR1 = {
'49839384' => {
'Test2' => {
'bgsae' => {
'npvxs' => {}
}
}
},
'49839382' => {
'Test1' => {
'bgsae' => {
'npvxs' => {}
}
}
},
'49839387' => {
'Test3' => {
'bgsae' => {
'npvxs' => {}
}
}
}
};
I don't know what you need the data for, but at a guess you want something more like this.
It builds a hash of arrays, using the first field as the key for the data, and the remaining three in an array for the value. So you can access the test number as $data{'49839382'}[0] etc.
use strict;
use warnings;
my %data = map {
chomp;
my #fields = split /;/;
shift #fields => \#fields;
} <DATA>;
use Data::Dumper;
print Data::Dumper->Dump([\%data], ['*data']);
__DATA__
49839382;Test1;bgsae;npvxs
49839384;Test2;bgsae;npvxs
49839387;Test3;bgsae;npvxs
output
%data = (
'49839384' => [
'Test2',
'bgsae',
'npvxs'
],
'49839382' => [
'Test1',
'bgsae',
'npvxs'
],
'49839387' => [
'Test3',
'bgsae',
'npvxs'
]
);

not able to access hash of hash of array values

I have written the following code in Perl. The code is reading a pdb file and getting some values. Ignore the top part of the code,where everything is working perfect.
Problem is in the sub-routine part, where I try to store arrays in the hash3 with model as key another key position
the array values can be accessed inside the if condition using this :
$hash3{$model}{$coordinates}[1].
but when I go out of all foreach loop and try to access the elements I only get one value.
Please look at the end foreach loop and tell me is it the wrong way to access the hash values.
The pdb file I am using can be downloaded from this link http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1NZS
#!/usr/bin/perl
open(IN,$ARGV[0]);
my #phosphosites;
my $model=1;
my %hash3;
while(<IN>)
{
#findmod(#line);
#finddist;
#findfreq;
if((/^MODRES/) && (/PHOSPHO/))
{
#line=split;
push(#phosphosites, $line[2]);
#print "$line[4]";
}
foreach $elements (#phosphosites){
if(/^HETATM\s+\d+\s+CA\s+$i/)
{
#line1=split;
#print "$line1[5]";
#print "$line1[6] $line1[7] $line1[8]\n";
push(#phosphositesnum, $line1[5]);
}
}
$pos=$line1[5];
#findspatial(\#line,\#line1);
}
my #ori_data=removeDuplicates(#phosphositesnum);
sub removeDuplicates {
my %seen = ();
my #vals = ();
foreach my $i (#_) {
unless ($seen{$i}) {
push #vals, $i;
$seen{$i} = 1;
}
}
return #vals;
}
$a=(#phosphosites);
print "$a\n";
print "#phosphosites\n";
print "#ori_data\n";
close(IN);
open(IN1,$ARGV[0]);
my (#data)=<IN1>;
spatial(\#ori_data);
sub spatial {
my #spatial_array1=#{$_[0]};
foreach $coordinates(#spatial_array1)
{
$model=1;
{foreach $data1(#data){
if($data1=~ m/^HETATM\s+\d+\s+CA\s+[A-Z]*\s+[A-Z]*\s+$coordinates/)
{
#cordivals=split(/\s+/,$data1);
push #{ $sphash{$model} },[$cordivals[6], $cordivals[7], $cordivals[8]];
$hash3{$model}{$coordinates}= \#cordivals;
#print "$model $coordinates $hash3{$model}{$coordinates}[6] $hash3{$model}{$coordinates}[7] $hash3{$model}{$coordinates}[8]\n";
#print "$model $sphash{$model}[$i][0] $sphash{$model}[$i][1] $sphash{$model}[$i][2]\n";
}
elsif($data1=~ m/^ENDMDL/)
{
$model++;
}
#print "$model $coordinates $hash3{$model}{$coordinates}[6] $hash3{$model}{$coordinates}[7] $hash3{$model}{$coordinates}[8]\n";
}
}
}
#foreach $z1 (sort keys %hash3)
# {
# foreach $z2(#spatial_array1){
# print "$z1 $z2";
# print "$hash3{$z1}{$z2}[6]\n";
# print "$z2\n";
# }
# }
}
After using the Data::Dumper option it is giving me this kind of output
$VAR1 = {
'11' => {
'334' => [
'HETATM',
'115',
'CA',
'SEP',
'A',
'343',
'-0.201',
'-2.884',
'1.022',
'1.00',
'99.99',
'C'
],
'342' => $VAR1->{'11'}{'334'},
'338' => $VAR1->{'11'}{'334'},
'335' => $VAR1->{'11'}{'334'},
'340' => $VAR1->{'11'}{'334'},
'343' => $VAR1->{'11'}{'334'},
'336' => $VAR1->{'11'}{'334'}
},
'7' => {
'334' => $VAR1->{'11'}{'334'},
'342' => $VAR1->{'11'}{'334'},
'338' => $VAR1->{'11'}{'334'},
'335' => $VAR1->{'11'}{'334'},
'340' => $VAR1->{'11'}{'334'},
'343' => $VAR1->{'11'}{'334'},
'336' => $VAR1->{'11'}{'334'}
},
'2' => {
'334' => $VAR1->{'11'}{'334'},
'342' => $VAR1->{'11'}{'334'},
...
Change:
#cordivals=split(/\s+/,$data1);
to:
my #cordivals=split(/\s+/,$data1);
What seems to be happening is that all the hash elements contain references to the same array variable, because you're not making the variable local to that iteration.
In general, you should use my with all variables.

Perl Working On Two Hash References

I would like to compare the values of two hash references.
The data dumper of my first hash is this:
$VAR1 = {
'42-MG-BA' => [
{
'chromosome' => '19',
'position' => '35770059',
'genotype' => 'TC'
},
{
'chromosome' => '2',
'position' => '68019584',
'genotype' => 'G'
},
{
'chromosome' => '16',
'position' => '9561557',
'genotype' => 'G'
},
And the second hash is similar to this but with more hashes in the array. I would like to compare the genotype of my first and second hash if the position and the choromosome matches.
map {print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n"}sort keys %$cave_snp_list;
map {print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n"}sort keys %$geno_seq_list;
I could do that for the first array of the hashes.
Could you help me in how to work for all the arrays?
This is my actual code in full
#!/software/bin/perl
use strict;
use warnings;
use Getopt::Long;
use Benchmark;
use Config::Config qw(Sequenom.ini);
useDatabase::Conn;
use Data::Dumper;
GetOptions("sam=s" => \my $sample);
my $geno_seq_list = getseqgenotypes($sample);
my $cave_snp_list = getcavemansnpfile($sample);
#print Dumper($geno_seq_list);
print scalar %$geno_seq_list, "\n";
foreach my $sam (keys %{$geno_seq_list}) {
my $seq_used = $geno_seq_list->{$sam};
my $cave_used = $cave_snp_list->{$sam};
print scalar(#$geno_seq_list->{$_}) if sort keys %$geno_seq_list, "\n";
print scalar(#$cave_used), "\n";
#foreach my $seq2com (# {$seq_used } ){
# foreach my $cave2com( # {$cave_used} ){
# print $seq2com->{chromosome},":" ,$cave2com->{chromosome},"\n";
# }
#}
map { print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n" } sort keys %$cave_snp_list;
map { print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n" } sort keys %$geno_seq_list;
}
sub getseqgenotypes {
my $snpconn;
my $gen_list = {};
$snpconn = Database::Conn->new('live');
$snpconn->addConnection(DBI->connect('dbi:Oracle:pssd.world', 'sn', 'ss', { RaiseError => 1, AutoCommit => 0 }),
'pssd');
#my $conn2 =Database::Conn->new('live');
#$conn2->addConnection(DBI->connect('dbi:Oracle:COSI.world','nst_owner','nst_owner', {RaiseError =>1 , AutoCommit=>0}),'nst');
my $id_ind = $snpconn->execute('snp::Sequenom::getIdIndforExomeSample', $sample);
my $genotype = $snpconn->executeArrRef('snp::Sequenom::getGenotypeCallsPosition', $id_ind);
foreach my $geno (#{$genotype}) {
push #{ $gen_list->{ $geno->[1] } }, {
chromosome => $geno->[2],
position => $geno->[3],
genotype => $geno->[4],
};
}
return ($gen_list);
} #end of sub getseqgenotypes
sub getcavemansnpfile {
my $nstconn;
my $caveman_list = {};
$nstconn = Database::Conn->new('live');
$nstconn->addConnection(
DBI->connect('dbi:Oracle:CANP.world', 'nst_owner', 'NST_OWNER', { RaiseError => 1, AutoCommit => 0 }), 'nst');
my $id_sample = $nstconn->execute('nst::Caveman::getSampleid', $sample);
#print "IDSample: $id_sample\n";
my $file_location = $nstconn->execute('nst::Caveman::getCaveManSNPSFile', $id_sample);
open(SNPFILE, "<$file_location") || die "Error: Cannot open the file $file_location:$!\n";
while (<SNPFILE>) {
chomp;
next if /^>/;
my #data = split;
my ($nor_geno, $tumor_geno) = split /\//, $data[5];
# array of hash
push #{ $caveman_list->{$sample} }, {
chromosome => $data[0],
position => $data[1],
genotype => $nor_geno,
};
} #end of while loop
close(SNPFILE);
return ($caveman_list);
}
The problem that I see is that you're constructing a tree for generic storage of data, when what you want is a graph, specific to the task. While you are constructing the record, you could also be constructing the part that groups data together. Below is just one example.
my %genotype_for;
my $record
= { chromosome => $data[0]
, position => $data[1]
, genotype => $nor_geno
};
push #{ $gen_list->{ $geno->[1] } }, $record;
# $genotype_for{ position }{ chromosome }{ name of array } = genotype code
$genotype_for{ $data[1] }{ $data[0] }{ $sample } = $nor_geno;
...
return ( $caveman_list, \%genotype_for );
In the main line, you receive them like so:
my ( $cave_snp_list, $geno_lookup ) = getcavemansnpfile( $sample );
This approach at least allows you to locate similar position and chromosome values. If you're going to do much with this, I might suggest an OO approach.
Update
Assuming that you wouldn't have to store the label, we could change the lookup to
$genotype_for{ $data[1] }{ $data[0] } = $nor_geno;
And then the comparison could be written:
foreach my $pos ( keys %$small_lookup ) {
next unless _HASH( my $sh = $small_lookup->{ $pos } )
and _HASH( my $lh = $large_lookup->{ $pos } )
;
foreach my $chrom ( keys %$sh ) {
next unless my $sc = $sh->{ $chrom }
and my $lc = $lh->{ $chrom }
;
print "$sc:$sc";
}
}
However, if you had limited use for the larger list, you could construct the specific case
and pass that in as a filter when creating the longer list.
Thus, in whichever loop creates the longer list, you could just go
...
next unless $sample{ $position }{ $chromosome };
my $record
= { chromosome => $chromosome
, position => $position
, genotype => $genotype
};
...