How to loop json results in Perl - perl

I am trying to output JSON from a perl script that accesses a mysql database.
How I can I loop through my query returns and turn that into JSON using the JSON module?
When I do this all I get is 1 return
while($query_handle->fetch()) {
$jsonStructure->{event};
$jsonStructure->{event}->{evid} = $evid;
$jsonStructure->{event}->{component} = $component;
$jsonStructure->{event}->{firstTime} = $firstTime;
$jsonStructure->{event}->{lastTime} = $lastTime;
$jsonStructure->{event}->{count} = $count;
$jsonStructure->{event}->{summary} = $summary;
$jsonStructure->{event}->{severity} = $severity;
}
Basically I have many events and don't know how to say event[0]...
Thank You

I think what you're looking for is this:
push #{ $jsonStructure->{events} }, {
evid => $evid,
component => $component,
...,
};
although even that is probably overkill, because you can probably do something like:
while (my $row = $dbh->fetchrow_hashref) {
push #{ $jsonStructure->{events} }, $row;
}
if all of the column names in the DB are the same as the field names you want in the JSON, and you want all columns, or:
my #keys = qw(evid component firstTime ...);
while (my $row = $dbh->fetchrow_hashref) {
my %hash;
#hash{#keys} = #$row{#keys};
push #{ $jsonStructure->{events} }, \%hash;
}
if you only want some columns, or:
# DB colname => JSON field name
my %mapping = (
event_id => 'evid',
component => 'component',
first_time => 'firstTime',
...,
);
while (my $row = $dbh->fetchrow_hashref) {
my %hash;
#hash{ values %mapping } = #$row{ keys %mapping };
push #{ $jsonStructure->{events} }, \%hash;
}
for a completely arbitrary mapping. Power of Perl and all that. :)

Related

Printing Hash of Hash into a Matrix Table in Perl

I have a data structure like this:
#!/usr/bin/perl -w
my $hash = {
'abTcells' => {
'mesenteric_lymph_node' => {
'Itm2a' => '664.661',
'Gm16452' => '18.1425',
'Sergef' => '142.8205'
},
'spleen' => {
'Itm2a' => '58.07155',
'Dhx9' => '815.2795',
'Ssu72' => '292.889'
}
}
};
What I want to do is to print it out into this format:
mesenteric_lymph_node spleen
Itm2a 664.661 58.07155
Gm16452 18.1425 NA
Sergef 142.8205 NA
Dhx9 NA 815.2795
Ssu72 NA 292.889
What's the way to do it.
I'm currently stuck with the following code https://eval.in/44207
foreach my $ct (keys %{$hash}) {
print "$ct\n\n";
my %hash2 = %{$hash->{$ct}};
foreach my $ts (keys %hash2) {
print "$ts\n";
my %hash3 = %{$hash2{$ts}};
foreach my $gn (keys %hash3) {
print "$gn $hash3{$gn}\n";
}
}
}
Use Text::Table for output. Beautify to taste.
#!/usr/bin/env perl
use strict;
use warnings;
use Text::Table;
my $hash = {
'abTcells' => {
'mesenteric_lymph_node' => {
'Itm2a' => '664.661',
'Gm16452' => '18.1425',
'Sergef' => '142.8205'
},
'spleen' => {
'Itm2a' => '58.07155',
'Dhx9' => '815.2795',
'Ssu72' => '292.889'
}
}
};
my $struct = $hash->{abTcells};
my #cols = sort keys %{ $struct };
my #rows = sort keys %{ { map {
my $x = $_;
map { $_ => undef }
keys %{ $struct->{$x} }
} #cols } };
my $tb = Text::Table->new('', #cols);
for my $r (#rows) {
$tb->add($r, map $struct->{$_}{$r} // 'NA', #cols);
}
print $tb;
Output:
mesenteric_lymph_node spleen
Dhx9 NA 815.2795
Gm16452 18.1425 NA
Itm2a 664.661 58.07155
Sergef 142.8205 NA
Ssu72 NA 292.889
Now, the order of the rows above is different than the one you show because I wanted it to be consistent. If you know the set of all possible rows, then you can specify another order obviously.
First thing would be to separate out the two hashes:
my %lymph_node = %{ $hash->{abTcells}->{mesenteric_lymph_node} };
my %spleen = %{ $hash->{abTcells}->{spleen} };
Now, you have two separate hashes that contains the data you want.
What we need is a list of all the keys. Let's make a third hash that contains your keys.
my %keys;
map { $keys{$_} = 1; } keys %lymph_node, keys %spleen;
Now, we can go through all your keys and print the value for each of the two hashes. If one of the hashes doesn't have the data, we'll set it to NA:
for my $value ( sort keys %keys ) {
my $spleen_value;
my $lymph_nodes_value;
$spleen_value = exists $spleen{$value} ? $spleen{$value} : "NA";
$lymph_node_value = exists $lymph_node{$value} ? $lymph_node{$value} : "NA";
printf "%-20.20s %-9.5f %-9.5f\n", $key, $lymph_node_value, $spleen_value;
}
The printf statement is a nice way to tabularize data. You'll have to create the headings yourself. The ... ? ... : ... statement is an abbreviated if/then/else If the statement before the ? is true, then the value is the value between the ? and the :. Else, the value is the value after the :.
Both of your inner hashes have the same keys, So do a foreach on one of the hashes to get the key, and then print both.

Adressing a hash of hashes with an array

This is my problem:
I have a file-system like data-structure:
%fs = (
"home" => {
"test.file" => {
type => "file",
owner => 1000,
content => "Hello World!",
},
},
"etc" => {
"passwd" => {
type => "file",
owner => 0,
content => "testuser:testusershash",
},
"conf" => {
"test.file" => {
type => "file",
owner => 1000,
content => "Hello World!",
},
},
},
);
Now, to get the content of /etc/conf/test.file I need $fs{"etc"}{"conf"}{"test.file"}{"content"}, but my input is an array and looks like this: ("etc","conf","test.file").
So, because the length of the input is varied, I don't know how to access the values of the hash. Any ideas?
You can use a loop. In each step, you proceed one level deeper into the structure.
my #path = qw/etc conf test.file/;
my %result = %fs;
while (#path) {
%result = %{ $result{shift #path} };
}
print $result{content};
You can also use Data::Diver.
my #a = ("etc","conf","test.file");
my $h = \%fs;
while (my $v = shift #a) {
$h = $h->{$v};
}
print $h->{type};
Same logic as what others given, but uses foreach
#keys = qw(etc conf test.file content);
$r = \%fs ;
$r = $r->{$_} foreach (#keys);
print $r;
$pname = '/etc/conf/test.file';
#names = split '/', $pname;
$fh = \%fs;
for (#names) {
$fh = $fh->{"$_"} if $_;
}
print $fh->{'content'};
Path::Class accepts an array. It also gives you an object with helper methods and handles cross platform slash issues.
https://metacpan.org/module/Path::Class
You can just build the hash element expression and call eval. This is tidier if it is wrapped in a subroutine
my #path = qw/ etc conf test.file /;
print hash_at(\%fs, \#path)->{content}, "\n";
sub hash_at {
my ($hash, $path) = #_;
$path = sprintf q($hash->{'%s'}), join q('}{'), #$path;
return eval $path;
}

Delete value from Perl hash of arrays of hashes

I'm trying to delete values from a hash of arrays of hashes that I created with the following code:
while ((my $Genotype1, my $Fitness1) = each (%Normalisedfithash)) {
while ((my $Parent1A, my $TallyP1) = each(%P1Tallyhash)) {
my $ParentTally = 0;
my $SecondParent = {
Parent2 => $Parent1A,
Tally => $ParentTally,
};
push #{ $StoredParentshash{$Genotype1}}, $SecondParent;
I have been trying to delete values from %StoredParentshash where Tally is zero. (I have further code which updates Tally, but some are not updated and I want them removed from the hash).
I have written the following:
for my $Parent (keys %StoredParentshash) {
my $aref1 = $StoredParentshash{$Parent};
for my $hashref1 (#$aref1) {
my $Tally = $hashref1->{'Tally'};
if ($Tally == 0){
delete $hashref1->{'Tally'};
delete $hashref1->{'Parent2'};
}
}
}
This code sort of deletes the data, but when I use Data::Dumper the structure I get back looks like this:
'7412' => [
{},
{
'Tally' => 1,
'Parent2' => '2136'
},
{},
{},
{},
How can I completely remove the keys where the Tally is zero rather than being left with {}?
Thanks!
The code that you say has generated the data structure is faulty, as it is missing two closing braces.
You must show either your actual code with balanced { .. } or a dump of %StoredParentshash before we can help you properly.
If Tally and Parent2 are the only keys in the SecondParent hashes, then you should write something like
for my $children (values %StoredParentshash) {
#$children = grep $_->{Tally} != 0, #$children;
}
Your data looks like:
my %StoredParentshash = (
key1 => [
{
Tally => ...,
...
},
...
],
...
);
And you want to delete some of the array elements. Generally, I use grep for that.
#array = grep keep_condition(), #array;
Here is no exception.
for my $array (values(%StoredParentshash)) {
#$array = grep $_->{Tally}, #$array;
}
And to delete any arrays that are now empty:
for my $key (keys(%StoredParentshash)) {
delete $StoredParentshash{$key} if !#{ $StoredParentshash{$key} };
}
Or combined:
for my $key (keys(%StoredParentshash)) {
my $array = $StoredParentshash{$key};
#$array = grep $_->{Tally}, #$array;
delete $StoredParentshash{$key} if !#$array;
}

Perl Working On Two Hash References

I would like to compare the values of two hash references.
The data dumper of my first hash is this:
$VAR1 = {
'42-MG-BA' => [
{
'chromosome' => '19',
'position' => '35770059',
'genotype' => 'TC'
},
{
'chromosome' => '2',
'position' => '68019584',
'genotype' => 'G'
},
{
'chromosome' => '16',
'position' => '9561557',
'genotype' => 'G'
},
And the second hash is similar to this but with more hashes in the array. I would like to compare the genotype of my first and second hash if the position and the choromosome matches.
map {print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n"}sort keys %$cave_snp_list;
map {print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n"}sort keys %$geno_seq_list;
I could do that for the first array of the hashes.
Could you help me in how to work for all the arrays?
This is my actual code in full
#!/software/bin/perl
use strict;
use warnings;
use Getopt::Long;
use Benchmark;
use Config::Config qw(Sequenom.ini);
useDatabase::Conn;
use Data::Dumper;
GetOptions("sam=s" => \my $sample);
my $geno_seq_list = getseqgenotypes($sample);
my $cave_snp_list = getcavemansnpfile($sample);
#print Dumper($geno_seq_list);
print scalar %$geno_seq_list, "\n";
foreach my $sam (keys %{$geno_seq_list}) {
my $seq_used = $geno_seq_list->{$sam};
my $cave_used = $cave_snp_list->{$sam};
print scalar(#$geno_seq_list->{$_}) if sort keys %$geno_seq_list, "\n";
print scalar(#$cave_used), "\n";
#foreach my $seq2com (# {$seq_used } ){
# foreach my $cave2com( # {$cave_used} ){
# print $seq2com->{chromosome},":" ,$cave2com->{chromosome},"\n";
# }
#}
map { print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n" } sort keys %$cave_snp_list;
map { print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n" } sort keys %$geno_seq_list;
}
sub getseqgenotypes {
my $snpconn;
my $gen_list = {};
$snpconn = Database::Conn->new('live');
$snpconn->addConnection(DBI->connect('dbi:Oracle:pssd.world', 'sn', 'ss', { RaiseError => 1, AutoCommit => 0 }),
'pssd');
#my $conn2 =Database::Conn->new('live');
#$conn2->addConnection(DBI->connect('dbi:Oracle:COSI.world','nst_owner','nst_owner', {RaiseError =>1 , AutoCommit=>0}),'nst');
my $id_ind = $snpconn->execute('snp::Sequenom::getIdIndforExomeSample', $sample);
my $genotype = $snpconn->executeArrRef('snp::Sequenom::getGenotypeCallsPosition', $id_ind);
foreach my $geno (#{$genotype}) {
push #{ $gen_list->{ $geno->[1] } }, {
chromosome => $geno->[2],
position => $geno->[3],
genotype => $geno->[4],
};
}
return ($gen_list);
} #end of sub getseqgenotypes
sub getcavemansnpfile {
my $nstconn;
my $caveman_list = {};
$nstconn = Database::Conn->new('live');
$nstconn->addConnection(
DBI->connect('dbi:Oracle:CANP.world', 'nst_owner', 'NST_OWNER', { RaiseError => 1, AutoCommit => 0 }), 'nst');
my $id_sample = $nstconn->execute('nst::Caveman::getSampleid', $sample);
#print "IDSample: $id_sample\n";
my $file_location = $nstconn->execute('nst::Caveman::getCaveManSNPSFile', $id_sample);
open(SNPFILE, "<$file_location") || die "Error: Cannot open the file $file_location:$!\n";
while (<SNPFILE>) {
chomp;
next if /^>/;
my #data = split;
my ($nor_geno, $tumor_geno) = split /\//, $data[5];
# array of hash
push #{ $caveman_list->{$sample} }, {
chromosome => $data[0],
position => $data[1],
genotype => $nor_geno,
};
} #end of while loop
close(SNPFILE);
return ($caveman_list);
}
The problem that I see is that you're constructing a tree for generic storage of data, when what you want is a graph, specific to the task. While you are constructing the record, you could also be constructing the part that groups data together. Below is just one example.
my %genotype_for;
my $record
= { chromosome => $data[0]
, position => $data[1]
, genotype => $nor_geno
};
push #{ $gen_list->{ $geno->[1] } }, $record;
# $genotype_for{ position }{ chromosome }{ name of array } = genotype code
$genotype_for{ $data[1] }{ $data[0] }{ $sample } = $nor_geno;
...
return ( $caveman_list, \%genotype_for );
In the main line, you receive them like so:
my ( $cave_snp_list, $geno_lookup ) = getcavemansnpfile( $sample );
This approach at least allows you to locate similar position and chromosome values. If you're going to do much with this, I might suggest an OO approach.
Update
Assuming that you wouldn't have to store the label, we could change the lookup to
$genotype_for{ $data[1] }{ $data[0] } = $nor_geno;
And then the comparison could be written:
foreach my $pos ( keys %$small_lookup ) {
next unless _HASH( my $sh = $small_lookup->{ $pos } )
and _HASH( my $lh = $large_lookup->{ $pos } )
;
foreach my $chrom ( keys %$sh ) {
next unless my $sc = $sh->{ $chrom }
and my $lc = $lh->{ $chrom }
;
print "$sc:$sc";
}
}
However, if you had limited use for the larger list, you could construct the specific case
and pass that in as a filter when creating the longer list.
Thus, in whichever loop creates the longer list, you could just go
...
next unless $sample{ $position }{ $chromosome };
my $record
= { chromosome => $chromosome
, position => $position
, genotype => $genotype
};
...

Mapping values with Column header and row header

I have some files with below data.
sample File 1:
sitename1,2009-07-19,"A1",11975,17.23
sitename1,2009-07-19,"A2",11,0.02
sitename1,2009-07-20,"A1",2000,17.23
sitename1,2009-07-20,"A2",538,0.02
I want to map the values in column 4 with column 2 and 3 as shown below.
Output required.
Site,Type,2009-07-19,2009-07-20
sitename1,"A1",11975,2000
sitename1,"A2",11,538
Here is what I have tried so far:
#! /usr/bin/perl -w
use strict;
use warnings;
my $column_header=["Site,Type"];
my $position={};
my $last_position=0;
my $current_event=[];
my $events=[];
while (<STDIN>) {
my ($site,$date,$type,$value,$percent) = split /[,\n]/, $_;
my $event_key = $date;
if (not defined $position->{$event_key}) {
$last_position+=1;
$position->{$event_key}=$last_position;
push #$column_header,$event_key;
}
my $pos = $position->{$event_key};
if (defined $current_event->[$pos]) {
dumpEvent();
}
if (not defined $current_event->[0]) {
$current_event->[0]="$site,$type";
}
$current_event->[$pos]=$value;
}
dumpEvent();
my $order = [];
for (my $scan=0; $scan<scalar(#$column_header); $scan++) {
push #$order,$scan;
}
printLine($column_header);
map { printLine($_) } #$events;
sub printLine {
my $record=shift;
my #result=();
foreach my $offset (#$order) {
if (defined $record->[$offset]) {
push #result,$record->[$offset];
} else {
push #result,"";
}
}
print join(",",#result)."\n";
}
sub dumpEvent {
return unless defined $current_event->[0];
push #$events,$current_event;
$current_event=[];
}
The output i am getting is as below.
*Site,Type,2009-07-19,2009-07-20*
sitename1,"A1",11975,
sitename1,"A2",11,
sitename1,"A1",,14620
sitename1,"A2",,538
If I understand you correctly (and I have to admit I'm only guessing), you have several types of things at different dates and a value for each. Thus you need a data structure like this hash for each site:
$foo = {
site => 'sitename1',
type => 'A1',
dates => [
{
date => '2009-07-19',
value => 11975,
},
{
date => '2009-07-20',
value => 538,
},
],
};
Is that even close?
The folowing code produces the expected result and makes "some" sense. I don't know if it makes real sense.
my %dates;
my %SiteType;
while (<DATA>) {
chomp;
my ($site,$date,$type,$value,$percent) = split /,/;
$dates{$date} = '1';
push #{$SiteType{"$site,$type"}}, $value ;
};
print 'Site,Type,', join(',', sort keys %dates), "\n";
foreach ( sort keys %SiteType) {
print $_, ',', join(',', #{$SiteType{$_}}), "\n";
};