How to group hash of numbers that are equals - perl

I'm kinda new to programming and Perl... I would say I could do that in C like language, but I've got no idea how to do that in Perl.
To simplify things...
my $sql = [
{ ID => 1 },
{ ID => 2 },
{ ID => 2 },
{ ID => 2 },
{ ID => 3 }
];
my $number = 0;
foreach( #{ $sql } ) {
my #array = ();
if ( $number != $_->{ ID } ) {
push #array, $number;
if( $number ) {
push #array, '(' . $_->{ ID } . ')'; #how should I cycle that ?
}
}
$number = $_->{ ID };
print("#array");
}
I somehow need to get an output like this 1(2)(2)(2)3. I don't want to work with real code here, so I made a simple example how I use it now. I would be really grateful for help.
This code gives me 01 (2)2 (3) which is wrong. I've been thinking about it about for multiple hours so my results are getting worse and worse because I'm stuck and looping through the same mistakes.

Welcome to Perl and programming in general.
One straightforward way to do this is to count the number of times each value appears. If only once then print it without parentheses. If more than once then print it with parentheses and repeat according to the count for that value. I'm using the 'x' operator to repeat a string. '*' x 3 would give you '***' for example.
use strict;
use warnings;
my $sql = [
{ ID => 1 },
{ ID => 2 },
{ ID => 2 },
{ ID => 2 },
{ ID => 3 }
];
my $number = 0;
my %count_values;
foreach( #{ $sql } ) {
# make the number the key, count how many times it appears.
$count_values{$_->{ID}}++;
}
my $output_string = '';
foreach my $key_number ( sort {$a<=>$b} keys %count_values ){
# if just one occurance then don't add parentheses.
if ($count_values{$key_number} == 1){
$output_string .= $key_number;
} else {
$output_string .= "($key_number)" x $count_values{$key_number};
}
}
print $output_string;

Related

Perl - "Complex" Data Structure

I'm trying to get a workable data structure that I can pull the element values from in a sensible fashion. Just having great difficulty working with the data once its in the structure. This is how the struct is built:
sub hopCompare
{
my %count;
my %master;
my $index = 0;
foreach my $objPath (#latest) #get Path object out of master array
{
my #path = #{$objPath->_getHopList()}; #dereferencing
my $iter = 0;
foreach my $hop (#path)
{
++$count{$hop}->{FREQ};
$count{$hop}->{INDEX} = $index;
$count{$hop}->{NODE} = $hop;
$index++;
}
$index = 0;
}
foreach my $element( keys %count )
{
if (defined($count{$element}->{NODE}))
{
my $curr = $count{$element}->{INDEX};
my $freq = $count{$element}->{FREQ};
if (($freq > 1) || ($count{$element}->{INDEX} =~ /[0-1]/))
{
push #{ $master{$curr} }, {$count{$element}->{NODE}, {FREQ => $count{$element}->{FREQ}}};
}
print "$element = $count{$element}\n";
print "$element Index = $count{$element}->{INDEX}\n";
}
}
print "\n Master contains: \n" . Dumper (%master);
if (%master){return %master;} else {die "NO FINAL HOPS MATCHED";}
}
Producing this structure:
%Master contains:
$VAR1 = '4';
$VAR2 = [
{
'1.1.1.2' => {
'FREQ' => 2
}
}
];
$VAR3 = '1';
$VAR4 = [
{
'1.1.1.9' => {
'FREQ' => 5
}
},
{
'1.1.1.8' => {
'FREQ' => 1
}
}
];
{truncated}
Although ideally the structure should look like this but I had even less joy trying to pull data out at sub identifyNode:
$VAR1 = {
'1' => [
{
'1.1.1.9' => {
'FREQ' => 5
}
},
{
'1.1.5.8' => {
'FREQ' => 1
}
}
],
Then to get back at the data in another method I'm using:
sub identifyNode
{
my %hops = %{$_[0]};
my $i = 0;
foreach my $h ( keys %hops ) #The HOP-INDEX is the key
{
print "\n\$h looks like \n" . Dumper ($hops{$h});
my %host = %{ $hops{$h}[0] }; #Push the first HASH in INDEX to the %host HASH
foreach my $hip (keys %host)
{
my $corelink = `corelinks $hip`;
my ($node) = $corelink =~ /([a-z0-9-]+),[a-z0-9-\/]+,$hip/s;
print "\n\t\t\tHostname is $node\n";
}
$i++;
}
}
This then generates:
$h looks like
$VAR1 = [
{
'1.1.1.2' => {
'FREQ' => 2
}
}
];
Hostname is blabla-bla-a1
$h looks like
$VAR1 = [
{
'1.1.1.9' => {
'FREQ' => 5
}
},
{
'1.1.1.8' => {
'FREQ' => 1
}
}
];
Hostname is somew-some-a1
So for each hash in $h only the topmost host gets evaluated and hostname returned. This is because it is told to do so by the [0] in line:
my %host = %{ $hops{$h}[0] };
I've played around with different data structures and de-referencing the structure a multitude of ways and this is the only halfway house I've found...
(The IPs have been obfuscated so are not consistent in my examples)
Thanks for your advice it got me halfway there. It works now (in still somewhat a convoluted fashion!) :
sub identifyNode
{
my %hops = %{$_[0]};
my $i = 0;
my #fin_nodes;
my $hindex;
foreach my $h ( keys %hops ) #The HOP-INDEX is the key
{
$hindex = $h;
foreach my $e (#{$hops{$h}}) #first part of solution credit Zdim
{
my #host = %{ $e }; #second part of solution
my $hip = $host[0];
my $corelink = `corelinks $hip`;
my ($node) = $corelink =~ /([a-z0-9-]+),[a-z0-9-\/]+,$hip/s;
print "\n\t\t\tHostname is $node\n";
push (#fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);
}
$i++;
}
return (\#fin_nodes);
}
Am I brave enough to add the data as a hash to #fin_nodes.. hmm

Printing Hash of Hash into a Matrix Table in Perl

I have a data structure like this:
#!/usr/bin/perl -w
my $hash = {
'abTcells' => {
'mesenteric_lymph_node' => {
'Itm2a' => '664.661',
'Gm16452' => '18.1425',
'Sergef' => '142.8205'
},
'spleen' => {
'Itm2a' => '58.07155',
'Dhx9' => '815.2795',
'Ssu72' => '292.889'
}
}
};
What I want to do is to print it out into this format:
mesenteric_lymph_node spleen
Itm2a 664.661 58.07155
Gm16452 18.1425 NA
Sergef 142.8205 NA
Dhx9 NA 815.2795
Ssu72 NA 292.889
What's the way to do it.
I'm currently stuck with the following code https://eval.in/44207
foreach my $ct (keys %{$hash}) {
print "$ct\n\n";
my %hash2 = %{$hash->{$ct}};
foreach my $ts (keys %hash2) {
print "$ts\n";
my %hash3 = %{$hash2{$ts}};
foreach my $gn (keys %hash3) {
print "$gn $hash3{$gn}\n";
}
}
}
Use Text::Table for output. Beautify to taste.
#!/usr/bin/env perl
use strict;
use warnings;
use Text::Table;
my $hash = {
'abTcells' => {
'mesenteric_lymph_node' => {
'Itm2a' => '664.661',
'Gm16452' => '18.1425',
'Sergef' => '142.8205'
},
'spleen' => {
'Itm2a' => '58.07155',
'Dhx9' => '815.2795',
'Ssu72' => '292.889'
}
}
};
my $struct = $hash->{abTcells};
my #cols = sort keys %{ $struct };
my #rows = sort keys %{ { map {
my $x = $_;
map { $_ => undef }
keys %{ $struct->{$x} }
} #cols } };
my $tb = Text::Table->new('', #cols);
for my $r (#rows) {
$tb->add($r, map $struct->{$_}{$r} // 'NA', #cols);
}
print $tb;
Output:
mesenteric_lymph_node spleen
Dhx9 NA 815.2795
Gm16452 18.1425 NA
Itm2a 664.661 58.07155
Sergef 142.8205 NA
Ssu72 NA 292.889
Now, the order of the rows above is different than the one you show because I wanted it to be consistent. If you know the set of all possible rows, then you can specify another order obviously.
First thing would be to separate out the two hashes:
my %lymph_node = %{ $hash->{abTcells}->{mesenteric_lymph_node} };
my %spleen = %{ $hash->{abTcells}->{spleen} };
Now, you have two separate hashes that contains the data you want.
What we need is a list of all the keys. Let's make a third hash that contains your keys.
my %keys;
map { $keys{$_} = 1; } keys %lymph_node, keys %spleen;
Now, we can go through all your keys and print the value for each of the two hashes. If one of the hashes doesn't have the data, we'll set it to NA:
for my $value ( sort keys %keys ) {
my $spleen_value;
my $lymph_nodes_value;
$spleen_value = exists $spleen{$value} ? $spleen{$value} : "NA";
$lymph_node_value = exists $lymph_node{$value} ? $lymph_node{$value} : "NA";
printf "%-20.20s %-9.5f %-9.5f\n", $key, $lymph_node_value, $spleen_value;
}
The printf statement is a nice way to tabularize data. You'll have to create the headings yourself. The ... ? ... : ... statement is an abbreviated if/then/else If the statement before the ? is true, then the value is the value between the ? and the :. Else, the value is the value after the :.
Both of your inner hashes have the same keys, So do a foreach on one of the hashes to get the key, and then print both.

Delete value from Perl hash of arrays of hashes

I'm trying to delete values from a hash of arrays of hashes that I created with the following code:
while ((my $Genotype1, my $Fitness1) = each (%Normalisedfithash)) {
while ((my $Parent1A, my $TallyP1) = each(%P1Tallyhash)) {
my $ParentTally = 0;
my $SecondParent = {
Parent2 => $Parent1A,
Tally => $ParentTally,
};
push #{ $StoredParentshash{$Genotype1}}, $SecondParent;
I have been trying to delete values from %StoredParentshash where Tally is zero. (I have further code which updates Tally, but some are not updated and I want them removed from the hash).
I have written the following:
for my $Parent (keys %StoredParentshash) {
my $aref1 = $StoredParentshash{$Parent};
for my $hashref1 (#$aref1) {
my $Tally = $hashref1->{'Tally'};
if ($Tally == 0){
delete $hashref1->{'Tally'};
delete $hashref1->{'Parent2'};
}
}
}
This code sort of deletes the data, but when I use Data::Dumper the structure I get back looks like this:
'7412' => [
{},
{
'Tally' => 1,
'Parent2' => '2136'
},
{},
{},
{},
How can I completely remove the keys where the Tally is zero rather than being left with {}?
Thanks!
The code that you say has generated the data structure is faulty, as it is missing two closing braces.
You must show either your actual code with balanced { .. } or a dump of %StoredParentshash before we can help you properly.
If Tally and Parent2 are the only keys in the SecondParent hashes, then you should write something like
for my $children (values %StoredParentshash) {
#$children = grep $_->{Tally} != 0, #$children;
}
Your data looks like:
my %StoredParentshash = (
key1 => [
{
Tally => ...,
...
},
...
],
...
);
And you want to delete some of the array elements. Generally, I use grep for that.
#array = grep keep_condition(), #array;
Here is no exception.
for my $array (values(%StoredParentshash)) {
#$array = grep $_->{Tally}, #$array;
}
And to delete any arrays that are now empty:
for my $key (keys(%StoredParentshash)) {
delete $StoredParentshash{$key} if !#{ $StoredParentshash{$key} };
}
Or combined:
for my $key (keys(%StoredParentshash)) {
my $array = $StoredParentshash{$key};
#$array = grep $_->{Tally}, #$array;
delete $StoredParentshash{$key} if !#$array;
}

Iterating over a complex data structure

I have what appears to be a hash of a hash of an array of hashes. I'm trying to pull some values out and I'm stumped (this is way deeper than I would go with a structure. It looks like this.....
%htest = (
8569 => {
4587 => [
{
date=> "2011-01-15",
approved=> 1,
},
{
date=> "2011-01-12",
approved=> 1,
},
],
1254 => [
{
date=> "2011-01-12",
approved=> "",
},
{
date=> "",
approved=> 1,
},
],
},
);
Trying to iterate over this thing is giving me a massive headache. I'm trying to access the number of elements under the second hash value (4587 and 1254). The number of those elements where approved="1" and the number of elements where the date contains a value.
If I could iterate over them I sure I could shove what I need into a less complex structure but so far I'm at a loss.
I got this far...
while (my ($id, $surveyhash) = each %{ $htest{'8569'} } ){
print "$enumid = $subhash\n";
print Dumper $subhash."\n";
}
That gives me the "4587" and "1254" but trying to do a dumper on $subhash just gives me....
4587 = ARRAY(0x9a9ffb0)
$VAR1 = 'ARRAY(0x9a9ffb0)
';
1254 = ARRAY(0x9a91788)
$VAR1 = 'ARRAY(0x9a91788)
';
Any idea how to iterate over this monstrosity?
Janie
Your structure has typos, you need commas between the innermost hashes and a parenthesis at the end (rather than a curly bracket)
Once you fix it you can use something like this:
my $approved = 0, my $date_has_value = 0;
while ( my ($k,$vref) = each %htest ) {
while ( my ($k,$v) = each %$vref ) {
# Now you're inside the inner hash, so there will be 2 iterations
# with $k 4587 and 1254
foreach my $item (#$v) {
# Now each $item is a reference to the innermost hashes
$approved++ if $item->{approved} == 1;
$date_has_value++ if $item->{date};
}
}
}
Here's a fairly explicit iteration that should help get you started
my ($num_approved, $num_date) = (0, 0);
# outer hash
while (my ($ka, $va) = each %htest)
{
# inner hash
while (my ($kb, $vb) = each %{$va})
{
# each hash inside the array
foreach my $h (#{$vb})
{
$num_approved += ${$h}{"approved"} == 1;
$num_date += length(${$h}{"date"}) > 0;
}
}
}
Counting match cases can be done with "scalar grep".
my ($approved, $date_has_value) = (0, 0);
for my $v1 (values %htest) {
for my $v2 (values %$v1) {
$approved += scalar grep { $$_{approved} eq '1' } #$v2;
$date_has_value += scalar grep { $$_{date} ne '' } #$v2;
}
}

Mapping values with Column header and row header

I have some files with below data.
sample File 1:
sitename1,2009-07-19,"A1",11975,17.23
sitename1,2009-07-19,"A2",11,0.02
sitename1,2009-07-20,"A1",2000,17.23
sitename1,2009-07-20,"A2",538,0.02
I want to map the values in column 4 with column 2 and 3 as shown below.
Output required.
Site,Type,2009-07-19,2009-07-20
sitename1,"A1",11975,2000
sitename1,"A2",11,538
Here is what I have tried so far:
#! /usr/bin/perl -w
use strict;
use warnings;
my $column_header=["Site,Type"];
my $position={};
my $last_position=0;
my $current_event=[];
my $events=[];
while (<STDIN>) {
my ($site,$date,$type,$value,$percent) = split /[,\n]/, $_;
my $event_key = $date;
if (not defined $position->{$event_key}) {
$last_position+=1;
$position->{$event_key}=$last_position;
push #$column_header,$event_key;
}
my $pos = $position->{$event_key};
if (defined $current_event->[$pos]) {
dumpEvent();
}
if (not defined $current_event->[0]) {
$current_event->[0]="$site,$type";
}
$current_event->[$pos]=$value;
}
dumpEvent();
my $order = [];
for (my $scan=0; $scan<scalar(#$column_header); $scan++) {
push #$order,$scan;
}
printLine($column_header);
map { printLine($_) } #$events;
sub printLine {
my $record=shift;
my #result=();
foreach my $offset (#$order) {
if (defined $record->[$offset]) {
push #result,$record->[$offset];
} else {
push #result,"";
}
}
print join(",",#result)."\n";
}
sub dumpEvent {
return unless defined $current_event->[0];
push #$events,$current_event;
$current_event=[];
}
The output i am getting is as below.
*Site,Type,2009-07-19,2009-07-20*
sitename1,"A1",11975,
sitename1,"A2",11,
sitename1,"A1",,14620
sitename1,"A2",,538
If I understand you correctly (and I have to admit I'm only guessing), you have several types of things at different dates and a value for each. Thus you need a data structure like this hash for each site:
$foo = {
site => 'sitename1',
type => 'A1',
dates => [
{
date => '2009-07-19',
value => 11975,
},
{
date => '2009-07-20',
value => 538,
},
],
};
Is that even close?
The folowing code produces the expected result and makes "some" sense. I don't know if it makes real sense.
my %dates;
my %SiteType;
while (<DATA>) {
chomp;
my ($site,$date,$type,$value,$percent) = split /,/;
$dates{$date} = '1';
push #{$SiteType{"$site,$type"}}, $value ;
};
print 'Site,Type,', join(',', sort keys %dates), "\n";
foreach ( sort keys %SiteType) {
print $_, ',', join(',', #{$SiteType{$_}}), "\n";
};