I have the following code
use strict;
use warnings;
use Data::Dumper;
my $s = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5 ";
my $hash = {};
my #a = split(/\n/, $s);
foreach (#a)
{
my $c = (split)[2];
my $d = (split)[1];
my $e = (split)[0];
push(#{$hash->{$c}}, $d);
}
print Dumper($hash );
I am getting the output
$VAR1 = {
'P5' => [
'B',
'E'
],
'P2' => [
'C'
],
'P1' => [
'A',
'D'
]
};
But I want the output like
$VAR1 = {
'P5' => {
'E' => '06',
'B' => '23'
},
'P2' => {
'C' => '24'
},
'P1' => {
'A' => '12',
'D' => '15'
}
};
Please help.
You need to use a hash if you want a hash as output.
No need to split three times and use postscripts, just split once and assign all variables. Also no need to initialize a scalar as an empty hash, perl will take care of that for you.
I renamed the variables for increased readability.
my $string = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5 ";
my $hash;
my #lines = split(/\n/, $string);
foreach (#lines)
{
my ($value, $key2, $key) = split;
$hash->{$key}{$key2} = $value;
}
print Dumper($hash );
Be aware that if you have multiple values with the same keys, they will overwrite each other. In that case, you'd need to push the values onto an array instead:
push #{$hash->{$key}{$key2}}, $value;
Well it's not that different from what you have. Just replace the push with a hash-assign (thank you auto-vivification):
foreach (#a)
{
my ($e, $d, $c) = split;
$hash->{$c}->{$d} = $e;
}
Additionally I have re-arranged the "split" so that it's just called once per line.
Related
For example
From this:
UserA|Single|Girly|200|500
UserA|Single|Boyish|200|200
UserA|Double|Girly|100|200
UserA|Multiple|Boyish|200|400
UserA|Double|Girly|250|150
UserA|Single|Boyish|150|150
To this:
UserA|Single|Girly|200|500
UserA|Single|Boyish|350|350
UserA|Double|Girly|350|350
UserA|Multiple|Boyish|200|400
How should I code this in order to get the sum of the line by their same keys.
You get the basic idea from my example.
Thanks!
use strict;
use warnings;
my %hash;
while(my $line = <DATA>)
{
chomp $line;
if ($line =~ /(.*\|.*\|.*)\|(\d*)\|(\d*)/)
{
# you want to group them by the first 3 attributes, therefore:
# 1 will hold UserA|Single|Girly
# 2 will hold the first value
# 3 will hold the second value
$hash{$1}{'first_value'} += $2;
$hash{$1}{'second_value'} += $3;
}
}
use Data::Dumper;
print Dumper %hash;
__DATA__
UserA|Single|Girly|200|500
UserA|Single|Boyish|200|200
UserA|Double|Girly|100|200
UserA|Multiple|Boyish|200|400
UserA|Double|Girly|250|150
UserA|Single|Boyish|150|150
The result looks like this :
$VAR1 = 'UserA|Multiple|Boyish';
$VAR2 = {
'first' => '200',
'second' => '400'
};
$VAR3 = 'UserA|Double|Girly';
$VAR4 = {
'first' => '350',
'second' => '350'
};
$VAR5 = 'UserA|Single|Boyish';
$VAR6 = {
'first' => '350',
'second' => '350'
};
$VAR7 = 'UserA|Single|Girly';
$VAR8 = {
'first' => '200',
'second' => '500'
};
As a follow up to my previous post here!
I tested the algorithm with nested hash references:
Algorithm:
use strict;
use warnings;
&expand_references2([a,b,{c=>123},d]);
sub expand_references2 {
my $indenting = -1;
my $inner; $inner = sub {
my $ref = $_[0];
my $key = $_[1];
$indenting++;
if(ref $ref eq 'ARRAY'){
print ' ' x $indenting;
printf("%s\n",($key) ? $key : '');
$inner->($_) for #{$ref};
}elsif(ref $ref eq 'HASH'){
print ' ' x $indenting;
printf("%s\n",($key) ? $key : '');
for my $k(sort keys %{$ref}){
$inner->($ref->{$k},$k);
}
}else{
if($key){
print ' ' x $indenting,$key,' => ',$ref,"\n";
}else{
print ' ' x $indenting,$ref,"\n";
}
}
$indenting--;
};
$inner->($_) for #_;
}
In some cases, the indentation and the newline character do not display as expected:
Example1:
expand_references2(hash=>{
d1=>{a=>123,
b=>234},
d2=>[1,2,3],
d3=>'hello'});
Output:
Hash
<newline> # not required
d1
a => 123
b => 234
d2
1
2
3
d3 => hello
Instead I would prefer an output something like this:
Hash
d1
a => 123
b => 234
d2
1
2
3
d3 => hello
OR
Hash
d1
a => 123
b => 234
d2
1
2
3
d3 => hello
Example2:
expand_references2([a,b,{c=>123},d]);
output:
a
b
c=>123 # indentation not required
d
Any guidance on how to achieve the above to scenario or indenting it right without extra newlines?
Appreciate any help.
Thanks
I'd use a somewhat different approach:
sub prindent {
my( $ref, $ind ) = #_;
if( ref( $ref ) eq 'HASH' ){
for my $key (sort keys %{$ref}){
print ' ' x $ind, $key;
my $val = $ref->{$key};
if( ref( $val ) ){
print "\n";
prindent( $val, $ind + 1 );
} else {
print " => $val\n";
}
}
} elsif( ref( $ref ) eq 'ARRAY' ){
for my $el ( #{$ref} ){
if( ref( $el ) ){
prindent( $el, $ind + 1 );
} else {
print ' ' x $ind, "$el\n";
}
}
}
}
sub prindent2 {
my( $key, $val ) = #_;
if( defined $val ){
print "$key\n";
prindent( $val, 1 );
} else {
prindent( $key, 0 );
}
}
This produces:
hash
d1
a => 123
b => 234
d2
1
2
3
d3 => hello
a
b
c => 123
d
You may not like the output for multidimensional arrays: all elements are in one column.
I have a hash which contain sub hash, I want to abstract that sub hash separately and create a array from that,
hash look like
'a1' => '1',
'a2' => '2'.
'Def' => [
'd' => 'x',
'e' => 'y'
]
I need to make a separate hash for 'Def'. and print only 'Def' as a array
Its hard from reading your question to know just exactly what you are trying to achieve but my interpretation of it is that you want to extract the anonymous hash allocated to def and store it in another hash. Then you want to print this hash as an array. I have also included examples to print just the keys of the values of the hash.
use strict;
use Data::Dumper;
my %first_hash = (
a1 => '1',
a2 => '2',
def => {
d => 'x',
e => 'y'
}
);
my %second_hash = %{$first_hash{'def'}};
my #full_array = %second_hash;
my #keys_array = keys %second_hash;
my #values_array = values %second_hash;
print Dumper (\%first_hash);
print Dumper (\%second_hash);
print "full array: ", join(' ',#full_array), "\n";
print "keys array: ", join(' ',#keys_array), "\n";
print "values array: ", join(' ',#values_array), "\n";
OUTPUT
$VAR1 = {
'a2' => '2',
'def' => {
'e' => 'y',
'd' => 'x'
},
'a1' => '1'
};
$VAR1 = {
'e' => 'y',
'd' => 'x'
};
full array: e y d x
keys array: e d
values array: y x
Below you'll find the answer.
print "#{$a{'Def'}}";
the original perl array is sorted and looks like this:
Original ARRARY:
ccc-->2
ccc-->5
abc-->3
abc-->7
cb-->6
and i like to have the following result:
FINAL ARRARY:
ccc-->7
abc-->10
cb-->6
Question:
can you please create a subroutine for that ?
this was the orig. subroutine that i used:
sub read_final_dev_file {
$dfcnt=0;
$DEVICE_ANZSUMZW=0;
$DEVICE_ANZSUM=0;
open(DATA,"$log_dir1/ALLDEVSORT.$log_file_ext1") || die ("Cannot Open Logfile: $log_dir1/$log_DEV_name.$log_file_ext1 !!!!");
#lines = <DATA>;
close(DATA);
chomp(#lines); # erase the last sign from a string
foreach $logline (#lines) {
if ($logline =~ /(.*)-->(.*)/) {
$DEVICE_CODE[$dfcnt] = $1;
$DEVICE_ANZAHL[$dfcnt] = $2;
print "DEVICE_final = $DEVICE_CODE[$dfcnt], D_ANZAHL_final = $DEVICE_ANZAHL[$dfcnt]\n";
if ($dfcnt > 0 ) {
if ( $DEVICE_CODE[$dfcnt] eq $DEVICE_CODE[$dfcnt-1] ) {
$DEVICE_ANZSUM = $DEVICE_ANZAHL[$dfcnt] + $DEVICE_ANZAHL[$dfcnt-1];
$DEVICE_ANZSUMZW = $DEVICE_ANZSUM++;
#$DEVICE_ANZSUM = $DEVICE_ANZAHL[$dfcnt]++;
#print "DEVICE_ANZAHL = $DEVICE_ANZAHL[$dfcnt],DEVICE_ANZAHL -1 = $DEVICE_ANZAHL[$dfcnt-1]\n";
print "DEVICE_eq = $DEVICE_CODE[$dfcnt], D_ANZAHL_eq = $DEVICE_ANZAHL[$dfcnt],DEVANZSUM = $DEVICE_ANZSUM,COUNT = $dfcnt\n";
}#end if
if ( $DEVICE_CODE[$dfcnt] ne $DEVICE_CODE[$dfcnt-1] ) {
#$DEVICE_ANZSUM=0;
#splice(#data3,$dfcnt+2,1) if ($DEVICE_ANZSUM > 1);
push (#data3,$DEVICE_ANZSUMZW) if ($DEVICE_ANZSUM > 1);
push (#data3,$DEVICE_ANZAHL[$dfcnt]) if ($DEVICE_ANZSUM == 0);
if ( $DEVICE_CODE[$dfcnt] ne $DEVICE_CODE[$dfcnt-1] ) {
$DEVICE_ANZSUM=0;
}
print "DEVICE_ne = $DEVICE_CODE[$dfcnt], D_ANZAHL_ne = $DEVICE_ANZAHL[$dfcnt], DEVANZSUM = $DEVICE_ANZSUM\n";
}#end if
}#end if $dfcnt
$dfcnt++;
}#end if logline
}#end for
print "#labels3\n";
print "#data3\n";
}#end sub read_final_dev_file
Probably not the best way, but this is what came to mind after seeing LeoNerd answer, since I don't have CPAN access in production and never have modules lying around:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #input = (
[ ccc => 2 ],
[ ccc => 5 ],
[ abc => 3 ],
[ abc => 7 ],
[ cb => 6 ],
);
my %output;
$output{$_->[0]} += $_->[1] for #input;
print Dumper \%output;
my #output = map { [ $_ => $output{$_} ] } keys(%output);
print Dumper \#output;
Output:
$VAR1 = {
'abc' => 10,
'cb' => 6,
'ccc' => 7
};
$VAR1 = [
['abc', 10],
['cb', 6],
['ccc', 7],
];
You could use List::UtilsBy::partition_by to group the original list into partitions, by the first string:
use List::UtilsBy qw( partition_by );
my #input = (
[ ccc => 2 ],
[ ccc => 5 ],
[ abc => 3 ],
[ abc => 7 ],
[ cb => 6 ],
);
my %sets = partition_by { $_->[0] } #input;
Now you have a hash, keyed by the leading strings, whose values are all the ARRAY refs with that key first. You can now sum the values within them, by mapping over $_->[1] which contains the numbers:
use List::Util qw( sum );
my %totals;
foreach my $key ( keys %sets ) {
$totals{$key} = sum map { $_->[1] } #{ $sets{$key} };
}
If you're inclined towards code of a more compact and functional-looking nature, you could instead use the new pairmap here; making the whole thing expressible in one line:
use List::UtilsBy qw( partition_by );
use List::Util qw( pairmap sum );
my %totals = pairmap { $a => sum map { $_->[1] } #$b }
partition_by { $_->[0] } #input;
Edit: I should add that even though you stated in your original question that the array was sorted, this solution doesn't require it sorted. It will happily take the input in any order.
You can simplify your subroutine a lot by using a hash to track the counts instead of an array. The following uses an array #devices to track the order and a hash %device_counts to track the counts:
my #devices;
my %device_counts;
while (<DATA>) { # Read one line at a time from DATA
if (/(.*)-->(.*)/) { # This won't extract newlines so no need to chomp
if (!exists $device_counts{$1}) {
push #devices, $1; # Add to the array the first time we encounter a device
}
$device_counts{$1} += $2; # Add to the count for this device
}
}
for my $device (#devices) {
printf "%s-->%s\n", $device, $device_counts{$device};
}
I have a three columns Excel file,which has the following pattern
12 A P1
23 B P5
24 C P2
15 D P1
06 E P5
The structure underlying this data set is that,
P1 contains A and D; A corresponds to 12 and D corresponds to 15
P2 contains C; C corresponds to 24
P5 contains B and E; B corresponds to 23 and E corresponds to 06
I want to represent this kind of structure in a hashed structure i.e., use P1 as a key to point to a hash, and A is used as the key for this second level hash. Is there a way to implement this in Perl?
Spreadsheet::ParseExcel can be used to parse .xls files. Below is a sample program that builds the desired data structure.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new;
my $workbook = $parser->parse( shift or die "Please provide a file\n" );
my $worksheet = $workbook->worksheet(0);
my %data;
for my $row ( 0 .. $worksheet->row_range ) {
my $value = $worksheet->get_cell( $row, 0 )->value;
my $key = $worksheet->get_cell( $row, 1 )->value;
my $super_key = $worksheet->get_cell( $row, 2 )->value;
$data{$super_key}->{$key} = $value;
}
print Dumper \%data;
Output
$VAR1 = {
'P5' => {
'E' => '06',
'B' => '23'
},
'P2' => {
'C' => '24'
},
'P1' => {
'A' => '12',
'D' => '15'
}
};
I had to process data in spreadsheets in the past. If you are dealing with a small number of Excel files, export them manually to CSV files using a spreadsheet software such as Excel. Then parse the CSV file, and store the cell values in a hash of hashes in perl:
#!/usr/bin/env perl
use warnings;
use strict;
use Data::Dumper::Simple;
my $file = "";
my #row = ();
my $rowidx = 1;
my %hh = (); # hash of hashes
open( INFILE, "input.csv" ) or die("Can not open input file: $!");
while ( $file = <INFILE> ) {
#row = parse($file);
chomp(#row);
$hh{ $row[2] }{ $row[1] } = $row[0];
#warn Dumper %hh; # debug
$rowidx++;
}
close(INFILE);
warn Dumper %hh;
exit;
sub parse {
my #newrow = ();
my $columns = shift; # read next row
push( #newrow, $+ ) while $columns =~ m{"([^\"\\]*(?:\\.[^\"\\]*)*)",?|([^,]+),?|,}gx; # parse and store columns to array
push( #newrow, undef ) if substr( $columns, -1, 1 ) eq ',';
return #newrow;
}
Running this gives
$ more input.csv
12,A,P1
23,B,P5
24,C,P2
15,D,P1
06,E,P5
$ ./ReadCSV.pl input.csv
%hh = (
'P5' => {
'E' => '06',
'B' => '23'
},
'P2' => {
'C' => '24'
},
'P1' => {
'A' => '12',
'D' => '15'
}
);
There's the Spreadsheet::ParseExcel function which does a pretty good job of parsing a regular *.xls spreadsheet.
Fortunately, there's an extension called Spreadsheet::XLSX that works with Spreadsheet::ParseExcel to also read *.xlsx spreadsheets too. The methods used in Spreadsheet::ParseExcel work with both *.xls and *.xlsx files if you also have Spreadsheet::XLSX also installed.
What version of excel are the files formatted in?
I have had a very good experience with reading from (and writing to) .xls files using the modules Spreadsheet::ParseExcel (Spreadsheet::WriteExcel for output)
Unfortunately, I did this 4 years ago and the .xlsx format was not as prevalent, so I can't speak for those.