slicing and array into a hasn value

slicing and array into a hasn value - perl

What I was trying to do was combine elements[1..3] into a single array, and then make the has out of that. Then sort by keys and print out the whole thing.
#!/usr/bin/perl
my %hash ;
while ( <> ) {
#elements = split /,/, $_;
#slice_elements = #elements[1..3] ;
if ($elements[0] ne '' ) {
$hash{ $elements[0] } = $slice_elements[0];
}
}
foreach $key (sort keys %hash ) {
print "$key; $hash{$key}\n";
}
This is what I get when I print this out -
casper_mint#casper-mint-dell /tmp $ /tmp/dke /tmp/File1.csv
060001.926941; TOT
060002.029434; RTP
060002.029568; RTP
060002.126895; UL
060002.229327; RDS/A
060002.312512; EON
060002.429382; RTP
060002.585408; BCS
060002.629333; LYG
060002.712240; HBC
This is waht I want the elements of the array - element[0] is the key and element[1..3] in the value
060001.926941,TOT,86.26,86.48
060002.029434,RTP,310.0,310.66
060002.029568,RTP,310.0,310.74
060002.126895,UL,34.06,34.14
060002.229327,RDS/A,84.47,84.72
060002.312512,EON,56.88,57.04
060002.429382,RTP,310.08,310.77
060002.585408,BCS,58.96,59.06
060002.629333,LYG,46.13,46.41
060002.712240,HBC,93.06,93.23

Always include use strict; and use warnings; at the top of EVERY perl script.
What you need is to create a new anonymous array [ ] as the value to your hash. Then join the values when displaying the results:
#!/usr/bin/perl
use strict;
use warnings;
my %hash;
while (<>) {
chomp;
my #elements = split /,/, $_;
if ($elements[0] ne '' ) {
$hash{ $elements[0] } = [#elements[1..3]];
}
}
foreach my $key (sort keys %hash ) {
print join(',', $key, #{$hash{$key}}) . "\n";
}
Of course, if your data really is fixed width like that, and you're not actually doing anything with the values, there actually is no need to split and join. The following would do the same thing:
use strict;
use warnings;
print sort <>;

Related

Perl, Split string into Key:Value pairs for hash with lowercase keys without temporary array

Given a string of Key:Value pairs, I want to create a lookup hash but with lowercase values for the keys. I can do so with this code
my $a="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my #a = split '\|', $a;
my %b = map { $a[$_] = ( !($_ % 2) ? lc($a[$_]) : $a[$_]) } 0 .. $#a ;
The resulting Hash would look like this Dumper output:
$VAR1 = {
'key3' => 'Value3',
'key2' => 'Value2',
'key1' => 'Value1'
};
Would it be possible to directly create hash %b without using temporary array #a or is there a more efficient way to achieve the same result?
Edit: I forgot to mention that I cannot use external modules for this. It needs to be basic Perl.

You can use pairmap from List::Util to do this without an intermediate array at all.
use strict;
use warnings;
use List::Util 1.29 'pairmap';
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash = pairmap { lc($a) => $b } split /\|/, $str;
Note: you should never use $a or $b outside of sort (or List::Util pair function) blocks. They are special global variables for sort, and just declaring my $a in a scope can break all sorts (and List::Util pair functions) in that scope. An easy solution is to immediately replace them with $x and $y whenever you find yourself starting to use them as example variables.

Since the key-value pair has to be around the | you can use a regex
my $v = "KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %h = split /\|/, $v =~ s/([^|]+) \| ([^|]+)/lc($1).q(|).$2/xger;

use strict;
use warnings;
use Data::Dumper;
my $i;
my %hash = map { $i++ % 2 ? $_ : lc } split(/\|/, 'KEY1|Value1|kEy2|Value2|KeY3|Value3');
print Dumper(\%hash);
Output:
$VAR1 = {
'key1' => 'Value1',
'key2' => 'Value2',
'key3' => 'Value3'
};

For fun, here are two additional approaches.
A cheaper one than the original (since the elements are aliased rather than copied into #_):
my %hash = sub { map { $_ % 2 ? $_[$_] : lc($_[$_]) } 0..$#_ }->( ... );
A more expensive one than the original:
my %hash = ...;
#hash{ map lc, keys(%hash) } = delete( #hash{ keys(%hash) } );

More possible solutions using regexes to do all the work, but not very pretty unless you really like regex:
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash;
my $copy = $str;
$hash{lc $1} = $2 while $copy =~ s/^([^|]*)\|([^|]*)\|?//;
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash;
$hash{lc $1} = $2 while $str =~ m/\G([^|]*)\|([^|]*)\|?/g;
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash = map { my ($k, $v) = split /\|/, $_, 2; (lc($k) => $v) }
$str =~ m/([^|]*\|[^|]*)\|?/g;

Here's a solution that avoids mutating the input string, constructing a new string of the same length as the input string, or creating an intermediate array in memory.
The solution here changes the split into looping over a match statement.
#! /usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my $a="KEY1|Value1|kEy2|Value2|KeY3|Value3";
sub normalize_alist_opt {
my ($input) = #_;
my %c;
my $last_key;
while ($input =~ m/([^|]*(\||\z)?)/g) {
my $s = $1;
next unless $s ne '';
$s =~ s/\|\z//g;
if (defined $last_key) {
$c{ lc($last_key) } = $s;
$last_key = undef;
} else {
$last_key = $s;
}
}
return \%c;
}
print Dumper(normalize_alist_opt($a));
A potential solution that operates over the split directly. Perl might recognize and optimize the special case. Although based on discussions here and here, I'm not sure.
sub normalize_alist {
my ($input) = #_;
my %c;
my $last_key;
foreach my $s (split /\|/, $input) {
if (defined $last_key) {
$c{ lc($last_key) } = $s;
$last_key = undef;
} else {
$last_key = $s;
}
}
return \%c;
}

Perl: Printing out the file where a word occurs

I am trying to write a small program that takes from command line file(s) and prints out the number of occurrence of a word from all files and in which file it occurs. The first part, finding the number of occurrence of a word, seems to work well.
However, I am struggling with the second part, namely, finding in which file (i.e. file name) the word occurs. I am thinking of using an array that stores the word but don’t know if this is the best way, or what is the best way.
This is the code I have so far and seems to work well for the part that counts the number of times a word occurs in given file(s):
use strict;
use warnings;
my %count;
while (<>) {
my $casefoldstr = lc $_;
foreach my $str ($casefoldstr =~ /\w+/g) {
$count{$str}++;
}
}
foreach my $str (sort keys %count) {
printf "$str $count{$str}:\n";
}

The filename is accessible through $ARGV.
You can use this to build a nested hash with the filename and word as keys:
use strict;
use warnings;
use List::Util 'sum';
while (<>) {
$count{$word}{$ARGV}++ for map +lc, /\w+/g;
}
foreach my $word ( keys %count ) {
my #files = keys %$word; # All files containing lc $word
print "Total word count for '$word': ", sum( #{ $count{$word} }{#files} ), "\n";
for my $file ( #files ) {
print "$count{$word}{$file} counts of '$word' detected in '$file'\n";
}
}

Using an array seems reasonable, if you don't visit any file more than once - then you can always just check the last value stored in the array. Otherwise, use a hash.
#!/usr/bin/perl
use warnings;
use strict;
my %count;
my %in_file;
while (<>) {
my $casefoldstr = lc;
for my $str ($casefoldstr =~ /\w+/g) {
++$count{$str};
push #{ $in_file{$str} }, $ARGV
unless ref $in_file{$str} && $in_file{$str}[-1] eq $ARGV;
}
}
foreach my $str (sort keys %count) {
printf "$str $count{$str}: #{ $in_file{$str} }\n";
}

unique value in hash while merging their keys

I have a file with tab separated columns like this:
TR1"\t"P0C134
TR2"\t"P0C133
TR2"\t"P0C136
Now I split these into two arrays (one for each column values) then convert them into hashes but I want to remove the duplicates (here its TR2) while merging their right column values...something like this TR2=>P0C133,P0C136...how is it possible?? is there any function to do it in perl??
for($i=0;$i<=scalar#s_arr;$i++)
{
if($s_arr[$i] eq $s_arr[$i+1])
{ push(#temp,$idx_arr[$i]); }
else
{
if(#temp eq "")
{ $s_hash{$s_arr[$i]}=$idx_arr[$i]; }
else
{
$idx_str=join(",",#temp);
$s_hash{$s_arr[$i]}=$idx_str;
#temp="";
}
}
}
this is code I've written where #s_arr is storing left column values and #idx_arr is storing right column value

You can avoid using two arrays and perform what you want in one fell swoop treating the left-side value as the hash key and making it an array reference, then pushing the right-side values that correlate with that key onto that aref:
use warnings;
use strict;
use Data::Dumper;
my %hash;
while (<DATA>){
my ($key, $val) = split;
push #{ $hash{$key} }, $val;
}
print Dumper \%hash;
__DATA__
TR1 P0C134
TR2 P0C133
TR2 P0C136
Output:
$VAR1 = {
'TR1' => [
'P0C134'
],
'TR2' => [
'P0C133',
'P0C136'
]
};

If you want that same structure output use hash of hash.
#!/usr/bin/perl
use warnings;
use strict;
my #arr = <DATA>;
my %hash;
foreach (#arr)
{
my ($k,$v) = split(/\s+/,$_);
chomp $v;
$hash{$k}{$v}++;
}
foreach my $key1 (keys %hash)
{
print "$key1=>";
foreach my $key2 (keys $hash{$key1})
{
print "$key2,";
}
print "\n";
}
__DATA__
TR1 P0C134
TR2 P0C133
TR2 P0C136
Output is:
TR2=>P0C136,P0C133,
TR1=>P0C134,

A simple variable count inside array

After working with this code, I am stuck at what I think is a simple error, yet I need outside eyes to see what is wrong.
I used unpack function to divide an array into the following.
#extract =
------MMMMMMMMMMMMMMMMMMMMMMMMMM-M-MMMMMMMM
------SSSSSSSSSSSSSSSSSSSSSSSSSS-S-SSSSSDTA
------TIIIIIIIIIIIIITIIIVVIIIIII-I-IIIIITTT
Apparently, after unpacking into the array, when I try to go into the while loop, #extract shows up completely empty. Any idea as to why this is happening?
print #extract; #<-----------Prints input
my $sum = 0;
my %counter = ();
while (my $column = #extract) {
print #extract; #<------- This extract is completely empty. Should be input
for (my $aa = (split ('', $column))){
$counter{$aa}++;
delete $counter{'-'}; # Don't count -
}
# Sort keys by count descending
my #keys = (sort {$counter{$b} <=> $counter{$a}} keys %counter) [0]; #gives highest letter
for my $key (#keys) {
$sum += $counter{$key};
print OUTPUT "$key $counter{$key} ";

Each line is an array element correct? I don't see in your code where you are checking the individual characters.
Assuming the input that you have shown is a 3 element array containing the line as a string:
#!/usr/bin/perl
use strict;
use warnings;
my #entries;
while(my $line = shift(#extract)){
my %hash;
for my $char(split('', $line)){
if($char =~ /[a-zA-Z]/) { $hash{$char}++ }
}
my $high;
for my $key (keys %hash) {
if(!defined($high)){ $high = $key }
elsif($hash{$high} < $hash{$key}){
$high = $key
}
}
push #entries, {$high => $hash{$high}};
}
Note this empties #extract, if you don't want to do that you'd have to use a for loop like below
for my $i (0 .. $#extract){
#my %hash etc...
}
EDIT:
Changed it so that only the highest number is actually kept

An approach using reduce from List::Util.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util 'reduce';
my #extract = qw/
------MMMMMMMMMMMMMMMMMMMMMMMMMM-M-MMMMMMMM
------SSSSSSSSSSSSSSSSSSSSSSSSSS-S-SSSSSDTA
------TIIIIIIIIIIIIITIIIVVIIIIII-I-IIIIITTT
/;
for (#extract) {
my %count;
tr/a-zA-Z//cd;
for (split //) {
$count{$_}++;
}
my $max = reduce { $count{$a} > $count{$b} ? $a : $b } keys %count;
print "$max $count{$max}\n";
}

How to get array values in hash using map function in Perl

I have an array of elements combined with # which I wish to put in hash , first element of that array as key and rest as value after splitting of that array elements by #
But it is not happening.
Ex:
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP ) #thousand of data
What I want is
$hash{9093} = [AT,AP];
$hash{8111} = [BR]; and so on
How we can accomplish it using map function. Otherwise I need to use for loop but I wish to use map function.

my %hash = map { my ($k, #v) = split /#/; $k => \#v } #arr;
For comparison, the corresponding foreach loop follows:
my %hash;
for (#arr) {
my ($k, #v) = split /#/;
$hash{$k} = \#v;
}

Use split to split on '#', taking the first chunk as the key, and keeping the rest in an array. Then create a hash using the keys and references to the arrays.
use Data::Dumper;
my #arr = qw( 9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
my %hash = map {
my ($key, #vals) = split '#', $_;
$key => \#vals;
} #arr;
print Dumper \%hash;

No effort shown in your question, but I am on a code freeze so I'll bite :)
A think that a for loop would be more idiomatic Perl here, process the elements one-by-one, split on # and then assign into your hash:
use strict;
use warnings;
use Data::Dumper;
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
my %h;
for my $elem ( #arr ) {
my ($key, #vals) = split /#/, $elem;
$h{$key} = \#vals;
}
print Dumper \%h;

That is easy:
%s = (map {split(/#/, $_, 2)} #arr);
Testing it:
$ cat 1.pl
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
%s = (map {split(/#/, $_, 2)} #arr);
foreach my $key ( keys %s )
{
print "key: $key, value: $s{$key}\n";
}
$ perl 1.pl
key: 7456, value: VD#AP
key: 8111, value: BR
key: 7786, value: WS#ER
key: 9431, value: BP
key: 9093, value: AT#BP

use strict;
use warnings;
use Data::Dumper;
my #arr = ('9093#AT#BP', '8111#BR', '7456#VD#AP', '7786#WS#ER', '9431#BP' );
my %h = map { map { splice(#$_, 0, 1), $_ } [ split /#/ ] } #arr;
print Dumper \%h;