convert array to hash using grep and map in perl - perl

I have an array as follows:
#array = ('a:b','c:d','e:f:g','h:j');
How can I convert this into the following using grep and map?
%hash={a=>1,b=>1,c=>1,d=>1,e=>1,f=>1,h=>1,j=>1};
I've tried:
#arr;
foreach(#array){
#a = split ':' , $_;
push #arr,#a;
}
%hash = map {$_=>1} #arr;
but i am getting all the values i should get first two values of an individual array

Its very easy:
%hash = map {$_=>1} grep { defined $_ } map { (split /:/, $_)[0..1] } #array;
So, you split each array element with ":" delimiter, getting bigger array, take only 2 first values; then grep defined values and pass it to other map makng key-value pairs.

You have to ignore everything except first two elements after split,
my #arr;
foreach (#array){
#a = split ':', $_;
push #arr, #a[0,1];
}
my %hash = map {$_=>1} #arr;
Using map,
my %hash =
map { $_ => 1 }
map { (split /:/)[0,1] }
#array;

I think this should work though not elegent enough. I use a temporary array to hold the result of split and return the first two elements.
my %hash = map { $_ => 1 } map { my #t = split ':', $_; $t[0], $t[1] } #array;

This filters out g key
my %hash = map { map { $_ => 1; } (split /:/)[0,1]; } #array;

Related

Handling Nested Delimiters in perl

use strict;
use warnings;
my %result_hash = ();
my %final_hash = ();
Compare_results();
foreach my $key (sort keys %result_hash ){
print "$key \n";
print "$result_hash{$key} \n";
}
sub Compare_results
{
while ( <DATA> )
{
my($instance,$values) = split /\:/, $_;
$result_hash{$instance} = $values;
}
}
__DATA__
1:7802315095\d\d,7802315098\d\d;7802025001\d\d,7802025002\d\d,7802025003\d\ d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
2:7802315095\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
Output
1
7802315095\d\d,7802315098\d\d;7802025001\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
2
7802315095\d\d,7802025002\d\d,7802025003\d\d,7802025004\d\d,7802025005\d\d,7802025006\d\d,7802025007\d\d
Iam trying to fetch value of each key and again trying to split the comma seperated value from result hash , if i find a semicolon in any value i would want to store the left and right values in separate hash keys.
Something like below
1.#split the value of result_hash{$key} again by , and see whether any chunk is seperated by ;
2. #every chunk without ; and value on left with ; should be stored in
#{$final_hash{"eto"}} = ['7802315095\d\d','7802315098\d\d','7802025002\d\d','7802025003\d\d','7802025004\d\d','7802025005\d\d','7802025006\d\d','7802025007\d\d'] ;
3.#Anything found on the right side of ; has to be stored in
#{$final_hash{"pro"}} = ['7802025001\d\d'] ;
Is there a way that i can handle everything in the subroutine? Can i make the code more simpler
Update :
I tried splitting the string in a single shot, but its just picking the values with semicolon and ignoring everything
foreach my $key (sort keys %result_hash ){
# print "$key \n";
# print "$result_hash{$key} \n";
my ($o,$t) = split(/,|;/, $result_hash{$key});
print "Left : $o \n";
print "Left : $t \n";
#push #{$final_hash{"eto"}}, $o;
#push #{$final_hash{"pro"}} ,$t;
}
}
My updated code after help
sub Compare_results
{
open my $fh, '<', 'Data_File.txt' or die $!;
# split by colon and further split by , and ; if any (done in insert_array)
my %result_hash = map { chomp; split ':', $_ } <$fh> ;
foreach ( sort { $a <=> $b } (keys %result_hash) )
{
($_ < 21)
? insert_array($result_hash{$_}, "west")
: insert_array($result_hash{$_}, "east");
}
}
sub insert_array()
{
my ($val,$key) = #_;
foreach my $field (split ',', $val)
{
$field =~ s/^\s+|\s+$//g; # / turn off editor coloring
if ($field !~ /;/) {
push #{ $file_data{"pto"}{$key} }, $field ;
}
else {
my ($left, $right) = split ';', $field;
push #{$file_data{"pto"}{$key}}, $left if($left ne '') ;
push #{$file_data{"ero"}{$key}}, $right if($right ne '') ;
}
}
}
Thanks
Update Added a two-pass regex, at the end
Just proceed systematically, analyze the string step by step. The fact that you need consecutive splits and a particular separation rule makes it unwieldy to do in one shot. Better have a clear method than a monster statement.
use warnings 'all';
use strict;
use feature 'say';
my (%result_hash, %final_hash);
Compare_results();
say "$_ => $result_hash{$_}" for sort keys %result_hash;
say '---';
say "$_ => [ #{$final_hash{$_}} ]" for sort keys %final_hash;
sub Compare_results
{
%result_hash = map { chomp; split ':', $_ } <DATA>;
my (#eto, #pro);
foreach my $val (values %result_hash)
{
foreach my $field (split ',', $val)
{
if ($field !~ /;/) { push #eto, $field }
else {
my ($left, $right) = split ';', $field;
push #eto, $left;
push #pro, $right;
}
}
}
$final_hash{eto} = \#eto;
$final_hash{pro} = \#pro;
return 1; # but add checks above
}
There are some inefficiencies here, and no error checking, but the method is straightforward. If your input is anything but smallish please change the above to process line by line, what you clearly know how to do. It prints
1 => ... (what you have in the question)
---
eto => [ 7802315095\d\d 7802315098\d\d 7802025002\d\d 7802025003\d\ d ...
pro => [ 7802025001\d\d ]
Note that your data does have one loose \d\ d.
We don't need to build the whole hash %result_hash for this but only need to pick the part of the line after :. I left the hash in since it is declared global so you may want to have it around. If it in fact isn't needed on its own this simplifies
sub Compare_results {
my (#eto, #pro);
while (<DATA>) {
my ($val) = /:(.*)/;
foreach my $field (split ',', $val)
# ... same
}
# assign to %final_hash, return from sub
}
Thanks to ikegami for comments.
Just for the curiosity's sake, here it is in two passes with regex
sub compare_rx {
my #data = map { (split ':', $_)[1] } <DATA>;
$final_hash{eto} = [ map { /([^,;]+)/g } #data ];
$final_hash{pro} = [ map { /;([^,;]+)/g } #data ];
return 1;
}
This picks all characters which are not , or ;, using the negated character class, [^,;]. So that is up to the first either of them, left to right. It does this globally, /g, so it keeps going through the string, collecting all fields that are "left of" , or ;. Then it cheats a bit, picking all [^,;] that are right of ;. The map is used to do this for all lines of data.
If %result_hash is needed build it instead of #data and then pull the values from it with my #values = values %hash_result and feed the map with #values.
Or, broken line by line (again, you can build %result_hash instead of taking $data directly)
my (#eto, #pro);
while (<DATA>) {
my ($data) = /:(.*)/;
push #eto, $data =~ /([^,;]+)/g;
push #pro, $data =~ /;([^,;]+)/g;
}

unique value in hash while merging their keys

I have a file with tab separated columns like this:
TR1"\t"P0C134
TR2"\t"P0C133
TR2"\t"P0C136
Now I split these into two arrays (one for each column values) then convert them into hashes but I want to remove the duplicates (here its TR2) while merging their right column values...something like this TR2=>P0C133,P0C136...how is it possible?? is there any function to do it in perl??
for($i=0;$i<=scalar#s_arr;$i++)
{
if($s_arr[$i] eq $s_arr[$i+1])
{ push(#temp,$idx_arr[$i]); }
else
{
if(#temp eq "")
{ $s_hash{$s_arr[$i]}=$idx_arr[$i]; }
else
{
$idx_str=join(",",#temp);
$s_hash{$s_arr[$i]}=$idx_str;
#temp="";
}
}
}
this is code I've written where #s_arr is storing left column values and #idx_arr is storing right column value
You can avoid using two arrays and perform what you want in one fell swoop treating the left-side value as the hash key and making it an array reference, then pushing the right-side values that correlate with that key onto that aref:
use warnings;
use strict;
use Data::Dumper;
my %hash;
while (<DATA>){
my ($key, $val) = split;
push #{ $hash{$key} }, $val;
}
print Dumper \%hash;
__DATA__
TR1 P0C134
TR2 P0C133
TR2 P0C136
Output:
$VAR1 = {
'TR1' => [
'P0C134'
],
'TR2' => [
'P0C133',
'P0C136'
]
};
If you want that same structure output use hash of hash.
#!/usr/bin/perl
use warnings;
use strict;
my #arr = <DATA>;
my %hash;
foreach (#arr)
{
my ($k,$v) = split(/\s+/,$_);
chomp $v;
$hash{$k}{$v}++;
}
foreach my $key1 (keys %hash)
{
print "$key1=>";
foreach my $key2 (keys $hash{$key1})
{
print "$key2,";
}
print "\n";
}
__DATA__
TR1 P0C134
TR2 P0C133
TR2 P0C136
Output is:
TR2=>P0C136,P0C133,
TR1=>P0C134,

Perl sort array by pattern match

I would like to sort this array based on the value after the comma
my #coords;
$coords[0] = "33.7645539, -84.3585973";
$coords[1] = "33.7683870, -84.3559850";
$coords[2] = "33.7687753, -84.3541355";
foreach my $coord (#sorted_coords) {
print "$coord\n";
}
Output:
33.7687753, -84.3541355
33.7683870, -84.3559850
33.7645539, -84.3585973
I've thought about using map, grep, and capture groups as the list input for sort, but I haven't gotten very far:
my #sorted_coords = sort { $a <=> $b } map {$_ =~ /, (-*\d+\.\d+)/} #unique_coords;
It is easy to submit to the temptation to use a fancy implementation instead of something straightforward and clear. Unless the data set is huge, the speed advantage of using a transform is negligible, and comes at the cost of much reduced legibility
A standard sort block is all that's necessary here
use strict;
use warnings;
my #coords = (
"33.7645539, -84.3585973",
"33.7683870, -84.3559850",
"33.7687753, -84.3541355",
);
my #sorted_coords = sort {
my ($aa, $bb) = map { (split)[1] } $a, $b;
$bb <=> $aa;
} #coords;
print "$_\n" for #sorted_coords;
output
33.7687753, -84.3541355
33.7683870, -84.3559850
33.7645539, -84.3585973
Update
If you prefer, the second field may be extracted from the input records using a regex instead. Replacing the map statement with something like this
my ($aa, $bb) = map /.*(\S+)/, $a, $b;
will work fine
Looks like you could use a Schwartzian transform. You had the right idea:
my #coords;
$coords[1] = "33.7683870, -84.3559850";
$coords[2] = "33.7687753, -84.3541355";
$coords[0] = "33.7645539, -84.3585973";
my #sorted_coords = map { $_->[0] } # 3. extract the first element
sort { $b->[1] <=> $a->[1] } # 2. sort on the second
# element, descending
map { [ $_, /,\s*(\S+)$/ ] } # 1. create list of array refs
#coords;
foreach my $coord (#sorted_coords) {
print "$coord\n";
}
Edit: Adding Joshua's suggestion:
my #sorted_coords = map { join ', ', #$_ }
sort { $b->[1] <=> $a->[1] }
map { [ split /, / ] }
#coords;
It seems easier to look at and more descriptive than my original example.

How to get array values in hash using map function in Perl

I have an array of elements combined with # which I wish to put in hash , first element of that array as key and rest as value after splitting of that array elements by #
But it is not happening.
Ex:
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP ) #thousand of data
What I want is
$hash{9093} = [AT,AP];
$hash{8111} = [BR]; and so on
How we can accomplish it using map function. Otherwise I need to use for loop but I wish to use map function.
my %hash = map { my ($k, #v) = split /#/; $k => \#v } #arr;
For comparison, the corresponding foreach loop follows:
my %hash;
for (#arr) {
my ($k, #v) = split /#/;
$hash{$k} = \#v;
}
Use split to split on '#', taking the first chunk as the key, and keeping the rest in an array. Then create a hash using the keys and references to the arrays.
use Data::Dumper;
my #arr = qw( 9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
my %hash = map {
my ($key, #vals) = split '#', $_;
$key => \#vals;
} #arr;
print Dumper \%hash;
No effort shown in your question, but I am on a code freeze so I'll bite :)
A think that a for loop would be more idiomatic Perl here, process the elements one-by-one, split on # and then assign into your hash:
use strict;
use warnings;
use Data::Dumper;
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
my %h;
for my $elem ( #arr ) {
my ($key, #vals) = split /#/, $elem;
$h{$key} = \#vals;
}
print Dumper \%h;
That is easy:
%s = (map {split(/#/, $_, 2)} #arr);
Testing it:
$ cat 1.pl
my #arr = qw(9093#AT#BP 8111#BR 7456#VD#AP 7786#WS#ER 9431#BP );
%s = (map {split(/#/, $_, 2)} #arr);
foreach my $key ( keys %s )
{
print "key: $key, value: $s{$key}\n";
}
$ perl 1.pl
key: 7456, value: VD#AP
key: 8111, value: BR
key: 7786, value: WS#ER
key: 9431, value: BP
key: 9093, value: AT#BP
use strict;
use warnings;
use Data::Dumper;
my #arr = ('9093#AT#BP', '8111#BR', '7456#VD#AP', '7786#WS#ER', '9431#BP' );
my %h = map { map { splice(#$_, 0, 1), $_ } [ split /#/ ] } #arr;
print Dumper \%h;

perl: getting value out of a hash using map

It seems like I should be able to do this with map, but the actual details elude me.
I have a list of strings in an array, and either zero or one of them may have a hash value.
So instead of doing:
foreach $str ( #strings ) {
$val = $hash{$str} if $hash{$str};
}
Can this be replaced with a one-liner using map?
#values = grep { $_ } #hash{#strings};
to account for the fact that you only want true values.
Change this to
#values = grep { defined } #hash{#strings};
if you want to skip undefined values.
Sure, it'd be:
map { $val = $hash{$_} } #strings;
That is, each value of #strings is set in $_ in turn (instead of $str as in your foreach).
Of course, this doesn't do much, since you're not doing anything with the value of $val in your loop, and we aren't capturing the list returned by map.
If you're just trying to generate a list of values, that'd be:
#values = map { $hash{$_} } #strings;
But it's more concise to use a hash slice:
#values = #hash{#strings};
EDIT: As pointed out in the comments, if it's possible that #strings contains values that aren't keys in your hash, then #values will get undefs in those positions. If that's not what you want, see Hynek's answer for a solution.
I'm used to do it in this way:
#values = map { exists $hash{$_} ? $hash{$_} : () } #strings;
but I don't see anything wrong in this way
push #values, $hash{$_} for grep exists $hash{$_}, #strings;
or
#values = #hash{grep exists $hash{$_}, #strings};
map { defined $hash{$_} && ( $val = $hash{$_})} #strings;