Related
I am trying to create a hash from the raw data read from a file.
The values for each hash element will be a list of lists. These inner lists are parsed from the file and need to be kept as a key => ((list 1), (list 2), ......, (list n)) in hash for further processing.
Final data that is expected in the hash will be something like:
%hash = {
'key 1' => ((A, B, C), (1, 2, 3), (Q, R, F)),
'key 2' => ((X, Y, Z), (P, Q, R)),
'key 3' => ((1.0, M, N), (R, S, T), (4, 7, 9)),
......,
'key n' => ((5, M, 8), (J, K, L), (1, 3, 4))
}
I wanted to keep them as a hash for easier lookup and to catch the duplicate keys
my %hash;
my #array = ();
my #inner_array = ();
open (my $FH, '<', $input_file) or die "Could not open : $!\n";
while (my $line = <$FH>) {
chomp $line;
## Lines making up $key and #inner_array
## e.g. $key = 'key 1' and
## #inner_array = (A, B, C)
## #inner_array = (1, 2, 3)
if (exists $hash{$key}) { # We have seen this key before
#array = $hash{$key}; # Get the existing array
push(#array, #inner_array); # Append new inner list
$hash{$key} = #array; # Replace the original list
} else { # Seeing the key for the first time
#array = (); # Create empty list
push (#array, #inner_list); # Append new inner list
$hash{$key} = #array; # Replace the original list
}
}
close $FH;
print dumper %hash;
when executed on a sample file of 10 lines, I am getting the output like below:
$VAR1 = {
'key 1' => 2,
'key 2' => 2,
'key 3' => 2
};
Rather than seeing an array of arrays, I am getting scalar value 2 as the value of each hash element. Please suggest what am I doing wrong.
((A, B, C), (1, 2, 3), (Q, R, F)) is equivalent to (A, B, C, 1, 2, 3, Q, R, F), lists are flattened in Perl. Hash values must be scalars, you need to use an array reference:
my %hash = ( key => [ [ 'A', 'B', 'C' ], [ 1, 2, 3 ], [ 'Q', 'R', 'F' ] ] ...
Note the square brackets for array references.
Also note the round parenthesis at the beginning: using { creates a hash reference, which you probably don't want to assign to a hash. It would've created a hash of a single key like HASH(0x5653cc6cc1e0) with undefined value.
Using warnings should tell you so:
$ perl -MData::Dumper -wE 'my %h = {x=>1}; say Dumper \%h'
Reference found where even-sized list expected at -e line 1.
$VAR1 = {
'HASH(0x557d282e41e0)' => undef
};
I am not providing answer for your question instead I am giving the reason for your output. This is because of "implicit scalar conversion" it stores the array length. for example,
my #ar = qw(1 2 3 4);
my $x = #ar;
# output 4 (total length of array)
Use the reference to holds the data in your hash, as follow
use Data::Dumper;
my %hash;
my #array = (1,2,3,1);
#{$hash{"key"}} = #array;
print Dumper \%hash;
Then you have to understand the array flatten in Perl, Let's consider
#ar = ((1,2,3),(4,5,6),(3,4,5));
print $ar[0];
# output is 1 not (1,2,3) this is because array flatten
If you want to access the data as array format again you have to store it as reference
my #ar = ( [1,3,4] , [5,4,2] );
print #{$ar[0]};
#1,3,4
How can I print the values of an array. I have tried several ways but I am unable to get the required values out of the arrays:
#array;
Dumper output is as below :
$VAR1 = [
'a',
'b',
'c'
];
$VAR1 = [
'd',
'e',
'f'
];
$VAR1 = [
'g',
'h',
'i'
];
$VAR1 = [
'j',
'k',
'l'
];
for my $value (#array) {
my $ip = $value->[0];
DEBUG("DEBUG '$ip\n'");
}
I am getting output as below, which means foreach instance I am only getting the first value.
a
d
g
j
I have tried several approaches :
First option :
my $size = #array;
for ($n=0; $n < $size; $n++) {
my $value=$array[$n];
DEBUG( "DEBUG: Element is as $value" );
}
Second Option :
for my $value (#array) {
my $ip = $value->[$_];
DEBUG("DEBUG Element is '$ip\n'");
}
What is the best way to do this?
It is obvious that you have list of arrays. You only loop over top list and print first (0th) value in your first example. Barring any automatic dumpers, you need to loop over both levels.
for my $value (#array) {
for my $ip (#$value) {
DEBUG("DEBUG '$ip\n'");
}
}
You want to dereference here so you need to do something like:
my #array_of_arrays = ([qw/a b c/], [qw/d e f/ ], [qw/i j k/])
for my $anon_array (#array_of_arrays) { say for #{$anon_array} }
Or using your variable names:
use strict;
use warnings;
my #array = ([qw/a b c/], [qw/d e f/], [qw/i j k/]);
for my $ip (#array) {
print join "", #{$ip} , "\n"; # or "say"
}
Since there are anonymous arrays involved I have focused on dereferencing (using PPB style!) instead of nested loops, but print for is a loop in disguise really.
Cheers.
I want to get the corresponding values from the second list based on the index value of the unique values in my first list using perl.
For ex:
#list1=('a','b','c','a','d');
#list2=('e','f','g','a','i');
i want to create two new list without the duplicate values
#new_list1=('a','b','c','d');
#new_list2=('e','f','g','i');
How can i do this?
to get unique values from one list i can use:
my %temp_hash = map { $_, 0 } #list1;
my #uniq_array = keys %temp_hash;
print "#uniq_array\n";
But how to get the values at the corresponding index from the other list.
Thanks in advance.
EDIT:
It should find unique values based on some condition and not the first occurrence.
for ex:
#list1=('a','b','c','a','d');
#list2=('e','f','g','a','i');
and
#list1=('a','b','c','a','d');
#list2=('a','f','g','e','i');
should give the same value:
#new_list1=('a','b','c','d');
#new_list2=('e','f','g','i');
In the given example the condition may be not to include the common element where the value for both the list are same. ie there are two occurrence of 'a' 1st 'a' corresponds to 'e' and 2nd 'a' corresponds to 'a'. so the second one is to be removed not the first one.
When dealing with parallel arrays, one deals with indexes. To find the list of indexes, one can start with the following common method of finding unique values:
my %seen;
my #uniq = grep !$seen{$_}++, #dups;
and expand it to take a list of indexes as input:
my %seen;
my #indexes = grep !$seen{ $list1[$_] }++, 0..$#list1;
In your updated question, you need something more complex:
my %options;
for (0..$#list1) {
push #{ $options{ $list1[$_] } }, $_;
}
my %seen;
my #indexes;
for (0..$#list1) {
next if $seen{$_}++;
my #options = #{ $options{ $list1[$_] } };
my $option = pick(\#list1, \#list2, \#options) // $options[0];
push #indexes, $option;
}
Then all you have to do is extract the desired elements:
my #new_list1 = #list1[ #indexes ];
my #new_list2 = #list2[ #indexes ];
The pick function for the possible selection algorithm you describe is:
sub pick {
my ($list1, $list2, $options) = #_;
return ( grep $list1->[$_] ne $list2->[$_], #options )[0];
}
use strict;
use warnings;
use Data::Dumper;
my #list1=('a','b','c','a','d');
my #list2=('e','f','g','a','i');
my %hash1 = ();
my %hash2 = ();
$hash1{$_}++ for #list1;
my #uniq_list1 = sort keys %hash1;
for (#list2)
{
next if defined ($hash1{$_});
$hash2{$_}++;
}
my #uniq_list2 = sort keys %hash2;
print Dumper(\#uniq_list1);
print Dumper(\#uniq_list2);
Output:
$VAR1 = [
'a',
'b',
'c',
'd'
];
$VAR1 = [
'e',
'f',
'g',
'i'
];
I just read
How can I generate all permutations of an array in Perl?
http://www.perlmonks.org/?node_id=503904
and
https://metacpan.org/module/Algorithm::Permute
I want to create all possible combinations with a userdefined length of values in an array.
perlmonks did it like this:
#a= glob "{a,b,c,d,e,1,2,3,4,5}"x 2;
for(#a){print "$_ "}
and this works fine, but instead of "{a,b,c,d,e,1,2,3,4,5}" I would like to use an array
i tried this:
#a= glob #my_array x $userinput ;
for(#a){print "$_ "}
but it didn't work, how can I do that? Or how can I limit the length of permutation within Algorithm::Permute ?
Simply generate the string from the array:
my #array = ( 'a' .. 'e', 1 .. 5 );
my $stringified = join ',', #array;
my #a = glob "{$stringified}" x 2;
say 0+#a; # Prints '100';
say join ', ', #a; # 'aa, ab, ac, ad ... 53, 54, 55'
One could also use a CPAN module. Like List::Gen:
use List::Gen 'cartesian';
my #permutations = cartesian { join '', #_ } map [ $_ ], ( 'a'..'e', 1..5 ) ;
I have the following code.
Here I am matching the vowels characters words:
if ( /(a)+/ and /(e)+/ and /(i)+/ and /(o)+/ and /(u)+/ )
{
print "$1#$2#$3#$4#$5\n";
$number++;
}
I am trying to get the all matched patterns using grouping, but I am getting only the last expression pattern, which means the fifth expression of the if condition. Here I know that it is giving only one pattern because last pattern matching in if condition. I want to get all matched patterns, however. Can anyone help me out of this problem?
It is not quite clear what you want to do. Here are some thoughts.
Are you trying to count the number of vowels? In which case, tr will do the job:
my $count = tr/aeiou// ;
printf("string:%-20s count:%d\n" , $_ , $count ) ;
output :
string:book count:2
string:stackoverflow count:4
Or extract the vowels
my #array = / ( [aeiou] ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'u',
'e',
'i',
'o'
];
Or extract sequences of vowels
my #array = / ( [aeiou]+ ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'ue',
'io'
];
You could use
sub match_all {
my($s,#patterns) = #_;
my #matches = grep #$_ >= 1,
map [$s =~ /$_/g] => #patterns;
wantarray ? #matches : \#matches;
}
to create an array of non-empty matches.
For example:
my $string = "aaa e iiii oo uuuuu aa";
my #matches = match_all $string, map qr/$_+/ => qw/ a e i o u /;
if (#matches == 5) {
print "[", join("][", #$_), "]\n"
for #matches;
}
else {
my $es = #matches == 1 ? "" : "es";
print scalar(#matches), " match$es\n";
}
Output:
[aaa][aa]
[e]
[iiii]
[oo]
[uuuuu]
An input of, say, "aaa iiii oo uuuuu aa" produces
4 matches
You have 5 patterns with one matching group () each. Not 1 pattern with 5 groups.
(a)+ looks for a string containing a, aa, aaa, aaaa etc. The match will be multiple a's, not the word containing the group of a-s.
Your if( ...) is true if $_ contains one or more of 'a','e','i','o','u'.