Create a hash of array: Displaying array reference - perl

Below is my code(just playing with hashes) where I want to create a hash of array(keys assigning to array). But I get the output as array reference. Why is this array reference displaying?
#!/usr/bin/perl
my #result = (0,0,0);
my #operator = ('AP', 'MP', 'UP');
my %operator_res;
for ( $i = 0; $i <= $#operator; $i++ ) {
if ( $i == 2 ) {
#result = (4,5,6);
} elsif ( $i == 1 ) {
#result = (1,2,3);
}
#{$operator_res{$operator[$i]}} = #result;
}
foreach $keys (%operator_res) {
print "$keys:";
#print "#{$operator_res{$keys}}\n";
print "$operator_res{$keys}[0], $operator_res{$keys}[1], $operator_res{$keys}[2]\n";
}
Output is
UP:4, 5, 6
ARRAY(0x17212e70):, , Why is this array reference printing?
AP:0, 0, 0
ARRAY(0x17212e00):, ,
MP:1, 2, 3
ARRAY(0x17212e20):, ,

foreach $keys (%operator_res)
should be
foreach $keys (keys %operator_res)

Your foreach loop iterates over each element of %operator_res, not just over the keys. As ikagim already answered, you have to use keys to get only the keys of the hash.
If you have a look with Data::Dumper on the %operator_res the Output is:
$VAR1 = 'UP';
$VAR2 = [
4,
5,
6
];
$VAR3 = 'AP';
$VAR4 = [
0,
0,
0
];
$VAR5 = 'MP';
$VAR6 = [
1,
2,
3
];
As you see, you will always get two iterations per element: one for the key and one for the array ref.

A hash value in Perl must be a scalar. To simulate multidimensional hashes, use values that are references to hashes or arrays.
The line
#{$operator_res{$operator[$i]}} = #result;
in your question is equivalent to
$operator_res{ $operator[$i] } = [ #result ];
That is, the value associated with the key $operator[$i] at the time is a reference to a new array whose contents are the same as #result.
For many examples, read the perllol documentation.

You could use Data::Dumper to print out your data in a well formatted way:
use Data::Dumper;
print Dumper(\%operator_res);
Q: Why is this array reference printing?
A: Because of this line: print "$keys:";

Related

remove an array from AOA perl

I have an array of array that looks like this -
$VAR1 = [
'sid_R.ba',
'PS20TGB2YM13',
'SID_r.BA',
'ARS',
'XBUE'
]; $VAR2 = [
'sddff.pk',
'PQ10XD06K800',
'SDDFF.PK',
'USD',
'PINX'
]; $VAR3 = [
'NULL',
'NULL',
'NULL',
'.',
'XNAS'
]; $VAR4 = [
'NULL',
'NULL',
'NULL',
'.',
'XNAS'
]; $VAR5 = [
'NULL',
'NULL',
'NULL',
'EUR',
'OTCX'
]; $VAR6 = [
'sid.ba',
'PS20TGB1TN17',
'SID.BA',
'ARS',
'XBUE'
];
I want to remove the complete block (array ref) if any of its element is NULL
I have a code in which the array gets generated, so I tried a for loop to delete but then the index of the array is reduced on the inside the for loop.
So I dont know in which order the array will be or the length of array.
Please I need a generic solution.
Please help.
Thanks
You seem to have an array like
my #AoA = (
[1, 2, 3],
[4, 5, 6],
[7, 8, "NULL"],
[9, 10],
);
You want to select all child arrays that do not contain "NULL". Easy: Just use nested grep:
my #AoA_sans_NULL = grep {
not grep { $_ eq "NULL" } #$_
} #AoA;
The grep { CONDITION } #array selects all elements from #array where the CONDITION evaluates to true.
The grep { $_ eq "NULL" } #$_ counts the number of "NULL"s in the inner array. If this is zero, our condition is true, else, we don't want to keep that sub-array.
use List::MoreUtils qw(none);
my #filtered = grep {
none { $_ eq "NULL" } #$_;
} #array;
Does this do what you want?
my #new_array = grep { scalar(grep { $_ eq 'NULL' } #{$_}) == 0 } #old_array;
Old school:
my #filtered = ();
ARRAY_LOOP:
for my $array ( #AoA ){
ITEM_LOOP:
for my $item ( #$array ){
next ARRAY_LOOP if $item eq 'NULL';
} # end ITEM_LOOP
push #filtered, $array;
} # end ARRAY_LOOP
This code will be slower than the others, but an in-place solution might be useful if the data-set is very large.
use List::MoreUtils qw(any);
for(my $i = 0; $i < #AoA; $i ++) {
splice #AoA, $i --, 1
if any { $_ eq "NULL" } #{ $AoA[$i] };
}
A non-grep of a grep solution:
my #array = ...; #Array of Arrays
for my $array_index ( reverse 0 .. $#array ) {
my #inner_array = #{ $array[$array_index] };
if ( grep /^NULL$/, #inner_array ) {
splice #array, $array_index, 1;
}
}
say Dumper #array;
The splice command removes the entire subarray. I don't need to create #inner_array I could have used my dereferenced #{ $array[$array_index] } in the if statement, but I like going for clarity.
The only gotcha is that you have to go through your array of array backwards. If you go through your array from first element to last element, you'll remove element 2 which causes all the other elements to have their indexes decremented. If I first remove element 4, element 0 to 3 don't change their index.
It's not as elegant as the grep of a grep solutions, but it's a lot easier to maintain. Imagine someone who has to go through your program six months from now trying to figure out what:
grep { not grep { $_ eq "NULL" } #$_ } #array;
is doing.

Perl - splice() issues

I'm having trouble using the perl splice() method. Bellow you will see that I first identify the indexes of the two strings that I am looking for and then perform splice() using the indexes to get the desired array.
My code is as follows:
my #a = qw(foo bar bazz elements in between hello bazz johnny bl aba);
my $z = 0;
for (my $i = 0; $i < #a; $i++)
{
next unless $a[$i] =~ /bazz/;
if( $z eq 0 )
{
$z++;
$first = $i;
}
else
{
$second = $i;
}
my #b = splice(#a,$first,$second);
print Dumper(#b);
}
And the result of the print is as follows:
$VAR1 = 'bazz';
$VAR2 = 'elements';
$VAR3 = 'in';
$VAR4 = 'between';
$VAR5 = 'hello';
$VAR6 = 'bazz';
$VAR7 = 'johnny';
I was under the impression that splice takes the chunk in between the given limits, inclusive of course. I don't understand why element 'johnny' would be there. Shouldn't the list stop at the second 'bazz' ?
Thank you for any pointers on this issue.
The second argument is the length of the slice, not the index of the end of the slice.
splice takes the arguments as
splice #ARRAY, $OFFSET, $LENGTH, #REPLACE_LIST;
It removes $LENGTH elements from the #ARRAY starting at index $OFFSET and replaces them by the given list (or deletes them from the array when the empty list is (implicitely) given).
It seems you want an array slice instead:
my #b = #a[$first .. $second];
print Dumper \#b;

Building and printing a multidimensional list in Perl without looping

The top answer in this post: How can I create a multidimensional array in Perl? suggests building a multi-dimensional array as follows:
my #array = ();
foreach my $i ( 0 .. 10 ) {
foreach my $j ( 0 .. 10 ) {
push #{ $array[$i] }, $j;
}
}
I am wondering if there is a way of building the array more compactly and avoiding the nested loop, e.g. using something like:
my #array = ();
my #other_array = (0 ... 10);
foreach my $i ( 0 .. 10 ) {
$array[$i] = #other_array; # This does not work in Perl
}
}
Does Perl support any syntax like that for building multi-dimensional arrays without nested looping?
Similarly, is there a way to print the multidimensional array without (nested) looping?
There is more than one way to do it:
Generating
push accepts LISTs
my #array;
push #{$array[$_]}, 0 .. 10 for 0 .. 10;
Alternative syntax:
my #array;
push #array, [ 0 .. 10 ] for 0 .. 10;
map eye-candy
my #array = map { [ 0 .. 10 ] } 0 .. 10;
Alternative syntax:
my #array = map [ 0 .. 10 ], 0 .. 10;
Printing
With minimal looping
print "#$_\n" for #array;
On Perl 5.10+
use feature 'say';
say "#$_" for #array;
With more formatting control
print join( ', ', #$_ ), "\n" for #array; # "0, 1, 2, ... 9, 10"
"No loops" (The loop is hidden from you)
use Data::Dump 'dd';
dd #array;
Data::Dumper
use Data::Dumper;
print Dumper \#array;
Have a look at perldoc perllol for more details
You are close, you need a reference to the other array
my #array; # don't need the empty list
my #other_array = (0 ... 10);
foreach my $i ( 0 .. 10 ) {
$array[$i] = \#other_array;
# or without a connection to the original
$array[$i] = [ #other_array ];
# or for a slice
$array[$i] = [ #other_array[1..$#other_array] ];
}
}
You can also make anonymous (unnamed) array reference directly using square braces [] around a list.
my #array;
foreach my $i ( 0 .. 10 ) {
$array[$i] = [0..10];
}
}
Edit: printing is probably easiest using the postfix for
print "#$_\n" for #array;
for numerical multidimensional arrays, you can use PDL. It has several constructors for different use cases. The one analogous to the above would be xvals. Note that PDL objects overload printing, so you can just print them out.
use PDL;
my $pdl = xvals(11, 11);
print $pdl;

What's the best practise for Perl hashes with array values?

What is the best practise to solve this?
if (... )
{
push (#{$hash{'key'}}, #array ) ;
}
else
{
$hash{'key'} ="";
}
Is that bad practise for storing one element is array or one is just double quote in hash?
I'm not sure I understand your question, but I'll answer it literally as asked for now...
my #array = (1, 2, 3, 4);
my $arrayRef = \#array; # alternatively: my $arrayRef = [1, 2, 3, 4];
my %hash;
$hash{'key'} = $arrayRef; # or again: $hash{'key'} = [1, 2, 3, 4]; or $hash{'key'} = \#array;
The crux of the problem is that arrays or hashes take scalar values... so you need to take a reference to your array or hash and use that as the value.
See perlref and perlreftut for more information.
EDIT: Yes, you can add empty strings as values for some keys and references (to arrays or hashes, or even scalars, typeglobs/filehandles, or other scalars. Either way) for other keys. They're all still scalars.
You'll want to look at the ref function for figuring out how to disambiguate between the reference types and normal scalars.
It's probably simpler to use explicit array references:
my $arr_ref = \#array;
$hash{'key'} = $arr_ref;
Actually, doing the above and using push result in the same data structure:
my #array = qw/ one two three four five /;
my $arr_ref = \#array;
my %hash;
my %hash2;
$hash{'key'} = $arr_ref;
print Dumper \%hash;
push #{$hash2{'key'}}, #array;
print Dumper \%hash2;
This gives:
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
Using explicit array references uses fewer characters and is easier to read than the push #{$hash{'key'}}, #array construct, IMO.
Edit: For your else{} block, it's probably less than ideal to assign an empty string. It would be a lot easier to just skip the if-else construct and, later on when you're accessing values in the hash, to do a if( defined( $hash{'key'} ) ) check. That's a lot closer to standard Perl idiom, and you don't waste memory storing empty strings in your hash.
Instead, you'll have to use ref() to find out what kind of data you have in your value, and that is less clear than just doing a defined-ness check.
I'm not sure what your goal is, but there are several things to consider.
First, if you are going to store an array, do you want to store a reference to the original value or a copy of the original values? In either case, I prefer to avoid the dereferencing syntax and take references when I can:
$hash{key} = \#array; # just a reference
use Clone; # or a similar module
$hash{key} = clone( \#array );
Next, do you want to add to the values that exist already, even if it's a single value? If you are going to have array values, I'd make all the values arrays even if you have a single element. Then you don't have to decide what to do and you remove a special case:
$hash{key} = [] unless defined $hash{key};
push #{ $hash{key} }, #values;
That might be your "best practice" answer, which is often the technique that removes as many special cases and extra logic as possible. When I do this sort of thing in a module, I typically have a add_value method that encapsulates this magic where I don't have to see it or type it more than once.
If you already have a non-reference value in the hash key, that's easy to fix too:
if( defined $hash{key} and ! ref $hash{key} ) {
$hash{key} = [ $hash{key} ];
}
If you already have non-array reference values that you want to be in the array, you do something similar. Maybe you want an anonymous hash to be one of the array elements:
if( defined $hash{key} and ref $hash{key} eq ref {} ) {
$hash{key} = [ $hash{key} ];
}
Dealing with the revised notation:
if (... )
{
push (#{$hash{'key'}}, #array);
}
else
{
$hash{'key'} = "";
}
we can immediately tell that you are not following the standard advice that protects novices (and experts!) from their own mistakes. You're using a symbolic reference, which is not a good idea.
use strict;
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push(#{$hash{'key'}}, #array);
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
This does not run:
Can't use string ("value") as an ARRAY ref while "strict refs" in use at xx.pl line 8.
I'm not sure I can work out what you were trying to achieve. Even if you remove the 'use strict;' warning, the code shown does not detect a change from the push operation.
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push #{$hash{'key'}}, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
foreach my $value (#{$hash{'key'}}) { print "h_key $value\n"; }
push #value, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
Output:
key = value
array 1
array abc
array 2
value 22
value 23
value 24
h_key 1
h_key abc
h_key 2
key = value
array 1
array abc
array 2
value 22
value 23
value 24
value 1
value abc
value 2
I'm not sure what is going on there.
If your problem is how do you replace a empty string value you had stored before with an array onto which you can push your values, this might be the best way to do it:
if ( ... ) {
my $r = \$hash{ $key }; # $hash{ $key } autoviv-ed
$$r = [] unless ref $$r;
push #$$r, #values;
}
else {
$hash{ $key } = "";
}
I avoid multiple hash look-ups by saving a copy of the auto-vivified slot.
Note the code relies on a scalar or an array being the entire universe of things stored in %hash.

Difference of Two Arrays Using Perl

I have two arrays. I need to check and see if the elements of one appear in the other one.
Is there a more efficient way to do it than nested loops? I have a few thousand elements in each and need to run the program frequently.
Another way to do it is to use Array::Utils
use Array::Utils qw(:all);
my #a = qw( a b c d );
my #b = qw( c d e f );
# symmetric difference
my #diff = array_diff(#a, #b);
# intersection
my #isect = intersect(#a, #b);
# unique union
my #unique = unique(#a, #b);
# check if arrays contain same members
if ( !array_diff(#a, #b) ) {
# do something
}
# get items from array #a that are not in array #b
my #minus = array_minus( #a, #b );
perlfaq4 to the rescue:
How do I compute the difference of two arrays? How do I compute the intersection of two arrays?
Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array:
#union = #intersection = #difference = ();
%count = ();
foreach $element (#array1, #array2) { $count{$element}++ }
foreach $element (keys %count) {
push #union, $element;
push #{ $count{$element} > 1 ? \#intersection : \#difference }, $element;
}
If you properly declare your variables, the code looks more like the following:
my %count;
for my $element (#array1, #array2) { $count{$element}++ }
my ( #union, #intersection, #difference );
for my $element (keys %count) {
push #union, $element;
push #{ $count{$element} > 1 ? \#intersection : \#difference }, $element;
}
You need to provide a lot more context. There are more efficient ways of doing that ranging from:
Go outside of Perl and use shell (sort + comm)
map one array into a Perl hash and then loop over the other one checking hash membership. This has linear complexity ("M+N" - basically loop over each array once) as opposed to nested loop which has "M*N" complexity)
Example:
my %second = map {$_=>1} #second;
my #only_in_first = grep { !$second{$_} } #first;
# use a foreach loop with `last` instead of "grep"
# if you only want yes/no answer instead of full list
Use a Perl module that does the last bullet point for you (List::Compare was mentioned in comments)
Do it based on timestamps of when elements were added if the volume is very large and you need to re-compare often. A few thousand elements is not really big enough, but I recently had to diff 100k sized lists.
You can try Arrays::Utils, and it makes it look nice and simple, but it's not doing any powerful magic on the back end. Here's the array_diffs code:
sub array_diff(\#\#) {
my %e = map { $_ => undef } #{$_[1]};
return #{[ ( grep { (exists $e{$_}) ? ( delete $e{$_} ) : ( 1 ) } #{ $_[0] } ), keys %e ] };
}
Since Arrays::Utils isn't a standard module, you need to ask yourself if it's worth the effort to install and maintain this module. Otherwise, it's pretty close to DVK's answer.
There are certain things you must watch out for, and you have to define what you want to do in that particular case. Let's say:
#array1 = qw(1 1 2 2 3 3 4 4 5 5);
#array2 = qw(1 2 3 4 5);
Are these arrays the same? Or, are they different? They have the same values, but there are duplicates in #array1 and not #array2.
What about this?
#array1 = qw( 1 1 2 3 4 5 );
#array2 = qw( 1 1 2 3 4 5 );
I would say that these arrays are the same, but Array::Utils::arrays_diff begs to differ. This is because Array::Utils assumes that there are no duplicate entries.
And, even the Perl FAQ pointed out by mob also says that It assumes that each element is unique in a given array. Is this an assumption you can make?
No matter what, hashes are the answer. It's easy and quick to look up a hash. The problem is what do you want to do with unique values.
Here's a solid solution that assumes duplicates don't matter:
sub array_diff {
my #array1 = #{ shift() };
my #array2 = #{ shift() };
my %array1_hash;
my %array2_hash;
# Create a hash entry for each element in #array1
for my $element ( #array1 ) {
$array1_hash{$element} = #array1;
}
# Same for #array2: This time, use map instead of a loop
map { $array_2{$_} = 1 } #array2;
for my $entry ( #array2 ) {
if ( not $array1_hash{$entry} ) {
return 1; #Entry in #array2 but not #array1: Differ
}
}
if ( keys %array_hash1 != keys %array_hash2 ) {
return 1; #Arrays differ
}
else {
return 0; #Arrays contain the same elements
}
}
If duplicates do matter, you'll need a way to count them. Here's using map not just to create a hash keyed by each element in the array, but also count the duplicates in the array:
my %array1_hash;
my %array2_hash;
map { $array1_hash{$_} += 1 } #array1;
map { $array2_hash{$_} += 2 } #array2;
Now, you can go through each hash and verify that not only do the keys exist, but that their entries match
for my $key ( keys %array1_hash ) {
if ( not exists $array2_hash{$key}
or $array1_hash{$key} != $array2_hash{$key} ) {
return 1; #Arrays differ
}
}
You will only exit the for loop if all of the entries in %array1_hash match their corresponding entries in %array2_hash. Now, you have to show that all of the entries in %array2_hash also match their entries in %array1_hash, and that %array2_hash doesn't have more entries. Fortunately, we can do what we did before:
if ( keys %array2_hash != keys %array1_hash ) {
return 1; #Arrays have a different number of keys: Don't match
}
else {
return; #Arrays have the same keys: They do match
}
You can use this for getting diffrence between two arrays
#!/usr/bin/perl -w
use strict;
my #list1 = (1, 2, 3, 4, 5);
my #list2 = (2, 3, 4);
my %diff;
#diff{ #list1 } = undef;
delete #diff{ #list2 };
You want to compare each element of #x against the element of the same index in #y, right? This will do it.
print "Index: $_ => \#x: $x[$_], \#y: $y[$_]\n"
for grep { $x[$_] != $y[$_] } 0 .. $#x;
...or...
foreach( 0 .. $#x ) {
print "Index: $_ => \#x: $x[$_], \#y: $y[$_]\n" if $x[$_] != $y[$_];
}
Which you choose kind of depends on whether you're more interested in keeping a list of indices to the dissimilar elements, or simply interested in processing the mismatches one by one. The grep version is handy for getting the list of mismatches. (original post)
n + n log n algorithm, if sure that elements are unique in each array (as hash keys)
my %count = ();
foreach my $element (#array1, #array2) {
$count{$element}++;
}
my #difference = grep { $count{$_} == 1 } keys %count;
my #intersect = grep { $count{$_} == 2 } keys %count;
my #union = keys %count;
So if I'm not sure of unity and want to check presence of the elements of array1 inside array2,
my %count = ();
foreach (#array1) {
$count{$_} = 1 ;
};
foreach (#array2) {
$count{$_} = 2 if $count{$_};
};
# N log N
if (grep { $_ == 1 } values %count) {
return 'Some element of array1 does not appears in array2'
} else {
return 'All elements of array1 are in array2'.
}
# N + N log N
my #a = (1,2,3);
my #b=(2,3,1);
print "Equal" if grep { $_ ~~ #b } #a == #b;
Not elegant, but easy to understand:
#!/usr/local/bin/perl
use strict;
my $file1 = shift or die("need file1");
my $file2 = shift or die("need file2");;
my #file1lines = split/\n/,`cat $file1`;
my #file2lines = split/\n/,`cat $file2`;
my %lines;
foreach my $file1line(#file1lines){
$lines{$file1line}+=1;
}
foreach my $file2line(#file2lines){
$lines{$file2line}+=2;
}
while(my($key,$value)=each%lines){
if($value == 1){
print "$key is in only $file1\n";
}elsif($value == 2){
print "$key is in only $file2\n";
}elsif($value == 3){
print "$key is in both $file1 and $file2\n";
}
}
exit;
__END__
Try to use List::Compare. IT has solutions for all the operations that can be performed on arrays.