Why is 1 assigned to a hash element?

Why is 1 assigned to a hash element? - perl

Could any one explain what this statement does in Perl
$type{$_->{brand}} = 1;
I could understand that the hash %type has a key brand holding the reference to another hash brand and 1 is assigned to it
what does it mean??!!! when it is assigned as 1?
package SillyFunction;
sub group_products {
my $products = shift;
my %brand_type = ();
my $grouped_products = [];
foreach (#{$products}) {
$brand_type{ $_->{brand} } ||= {};
$brand_type{ $_->{brand} }->{ $_->{type} } = 1;
}
foreach (sort keys %brand_type) {
my $brand = $_;
foreach (sort keys %{ $brand_type{$brand} }) {
push(#{$grouped_products}, { brand => $brand, type => $_ });
}
}
$grouped_products;
}
1;

The code
$type{$_->{brand}} = 1;
means:
We have a variable of the type hash, named %hash.
The topic variable $_ contains a reference to a hash.
We access the entry called brand in the hash referenced by $_. We remember this value.
We access the entry with the name we just remembered in the hash named %hash.
Hash elements are lvalues, i.e. something can be assigned to them.
We assign the number 1 into the hash slot we just accessed.
Points to note:
In Perl, a hash is a data structure. Other languages know this as an associative array. It maps strings to scalar values.
A hash function calculates a characteristic number for a given string. The hash data structure uses such a function internally, and in a way inaccessible from Perl. Hash functions are also important in cryptography.
The = operator assigns the thing on the right to the thing on the left.
That line of code has not a single keyword, only variables (%type, $_), constants ('brand', 1) and operators ({...}, ->, =, ;).
Here is the code you posted in your comment, annotated with comments:
# Declare a namespace "SillyFunction".
# This affects the full names of the subroutines, and some variables.
package SillyFunction;
# Declare a sub that takes one parameter.
sub group_products {
my $products = shift;
my %brand_type = (); # %brand_type is an empty hash.
my $grouped_products = []; # $grouped_products is a reference to an array
# loop through the products.
# The #{...} "dereferences" an arrayref to an ordinary array
# The current item is in the topic variable $_
foreach (#{$products}) {
# All the items in $products are references to hashes.
# The hashes have keys "brand" and "type".
# If the entry if %brand_type with the name of $_->{brand} is false,
# Then we assign an empty hashref.
# This is stupid (see discussion below)
$brand_type{$_->{brand}} ||= {};
# We access the entry names $_->{brand}.
# We use that value as a hashref, and access the entry $_->{type} in there.
# We then assign the value 1 to that slot.
$brand_type{$_->{brand}}->{$_->{type}} = 1;
}
# We get the names of all entries of %brand_type with the keys function
# We sort the names alphabetically.
# The current key is in $_
foreach (sort keys %brand_type) {
# We assign the current key to the $brand variable.
# This is stupid.
my $brand = $_;
# We get all the keys of the hash referenced by $brand_type{$brand}
# And sort that again.
# The current key is in $_
foreach (sort keys %{$brand_type{$brand}}) {
# We dereference the ordinary array from the arrayref $grouped_products.
# We add a hashref to the end that contains entries for brand and type
push(#{$grouped_products}, { brand => $brand, type => $_});
}
}
# We implicitly return the arrayref containing all brands and types.
$grouped_products;
}
# We return a true value to signal perl that this module loaded all right.
1;
What does this code do? It takes all products (a product is a hashref containing a field for brand and type), and sorts them primarily by brand, secondarily by type, in alphabetic, ascending order.
While doing so, the author produced horrible code. Here is what could have gone better:
He uses an arrayref instead of an array. It would have been easier to just use an array, and return a reference to that:
my #grouped_products;
push #grouped_products, ...;
return \#grouped_products; # reference operator \
At some point, an hashref is assigned. This is unneccessary, as Perl autovivicates undefined values that you use as a hash or array reference. That complete line is useless. Also, it is only assigned if that value is false. What the author probably wanted is to assign if that value is undefined. The defined-or operator // could have been used here (only since perl5 v10 or later).
A hash of hashes is built. This is wasteful. A hash of an array would have been better.
If one loops over values with for or foreach, the current item doesn't have to be assigned to the cryptic $_. Instead, a loop variable can be specified: foreach my $foo (#bar). The default behaviour of foreach is similar to foreach local $_ (#bar).
Implicit returns are bad.
Here is a piece of code that implements the same subroutine, but more perlish — remember, we just wanted to sort the products (assuming they already are unique)
sub group_products {
my ($products) = #_;
my #grouped =
# sort by brand. If that is a draw, sort by type.
sort { $a->{brand} cmp $b->{brand} or $a->{type} cmp $b->{type} }
map { +{%$_} } # make a copy.
#$products; # easy dereference
return \#grouped;
}
Explanation: This code is largely self-documenting. The sort function takes a block that has to return a number: Either negative for “$a is smaller than $b”, zero for “$a and $b are equal”, or positive for “$a is larger than $b”.
The cmp operator compare the operands lexigraphically. If the brands are different, then we don't have to compare the types. If the brands are the same, then the first cmp returns 0, which is a false value. Therefore, the second comparision (type) is executed, and that value returned. This is standard Perl idiom for sorting by primary and secondary key.
The sort and map cascade executes from right/bottom to left/top.
If the uniqueness is not guaranteed, something like this would work better:
use List::MoreUtils qw/uniq/;
sub group_products {
my ($products) = #_;
my %grouping;
push #{ $grouping{ $_->{brand} } }, $_->{type} for #$products;
my #grouped;
for my $brand (sort keys %grouping) {
push #grouped, +{brand => $brand, type => $_} for sort uniq #{ $grouping{$brand} };
}
return \#grouped;
}
Explanation: We define a %grouping hash (to be filled). For each product, we add the type of that product to the arrayref of the appropriate brand in the grouping hash. That is, we collect all types for each brand. We define an array of all grouped products (to be filled). We iterate through all brands in alphabetical order, and then iterate through all unique products of that brand in alphabetical order. For each of these brand/type combinations, we add a new hashref to the grouped products. The uniq function is imported from the excellent List::MoreUtils module. We return a reference to the array of grouped products.

Related

How to store values of hash in array in perl?

I have a hash having duplicate values and unique keys.I have to store keys in array of size 5, if more keys are there new array should be created and stored in it.
The keys stored in 1 array should have same value.
Note: I have to read those values from excel sheet and generate c source file.
Ex:
%hash = (a=>1,b=>2,c=>1,d=>1,e=>3,f=>4,g=>4,h=>1,i=>1,j=>1);
output in c file:
datatype arr1[]={a,c,d,h,i};
datatype arr2[]={j};
datatype arr3[]={b};
datatype arr4[]={e};
datatype arr5[]={f,g};

So you need to find keys that have the same values?
So we need to kind of revert the array, but being a bit smart to handle that the original values are not unique. Som instead of just transforming 'key' => 'value' pairs to 'value' => 'key', we need to store the keys in arrays.
my %hash = ...;
my %transposed;
for my $key (keys %hash) {
my $value = $hash{$key};
$transposed{$value} = [] unless defined $transposed{$value};
push #{ $transposed{$value} }, $key;
}
Then you have a hash of arrays, where each key is a value in the original hash and the elements of the arrays are the keys. The next step is to iterate over the keys and spilt each list into lines of 5 elements:
for my $key (sort keys %transposed) {
while (#{ $transposed{$key} }) {
my #list = splice #{ $transposed{$key} }, 0, 5;
say join ", ", #list;
}
}
The main parts is the while loop iterating as long as there are elements in the current list and the splice removes and returns up to 5 element from the list each iteration. Adding the exact C code is left as an exercise for the interested reader... :-)
You might need to read up on references: http://perldoc.perl.org/perlreftut.html
The line setting a hash value to a reference to an empty array is not necessary as perl will automatically create a arrayref when you tries to push a value to it. I have included it to make it clearer what is going on.

Does iterating over a hash reference require implicitly copying it in perl?

Lets say I have a large hash and I want to iterate over the contents of it contents. The standard idiom would be something like this:
while(($key, $value) = each(%{$hash_ref})){
///do something
}
However, if I understand my perl correctly this is actually doing two things. First the
%{$hash_ref}
is translating the ref into list context. Thus returning something like
(key1, value1, key2, value2, key3, value3 etc)
which will be stored in my stacks memory. Then the each method will run, eating the first two values in memory (key1 & value1) and returning them to my while loop to process.
If my understanding of this is right that means that I have effectively copied my entire hash into my stacks memory only to iterate over the new copy, which could be expensive for a large hash, due to the expense of iterating over the array twice, but also due to potential cache hits if both hashes can't be held in memory at once. It seems pretty inefficient. I'm wondering if this is what really happens, or if I'm either misunderstanding the actual behavior or the compiler optimizes away the inefficiency for me?
Follow up questions, assuming I am correct about the standard behavior.
Is there a syntax to avoid copying of the hash by iterating over it values in the original hash? If not for a hash is there one for the simpler array?
Does this mean that in the above example I could get inconsistent values between the copy of my hash and my actual hash if I modify the hash_ref content within my loop; resulting in $value having a different value then $hash_ref->($key)?

No, the syntax you quote does not create a copy.
This expression:
%{$hash_ref}
is exactly equivalent to:
%$hash_ref
and assuming the $hash_ref scalar variable does indeed contain a reference to a hash, then adding the % on the front is simply 'dereferencing' the reference - i.e. it resolves to a value that represents the underlying hash (the thing that $hash_ref was pointing to).
If you look at the documentation for the each function, you'll see that it expects a hash as an argument. Putting the % on the front is how you provide a hash when what you have is a hashref.
If you wrote your own subroutine and passed a hash to it like this:
my_sub(%$hash_ref);
then on some level you could say that the hash had been 'copied', since inside the subroutine the special #_ array would contain a list of all the key/value pairs from the hash. However even in that case, the elements of #_ are actually aliases for the keys and values. You'd only actually get a copy if you did something like: my #args = #_.
Perl's builtin each function is declared with the prototype '+' which effectively coerces a hash (or array) argument into a reference to the underlying data structure.
As an aside, starting with version 5.14, the each function can also take a reference to a hash. So instead of:
($key, $value) = each(%{$hash_ref})
You can simply say:
($key, $value) = each($hash_ref)

No copy is created by each (though you do copy the returned values into $key and $value through assignment). The hash itself is passed to each.
each is a little special. It supports the following syntaxes:
each HASH
each ARRAY
As you can see, it doesn't accept an arbitrary expression. (That would be each EXPR or each LIST). The reason for that is to allow each(%foo) to pass the hash %foo itself to each rather than evaluating it in list context. each can do that because it's an operator, and operators can have their own parsing rules. However, you can do something similar with the \% prototype.
use Data::Dumper;
sub f { print(Dumper(#_)); }
sub g(\%) { print(Dumper(#_)); } # Similar to each
my %h = (a=>1, b=>2);
f(%h); # Evaluates %h in list context.
print("\n");
g(%h); # Passes a reference to %h.
Output:
$VAR1 = 'a'; # 4 args, the keys and values of the hash
$VAR2 = 1;
$VAR3 = 'b';
$VAR4 = 2;
$VAR1 = { # 1 arg, a reference to the hash
'a' => 1,
'b' => 2
};
%{$h_ref} is the same as %h, so all of the above applies to %{$h_ref} too.
Note that the hash isn't copied even if it is flattened. The keys are "copied", but the values are returned directly.
use Data::Dumper;
my %h = (abc=>"def", ghi=>"jkl");
print(Dumper(\%h));
$_ = uc($_) for %h;
print(Dumper(\%h));
Output:
$VAR1 = {
'abc' => 'def',
'ghi' => 'jkl'
};
$VAR1 = {
'abc' => 'DEF',
'ghi' => 'JKL'
};
You can read more about this here.

Confused by local variables and methods in Perl

I've got the following Perl code:
my $wantedips;
# loop through the interfaces
foreach (#$interfaces) {
# local variable called $vlan
my $vlan = $_->{vlan};
# local variable called $cidr
my $cidr = $_->{ip} ."/".$nnm->bits();
# I dont understand this next bit.
# As a rubyist, it looks like a method called $cidr is being called on $wantedips
# But $cidr is already defined as a local variable.
# Why the spooky syntax? Why is $cidr passed as a method to $wantedips?
# what does ->{} do in PERL? Is it some kind of hash syntax?
$wantedips->{$cidr} = $vlan;
# break if condition true
next if ($ips->{$cidr} == $vlan);
# etc
}
The part I don't get is in my comments. Why is $cidr passed to $wantedips, when both are clearly defined as local variables? I'm a rubyist and this is really confusing. I can only guess that $xyz->{$abc}="hello" creates a hash of some sort like so:
$xyz => {
$abc => "hello"
}
I'm new to Perl as you can probably tell.

I don't understand why you are comfortable with
my $vlan = $_->{vlan}
but then
$wantedips->{$cidr} = $vlan
gives you trouble? Both use the same syntax to access hash elements using a hash reference.
The indirection operator -> is used to apply keys, indices, or parameters to a reference value, so you access elements of a hash by its reference with
$href->{vlan}
elements of an array by its reference with
$aref->[42]
and call a code reference with
$cref->(1, 2, 3)
As a convenience, and to make code cleaner, you can remove the indirection operator from the sequences ]->[ and }->{ (and any mixture of brackets and braces). So if you have a nested data structure you can write
my $name = $system->{$ip_address}{name}[2]
instead of
my $name = $system->{$ip_address}->{name}->[2]

#I dont understand this next bit.
$wantedips->{$cidr} = $vlan;
$wantedips is a scalar, specifically it is a hashref (a reference to a hash).
The arrow gets something from inside the reference.
{"keyname"} is how to access a particular key in a hash.
->{"keyname"} is how you access a particular key in a hash ref
$cidr is also a scalar, in this case it is a string.
->{$cidr} accesses a key from a hash ref when the key name is stored in a string.
So to put it all together:
$wantedips->{$cidr} = $vlan; means "Assign the value of $vlan to the key described by the string stored in $cidr on the hash referenced by $wantedips.
I can only guess that $xyz->{$abc}="hello" creates a hash of some sort
like.
Let's break this down to a step by step example that strips out the loops and other bits not directly associated with the code in question.
# Create a hash
my %hash;
# Make it a hashref
my $xyz = \%hash;
# (Those two steps could be done as: my $xyz = {})
# Create a string
my $abc = "Hello";
# Use them together
$xyz->{$abc} = "world";
# Look at the result:
use Data::Dump;
Data::Dump::ddx($xyz);
# Result: { Hello => "world" }

Multidimension array in perl

I am working on a short script in which two to three variables are linked with each other.
Example:
my #batch;
my #case;
my #type = {
back => "sticker",
front => "no sticker",
};
for (my $i=0; $i<$#batch; $i++{
for (my $j=0; $j<$#batch; $j++{
if ($batch[$i]=="health" && $case[$i]$j]=="pain"){
$type[$i][$j]->back = "checked";
}
}
}
In this short code I want to use #type as $type[$i][$j]->back & $type[$i][$j]->front, but I am getting error that array referenced not defined . Can anyone help me how to fix this ?

Perl two-dimensional arrays are just arrays of arrays: each element of the top level array contains a (reference to) another array. The best reference for this is perldoc perlreftut
From what I can understand, you want an array of arrays of hashes. $type[$i][$j]->back and $type[$i][$j]->front are method calls in Perl, and what you want is $type[$i][$j]{back} and $type[$i][$j]{front}.
use strict;
use warnings;
my #batch;
my #case;
# Populate #batch and #case
my #type;
for my $i (0 .. $#batch) {
for my $j (0 .. $#{ $batch[$i] } ) {
if ($batch[$i] eq 'health' and $case[$i][$j] eq 'pain') {
$type[$i][$j]{back} = 'checked';
}
}
}
But I am very worried about your design. #type will be full of undefined elements, with only occasional ones set to checked. A proper fix depends entirely on what you need to do with #type once you have built it.
I hope this helps

Perl doesn't have multiple dimension variables. To emulate multidimential arrays, you can use what are called references. A reference is a way of referring to a memory location of another Perl structure such as an array or hash.
References allows you to build up more complex structures. For example, you could have an array and instead of each element in the array having a distinct value, it could point to another array. Using this, I can treat my array of arrays as a two dimensional array. But it's not a two dimensional array.
In a two dimensional array, each column ($j) has the same length. That's guaranteed. In Perl, what you have is each row ($i), pointing to a different array of columns ($j), and each of those column arrays could have a different number of elements (or even none at all! That inner array $j may not even be defined!).
There for, I have to check each column and see exactly how many values it might have:
for my $i ( 0..$#array ) {
if ( ref $array[i] ne "ARRAY" ) {
die qq(There is no sub array! for \$array[$i]!\n);
}
my #temp_j_array = #{ $array[$i] } { # This is how you dereference a reference
for my $j ( 0..$#temp_j_array ) {
# Here be dragons...
}
}
Note that I have to see exactly how many columns are in my inner ($j) array before I can go through it.
By the way, notice how I use .. to index my arrays. It's a lot cleaner than using that three part for loop which is very error prone. For example, should you check $i < $#array or $i <= $#array`? See the difference?
Since you're already dealing with a very complex structure (an array of arrays), I'm going to make it even more complex: (An array of arrays of hashes). This added complexity allows me to get rid of three separate variables. Instead of trying to keep #batch #case and #type in sync with each other, I can make these keys to my inner most hash:
my #structure = ... # Some sort of structure...
for my $i ( 0..$#structure ) {
my #temp = #{ $structure[$i] }; # This is a reference to an array. Dereference it.
for my $j ( 0..$#temp ) {
if ( $structure[$i]->[$j]->{batch} eq "health"
and $structure[$i]->[$j]->{case} eq "pain" ) {
$structure[$i]->[$j]->{back} = "checked";
}
}
}
This is a very common way to use Perl references to build more complex data structures:
my %employees; # Keyed by employee number:
$employees{1001}->{NAME} = "Bob";
$employees{1001}->{JOB} = "Yes man";
$employees{1002}->{NAME} = "Susan";
$employees{1002}->{JOB} = "sycophant";
You had some syntax errors, and were using the wrong boolean operator (==) instead of (ne).

Bad index while coercing array into hash in Perl

Below is the code
my $results = $session->array_of_hash_for_cursor("check_if_receipts_exist", 0, #params);
next if( scalar #{$results} <= 0 );
$logger->info("Retrieved number of records: ".scalar #$results);
foreach my $row( sort { $results->{$b}->{epoch_received_date} cmp $results->{$a}->{epoch_received_date} } keys %{$results} )
{
//logic
}
'check_if_receipts_exists' is a SQL query which returns some results. Which I try to execute this, I am getting the following error,
Bad index while coercing array into hash
I am new to Perl. Can someone please point out the mistake I am making?

Is $results a hash reference or an array reference?
In some places you are using it like an array reference:
scalar #{$results}
and in other places you are using it like a hash reference:
$results->{$b}->{...}
keys %{$results}
It can't be both (at least not without some heavy overload magic).
If I can infer from the name of the function that sets $results, it should be a reference to a list of hash references, then a few tweaks will set it right:
Using #{$results} is correct; this expression is "an array of hash references"
The last argument to sort should be a list, but the correct list to pass is #{$results}, not keys %{$results}.
Then the parameters $a and $b inside the sort function will be members of #{$results}, that is, they will be hash references. So the comparison to make is
$a->{epoch_received_date} cmp $b->{epoch_retrieve_data}
and not
$results->{$a}->{...} cmp $results->{$b}->{...}
All together:
my $results = $session->array_of_hash_for_cursor(
"check_if_receipts_exist", 0, #params);
next if !#$results;
$logger->info("Retrieved number of records: ".#$results);
for my $row (
sort {
$b->{epoch_received_date}
cmp
$a->{epoch_received_date}
} #$results
) {
# logic
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse