Having difficulty in understanding a piece of Perl code

Having difficulty in understanding a piece of Perl code - perl

I am studying an existing Perl program, which includes the following line of code:
#{$labels->{$set}->{"train"}->{"negative"}} = (#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
I am very confused on how to understand this piece of code.

It is not valid Perl as written:
#{$labels->{$set}->{"train"}->{"negative"}} = (#{$labels->{$set}->{"train"}->{"negative"}};
^
Syntax error - open parenthesis without a matching close parenthesis
If you ignore the open parenthesis, then the LHS and the RHS expressions are identical; it assigns an array value to itself. The arrows -> and {} notations mostly mean that you're dealing with an array ref at the end of a hash reference to a hash reference to a hash reference, or thereabouts. It is a nasty piece of code to have to understand, at best, but the structure may make more sense in the bigger context of the whole program (and the whole program will be considerably bigger, so I don't recommend posting it here).
Double check your copy'n'paste. If that is actually in the Perl script, then it can't be being compiled, much less executed, so you'll have to work out how and why that line is not operative.
The revised expression has the RHS:
(#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
The parentheses provide an array or list context; the first term is the original array; the second term is the result of splice applied to the array #shuffled. So, the splice removes a couple thousand elements (2001?) from the array #shuffled and the expression as a whole adds the deleted elements to the end of the array identified by the complex expression on the LHS.
It would probably be more efficiently written as:
push #{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000);
It's also more economical on the typing, and a lot more economical on the brain cells.

The statement:
#{$labels->{$set}->{"train"}->{"negative"}} =
(#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
Does a number of things at once. It can also be written, slightly more verbose:
my #array = #{$labels->{$set}->{"train"}->{"negative"}};
my #values = #shuffled[0..1999]; # get the first 2000 values
splice #shuffled, 0, 2000; # delete the values after use
#array = (#array, #values); # add the values to the array
#{$labels->{$set}->{"train"}->{"negative"}} = #array;
As you'll notice, the LENGTH in splice is not the number of the array element, but the length of the array, which is why the count is one off in the array slice above.
As Jonathan pointed out, it is much simpler to use push.
push #{$labels->{$set}{"train"}{"negative"}}, splice(#shuffled, 0, 2000);
Documentation: splice

$labels is a reference to a hash of hashes with three depths (HoHoH). The lookup $labels->{$set}->{"train"}->{"negative"} returns an array reference.
Hope that helps a bit ..

Related

Raku: Trouble Accessing Value of a Multidimensional Hash

I am having issues accessing the value of a 2-dimensional hash. From what I can tell online, it should be something like: %myHash{"key1"}{"key2"} #Returns value
However, I am getting the error: "Type Array does not support associative indexing."
Here's a Minimal Reproducible Example.
my %hash = key1-dim1 => key1-dim2 => 42, key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'foo bar'}; # Type Array does not support associative indexing.
Here's another reproducible example, but longer:
my #tracks = 'Foo Bar', 'Foo Baz';
my %count;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash.push: (#words[$i + 1] => 1);
%counts.push: (#words[$i] => %wordHash);
say %counts{#words[$i]}{#words[$i+1]}; #===============CRASHES HERE================
say %counts.kv;
$i = $i + 1;
}
}
}
In my code above, the problem line where the 2-d hash value is accessed will work once in the first iteration of the for-loop. However, it always crashes with that error on the second time through. I've tried replacing the array references in the curly braces with static key values in case something was weird with those, but that did not affect the result. I can't seem to find what exactly is going wrong by searching online.
I'm very new to raku, so I apologize if it's something that should be obvious.

After adding the second elements with push to the same part of the Hash, the elment is now an array. Best you can see this by print the Hash before the crash:
say "counts: " ~ %counts.raku;
#first time: counts: {:aaa(${:aaa(1)})}
#second time: counts: {:aaa($[{:aaa(1)}, {:aaa(1)}])}
The square brackets are indicating an array.
Maybe BagHash does already some work for you. See also raku sets without borders
my #tracks = 'aa1 aa2 aa2 aa3', 'bb1 bb2', 'cc1';
for #tracks -> $title {
my $n = BagHash.new: $title.words;
$n.raku.say;
}
#("aa2"=>2,"aa1"=>1,"aa3"=>1).BagHash
#("bb1"=>1,"bb2"=>1).BagHash
#("cc1"=>1).BagHash

Let me first explain the minimal example:
my %hash = key1-dim1 => key1-dim2 => 42,
key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'key2-dim2'}; # Type Array does not support associative indexing.
The problem is that the value associated with key2-dim1 isn't itself a hash but is instead an Array. Arrays (and all other Positionals) only support indexing by position -- by integer. They don't support indexing by association -- by string or object key.
Hopefully that explains that bit. See also a search of SO using the [raku] tag plus 'Type Array does not support associative indexing'.
Your longer example throws an error at this line -- not immediately, but eventually:
say %counts{...}{...}; # Type Array does not support associative indexing.
The hash %counts is constructed by the previous line:
%counts.push: ...
Excerpting the doc for Hash.push:
If a key already exists in the hash ... old and new value are both placed into an Array
Example:
my %h = a => 1;
%h.push: (a => 1); # a => [1,1]
Now consider that the following code would have the same effect as the example from the doc:
my %h;
say %h.push: (a => 1); # {a => 1}
say %h.push: (a => 1); # {a => [1,1]}
Note how the first .push of a => 1 results in a 1 value for the a key of the %h hash, while the second .push of the same pair results in a [1,1] value for the a key.
A similar thing is going on in your code.
In your code, you're pushing the value %wordHash into the #words[$i] key of the %counts hash.
The first time you do this the resulting value associated with the #words[$i] key in %counts is just the value you pushed -- %wordHash. This is just like the first push of 1 above resulting in the value associated with the a key, from the push, being 1.
And because %wordHash is itself a hash, you can associatively index into it. So %counts{...}{...} works.
But the second time you push a value to the same %counts key (i.e. when the key is %counts{#words[$i]}, with #words[$i] set to a word/string/key that is already held by %counts), then the value associated with that key will not end up being associated with %wordHash but instead with [%wordHash, %wordHash].
And you clearly do get such a second time in your code, if the #tracks you are feeding in have titles that begin with the same word. (I think the same is true even if the duplication isn't the first word but instead later ones. But I'm too confused by your code to be sure what the exact broken combinations are. And it's too late at night for me to try understand it, especially given that it doesn't seem important anyway.)
So when your code then evaluates %counts{#words[$i]}{#words[$i+1]}, it is the same as [%wordHash, %wordHash]{...}. Which doesn't make sense, so you get the error you see.
Hopefully the foregoing has been helpful.
But I must say I'm both confused by your code, and intrigued as to what you're actually trying to accomplish.
I get that you're just learning Raku, and that what you've gotten from this SO might already be enough for you, but Raku has a range of nice high level hash like data types and functionality, and if you describe what you're aiming at we might be able to help with more than just clearing up Raku wrinkles that you and we have been dealing with thus far.
Regardless, welcome to SO and Raku. :)

Well, this one was kind of funny and surprising. You can't go wrong if you follow the other question, however, here's a modified version of your program:
my #tracks = ['love is love','love is in the air', 'love love love'];
my %counts;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash{#words[$i + 1]} = 1;
%counts{#words[$i]} = %wordHash;
say %counts{#words[$i]}{#words[$i+1]}; # The buck stops here
say %counts.kv;
$i = $i + 1;
}
}
}
Please check the line where it crashed before. Can you spot the difference? It was kind of a (un)lucky thing that you used i as a loop variable... i is a complex number in Raku. So it was crashing because it couldn't use complex numbers to index an array. You simply had dropped the $.
You can use sigilless variables in Raku, as long as they're not i, or e, or any of the other constants that are already defined.
I've also made a couple of changes to better reflect the fact that you're building a Hash and not an array of Pairs, as Lukas Valle said.

Initialize empty Array of Hashes of a given length - one-liner

I'd like to pre-initialize the elements of an array of hashes so that when it comes time to filling in the data, I don't need to check for the existence of various members and initialize them every loop. If I can, I'd like to pre-initialize the general form of the datastructure which should look like this:
$num_sections = 4;
# Some magic to initialize $VAR1 to this:
$VAR1 = {
'sections' => [
{},
{},
{},
{}
]
};
I would like this to work,
$ph->{sections} ||= [({} x $num_sections )];
but it results in
$VAR1 = {
'sections' => 'HASH(0x21b6110)HASH(0x21b6110)HASH(0x21b6110)HASH(0x21b6110)'
};
And no amount of playing with the () list context and {} empty hash reference seem to make it work.
This works but it's not quite a one-liner
unless ($ph->{sections})
{
push #{ $ph->{sections}}, {} foreach (1..$num_sections);
}
There's probably some perl magic that I can use to add the unless to the end, but I haven't quite figured it out.
I feel I'm so close, but I just can't quite get it.
Update Oleg points out that this probably isn't necessary at all. See comments below.

If the left-hand side of x is not in parens, x repeats the string on its LHS and returns the concatenation of those strings.
If the left-hand side of x is in parens, x repeats the value on its LHS and returns the copies.
This latter approach is closer to what you want, but it's still wrong as you'll end up with multiple references to a single hash. You want to create not only new references, but new hashes as well. For that, you can use the following:
$ph->{sections} ||= [ map { +{} } 1..$num_sections ];

Why does the goatse operator work?

The difference between arrays and lists and between list and scalar context have been discussed in the Perl community quite a bit this last year (and every year, really). I have read over articles from chromatic and friedo, as well as this recommended monks node. I'm trying now to understand the goatse operator, documented in perlsecret.
Here is some code I used to study it:
# right side gets scalar context, so commas return rightmost item
$string = qw(stuff junk things);
say $string; # things
# right side gets list context, so middle is list assigned in scalar context
$string = () = qw(stuff junk things);
say $string; # 3
# right side gets list context, so creates a list, assigns an item to $string2, and
# evaluates list in scalar context to assign to $string
$string = ($string2) = qw(stuff junk things);
say $string; # 3
say $string2; # stuff
I think I have come far enough to understand all of the list and scalar context workings. The comma operator in scalar context returns its right side, so the first example simply assigns the last item in the comma expression (without any commas) to $string. In the other examples, the assignment of the comma expression to a list puts it in list context, so a list is created, and lists evaluated in scalar context return their size.
There are 2 parts that I don't understand.
First, lists are supposed to be immutable. This is stressed repeatedly by friedo. I guess that assignment via = from list to list distributes assignments from items in one list to items in the other list, which is why in the second example $string2 gets 'stuff', and why we can unpack #_ via list assignment. However, I don't understand how assignment to (), an empty list, could possibly work. With my current understanding, since lists are immutable, the size of the list would remain 0, and then assigning the size to $stuff in examples 2 and 3 would give it the value 0. Are lists not actually immutable?
Second, I've read numerous times that lists don't actually exist in scalar context. But the explanation of the goatse operator is that it is a list assignment in scalar context. Is this not a counter-example to the statement that lists don't exist in scalar context? Or is something else going on here?
Update: After understanding the answer, I think an extra pair of parentheses helps to conceptualize how it works:
$string = ( () = qw(stuff junk things) );
Inside the parens, the = is an assignment to an 'aggregate', and so is a list assignment operator (which is different from the scalar assignment operator, and which should not be confused with "list context"; list and scalar assignment can happen in either list or scalar context). () does not change in any way. = has a return value in Perl, and the result of the list assignment is assigned to $string via the left =. Assignment to $string gives scalar context to the RHS (everything in the parens), and in scalar context the returned value of the list assignment operator is the number of items in the RHS.
You can put the RHS list assignment into list context instead:
($string) = ( () = qw(stuff junk things) );
According to perlop list assignment in list context returns the list of assigned lvalues, which here is empty since there is nothing to be assigned to in (). So here $string would be undef.

You misunderstand. Lists evaluated in scalar context do not get their size. In fact, it is all but impossible to have a list in scalar context. Here, you have a scalar assignment with two operands, a scalar variable on the left, and a list assignment on the right (given scalar context by the scalar assignment). List assignments in scalar context evaluate to the number of items on the right of the assignment.
So, in:
1 $foo
2 =
3 ()
4 =
5 ('bar')
2, a scalar assignment, gives 1 and 4 scalar context.
4, a list assignment, gives 3 and 5 list context, but nevertheless is itself in scalar context and returns appropriately.
(When = is a list assignment or a scalar assignment is determined purely from the surrounding syntax; if the left operand is a hash, array, hash slice, array slice, or in parentheses, it is a list assignment, otherwise it is a scalar assignment.)
This treatment of list assignments in scalar context makes possible code like:
while ( my ($key, $value) = each %hash ) {
where list-context each is an iterator that returns (in list context) one key and value for each call, and an empty list when done, giving the while a 0 and terminating the loop.

It helps to remember that in Perl, assignment is an expression, and that you should be thinking about the value of the expression (the value of the assignment operator), not "the value of a list".
The value of the expression qw(a b) is ('a', 'b') in list context and 'b' in scalar context, but the value of the expression (() = qw(a b)) is () in list context and 2 in scalar context. The values of (#a = qw(a b)) follow the same pattern. This is because pp_aassign, the list assignment operator, chooses to return a count in scalar context:
else if (gimme == G_SCALAR) {
dTARGET;
SP = firstrelem;
SETi(lastrelem - firstrelem + 1);
}
(pp_hot.c line 1257; line numbers are subject to change, but it's near the end of PP(pp_aassign).)
Then, apart from the value of the assignment operator is the side-effect of the assignment operator. The side-effect of list assignment is to copy values from its right side to its left side. If the right side runs out of values first, the remaining elements of the left side get undef; if the left side runs out of values first, the remaining elements of the right side aren't copied. When given a LHS of (), the list assignment doesn't copy anything anywhere at all. But the value of the assignment itself is still the number of elements in the RHS, as shown by the code snippet.

First, "list" is an ambiguous term. Even the question uses it to refer to two different things. I suspect you might be doing this without realizing it, and that this is a significant part of the cause of your confusion.
I shall use "a list value" to denote what an operator returns in list context. In contrast, "the list operator" refers the operator EXPR,EXPR,EXPR,... also known as "the comma operator"[1].
Second, you should read Scalar vs List Assignment Operator.
I guess that assignment via = from list to list distributes assignments from items in one list to items in the other list, which is why in the second example $string2 gets 'stuff', and why we can unpack #_ via list assignment.
Correct.
I've read numerous times that lists don't actually exist in scalar context.
That wording is very ambiguous. You seem to be talking about list values (which are found in memory), but scalar context only exist in the code (where operators are found).
A list/comma operator can be evaluated in scalar context.
A list value can't be returned in scalar context.
Scalar context is a context in which an operator can be evaluated.
A operator evaluated in scalar context cannot return a list. It must return a scalar. Loosely speaking, you could say a list can't be returned in scalar context.
On the other hand, a list/comma operator can be evaluated in scalar context. e.g. scalar(4,5,6). Every operator can be evaluated in any context (though it's not necessarily useful to do so).
But the explanation of the goatse operator is that it is a list assignment in scalar context.
It includes one, yes.
List values and list assignment operators are two different things. One's a value. The other is a piece of of code.
A list assignment operator, like every other operator, can be evaluated in scalar context. A list assignment operator in scalar context returns the number of scalars returned by its RHS.
So if you evaluate () = qw(a b c) in scalar context, it will return three, since qw() placed three scalars on the stack.
However, I don't understand how assignment to (), an empty list, could possibly work.
Just like the assignment ($x,$y) = qw(stuff junk things) ignores the third element returned by the RHS, () = qw(stuff junk things) ignores all elements returned by the RHS.
With my current understanding, since lists are immutable, the size of the list would remain 0
Saying "the size of the list would remain zero" for ()=qw(a b c) is like saying "the value of the scalar will remain 4" for 4+5.
For starters, there's question of which list you're talking about. The LHS returned one, the RHS returned one, and the assignment operator might return one.
The list value returned by the LHS will be 0 in length.
The list value returned by the RHS will be 3 in length.
In scalar context, the list assignment operator returns the number of scalars returned by RHS (3).
In list context, the list assignment operator returns the scalars returned by LHS as lvalues (empty list).
lists are supposed to be immutable.
If you're thinking in terms of list mutability, you took a wrong turn somewhere.[2]
Notes:
The docs call EXPR,EXPR,EXPR,... two instances of a binary operator, but it's easier to understand as a single N-ary operator, and it's actually implemented as a single N-ary operator. Even in scalar context.
It's not actually true, but let's not go any further down this wrong turn.

Hash value is not re-initialized when loop is terminated with 'last' keyword

Consider the following nested loops:
my %deleted_documents_names = map { $_ => 1 }
$self->{MANUAL}->get_deleted_documents();
while($sth->fetch){
.....
.....
.....
while(my ($key, $value) = each(%deleted_documents_names)){
{
if($document_name eq $key){
$del_status=1;
last;
}
}
if($del_status==1){next;}
.....
.....
.....
.....
}
Now, I take a sample case where three values (A,B,C) will be compared against two values (B,C).
First scan:
A compared to B
A compared to C
Second scan:
B compared to B
Loop is terminated.
Third scan:
C is compared with C.
In this case, C should be compared first with B, being first value, but this comparison is skipped, and it only scans from the next element after the one that was found equal. If I remove last termination condition and let the loop run for total number of scans, then it works all fine, but I need to find out why in this case, $key refers to the next compared value and not to the first value once loop is restarted after getting terminated with last keyword.
Any help will be appreciated.

Use
keys %deleted_documents_names ; # Reset the "each" iterator.
See keys.
But, why are you iterating over the hash? Why don't you just
if (exists $deleted_documents_names{$document_name}) {

each() is a function that returns key-value pairs from a hash until it reaches the end. It is not aware of the scope it was called in, and doesn't know anything about your while loop logic. See the documentation here.
It can be reset by calling keys %hash or values %hash.
Update: however, as Choroba points out, you don't really need this loop. Your loop and accompanying logic could be replaced by this:
next if (exists $deleted_documents_names{$document_name});
(Hashes are designed with a structure that allows a key to be quickly found. In fact, this structure is what gives them the name "hashes". So doing it this way will be much more efficient than looping through all elements and testing each one).

How can I print the first to the fifth from last array elements in Perl?

I'm running the following code and I'm attempting to print the first element in the #rainbow array through the fifth-from-last element in the #rainbow array. This code works for any positive indices within the bounds of the array, but not for negative ones:
#rainbow = ("a".."z");
#slice = #rainbow[1..-5];
print "#slice\n";

You want
my #slice = #rainbow[0 .. $#rainbow - 5];
Be careful, 1 is the second element, not the first.

The .. operator forms a range from the left to right value - if the right is greater than or equal to the left. Also, in Perl, array indexing starts at zero.
How about this?
#slice = #rainbow[0..$#rainbow-5];
$#arraygives you the index of the last element in the array.

From the first two sentences for the range operator, documented in perlop:
Binary ".." is the range operator, which is really two different operators depending on the context. In list context, it returns a list of values counting (up by ones) from the left value to the right value. If the left value is greater than the right value then it returns the empty list.
When the code doesn't work, decompose it to see what's happening. For instance, you would try the range operator to see what it produced:
my #indices = 1 .. -5;
print "Indices are [#indices]\n";
When you got an empty list and realized that there is something going on that you don't understand, check the documentation for whatever you are trying to do to check it's doing what you think it should be doing. :)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Having difficulty in understanding a piece of Perl code - perl

I am studying an existing Perl program, which includes the following line of code: #{$labels->{$set}->{"train"}->{"negative"}} = (#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000)); I am very confused on how to understand this piece of code.

$labels is a reference to a hash of hashes with three depths (HoHoH). The lookup $labels->{$set}->{"train"}->{"negative"} returns an array reference. Hope that helps a bit ..

Related

Raku: Trouble Accessing Value of a Multidimensional Hash

Initialize empty Array of Hashes of a given length - one-liner

Why does the goatse operator work?

Hash value is not re-initialized when loop is terminated with 'last' keyword

How can I print the first to the fifth from last array elements in Perl?

Categories

Resources