perl adds dummy elements to array after inquiry - perl

As the title says, perl adds dummy elements to arrays after inquiries to not existing elements. Array size grows after the inquiry. Illustration to the behaviour:
my $rarr;
$rarr->[0][0] = 'S';
$rarr->[0][1] = 'MD';
$rarr->[1][0] = 'S';
$rarr->[1][1] = 'PRP';
my $crulesref;
$crulesref->[0] = $rarr;
check_rule('aa', 0);
if($rarr->[3][0] == 'M'){ # just check a not existing element
print "m\n";
}
check_rule('bb', 0);
if($rarr->[5][0] == 'M'){ # again: just check a not existing element
print "m\n";
}
check_rule('cc', 0);
sub check_rule($$)
{
my ($strg,$ix) = #_;
my $aref = $crulesref->[$ix];
my $rule_size = #$aref;
{print "-----$strg aref:$aref rs:$rule_size aref:'#$aref'\n";
for(my $t1 = 0; $t1 <$rule_size; $t1++){
print "t1:$t1 0:$aref->[$t1][0] 1:$aref->[$t1][1]\n";
}
}
}
The result of the run is:
en#en-desktop ~/dtest/perl/forditas/utf8_v1/forditas/test1 $ perl v15.pl
-----aa aref:ARRAY(0x90ed8c8) rs:2 aref:'ARRAY(0x9106cac) ARRAY(0x9106d24)'
t1:0 0:S 1:MD
t1:1 0:S 1:PRP
m <-------------- finds the non existing
-----bb aref:ARRAY(0x90ed8c8) rs:4 aref:'ARRAY(0x9106cac) ARRAY(0x9106d24) ARRAY(0x9107508)'
t1:0 0:S 1:MD
t1:1 0:S 1:PRP
t1:2 0: 1: <-- undesired dummy due to inquiry
t1:3 0: 1: <-- undesired dummy due to inquiry
m <-------------- finds the non existing
-----cc aref:ARRAY(0x90ed8c8) rs:6 aref:'ARRAY(0x9106cac) ARRAY(0x9106d24) ARRAY(0x9107904) ARRAY(0x9107508) ARRAY(0x910e860)'
t1:0 0:S 1:MD
t1:1 0:S 1:PRP
t1:2 0: 1: <-- undesired dummy due to inquiry
t1:3 0: 1: <-- undesired dummy due to inquiry
t1:4 0: 1: <-- undesired dummy due to inquiry
t1:5 0: 1: <-- undesired dummy due to inquiry
Is there no other way to avoid this than to ask before each inquiry, if the inquired element exists? I try to increase speed, and these inquiries slow the code down, and make it less easy to read.
Thanks in advance for useful hints.

This is autovivification that you are seeing. If you access the memory of $ref->[3][0] even with just a check:
if ($ref->[3][0] eq 'M' )
Then first $ref->[3] must exist before its element number zero can be checked, so it is created via autovivification. You need to first check if $ref->[3] exists or is defined to avoid creating it.
if (defined($ref->[3]) && $ref->[3][0] eq 'M')
Also, you should always use:
use strict;
use warnings;
Then you would see the warnings
Argument "M" isn't numeric in numeric eq (==) at ...
Use of uninitialized value in numeric eq (==) at ...
The if-clause gives a false positive here because the string 'M' is converted to a number (0) because of the context imposed by the numeric equality operator ==. The LHS value is undef, which is also converted to a number (0), which is why the expression evaluates to true.

Related

Raku: Trouble Accessing Value of a Multidimensional Hash

I am having issues accessing the value of a 2-dimensional hash. From what I can tell online, it should be something like: %myHash{"key1"}{"key2"} #Returns value
However, I am getting the error: "Type Array does not support associative indexing."
Here's a Minimal Reproducible Example.
my %hash = key1-dim1 => key1-dim2 => 42, key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'foo bar'}; # Type Array does not support associative indexing.
Here's another reproducible example, but longer:
my #tracks = 'Foo Bar', 'Foo Baz';
my %count;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash.push: (#words[$i + 1] => 1);
%counts.push: (#words[$i] => %wordHash);
say %counts{#words[$i]}{#words[$i+1]}; #===============CRASHES HERE================
say %counts.kv;
$i = $i + 1;
}
}
}
In my code above, the problem line where the 2-d hash value is accessed will work once in the first iteration of the for-loop. However, it always crashes with that error on the second time through. I've tried replacing the array references in the curly braces with static key values in case something was weird with those, but that did not affect the result. I can't seem to find what exactly is going wrong by searching online.
I'm very new to raku, so I apologize if it's something that should be obvious.
After adding the second elements with push to the same part of the Hash, the elment is now an array. Best you can see this by print the Hash before the crash:
say "counts: " ~ %counts.raku;
#first time: counts: {:aaa(${:aaa(1)})}
#second time: counts: {:aaa($[{:aaa(1)}, {:aaa(1)}])}
The square brackets are indicating an array.
Maybe BagHash does already some work for you. See also raku sets without borders
my #tracks = 'aa1 aa2 aa2 aa3', 'bb1 bb2', 'cc1';
for #tracks -> $title {
my $n = BagHash.new: $title.words;
$n.raku.say;
}
#("aa2"=>2,"aa1"=>1,"aa3"=>1).BagHash
#("bb1"=>1,"bb2"=>1).BagHash
#("cc1"=>1).BagHash
Let me first explain the minimal example:
my %hash = key1-dim1 => key1-dim2 => 42,
key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'key2-dim2'}; # Type Array does not support associative indexing.
The problem is that the value associated with key2-dim1 isn't itself a hash but is instead an Array. Arrays (and all other Positionals) only support indexing by position -- by integer. They don't support indexing by association -- by string or object key.
Hopefully that explains that bit. See also a search of SO using the [raku] tag plus 'Type Array does not support associative indexing'.
Your longer example throws an error at this line -- not immediately, but eventually:
say %counts{...}{...}; # Type Array does not support associative indexing.
The hash %counts is constructed by the previous line:
%counts.push: ...
Excerpting the doc for Hash.push:
If a key already exists in the hash ... old and new value are both placed into an Array
Example:
my %h = a => 1;
%h.push: (a => 1); # a => [1,1]
Now consider that the following code would have the same effect as the example from the doc:
my %h;
say %h.push: (a => 1); # {a => 1}
say %h.push: (a => 1); # {a => [1,1]}
Note how the first .push of a => 1 results in a 1 value for the a key of the %h hash, while the second .push of the same pair results in a [1,1] value for the a key.
A similar thing is going on in your code.
In your code, you're pushing the value %wordHash into the #words[$i] key of the %counts hash.
The first time you do this the resulting value associated with the #words[$i] key in %counts is just the value you pushed -- %wordHash. This is just like the first push of 1 above resulting in the value associated with the a key, from the push, being 1.
And because %wordHash is itself a hash, you can associatively index into it. So %counts{...}{...} works.
But the second time you push a value to the same %counts key (i.e. when the key is %counts{#words[$i]}, with #words[$i] set to a word/string/key that is already held by %counts), then the value associated with that key will not end up being associated with %wordHash but instead with [%wordHash, %wordHash].
And you clearly do get such a second time in your code, if the #tracks you are feeding in have titles that begin with the same word. (I think the same is true even if the duplication isn't the first word but instead later ones. But I'm too confused by your code to be sure what the exact broken combinations are. And it's too late at night for me to try understand it, especially given that it doesn't seem important anyway.)
So when your code then evaluates %counts{#words[$i]}{#words[$i+1]}, it is the same as [%wordHash, %wordHash]{...}. Which doesn't make sense, so you get the error you see.
Hopefully the foregoing has been helpful.
But I must say I'm both confused by your code, and intrigued as to what you're actually trying to accomplish.
I get that you're just learning Raku, and that what you've gotten from this SO might already be enough for you, but Raku has a range of nice high level hash like data types and functionality, and if you describe what you're aiming at we might be able to help with more than just clearing up Raku wrinkles that you and we have been dealing with thus far.
Regardless, welcome to SO and Raku. :)
Well, this one was kind of funny and surprising. You can't go wrong if you follow the other question, however, here's a modified version of your program:
my #tracks = ['love is love','love is in the air', 'love love love'];
my %counts;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash{#words[$i + 1]} = 1;
%counts{#words[$i]} = %wordHash;
say %counts{#words[$i]}{#words[$i+1]}; # The buck stops here
say %counts.kv;
$i = $i + 1;
}
}
}
Please check the line where it crashed before. Can you spot the difference? It was kind of a (un)lucky thing that you used i as a loop variable... i is a complex number in Raku. So it was crashing because it couldn't use complex numbers to index an array. You simply had dropped the $.
You can use sigilless variables in Raku, as long as they're not i, or e, or any of the other constants that are already defined.
I've also made a couple of changes to better reflect the fact that you're building a Hash and not an array of Pairs, as Lukas Valle said.

What this mean !$var PERL?

I get some code It works but don't understan this part !$dump_done...
my $dump_done = 0;
foreach my $line(keys %results){
if ($results{$line} == 1 and !$dump_done) {
print Dump($post);
$dump_done = 1;
}
}
! is the Logical NOT operator. It will return the negation of $dump_done. If $dump_done contains 0, the negation will give you 1:
my $dump_done = 0;
print !$dump_done; # Prints 1
This is valid, because in Perl any non-zero value is considered true and 0 is considered false.
You can try out this snippet:
if (5) {
print "Hello"; # Will be executed.
}
The ! character in most programming languages stands for NOT, it's the negation.
If the value of your variable $dump_done is still zero, when you test $dump_done it will returns FALSE (0). If you negate this expression, you get a TRUE expression (!= 0).
See Truth and Falsehood

What do Perl functions that return Boolean actually return

The Perl defined function (and many others) returns "a Boolean value".
Given Perl doesn't actually have a Boolean type (and uses values like 1 for true, and 0 or undef for false) does the Perl language specify exactly what is returned for a Boolean values? For example, would defined(undef) return 0 or undef, and is it subject to change?
In almost all cases (i.e. unless there's a reason to do otherwise), Perl returns one of two statically allocated scalars: &PL_sv_yes (for true) and &PL_sv_no (for false). This is them in detail:
>perl -MDevel::Peek -e"Dump 1==1"
SV = PVNV(0x749be4) at 0x3180b8
REFCNT = 2147483644
FLAGS = (PADTMP,IOK,NOK,POK,READONLY,pIOK,pNOK,pPOK)
IV = 1
NV = 1
PV = 0x742dfc "1"\0
CUR = 1
LEN = 12
>perl -MDevel::Peek -e"Dump 1==0"
SV = PVNV(0x7e9bcc) at 0x4980a8
REFCNT = 2147483647
FLAGS = (PADTMP,IOK,NOK,POK,READONLY,pIOK,pNOK,pPOK)
IV = 0
NV = 0
PV = 0x7e3f0c ""\0
CUR = 0
LEN = 12
yes is a triple var (IOK, NOK and POK). It contains a signed integer (IV) equal to 1, a floating point number (NV) equal to 1, and a string (PV) equal to 1.
no is also a triple var (IOK, NOK and POK). It contains a signed integer (IV) equal to 0, a floating point number (NV) equal to 0, and an empty string (PV). This means it stringifies to the empty string, and it numifies to 0. It is neither equivalent to an empty string
>perl -wE"say 0+(1==0);"
0
>perl -wE"say 0+'';"
Argument "" isn't numeric in addition (+) at -e line 1.
0
nor to 0
>perl -wE"say ''.(1==0);"
>perl -wE"say ''.0;"
0
There's no guarantee that this will always remain the case. And there's no reason to rely on this. If you need specific values, you can use something like
my $formatted = $result ? '1' : '0';
They return a special false value that is "" in string context but 0 in numeric context (without a non-numeric warning). The true value isn't so special, since it's 1 in either context. defined() does not return undef.
(You can create similar values yourself with e.g. Scalar::Util::dualvar(0,"").)
Since that's the official man page I'd say that its exact return value is not specified. If the Perl documentation talks about a Boolean value then then it almost always talks about evaluating said value in a Boolean context: if (defined ...) or print while <> etc. In such contexts several values evaluate to a false: 0, undef, "" (empty strings), even strings equalling "0".
All other values evaluate to true in a Boolean context, including the infamous example "0 but true".
As the documentation is that vague I would not ever rely on defined() returning any specific value for the undefined case. However, you'll always be OK if you simply use defined() in a Boolean context without comparing it to a specific value.
OK: print "yes\n" if defined($var)
Not portable/future proof: print "yes\n" if defined($var) eq '' or something similar
It probably won't ever change, but perl does not specify the exact boolean value that defined(...) returns.
When using Boolean values good code should not depend on the actual value used for true and false.
Example:
# not so great code:
my $bool = 0; #
...
if (some condition) {
$bool = 1;
}
if ($bool == 1) { ... }
# better code:
my $bool; # default value is undef which is false
$bool = some condition;
if ($bool) { ... }
99.9% of the time there is no reason to care about the value used for the boolean.
That said, there are some cases when it is better to use an explicit 0 or 1 instead of the boolean-ness of a value. Example:
sub foo {
my $object = shift;
...
my $bool = $object;
...
return $bool;
}
the intent being that foo() is called with either a reference or undef and should return false if $object is not defined. The problem is that if $object is defined foo() will return the object itself and thus create another reference to the object, and this may interfere with its garbage collection. So here it would be better to use an explicit boolean value here, i.e.:
my $bool = $object ? 1 : 0;
So be careful about using a reference itself to represent its truthiness (i.e. its defined-ness) because of the potential for creating unwanted references to the reference.

How can I set a default value for a Perl variable?

I am completely new to Perl. I needed to use an external module HTTP::BrowserDetect. I was testing some code and tried to get the name of the OS from os_string method. So, I simply initialized the object and created a variable to store the value returned.
my $ua = HTTP::BrowserDetect->new($user_agent);
my $os_name = $ua->os_string();
print "$user_agent $os_name\n";
there are some user agents that are not browser user agents so they won't get any value from os_string. I am getting an error Use of uninitialized value $os_name in concatenation (.) or string
How do I handle such cases when the $os_name is not initialized because the method os_string returns undef (this is what I think happens from reading the module source code). I guess there should be a way to give a default string, e.g. No OS in these cases.
Please note: the original answer's approach ( $var = EXPRESSION || $default_value ) would produce the default value for ANY "Perl false values" returned from the expression, e.g. if the expression is an empty string, you will use the default value instead.
In your particular example it's probably the right thing to do (OS name should not be an empty string), but in general case, it can be a bug, if what you actually wanted was to only use the default value instead of undef.
If you only want to avoid undef values, but not empty string or zeros, you can do:
Perl 5.10 and later: use the "defined-or operator" (//):
my $os_name = $ua->os_string() // 'No OS';
Perl 5.8 and earlier: use a conditional operator because defined-or was not yet available:
my $os_string = $ua->os_string(); # Cache the value to optimize a bit
my $os_name = (defined $os_string) ? $os_string : 'No OS';
# ... or, un-optimized version (sometimes more readable but rarely)
my $os_name = (defined $ua->os_string()) ? $ua->os_string() : 'No OS';
A lot more in-depth look at the topic (including details about //) are in brian d foy's post here: http://www.effectiveperlprogramming.com/2010/10/set-default-values-with-the-defined-or-operator/
my $os_name = $ua->os_string() || 'No OS';
If $ua->os_string() is falsy (ie: undef, zero, or the empty string), then the second part of the || expression will be evaluated (and will be the value of the expression).
Are you looking for defined?

Secondary Order in Heap::Simple

How do I define a secondary ordering to the Heap::Simple interface in Perl?
The documentation states that the constructor takes a code reference to define
the order, so you can specify any sort method you like:
my $heap = Heap::Simple->new(order => \&sort_method);
Every time two keys need to be compared, the given code reference will be called like:
$less = $code_reference->($key1, $key2);
This should return a true value if $key1 is smaller than $key2 and a false
value otherwise. $code_reference should imply a total order relation, so it
needs to be transitive.
By "secondary ordering" I assume you mean that a second comparison is used if
the first one shows the values to be equal. Let's say the first comparison is
of values found via the "method1" method, and the second comparison is of
values from "method2". So, if by method1 the values are different, return
that result, and otherwise fall back to method2:
sub sort_method
{
my ($val1, $val2) = #_;
my $result = ($val1->method1 <=> $val2->method1)
||
($val1->method2 <=> $val2->method2);
return 1 if $result == -1;
}
If method1 and method2 return strings instead of numeric values, simply use
the cmp operator instead of <=>. You can use anything you like, as long
as the operator returns the right values. Most sort functions like using the
values -1, 0 and 1 to indicate whether value1 is less than, equal to, or
greater than value2, but this module likes 1 to mean val1 < val2, so after
gathering the -1, 0, 1 result, one then returns 1 if the result is -1 (where
value1 is less than value2).
First of all, you write a function that takes two of the objects you want to put in the heap, and returns a true value if the first one is smaller than the second, and false otherwise.
Then supply that as a coderef to Heap::Simple.
The example from the Heap::Simple docs is as follows:
use Heap::Simple;
sub more { return $_[0] > $_[1] }
my $heap = Heap::Simple->new(order => \&more);
$heap->insert(8, 3, 14, -1, 3);
print $heap->extract_top, " " for 1..$heap->count;
print "\n";
# Will print: 14 8 3 3 -1