Error: "Cannot copy to ARRAY in scalar assignment" - perl

The situation is that I am in a sub trying to use the caller's $a and $b, similar to what sort does. Most of the time it works, but while running unit tests I noticed in one case, inside one package, it did not work. I use caller to get the package of the caller and then set their $a and $b as shown in the following simplified demo. The problem does occur by simply calling this whatever function in the package where I discovered this.
sub whatever { # defined in main package
my $pkg = caller;
no strict 'refs';
${"${pkg}::a"} = "";
${"${pkg}::b"} = "";
}
I attempted to create a minimal package/class to reproduce the problem but the problem does not occur in this context, but I'm posting my attempt below anyway as an indication of the general context:
package WhateverClass {
sub new {
my ($class, $something, $something2) = #_;
my $this = {};
bless $this, $class;
$this->{something_} = $something;
$this->{something2_} = $something2;
return $this;
}
sub test { # bug reproduction
main::whatever();
}
}
my $obj = WhateverClass->new(1.0, 2.0);
$obj->test;
The error message is,
Error: "Cannot copy to ARRAY in scalar assignment"
And it is triggered by the exact line:
${"${pkg}::a"} = "";
Having narrowed it down to this "whatever" function, I tried putting a variety of things on the right side of that assignment, including arrayrefs, arrays, strings as shown, numbers, as well as undef. The ones that it accepts are only undef, integer values, and floating point values. It does not accept arrayrefs, hashrefs, or strings. In my case, in the original code that exposed this problem, the things being passed were object references, or blessed hashrefs, and those assignments fail as you'd expect if any hashref or arrayref assignment fails.
Even more strangely, under the Perl debugger the problem doesn't occur, but if I run normally it does.
Google searching for this turns up nothing matching this exact error and very little that is even close. So first question is what does this error message even mean? Second question is obviously how to move forward.
I'm using Perl 5.20.3 on Linux, but I also tried the latest 5.22 on a Windows machine and saw the same behavior.

I found the fix for me had to do with the fact that the package that was the only location where any problems were seen with use of this code had at one point in its lifetime prior to the problem appearing used List::MoreUtils pairwise. Some but not all of the problem behavior was reproducible in the debugger, and I tracked down where the "buggy" behavior began and it was after calling pairwise. I switched this call to use one of my own routines that does essentially the same thing and the problem is completely gone (thank God). This was quite a disturbing issue. As #ikegami noted, "buggy XS code" may be the problem, and my routine is pure Perl without any XS code.

Related

Is the use of an uninitialized variable undefined behavior?

I don't know if "undefined behavior" means something in Perl but I would like to know if using not initialized variables in Perl may provoke unwanted behaviors.
Let's consider the following script:
use strict;
use warnings FATAL => 'all';
use P4;
my $P4;
sub get {
return $P4 if $P4;
# ...connection to Perforce server and initialization of $P4 with a P4 object...
return $P4;
}
sub disconnect {
$P4 = $P4->Disconnect() if $P4;
}
sub getFixes {
my $change = shift;
my $p4 = get();
return $p4->Run( "fixes", "-c", $change );
}
Here, the variable $P4, which is meant to store a P4 object after a connection to a Perforce server, is not initialized at the beginning of the script. However, whatever the function which is called first (get, disconnect or getFixes), the variable will be initialized before being used.
Is there any risk to do that? Should I explicitly initialized the $P4 variable at the beginning of the script?
Just a couple of straight-up answers to basic questions asked.
if "undefined behavior" means something in Perl
Yes, there is such a notion in Perl, and documentation warns of it (way less frequently than in C). See some examples in footnote †. On the other hand, at many places in documentation one finds a discussion ending with
... So don't do that.
It often comes up for things that would confuse the interpreter and could result in strange and possibly unpredictable behavior. These are sometimes typical "undefined behavior" even as they are not directly called as such.
The main question is of how uninitialized variables relate, per the title and
if using not initialized variables in Perl may provoke unwanted behaviors
This does not generally result in "undefined behavior" but it may of course lead to trouble and one mostly gets a warning for it. Unless the variable is legitimately getting initialized in such "use" of course. For example,
my $x;
my $z = $x + 3;
will draw a warning for the use of $x but not for $z (if warnings are on!). Note that this still succeeds as $x gets initialized to 0. (But in what is shown in the question the code will abort at that point, due to the FATAL.)
The code shown in the question seems fine in this sense, since as you say
the variable will be initialized before being used
Testing for truth against an uninitialized variable is fine since once it is declared it is equipped with the value undef, admissible (and false) in such tests.
See the first few paragraphs in Declarations in perlsyn for a summary of sorts on when one does or doesn't need a variable to be defined.
† A list of some behaviors specifically labeled as "undefined" in docs
Calling sort in scalar context
In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort is undefined.
Length too great in truncate
The behavior is undefined if LENGTH is greater than the length of the file.
Using flags for sysopen which are incompatible (nonsensical)
The behavior of O_TRUNC with O_RDONLY is undefined.
Sending signals to a process-list with kill, where one can use negative signal or process number to send to a process group
If both the SIGNAL and the PROCESS are negative, the results are undefined. A warning may be produced in a future version.
From Auto-increment and Auto-decrement (perlop)
... modifying a variable twice in the same statement will lead to undefined behavior.
Iterating with each, tricky as it may be anyway, isn't well behaved if hash is inserted into
If you add or delete a hash's elements while iterating over it, the effect on the iterator is unspecified; for example, entries may be skipped or duplicated--so don't do that. It is always safe to delete the item most recently returned by each, ...
This draws a runtime warning (F), described in perldiag
Use of each() on hash after insertion without resetting hash iterator results in undefined behavior.
Statement modifier (perlsyn) used on my
The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ...) is undefined.
Some of these seem a little underwhelming (predictable), given what UB can mean. Thanks to ikegami for comments. A part of this list is found in this question.
Pried from docs current at the time of this posting (v5.32.1)
A variable declared with my is initialized with undef. There is no undefined behaviour here.
This is documented in perldoc persub:
If no initializer is given for a particular variable, it is created with the undefined value.
However, the curious construct my $x if $condition does have undefined behaviour. Never do that.
my initializes scalars to undef, and arrays and hashes to empty.
Your code is fine, though I would take a different approach to destruction.
Option 1: Provide destructor through wrapping
use Object::Destroyer qw( );
use P4 qw( );
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
Object::Destroyer->new($p4, 'Disconnect')
};
}
# No disconnect sub
Option 2: Provide destructor through monkey-patching
use P4 qw( );
BEGIN {
my $old_DESTROY = P4->can('DESTROY');
my $new_DESTROY = sub {
my $self = shift;
$self->Disconnect();
$old_DESTROY->($self) if $old_DESTROY;
};
no warnings qw( redefined );
*P4::DESTROY = $new_DESTROY;
}
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
$p4
};
}
# No disconnect sub

clean way to pass argument to perl subroutine to child routine as they are without storing to variable first?

I want to pass the arguments of my routine to a subroutine as they are, possibly while adding a new argument. To give an example imagine something like this
sub log($$$){
my ($message, $log_location, $log_level) = #_;
#logs message
}
sub log_debug($$){
my ($message, $log_location) = #_;
log($message, $log_location, DEBUG_LOG_LEVEL);
}
That syntax works fine above, but requires my saving the #_ to intermediate variables to do it. I'm wondering if there is a simple clean syntax for doing so without using the intermediate variables. Something like
log(#_, DEBUG_LOG_LEVEL);
which gets an error about my "not having enough variables", but I think would otherwise work. Can this be done easily without warning?
You don't have to copy the elements of #_. You could use them directly as follows:
sub log_debug($$) {
log($_[0], $_[1], DEBUG_LOG_LEVEL);
}
Prefixing the call with & causes the prototype to be ignored, so you could also use the following:
sub log_debug($$) {
&log(#_, DEBUG_LOG_LEVEL);
}
If you eliminate the arguments (including ()), the callee will use the parent's #_ instead of creating a new one. This following is slightly more efficient than the previous solution:
sub log_debug($$) {
push #_, DEBUG_LOG_LEVEL;
&log;
}
Finally, if log obtains a stack trace, you might want to remove log_debug from the trace. This is slightly less efficient.
sub log_debug($$) {
push #_, DEBUG_LOG_LEVEL;
goto &log;
}
Note that none of four solutions I posted make a copy of the arguments.
Sure, just skip the prototypes
sub log {
my ($message, $log_location, $log_level) = #_;
#logs message
}
sub log_debug {
log(#_, DEBUG_LOG_LEVEL);
}
You can skip prototype checking by calling your subroutine with an &:
&log(#_, DEBUG_LOG_LEVEL);
According to perlsub:
Not only does the & form make the argument list optional, it also
disables any prototype checking on arguments you do provide. This is
partly for historical reasons, and partly for having a convenient way
to cheat if you know what you're doing.

Why do '::' and '->' work (sort of) interchangeably when calling methods from Perl modules?

I keep getting :: confused with -> when calling subroutines from modules. I know that :: is more related to paths and where the module/subroutine is and -> is used for objects, but I don't really understand why I can seemingly interchange both and it not come up with immediate errors.
I have perl modules which are part of a larger package, e.g. FullProgram::Part1
I'm just about getting to grips with modules, but still am on wobbly grounds when it comes to Perl objects, but I've been accidentally doing this:
FullProgram::Part1::subroutine1();
instead of
FullProgram::Part1->subroutine1();
so when I've been passing a hash ref to subroutine1 and been careful about using $class/$self to deal with the object reference and accidentally use :: I end up pulling my hair out wondering why my hash ref seems to disappear. I have learnt my lesson, but would really like an explanation of the difference. I have read the perldocs and various websites on these but I haven't seen any comparisons between the two (quite hard to google...)
All help appreciated - always good to understand what I'm doing!
There's no inherent difference between a vanilla sub and one's that's a method. It's all in how you call it.
Class::foo('a');
This will call Class::foo. If Class::foo doesn't exist, the inheritance tree will not be checked. Class::foo will be passed only the provided arguments ('a').
It's roughly the same as: my $sub = \&Class::foo; $sub->('a');
Class->foo('a');
This will call Class::foo, or foo in one of its base classes if Class::foo doesn't exist. The invocant (what's on the left of the ->) will be passed as an argument.
It's roughly the same as: my $sub = Class->can('foo'); $sub->('Class', 'a');
FullProgram::Part1::subroutine1();
calls the subroutine subroutine1 of the package FullProgram::Part1 with an empty parameter list while
FullProgram::Part1->subroutine1();
calls the same subroutine with the package name as the first argument (note that it gets a little bit more complex when you're subclassing). This syntax is used by constructor methods that need the class name for building objects of subclasses like
sub new {
my ($class, #args) = #_;
...
return bless $thing, $class;
}
FYI: in Perl OO you see $object->method(#args) which calls Class::method with the object (a blessed reference) as the first argument instead of the package/class name. In a method like this, the subroutine could work like this:
sub method {
my ($self, $foo, $bar) = #_;
$self->do_something_with($bar);
# ...
}
which will call the subroutine do_something_with with the object as first argument again followed by the value of $bar which was the second list element you originally passed to method in #args. That way the object itself doesn't get lost.
For more informations about how the inheritance tree becomes important when calling methods, please see ikegami's answer!
Use both!
use Module::Two;
Module::Two::->class_method();
Note that this works but also protects you against an ambiguity there; the simple
Module::Two->class_method();
will be interpreted as:
Module::Two()->class_method();
(calling the subroutine Two in Module and trying to call class_method on its return value - likely resulting in a runtime error or calling a class or instance method in some completely different class) if there happens to be a sub Two in Module - something that you shouldn't depend on one way or the other, since it's not any of your code's business what is in Module.
Historically, Perl dont had any OO. And functions from packages called with FullProgram::Part1::subroutine1(); sytax. Or even before with FullProgram'Part1'subroutine1(); syntax(deprecated).
Later, they implemented OOP with -> sign, but dont changed too much actually. FullProgram::Part1->subroutine1(); calls subroutine1 and FullProgram::Part1 goes as 1st parameter. you can see usage of this when you create an object: my $cgi = CGI->new(). Now, when you call a method from this object, left part also goes as first parameter to function: $cgi->param(''). Thats how param gets object he called from (usually named $self). Thats it. -> is hack for OOP. So as a result Perl does not have classes(packages work as them) but does have objects("objects" hacks too - they are blessed scalars).
Offtop: Also you can call with my $cgi = new CGI; syntax. This is same as CGI->new. Same when you say print STDOUT "text\n";. Yeah, just just calling IOHandle::print().

How does this call to a subroutine in a Perl module work?

I recently saw some Perl code that confused me. I took out all of the extra parts to see how it was working, but I still don't understand why it works.
Basically, I created this dummy "module" (TTT.pm):
use strict;
use warnings;
package TTT;
sub new {
my $class = shift;
return bless {'Test' => 'Test'}, $class;
}
sub acquire {
my $tt = new TTT();
return $tt;
}
1;
Then I created this script to use the module (ttt.pl):
#!/usr/bin/perl
use strict;
use warnings;
use TTT;
our $VERSION = 1;
my $tt = acquire TTT;
print $tt->{Test};
The line that confuses me, that I thought would not work, is:
my $tt = acquire TTT;
I thought it would not work since the "acquire" sub was never exported. But it does work.
I was confused by the "TTT" after the call to acquire, so I removed that, leaving the line like this:
my $tt = acquire;
And it complained of a bareword, like I thought it would. I added parens, like this:
my $tt = acquire();
And it complained that there wasn't a main::acquire, like I thought it would.
I'm used to the subroutines being available to the object, or subroutines being exported, but I've never seen a subroutine get called with the package name on the end. I don't even know how to search for this on Google.
So my question is, How does adding the package name after the subroutine call work? I've never seen anything like that before, and it probably isn't good practice, but can someone explain what Perl is doing?
Thanks!
You are using the indirect object syntax that Perl allows (but in modern code is discouraged). Basically, if a name is not predeclared, it can be placed in front of an object (or class name) separated with a space.
So the line acquire TTT actually means TTT->acquire. If you actually had a subroutine named acquire in scope, it would instead be interpreted as aquire(TTT) which is can lead to ambiguity (hence why it is discouraged).
You should also update the new TTT(); line in the method to read TTT->new;
It's the indirect object syntax for method calls, which lets you put the method name before the object name.
As the documentation there shows, it's best avoided because it's unwieldy and it can break in unpredictable ways, for example if there is an imported or defined subroutine named acquire — but it used to be more common than it is today, and so you will find it pretty often in old code and docs.

Why shouldn't I use UNIVERSAL::isa?

According to this
http://perldoc.perl.org/UNIVERSAL.html
I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or CLASS->isa().
This means that to find out if something is a reference in the first place and then is reference to this class I have to do
eval { $poss->isa("Class") }
and check $# and all that gumph, or else
use Scalar::Util 'blessed';
blessed $ref && $ref->isa($class);
My question is why? What's wrong with UNIVERSAL::isa called like that? It's much cleaner for things like:
my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__)
To see whether this function is being called on the object or not. And is there a nice clean alternative that doesn't get cumbersome with ampersands and potentially long lines?
The primary problem is that if you call UNIVERSAL::isa directly, you are bypassing any classes that have overloaded isa. If those classes rely on the overloaded behavior (which they probably do or else they would not have overridden it), then this is a problem. If you invoke isa directly on your blessed object, then the correct isa method will be called in either case (overloaded if it exists, UNIVERSAL:: if not).
The second problem is that UNIVERSAL::isa will only perform the test you want on a blessed reference just like every other use of isa. It has different behavior for non-blessed references and simple scalars. So your example that doesn't check whether $ref is blessed is not doing the right thing, you're ignoring an error condition and using UNIVERSAL's alternate behavior. In certain circumstances this can cause subtle errors (for example, if your variable contains the name of a class).
Consider:
use CGI;
my $a = CGI->new();
my $b = "CGI";
print UNIVERSAL::isa($a,"CGI"); # prints 1, $a is a CGI object.
print UNIVERSAL::isa($b,"CGI"); # Also prints 1!! Uh-oh!!
So, in summary, don't use UNIVERSAL::isa... Do the extra error check and invoke isa on your object directly.
See the docs for UNIVERSAL::isa and UNIVERSAL::can for why you shouldn't do it.
In a nutshell, there are important modules with a genuine need to override 'isa' (such as Test::MockObject), and if you call it as a function, you break this.
I have to say, my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__) doesn't look terribly clean to me - anti-Perl advocates would be complaining about line noise. :)
To directly answer your question, the answer is at the bottom of the page you linked to, namely that if a package defines an isa method, then calling UNIVERSAL::isa directly will not call the package isa method. This is very unintuitive behaviour from an object-orientation point of view.
The rest of this post is just more questions about why you're doing this in the first place.
In code like the above, in what cases would that specific isa test fail? i.e., if it's a method, in which case would the first argument not be the package class or an instance thereof?
I ask this because I wonder if there is a legitimate reason why you would want to test whether the first argument is an object in the first place. i.e., are you just trying to catch people saying FooBar::method instead of FooBar->method or $foobar->method? I guess Perl isn't designed for that sort of coddling, and if people mistakenly use FooBar::method they'll find out soon enough.
Your mileage may vary.
Everyone else has told you why you don't want to use UNIVERSAL::isa, because it breaks when things overload isa. If they've gone to all the habit of overloading that very special method, you certainly want to respect it. Sure, you could do this by writing:
if (eval { $foo->isa("thing") }) {
# Do thingish things
}
because eval guarantees to return false if it throws an exception, and the last value otherwise. But that looks awful, and you shouldn't need to write your code in funny ways because the language wants you to. What we really want is to write just:
if ( $foo->isa("thing") ) {
# Do thingish things
}
To do that, we'd have to make sure that $foo is always an object. But $foo could be a string, a number, a reference, an undefined value, or all sorts of weird stuff. What a shame Perl can't make everything a first class object.
Oh, wait, it can...
use autobox; # Everything is now a first class object.
use CGI; # Because I know you have it installed.
my $x = 5;
my $y = CGI->new;
print "\$x is a CGI object\n" if $x->isa('CGI'); # This isn't printed.
print "\$y is a CGI object\n" if $y->isa('CGI'); # This is!
You can grab autobox from the CPAN. You can also use it with lexical scope, so everything can be a first class object just for the files or blocks where you want to use ->isa() without all the extra headaches. It also does a lot more than what I've covered in this simple example.
Assuming your example of what you want to be able to do is within an object method, you're being unnecessarily paranoid. The first passed item will always be either a reference to an object of the appropriate class (or a subclass) or it will be the name of the class (or a subclass). It will never be a reference of any other type, unless the method has been deliberately called as a function. You can, therefore, safely just use ref to distinguish between the two cases.
if (ref $_[0]) {
my $self = shift;
# called on instance, so do instancey things
} else {
my $class = shift;
# called as a class/static method, so do classy things
}
Right. It does a wrong thing for classes that overload isa. Just use the following idiom:
if (eval { $obj->isa($class) }) {
It is easily understood and commonly accepted.
Update for 2020: Perl v5.32 has the class infix operator, isa, which handles any sort of thing on the lefthand side. If the $something is not an object, you get back false with no blowup.
use v5.32;
if( $something isa 'Animal' ) { ... }