returning a lazily-computed scalar, in perl - perl

I'm trying to add some functionality to our code base by using tied scalars.
We have a function which is specified to return scalars. I thought I could add some features to the system by tie-ing these scalars before returning them, but it looks like the FETCH method is called just before the return, which results in an untied scalar being returned.
Is there any way around this?
I really want to keep the subroutine's interface (returning scalars) intact if it's at all possible.
use strict;
use warnings;
main();
sub GetThing{
my $thing;
tie $thing, 'mything', #_;
return $thing;
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
require Tie::Scalar;
my #ISA = qw(Tie::StdScalar);
sub TIESCALAR {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}
Desired output:
1
ACCESS ALERT!
NAME: 'Fred'
2
ACCESS ALERT!
NAME: 'Fred'
3
I can get the desired output by returning a reference, and dereferencing on each access, but that ruins our established interface, and makes it more confusing for our users.
--Buck

As DVK said, tie applies to containers, so isn't useful for returned values.
For that, you use overloading. An example (not all the possible overloaded operations are supplied; see http://perldoc.perl.org/overload.html#Minimal-set-of-overloaded-operations):
use strict;
use warnings;
main();
sub GetThing{
my $thing;
$thing = "mything"->new(#_);
return $thing;
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
use overload 'fallback' => 1, '""' => 'FETCH';
sub new {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}

As mentioned in other answers, tie applies to containers, and not to values, so there is no way to assign a tied variable to another variable and retain the tied properties.
Since assignment is out, you need to pass the container into the GetThing routine. You can do this by reference as follows:
use strict;
use warnings;
main();
sub GetThing{
tie ${$_[1]}, 'mything', $_[0];
}
sub main {
my %m;
GetThing('Fred' => \$m{pre});
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
require Tie::Scalar;
my #ISA = qw(Tie::StdScalar);
sub TIESCALAR {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}
which produces the correct output.
However, if you want to retain the assignment, you will need to use overloading, which applies to values (actually to objects, but they themselves are values). Without more detail on your intended purpose it is hard to give a complete answer, but this will meet your stated requirements:
use strict;
use warnings;
main();
sub GetThing{
return mything->new( shift );
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
sub new {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
use overload '""' => sub { # '""' means to overload stringification
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
};
Both ties and overloads can get complicated, so read through all of the documentation if anything is not clear.

First, the exact method of doing what you are proposing seems technically impossible:
Tied variables have the tie attached to the variable itself, not to its value.
In Perl, subroutine's return values are returned by value, meaning you take the value passed to return, access it (in you case, accessing the tied variable and calling FETCH in the process) - and then copy that value! Which means that what the caller gets is a scalar VALUE, not a scalar variable (tied or untied).
Your confusion, in short, seems to stem from mixing together variables (locations in program's symbol table) and values stored in those variables.
Second, you were somewhat unclear as to what exactly you are trying to achieve, so it's hard to propose how to achieve what you want. But assuming, based on your description, that you wanted to call some method upon subroutine's return (possibly passing it the return value), you CAN do that.
To do so, you need to employ what fancy people call aspect programming. The politically (and technically) correct way of doing it in Perl is by using Moose.
However, you can DIY it, by basically replacing the original method with a wrapper method.
The exact mechanics of both Moose and DIY approaches can be seen in the first two answers to the following SO question, so I won't copy/paste them here, hope you don't mind:
Simulating aspects of static-typing in a duck-typed language

If you're feeling adventurous, you could also use the Scalar::Defer module which provides a general-purpose mechanism for a scalar variable to compute a value lazily, either once or on each access.

Related

Catching undefined value accessing in Perl

How can I catch access to member variables?
$Class1->{Class2}
If the Class2 field doesn't exist, is is possible to catch this from an internal function?
You can, but you probably shouldn't. The problem here is - if you access a variable within a class directly... then you just can. You can prevent this with a couple of workarounds - and this is where things like Moose come in.
And there's a couple of slightly hacky tricks like inside-out objects (which I think aren't common practice any more - Perl Best Practice advocated them some years back) or using anonymous hashes to hold state.
But failing that - why not use an accessor, and auto-generate one using 'AUTOLOAD'.
#!/usr/bin/env perl
package MyClass;
use strict;
use warnings;
use vars '$AUTOLOAD';
sub AUTOLOAD {
my ( $self ) = #_;
my $subname = $AUTOLOAD =~ s/.*:://r;
if ( $self -> {$subname} ) {
return $self -> {$subname};
}
warn "Sub called $subname was called\n";
return "$subname";
}
sub new {
my ( $class ) = #_;
my $self = {};
bless $self, $class;
}
package main;
use strict;
use warnings;
my $object = MyClass -> new;
$object -> {var} = "fleeg";
print "Undef fiddle was: ", $object -> fiddle,"\n";
print "But 'var' was: ", $object -> var,"\n";
This has the same problem, in that changing method names might cause things to break. However it has the advantage that you can handle 'invalid' method calls however you like.
But really - explicit 'get' and 'set' methods are better choices for most use-cases.
You do this by providing proper getter/setter methods that wrap around your class/instance variables. The internals should never be accessed directly, particularly from outside of the class itself (it's wise to not do so within the class either, except for the actual method that maintains that specific attribute. Here's a very basic example:
use warnings;
use strict;
package A;
sub new {
my ($class, %args) = #_;
my $self = bless {}, $class;
$self->x($args{x});
$self->y($args{y});
return $self;
}
sub x {
my ($self, $x) = #_;
$self->{x} = $x if defined $x;
return $self->{x} // 1;
}
sub y {
my ($self, $y) = #_;
$self->{y} = $y if defined $y;
return $self->{y} // 2;
}
package main;
my $obj = A->new(x => 5, y => 3);
print $obj->x ."\n";
print $obj->y ."\n";
Now, you could just as easily do print $obj->{x}, but that's where your problem is. What happens when the code is much more complicated than this, and for some reason you want to change the x attribute name to foo, but retain the x() method? $obj->{x} will now be undef as its never set.
Always use the provided methods for accessing attributes of a class/object. Encapsulation such as this is a staple of OO programming.

Manipulate Perl object by reference in subroutine

I have a Perl program and packages Worker and Log.
The Worker does almost all calculations, and I want to pass an object by reference to the Worker subroutine, as well as some other parameters (scalar and an array). I have seen examples like this and this.
They handle this by putting #_ in subs, then manipulating the object. I also found a way to manipulate them by using the index, like #{$_[i]}. Problem is, when I try the code like so, I get an error:
Can't call method "write" on unblessed reference at ...
Code snippets below.
Main:
use strict;
use warnings;
use Log;
use Worker;
my $log = Log->new();
my $worker = Worker->new();
my $scalar = "SomeURLhere";
my #array = ('red','blue','white');
# I do some stuff with $log object
#...
# Now I want to pass data to the Worker
$worker->subFromWorker($scalar, \$log, \#array);
Worker:
use strict;
use warnings;
package Worker;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub subFromWorker{
my ($self) = shift;
my $scalar = $_[0];
#my ($log) = $_[1];
my #array = #{$_[2]};
foreach my $item (#array){
print $item;
}
$_[1]->write("The items from url $scalar are printed.");
#Same thing happens if I use $log here
}
In C#, this is handled in a different way - you can send a parameter to a method by value or by reference, and then do what you want in a specialized method (method is pre-written to handle parameters by reference or value). I thought that in Perl sending using \parameter will send the reference.
Objects are references. References are scalar values.
If you want to pass arrays or hashes into a subroutine then you usually want to pass references to them - because Perl parameter passing works far better with scalar values.
But $log is already a reference to your object. Therefore you don't need to take a reference to it. You end up passing a reference to a reference. So when you copy that parameter into $log inside your subroutine you have an extra, unnecessary, level of references.
The fix is to just pass the $log scalar into the subroutine.
$worker->subFromWorker($scalar, $log, \#array); # $log, not \$log
Everything else will then work fine.
You have read about the issues that prevent your program from working, but there are a few other things you should be aware of
Perl lexical identifiers and subroutine/method names consist of alphanumerics and underscore. Capital letters are reserved for global identifiers, such as package names like Worker and Log.
Packages that you use or require should end with the statement 1; so as to return a true value when they are imported, otherwise your program may fail to compile.
If a subroutine that you are writing happens to be a method, then it is clearest to start it by shifting off the $self parameter and making a copy of the rest:
my $self = shift;
my ($p1, $p2, $p3) = #_;
It is rare to use elements of #_ directly unless you're desperate for the minimal speed bonus
It is usually best to work directly with an array reference rather than copying the array, especially if it may be large.
Here is how I would code your program and associated modules:
program.pl
use strict;
use warnings;
use Worker;
use Log;
my $log = Log->new;
my $worker = Worker->new;
my $scalar = 'SomeURLhere';
my #array = qw/ red blue white /;
$worker->worker_method($scalar, $log, \#array);
Worker.pm
use strict;
use warnings;
package Worker;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub worker_method {
my $self = shift;
my ($scalar, $log, $array) = #_;
foreach my $item (#$array) {
print $item, "\n";
}
$log->write("The items from URL $scalar are printed.");
}
1;
Log.pm
use strict;
use warnings;
package Log;
sub new {
my $class = shift;
bless {}, $class;
}
sub write {
my $self = shift;
my ($text) = #_;
print "Logging: $text\n"
}
1;
Output
red
blue
white
Logging: The items from URL SomeURLhere are printed.
A more common pattern is to use List assignment to unpack #_ into multiple variables all at once:
sub subFromWorker {
my ($self, $scalar, $log_ref, $array) = #_;
...
}
In reference to your specific problem:
my $log = Log->new();
$log is already a reference to your object, using \$log creates a reference to that reference which is not probably not what you want. You can handle this two ways:
only pass $log:
$worker->subFromWorker($scalar, $log, \#array);
dereference $log in subFromWorker before calling functions on it:
$$log_ref->write('...');

Locally change an attribute of a class in Perl

I have come across an odd problem in one of my Perl scripts. I have a Perl object. Within a certain scope I want one of the objects attributes to be changed, but I want the attribute to be restored to it's old value after it leaves the scope.
Example:
my $object = Object->new('name' => 'Bob');
{
# I know this doesn't work, but it is the best way
# I can represent what I amd trying to do.
local $object->name('Lenny');
# Prints "Lenny"
print $object->name();
}
# Prints "Bob"
print $object->name();
Is there a way to achieve something like this?
This might not be as much encapsulation as you were asking for, but you can local-ize an attribute of a hash. This outputs "CarlLennyCarl"
sub Object::new { bless { _name => $_[1] }, $_[0] } }
sub Object::name { $_[0]->{_name} }
my $obj = Object->new("Carl");
print $obj->name;
{
local $obj->{_name} = "Lenny";
print $obj->name;
}
print $obj->name;
You could also local-ize the entire method. This also outputs "CarlLennyCarl":
sub Object::new { bless { _name => $_[1] }, $_[0] } }
sub Object::name { $_[0]->{_name} }
my $obj = Object->new("Carl");
print $obj->name;
{
local *Object::name = sub { "Lenny" };
print $obj->name;
}
print $obj->name;
I was completely misunderstanding what was occurring there. You cannot use local on subroutine calls, that is the issue you are having.
Lets use a code example from one that I know works and try to explain what eval is actually doing.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Cwd;
print getcwd() . "\n";
eval{
local #INC = ('/tmp');
require 'test.pl';
print 'local: ' . Dumper(\#INC);
};
print Dumper(\#INC);
That works because I am modifying a variable, not calling on another subroutine to modify my variable.
In order for it to work as you are expecting, you would have to create a deep copy of the object to modify in local scope or something of the sort. (which I'm pretty sure is what is occurring in the first place)
local creates scope for the given brackets, eval, OR file (your problem there)
If you were able to access the elements directly without the method call (bad practice IMHO) you would likely be able to localize the scope of that element in the object.
Example:
name.pm:
package name;
use strict;
use warnings;
{
sub new {
my ($class,$name) = #_;
my $self = bless {}, $class;
$self->{'name'} = $name if defined $name;
return $self;
}
sub name
{
my ($self,$name) = #_;
$self->{'name'} = $name if defined $name;
return $self->{'name'};
}
}
index.pl:
#!/usr/bin/perl -w
use strict;
use warnings FATAL => 'all';
use name;
my $obj = name->new('test');
print $obj->{'name'} . "\n";
{
local $obj->{'name'} = 'test2';
print $obj->{'name'} . "\n";
}
print $obj->{'name'} . "\n";

Should a subroutine always return explicitly?

If perlcritic says "having no returns in a sub is wrong", what is the alternative if they really aren't needed?
I've developed two apparently bad habits:
I explicitly assign variables to the '$main::' namespace.
I then play with those variables in subs.
For example, I might do..
#!/usr/bin/perl
use strict;
use warnings;
#main::array = (1,4,2,6,1,8,5,5,2);
&sort_array;
&push_array;
&pop_array;
sub sort_array{
#main::array = sort #main::array;
for (#main::array){
print "$_\n";
}
}
sub push_array{
for ( 1 .. 9 ){
push #main::array, $_;
}
}
sub pop_array {
for ( 1 .. 3 ){
pop #main::array;
}
}
I don't do this all the time. But in the above, it makes sense, because I can segregate the operations, not have to worry about passing values back and forth and it generally looks tidy to me.
But as I said, perl critic says its wrong - because there's no return..
So, is anyone able to interpret what I'm trying to do and suggest a better way of approaching this style of coding in perl? eg. am I sort of doing OOP?
In short - yes, you're basically doing OO, but in a way that's going to confuse everyone.
The danger of doing subs like that is that you're acting at a distance. It's a bad coding style to have to look somewhere else entirely for what might be breaking your code.
This is generally why 'globals' are to be avoided wherever possible.
For a short script, it doesn't matter too much.
Regarding return values - Perl returns the result of the last expression by default. (See: return)
(In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.)
The reason Perl critic flags it is:
Require all subroutines to terminate explicitly with one of the following: return, carp, croak, die, exec, exit, goto, or throw.
Subroutines without explicit return statements at their ends can be confusing. It can be challenging to deduce what the return value will be.
Furthermore, if the programmer did not mean for there to be a significant return value, and omits a return statement, some of the subroutine's inner data can leak to the outside.
Perlcritic isn't always right though - if there's good reason for doing what you're doing, then turn it off. Just as long as you've thought about it and are aware of the risks an consequences.
Personally I think it's better style to explicitly return something, even if it is just return;.
Anyway, redrafting your code in a (crude) OO fashion:
#!/usr/bin/perl
use strict;
use warnings;
package MyArray;
my $default_array = [ 1,4,2,6,1,8,5,5,2 ];
sub new {
my ( $class ) = #_;
my $self = {};
$self -> {myarray} = $default_array;
bless ( $self, $class );
return $self;
}
sub get_array {
my ( $self ) = #_;
return ( $self -> {myarray} );
}
sub sort_array{
my ( $self ) = #_;
#{ $self -> {myarray} } = sort ( #{ $self -> {myarray} } );
for ( #{ $self -> {myarray} } ) {
print $_,"\n";
}
return 1;
}
sub push_array{
my ( $self ) = #_;
for ( 1 .. 9 ){
push #{$self -> {myarray}}, $_;
}
return 1;
}
sub pop_array {
my ( $self ) = #_;
for ( 1 .. 3 ){
pop #{$self -> {myarray}};
}
return 1;
}
1;
And then call it with:
#!/usr/bin/perl
use strict;
use warnings;
use MyArray;
my $array = MyArray -> new();
print "Started:\n";
print join (",", #{ $array -> get_array()} ),"\n";
print "Reshuffling:\n";
$array -> sort_array();
$array -> push_array();
$array -> pop_array();
print "Finished:\n";
print join (",", #{ $array -> get_array()} ),"\n";
It can probably be tidied up a bit, but hopefully this illustrates - within your object, you've got an internal 'array' which you then 'do stuff with' by making your calls.
Result is much the same (I think I've replicated the logic, but don't trust that entirely!) but you have a self contained thing going on.
If the function doesn't mean to return anything, there's no need to use return!
No, you don't use any aspects of OO (encapsulation, polymorphism, etc). What you are doing is called procedural programming. Nothing wrong with that. All my work for nuclear power plants was written in that style.
The problem is using #main::array, and I'm not talking about the fact that you could abbreviate that to #::array. Fully-qualified names escape strict checks, so they are far, far more error-prone. Mistyped var name won't get caught as easily, and it's easy to have two pieces of code collide by using the same variable name.
If you're just using one file, you can use my #array, but I presume you are using #main::array because you are accessing it from multiple files/modules. I suggest placing our #array in a module, and exporting it.
package MyData;
use Exporter qw( import );
our #EXPORT = qw( #array );
our #array;
1;
Having some kind of hint in the variable name (such as a prefix or suffix) indicating this is a variable used across many modules would be nice.
By the way, if you wanted do create an object, it would look like
package MyArray;
sub new {
my $class = shift;
my $self = bless({}, $class);
$self->{array} = [ #_ ];
return $self;
}
sub get_elements {
my ($self) = #_;
return #{ $self->{array} };
}
sub sort {
my ($self) = #_;
#{ $self->{array} } = sort #{ $self->{array} };
}
sub push {
my $self = shift;
push #{ $self->{array} }, #_;
}
sub pop {
my ($self, $n) = #_;
return splice(#{ $self->{array} }, 0, $n//1);
}
my $array = MyArray->new(1,4,2,6,1,8,5,5,2);
$array->sort;
print("$_\n") for $array->get_elements();
$array->push_array(1..9);
$array->pop_array(3);
I improved your interface a bit. (Sorting shouldn't print. Would be nice to push different things and to pop other than three elements.)

How can I call methods on a tied variable?

I've just started to learn about tie. I have a class named Link which I would like to do the following thing:
if fetched, return the link's address
if stored, store the new address
be able to call methods on it
So far, my code is :
package Link;
sub FETCH {
my $this = shift;
return $this->{"site"};
}
sub STORE {
my ($self,$site) = #_;
$self->{"site"} = $site;
}
sub print_method {
my $self = shift;
print $self->{"site"};
}
sub TIESCALAR {
my $class = shift;
my $link = shift;
my $this = {};
bless($this,$class);
$this->{"site"} = $link;
return $this;
}
1;
And the code I'm using to check the functionality is:
use Link;
tie my $var,"Link","http://somesite.com";
$var->print_method;
When ran, the script will terminate with the following error:
Can't call method "print_method" without a package or object reference at tietest.pl line 4..
If I understand its message correctly, $var->print_method resolves to some string upon which the method print_method is called. How could I benefit from tie, but also use the variable as an object?
EDIT: after experimenting a bit,I found out that if I return $self on fetch , I can call the methods , however , fetch won't return the address .
EDIT 2:the perl monks supplied me the solution : tied . tied will return a reference to the object VARIABLE .
By combining tied with my methods , I can accomplish everything I wanted .
Tie is the wrong tool for this job. You use ties when you want the same interface as normal data types but want to customize how the operations do their work. Since you want to access and store a string just like a scalar already does, tie doesn't do anything for you.
It looks like you want the URI module, or a subclass of it, and perhaps some overloading.
If you really need to do this, you need to use the right variable. The tie hooks up the variable you specify to the class you specify, but it's still a normal scalar (and not a reference). You have to use the object it returns if you want to call methods:
my $secret_object = tie my($normal_scalar), 'Tie::Class', #args;
$secret_object->print_method;
You can also get the secret object if you only have the tied scalar:
my $secret_object = tied $normal_scalar;
I have an entire chapter on tie in Mastering Perl.
I suggest making a normal Perl object and then overloading stringification. You lose the ability to store a value through assignment, but retain the ability to get the value out by printing the object. Once you start wanting to call methods directly, an object is probably what you want.
package Link;
use strict;
use Carp;
use overload
(
'""' => sub { shift->site },
fallback => 1,
);
sub new
{
my $class = shift;
my $self = bless {}, $class;
if(#_)
{
if(#_ == 1)
{
$self->{'site'} = shift;
}
else { croak "$class->new() expects a single URL argument" }
}
return $self;
}
sub site
{
my $self = shift;
$self->{'site'} = shift if(#_);
return $self->{'site'};
}
sub print_method
{
my $self = shift;
print $self->site, "\n";
}
1;
Example usage:
use Link;
my $link = Link->new('http://somesite.com');
print $link, "\n"; # http://somesite.com
$link->print_method; # http://somesite.com
If you really, really want assignment to work too, you can combine a normal object with overloaded stringification (Link, above) with tie:
package LinkTie;
use strict;
use Link;
sub FETCH
{
my $this = shift;
return $this->{'link'};
}
sub STORE
{
my($self, $site) = #_;
$self->{'link'}->site($site);
return $site;
}
# XXX: You could generalize this delegation with Class::Delegation or similar
sub print_method
{
my $self = shift;
print $self->{'link'}->print_method;
}
sub TIESCALAR
{
my $class = shift;
my $self = bless {}, $class;
$self->{'link'} = Link->new(#_);
return $self;
}
1;
Example usage:
tie my $link,'LinkTie','http://somesite.com';
print $link, "\n"; # http://somesite.com
$link->print_method; # http://somesite.com
$link = 'http://othersite.com';
print $link, "\n"; # http://othersite.com
$link->print_method; # http://othersite.com
This is all quite hideous and a long way to go just to get the dubious ability to assign to something that you can also call methods on and also print as-is. A standard URI object with stringification is probably a better bet.