Controlling order of object destruction - perl

My web application uses a Log module to record various events. The log object is initialized by being passed a CGI::Session object containing various information about the user. This information gets copied to the log object's data fields. Owing to the volume of activity on the site and the fact that a single visit to the site can result in multiple loggable events, the log module currently caches all events in memory, then actually writes them to the log file in the DESTROY function. However, this results in the session parameters being frozen at the time the log object is initialized, which occurs at the beginning of the request.
Recently, some new parameters were required to be logged that a) would be stored in the session object, and b) needed to be logged as their final, instead of initial value (and would potentially change during execution). My initial idea was to instead store a reference to the session object in the log object, but as the DESTROY function typically is called in global destruction, I have no guarantee that the session object will still be defined when the log is destroyed. Is there a way to guarantee that the CGI::Session object won't be destroyed before my log, hopefully without having to add an explicit destruct to each page of the app?
#old
package Log;
sub new
{
my $class = shift;
my $session = shift; #CGI::Session
my $self = {session => {customer_id => $session->param('customer_id')}, events => []};
return bless $self, $class;
}
sub log_event
{
my $self = shift;
my $event = shift;
push #{$self->{'events'}}, {event_type => $event->{'type'}, timestamp => $event->{'timestamp'}};
}
sub DESTROY
{
my $self = shift;
if (scalar #{$self->{'events'}})
{
open LOG, "/tmp/log";
print LOG, Dumper({session => $self->{'session'}, events => $self->{'events'}});
close LOG;
}
}
#new
package Log;
sub new
{
my $class = shift;
my $session = shift;#CGI::Session
my $self = {session => $session, events => []};
return bless $self, $class;
}
sub log_event
{
my $self = shift;
my $event = shift;
push #{$self->{'events'}}, {event_type => $event->{'type'}, timestamp => $event->{'timestamp'}};
}
sub DESTROY
{
my $self = shift;
if (scalar #{$self->{'events'}})
{
open LOG, "/tmp/log";
print LOG, Dumper({session => {customer_id => $self->{'session'}->param('customer_id')}}, events => $self->{'events'}});
close LOG;
}
}

Perl uses reference-counting to govern object destruction[1]. This means that under normal circumstances, if object A references object B, object A will be destroyed before object B.
This fails if the object survives until global destruction. There are two circumstances where this happens:
Reference cycle. If two objects directly or indirectly reference each other, the order of destruction of the objects involved in the cycles is unpredictable.
Global variable. The order in which objects referenced by package variables (and therefore objects directly or indirectly referenced by them) is unpredictable (though Perl does try to do the right thing).
So if the log holds a reference to a session object (as it appears you are doing), the log will be destroyed first (within the limits I mentioned above).
If the objects are in package variables instead of lexical (my) variables, and if you don't want to change that, you could use the following in the main program:
use Sub::ScopeFinalizer qw( scope_finalizer );
# Here or wherever.
our $log;
our $session;
{
# The lexicals within these curlies will get destroyed before
# global destruction. This will lead to the code ref provided to
# scope_finalizer getting called before global destruction.
my $guard = scope_finalizer {
# The order doesn't matter because the log object
# holds a reference to the session object.
$log = undef;
$session = undef;
};
# ... Main program here ...
}
As long as the log holds a reference to the session, this will ensure that the objects are destroyed in the correct order. Even if the program dies or exits.
You can tell if objects of a class are surviving until global destruction by adding the following code to the program and examining the order of the output:
DESTROY { warn($_[0]->id . " destroyed.\n"); } # In the class
END { warn("Global destruction."); }
In this post, I'm using the word "object" loosely. I don't just means instances of classes, but also scalars, arrays, hashes subs and other values and variables.
Similarly, referencing refers not just to a Perl reference, but other forms of referencing. For example, arrays and hashes reference their elements, and subs reference variables they capture, etc.

Related

How can I ensure my method calls use the right method name on the right object?

I am working on a program which makes multiple attempts at processing, storing to a new log each time it tries (several other steps before/after).
use strict;
for (my $i = 0; $i < 3; $i++)
{
my $loggerObject = new MyLoggerObject(tag => $i);
#.. do a bunch of other things ..
Process($loggerObject,$i);
#.. do a bunch of other things ..
}
sub Process
{
my ($logger,$thingToLog) = #_;
sub Logger { $logger->Print($_[0]); }
Logger("Processing $thingToLog");
}
package MyLoggerObject;
sub new
{
my $package = shift;
my %hash = (#_); my $self = \%hash;
return bless $self, $package;
}
sub Print
{
my $self = shift;
my $value = shift;
print "Entering into log ".$self->{tag}.": $value\n";
}
1;
To avoid having to do a bunch of $self->{logger}->Print() and risk misspelling Print, I tried to collapse them into the local subroutine as seen above. However, when I run this I get:
perl PerlLocalMethod.pl
Entering into log 0: Processing 0
Entering into log 0: Processing 1
Entering into log 0: Processing 2
instead of:
perl PerlLocalMethod.pl
Entering into log 0: Processing 0
Entering into log 1: Processing 1
Entering into log 1: Processing 2
I am presuming the problem is that the Logger method is 'compiled' the first time I call the Process method with the object reference I used on the first call but not afterwards.
If I did $logger->Print(), misspelling Print, and hit a codepath I can't reliably test (this is for an embedded system and I can't force every error condition) it would error out the script with an undefined Method. I suppose I could use AUTOLOAD within logger and log any bad Method calls, but I'd like to know any other recommendations on how to make sure my Logger() calls are reliable and using the correct object.
In Perl, subroutines are compiled during compile time. Embedding a named subroutine declaration into a subroutine doesn't do what one would expect and isn't recommended.
If you are afraid of typos, write tests. See Test::More on how to do it. Use mocking if you can't instantiate system specific classes on a dev machine. Or use shorter names, like P.
You can declare the Logger in the highest scope as a closure over $logger that you would need to declare there, too:
my $logger;
sub Logger { $logger->Print($_[0]) }
But it's confusing and can lead to code harder to maintain if there are many variables and subroutines like that.
If you had used use warnings in your code you would have seen the message:
Variable "$logger" will not stay shared at logger line 24.
Which would have alerted you to the problem (moral: always use strict and use warnings).
I'm not entirely sure why you need so many levels of subroutines in order to do your logging, but it seems to me that all of your subroutines which take the $logger object as their first parameter should probably by methods on the MyLoggerObject (which should probably be called MyLoggerClass as it's a class, not an object).
If you do that, then you end up with this code (which seems to do what you want):
use strict;
use warnings;
for my $i (0 .. 2) {
my $loggerObject = MyLoggerClass->new(tag => $i);
#.. do a bunch of other things ..
$loggerObject->Process($i);
#.. do a bunch of other things ..
}
package MyLoggerClass;
sub new {
my $package = shift;
my $self = { #_ };
return bless $self, $package;
}
sub Process {
my $self = shift;
my ($thingToLog) = #_;
$self->Logger("Processing $thingToLog");
}
sub Logger {
my $self = shift;
$self->Print($_[0]);
}
sub Print {
my $self = shift;
my ($value) = #_;
print "Entering into log $self->{tag}: $value\n";
}
1;
Oh, and notice that I moved away from the indirect object notation call (new Class(...)) to the slightly safer Class->new(...). The style you used will work in the vast majority of cases, but when it doesn't you'll waste days trying to fix the problem.
As already explained above, using lexical defined variables in these kinds of method is not possible.
If you have to "duct-tape" this problem you could use global Variables (our instead of my).
sub Process
{
our ($logger,$thingToLog) = #_;
sub Logger { $logger->Print($_[0]); }
Logger("Processing $thingToLog");
}
But be aware that $logger and $thingToLog are now global variables accessible outside this function.

Completely destroy all traces of an object in Perl

I am writing a Perl script that gets data from a changing outside source. As data is added or removed from this outside source I would like the data structures in my Perl script to mirror the changes. I was able to do this by creating objects that store the data in a similar fashion and synchronizing the data in those objects whenever I try to access the data in my Perl script.
This works great, and gets any new data that is added to this external source; however a problem arises when any data is removed. If any data is removed I want all existing references to it in my Perl script to be destroyed as well so the user can no longer attempt to access the data without raising an error.
The only way I could think of was to undefine the internal reference to the data as soon as it is determined that the data no longer exists. It appears, however; that when a reference is undefined it doesn't remove the data stored in the location of the reference, it only deletes the data from the variable that holds the reference.
Here is a test script demonstrating my problem:
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
####################################################################################################
# Package to create an object
package Object;
use Moose;
# Define attributes
has 'name' => (is => 'ro', isa => 'Str', required => 1);
####################################################################################################
# Package to test with
package test;
# Create an object
my $object1 = Object->new('name' => 'Test Object');
print 'OBJECT1 NAME: '.$object1->name()."\n";
# Create another reference to the object
my $object2 = $object1;
# Print the name
print 'OBJECT2 NAME: '.$object2->name()."\n";
# Print both references
print "\n";
print "OBJ : $object1\n";
print "OBJ2 : $object2\n";
print "\n";
# Undefine the reference to object2
undef $object2;
# Try to print both names
print 'OBJECT1 NAME: '.$object1->name()."\n";
print 'OBJECT2 NAME: '.$object2->name()."\n";
Output:
How can I completely destroy all traces of an object so that any attempt to access it's data will result in an error?
EDIT:
Here is a different example that may explain better what I am trying to achieve.
Say I have a file object:
my $file = File->new();
Now I want to get the text in that file
my $text = $file->text();
Now I get it again (for some unknown reason)
my $text2 = $file->text();
I want to be able to modify $text and have it directly effect the contents of $text2 as well as change the actual text attribute of the file.
I'm basically trying to tie the variables together so if one changes they all change. Also if one is deleted they would all be deleted.
This would also mean if the text attribute is changed, $text1 and $text2 would also change with it to reflect the new value.
Could this be done using an alias of some sort?
Perl uses reference counting to "retire" data.
What your program is doing is as follows:
Create an object and assign a reference to $object1
Copy that reference to $object2 and increment the object's reference count
Change the value of $object2 and decrement the object's reference count.
In the end you still have a hold of the object via $object1 and it's reference count will not drop to below 1 while you keep a hold of it. Technically as your program shuts down the $object1 variable will be destroyed as it goes out of scope, at which the object's reference count will drop to 0 and perl will look for and try to call it's DESTROY method.
If you truly want to "see" an item destroyed you may want to look into defining a DESTROY method that prints out a message upon object destruction. This way you can "let go" of your last reference and still see it's destruction.
You can't free an object that's still being used. Have the object itself keep track of whether it's still valid or not.
sub new {
my ($class, $data) = #_;
my $self = bless({}, $class);
$self->{valid} = 1;
$self->{data} = $data;
return $self;
}
sub delete {
my $self = shift;
undef(%$self);
}
sub data {
my $self = shift;
croak("Invalid object") if !$self->{valid};
$self->{data} = $_[0] if #_;
return $self->{data};
}
Example:
my $o1 = Class->new('big_complex_data');
my $o2 = $o1;
say $o1->data();
say $o2->data();
$o1->delete();
say $o2->name();
Output:
big_complex_data
big_complex_data
Invalid object at a.pl line 39.

How to access object features in Perl from within the same package

I'm making a Perl module and I am still getting to grips with how Perl deals with objects.
This is the new sub that I wrote to create an object and I have no problem updating elements:
sub new {
my $class = shift;
my ($self) = {
name => undef
};
bless($self, $class);
return $self;
}
sub get_name {
my $self = shift;
$self->{name} = 'Eve';
return $self->{name};
}
I can use the object fine when I call the module and access it from another file, but I want to use the data in the object at other areas in the module code.
So I have no problem doing this:
my $new_object = new ProgramTest; # ProgramTest being the module/package
my $name = get_name();
But I want to use the $self elements in a 'module-internal' method which is never accessed by an outside script. So I want to have something like this:
sub get_variables {
return (name); # I don't know how to get the name here
# (I plan to have other variables, too)
}
I am probably missing something obvious (I'm sure I'll kick myself when I see the solution), so any help appreciated!
I want this so that the rest of the module can use the variables (without changing) as there are conditions that rely on their values.
There's no such thing as internal/private methods in perl objects. Common practise is to start any methods which should not be used publicly with an underscore, but this is not enforced in any way. Also have a look at moose - it takes a lot of the hassle out of OO perl.
With regards to your question the below shows how one module method can call another module method, with both having access to the object data. Again I woulds really recommend you use Moose!
sub publicSub{
my ( $self ) = #_;
return $self->_privateSub();
}
sub _privateSub{
my ( $self ) = #_;
return $self->{name};
}
I think you want class-variables. They are global to a class and all instances of the class (i.e. all the objects you created) can see them. Global in this case means that they are at the ouside-most lexical scope, so all subs can see them.
package ProgramTest;
my $everyone_can_see_this = 1; # lexical scope, but 'global' to the package
sub new {
my $class = shift;
my ($self) = {
name => undef
};
bless($self, $class);
return $self;
}
sub get_var {
my $self = shift;
return ++$everyone_can_see_this;
}
package Main;
my $o1 = ProgramTest->new;
my $o2 = ProgramTest->new;
say $o1->get_var;
say $o2->get_var;
say $o1->get_var;
__END__
2
3
4
But I don't see why you would want to do that. It doesn't make sense (unless you want an object-counter). Don't use it for config values, or you cannot really have objects for different purposes of the same class.
Maybe you want something else. If so, please try to rephrase your question.

How to create a row object that contains related objects in DBIx::Class?

When you create a Row object in DBIx::Class you can pass related objects as values, e.g.
my $author = $authors_rs->find(1);
my $book = $books_rs->create({ author => $author, title => 'title' });
However, if you later use the author accessor, the object is retrieved again from the database. Is it possible to create an object so that the associated object can be accessed without the additional query?
How about just copying what you want from $author into some plain old Perl variables?
If you want to copy a whole structure, the clone module may help (I don't have experience with this module; I just found it online).
I'm not sure that I'm understanding you correctly, but if I am, perhaps you want to look into the prefetch functionality to automatically be ready for calling those other related row objects.
For example, in Galileo, when listing all pages (articles) I use this mechanism to get the related author objects for each page object (see here).
Ok so if the point is to store one object in another, you might want to inject some extra data into the object.
UNTESTED:
## some initial checks (run these only once)
# check that method name is available
die "Cannot use method 'extra_data'" if $book->can('extra_data');
# check that the reftype is a hash
require Scalar::Util;
die "Incorrect underlying type" unless Scalar::Util::reftype($book) eq 'HASH';
# check that the key is available
die "Key unavailable" if exists $book->{'my_extra_data'};
{
no strict 'refs';
# create a simple accessor for a hash stored in an object
*{ ref($book) . '::extra_data' } = sub {
my $self = shift;
#return all extra data if called without args
return $self->{my_extra_data} unless #_;
my $key = shift;
if (#_) {
$self->{my_extra_data}{$key} = shift;
}
return $self->{my_extra_data}{$key};
};
}
$book->extra_data( author => $author );
#then later
my $stored_author = $book->extra_data('author');

Moose traits for multdimensional data structures

Breaking out handling an internal variable from calls on the variable into calls on the object is easy using the Attribute::Native::Trait handlers. However, how do you deal with multiple data structures? I can't think of any way to handle something like the below without making the stash an arrayref of My::Stash::Attribute objects, which in turn contain an arrayref of My::Stash::Subattribute objects, which contains an arrayref My::Stash::Instance objects. This includes a lot of munging and coercing the data each level down the stack as I sort things out.
Yes, I can store the items as a flat array and then grep it on every read, but in a situation with frequent reads and that most calls are reads, grepping against a large list of array items is a lot of processing every read vs just indexing the items internally in the way needed.
Is there a MooseX extension that can handle this sort of thing via handlers creating methods, instead of just treating the read accessor as the hashref it is and modifying it in place? Or am I just best off forgetting about doing things like this via method call and just doing it as-is?
use strict;
use warnings;
use 5.010;
package My::Stash;
use Moose;
has '_stash' => (is => 'ro', isa => 'HashRef', default => sub { {} });
sub add_item {
my $self = shift;
my ($item) = #_;
push #{$self->_stash->{$item->{property}}{$item->{sub}}}, $item;
}
sub get_items {
my $self = shift;
my ($property, $subproperty) = #_;
return #{$self->_stash->{$property}{$subproperty}};
}
package main;
use Data::Printer;
my $stash = My::Stash->new();
for my $property (qw/foo bar baz/) {
for my $subproperty (qw/fazz fuzz/) {
for my $instance (1 .. 2) {
$stash->add_item({ property => $property, sub => $subproperty, instance => $instance })
}
}
}
p($_) for $stash->get_items(qw/baz fuzz/);
These are very esoteric:
sub add_item {
my $self = shift;
my ($item) = #_;
push #{$self->_stash->{$item->{property}}{$item->{sub}}}, $item;
}
So add_item takes an hashref item, and pushes it onto an array key in stash indexed by it's own keys property, and sub.
sub get_items {
my $self = shift;
my ($property, $subproperty) = #_;
return #{$self->_stash->{$property}{$subproperty}};
}
Conversely, get_item takes two arguments, a $property and a $subproperty and it retrieves the appropriate elements in a Array in a HoH.
So here are the concerns into making it MooseX:
There is no way in a non-Magic hash to insist that only hashes are values -- this would be required for predictable behavior on the trait. As in your example, what would you expect if _stash->{$property} resolved to a scalar.
add_item has it's depth hardcoded to property and sub.
returning arrays is bad, it requires all of the elements to be pushed onto the stack (return refs)
Now firstly, I don't see why a regular Moose Hash trait couldn't accept array refs for both the setter and getter.
->set( [qw/ key1 key2/], 'foo' )
->get( [qw/ key1 key2/] )
This would certainly make your job easier, if your destination wasn't an array:
sub add_item {
my ( $self, $hash ) = #_;
$self->set( [ $hash->{property}, $hash->{subproperty} ], $hash );
}
# get items works as is, just pass in an `ArrayRef`
# ie, `->get([$property, $subproperty])`
When it comes to having the destination be an array than a hash slot, I assume you'd just have to build that into a totally different helper in the trait, push_to_array_or_create([$property, $subproperty], $value). I'd still just retrieve it with the fictional get helper specified above. auto_deref type functionality is a pretty bad idea.
In short ask a core developer on what they would think about extending set and get in this context to accept ArrayRefs as keys and act appropriately. I can't imagine there is a useful default for ArrayRef keys (I don't think regular stringification would be too useful.).