Signal handler in XS module - perl

I have very simple constructor and destructor for some C-based library. Well, i need to catch signals like TERM and others to enforce destructor calling on perl's mandatory death with killall perl e.g.
In fact i need local $SIG{TERM} for each object to be installed, written inside XS constructor. Is it possible?
!thanx!

Not exactly. A signal is sent to and trapped by a process, not objects.
Another approach might be to maintain a global list of all the objects that must be cleaned up, and install a single signal handler that will clean up each object on that list.

Related

How to disable perl's garbage collector?

I have a short-lived (fast to complete) perl script, that nevertheless uses enough memory to trigger the garbage collector. In turn, collection takes more than the rest of the processing.
Is there a way to disable garbage collection and let the OS to do it when the script exits?
Edit
The GC pauses the script in the middle of it, not at the very end. KILLing it does not help.
Two approaches to exiting your program quickly without executing global destruction:
POSIX::_exit
This is identical to the C function _exit(). It exits the program immediately which means among other things buffered I/O is not flushed.
kill your main process with SIGKILL.
kill 'KILL', $$;
As mentioned in comments, garbage collection in Perl is a refcounting mechanism, and is triggered by the value no longer being referenced by anything (whether a variable it is stored in which may go out of scope or be assigned a different value, an operation it's part of, a subroutine call stack it's being passed around in, or an actual reference).
So to prevent a value from being cleaned up until program exit, the easiest way is to do the opposite of the conventional memory-conscious wisdom: reference the value from the global stash.
our $foo = \$something_to_keep_alive;
Alternatively, you can (ab)use the fact that circular references will prevent refcounts from decrementing until global destruction.
$something->{self} = $something;
This will cause the value to reference itself, even if done through another layer, until one of the references in the cycle is weakened, removed, or global destruction is reached. And again, certainly something to be avoided in normal circumstances, as it is a by-design memory leak.

What's up with CHECK and INIT blocks?

I have a circular dependency problem with Perl modules: say package X uses Y and wants to hold a static reference to an Y instance, and package Y uses X and wants to hold a static reference to an X instance.
Simply saying our $x_instance = new X will give Can't locate object method "new" in the module that was not loaded first.
I figured something like
our $x_instance;
INIT { $x_instance = new X }
would make sense, so I read everything about the specially named blocks.
Well, this works in a simple test I made, but in my real application it systematically shows Too late to run INIT block. The same happens with CHECK blocks.
The only explanation I found was from Perl Monks and I'm afraid I couldn't make much sense of it.
Does someone have an explanation about how Perl goes about executing CHECK and INIT block that goes beyond what is in perlmod, and would help me understand why my blocks and sometimes executed and sometimes not?
By the way, I just want to understand this—I am not specifically asking a solution to my original circular dependency problem, as I have a workaround that I am reasonably happy about:
our $x_instance;
sub get_x_instance {
$x_instance //= new X;
return $x_instance;
}
INIT blocks are executed immediately before the run time phase is started in the order the compiler encountered them during the compilation phase.
If you use use require (or do) at run time to compile a Perl file that includes an INIT block then the block won't be executed.
It is rare that there is a real reason to use require in preference to use.
Despite your confidence, there must be a place where you are attempting to load a module at run time that contains an INIT block. I suggest you install and use Carp::Always so that the Too late to run INIT block message is accompanied by a stack backtrace that will help you find the erroneous call.

What is the architecture behind Scratch programming blocks?

I need to build a mini version of the programming blocks that are used in Scratch or later in snap! or openblocks.
The code in all of them is big and hard to follow, especially in Scratch which is written in some kind of subset of SmallTalk, which I don't know.
Where can I find the algorithm they all use to parse the blocks and transform it into a set of instructions that work on something, such as animations or games as in Scratch?
I am really interested in the algorithmic or architecture behind the concept of programming blocks.
This is going to be just a really general explanation, and it's up to you to work out specifics.
Defining a block
There is a Block class that all blocks inherit from. They get initialized with their label (name), shape, and a reference to the method. When they are run/called, the associated method is passed the current context (sprite) and the arguments.
Exact implementations differ among versions. For example, In Scratch 1.x, methods took arguments corresponding to the block's arguments, and the context (this or self) is the sprite. In 2.0, they are passed a single argument containing all of the block's arguments and context. Snap! seems to follow the 1.x method.
Stack (command) blocks do not return anything; reporter blocks do.
Interpreting
The interpreter works somewhat like this. Each block contains a reference to the next one, and any subroutines (reporter blocks in arguments; command blocks in a C-slot).
First, all arguments are resolved. Reporters are called, and their return value stored. This is done recursively for lots of Reporter blocks inside each other.
Then, the command itself is executed. Ideally this is a simple command (e.g. move). The method is called, the Stage is updated.
Continue with the next block.
C blocks
C blocks have a slightly different procedure. These are the if <> style, and the repeat <> ones. In addition to their ordinary arguments, they reference their "miniscript" subroutine.
For a simple if/else C block, just execute the subroutine normally if applicable.
When dealing with loops though, you have to make sure to thread properly, and wait for other scripts.
Events
Keypress/click events can be dealt with easily enough. Just execute them on keypress/click.
Something like broadcasts can be done by executing the hat when the broadcast stack is run.
Other events you'll have to work out on your own.
Wait blocks
This, along with threading, is the most confusing part of the interpretation to me. Basically, you need to figure out when to continue with the script. Perhaps set a timer to execute after the time, but you still need to thread properly.
I hope this helps!

Catching runtime errors in Perl and converting to exceptions

Perl currently implements $SIG{__DIE__} in such a way that it will catch any error that occurs, even inside eval blocks. This has a really useful property that you can halt the code at the exact point where the error occurs, collect a stack trace of the actual error, wrap this up in an object, and then call die manually with this object as the parameter.
This abuse of $SIG{__DIE__} is deprecated. Officially, you are supposed to replace $SIG{__DIE__} with *CORE::GLOBAL::die. However, these two are NOT remotely equivalent. *CORE::GLOBAL::die is NOT called when a runtime error occurs! All it does is replace explicit calls to die().
I am not interested in replacing die.
I am specifically interested in catching runtime errors.
I need to ensure that any runtime error, in any function, at any depth, in any module, causes Perl to pass control to me so that I can collect the stack trace and rethrow. This needs to work inside an eval block -- one or more enclosing eval blocks may want to catch the exception, but the runtime error could be in a function without an enclosing eval, inside any module, from anywhere.
$SIG{__DIE__} supports this perfectly—and has served me faithfully for a couple of years or more—but the Powers that Be™ warn that this fantastic facility may be snatched away at any time, and I don't want a nasty surprise one day down the line.
Ideally, for Perl itself, they could create a new signal $SIG{__RTMERR__} for this purpose (switching signal is easy enough, for me anyway, as it's only hooked in one place). Unfortunately, my persuasive powers wouldn't lead an alcoholic to crack open a bottle, so assuming this will not happen, how exactly is one supposed to achieve this aim of catching runtime errors cleanly?
(For example, another answer here recommends Carp::Always, which … also hooks DIE!)
Just do it. I've done it. Probably everyone who's aware of this hook has done it.
It's Perl; it's still compatible going back decades. I interpret "deprecated" here to mean "please don't use this if you don't need it, ew, gross". But you do need it, and seem to understand the implications, so imo go for it. I seriously doubt an irreplaceable language feature is going away any time soon.
And release your work on CPAN so the next dev doesn't need to reinvent this yet again. :)

Is the only way to disconnect WWW::Mechanize::Firefox from mozrepl destruction of the objects?

As the title says I'm trying to make a perl daemon which, being long-running I want to be sane on resource usage.
All the examples / documentation I've seen doesn't seem to mention a way to disconnect a session.
The best documentation on the topic I can find in WWW::Mechanize::Firefox::Troubleshooting
Where it's suggested the object (and connection?) is kept alive until global destruction.
In short, I've seen no 'disconnect' function, and wonder if I'm missing something.
Disconnection seems to be handled via destructors. Perl uses special DESTROY methods for this. It is not advisable to call this method manually.
You need to decrease the refcount of your $mech object in order to get it destroyed automatically. This happens when the variable drops out of scope, in the Global Destruction Phase at the end of the process, or (in the case of objects), by assigning something different to your variable, e.g.
$mech = undef;
To completely deallocate any variable, you can also
undef $mech; # which btw is the answer provided in the FAQ you linked
The differences are subtle, and irrelevant in this case.