Perl script not updating when disk code changes are made - perl

I have a perl package called mypackage.pm (that i have on disk)
I have a script called test.pl
Inside my test.pl I have the following statement
use mypackage;
Now why, when I make changes inside mypackage, are those changes are NOT reflected when running my test.pl script?

No.
When you start a Perl program, the perl compiler will read all the modules referenced in the program from disk, compile them and store the resulting opcode in memory. It will also remember which files it has already read (in %INC) so it does not read them again.
There is a difference to when those files are read, but they are likely not significant here. If you use a module, it will be loaded at compile time, as in when the program starts. If you require a module, that can be inside of conditionals, and the file will be read when that code is executed at run time. That might be while the program starts up, or way later, or even never. Perl then switches back to compile time for that file to compile it, and then returns to run time.
If you want to re-read a module that you use, you will typically have to restart your program.
Having said that, there are some black magic implementations that allow you to re-load a module that has changed on disk. Unless you are building a daemon with insane start-up time and high throughput there is probably no need to use that.
One of these modules is Module::Reload. Its been around for a while and has recently seen some changes. Its See Also section points to further implementations, namely Module::Reload::Selective and the again pragma.
I have used neither of these and can't say if they work or how.

Related

How can I have one perl script call another and get the return results?

How can I have one perl script call another perl script and get the return results?
I have perl Script B, which does a lot of database work, prints out nothing, and simply exits with a 0 or a 3.
So I would like perl Script A call Script B and get its results. But when I call:
my $result = system("perl importOrig.pl filename=$filename");
or
my $result = system("/usr/bin/perl /var/www/cgi-bin/importOrig.pl filename=$filename");
I get back a -1, and Script B is never called.
I have debugged Script B, and when called manually there are no glitches.
So obviously I am making an error in my call above, and not sure what it is.
There are many things to consider.
Zeroth, there's the perlipc docs for InterProcess Communication. What's the value in the error variable $!?
First, use $^X, which is the path to the perl you are executing. Since subprocesses inherit your environment, you want to use the same perl so it doesn't confuse itself with PERL5LIB and so on.
system("$^X /var/www/cgi-bin/importOrig.pl filename=$filename")
Second, CGI programs tend to expect particular environment variables to be set, such as REQUEST_METHOD. Calling them as normal command-line programs often leaves out those things. Try running the program from the command line to see how it complains. Check that it gets the environment it wants. You might also check the permissions of the program to see if you (or whatever user runs the calling program) are allowed to read it (or its directory, etc). You say there are no glitches, so maybe that's not your particular problem. But, do the two environments match in all the ways they should?
Third, consider making the second program a modulino. You could run it normally as a script from the command line, but you could also load it as a Perl library and use its features directly. This obviates all the IPC stuff. You could even fork so that stuff runs concurrently.

Compiling my own modules?

I developed my own module (package) for example MyUtils.pm.
That is a file located on same folder than the main.cgi that use it. Then I use the module with use MyModule;
I think that is a bit slow. Or suppose there's a better way.
Is it possible to "compile the module" and include it in perl core or something like that?
If yes, I think it will load and run "faster".
Don't worry too much about it. The overhead of loading a Perl module is quite low.
If your application is getting enough traffic that this overhead does become significant, it's time to stop using CGI — the overhead of starting the Perl interpreter becomes a problem on its own. Consider switching your site to use something like FastCGI (with CGI::Fast or Plack::Handler::FCGI), or the mod_perl Apache module (possibly in conjunction with ModPerl::Registry to run CGI scripts directly, or with Plack::Handler::Apache2). Any of these will allow multiple consecutive requests to be handled by a single process, bypassing the module loading process entirely.
I think you are looking for B::Bytecode.
DESCRIPTION
Compiles a Perl script into a bytecode format that could be loaded
later by the ByteLoader module and executed as a regular Perl script.
This saves time for the optree parsing and compilation and space for
the sourcecode in memory.

Cleanup huge Perl Codebase

I am currently working on a roughly 15 years old web application.
It contains mainly CGI perl scripts with HTML::Template templates.
It has over 12 000 files and roughly 260 MB of total code. I estimate that no more than 1500 perl scripts are needed and I want to get rid of all the unused code.
There are practically no tests written for the code.
My questions are:
Are you aware of any CPAN module that can help me get a list of only used and required modules?
What would be your approach if you'd want to get rid of all the extra code?
I was thinking at the following approaches:
try to override the use and require perl builtins with ones that output the loaded file name in a specific location
override the warnings and/or strict modules import function and output the file name in the specific location
study the Devel::Cover perl module and take the same approach and analyze the code when doing manual testing instead of automated tests
replace the perl executable with a custom one, which will log each name of file it reads (I don't know how to do that yet)
some creative use of lsof (?!?)
Devel::Modlist may give you what you need, but I have never used it.
The few times I have needed to do somehing like this I have opted for the more brute force approach of inspecting %INC at the end the program.
END {
open my $log_fh, ...;
print $log_fh "$_\n" for sort keys %INC;
}
As a first approximation, I would simply run
egrep -r '\<(use|require)\>' /path/to/source/*
Then spend a couple of days cleaning up the output from that. That will give you a list of all of the modules used or required.
You might also be able to play around with #INC to exclude certain library paths.
If you're trying to determine execution path, you might be able to run the code through the debugger with 'trace' (i.e. 't' in the debugger) turned on, then redirect the output to a text file for further analysis. I know that this is difficult when running CGI...
Assuming the relevant timestamps are turned on, you could check access times on the various script files - that should rule out any top-level script files that aren't being used.
Might be worth adding some instrumentation to CGI.pm to log the current script-name ($0) to see what's happening.

Delete file on exit

Maybe I'm wrong, but I am convinced there is some facility provided by UNIX and by the C standard library to get the OS to delete a file once a process exits. But I can't remember what it's called (or maybe I imagined it). In my particular case I would like to access this functionality from perl.
Java has the deleteOnExit function but I understand the deletion is done by the JVM as opposed to the OS which means that if the JVM exits uncleanly (e.g. power failure) then the file will never get deleted.
But I understand the facility I am looking for (if it exists), as it is provided by the OS, the OS looks after the file's deletion, presumably doing some cleanup work on OS start in the case of power failure etc., and certainly doing cleanup in the case the process exits uncleanly.
A very very simple solution to this (that only works on *nix systems) is to:
Create and open the file (keep the file handle around)
Immediately call unlink on the file
Proceed as normal using the file handle, and exit when you feel like it
Then when your program is complete, the file descriptor is closed and the file is truly deleted. This will even work if the program crashes.
Of course this only works within the context of a single script (i.e. other scripts won't be able to directly manipulate the file, although you COULD pass them the file descriptor).
If you are looking for something that the OS may automatically take care of on restart after power failure, an END block isn't enough, you need to create the file where the OS is expecting a temporary file. And once you are doing that, you should just use one of the File::Temp routines (which even offer the option of opening and immediately unlinking the file for you, if you want).
You're looking for atexit(). In Perl this is usually done with END blocks. Java and Perl provide their own because they want to be portable to systems that don't follow the relevant standards (in this case C90).
That said, on Unix the common convention is to open a file and then unlink it; the kernel will delete it when the last reference (which is to say, your file descriptor) is closed. You almost always want to open for read+write.
I think you are looking for a function called tmpfile() which creates file when called and deletes it upon close. check, article
You could do your work in an END block.

How can I force unload a Perl module?

Hi I am using a perl script written by another person who is no longer in the company.
If I run the script as a stand alone, then the output are as expected. But when I call the script from another code repeatedly, the output is wrong except for the first time.
I suspect some variables are not initialised properly. When it is called standalone, each time it exits and all the variable values are initialised to defaults. But when called from another perl script, the modules and the variable values are probably carried over to the next call of the script.
Is there any way to flush out the called script from memory before I call it next time?
I tried enabling warning and it was throwing up 1000s of lines of warning...!
EDIT: How I am calling the other script:
The code looks like this:
do "processing.pl";
...
...
...
process(params); #A function in processing.pl
...
...
...
If you want to force the module to be reloaded, delete its entry from %INC and then reload it.
For example:
sub reload_module {
delete $INC{'Your/Silly/Module.pm'};
require Your::Silly::Module;
Your::Silly::Module->import;
}
Note that if this module relies on globals in other modules being set, those may need to be reloaded as well. There's no easy way to know without taking a peak at the code.
Hi I am using a perl script written by another person who is no longer in the company.
I tried enabling warning and it was throwing up 1000s of lines of warning...!
There's your problem right there. The script was not written properly, and should be rewritten.
Ask yourself this question: if it has 1000s of warnings when you enable strict checking, how can you be sure that it is doing the right thing? How can you be sure that it is not clobbering files, trashing data sets, making a mess of your filesystem? Chances are it is doing all of these things, either deliberately or accidentally.
I wouldn't trust running an error-filled script written by someone no longer with the company. I'd rewrite it and be sure that it was doing what I needed it to do.
Unloading a module is a more difficult task than simply removing the %INC entry of the module. Take a look at Class::Unload from CPAN.
If you don't want to rewrite/fix the script, I suggest calling the script via exec() or one of its varieties. While it is not very elegant to do, it will definitely fix your problem.
Are you sure that you need to reload the module? By using do, you are reading the source every time and executing it. What happens if you change that to require, which will only read and evaluate the source once?
Another possibility (just thinking aloud here) could be to do with the local directory? Are they running from the same place. Probably wouldn't work the first time though.
Another option is to use system ('doprocessing.pl');. Lazily, we do this with a few scripts to force re-initialisation of a number of classes/variables etc. And to force the log files to rotate properly.
edit: I have just re-read your question, and it would appear that you are not calling it like this.