How can I force unload a Perl module? - perl

Hi I am using a perl script written by another person who is no longer in the company.
If I run the script as a stand alone, then the output are as expected. But when I call the script from another code repeatedly, the output is wrong except for the first time.
I suspect some variables are not initialised properly. When it is called standalone, each time it exits and all the variable values are initialised to defaults. But when called from another perl script, the modules and the variable values are probably carried over to the next call of the script.
Is there any way to flush out the called script from memory before I call it next time?
I tried enabling warning and it was throwing up 1000s of lines of warning...!
EDIT: How I am calling the other script:
The code looks like this:
do "processing.pl";
...
...
...
process(params); #A function in processing.pl
...
...
...

If you want to force the module to be reloaded, delete its entry from %INC and then reload it.
For example:
sub reload_module {
delete $INC{'Your/Silly/Module.pm'};
require Your::Silly::Module;
Your::Silly::Module->import;
}
Note that if this module relies on globals in other modules being set, those may need to be reloaded as well. There's no easy way to know without taking a peak at the code.

Hi I am using a perl script written by another person who is no longer in the company.
I tried enabling warning and it was throwing up 1000s of lines of warning...!
There's your problem right there. The script was not written properly, and should be rewritten.
Ask yourself this question: if it has 1000s of warnings when you enable strict checking, how can you be sure that it is doing the right thing? How can you be sure that it is not clobbering files, trashing data sets, making a mess of your filesystem? Chances are it is doing all of these things, either deliberately or accidentally.
I wouldn't trust running an error-filled script written by someone no longer with the company. I'd rewrite it and be sure that it was doing what I needed it to do.

Unloading a module is a more difficult task than simply removing the %INC entry of the module. Take a look at Class::Unload from CPAN.

If you don't want to rewrite/fix the script, I suggest calling the script via exec() or one of its varieties. While it is not very elegant to do, it will definitely fix your problem.

Are you sure that you need to reload the module? By using do, you are reading the source every time and executing it. What happens if you change that to require, which will only read and evaluate the source once?

Another possibility (just thinking aloud here) could be to do with the local directory? Are they running from the same place. Probably wouldn't work the first time though.
Another option is to use system ('doprocessing.pl');. Lazily, we do this with a few scripts to force re-initialisation of a number of classes/variables etc. And to force the log files to rotate properly.
edit: I have just re-read your question, and it would appear that you are not calling it like this.

Related

Powershell Remove-Variable cmdlet, do I need to call it at the end of each function/scriptblock?

This is a generic question, no code.
Not sure if I need to remove local variables as I thought it should be done by the Powershell Engine.
I had a script to gather info from WMI and used a lot of local variables. The output was messed up when running multiple times, but it got fixed after I clean up all local variables at the end of function/scriptblock.
Any thoughts/idea would be appreciated.
The trouble do not come from the fact that you do not remove your vars, but by at least from two beginers errors (or done by lazy developpers like me, supposing I'am a developper).
we forget to itialize our vars before using them.
we do not test every returns of our functions or CmdLets calls.
Once these two things done (code multipled by two at least) you can restart your script without cleaning anything except the processed datas.
For me scripting is most of the time done on a corner of a table even not push in a source repository.
So start scripting and ask yourself fewer questions.

Perl: system command returns before it is done

I'm using a a few system() commands in my perl script that is running on Linux.
The commands I run with the system() function output their data to a log which I then parse to decide what to do next.
I noticed that sometimes it looks like the code that parses the log file (which comes after the system() function) doesn't use the final log.
For example, I search for a "test pass" phrase in the log file - and the script doesn't find it even though when I open the log it is there.
Another example - I try to delete the folder where the log was placed but it doesn't let me because it's "Not empty". When I try to delete it manually it is deleted with errors.
(These examples happen every now and then, but most of the time they don't)
It looks like some kind of "timing" problem to me. How can I solve it?
If you want to be safe and are on Linux, call system sync; after every command that writes to the disk before reading from the disk. That will force the OS to write everything still buffered to the filesystem and only return afterwards. Thus you can be sure that when it is finished, everything you wrote to files now actually has arrived there.
But be aware, that may be overkill in many situations. There are reasons for those buffers and manually calling sync constantly is most likely not the fastest way of achieving things. See also http://linux.die.net/man/8/sync
If, for example, you have something else that you could do between the writing and the reading, like some calculations or whatever, that would likely be enough and you would not waste time by telling the OS that you know better how and when it has to do it's jobs. ^^
But if perfect efficiency is not your main concern (and you should not be using Perl if it was), system sync; after everything that modifies files and before accessing those files is probably okay and safe.

Perl: Smartest and fastest way for (dividing) large script?

I have a Perl Script, that has (as of now) 10 subs and growing..
Each sub is making a different LWP-Call and is working with a variable I set in the first sub.
As each sub takes some time, I'm looking for a smart way to make the script run faster.
What would you recommend?
Should I:
Put each sub into a seperate script?
Call the subs (except the first) at once?
Use a different solution (that I don't know of)?
Thanks in advance!
EDIT:
The script is fetching some xml- and soap-data from the web, then extracting the necesary parts.
You probably want check the LWP::Parallel module.
ParallelUserAgent is an extension to the existing libwww module. It
allows you to take a list of URLs (it currently supports HTTP, FTP,
and FILE URLs. HTTPS might work, too) and connect to all of them in
parallel, then wait for the results to come in.
You dont have to split up the file if you dont want to. Just take a list of things to run on STDIN, then execute your script multiple times putting each process in the background or use cron if its an on going thing.
I would recommend using the Operating System's processes until you have a good reason not to do so anymore.
I've had success using Coro and there is already a module to do LWP requests.
http://metacpan.org/pod/LWP::Protocol::Coro::http
Coro is cooperative routines so not multithreaded. There is also an AnyEvent variation

How to split long Perl code into several files without too much manual editing?

How do I split a long Perl script into two or more different files that can all access the same variables - without having to rename all shared variables from e.g. $count to $::count (or $main::count which is the same)?
In other words, what's the best and simplest way to split the Perl script into several files without having to import a lot of variables/functions and/or do a lot of manual editing.
I assume it has something to do with making the code part of the same package/scope/namespace, but my experiments so far have failed.
I am not sure it makes a difference, but the script is used for web/CGI purposes and will be running under mod_perl.
EDIT - Background:
I kind of knew I would get that response. The reason I want to split up the file is the following:
Currently I have a single very old and very long Perl file. I know it is not following Perl best practices but it works.
The problem is, I need to distribute the data files it uses between different web servers, first of all for performance reasons. There will be one "master" server and one or several "slaves".
About 20% of the mentioned Perl file contains shared functions, 40% has the code need to run on the master server and 40% on the slave servers. Therefore, I would like to split the code into three files: 1. shared, 2. master-only, 3. slave-only. On the master server, 1 and 2 will be loaded, on the slaves, 1 and 3 will be loaded.
I assume this approach would use less process RAM and, more importantly, I would minimize the risk of not splitting the code correctly (e.g. a slave process calling a master data file). I don't see a great need for modularization, as the system works and the code does not need a lot of changes or exchanges with other projects.
EDIT 2 - Solution:
Found the solution I was looking for here:
http://www.perlmonks.org/?node_id=95813
In cases where the main package is in ownership of the variable, the
actual word 'main' can be ommitted to yield something like: $::var
It is possible to get around having to fully qualify variable names
when strict is in use. Applying a simply use vars to your script, with
the variable names as it arguments will get around explicit package
names.
Actually, I ended up repeating the our ($count, etc...) statement for the needed variables instead of use vars ();
Do let me know if I am missing something vital - apart from not going with modules! :)
#Axeman, Thanks, I will accept your answer, both for your effort and for sending me in the right direction.
Unless you put different package statements in their files, they will all be treated as if they had package main; at the top. So assuming that the scripts use package variables, you shouldn't have to do anything. If you have declared them with my (that is, if they are lexically scoped variables) then you would have to make sure that all references to the variables are in the same file.
But splitting scripts up for length is a rotten substitute for modularization. Yes, modularization helps keep code length down, but modularization if the proper way to keep code length down--for all the reasons that you would want to keep code-length down, modularization does it best.
If chopping the files by length could really work for you, then you could create a script like this:
do '/path/to/bin/part1.pl';
do '/path/to/bin/part2.pl';
do '/path/to/bin/part3.pl';
...
But I kind of suspect that if the organization of this code is as bad as you're--sort of--indicating, it might suffer from some of the same re-inventing the wheel that I've seen in Perl-ignorant scripts. Just offhand (I might be wrong) but I'm thinking you would be surprised how much could be chopped from the length by simply substituting better-tested Perl library idioms than for-looping and while-ing everything.

Finding a Perl memory leak

SOLVED see Edit 2
Hello,
I've been writing a Perl program to handle automatic upgrading of local (proprietary) programs (for the company I work for).
Basically, it runs via cron, and unfortunately has a memory leak (or something similar). The problem is that the leak only happens when I'm not looking (aka when run via cron, not via command line).
My code does not contain any circular (or other) references, so the commonly cited tools will not help me (Devel::Cycle, Devel::Peek).
How would I go about figuring out what is using so much memory that the kernel kills it?
Basically, the code SFTPs into a server (using ```sftp...`` `), calls OpenSSL to verify the file, and then SFTPs more if more files are needed, and installs them (untars them).
I have seen delays (~15 sec) before the first SFTP session, but it has never used so much memory as to be killed (in my presence).
If I can't sort this out, I'll need to re-write in a different language, and that will take precious time.
Edit: The following message is printed out by the kernel which led me to believe it was a memory leak:
[100023.123] Out of memory: kill process 9568 (update.pl) score 325406 or a child
[100023.123] Killed Process 9568 (update.pl)
I don't believe it is an issue with cron because of the stalling (for ~15 sec, sometimes) when running it via the command-line. Also, there are no environmental variables used (at least by what I've written, maybe underlying things do?)
Edit 2: I found the issue myself, with help from the below comment by mobrule (in response to this question). It turns out that the script was called from a crontab of a user (non-root) just once a day and that (non-root privs) caused a special infinite loop situation.
Sorry guys, I feel kinda stupid for not finding this before, but thanks.
mobrule, if you submit your comment as an answer, I will accept it as it lead to me finding the problem.
End Edits
Thanks,
Brian
P.S. I may be able to post small snippets of code, but not the whole thing due to company policy.
You could try using Devel::Size to profile some of your objects. e.g. in the main:: scope (the .pl file itself), do something like this:
use Devel::Size qw(total_size);
foreach my $varname (qw(varname1 varname2 ))
{
print "size used for variable $varname: " . total_size($$varname) . "\n";
}
Compare the actual size used to what you think is a reasonable value for each object. Something suspicious might pop out immediately (e.g. a cache that is massively bloated beyond anything that sounds reasonable).
Other things to try:
Eliminate bits of functionality one at a time to see if suddenly things get a lot better; I'd start with the use of any external libraries
Is the bad behaviour localized to just one particular machine, or one particular operating system? Move the program to other systems to see how its behaviour changes.
(In a separate installation) try upgrading to the latest Perl (5.10.1), and also upgrade all your CPAN modules
How do you know that it's a memory leak? I can think of many other reasons why the OS would kill a program.
The first question I would ask is "Does this program always work correctly from the command line?". If the answer is "No" then I'd fix these issues first.
On the other hand if the answer is "Yes", I would investigate all the differences between having the program executed under cron and from the command line to find out why it is misbehaving.
If it is run by cron, that shouldn't it die after iteration? If that is the case, hard for me to see how a memory leak would be a big deal...
Are you sure it is the script itself, and not the child processes that are using the memory? Perhaps it ends up creating a real lot of ssh sessions , instead of doing a bunch of stuff in one session?