Can I use dtrace on OS X 10.5 to determine which of my perl subs is causing the most memory allocation? - perl

We have a pretty big perl codebase.
Some processes that run for multiple hours (ETL jobs) suddenly started consuming a lot more RAM than usual. Analysis of the changes in the relevant release is a slow and frustrating process. I am hoping to identify the culprit using more automated analysis.
Our live environment is perl 5.14 on Debian squeeze.
I have access to lots of OS X 10.5 machines, though. Dtrace and perl seem to play together nicely on this platform. Seems that using dtrace on linux requires a boot more work. I am hoping that memory allocation patterns will be similar between our live system and a dev OS X system - or at least similar enough to help me find the origin of this new memory use.
This slide deck:
https://dgl.cx/2011/01/dtrace-and-perl
shows how to use dtrace do show number of calls to malloc by perl sub. I am interested in tracking the total amount of memory that perl allocates while executing each sub over the lifetime of a process.
Any ideas on how this can be done?

There's no single way to do this, and doing it on a sub-by-sub basis isn't always the best way to examine memory usage. I'm going to recommend a set of tools that you can use, some work on the program as a whole, others allow you to examine a single section of your code or a single variable.
You might want to consider using Valgrind. There's even a Perl module called Test::Valgrind that will help set up a suppression file for your Perl build, and then check for memory leaks or errors in your script.
There's also Devel::Size which does exactly what you asked for, but on a variable-by-variable basis rather than a sub-by-sub basis.
You can use Devel::Cycle to search for inadvertent circular memory references in complex data structures. While a circular reference doesn't mean that you're wasting memory as you use the object, circular references prevent anything in the chain from being freed until the cycle is broken.
Devel::Leak is a little bit more arcane than the rest, but it basically will allow you to get full information on any SVs that are created and not destroyed between two points in your program's execution. If you check this across a sub call, you'll know any new memory that that subroutine allocated.
You may also want to read the perldebguts section of the Perl manual.
I can't really help more because every codebase is going to wind up being different. Test::Valgrind will work great for some codebases and terribly on others. If you are going to try it, I recommend you use the latest version of Valgrind available and Perl >= 5.10, as Perl 5.8 and Valgrind historically didn't get along too well.

You might want to look at Memory::Usage and Devel::Size
To check the whole process or sub:
use Memory::Usage;
my $mu = Memory::Usage->new();
# Record amount of memory used by current process
$mu->record('starting work');
# Do the thing you want to measure
$object->something_memory_intensive();
# Record amount in use afterwards
$mu->record('after something_memory_intensive()');
# Spit out a report
$mu->dump();
Or to check specific variables:
use Devel::Size qw(size total_size);
my $size = size("A string");
my #foo = (1, 2, 3, 4, 5);
my $other_size = size(\#foo);
my $foo = {
a => [1, 2, 3],
b => {a => [1, 3, 4]}
};
my $total_size = total_size($foo);

The answer to the question is 'yes'. Dtrace can be used to analyze memory usage in a perl process.
This snippet of code:
https://github.com/astletron/perl-dtrace-malloc/blob/master/perl-malloc-total-bytes-by-sub.d
tracks how memory use increases between the call and return of every sub in a program. As an added bonus, dtrace seems to sort the output for you (at least on OS X). Cool.
Thanks to all that chimed in. I answered this one myself as the question is really specific to dtrace/perl.

You could write a simple debug module based on Devel::CallTrace that prints the sub entered as well as the current memory size of the current process. (Using /proc or whatever.)

Related

How to identify places accumulating memory use in a Perl script?

In my Perl script, it runs with high accumulation speed of occupied memory. I have tried making suspect variables clear immediately when they are no longer needed, but the problem can not be fixed. Is there any method to monitor change of memory occupation before and after executing a block?
I have recently had to troubleshoot an out-of-memory situation in one of my programs. While I do not claim to be an expert in this matter by any means, I'm going to share my findings in the hope that it will benefit someone.
1. High, but stable, memory usage
First, you should ensure that you do not just have a case of high, but stable, memory usage. If memory usage is stable, even if your process does not fit in available memory, the discussion below won't be of much help. Here are some notes worth reading in Perl's documentation here and here, in this SO question, in this PerlMonks discussion. There is an interesting analysis here if you're familiar with Perl internals. A lot of deep information is to be found in Tim Bunce's presentation. You should be aware that Perl may not return memory to the system even if you undef stuff. Finally, there's this opinion from a Perl developer that you shouldn't worry too much about memory usage.
2. Steadily growing memory usage
In case memory usage steadily grows, this may eventually cause an out-of-memory situation. My problem turned out to be a case of circular references. According to this answer on StackOverflow, circular references are a common source of memory leaks in Perl. The underlying reason is that Perl uses a reference counting mechanism and cannot release circularly referenced memory until program exit. (Note: I haven't been able to find a more up-to-date version in Perl's documentation of the last claim.)
You can use Scalar::Util::weaken to 'weaken' a circular reference chain (see also http://perlmaven.com/eliminate-circular-reference-memory-leak-using-weaken).
3. Further reading
Tim Bunce's presentation (slides here); also in this blog post
http://www.perlmonks.org/?node_id=472366
Perl memory usage profiling and leak detection?
and of course the link given by #mpapec: http://perlmaven.com/how-much-memory-does-the-perl-application-use
4. Tools
on Unix, you could do system("ps -p $$ -o vsz,rsz,sz,size") Caution: as explained in Tim Bunce's presentation, you'll want to track VSIZE instead of RSS
How to find the amount of physical memory occupied by a hash in Perl?
https://metacpan.org/pod/Devel::Size
and a more recent take by Tim Bunce, which adds the possibility of estimating the total interpreter memory size: https://metacpan.org/pod/Devel::SizeMe
in test scripts, you can use https://metacpan.org/pod/Test::LeakTrace and https://metacpan.org/pod/Test::Memory::Cycle; an example here
https://metacpan.org/pod/Devel::InterpreterSize

Which tool should I use for finding out my memory allocation in Perl?

I've slurped in a big file using File::Slurp but given the size of the file I can see that I must have it in memory twice or perhaps it's getting inflated by being turned into 16 bit unicode. How can I best diagnose that sort of a problem in Perl?
The file I pulled in is 800mb in size and my perl process that's analysing that data has roughly 1.6gb allocated at runtime.
I realise that I may be wrong about my reason for the problem but I'm not sure the most efficient way to prove/disprove my theory.
Update:
I have elminated dodgy character encoding from the list of suspects. It looks like I'm copying the variable at some point, I just can't figure out where.
Update 2:
I have now done some more investigation and discovered that it's actually just getting the data from File::Slurp that's causing the problem. I had a look through the documentation and discovered that I can get it to return a scalar_ref, i.e.
my $data = read_file($file, binmode => ':raw', scalar_ref => 1);
Then I don't get the inflation of my memory. Which makes some sense and is the most logical thing to do when getting the data in my situation.
The information about looking at what variables exist etc. has generally helpful though thanks.
Maybe Devel::DumpSizes and/or Devel::Size can help out? I think the former would be more useful in your case.
Devel::DumpSizes - Dump the name and size in bytes (in increasing order) of variables that are available at a give point in a script.
Devel::Size - Perl extension for finding the memory usage of Perl variables
Here are some generic resources on memory issues in Perl:
http://perl.active-venture.com/pod/perldebguts-perlmemory.html
Perl memory usage profiling and leak detection?
How can I find memory leaks in long-running Perl program?
As far as your own suggestion, the simplest way to disprove would be to write a simple Perl program that:
Creates a big (100M) file of plain text, probably by just outputting the same string in a loop into a file, or for binary files running dd command via system() call
Read the file in using standard Perl open()/#a=<>;
Measure memory consumption.
Then repeat #2-#3 for your 800M file.
That will tell you if the issue is File::Slurp, some weird logic in your program, or some specific content in the file (e.g. non-ascii, although I'd be surprized if that ends up to be the reason)

How can I profile a subroutine without using modules?

I'm tempted to relabel this question 'Look at this brick. What type of house does it belong to?'
Here's the situation: I've effectively been asked to profile some subroutines having access to neither profilers (even Devel::DProf) nor Time::HiRes. The purpose of this exercise is to 'locate' bottlenecks.
At the moment, I'm sprinkling print statements at the beginning and end of each sub that log entries and exits to file, along with the result of the time function. Not ideal, but it's the best I can go by given the circumstances. At the very least it'll allow me to see how many times each sub is called.
The code is running under Unix. The closest thing I see to my need is perlfaq8, but that doesn't seem to help (I don't know how to make a syscall, and am wondering if it'll affect the code timing unpredictably).
Not your typical everyday SO question...
This technique should work.
Basically, the idea is if you run Perl with the -d flag, it goes into the debugger. Then, when you run the program, ctrl-Break or ctrl-C should cause it to pause in the middle of whatever it is doing. Then you can type T to show the stack, and examine any other variables if you like, before continuing it.
Do this about 10 or 20 times. Any line of code (or any function, if you prefer) costing a significant percent of time will appear on that percent of stack samples, roughly, so you will not miss it.
For example, if a line of code (typically a function call) costs 20% of time, and you pause the program 20 times, you will see that line on 4 stack samples, give or take 1.8 samples. The amount of time that could be saved if you could avoid executing that line, or execute it a lot less, is a 20% reduction in overall execution time.
Then you can repeat it to find more problems.
You said the purpose is to 'locate' bottlenecks. This method does exactly that. Measuring function execution time is only a very indirect way to do that.
As far as syscall, there's a pretty good example in this post: http://www.cpan.org/scripts/date_and_time/gettimeofday
I think it's clear enough even for someone who never used syscall before (like myself :)
May I ask what the specifics of "having no access" are?
It's usually possible to get access to CPAN modules, even in cases where installing them in central location is not in the cards. Is there a problem with downloading the module? Installing it in your home directory? Using software with the module incuded?
If one of those is a hang-up it can probably be fixed... if it's some company policy, that's priceless :(
Well, you can write your own profiler. It's not as bad as it sounds. A profiler is just a very special-case debugger. You want to read the perldebguts man page for some good first-cut code to get started if you must write your own.
What you want, and what your boss wants, though he or she may not know it, is to use Devel::NYTProf to do a really good job of profiling your code, and getting the job done instead of having to wait for you to partially duplicate the functions of it while learning how it is done.
The comment you made about "personal use" doesn't make sense. You're doing a job for work, and the work needs to get done, and you need (or your manager needs to get you) the resources to do that work. "Personal use" doesn't seem to enter into it.
Is it a question of someone else refusing to sign off on the module to have it installed on the machine running the software to be measured? Is it a licensing question? Is it not being allowed to install arbitrary software on a production machine (understandable, but there's got to be some way the software's tested before it goes live - I hope - profile it there)?
What is the reason that a well-known module from a trustworthy source can't be used? Have you made the money case to your manager that more money will be spent coding a new, less-functional, profiler from scratch than finding a way to use one that is both good and already available?
For each subroutine, create a wrapper around it which reports the time in some format which you can export to something like R, a database, Excel or something similar (CSV would be a good choice). Add something like this to your code. If you are using a Perl less than 5.7 (when Time::HiRes was first added to core), use syscall as mentioned above instead of Time::HiRes's functions below.
INIT {
sub wrap_sub {
no strict 'refs';
my $sub = shift;
my $subref = *{$sub}{CODE};
return sub {
local *__ANON__ = "wrapped_$sub";
my $fsecs = Time::HiRes::gettimeofday();
print STDERR "$sub,$fsecs,";
if (wantarray) {
#return = eval { $subref->(#_) } or die $#;
} else {
$return[0] = eval { $subref->(#_) } or die $#;
}
$fsecs = Time::HiRes::gettimeofday();
print STDERR "$fsecs\n";
return wantarray ? #return : $return[0];
};
}
require Time::HiRes;
my #subs = qw{the subs you want to profile};
no strict 'refs';
no warnings 'redefine';
foreach my $sub (#subs) {
*{$sub} = wrap_sub($sub);
}
}
Replace 'subs you want to profile' with the subs you need profiled, and use an open()ed file handle instead of STDERR if you need to, bearing in mind you can get the results of the run separate from the output of the script (on Unix, with the bourne, korn and bash shells), like this
perl ./myscript.pl 2>myscript.profile

How can I access the ref count of a Perl hash?

I'm trying to enable the garbage collector of my script to do a better job. There's a ton of memory that it should be able to reclaim, but something is stopping it.
I've used Devel::Cycle a bit and that's allowed me to get closer but I'm not quite there.
How do I find out the current reference count for a Perl hash (the storage for my objects)?
Is there a way to track who is holding a reference to an object? Perhaps a sort of Tie that says, whenever someone points are this object, remember who that someone is.
You are looking for Devel::Refcount.
If you are worried about returning unused memory to the OS, you should know that is not possible in general. The memory footprint of your Perl program will be proportional to the largest allocation during the lifetime of your program.
See How can I make my Perl program take less memory? in the Perl FAQ list as well as Mini-Tutorial: Perl's Memory Management (as pointed out by #Evan Carroll in the comments).

Finding a Perl memory leak

SOLVED see Edit 2
Hello,
I've been writing a Perl program to handle automatic upgrading of local (proprietary) programs (for the company I work for).
Basically, it runs via cron, and unfortunately has a memory leak (or something similar). The problem is that the leak only happens when I'm not looking (aka when run via cron, not via command line).
My code does not contain any circular (or other) references, so the commonly cited tools will not help me (Devel::Cycle, Devel::Peek).
How would I go about figuring out what is using so much memory that the kernel kills it?
Basically, the code SFTPs into a server (using ```sftp...`` `), calls OpenSSL to verify the file, and then SFTPs more if more files are needed, and installs them (untars them).
I have seen delays (~15 sec) before the first SFTP session, but it has never used so much memory as to be killed (in my presence).
If I can't sort this out, I'll need to re-write in a different language, and that will take precious time.
Edit: The following message is printed out by the kernel which led me to believe it was a memory leak:
[100023.123] Out of memory: kill process 9568 (update.pl) score 325406 or a child
[100023.123] Killed Process 9568 (update.pl)
I don't believe it is an issue with cron because of the stalling (for ~15 sec, sometimes) when running it via the command-line. Also, there are no environmental variables used (at least by what I've written, maybe underlying things do?)
Edit 2: I found the issue myself, with help from the below comment by mobrule (in response to this question). It turns out that the script was called from a crontab of a user (non-root) just once a day and that (non-root privs) caused a special infinite loop situation.
Sorry guys, I feel kinda stupid for not finding this before, but thanks.
mobrule, if you submit your comment as an answer, I will accept it as it lead to me finding the problem.
End Edits
Thanks,
Brian
P.S. I may be able to post small snippets of code, but not the whole thing due to company policy.
You could try using Devel::Size to profile some of your objects. e.g. in the main:: scope (the .pl file itself), do something like this:
use Devel::Size qw(total_size);
foreach my $varname (qw(varname1 varname2 ))
{
print "size used for variable $varname: " . total_size($$varname) . "\n";
}
Compare the actual size used to what you think is a reasonable value for each object. Something suspicious might pop out immediately (e.g. a cache that is massively bloated beyond anything that sounds reasonable).
Other things to try:
Eliminate bits of functionality one at a time to see if suddenly things get a lot better; I'd start with the use of any external libraries
Is the bad behaviour localized to just one particular machine, or one particular operating system? Move the program to other systems to see how its behaviour changes.
(In a separate installation) try upgrading to the latest Perl (5.10.1), and also upgrade all your CPAN modules
How do you know that it's a memory leak? I can think of many other reasons why the OS would kill a program.
The first question I would ask is "Does this program always work correctly from the command line?". If the answer is "No" then I'd fix these issues first.
On the other hand if the answer is "Yes", I would investigate all the differences between having the program executed under cron and from the command line to find out why it is misbehaving.
If it is run by cron, that shouldn't it die after iteration? If that is the case, hard for me to see how a memory leak would be a big deal...
Are you sure it is the script itself, and not the child processes that are using the memory? Perhaps it ends up creating a real lot of ssh sessions , instead of doing a bunch of stuff in one session?