Why is Devel::LeakTrace leaking memory? - perl

I am trying to learn more about how to detect memory leaks in Perl.
I have this program:
p.pl:
#! /usr/bin/env perl
use Devel::LeakTrace;
my $foo;
$foo = \$foo;
Output:
leaked SV(0xac2df8e0) from ./p.pl line 5
leaked SV(0xac2df288) from ./p.pl line 5
Why is this leaking two scalars (and not just a single)?
Then I run it through valgrind. First I created a debugging version of perl:
$ perlbrew install perl-5.30.0 --as=5.30.0-D3L -DDEBUGGING \
-Doptimize=-g3 -Accflags="-DDEBUG_LEAKING_SCALARS"
$ perlbrew use 5.30.0-D3L
$ cpanm Devel::LeakTrace
Then I ran valgrind setting PERL_DESTRUCT_LEVEL=2 as recommended in perlhacktips:
$ PERL_DESTRUCT_LEVEL=2 valgrind --leak-check=yes perl p.pl
==12479== Memcheck, a memory error detector
==12479== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12479== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==12479== Command: perl p.pl
==12479==
leaked SV(0x4c27320) from p.pl line 5
leaked SV(0x4c26cc8) from p.pl line 5
==12479==
==12479== HEAP SUMMARY:
==12479== in use at exit: 105,396 bytes in 26 blocks
==12479== total heap usage: 14,005 allocs, 13,979 frees, 3,011,508 bytes allocated
==12479==
==12479== 16 bytes in 1 blocks are definitely lost in loss record 5 of 21
==12479== at 0x483874F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==12479== by 0x484851A: note_changes (LeakTrace.xs:80)
==12479== by 0x48488E3: XS_Devel__LeakTrace_hook_runops (LeakTrace.xs:126)
==12479== by 0x32F0A2: Perl_pp_entersub (pp_hot.c:5237)
==12479== by 0x2C0C50: Perl_runops_debug (dump.c:2537)
==12479== by 0x1A2FD9: Perl_call_sv (perl.c:3043)
==12479== by 0x1ACEE3: Perl_call_list (perl.c:5084)
==12479== by 0x181233: S_process_special_blocks (op.c:10471)
==12479== by 0x180989: Perl_newATTRSUB_x (op.c:10397)
==12479== by 0x220D6C: Perl_yyparse (perly.y:295)
==12479== by 0x3EE46B: S_doeval_compile (pp_ctl.c:3502)
==12479== by 0x3F4F87: S_require_file (pp_ctl.c:4322)
==12479==
==12479== LEAK SUMMARY:
==12479== definitely lost: 16 bytes in 1 blocks
==12479== indirectly lost: 0 bytes in 0 blocks
==12479== possibly lost: 0 bytes in 0 blocks
==12479== still reachable: 105,380 bytes in 25 blocks
==12479== suppressed: 0 bytes in 0 blocks
==12479== Reachable blocks (those to which a pointer was found) are not shown.
==12479== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==12479==
==12479== For counts of detected and suppressed errors, rerun with: -v
==12479== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
so 16 bytes are lost. However, if I comment out the line use Devel::LeakTrace in p.pl and run valgrind again, the output is:
==12880== Memcheck, a memory error detector
==12880== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12880== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==12880== Command: perl p.pl
==12880==
==12880==
==12880== HEAP SUMMARY:
==12880== in use at exit: 0 bytes in 0 blocks
==12880== total heap usage: 1,770 allocs, 1,770 frees, 244,188 bytes allocated
==12880==
==12880== All heap blocks were freed -- no leaks are possible
==12880==
==12880== For counts of detected and suppressed errors, rerun with: -v
==12880== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So the question is: Why is Devel::LeakTrace causing a memory leak?

It seems like there are even more memory leaks than valgrind reported.
Each time a new SV is created, Devel::LeakTrace records the current file name and line number in a 16 bytes structure called when:
typedef struct {
char *file;
int line;
} when;
These blocks are allocated at line #80 with malloc() but it seems it never frees these blocks. So the more scalars are created, the more memory will leak.
Some background information
The module tries to determine leaked SVs from the END{} phaser. At this point all allocated SVs should have gone out of scope from the main program and had their reference count decreased to zero, which should destroy them. However, if for some reason the reference count is not decremented to zero, the scalar will not be destroyed and freed
from perl's internal memory management pool. In this case the scalar is considered as leaked by the module.
Note that this is not the same as leaked memory as seen from the operating
systems memory pool handled by e.g. malloc(). When perl exits it will still
free any leaked scalars (from its internal memory pool) back to the systems memory pool.
This means that the module is not meant to detect leaked system memory. For this, we can use e.g. valgrind.
The module hooks into the perl runops loop and for each OP that is of type OP_NEXTSTATE it will scan all arenas and all SVs in those for new SVs (that is: SVs that has been introduced since the previous OP_NEXTSTATE).
For this sample program p.pl in my question I counted 31 arenas, and each arena contained space for 71 SVs. Almost all of these SVs were in use during run time (approximately 2150 of them). The module keeps each of these SVs in a hash used with key equal to the address of the SV and value equal to the when block (see above) where the scalar was allocated. For each OP_NEXTSTATE, it can then scan all SVs and check if there are some that are not present in the used hash.
The used hash is not a perl hash ( I guess this was to avoid any conflicts with
the allocated SVs that the module tries to keep track of), instead the module uses GLib hash tables.
Patch
In order to keep track of the allocated when blocks, I used a new glib hash called when_hash. Then after the module had printed the leaked scalars, the when blocks could be freed by looking up all keys in the when_hash.
I also found that the module did not free the used-hash. As far as I can see it should be calling the glib g_hash_table_destroy() to release it from the END{} block. Here is the patch:
LeakTrace.xs (patched):
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <glib.h>
typedef struct {
char *file;
int line;
} when;
/* a few globals, never mind the mess for now */
GHashTable *used = NULL;
GHashTable *new_used = NULL;
/* cargo from Devel::Leak - wander the arena, see what SVs live */
typedef long used_proc _((void *,SV *,long));
/* PATCH: fix memory leaks */
/***************************/
GHashTable *when_hash = NULL; /* store the allocated when blocks here */
static int have_run_end_hook = 0; /* indicator to runops that we are done */
static runops_proc_t save_orig_run_ops; /* original runops function */
/* Called from END{}, i.e. from show_used() after having printed the leaks.
* Free memory allocated for the when blocks */
static
void
free_when_block(gpointer key, gpointer value, gpointer user_data) {
free(key);
}
static
void
do_cleanup() {
/* this line was missing from the original show_used() */
if (used) g_hash_table_destroy( used );
if (when_hash) g_hash_table_foreach( when_hash, free_when_block, NULL );
g_hash_table_destroy( when_hash );
PL_runops = save_orig_run_ops;
have_run_end_hook = 1;
}
/* END PATCH: fix memory leaks */
/*******************************/
static
long int
sv_apply_to_used(void *p, used_proc *proc, long n) {
SV *sva;
for (sva = PL_sv_arenaroot; sva; sva = (SV *) SvANY(sva)) {
SV *sv = sva + 1;
SV *svend = &sva[SvREFCNT(sva)];
while (sv < svend) {
if (SvTYPE(sv) != SVTYPEMASK) {
n = (*proc) (p, sv, n);
}
++sv;
}
}
return n;
}
/* end Devel::Leak cargo */
static
long
note_used(void *p, SV* sv, long n) {
when *old = NULL;
if (used && (old = g_hash_table_lookup( used, sv ))) {
g_hash_table_insert(new_used, sv, old);
return n;
}
g_hash_table_insert(new_used, sv, p);
return 1;
}
static
void
print_me(gpointer key, gpointer value, gpointer user_data) {
when *w = value;
char *type;
switch SvTYPE((SV*)key) {
case SVt_PVAV: type = "AV"; break;
case SVt_PVHV: type = "HV"; break;
case SVt_PVCV: type = "CV"; break;
case SVt_RV: type = "RV"; break;
case SVt_PVGV: type = "GV"; break;
default: type = "SV";
}
if (w->file) {
fprintf(stderr, "leaked %s(0x%x) from %s line %d\n",
type, key, w->file, w->line);
}
}
static
int
note_changes( char *file, int line ) {
static when *w = NULL;
int ret;
/* PATCH */
if (have_run_end_hook) return 0; /* do not enter after clean up is complete */
/* if (!w) w = malloc(sizeof(when)); */
if (!w) {
w = malloc(sizeof(when));
if (!when_hash) {
/* store pointer to allocated blocks here */
when_hash = g_hash_table_new( NULL, NULL );
}
g_hash_table_insert(when_hash, w, NULL); /* store address to w */
}
/* END PATCH */
w->line = line;
w->file = file;
new_used = g_hash_table_new( NULL, NULL );
if (sv_apply_to_used( w, note_used, 0 )) w = NULL;
if (used) g_hash_table_destroy( used );
used = new_used;
return ret;
}
/* Now this bit of cargo is a derived from Devel::Caller */
static
int
runops_leakcheck(pTHX) {
char *lastfile = 0;
int lastline = 0;
IV last_count = 0;
while ((PL_op = CALL_FPTR(PL_op->op_ppaddr)(aTHX))) {
PERL_ASYNC_CHECK();
if (PL_op->op_type == OP_NEXTSTATE) {
if (PL_sv_count != last_count) {
note_changes( lastfile, lastline );
last_count = PL_sv_count;
}
lastfile = CopFILE(cCOP);
lastline = CopLINE(cCOP);
}
}
note_changes( lastfile, lastline );
TAINT_NOT;
return 0;
}
MODULE = Devel::LeakTrace PACKAGE = Devel::LeakTrace
PROTOTYPES: ENABLE
void
hook_runops()
PPCODE:
{
note_changes(NULL, 0);
PL_runops = runops_leakcheck;
}
void
reset_counters()
PPCODE:
{
if (used) g_hash_table_destroy( used );
used = NULL;
note_changes(NULL, 0);
}
void
show_used()
CODE:
{
if (used) g_hash_table_foreach( used, print_me, NULL );
/* PATCH */
do_cleanup(); /* released allocated memory, restore original runops */
/* END PATCH */
}
Testing the patch
$ wget https://www.cpan.org/modules/by-module/Devel/Devel-LeakTrace-0.06.tar.gz
$ tar zxvf Devel-LeakTrace-0.06.tar.gz
$ cd Devel-LeakTrace-0.06
$ perlbrew use 5.30.0-D3L
# replace lib/Devel/LeakTrace.xs with my patch
$ perl Makefile.PL
$ make
$ make install # <- installs the patch
# cd to test folder, then
$ PERL_DESTRUCT_LEVEL=2 valgrind --leak-check=yes perl p.pl
==25019== Memcheck, a memory error detector
==25019== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==25019== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==25019== Command: perl p.pl
==25019==
leaked SV(0x4c26cd8) from p.pl line 5
leaked SV(0x4c27330) from p.pl line 5
==25019==
==25019== HEAP SUMMARY:
==25019== in use at exit: 23,324 bytes in 18 blocks
==25019== total heap usage: 13,968 allocs, 13,950 frees, 2,847,004 bytes allocated
==25019==
==25019== LEAK SUMMARY:
==25019== definitely lost: 0 bytes in 0 blocks
==25019== indirectly lost: 0 bytes in 0 blocks
==25019== possibly lost: 0 bytes in 0 blocks
==25019== still reachable: 23,324 bytes in 18 blocks
==25019== suppressed: 0 bytes in 0 blocks
==25019== Reachable blocks (those to which a pointer was found) are not shown.
==25019== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==25019==
==25019== For counts of detected and suppressed errors, rerun with: -v
==25019== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

First, valgrind reports 16 bytes of leaked memory in a script containing only up to use Devel::LeakTrace. The possible leak is independent of the fourth and fifth lines. From your link,
NOTE 3: There are known memory leaks when there are compile-time errors
within eval or require, seeing S_doeval in the call stack is a good sign
of these. Fixing these leaks is non-trivial, unfortunately, but they must be fixed
eventually.
Since I see the line by 0x3F18E5: S_doeval_compile (pp_ctl.c:3502), and a similar line in your example, I would say that this is why Devel::LeakTrace causes an apparent memory leak.
Second, regarding the original script, Devel::LeakTrace is simply reporting the leak caused by (at least) a circular reference at the fifth line. You can see this by using weaken from Scalar::Util:
#! /usr/bin/env perl
use Devel::LeakTrace;
use Scalar::Util;
my $foo;
$foo = \$foo;
Scalar::Util::weaken($foo);
Then, perl p.pl will not report any leak. My guess is that the first scripts reports two leaks because, in addition to creating a circular reference, perl is losing a pointer at $foo = \$foo. There is some magic I cannot understand that occurs when you weaken $foo that apparently fixes both issues. You can see this by tweaking the original script:
#! /usr/bin/env perl
use Devel::LeakTrace;
my $foo;
my $bar = \$foo;
$foo = $bar;
The resulting $foo should be identical, we have just created $bar to hold the reference. However, in this case the script only reports one leak.
So, in summary, I would say that 1)Devel::LeakTrace has a bug that shows as a memory leak in valgrind independently of the code; 2) perl is creating a circular reference and losing a pointer in the original script, which is why Devel::LeakTrace reports two leaks.

Related

"Program too large" threshold greater than actual instruction count

I've written a couple production BPF agents, but my approach is very iterative until I please the verifier and can move on. I've reached my limit again.
Here's a program that works if I have one fewer && condition -- and breaks otherwise. The confusing part is that the warning implies that 103 insns is greater-than at most 4096 insns. There's obviously something I'm misunderstanding about how this is all strung together.
My ultimate goal is to do logging based on a process' environment -- so alternative approaches are welcome. :)
Error:
$ sudo python foo.py
bpf: Argument list too long. Program too large (103 insns), at most 4096 insns
Failed to load BPF program b'tracepoint__sched__sched_process_exec': Argument list too long
BPF Source:
#include <linux/mm_types.h>
#include <linux/sched.h>
#include <linux/version.h>
int tracepoint__sched__sched_process_exec(
struct tracepoint__sched__sched_process_exec* args
) {
struct task_struct* task = (typeof(task))bpf_get_current_task();
const struct mm_struct* mm = task->mm;
unsigned long env_start = mm->env_start;
unsigned long env_end = mm->env_end;
// Read up to 512 environment variables -- only way I could find to "limit"
// the loop to satisfy the verifier.
char var[12];
for (int n = 0; n < 512; n++) {
int result = bpf_probe_read_str(&var, sizeof var, (void*)env_start);
if (result <= 0) {
break;
}
env_start += result;
if (
var[0] == 'H' &&
var[1] == 'I' &&
var[2] == 'S' &&
var[3] == 'T' &&
var[4] == 'S' &&
var[5] == 'I' &&
var[6] == 'Z' &&
var[7] == 'E'
) {
bpf_trace_printk("Got it: %s\n", var);
break;
}
}
return 0;
}
Basic loader program for reproducing:
#!/usr/bin/env python3
import sys
from bcc import BPF
if __name__ == '__main__':
source = open("./foo.c").read()
try:
BPF(text=source.encode("utf-8")).trace_print()
except Exception as e:
error = str(e)
sys.exit(error)
bpf: Argument list too long. Program too large (103 insns), at most 4096 insns
Looking at the error message, my guess would be that your program has 103 instructions and it's rejected because it's too complex. That is, the verifier gave up before analyzing all instructions on all paths.
On Linux 5.15 with a privileged user, the verifier gives up after reading 1 million instructions (the complexity limit). Since it has to analyze all paths through the program, a program with a small number of instructions can have a very high complexity. That's particularly the case when you have loops and many conditions, as is your case.
Why is the error message confusing? This error message is coming from libbpf.c:
if (ret < 0 && errno == E2BIG) {
fprintf(stderr,
"bpf: %s. Program %s too large (%u insns), at most %d insns\n\n",
strerror(errno), attr->name, insns_cnt, BPF_MAXINSNS);
return -1;
}
Since the bpf(2) syscall returns E2BIG both when the program is too large and when its complexity is too high, libbpf prints the same error message for both cases, always with at most 4096 instructions. I'm confident upstream would accept a patch to improve that error message.

getting illegal instructions when vectorized code writes to PCI

I am writing a program that writes to a device's range of HW registers. I am using mmap to map the HW addresses to virtual address (user space). I tested the result from the mmap and it is OK. I implemented a copy of a buffer into the device:
void bufferCopy(void *dest, void *src, const size_t size) {
uint8_t *pdest = static_cast<uint8_t *>(dest);
uint8_t *psrc = static_cast<uint8_t *>(src);
size_t iters = 0, tailBytes = 0;
/* iterate 64bit */
iters = (size / sizeof(uint64_t));
for (size_t index = 0; index < iters; ++index) {
*(reinterpret_cast<uint64_t *>(pdest)) =
*(reinterpret_cast<uint64_t *>(psrc));
pdest += sizeof(uint64_t);
psrc += sizeof(uint64_t);
}
.
.
.
but when running it on QEMU I get illegal instruction exception. When I debugged got it crashes on the next instruction (below is the asm of the main loop):
movdqu (%rsi,%rax,1),%xmm0
movups %xmm0,(%rdi,%rax,1) <----- this instruction crashes ...
add $0x10,%rax
cmp %rax,%r9
jne 0x7ffff7eca1e0 <_ZN12_GLOBAL__N_110bufferCopyEPvS0_m+64>
any ideas why ? my guess that you can write to PCI only 32/64 bit.
The compile doesn’t know my limitations, so it optimize my code and create vectorized loop (each iteration loads 128 bit and saves 128 bit). Is is making sense ?? can I write to PCI with vectorized instructions ?
Also, whether it is a missing feature in QEMU or a bug or just a recommendation, how can I prevent from the compiler to generate those vector instructions ?

How to benefit from heap tagging by DLL?

How do I use and benefit from the GFlags setting Enable heap tagging by DLL?
I know how to activate the setting for a process, but I did not find useful information in the output of !heap -t in WinDbg. I was expecting some output like this:
0:000> !heap -t
Index Address Allocated by
1: 005c0000 MyDll.dll
2: 006b0000 AnotherDll.dll
so that I can identify which heap was created by which DLL and then e.g. identify the source of a memory leak.
Is this a misunderstanding of the term "heap tagging by DLL" or do I need some more commands to get to the desired result?
My research so far:
I googled for a tutorial on this topic, but I couldn't find a detailed description
I read WinDbg's .hh !heap but it's not explained there in detail as well. Tag is only used in !heap -b
again a very late answer
to benefit from HeapTagging you need to create a tag first in your code.
as far as i know (that is upto xp-sp3) there were no Documented APIS to Create a tag
(I havent mucked with heap since then so i am not aware of latest apis in os > vista Rewrites were done to heap manager so probably many of the ^^^features^^^ that i post below might have been corrected or bettered or bugs removed )
in xp-sp3 you can use undocumented RtlCreateTagHeap to create a new tag to either Process Heap or Private Heap
and after you create tha tag you need to set the global flag 8000 | 800
htg - Enable heap tagging
htd - Enable heap tagging by DLL
and theoratically all allocs and frees must get tagged .
but practically only allocations > 512 kB gets tagged in xp-sp3 with these basic steps
it either is a bug or a feature that limits tagging to allocations and frees > 512 kB
HeapAlloc goes through ZwAllocateVirtualMemory in case of Allocations > 512 kB in 32 bit process refer HeapCreate / HeapAlloc Documentation in msdn
and as a debuging aid you can patch ntdll.dll on the fly to enable tagging for all Allocations and frees .
below is a sample code that demonstrates the tagging and how to view it all in windbg
compile using cl /Zi /analyze /W4 <src> /link /RELEASE
use windbg to execute the app and watch tagging with !heap * -t command
#include <windows.h>
#include <stdio.h>
//heaptags are kinda broken or they are intentionally
//given only to allocations > 512 kb // allocation > 512 kb
//go through VirtualAlloc Route for Heap created with maxsize
//set to 0 uncomment ALLOCSIZE 0xfdfd2 and recompile to watch
// tagging increase by 100% with ALLOCSIZE 0xfdfd1 only 50 allocs
// and frees that are > 512 kB will be tagged these magic numbers
// are related to comment in HeapCreate Documentation that state
// slightly less than 512 kB will be allocated for 32 bit process
// tagging can be dramatically increased by patching ntdll when
// stopped on system breakpoint patch 7c94b8a4 (xpsp3 ntdll.dll)
// use the below command in windbg for finding the offset of pattern
// command must be in single line no line breaks
// .foreach /pS 4 /ps 4 ( place { !grep -i -e call -c
// "# call*RtlpUpdateTagEntry 7c900000 l?20000" } ) { ub place }
// the instruction we are searching to patch is
//7c94b8a1 81e3ff0fffff and ebx,0FFFF0FFFh
// patch 0f to 00 at system breakpoint with eb 7c94b8a1+3 00
#define BUFFERSIZE 100
#define ALLOCSIZE 0xfdfd1
//#define ALLOCSIZE 0xfdfd2
typedef int ( __stdcall *g_RtlCreateTagHeap) (
HANDLE hHeap ,
void * unknown,
wchar_t * BaseString,
wchar_t * TagString
);
void HeapTagwithHeapAllocPrivate()
{
PCHAR pch[BUFFERSIZE] = {};
HANDLE hHeap = 0;
ULONG tag1 = 0;
ULONG tag2 = 0;
ULONG tag3 = 0;
ULONG tag4 = 0;
ULONG tag5 = 0;
g_RtlCreateTagHeap RtlCreateTagHeap = 0;
HMODULE hMod = LoadLibrary("ntdll.dll");
if(hMod)
{
RtlCreateTagHeap = (g_RtlCreateTagHeap)
GetProcAddress( hMod,"RtlCreateTagHeap");
}
if (hHeap == 0)
{
hHeap = HeapCreate(0,0,0);
if (RtlCreateTagHeap != NULL)
{
tag1 = RtlCreateTagHeap (hHeap,0,L"HeapTag!",L"MyTag1");
tag2 = RtlCreateTagHeap (hHeap,0,L"HeapTag!",L"MyTag2");
tag3 = RtlCreateTagHeap (hHeap,0,L"HeapTag!",L"MyTag3");
tag4 = RtlCreateTagHeap (hHeap,0,L"HeapTag!",L"MyTag4");
}
}
HANDLE DefHeap = GetProcessHeap();
if ( (RtlCreateTagHeap != NULL) && (DefHeap != NULL ))
{
tag5 = RtlCreateTagHeap (DefHeap,0,L"HeapTag!",L"MyTag5");
for ( int i = 0; i < BUFFERSIZE ; i++ )
{
pch[i]= (PCHAR) HeapAlloc( DefHeap,HEAP_ZERO_MEMORY| tag5, 1 );
HeapFree(DefHeap,NULL,pch[i]);
}
}
if(hHeap)
{
for ( int i = 0; i < BUFFERSIZE ; i++ )
{
pch[i]= (PCHAR) HeapAlloc( hHeap,HEAP_ZERO_MEMORY| tag1, 1 );
//lets leak all allocs patch ntdll to see the tagging details
//HeapFree(hHeap,NULL,pch[i]);
}
for ( int i = 0; i < BUFFERSIZE ; i++ )
{
pch[i]= (PCHAR) HeapAlloc( hHeap,HEAP_ZERO_MEMORY| tag2, 100 );
// lets leak 40% allocs patch ntdll to see the tagging details
if(i >= 40)
HeapFree(hHeap,NULL,pch[i]);
}
// slightly less than 512 kb no tagging
for ( int i = 0; i < BUFFERSIZE / 2 ; i++ )
{
pch[i]= (PCHAR) HeapAlloc(
hHeap,HEAP_ZERO_MEMORY| tag3, ALLOCSIZE / 2 );
}
// > 512 kb default tagging
for ( int i = BUFFERSIZE / 2; i < BUFFERSIZE ; i++ )
{
pch[i]= (PCHAR) HeapAlloc(
hHeap,HEAP_ZERO_MEMORY | tag4 ,ALLOCSIZE );
}
for (int i =0 ; i < BUFFERSIZE ; i++)
{
HeapFree(hHeap,NULL,pch[i]);
}
}
}
void _cdecl main()
{
HeapTagwithHeapAllocPrivate();
}
the compiled exe to be run with windbg as below
DEFAULT execution and inspection
**only 50 tags will be visible all of them are > 512 kB Allocations
cdb -c "g;!heap * -t;q" newheaptag.exe | grep Tag**
heaptag:\>cdb -c "g;!heap * -t;q" newheaptag.exe | grep Tag
Tag Name Allocs Frees Diff Allocated
Tag Name Allocs Frees Diff Allocated
Tag Name Allocs Frees Diff Allocated
0004: HeapTag!MyTag4 50 50 0 0
patching ntdll on system breakpoint should make all tags visible
eb = write byte
patch and run the exe on exit inspect heaps with tags
cdb -c "eb 7c94b8a1+3 00;g;!heap * -t;q" newheaptag.exe | grep Tag
heaptag:\>cdb -c "eb 7c94b8a1+3 00;g;!heap * -t;q" newheaptag.exe | grep Tag
Tag Name Allocs Frees Diff Allocated
0012: HeapTag!MyTag5 100 100 0 0 <-our tag in process heap
Tag Name Allocs Frees Diff Allocated
Tag Name Allocs Frees Diff Allocated
0001: HeapTag!MyTag1 100 0 100 3200 <--- leak all
0002: HeapTag!MyTag2 100 60 40 5120 <--- leak 40 %
0003: HeapTag!MyTag3 50 50 0 0 <--- clean < 512 kB
0004: HeapTag!MyTag4 50 50 0 0 <----clean > 512 kB

Perl: Devel::Gladiator module and memory management

I have a perl script that needs to run in the background constantly. It consists of several .pm module files and a main .pl file. What the program does is to periodically gather some data, do some computation, and finally update the result recorded in a file.
All the critical data structures are declared in the .pl file with our, and there's no package variable declared in any .pm file.
I used the function arena_table() in the Devel::Gladiator module to produce some information about the arena in the main loop, and found that the SVs of type SCALAR and GLOB are increasing slowly, resulting in a gradual increase in the memory usage.
The output of arena_table (I reformat them, omitting the title. after a long enough period, only the first two number is increasing):
2013-05-17#11:24:34 36235 3924 3661 3642 3376 2401 720 201 27 23 18 13 13 10 2 2 1 1 1 1 1 1 1 1 1 1
After running for some time:
2013-05-17#12:05:10 50702 46169 36910 4151 3995 3924 2401 720 274 201 26 23 18 13 13 10 2 2 1 1 1 1 1 1 1 1 1
The main loop is something like:
our %hash1 = ();
our %hash2 = ();
# and some more package variables ...
# all are hashes
do {
my $nowtime = time();
collect_data($nowtime);
if (calculate() == 1) {
update();
}
sleep 1;
get_mem_objects(); # calls arena_table()
} while (1);
Except get_mem_objects, other functions will operate on the global hashes declared by our. In update, the program will do some log rotation, the code is like:
sub rotate_history() {
my $i = $main::HISTORY{'count'};
if ($i == $main::CONFIG{'times'}{'total'}) {
for ($i--; $i >= 1; $i--) {
$main::HISTORY{'data'}{$i} = dclone($main::HISTORY{'data'}{$i-1});
}
} else {
for (; $i >= 1; $i--) {
$main::HISTORY{'data'}{$i} = dclone($main::HISTORY{'data'}{$i-1});
}
}
$main::HISTORY{'data'}{'0'} = dclone(\%main::COLLECT);
if ($main::HISTORY{'count'} < $main::CONFIG{'times'}{'total'}) {
$main::HISTORY{'count'}++;
}
}
If I comment the calls to this function, in the final report given by Devel::Gladiator, only the SVs of type SCALAR is increasing, the number of GLOBs will finally enter a stable state. I doubt the dclone may cause the problem here.
My questions are,
what exactly does the information given by that module mean? The statements in the perldoc is a little vague for a perl newbie like me.
And, what are the common skills to lower the memory usage of long-running perl scripts?
I know that package variables are stored in the arena, but how about the lexical variables? How are the memory consumed by them managed?

Why system doesn't return main's value?

[root# test]$ cat return10.c
#include <stdio.h>
int main(int argc, char *argv[]){
return 10;
}
[root# test]$ perl -e 'print system("/path_to_return10")'
2560
I was expecting 10 but got 2560,why?
See $? in perldoc perlvar.
You got 10 * 256 (return value = 10) + 0 * 128 (there was no core dump) + 0 (process wasn't killed by signal).
as specified in the documentation for the system call in perl (http://perldoc.perl.org/functions/system.html):
The return value is the exit status of the program as returned by the
wait call. To get the actual exit value, shift right by eight (see
below).
indeed: 2560 >> 8 = 10