I'm trying to analyse a crash dump of MS BizTalk service, which is constantly consuming 100% CPU (and I assume that's because of our code :) ). I have a couple of dumps and the stack trace of the busiest threads looks similar - the only problem is, that the top of the stack seems to be missing symbols. It looks like this:
0x642`810b2fd0
So, the question is - how can I find out the module/function from this address? (or at least the module, so that I know what symbol file is missing).
lm in WinDbg dumps list of modules. In your case WinDbg does not find any modules that occupy this address -- otherwise it would have printed +. Some of the libraries generate code dynamically, in this case the body of the function will be placed in the heap and won't have any symbols or even module associated with it. I know MCF at some point did this.
I suggest you try to analyze the frames at the top of the stack that have symbols and try to find out what they might be doing.
Wish I could help more, but the only thing I can suggest is reading this cheat sheet of WinDbg commands. There is one command wt which has a list of params which could help with getting module information about that call site.
Let me know if this is any use for you.
Related
I have a huge application (made in PowerBuilder) that crashes every once in a while so it is hard to reproduce this error. We have it set up so that when a crash like this occurs, we recieve a .dmp file.
I used WinDbg to analyze my .dmp file with the command !analyze -v. From this I can deduct that the error that occured was an Access Violation C0000005. Based on the [0] and [1] parameters, it attempted to dereference a null pointer.
WinDbg also showed me STACK_TEXT consisting of around 30 lines, but I am not sure how to read it. From what I have seen I need to use some sort of symbols.
First line of my STACK_TEXT is this:
00000000`00efca10 00000000`75d7fa46 : 00000000`10df1ae0 00000000`0dd62828 00000000`04970000 00000000`10e00388 : pbvm!ob_get_runtime_class+0xad
From this, my goal is to analyze this file to figure out where exactly in the program this error happened or which function it was in. Is this something I will be able to find after further analyzing the stack trace?
How can I pinpoint where in the program a crash happened using .dmp and WinDbg so I can fix my code?
If you analyze a crash dump with !analyze -v, the lines after STACK TEXT is the stack trace. The output is equivalent to kb, given you set the correct thread and context.
The output of kb is
Child EBP
Return address
First 4 values on the stack
Symbol
The backticks ` tell you that you are running in 64 bit and they split the 64 bit values in the middle.
On 32 bit, the first 4 parameters on the stack were often equivalent to the first 4 parameters to the function, depending on the calling convention.
On 64 bit, the stack is not so relevant any more, because with the 64 bit calling convention, parameters are passed via registers. Therefore you can probably ignore those values.
The interesting part is the symbol like pbvm!ob_get_runtime_class+0xad.
In front of ! is the module name, typically a DLL or EXE name. Look for something that you built. After the ! and before the + is a method name. After the + is the offset in bytes from the beginning of the function.
As long as you don't have functions with thousands of lines of code, that number should be small, like < 0x200. If the number is larger than that, it typically means that you don't have correct symbols. In that case, the method name is no longer reliable, since it's probably just the last known (the last exported) method name and a faaaar way from there, so don't trust it.
In case of pbvm!ob_get_runtime_class+0xad, pbvm is the DLL name, ob_get_runtime_class is the method name and +0xad is the offset within the method where the instruction pointer is.
To me (not knowing anything about PowerBuilder) PBVM sounds like the PowerBuilder DLL implementation for Virtual Memory. So that's not your code, it's the code compiled by Sybase. You'd need to look further down the call stack to find the culprit code in your DLL.
After reading Wikipedia, it seems that PowerBuilder does not necessarily compile to native code, but to intermediate P-Code instead. In this case you're probably out of luck, since your code is never really on the call stack and you need a special debugger or a WinDbg extension (which might not exist, like for Java). Run it with the -pbdebug command line switch or compile it to native code and let it crash again.
I have a problem. I want to execute some commands in the Commandline of linux. I tested TProcess (So i am using Lazarus) but now when i am starting the programm, there is nothing, wich the Program do.
Here is my Code:
uses [...], unix, process;
[...]
var LE_Path: TLabeledEdit;
[...]
Pro1:=TProcess.Create(nil);
Pro1.CommandLine:=(('sudo open'+LE_Path.Text));
Pro1.Options := Pro1.Options; //Here i used Options before
Pro1.Execute;
With this Program, i want to open Files with sudo (The Programm is running on the User Interface)
->Sorry for my Bad English; Sorry for fails in the Question: I am using StackOverflow the first time.
I guess the solution was a missing space char?
Change
Pro1.CommandLine:=(('sudo open'+LE_Path.Text));
to
Pro1.CommandLine:=(('sudo open '+LE_Path.Text));
# ----------------------------^--- added this space char.
But if you're a beginner programmer, my other comments are still worth considering:
trying to use sudo in your first bit of code may be adding a whole extra set of problems. SO... Get something easier to work first, maybe
/bin/ls -l /path/to/some/dir/that/has/only/a/few/files.
find out how to print a statement that will be executed. This is the most basic form of debugging and any language should support that.
Your english communicated your problem well enough, and by including sample code and reasonable (not perfect) problem description "we" were able to help you. In general, a good question contains the fewest number of steps to re-create the problem. OR, if you're trying to manipulate data,
a. small sample input,
b. sample output from that same input
c. your "best" code you have tried
d. your current output
e. your thoughts about why it is not working
AND comments to indicate generally other things you have tried.
I've read a lot of similar questions, but I can't seem to find an answer to exactly what my problem is.
I've got a set of minidumps from a 32-bit application that was running on 64-bit Windows 2008. The 32-bit Visual Studio on my 32-Bit Vista Business wouldn't touch them at all, so I've been trying to open them in WinDbg.
I don't have the EXACT corresponding .pdb files (we only started saving them AFTER this particular release), but I have .pdbs built by the same machine with the same code. I also have access to the exact executable that created the minidumps.
I found a nifty little application called ChkMatch that can make .pdbs match an executable... the only difference (according to ChkMatch) was age, so I matched my newer .pdbs to the original executable.
However, when I load it in WinDbg, it still says that it is a "mismatched pdb" then, since I had set .symopts+0x40 it tries to load them anyway. I then get the warning:
*** WARNING: Unable to verify checksum for myexe.exe
I ran !lmi myexe and saw that, indeed, the checksum of the executable was in fact zero. From poking around a bit, I've found that the executable should have been built with the /release flag to have a checksum. That's all well and good, but I can't exactly go back in time and rebuild (if I did though, I'd definitely save the original .pdbs :-P ).
Is there anything I can do here? Seems a little ridiculous I can't make things match here at least enough to get a call stack.
you don't need the checksum to get a call stack - this warning can be safely ignored.
to get the stack you need to issue the stack command (any variant of k).
if the minidumps are any good (i.e. describe an actual fault), you should first try the auto analysis !analyze -v which will get you started.
come back when you have exhausted your expertise :o)
If you're working with minidumps then you have to set your image path (Ctrl+I) to point to a location with the images in the dump. The trouble with minidumps is that they don't contain any code or data from the executables on the target, so you have to supply them yourself.
-scott
Is there a way from WinDbg, without using the DbgEng API, to display the symbol server paths (i.e. PdbSig70 and PdbAge) for all loaded modules?
I know that
lml
does this for the modules whose symbols have loaded. I would like to know these paths for the symbols that did not load so as to diagnose the problem. Anyone know if this is possible without having to utilize the DbgEng API?
edited:
I also realize that you can use
!sym noisy
to get error messages about symbols loading. While this does have helpful output it is interleaved with other output that I want and is not simple and clear like 'lml'
!sym noisy and !sym quiet can turn on additional output for symbol loading, i.e.:
!sym noisy
.reload <dll>
X <some symbol in that DLL to cause a load>
!sym quiet
When the debugger attempts to load the PDB you will see every path that it tries to load and if PDB's weren't found or were rejected.
To my knowledge there's no ready solution in windbg.
Your options would be to either write a nifty script or an extension dependent on where you're the fittest.
It is pretty doable within windbg as a script. The information you're after is described in the PE debug directory.
Here's a link to the c++ sample code that goes into detail on extracting useful information (like the name of the symbol file in your case). Adapting it to windbg script should be no sweat.
Here's another useful pointer with tons of information on automating windbg. In particular, it talks about ways of passing arguments to windbg scripts (which is useful in your case as well, to have a common debug info extraction code which you can invoke from within the loaded modules iteration loop).
You can use the command
lme
to show the modules that did not have any symbols loaded.
http://ntcoder.com/bab/tag/lme/
SOLVED see Edit 2
Hello,
I've been writing a Perl program to handle automatic upgrading of local (proprietary) programs (for the company I work for).
Basically, it runs via cron, and unfortunately has a memory leak (or something similar). The problem is that the leak only happens when I'm not looking (aka when run via cron, not via command line).
My code does not contain any circular (or other) references, so the commonly cited tools will not help me (Devel::Cycle, Devel::Peek).
How would I go about figuring out what is using so much memory that the kernel kills it?
Basically, the code SFTPs into a server (using ```sftp...`` `), calls OpenSSL to verify the file, and then SFTPs more if more files are needed, and installs them (untars them).
I have seen delays (~15 sec) before the first SFTP session, but it has never used so much memory as to be killed (in my presence).
If I can't sort this out, I'll need to re-write in a different language, and that will take precious time.
Edit: The following message is printed out by the kernel which led me to believe it was a memory leak:
[100023.123] Out of memory: kill process 9568 (update.pl) score 325406 or a child
[100023.123] Killed Process 9568 (update.pl)
I don't believe it is an issue with cron because of the stalling (for ~15 sec, sometimes) when running it via the command-line. Also, there are no environmental variables used (at least by what I've written, maybe underlying things do?)
Edit 2: I found the issue myself, with help from the below comment by mobrule (in response to this question). It turns out that the script was called from a crontab of a user (non-root) just once a day and that (non-root privs) caused a special infinite loop situation.
Sorry guys, I feel kinda stupid for not finding this before, but thanks.
mobrule, if you submit your comment as an answer, I will accept it as it lead to me finding the problem.
End Edits
Thanks,
Brian
P.S. I may be able to post small snippets of code, but not the whole thing due to company policy.
You could try using Devel::Size to profile some of your objects. e.g. in the main:: scope (the .pl file itself), do something like this:
use Devel::Size qw(total_size);
foreach my $varname (qw(varname1 varname2 ))
{
print "size used for variable $varname: " . total_size($$varname) . "\n";
}
Compare the actual size used to what you think is a reasonable value for each object. Something suspicious might pop out immediately (e.g. a cache that is massively bloated beyond anything that sounds reasonable).
Other things to try:
Eliminate bits of functionality one at a time to see if suddenly things get a lot better; I'd start with the use of any external libraries
Is the bad behaviour localized to just one particular machine, or one particular operating system? Move the program to other systems to see how its behaviour changes.
(In a separate installation) try upgrading to the latest Perl (5.10.1), and also upgrade all your CPAN modules
How do you know that it's a memory leak? I can think of many other reasons why the OS would kill a program.
The first question I would ask is "Does this program always work correctly from the command line?". If the answer is "No" then I'd fix these issues first.
On the other hand if the answer is "Yes", I would investigate all the differences between having the program executed under cron and from the command line to find out why it is misbehaving.
If it is run by cron, that shouldn't it die after iteration? If that is the case, hard for me to see how a memory leak would be a big deal...
Are you sure it is the script itself, and not the child processes that are using the memory? Perhaps it ends up creating a real lot of ssh sessions , instead of doing a bunch of stuff in one session?