How to avoid ChildEBP and RetAddr in WinDbg call stack? - windbg

If one outputs call stack in WinDbg using k command then the output includes two columns ChildEBP and RetAddr at the beginning:
ChildEBP RetAddr
0151d9c8 55c59339 KERNELBASE!RaiseException+0x48
0151da08 00e15b3a msvcr120!_CxxThrowException+0x5b [f:\dd\vctools\crt\crtw32\eh\throw.cpp # 152]
...
Given many crash dumps which I would like to cluster by the similarity of call stacks, ChildEBP and RetAddr addresses are preventing me from doing it: they are different even if the call stacks are actually the same just because of different addresses where DLLs were loaded.
It is clear that such things can be removed by some simple text processing, but may be there is some command in WinDbg which allows showing call stacks without ChildEBP and RetAddr like that:
KERNELBASE!RaiseException+0x48
msvcr120!_CxxThrowException+0x5b [f:\dd\vctools\crt\crtw32\eh\throw.cpp # 152]
...
?

As mentioned by Sean Cline in the comments already, kc displays a clean stack.

Related

How to pinpoint where in the program a crash happened using .dmp and WinDbg?

I have a huge application (made in PowerBuilder) that crashes every once in a while so it is hard to reproduce this error. We have it set up so that when a crash like this occurs, we recieve a .dmp file.
I used WinDbg to analyze my .dmp file with the command !analyze -v. From this I can deduct that the error that occured was an Access Violation C0000005. Based on the [0] and [1] parameters, it attempted to dereference a null pointer.
WinDbg also showed me STACK_TEXT consisting of around 30 lines, but I am not sure how to read it. From what I have seen I need to use some sort of symbols.
First line of my STACK_TEXT is this:
00000000`00efca10 00000000`75d7fa46 : 00000000`10df1ae0 00000000`0dd62828 00000000`04970000 00000000`10e00388 : pbvm!ob_get_runtime_class+0xad
From this, my goal is to analyze this file to figure out where exactly in the program this error happened or which function it was in. Is this something I will be able to find after further analyzing the stack trace?
How can I pinpoint where in the program a crash happened using .dmp and WinDbg so I can fix my code?
If you analyze a crash dump with !analyze -v, the lines after STACK TEXT is the stack trace. The output is equivalent to kb, given you set the correct thread and context.
The output of kb is
Child EBP
Return address
First 4 values on the stack
Symbol
The backticks ` tell you that you are running in 64 bit and they split the 64 bit values in the middle.
On 32 bit, the first 4 parameters on the stack were often equivalent to the first 4 parameters to the function, depending on the calling convention.
On 64 bit, the stack is not so relevant any more, because with the 64 bit calling convention, parameters are passed via registers. Therefore you can probably ignore those values.
The interesting part is the symbol like pbvm!ob_get_runtime_class+0xad.
In front of ! is the module name, typically a DLL or EXE name. Look for something that you built. After the ! and before the + is a method name. After the + is the offset in bytes from the beginning of the function.
As long as you don't have functions with thousands of lines of code, that number should be small, like < 0x200. If the number is larger than that, it typically means that you don't have correct symbols. In that case, the method name is no longer reliable, since it's probably just the last known (the last exported) method name and a faaaar way from there, so don't trust it.
In case of pbvm!ob_get_runtime_class+0xad, pbvm is the DLL name, ob_get_runtime_class is the method name and +0xad is the offset within the method where the instruction pointer is.
To me (not knowing anything about PowerBuilder) PBVM sounds like the PowerBuilder DLL implementation for Virtual Memory. So that's not your code, it's the code compiled by Sybase. You'd need to look further down the call stack to find the culprit code in your DLL.
After reading Wikipedia, it seems that PowerBuilder does not necessarily compile to native code, but to intermediate P-Code instead. In this case you're probably out of luck, since your code is never really on the call stack and you need a special debugger or a WinDbg extension (which might not exist, like for Java). Run it with the -pbdebug command line switch or compile it to native code and let it crash again.

Function address not in the range of memory-mapped files for the lisp image process

After defining and disassembling the function fn, I can see that the function (or code component) resides at memory address 0x53675216. But I don't see said memory address to be in the range of memory-mapped files attributed to the lisp image process (I'm using SBCL).
Am I missing something about how the internals of a process work?
FWIW My goal was to dump the entire memory of the process and inspect some of the memory. But if I can't even get at a function that I defined, what's the point?
Please post the actual text rather than a picture of the text.
/proc/<pid>/map_files is not the right thing to look at: instead look at /proc/<pid>/maps which shows you all the memory maps.
In my case if I define & compile foo on SBCL on x64 / Linux as:
(defun foo ())
Then (disassemble 'foo) looks like:
; disassembly for foo
; Size: 21 bytes. Origin: #x53624A7C ; foo
[...]
And I can use my hexdump-thing function from this answer to check this:
> (hexdump-thing #'foo)
> (hexdump-thing #'foo)
lowtags: 1011
function: 0000000053624A6B : 0000000053624A60
So the actual address of the object is #x0000000053624A60, which is compatible with what disassemble is saying.
So then if I look at /proc/<pid>/maps I see, among all the other lines, two lines like:
52a00000-533f8000 rwxp 016a8000 fd:00 2758077 /local/environments/sbcl/lib/sbcl/sbcl.core
533f8000-5ac00000 rwxp 00000000 00:00 0
The fields in this file are address, permissions, offset, device, inode, file. You can see that the range that includes the function's address is not mapped to any file (note that p means 'copy on write' so the range mapped to the core file will never be written back to the core file).
The function's definition lives somewhere in this anonymous range of memory.
As a note: if you want to investigate the memory of the implementation, do it from within the implementation, which will be hugely easier than trying to investigate it from outside. SBCL has lots of support for this sort of thing, although you have to find some of them by grovelling around in the source. After all this sort of thing is exactly what a garbage collector has to do.

Display callstack without method names

In WinDbg, I can get the callstack using the k command. For DLLs without symbols, it displays an incorrect method name and a large offset, e.g.
0018f9f0 77641148 syncSourceDll_x86!CreateTimerSyncBridge+0xc76a
Since I don't have symbols, I have to give this information to the developer of the DLL. I don't know who will work on the bug and how much debugging knowledge he has. I want to avoid that the developer thinks the problem is in the CreateTimerSyncBridge() method.
Is there a way to get the callstack without method names, just with offsets?
At the moment I'm using the following workaround:
0:000> ? syncSourceDll_x86!CreateTimerSyncBridge+0xc76a
Evaluate expression: 1834469050 = 6d57c6ba
0:000> ? syncSourceDll_x86
Evaluate expression: 1834287104 = 6d550000
0:000> ? 6d57c6ba-6d550000
Evaluate expression: 181946 = 0002c6ba
So I can modify the callstack manually to
0018f9f0 77641148 syncSourceDll_x86!+0x2c6ba
But that's really hard to do for a lot of frames in a lot of threads.
You can specify that the symbols must match exactly using a stricter evaluation, either by starting windbg with command line parameter -ses or issuing the command:
.symopt +0x400
The default is false for the debugger, if you wish to reset this then just remove the option:
.symopt -0x400
See the msdn docs: https://msdn.microsoft.com/en-us/library/windows/hardware/ff558827(v=vs.85).aspx#symopt_exact_symbols

Set ba (break on access) breakpoint in managed code programmatically

For investigating managed heap corruption I would like to use ba (break on access) breakpoints. Can I use them in managed code? If yes, how can I set them programmatically?
UPDATE: It would also be okay so set them in WinDbg (-> set ba for every object of type XY)
Breakpoints set by 'ba' command are called processor or hardware breakpoints.
First the good news
It is easy to set hardware breakpoint. You will need to set one of the processor's debug registers (DR0, DR1, DR2 or DR3) with the address of the data and set debug control register DR7 with fields to set size of memory and type of access. The instruction (in x64 assembler) looks like:
MOV rax, DR0
Obviously you will have to somehow execute this assembler instruction from your language of choice, or use interop to C++ and inline assembly, but this is easier than for example setting software breakpoint.
Now the bad news
First of all, on SMP machines you will have to do this for all processors that can touch your code. This is probably solvable if you configure processor affinity for you process, or do debugging on single-proc machine. Second, there are only 4 debug processors on Intel architecture. If you try setting processor breakpoints with WinDbg, after 4th it will complain Too many data breakpoints for thread N after you hit g.
I assume the whole purpose you are asking about automation is because there are too many objects to set breakpoints by hand. Since you are limited to 4 ba breakpoints anyways, there is not much point in automating this.

WinDbg: print to another window

My WinDbg command window gets always polluted by useless AFW-traces... Therefore I would like to print my own stuff in another window. How can I do that?
No way that I'm aware of. You can use .ofilter to filter the output though, which may be sufficient.