CDB unable to get callstack from crash dump, but Visual Studio can - windbg

I'm trying to write a program to automate getting the call stack from crash dumps. It runs cdb.exe:
cdb.exe -i "{binaries path}" -y "{binaries path}" -srcpath "{source files path}" -z "{dmp file path}" -lines
I then feed some commands to CDB's standard input:
.symfix+ c:\\symcache
.ecxr
k
q
For many dumps this succeeds in printing the call stack, however some dumps do not work. The dumps that don't work output this error:
Unable to load image C:\Windows\System32\igdumd32.dll, Win32 error 0n2
*** ERROR: Module load completed but symbols could not be loaded for igdumd32.dll
However, Visual studio is able to figure out the call stack just fine. In the Visual Studio call stack, igdumd32.dll is at the bottom of the stack:
igdumd32.dll!0c70c570()
[Frames below may be incorrect and/or missing, no symbols loaded for igdumd32.dll]
I'm not sure if the symbol not loading is the problem or not, but I can't figure out why CDB can't get the call stack while Visual Studio can.

"Frames below may be incorrect and/or missing"
Sometimes this means what it says, annoyingly. I've no idea what VS does to get around it. So far the best I have for cdb is to, instead of k, run kd (the cdb command, not kd the kernel debugger program!) to get the raw stack data, then discard the junk lines between the useful ones.
You'll probably want to supply a number of lines (in hex) after kd to get enough output to contain the whole call stack.
e.g.
kd 200
Oh, this doesn't work for dumps generated from 64-bit processes because kd uses the wrong word size (afaict this is a bug). I'm currently looking for a way to work around this cleanly. For one thread you should be OK with something like:
dps #esp L200
This uses the esp register to access the stack, which is not portable but works for me. You may need a different register.

Related

Can't get ebpf program jitted output using bpftool

When I run sudo bpftool prog show I get the following output
39: socket_filter name bpfprog1 tag e29cda32ba011d7f gpl
loaded_at 2019-09-08T14:21:57+0200 uid 1000
xlated 248B jited 169B memlock 4096B map_ids 30
but If I try to get the program jitted output with the following command
sudo bpftool prog dump jited tag e29cda32ba011d7f
I get an error message, as reported below:
Error: can't get prog info (3): Bad address
QUESTION: what am I doing wrong? XD
You most certainly use a bpftool version compiled from Linux 4.20 or older, and hit a bug that was fixed in version 5.0. Update bpftool, and dumping programs by tags should work again.
As a side note, I usually use program IDs or pinned path, as I find it more useful to retrieve the program I want. Depending on your use case, tags might make sense, especially if you often load the same programs without changes (so you would be sure to keep the same tags) and do not have them pinned.

Python 3 Popen calling rrdtool hangs indefinitely

I am trying to use Python's Popen() to retrieve graph data from a multiple rrd files. Due to complexity of app where the following piece of code is utilised, I rely on rrdtool graph parameter -Z for handling missing files for me:
#!/bin/python3
import subprocess
cmd = '/opt/rrdtool/bin/rrdtool graph - -a JSONTIME -Z --width 924 --start 1486428000 --end 1486471200 DEF:foo1=ch1.rrd:flows:MAX DEF:foo2=ch2.rrd:flows:MAX AREA:foo1#000:"ch1" AREA:foo2#606060:"ch2":STACK'
path = '/data/live/pokus/rrd/channels/'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, cwd=path, shell=True)
p.wait()
if p.returncode is not 0:
print("Error")
else:
print(p.stdout.read().decode(encoding="utf-8"))
Following code works as expected when both files ch1.rrd and ch2.rrd are present. When one of them is missing, whole thing hangs indefinitely until I kill the rrdtool process manually from htop. Then python detects nonzero return code and reports error.
Using shell=False and shlex.split() on cmd does not help.
When I execute the same command from bash, even with the missing files rrdtool does the job as expected.
Unfortunately I can't use rrdtool bindings for python and also I am stuck on python 3.4.5. rrdtool version is 1.6.0.
I am glab for any idea how to overcome this. I would prefer solution that does not include testing whether files exist from python and keeps the -Z parameter in rrdtool command. Also using timeout on p.wait() isn't a viable solution.
Thanks in advance
Ok, I found the solution.
The reason why Python (namely p.wait) hanged was because rrdtool did not know the minimum step size (parameter -S) resulting in step size of two seconds. This way, output of the rrdtool was able to fill the OS buffers and that deadlocked p.wait. According to Python docs, Popen.communicate should be a way to go.

Call stack is show as hex when using WinDbg to analyze a crash dump

When I use below command to show the stack, I just get the hex address, even through the module is loaded (checked with command lm m xx):
0:014> k
Child-SP RetAddr Call Site
00000000`88f9b0e0 00000000`305e8a60 0x36f038d
00000000`88f9b0e8 00000000`305e8a60 0x305e8a60
Can anybody tell me why?
This is e.g. normal for .NET applications. The intermediate code is part of the assembly / DLL which you can see by lm.
However, the intermediate code never gets executed itself. It is processed by the JIT compiler at runtime. The JIT compiler allocates some memory (outside of the DLL) and emits assembler code there.
Since that part of memory is not related to the DLL immediately, WinDbg shows it as hex addresses only.
To work with .NET, load the SOS extension and use commands like
.loadby sos clr
!dumpstack
!clrstack
or SOSEX with commands like
.load <full path to>\sosex.dll
!mk

Catch only second chance exception with windbg

I need to debug a running program running on windows.
It some times crashes with "memory access violation".
With windbg (usage of IDE not possible) I attached to running process (it is a requirement the program shall not stop)
The command line is
windbg -g -p <pid>
The problem is that I now catch all first chance exceptions but I am only interested in any second chance exception (do not care which type of exception).
How can I setup windbg to catch any second chance exception?
WinDbg will catch second chance exceptions by default, so you just need to turn off the first chance exceptions. Doing this for a single type of exception is simple:
0:000> sxd av
0:000> *** Check the setting
0:000> .shell -ci "sx" find "av"
See set all exceptions to set all exception types to second-chance only.
Since it does not seem to be an option to perform those commands at debug time, you can also try to configure a Workspace that has exception handling disabled and then reuse the workspace. For understanding the concept of Workspaces, the MSDN article Uncovering how Workspaces work was really helpful. It is a set of experiments that you should do yourself once.
With that background knowledge, attach to any process
0:000> .foreach(exc {sx}) {.catch{sxd ${exc}}}
0:000> *** perhaps some other useful workspace relevant commands here
0:000> *** e.g. .symfix seems useful
0:000> *** File / Save Workspace As ...
0:000> *** Enter a name, e.g. myworkspace
0:000> q
Restart WinDbg with the -W myworkspace command line switch. Attach to any process. Check if your setting have been applied (e.g. sx, .sympath). If everything is fine, you can start debugging.

How to get SLC.pdb to analyze memory dump

I am using windbg 6.12.0002.633 X86 on Windows Vista to analyze memory dumps for memory leaks.
I'm trying to use the command ``dumpheap -statto determine the quantities of objects in the heap.
Unfortunately, I'm getting the error*** ERROR: Symbol file could not be found. Defaulted to export symbols for SLC.dll. I have activated!sym noisyto show where the error comes from and the file SLC.pdb is just not available on the symbol server.
I have googled the file but haven't found such a downloadable file.
The last line in the log output says:Couldn't resolve error at "mpheap -stat"`.
I can't proceed debugging because I'm getting this error permanently.
Does anyone know where I can get a SLC.pdb file or another way to workaround this problem?
Writing
dumpheap -stat
Will result in
Couldn't resolve error at 'mpheap -stat'
However, this will work:
!dumpheap -stat
Note the exclamation mark !
Your error messages seems a little incomplete. The !dumpheap command is part of the SOS extension used to debug managed .NET code under WinDbg. Is that what you're trying to do? You should be able to use the command even without correct PDB files for all modules.
How did you load SOS? Can you use any other SOS commands?