Windbg native call stack trace does not make sense - windbg

I have a simple test program causing an infinite wait on lock.
public class SyncBlock
{
}
class Program
{
public static SyncBlock sync = new SyncBlock();
private static void ThreadProc()
{
try
{
Monitor.Enter(sync);
}
catch (Exception)
{
//Monitor.Exit(sync);
Console.WriteLine("3rd party code threw an exception");
}
}
static void Main(string[] args)
{
Thread newThread = new Thread(ThreadProc);
newThread.Start();
Console.WriteLine("Acquiring lock");
Monitor.Enter(sync);
Console.WriteLine("Releasing lock");
Monitor.Exit(sync);
}
}
So the main thread is basically get locked when it tries to do Monitor.Enter(sync). If I looked at !clrStack on main thread, its output basically show it which make sense but when I try to see native side of stack, I am expecting to see some Wait on single/multiple object type of call but I don't see it. Can anyone explain it. Thanks
0:000> !CLRStack
PDB symbol for mscorwks.dll not loaded
OS Thread Id: 0x1e8 (0)
ESP EIP
0012f0a8 77455e74 [GCFrame: 0012f0a8]
0012f178 77455e74 [HelperMethodFrame_1OBJ: 0012f178] System.Threading.Monitor.Enter (System.Object)
0012f1d0 00a40177 ConsoleApplication1.Program.Main(System.String[])
0012f400 70fc1b4c [GCFrame: 0012f400]
0:000> kb
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
0012eeb4 710afb92 0012ee68 002d6280 00000000 ntdll!KiFastSystemCallRet
0012ef1c 710af7c3 00000001 002d6280 00000000 mscorwks!StrongNameFreeBuffer+0x1b1f2
0012ef3c 710af8cc 00000001 002d6280 00000000 mscorwks!StrongNameFreeBuffer+0x1ae23
0012efc0 710af961 00000001 002d6280 00000000 mscorwks!StrongNameFreeBuffer+0x1af2c
0012f010 710afae1 00000001 002d6280 00000000 mscorwks!StrongNameFreeBuffer+0x1afc1
0012f06c 70fdc5ae ffffffff 00000001 00000000 mscorwks!StrongNameFreeBuffer+0x1b141
0012f080 710df68a ffffffff 00000001 00000000 mscorwks!LogHelp_NoGuiOnAssert+0x10562
0012f10c 710b1154 002aad90 ffffffff 002aad90 mscorwks!StrongNameFreeBuffer+0x4acea
0012f128 710b10d8 42b8b47d 00000000 002aad90 mscorwks!StrongNameFreeBuffer+0x1c7b4
0012f1e0 70fc1b4c 0012f1f0 0012f230 0012f270 mscorwks!StrongNameFreeBuffer+0x1c738
0012f1f0 70fd2219 0012f2c0 00000000 0012f290 mscorwks+0x1b4c
0012f270 70fe6591 0012f2c0 00000000 0012f290 mscorwks!LogHelp_NoGuiOnAssert+0x61cd
0012f3ac 70fe65c4 0023c038 0012f478 0012f444 mscorwks!CoUninitializeEE+0x2ead
0012f3c8 70fe65e2 0023c038 0012f478 0012f444 mscorwks!CoUninitializeEE+0x2ee0
0012f3e0 7103389d 0012f444 42b8b0f1 00000000 mscorwks!CoUninitializeEE+0x2efe
0012f544 710337bd 002332e0 00000001 0012f580 mscorwks!GetPrivateContextsPerfCounters+0xf546
0012f7ac 71033d0d 00000000 42b8b9c9 00000001 mscorwks!GetPrivateContextsPerfCounters+0xf466
0012fc7c 71033ef7 00ce0000 00000000 42b8979 mscorwks!GetPrivateContextsPerfCounters+0xf9b6
0012fccc 71033e27 00ce0000 42b8b8a1 00000000 mscorwks!CorExeMain+0x168
* ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Windows\Microsoft.NET\Framework\v4.0.30319\mscoreei.dll -
0012fd14 71cf55ab 71033d8f 0012fd30 71f37f16 mscorwks!CorExeMain+0x98
* ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Windows\system32\mscoree.dll -
0012fd20 71f37f16 00000000 71cf0000 0012fd44 mscoreei!CorExeMain+0x38
0012fd30 71f34de3 00000000 7723d0e9 7ffd8000 mscoree!CreateConfigStream+0x13f
0012fd44 774319bb 7ffd8000 084952f9 00000000 mscoree!CorExeMain+0x8
0012fd84 7743198e 71f34ddb 7ffd8000 00000000 ntdll!RtlInitializeExceptionChain+0x63
0012fd9c 00000000 71f34ddb 7ffd8000 00000000 ntdll!RtlInitializeExceptionChain+0x36

You have to point windbg to the microsoft windows symbols server to get a good stack trace.
type in the following in your windbg command window:
.sympath srv*c:\websymbols*http://msdl.microsoft.com/download/symbols
Also see this:
Using microsoft symbol server to get symbols
Also, to answer your original question about how to debug this, here is the cookbook:
0:000> !clrstack
OS Thread Id: 0x1358 (0)
ESP EIP
0012f328 7c90e514 [GCFrame: 0012f328]
0012f3f8 7c90e514 [HelperMethodFrame_1OBJ: 0012f3f8] System.Threading.Monitor.Enter(System.Object)
0012f450 00d10177 Program.Main(System.String[])
0012f688 79e71b4c [GCFrame: 0012f688]
In your original program, the background thread was started first. So, it acquired the lock. However it exited without releasing the lock. After that your main thread tried to acquire the lock and it is stuck because the lock is already owned.
How do you find out who owns it? First do a !threads followed by !syncblk.
0:000> !threads
ThreadCount: 3
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 1
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
0 1 1358 0014bb00 200a020 Enabled 00000000:00000000 001540d0 0 MTA
2 2 1360 0015e320 b220 Enabled 00000000:00000000 001540d0 0 MTA (Finalizer)
XXXX 3 0 00175a98 9820 Enabled 00000000:00000000 001540d0 1 Ukn
0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
2 0017903c 3 1 00175a98 0 XXX 013503cc SyncBlock
-----------------------------
Total 2
CCW 0
RCW 0
ComClassFactory 0
Free 0
As you can see, !syncblk says that the owining thread object is 00175a98. From the !threads output, you can see that thread object 00175a98 is the dead thread that exited while owning the lock.
Hope this helps.

Related

SRIO interface u-boot. Cannot read IO memory

I try running a srio interface on my p2020 custom board. I plug a FPGA board with srio firmware to SRIO1 and configure SRIO as a host.
In uboot_config
#define CONFIG_SRIO1 /* SRIO port 1 */
#define CONFIG_SYS_SRIO1_MEM_VIRT 0xC0000000
#define CONFIG_SYS_SRIO1_MEM_BUS 0xC0000000
#define CONFIG_SYS_SRIO1_MEM_PHYS CONFIG_SYS_SRIO1_MEM_BUS
#define CONFIG_SYS_SRIO1_MEM_SIZE 0x10000000 /* 256M */
in tlb.c
SET_TLB_ENTRY(1, CONFIG_SYS_SRIO1_MEM_VIRT, CONFIG_SYS_SRIO1_MEM_PHYS,
MAS3_SX | MAS3_SW | MAS3_SR,
MAS2_I | MAS2_G,
0, 3, BOOKE_PAGESZ_256M, 1),
Try to read srio memory from u-boot
=> md.l 0xc0000000
c0000000:
p2020 is stucked.
I can watch a read request and read response on FPGA board.
Why I can't read a srio memory?
I set a gpio 'marks' for each Interrupt vector in start.S. When I try to read Srio memory, uboot is stacked. An interrupt doesn't occur. I cannot determine the cause of the error.
I tried to read the SRIO1 memory from linux
# devmem 0xc0000000 32
Disabling lock debugging due to kernel taint
Machine check in kernel mode.
Caused by (from MCSR=10008): Bus - Read Data Bus Error
Oops: Machine check, sig: 7 [#1]
SMP NR_CPUS=2 P2020 DS
Modules linked in:
CPU: 1 PID: 1578 Comm: devmem Tainted: G M 4.9.34 #28
task: eb161a80 task.stack: ef0ca000
NIP: 1000b5fc LR: 1000b510 CTR: c02e1108
MSR: 0202d000 <VEC,CE,EE,PR,ME> CR: 40000242 XER: 20000000
DEAR: b7e79000 ESR: 00000000
GPR00: 40000242 bfab4250 b7e81470 b7e79000 1000b510 40000242 b7e79000 b7d88444
GPR08: 0202d000 00000000 b7e79000 bfab4250 ef0ca000 100c8126 00000000 00000000
GPR16: 00000000 00000000 100a3560 100c0000 100c3fc5 00000000 100c0000 00000003
GPR24: 100c225c 100c0000 00000000 00001000 bfab4554 00000000 b7e79000 00000020
NIP [1000b5fc] 0x1000b5fc
LR [1000b510] 0x1000b510
Oops: Machine check, sig: 7 [#1]
arch/powerpc/kernel/traps.c +731
Machine check exception usually means a hardware problem. I connected the SRIO1 port to a SRIO2 of my p2020 (SRIO2 address starts at 0xd0000000)
# devmem 0xc0000000
0x00710002
# devmem 0xd0000000
0x00710002
It's work! I think, the problem in the FPGA board.

The meaning of "Wait Start TickCount" and "Ticks" in dump file

When I use WinDBG to analyse a kernel model dump file, I can get the information of certain thread. But there are some items that confuse me.
So please help me understand the meaning of the following keywords. Thank you.
Wait Start TickCount
Ticks
UserTime
KernelTime
Here is one example.
THREAD b6b48908 Cid 1038.10b0 Teb: 7ffac000 Win32Thread: fd517868 WAIT: (WrUserRequest) UserMode Non-Alertable
b5700630 SynchronizationEvent
IRP List:
b6ae6ab8: (0006,01d8) Flags: 00060000 Mdl: 00000000
Not impersonating
DeviceMap 95bd9310
Owning Process b5614788 Image: iexplore.exe
Attached Process N/A Image: N/A
Wait Start TickCount 27465609 Ticks: 109779 (0:00:28:32.563)
Context Switch Count 38627
UserTime 00:00:00.717
KernelTime 00:00:00.421
Win32 Start Address 0x6a6439a0
Stack Init b8b7ded0 Current b8b7d8e0 Base b8b7e000 Limit b8b7b000 Call 0
Priority 11 BasePriority 8 UnusualBoost 0 ForegroundBoost 2 IoPriority 2 PagePriority 5
ChildEBP RetAddr Args to Child
b8b7d8f8 8328aefd b6b48908 8333d008 83339e20 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
b8b7d930 83289d57 b5700630 b6b48908 b6b489ec nt!KiSwapThread+0x266
b8b7d958 83285af4 b6b48908 b6b489c8 00000000 nt!KiCommitThreadWait+0x1df
b8b7dad4 94bac293 00000001 b8b7db0c 00000001 nt!KeWaitForMultipleObjects+0x535
b8b7db44 94bac06c 000025ff 00000000 00000001 win32k!xxxRealSleepThread+0x20b (FPO: [SEH])
b8b7db60 94ba90b4 000025ff 00000000 00000001 win32k!xxxSleepThread+0x2d (FPO: [3,0,0])
b8b7dbb8 94bac685 b8b7dbe8 000025ff 00000000 win32k!xxxRealInternalGetMessage+0x4b2 (FPO: [SEH])
b8b7dc1c 83249dc6 0295c7dc 00000000 00000000 win32k!NtUserGetMessage+0x4d (FPO: [SEH])
b8b7dc1c 77366bf4 0295c7dc 00000000 00000000 nt!KiSystemServicePostCall (FPO: [0,3] TrapFrame # b8b7dc34)
0295c790 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
Wait Start TickCount is the computer internal time representation of when the Thread started waiting, i.e. when it changed from state "running" to state "waiting".
Ticks is the difference from Wait Start TickCount to now. These values may affect thread scheduling (together with others, such as the priorities).
Usertime is the amount of time the thread had a call stack with user mode functions on top.
Kerneltime is the amount of time the thread had a call stack with kernel mode functions on top. This should correspond to the values displayed by !runaway in user mode debugging. Both times do not include waiting time, just the actual running time when the thread was really executing CPU instructions.

How to find the source of an 'Access Violation'

To put in a nutshell, I have a C# application doing lots of mciSendString calls ( via dllimport ) to control wav files playback ( essentially open, play, pause, stop, status, close ). And after running for a while, the application crashes without notice with an 'access violation'.
Even though I'm running the app from my vs2012, the exception is not caught by visual studio. Even with the 'force break on an exception' option, I've had no luck in debugging this from the vs2012. So instead I've setup WER to generate me crash dumps and I am using windbg with psscor2.dll plugin to debug it.
Then in sequence, using the following commands, this is what I get ( shorten to the essential for readability purposes ) :
$>.ecxr
eax=00000001 ebx=00000000 ecx=00000401 edx=00000000 esi=049725b8 edi=00000002
eip=4e88159e esp=0a4efa38 ebp=0a4efa54 iopl=0 nv up ei pl nz ac pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010216
<Unloaded_mciwave.dll>+0x159e:
4e88159e ?? ???
$>~*kb
# 19 Id: 105c.28cc Suspend: 1 Teb: 7ef06000
Unfrozen
user32!NtUserGetMessage+0x15
user32!GetMessageA+0xa1
winmm!mciwindow+0x102
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x70
ntdll!_RtlUserThreadStart+0x1b
# 30 Id: 105c.15f8 Suspend: 0 Teb: 7ef1b000 Unfrozen
ntdll!ZwWaitForMultipleObjects+0x15
KERNELBASE!WaitForMultipleObjectsEx+0x100
kernel32!WaitForMultipleObjectsExImplementation+0xe0
kernel32!WaitForMultipleObjects+0x18
kernel32!WerpReportFaultInternal+0x186
kernel32!WerpReportFault+0x70
kernel32!BasepReportFault+0x20
kernel32!UnhandledExceptionFilter+0x1af
ntdll!__RtlUserThreadStart+0x62
ntdll!_EH4_CallFilterFunc+0x12
ntdll!_except_handler4+0x8e
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!RtlDispatchException+0x127
ntdll!KiUserExceptionDispatcher+0xf
WARNING: Frame IP not in any known module. Following frames may be wrong.
<Unloaded_mciwave.dll>+0x159e
# 31 Id: 105c.2310 Suspend: 1 Teb: 7ef00000 Unfrozen
user32!NtUserGetMessage+0x15
user32!GetMessageW+0x33
mciwave!TaskBlock+0x1d
mciwave!PlayFile+0xcb
mciwave!mwTask+0x98
winmm!mmStartTask+0x22
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x70
ntdll!_RtlUserThreadStart+0x1b:
$>!analyze -v
FAULTING_IP:
mciwave_4e880000!TaskBlock+1d
4e88159e ?? ???
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 4e88159e (mciwave_4e880000!TaskBlock+0x0000001d)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000008
Parameter[1]: 4e88159e
Attempt to execute non-executable address 4e88159e
PROCESS_NAME: Titan.vshost.exe
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.
EXCEPTION_PARAMETER1: 00000008
EXCEPTION_PARAMETER2: 4e88159e
WRITE_ADDRESS: 4e88159e
FOLLOWUP_IP:
mciwave_4e880000!TaskBlock+1d
4e88159e ?? ???
MOD_LIST: <ANALYSIS/>
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0x15f8 (30)
====> Exception cxr#a4ef750
FAULTING_THREAD: 000015f8
BUGCHECK_STR: APPLICATION_FAULT_SOFTWARE_NX_FAULT_CODE_WRONG_SYMBOLS
PRIMARY_PROBLEM_CLASS: SOFTWARE_NX_FAULT_CODE
DEFAULT_BUCKET_ID: SOFTWARE_NX_FAULT_CODE
LAST_CONTROL_TRANSFER: from 4e881999 to 4e88159e
STACK_TEXT:
0a4efa54 4e881999 0a4efa88 078db198 078db1a4 mciwave_4e880000!TaskBlock+0x1d
0a4efa68 74370ae5 00038edc 00000000 00000000 mciwave_4e880000!mwTask+0x45
0a4efa88 7670338a 078db198 0a4efad4 76f99f72 winmm!mmStartTask+0x22
0a4efa94 76f99f72 078db198 79f84a28 00000000 kernel32!BaseThreadInitThunk+0xe
0a4efad4 76f99f45 74370ac3 078db198 00000000 ntdll!__RtlUserThreadStart+0x70
0a4efaec 00000000 74370ac3 078db198 00000000 ntdll!_RtlUserThreadStart+0x1b
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: mciwave!TaskBlock+1d
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: mciwave_4e880000
IMAGE_NAME: mciwave.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 4a5bcb4a
STACK_COMMAND: ~30s; .ecxr ; kb
FAILURE_BUCKET_ID: SOFTWARE_NX_FAULT_CODE_c0000005_mciwave.dll!TaskBlock
BUCKET_ID: APPLICATION_FAULT_SOFTWARE_NX_FAULT_CODE_WRONG_SYMBOLS_mciwave!TaskBlock+1d
Followup: MachineOwner
---------
Exception seems to happen in thread #30 in Unloaded_mciwave.dll, but I have no idea how to push further the debugging.. How can I get a better idea of what's going ??
How can I get what's happening between these two lines ?
ntdll!KiUserExceptionDispatcher+0xf
--> WARNING: Frame IP not in any known module. Following frames may be wrong.
<Unloaded_mciwave.dll>+0x159e
Thank you for your help in advance.
You should get more details by reloading the DLL in the Debugger.
For that you need to do:
lmvm mciwave.dll
start end module name
Unloaded modules:
e6510000 e6548000 mciwave.dll
Timestamp: Fri Oct 14 12:00:00 2011 (4E98E6E2)
Checksum: 0003E937
ImageSize: 00038000
You need to set up the Symbol and Exe-Path so the debugger can find the DLL and PDB (which shouldn't be a problem if you have it in your machine). Then you can do
.reload mciwave.dll=e6510000,00038000
DBGHELP: <path>\mciwave.dll - OK
Now if you do !analyze -v again, it should give you the correct call stack.

Access violation while running app via windbg

My application get access violation sometimes.
I runned application through windbg, and it stopped in the following function .
also tried _vscprintf instead of vsnprintf, and the result was same.
I 'm newbie about windbg.
Any help will be appreciated.
int tsk_sprintf_2(char** str, const char* format, va_list* ap)
{
int len = 0;
va_list ap2;
ap2 = *ap;
len = vsnprintf(0, 0, format, *ap); /*-> access violation in this point! */
*str = (char*)calloc(1, len+1);
vsnprintf(*str, len, format, ap2);
va_end(ap2);
return len;
}
==> the following are the result from windbg
MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0x5b8 (22)
Current frame:
ChildEBP RetAddr Caller, Callee
PRIMARY_PROBLEM_CLASS: WRONG_SYMBOLS
BUGCHECK_STR: APPLICATION_FAULT_WRONG_SYMBOLS
LAST_CONTROL_TRANSFER: from 1026d3d8 to 102e14cf
STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
1d3cde7c 1026d3d8 1d3cdea8 0898eeeb 00000000 MSVCR100D!vcwprintf_s_l+0x52ef
1d3cded0 1026d46c 00000000 00000000 0898ee88 MSVCR100D!vsnprintf_l+0x158
1d3cdeec 0834d927 00000000 00000000 0898ee88 MSVCR100D!vsnprintf+0x1c
1d3cdfe8 1002891e 1d3ce0d0 0898ee88 1d3ce1e4 tinySAK!tsk_sprintf_2+0x57
1d3ce0f0 10028b77 09a16fe8 0898ee88 00000000 tinyWRAP!debug_xxx_cb+0x6e
1d3ce1ec 088b697b 09a16fe8 0898ee88 00000444 tinyWRAP!DDebugCallback::debug_info_cb+0x37
1d3cffb4 7c80b713 1cd10f90 1d2cfb44 7c947d9a tinyNET!tnet_transport_mainthread+0x1adb
1d3cffec 00000000 088a2aff 1cd10f90 00000000 KERNEL32!GetModuleFileNameA+0x1b4
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: msvcr100d!vcwprintf_s_l+52ef
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: MSVCR100D
IMAGE_NAME: MSVCR100D.dll
STACK_COMMAND: ~22s ; kb
BUCKET_ID: WRONG_SYMBOLS
FAILURE_BUCKET_ID: WRONG_SYMBOLS_c0000005_MSVCR100D.dll!vcwprintf_s_l
WATSON_STAGEONE_URL:
Followup: MachineOwner
---------
route.
You're attempting to print into a NULL pointer: len = vsnprintf(0, 0, format, *ap);; of course, it will crash. Send a valid address of output buffer as the first parameter and valid length as second.

Understanding Objective c enum declaration

From iPhone UIControl
UIControlEventAllTouchEvents = 0x00000FFF,
UIControlEventAllEditingEvents = 0x000F0000,
UIControlEventApplicationReserved = 0x0F000000,
UIControlEventSystemReserved = 0xF0000000,
UIControlEventAllEvents = 0xFFFFFFFF
Now I assume the UIControlEventApplication is the 'range' I can use to specify custom control events, but I have no idea how to do it properly. Only if I assign 0xF0000000 the control event will correctly fire. If I assign anything else (0xF0000001) the control event fires when it's not supposed to.
Some clarification:
enum {
UIBPMPickerControlEventBeginUpdate = 0x0F000000,
UIBPMPickerControlEventEndUpdate = // Which value do I use here?
};
My assumption of it being a range is based on the docs. Which say:
I assume this because the docs say: A range of control-event values available for application use.
Could anyone help me understand the type of enum declaration used in UIControl?
I would think 0x0F000000 is the 4 bits you have at your disposal for creating your own control events.
0x0F000000 = 00001111 00000000 00000000 00000000
So any combination of:
0x00000001<<27 = 00001000 00000000 00000000 00000000
0x00000001<<26 = 00000100 00000000 00000000 00000000
0x00000001<<25 = 00000010 00000000 00000000 00000000
0x00000001<<24 = 00000001 00000000 00000000 00000000
You can of course OR these together to create new ones:
0x00000001<<24 | 0x00000001<<25 = 00000011 00000000 00000000 00000000
So in your example:
enum {
UIBPMPickerControlEventBeginUpdate = 0x00000001<<24,
UIBPMPickerControlEventEndUpdate = 0x00000001<<25, ...
};
To use the enums you just do bitwise operations:
UIControlEventAllEditingEvents | UIControlEventApplicationReserved | UIControlEventApplicationReserved