I am having a problem getting WinDbg to use the PDB files for my .NET DLL files.
The hang dump I am looking at is from a production build, but I have PDB files from a debug build of the same code.
I set the symbol path to include a local folder and the Microsoft symbol server.
C:\websymbols\foo;srv*c:\websymbols*http://msdl.microsoft.com/download/symbols
I put all my PDB files in C:\websymbols\foo. Yet, the managed stack listings do not contain any method names.
Doing a reload, .reload /f, tells me:
DBGHELP: No debug info for FOO.dll. Searching for dbg file
SYMSRV: c:\websymbols\foo\FOO.dbg\49B7F17C10000\FOO.dbg not found
SYMSRV: c:\websymbols\FOO.dbg\49B7F17C10000\FOO.dbg not found
SYMSRV: http://msdl.microsoft.com/download/symbols/FOO.dbg/49B7F17C10000/FOO.dbg not found
DBGHELP: .\FOO.dbg - file not found
DBGHELP: .\dll\FOO.dbg - path not found
DBGHELP: .\symbols\dll\FOO.dbg - path not found
DBGHELP: FOO.dll missing debug info. Searching for pdb anyway
DBGHELP: Can't use symbol server for FOO.pdb - no header information available
DBGHELP: FOO.pdb - file not found
*** WARNING: Unable to verify checksum for FOO.dll
*** ERROR: Module load completed but symbols could not be loaded for FOO.dll
DBGHELP: FOO - no symbols loaded
When attaching WinDbg to the service in a test environment, managed stacks show up fine with method names. Dumping the memory, and analyzing the DMP file locally I don't see the names in the managed stacks. What might I be doing wrong?
You need the exact same PDB files. Debug symbols will not work with a retail dump. And you need the PDB file from exactly the same build.
Whenever you release bits into the wild, your build team should store the private PDB files for reference in case you have to stare at a dump six months later...
There's not much you can do about it now. As John Robbins says:
The most important thing all
developers need to know: PDB files are
as important as source code! ... I've
been to countless companies to help
them debug those bugs costing hundreds
of thousands of dollars and nobody can
find the PDB files for the build
running on a production server.
Without the matching PDB files you
just made your debugging challenge
nearly impossible.
You can try your luck with an evil tool called ChkMatch, fooling VS to accept whatever PDB you throw at it. Just know that chances are near-zero you'd get any meaningful stacks - the PE layout is extremely sensitive to code changes, and technically even two builds of identical source are not guaranteed to give the same PE.
[edit:] Sorry, just noticed you use WinDBG. In that case, as Remus says, .reload /f /i can achieve the same trick (with the same risks).
OK, I asked the wrong question. I don't even need symbols for the .NET code (as Remus pointed out). So this is not the answer to my question, but it is the solution to my problem, which seems to be related to the .NET build on the machine WinDbg is running on.
I get meaningful stack information when .chain tells me this:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos: image 2.0.50727.**1433**, API 1.0.0, built Tue Oct 23 20:41:30 2007
(The same as on the server the dump was taken on.)
I don't get any information other than addresses from !clrstack when run on machines where .chain tells me:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos: image 2.0.50727.**3053**, API 1.0.0, built Fri Jul 25 07:08:38 2008
Related
I've created a library on myget (part of ci), and I'm trying to push the symbol sources to symbolsource.org (this is a great service, and I love the idea). This is my first attempt. I've been using the instructions found on the myget site: http://docs.myget.org/docs/reference/symbolsource, but there are some gaps.
Here are the steps I go through. First, I create a nuspec file, and I use "nuget pack -symbol xxx" to create the X.symbols.nupkg and X.nupkg files. This works just fine. I then push them individually to myget and symbolsource. I used the nuget pkg explorer to examine the contents, and they look as I would expect (the src, pdb, and dll show up in the symbols). After doing the push, I can log into symbolsource and I see my packages up there using the instructions found on the myget page.
I used the following command to push to symbolsource:
nuget push X.symbols.nupkg $ApiKey -Source http://nuget.gw.SymbolSource.org/MyGet/rootdotnet/
I then configure visual studio as instructed: make sure to turn off "enable just my code" and also to turn on symbol servers. I then add to the list of symbol servers the following URL:
http://srv.SymbolSource.org/pdb/MyGet/gwatts/XXXXX
Where XXXX is a GUID I read off the sumbolsource "Your Account"/"Authentication" "Visual Studio" table entry (myget wasn't at all clear this is what I was supposed to do).
I then try to debug. When I hit something in that library, I get the "No Symbols Loaded" page in VS2012. Under details, there is a dump VS2012's attempt to find the pdb file. I see the following:
C:\Users\Gordon\Documents\Code\HVQCDCorrelationStudy\CalcSimpleCorrelationTestNumbers\bin\x86\Debug\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
c:\TeamCity\buildAgent\work\44463130cd7383cb\LINQToTTree\LINQToTTreeLib\obj\x86\Release\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\WINDOWS\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\WINDOWS\symbols\dll\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\WINDOWS\dll\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\Users\Gordon\AppData\Local\Temp\SymbolCache\LINQToTTreeLib.pdb\9c883e0fa93245c99efd2b92dbfc6dfc1\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\Users\Gordon\AppData\Local\Temp\SymbolCache\MicrosoftPublicSymbols\LINQToTTreeLib.pdb\9c883e0fa93245c99efd2b92dbfc6dfc1\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
C:\Users\Gordon\Documents\Code\HVQCDCorrelationStudy\LINQToTTreeLib.pdb: Cannot find or open the PDB file.
SYMSRV: C:\Users\Gordon\AppData\Local\Temp\SymbolCache\LINQToTTreeLib.pdb\9C883E0FA93245C99EFD2B92DBFC6DFC1\LINQToTTreeLib.pdb not found
SYMSRV: http://srv.SymbolSource.org/pdb/MyGet/gwatts/XXXXX/LINQToTTreeLib.pdb/9C883E0FA93245C99EFD2B92DBFC6DFC1/LINQToTTreeLib.pdb not found
http://srv.SymbolSource.org/pdb/MyGet/gwatts/XXXXX: Symbols not found on symbol server.
SYMSRV: C:\Users\Gordon\AppData\Local\Temp\SymbolCache\LINQToTTreeLib.pdb\9C883E0FA93245C99EFD2B92DBFC6DFC1\LINQToTTreeLib.pdb not found
SYMSRV: http://msdl.microsoft.com/download/symbols/LINQToTTreeLib.pdb/9C883E0FA93245C99EFD2B92DBFC6DFC1/LINQToTTreeLib.pdb not found
http://msdl.microsoft.com/download/symbols: Symbols not found on symbol server.
In short, it looks like it correctly contacts symbolssource.org. But something is failing up there. The 9C883E0FA93245C99EFD2B92DBFC6DFC1 is obviously a hash. I have no idea (??) what hash symbolssource assigned to that library - though I'd love to try to figure it out, as that might be a first step to understanding what is going on.
Basically. I don't know how to proceed with debugging at this point. Any help would be appreciated!
Update: As mentioned in the answers below, build something small that can be tested. I've done that, and it works just fine. In doing that I discovered there are some debugging tools up on SymbolSource.org - specifically, when you look at a package in your feed, you can find the "Compilations" link. Click on it. It should show a line for each build type you've uploaded. My packages have nothing associated with that - so I've messed up my nuspec file somehow for symbol generation.
Try to isolate a reproducible scenario (rule out as many other factors as you can). Sounds like your Visual Studio set up is correct, so I'm suspicious for package or compilation issues (e.g. symbols and sources out of sync). Feel free to contact MyGet support for further assistance.
The answer, it turns out, is a slice of humble pie. Turns out on my build server there was an environment variable conflict. The result was that local build scripts built a symbols file just fine and the build server built one without PDB's in it. Without pdb's, of course, the source server was not able do very much.
One thing I did learn on the way is the NuGet PackageExplorer (https://npe.codeplex.com/). Want you can do is use it to load up the nugget symbols package. Then use the plug-in manager to load in the SymbolesSource plug-in (you'll have to use the market place, but it is all free). This utility would have caught the problem in my packages had I submitted the proper ones to it (my local packages passed with flying colors).
I'm trying to learn about reading dump files, so I made my small APP crush, and created a dump for that process from task manager.
I tried to open the .dmp file, both from VS10 and windbg.exe, and got an error that the symbol files are missing. I specified the path of the symbol files as the directory where the .pdb files are located :
..\Visual Studio 2010\Projects\CachedQueryTester\CachedQueryTester\bin\Debug
but I still get the same error, both on VS10 and windbg.exe,
Any Idea?
You may also need symbols from Microsoft , try to enter
0:000> .symfix
in windbg
From your configuration, you should not have to specify any debug symbol path, because the path of your symbols are stored in the executable. To be sure, you can open a Visual Studio Command prompt and type
dumpbin CachedQueryTester.exe /HEADERS
In the output, you should have a 'Debug Directories' entry containing the full path of the pdb.
If this is not the case, check you have specified to generate a PDB file (Configuration Properties / Linker / Debugging / Generate debug info);
You can also try to ask WinDbg for the location it looks for. In order to do this, open your dump file from within WinDbg, type '!symnoisy' and reload the symbols (.reload /u then .reload and kb). It will tell you the locations it looks for.
My exe depends on ntdll, user32 and kernel32. I save these dlls as a local copy and change the first letter as "V".
I then edit the exe's Import dll name as Vernel32.dll from kernel32. The application works fine by loading vernel32.dll in local space.
Next i edit the exe's import dll spec as vtdll as ntdll, the process loads vtdll from local, runs its code and throws an _stackhash exception on vtdll instructions.
I need this for developing my appliction to bundle all windows dependencies. Does any body have any idea, Why ntdll cant be run in local space.
No! You cannot try to replace ntdll. It is mapped by the kernel into every single process, probably before any of your code is even loaded. It has an intricate connection with the kernel. It knows all the correct system call numbers. Try using ntdll from NT 5.1 and it will crash on NT 6.1. ntdll hosts the system call entry and exit code. The kernel-user callback dispatcher code. The thread start function which the kernel knows the address of. The user exception dispatcher. The user APC handler. I could go on, but I won't.
I don't see why you're trying to "bundle" these DLLs with your program. There is no way a Windows install won't have these DLLs. And that's ZERO chance for ntdll.dll since I don't see how without the session manager and CSR you are going to run your program in the first place.
I find the idea to "bundle" system DLL as not a good idea.
First of all it is illegal to redistribute this DLLs together with your application. Seconds you should understand that a DLL can create some global objects and the usage of two copies of the same DLL (vtdll.dll and ntdll.dll) can not work. You don't wrote how you modified imports of the dlls. If you do it on the disk it is illegal and moreover it break the signature of the files (open file properties of any of the dlls and look at "Digital Signatures" tab).
If you do want to experiment with different copies of system dlls you can better use DLL redirection (see http://msdn.microsoft.com/en-us/library/ms682600.aspx) through creating of files with the name myapp.exe.local where myapp.exe is the name of your application. It can be required to delete some entries from HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\KnownDLLs to do this. You should understand that your computer will run slowly after this and I recommend to make such experiments better inside a virtual machine which you can easy restore if it will be no more booted.
Thanks for the information. It helped me to do a research on it.
I am not bundling the dlls for my own application. I am doing it for existing applications to provide a windows cross platform independence solution.
I tried the dll redirection technique which you have posted, with all applications.
It works well with all dlls except NTdll and User32.dll
User32.dll:
It loads user32.dll from local space only and not kernel space. I confirmed it. But on executing its instructions, it results in the null address access exception (c0000005) with fault module name StackHash_5964
ntdll:
The application on booting, it loads ntdll from system32 and again loads ntdll from local space, which may cause the error as you said (global object sharing violation)
This happens only for ntdll and not for user32.dll.
Is there any way we can make load ntdll once(only form local space) and avoid the errors caused by user32.dll in local space.
I tried the references sent by you and here are the results.
User32.dll
I couldnot build user32.dll having these below functions.
IsThreadDesktopComposited = user33.IsThreadDesktopComposited,
User32InitializeImmEntry = user33.User32InitializeImmEntry
It produces a linker error (Unreolved external symbol "IsThreadDesktopComposited")
Hence i left 100 such functions out of 800 functions in user32.dll. The DLL was built finally.
I then placed the dll in local space along with user33.dll. On running the application, it says the 100 missed functions procedure entry points are not found.
Ntdll.dll
I tried removing known dlls. But its inacccesible for modify or delete operations. I could just read. I am the admin and ran regedit as administrator.
Is it possible to do such implementatipons for ntdll or user32.dll.
I guess, am coming with repeated times.
Thanks for all your help.
But, If you have any other ways or any suggestions you can make, that would be grateful
This seems so trivial, yet I can't get it to work..
I have an msi.dll wrapper (named Interop.WindowsInstaller.dll) which I need to sign. The way to do it is by signing it upon import (this specific case is even documented in MSDN: http://msdn.microsoft.com/en-us/library/zec56a0w.aspx).
BUT - no matter how I do it (w/ or w/o a keyfile, w/ or w/o adding "/delaysign"), the generated assemly's size is always 36,864 bytes and when viewing the DLL's properties there is no "Digital Signatures" tab (needless to say - the DLL is NOT signed).
What am I missing here?? (... HELP!...)
[Note: Eventually I got a hint from Karel Zikmund on this thread, which helped me solve the mystery. I'll paste my reply here - for the greater good].
So, I used the following line to sign-upon-import the assembly:
tlbimp C:\WINDOWS\system32\msi.dll /out:Interop.WindowsInstaller.dll /keyfile:MyKey.snk
I then copied the file to the appropriate location and built the project, but each time the build failed on the following error: Assembly generation failed -- Referenced assembly 'Interop.WindowsInstaller' does not have a strong name.
I thought the problem was with the tlbimp line, but after reading Karel Zikmund's reply and verifying that the DLL is strong-named (using sn -vf Interop.WindowsInstaller) I found out the problem.
Adding a reference to the "Microsoft Windows Installer Object Library" COM object actually planted a code block into the .csproj file.
I didn't realize it, but this block caused the DLL file to be regenerated from scratch upon each time the project was built. The generated file, of course, was not strong-named anymore.
The way I resolved it was to remove the reference to "Microsoft Windows Installer Object Library" from the project, and add a direct file reference to the imported (and already signed) Interop.WindowsInstaller.dll file.
I'm trying to get my ad hoc build distributed but have started experiencing problems. It used to work up until around a week ago, but now ITunes gives an 0x8008017 error when I try to Sync.
I've narrowed it down by using the iPhone Configuration Utility and then discovering the error seems to be coming from a failed code sign. I've ran codesign -vvvv myApp.app and the outup lists a load of missing resources from my Help documents (from my Apps Resource folder). each missing resource begins ._ so for my index page:
01 - Index.html
the codesign is also expecting: ._01 - Index.html
It also has the existing file listed (as it should) but fails because all ._files are not included in the app.
I've looked through my projects directory and can't find any files beginning with ._ so am not sure where the codesigner is getting these filenames from, but they are included every build, after a clean or an Xcode restart.
All the resources that are causing problems are all recently updated files that I copied over the old resources at the beginning of the week; might this be something to do with it?
Any help appreciated
Make sure you do one of these:
copy those files with an Xcode Copy Files phase, which should Do The Right Thing by default, or
exclude resource forks and ._* files if you copy through a script, or
make sure you build on HFS volumes (where ._* files are not generated for resource forks).
Sounds like your partition type is generating resource-fork files which are also being signed as separate files in the bundle, rather than as part of the original files (which is bad); and then, they're also not getting copied (if you use Finder zipping, they'll be removed and set aside in a different portion of the Zip file, IIRC), again bad. Avoid having them in the bundle, so they don't get signed and you don't have to wade through this mess :)