Can one use libSegFault.so to get backtraces for SIGABRT? - stack-trace

The magic incantation
LD_PRELOAD=/lib/libSegFault.so someapp
runs someapp with libSegFault.so providing backtrace information on a SIGSEGV as described in many different places.
Other than using signal(7)-like approaches to cause SIGABRT to invoke the SIGSEGV handler, is there some way to get libSegFault to provide backtrace information for assert(3) failures?

env SEGFAULT_SIGNALS="abrt segv" LD_PRELOAD=/lib/libSegFault.so someapp
Note that the actual path to the preload library may differ. On my machine, I'd use
env SEGFAULT_SIGNALS="abrt segv" LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so some-64bit-app
or
env SEGFAULT_SIGNALS="abrt segv" LD_PRELOAD=/lib/i386-linux-gnu/libSegFault.so some-32bit-app
depending whether the application I was running was compiled 64-bit or 32-bit. (You can use file to check.)
The source tells us there are three environment variables that define how libSegFault.so behaves:
SEGFAULT_SIGNALS: The list of signals that cause a stack trace.
The default is SIGSEGV. A defined but empty SEGFAULT_SIGNALS means no signals cause a stack trace.
The supported values are segv, ill, abrt, fpe, bus on systems that have the SIGBUS signal, stkflt on systems that have the SIGSTKFLT signal, and all for all of these.
SEGFAULT_USE_ALTSTACK: If defined in the environment, libSegFault.so uses an altenate stack for the stack trace signals.
This may come in handy if you are debugging stack corruption.
SEGFAULT_OUTPUT_NAME: If defined in the environment, the stack trace is written to this file instead of standard error.
To be honest, I found these initially by examining the library with strings /lib/libSegFault.so | sed -e '/[^0-9A-Z_]/ d'. All standard libraries (libSegFault.so having become a part of GNU C library) are tunable via environment variables, so using something like that command to dump any strings that look like environment variable names is a quick way to find stuff to search for. Doing a web search on "SEGFAULT_SIGNALS" "SEGFAULT_OUTPUT_NAME" produces a number of useful links; seeing that it was part of the GNU C library nowadays, I went to the source git archives, found the actual source file for the library, and posted my answer.

In a similar vein, the glibc exception handler writes a stack dump to /dev/console on heap corruption errors.
If you are running your executable in a non tty (i.e. a systemd process or other detached process), the crash output goes to /dev/null, which is not so useful.
There is an undocumented feature to redirect the output to /dev/stderr. Set the following environment variable:
export LIBC_FATAL_STDERR_=1
This can be used in conjunction with libSegFault.so for maximal forensics.
It is also worth mentioning that this might give you two stack traces if you also enable backtraces for SIGABRT, as the glibc first does a stack trace, then signals SIGABRT ... and then libSegFault gives a second stack trace.

Related

Take kernel dump on-demand from user-space without kernel debugging (Windows)

What would be the simplest and most portable way (in the sense of only having to copy a few files to the target machine, like procdump is) to generate a kernel dump that has handle information?
procdump has the -mk option which generates a limited dump file pertaining to the specified process. It is reported in WinDbg as:
Mini Kernel Dump File: Only registers and stack trace are available. Most of the commands I try (!handle, !process 0 0) fail to read the data.
Seems that officially, windbg and kd would generate dumps (which would require kernel debugging).
A weird solution I found is using livekd with -ml: Generate live dump using native support (Windows 8.1 and above only).. livekd still looks for kd.exe, but does not use it :) so I can trick it with an empty file, and does not require kernel debugging. Any idea how that works?
LiveKD uses the undocumented NtSystemDebugControl API to capture the memory dump. While you can easily find information about that API online the easiest thing to do is just use LiveKD.

failed using cuda-gdb to launch program with CUPTI calls

I'm having this weird issue: I have a program that uses CUPTI callbackAPI to monitor the kernels in the program. It runs well when it's directly launched; but when I put it under cuda-gdb and run, it failed with the following error:
error: function cuptiSubscribe(&subscriber, CUpti_CallbackFunc)my_callback, NULL) failed with error CUPTI_ERROR_NOT_INITIALIZED
I've tried all examples in CUPTI/samples and concluded that programs that use callbackAPI and activityAPI will fail under cuda-gdb. (They are all well-behaved without cuda-gdb) But the fail reason differs:
If I have calls from activityAPI, then once run it under cuda-gdb, it'll hang for a minute then exit with error:
The CUDA driver has hit an internal error. Error code: 0x100ff00000001c Further execution or debugging is unreliable. Please ensure that your temporary directory is mounted with write and exec permissions.
If I have calls from callbackAPI like my own program, then it'll fail out much sooner with the same error:
CUPTI_ERROR_NOT_INITIALIZED
Any experience on this kinda issue? I really appreciate that!
According to NVIDIA forum posting here and also referred to here, the CUDA "tools" must be used uniquely. These tools include:
CUPTI
any profiler
cuda-memcheck
a debugger
Only one of these can be "in use" on a code at a time. It should be fairly easy for developers to use a profiler, or cuda-memcheck, or a debugger independently, but a possible takeaway for those using CUPTI, who also wish to be able to use another CUDA "tool" on the same code, would be to provide a coding method to be able to disable CUPTI use in their application, when they wish to use another tool.

Unable to obtain Proper Stack Trace after running the dump file with WinDbg

We are having exception thrown on Production that causes the w3wp process to crash. To figure out the faulted code, we configured the Debug Diag that is creating dump file when exception occur. Then we are trying to run the dump file with WinDbg to obtain the Stack Trace to figure out the faulted code but this is what we are experiencing after opening the dump file and running the required commands.
As you can see in the image above, it's not giving the stack trace after running the commands, I'm not sure what I'm missing
UPDATE
After running a command twice as suggested in the comments, I'm able to get the stack trace. But seems like there is no faulted code pointed out in the stack instead there is a long list of underlying framework in the stack. Below is the snapshot for the start of stack. Not sure how to identify the error. Any suggestion or I may need to open separate Question for this?

Profiling foswiki with NYTProf results in incomplete profile data

I've an foswiki installation which is really slow (~ 60 seconds for a uncached page). I've tried to profile the installation with NYTProf, according to http://foswiki.org/Support/NYTProfDebugging with the following command:
> sudo -u www-data NYTPROF="file=/tmp/nytprof.out:addpid=1:endatexit=1" perl -wTd:NYTProf view -topic Some.Topic -username MyUsername
The script fails with an exit code 141 when I run it with profiler. If I run it without profiler (remote d:NYTProf) it exits successful and producing output.
After the profiling I've gotten a bunch of profile files in my /tmp directory:
nytprof.out.[841-1860]
But when I try to merge these files, I've get an error for the first file:
> nytprofmerge nytprof.out.*
Profile data incomplete, inflate error -5 ((null)) at end of input file, perhaps the process didn't exit cleanly or the file has been truncated (refer to TROUBLESHOOTING in the documentation)
I can merge the files without the first file, but the results are useless and shows only 87 calls to Foswiki::Sandbox::CORE:open and that's it.
Do I have any chance got get an valid profiling result? Or is there an other tool, that I can use in this case?
I'm not sure why you can't get NYTProfiler to work, we've used it to figure out some performance issues in Foswiki 2.0.2, which have been partially addressed in Foswiki 2.0.3. There are a couple of issues going on, but one major cause is our conversion to UNICODE internally, and some Perl regex issues in perl versions before 5.20. https://rt.perl.org/Public/Bug/Display.html?id=66852
Foswiki 2.0.3 made the following performance updates:
Changed some heavily called internal functions from regular expressions, to index()
Changed EditRowPlugin to generate less html that requires processing by regular expressions in the rendering module.
Made some other improvements to reduce excessive re-reading of topics.
If 2.0.3 doesn't significantly help, Check to see if the problem pages have large tables in them. If so, you might try disabling the EditRowPlugin and use EditTablePlugin.
Other than that, you might try our official support channel #foswiki on IRC, http://irclogs.foswiki.org/
The script fails with an exit code 141 when I run it with profiler.
That suggests the process received a SIGPIPE signal. The sigexit option may help.
If I run it without profiler ... it exits successful and producing output.
You're using sudo so permissions might be an issue, but that's just a guess. You'll need to dig deeper to confirm if a SIGPIPE is being received and why.
I'm not familiar with foswiki. Perhaps someone in that community could be more helpful.

Using Eclipse To Study Large Code Base, and Trace Method Calls

I am starting to work with a very large code base (a large webapp), and want to be able to see the method calls in order to understand how the requests are served. So, I want to use Eclipse to trace method calls for any request that comes in. I'm not sure, but I think that the best way of doing this is through remote debugger; so, I have already created a remote debugger. Now, my question is the following:
How can I configure the debugger such that as soon as a request comes in, the debugger would pause, and allow me to control its step through.
Is there a better way of tracing method calls (for the purpose of studying the code), or using a debugger is really the best method?
Use the trace module, and exclude framework or other uninteresting directories.
-m trace --file=/tmp/trace.log -t --ignore-dir=/home/unifield-server
You can then tail the trace file as it runs.
$ tail -f trace.log
You can also periodically clear it or append a marker for later analysis
$ echo > trace.log
$ echo 'about to press save button' >> trace.log