Correlate stack trace with source code in MATLAB .mexa64 file - matlab

I have custom C bindings called from Matlab and sometimes I get a segmentation fault. How can I identify in my source code what the corresponding statement is producing the SEGFAULT?
My C function is called Pairing in the source file Pairing.c
Stack Trace (from fault):
[ 0] 0x00007fff6bc76d00 Pairing.mexa64+00015616
[ 1] 0x00007fff6bc74330 Pairing.mexa64+00004912 mexFunction+00001862
[ 2] 0x00007fffe2b4f213 MATLAB/R2020a/bin/glnxa64/libmex.so+00582163
The result of nm -a Pairing.mexa64 | grep ' N ' is
0000000000000000 N .debug_abbrev
0000000000000000 N .debug_aranges
0000000000000000 N .debug_info
0000000000000000 N .debug_line
0000000000000000 N .debug_str

here is my trick (works every single time), run this in a terminal window
matlab -nojvm -nosplash -r 'my_script' -D"valgrind --error-limit=no --tool=memcheck -v --log-file=valgrind.log"
preferably run this under Linux or Mac, but you can also do this in Windows using cygwin64/msys2. Need to install valgrind before use - once it dumps the log in valgrind.log, open it using a text editor, and you can see all memory errors captured by valgrind.
for CUDA codes, you can also replace the valgrind command and parameters by cuda-memcheck, does something similar, but for the GPU.
make your test script my_script.m very simple, for example, load a .mat file, and then call your mex function immediately to avoid lengthy overhead.

The way I solved it was following these steps
1) Use objdump -d Pairing.mexa64 > Pairing_obj.
2) Translate 00015616 to hex=0x3d00.
3) Find the relevant statement and recognize the produced assembly.
4) Realize this is the first time a certain variable is dereferenced.
I am still looking for some way that this could be done easier.

Related

Usage of %line directive in NASM with dwarf debug format does not result in expected line numbers

I'm compiling the following code with NASM (nasm -g -F dwarf -f elf64 test.asm && gcc -g3 test.o).
global main
section .text
main:
%line 1 test.txt
PUSH 1337
%line 2 test.txt
PUSH 1338
%line 3 test.txt
PUSH 1339
%line 8 test.txt
POP RAX
%line 9 test.txt
POP RAX
%line 10 test.txt
POP RAX
RET
I would expect this to add the lines 1, 2, 3, 8, 9 and 10 to the dwarf data, however when I explore the file (using DWARF explorer, readelf or own code) I instead get the following lines:
test.txt 2 0x1130 (PUSH 1337)
test.txt 3 0x1135 (PUSH 1338)
test.txt 4 0x113a (PUSH 1339)
test.txt 9 0x113f (POP RAX)
test.txt 10 0x1140 (POP RAX)
test.txt 11 0x1141 (POP RAX)
test.txt 13 0x1142 (RET)
Every line number is one higher than what I provided in the assembly and in addition there is an extra line #13 situated at the ret statement. Could anyone explain what is going on here, and what I should do to get the expected result?
PUSH 1337 is on the line after the line you made line 1. So it seems %line sets the number of this line, and NASM's normal mechanism for counting line numbers continues to operate as normal. (Unlike GAS's deprecated .line which sets the line number of the next line.)
The RET is 2 lines after the last POP RAX, so that makes sense.
According to the manual, the full syntax includes an optional parameter to control line-number incrementing; apparently mmm defaults to 1:
%line nnn[+mmm] filename
So possibly (untested)
%line 1+0 test.txt
NASM -g is designed to make debug info for the asm source itself, not for the line numbers of some higher-level source file that was compiled to a .asm NASM file. The manual says it's intended for use with a macro-preprocessor (other than NASM's built-in macros), where it would make sense to have source line numbers increment.
But if you want to hack up that functionality, if +0 doesn't work, I guess you could keep resetting the %line before every instruction, with the same line number repeatedly for ones in a block that all came from the same higher-level source line.
And use the number before the one you want NASM to use. So I guess use %line 0 test.txt if you want the instruction on the next line to be reported as line 1 of test.txt, because 0 is the number before 1.
(Assuming NASM supports using 0 as a line number, and rewinding the line number to have the same line twice.)
I don't know of NASM directives equivalent in design to GAS's .loc which is intended for generating debug info for a C or other high-level source which compiled to a .s.

How to prevent the output truncated if the rows of output from the windbg to large?

If the output rows from the windbg command to large ,such as 100k rows, finally the windbg just display thousands of the rows, and most of them would be truncated , so my question is how to prevent the output truncated , or write all of the rows from the output to a local file to keep all of the output rows? the "write Windows text to file" wouldn't helpful.
Not sure if it would help, but .logopen and .logclose commands might be helpful in this case (respectively open and close a log file which keeps a copy of the events and commands from the Debugger Command window).
See also Keeping a Log File in WinDbg.
sometimes simply piping works especially when running cdb and quitting after executing just one command
cdb -c "tc 100;q" calc >> foo.txt
you should have 100 calls lets check
grep -c !.*: foo.txt
256
lets check how many sysenter were done and what were the index of the syscalls
grep sysenter -B 4 foo.txt | grep eax | awk "{print $1}"
eax=000000ea
eax=0000014d
eax=000000fb
we can use the output when the commands run for an infinite amount of time
without having file locked issues
like this
if .logopen .logclose isnt an option
Try to open additional command window with Ctrl+N and execute the long outputed command within it

Core dump: how to determine version of crashed application

I need to strictly bind every core file generated by system to certain bin version of crashed application. I can specify core-name pattern in sysctl.conf:kernel.core_pattern, but there is no way to put bin version here.
How can I put the version of crashed program into core file (revision number) or any other way to determine version of crashed bin?
I'm using qmake VERSION variable in .pro file, which contains revision number from SVN. Its available by QCoreApplication::applicationVersion(), in my every bin by flag --version.
Assuming your app can get far enough to print out its version number without a core dump, you can write a small program (python would probably be easiest) that is invoked by a core dump. The program would read stdin, dump it to a file, then rename the file based on the version number.
From man 5 core:
Piping core dumps to a program
Since kernel 2.6.19, Linux supports an alternate syntax for the
/proc/sys/kernel/core_pattern file. If the first character of this
file is a pipe symbol (|), then the remainder of the line is inter‐
preted as a program to be executed. Instead of being written to a disk
file, the core dump is given as standard input to the program. Note
the following points:
* The program must be specified using an absolute pathname (or a path‐
name relative to the root directory, /), and must immediately follow
the '|' character.
* The process created to run the program runs as user and group root.
* Command-line arguments can be supplied to the program (since Linux
2.6.24), delimited by white space (up to a total line length of 128
bytes).
* The command-line arguments can include any of the % specifiers
listed above. For example, to pass the PID of the process that is
being dumped, specify %p in an argument.
If you call your script /usr/local/bin/dumper, then
echo "| /usr/local/bin/dumper %E" > /proc/sys/kernel/core_pattern
The dumper should copy stdin to a file, then try to run the program named on its command line to extract a version number and use that to rename the file.
Something like this might work (I haven't tried it, so use at extreme risk:)
#!/usr/bin/python
import sys,os,subprocess
from subprocess import check_output
CORE_FNAME="/tmp/core"
with open(CORE_FNAME,"f") as f:
while buf=sys.stdin.read(10000):
f.write(buf)
pname=sys.argv[1].replace('!','/')
out=subprocess.check_output([pname, "--version"])
version=out.split('\n')[0].split()[-1]
os.rename(CORE_FNAME, CORE_FNAME+version)
The really big risk of doing this is recursive core dumps that may crash your system. Be sure to use ulimit to only allow core dumps from processes that can print out their own versions without core dumping.
It would be a good idea to change the script to re-run the program to get the version info only if it is the program you are expecting.

Using SAS and mkdir to create a directory structure in windows

I want to create a directory structure in Windows from within SAS. Preferably using a method that will allow me to specify a UNC naming convention such as:
\\computername\downloads\x\y\z
I have seen many examples for SAS on the web using the DOS mkdir command called via %sysexec() or the xcommand. The nice thing about the mkdir command is that it will create any intermediate folders if they also don't exist. I successfully tested the below commands from the prompt and it behaved as expected (quoting does not seem to matter as I have no spaces in my path names):
mkdir \\computername\downloads\x\y\z
mkdir d:\y
mkdir d:\y\y
mkdir "d:\z"
mkdir "d:\z\z"
mkdir \\computername\downloads\z\z\z
mkdir "\\computername\downloads\z\z\z"
The following run fine from SAS:
x mkdir d:\x;
x 'mkdir d:\y';
x 'mkdir "d:\z"';
x mkdir \\computername\downloads\x;
x 'mkdir \\computername\downloads\y';
But these do not work when run from SAS,eg:
x mkdir d:\x\x;
x 'mkdir d:\y\y';
x 'mkdir "d:\z\z"';
x mkdir \\computername\downloads\x\y\z ;
x 'mkdir "\\computername\downloads\z"';
** OR **;
%sysexec mkdir "\\computername\downloads\x\y\z ";
** OR **;
filename mkdir pipe "mkdir \\computername\downloads\x\y\z";
data _null_;
input mkdir;
put infile;
run;
It does not work. Not only this but the window closes immediately even though I have options xwait specified so there is no opportunity to see any ERROR messages. I have tried all methods with both the UNC path and a drive letter path, ie. D:\downloads\x\y\z.
If I look at the error messages being returned by the OS:
%put %sysfunc(sysrc()) %sysfunc(sysmsg());
I get the following:
-20006 WARNING: Physical file does not exist, d:\downloads\x\x\x.
Looking at the documentation for the mkdir command it appears that it only supports creating intermediate folders when 'command extensions' are enabled. This can be achieved with adding the /E:ON to cmd.exe. I've tried changing my code to use:
cmd.exe /c /E:ON mkdir "\\computername\downloads\x\y\z"
And still no luck!
Can anyone tell me why everyone else on the internet seems to be able to get this working from within SAS except for me? Again, it works fine from a DOS prompt - just not from within SAS.
I'd prefer an answer that specifically address this issue (I know there are other solutions that use multiple steps or dcreate()).
I'm running WinXP 32Bit, SAS 9.3 TS1M2. Thanks.
Here is a trick that uses the LIBNAME statement to make a directory
options dlcreatedir;
libname newdir "/u/sascrh/brand_new_folder";
I believe this is more reliable than an X statement.
Source: SAS trick: get the LIBNAME statement to create folders for you
You need to use the mkdir option -p which will create all the sub folders
i.e.
x mkdir -p "c:\newdirectory\level 1\level 2";
I'm on WinXP as well, using SAS 9.3 TS1M1. The following works for me as advertised:
122 options noxwait;
123 data _null_;
124 rc = system('mkdir \\W98052442n3m1\public\x\y\z');
125 put rc=;
126 run;
rc=0
NOTE: DATA statement used (Total process time):
real time 1.68 seconds
cpu time 0.03 seconds
That's my actual log file; "public" is a Windows shared folder on that network PC and the entire path was created. Perhaps using the SYSTEM function did the trick. I never ever use the X command myself.
You need to quote your x commands, e.g.
x 'mkdir "c:\this\that\something else"' ;
Also, I've never had a problem using UNC paths, e.g.
x "\\server.domain\share\runthis.exe" ;
This seems to work just fine with the dos window remaining open. You may need the XSYNC option. I am using 9.3 TS1M1 64 bit under VMWARE on a MAC:
options xwait xsync;
x mkdir c:\newdirectory;

Calling a software from Matlab

The command prompt works well in all respect of running the software as well as generating reports and output files. To generate an ouput file containing the desired result, we have to run the executable of the report program which uses a parameter file. For example if I were to implement these steps in command prompt, it would be like this:
“path\report.exe” –f Report.rwd –o Report.rwo
The output file is Report.rwo, this file will contain the variable exported.
Now to implement this in Matlab, below is a small script giving a gist of what I am trying to achieve. It calls the software for each run and extracts the data.
for nr=1:NREAL
dlmwrite(‘file.INC’,file(:,nr),’delimiter’,’\n’); % Writes the data file for each run
system('"path\file.dat"'); % calls software
system('"path\Report.rwd" –o "path\Report.rwo"'); % calls report
[a,b]=textread(‘"path\Report.rwo".rwo’,’%f\t%f’); % Reads the data and store it in the variable b
end
So I have two problems:
1) When I run this script in Matlab, it does not generate output file Report.rwo. Consequently, it gives an error when it reaches the line containing 'textread' function because of absence of the file.
2) Everytime Matlab calls a report (.rwd file), it prompts me to hit enter or type 'q' to quit. If suppose there are hundreds of files to run, then for every file I would be prompted to hit enter to proceed. The following line causes the prompt:
system('"path\Report.rwd" –o "path\Report.rwo"'); % Calls report
OLDER EDIT: There are 2 updates to my problem as follow:
Update 1: It seems that part 2 of my problem above has been resolved by Jacob. It is working fine for one run. However the final outcome will be confirmed only when I am able to run whole of my program which involves running hundreds of files.
Update 2: I can run the software and generate output file using command-prompt as follow:
**“path\mx200810.exe” –f file.dat**
This command reads the report parameter file and generates output file:
“path\report.exe” –f Report.rwd –o Report.rwo
LATEST EDIT:
1) I am able to run the software, avoid the prompt to hit the return key and generate the output file using Matlab through the following commands:
system('report.exe /f Report.rwd /o Report.rwo')
system('mx200810.exe -f file.dat')
However, I was able to do it only after copying my required .exe and .dll files in the same folder where I have my .dat file. So I am running the .m file through the same folder where I have all these files.
2) However there is still one error in Matlab's command window which says this:
"...STOP: Unable to open the following file as data file:
'file.dat'
Check path name for spaces, special character or a total length greater than 256 characters
Cannot find data file named 'file.dat'
Date and Time of End of Run: .....
ans = 0"
Strings enclosed in " .. " are invalid in MATLAB so I do not know how your system functions can even function.
Replace all " with ' and then update your question and include the command line arguments (e.g.-f file.dat) inside the quotes as below:
%# Calls software
system('"path\mx200810.exe" –f file.dat');
%# Calls report
system('"path\report.exe" –f Report.rwd –o Report.rwo');
Update:
Here's a cheap trick to solve your second problem (type q to terminate the program):
%# Calls software
system('"path\mx200810.exe" –f "path\file.dat" < "C:\inp.txt"');
%# Calls report
system('"path\report.exe" –f "path\Report.rwd" –o "path\Report.rwo" < "C:\inp.txt"');
Create a file (e.g. C:\inp.txt) which contains the letter q followed by the return character. You can create this by opening Notepad, typing q, hitting the return key and saving it as C:\inp.txt. This will serve as the "input" report.exe seems to need.
Change all the system calls in your code so that the input from the text file we just made is piped into it. I've included the modified calls above (scroll to the end to see the difference).
Use both outputs to get status of system run and text result, if any will be available.
cmd_line = '“path\report.exe” –f Report.rwd –o Report.rwo';
[status, result] = system(cmd_line);
Continue your script depending on status variable. Stop if it over then zero.
if (status)
error('Error running report.exe')
end
[a,b]=textread(...
If your parameters are variable you can generate the command line string in MATLAB using string concatenation or SPRINTF function.