I write CUDA code that I call from MATLAB MEX files. I am not using any of MATLABs GPU libraries or capabilities. My code its just CUDA code that accepts C type variables and I only use mex to convert from mwtypes to C types, then call independent self-written CUDA code.
The problem is that sometimes, specially in development phase, CUDA fails (because I made a mistake). Most CUDA calls are generally surrounded by a call to gpuErrchk(cudaDoSoething(cuda)), defined as:
// Uses MATLAB functions but you get the idea.
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
mexPrintf("GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort){
//cudaDeviceReset(); //This does not make MATLAB release it
mexErrMsgIdAndTxt("MEX:myfun", ".");
}
}
}
While this works as expected, giving errors such as
GPUassert: an illegal memory access was encountered somefile.cu 208
In most cases MATLAB does not release the GPU afterwards. Meaning that even if I change the code and recompile, the next call of the code will result in error:
GPUassert: all CUDA-capable devices are busy or unavailable
somefile.cu firs_cuda_line
The only way of removing this error is restarting MATLAB. This is just annoying and hinders the development/testing process. This is not what happens when I develop in say Visual Studio.
I have tried to cudaDeviceReset() both before and after the error has been raised, but to no avail.
What can I do/try to make MATLAB release the GPU after a GPU runtime error?
Related
I'm trying to create my own servo.write block in Simulink for Arduino DUE deployment (and External Mode). Before you ask why if there is one available inside the Simulink Arduino Support Package, generally my final goal is to create a block that will use Arduino servo.writeMicroseconds function (there is no out of the box block for this one) but first i want to try with something simple to debug to see if i can get it to work....
I've been using this guide (https://www.mathworks.com/matlabcentral/fileexchange/39354-device-drivers) and one of working examples in there as a template and started modifying it (originally it implemented Digital Output driver). I took the LCT approach.
The the original digitalio_arduino.cpp/h files from the guide (example with digital read/out) were the files I modified as they were working without any issues out-of-the-box. Step by step i made following modifications:
Remove DIO read (leave only write) from CPP and H files
Change StartFcnSpec to digitalIOSetup and make changes in H file so port is always in OUTPUT mode
Include Servo.h library within CPP file and create Servo object
Up to this point, all edits went fine, no compile errors, all header files were detected by Simulink and diode kept blinking as it should so the code actually worked (i ran it in External Mode).
But as soon as i made the final modification and replaced pinMode() with myservo.attach() and digitalWrite() with myservo.write() (of course i changed the data type in writeDigitalPin function from boolean to uint8_T) the code, despite compiling and building without any issue didn't work at all. Specified servo port was completely dead, as it even wasn't initialised. Changing value on S-Function input didn't yield any results.
Of course If i replaced custom block with built in Servo Write block from Hardware Support Package, everything worked fine so its not hardware issue.
I'm completely out of ideas what could be wrong, especially that there are no errors so not even a hint where to look.
Here is the LCT *.m script used for generating S-Function:
def = legacy_code('initialize');
def.SFunctionName = 'dout_sfun';
def.OutputFcnSpec = 'void NO_OP(uint8 p1, uint8 u1)';
def.StartFcnSpec = 'void NO_OP(uint8 p1)';
legacy_code('sfcn_cmex_generate', def);
legacy_code('compile', def, '-DNO_OP=//')
def.SourceFiles = {fullfile(pwd,'..','src','digitalio_arduino.cpp')};
def.HeaderFiles = {'digitalio_arduino.h'};
def.IncPaths = {fullfile(pwd,'..','src'), 'C:\ProgramData\MATLAB\SupportPackages\R2021b\aIDE\libraries\Servo\src'};
def.OutputFcnSpec = 'void writeDigitalPin(uint8 p1, uint8 u1)';
def.StartFcnSpec = 'void digitalIOSetup(uint8 p1)';
legacy_code('sfcn_cmex_generate', def);
legacy_code('sfcn_tlc_generate', def);
legacy_code('rtwmakecfg_generate',def);
legacy_code('slblock_generate',def);
Here is digitalio_arduino.CPP file
#include <Arduino.h>
#include <Servo.h>
#include "digitalio_arduino.h"
Servo myservo;
// Digital I/O initialization
extern "C" void digitalIOSetup(uint8_T pin)
{
//pinMode(pin, OUTPUT);
myservo.attach(pin);
}
// Write a logic value to pin
extern "C" void writeDigitalPin(uint8_T pin, uint8_T val)
{
//digitalWrite(pin, val);
myservo.write(val);
}
// [EOF]
And here is digitalio_arduino.H file
#ifndef _DIGITALIO_ARDUINO_H_
#define _DIGITALIO_ARDUINO_H_
#include "rtwtypes.h"
#ifdef __cplusplus
extern "C" {
#endif
void digitalIOSetup(uint8_T pin);
void writeDigitalPin(uint8_T pin, uint8_T val);
#ifdef __cplusplus
}
#endif
#endif //_DIGITALIO_ARDUINO_H_
As I mentioned I've been using a working example as a reference. So I've modified it step by step to see if maybe there is a point when suddenly some error comes up but everything compiles yet does not work :/
I was wondering maybe if there is an issue with the Servo.h library or the Servo object and did some tinkering with these, like i removed Servo myservo; line of code to see if anything happens and just like expected, i started receiving errors that Servo is not defined. If I did not include Servo.h at all or forget to add IncPath to Servo.h as before compile errors about Servo not being supported symbol or not being able to find Servo.h library - so actually the code seems to be "working" in a way, it seems to have everything it needs to work :/
I also looked at the MathWorks implementation of Servo Write block, the MWServoReadWrite to see how Arduino API is being used and no surprise, it's being used in the same way as I've been trying to. They include Servo.h, they are using servo.attach() and servo.write() to control the servo port. And that's it. Yet for them it works, for me does not :/
When I inspect generated C code that runs on Arduino (with my custom S-Function block in it), it seems that all the functions are placed exactly where they are supposed to be, they receive correct arguments. I expected at least that I'll find a hint in there, i.e. missing code or anything else.
Here's a brief summary for what happened: basically, I'm trying to rewrite a MATLAB program to a C# program, where both of them utilized function implemented in a VS compiled library.
For Matlab:
val = MyDLL.MyFunc(int32(a),int32(b),int32(c),int32(d));
Where:
MyDLL= actxserver('MyLib.MyClass');
As in C#, it's directly calling the function:
obj = new MyLib.MyClass;
val = obj.MyFunc(int32(a),int32(b),int32(c),int32(d));
However, in C# program, an exception from MyFunc would be raised but in MATLAB program nothing happens. Since they are essentially calling the same function from the same library, I reckon whether it could be the reason that MATLAB doesn't actually respond to exception raised from outer library?
Any advice would be helpful, thanks!
P.S.: All arguments match with each other and both programs are absolutely calling the very same function.
I fiddled with this the whole day, so I thought I might make everyone benefit from my experience, please see my answer below.
I first had a problem with running a compiled Mex file within Matlab, because Matlab complained that it couldn't open the shared library libarmadillo. I solved this using the environment variables LD_LIBRARY_PATH and LD_RUN_PATH (DYLD_LIBRARY_PATH and LYLD_RUN_PATH in osx).
The problem remained however, that a simple test file would segfault at runtime even though the exact same code would compile and run fine outside Matlab (not Mex'd).
The segfault seems to be caused by the fact that Matlab uses 64bits integers (long long or int64_t) in its bundled LAPACK and BLAS libraries. Armadillo on the other hand, uses 32bits integers (regular int on a 64bits platform, or int32_t) by default.
There are two solutions; the first one involves forcing Matlab to link to the system's libraries instead (which use ints), the second involves changing Armadillo's config file to enable long longs with BLAS. I tend to think that the first is more reliable, because there is no black-box effect, but it's also more troublesome, because you need to manually install and remember the path of your BLAS and LAPACK libs.
Both solutions required that I stopped using Armadillo's shared libraries and linked/included manually the sources.
To do this, you must simply install LAPACK and BLAS on your system (if they are not already there, in Ubuntu that's libblas-dev and liblapack-dev), and copy the entire includes directory somewhere sensible like in $HOME/.local/arma for example.
Solution 1: linking to system's libraries
From the matlab console, set the environment variables BLAS_VERSION and LAPACK_VERSION to point to your system's libraries. In my case (Ubuntu 14.04, Matlab R2014b):
setenv('BLAS_VERSION','/usr/lib/libblas.so');
setenv('LAPACK_VERSION','/usr/lib/liblapack.so');
You can then compile normally:
mex -compatibleArrayDims -outdir +mx -L/home/john/.local/arma -llapack -lblas -I/home/john/.local/arma test_arma.cpp
or if you define the flag ARMA_64BIT_WORD in includes/armadillo_bits/config.hpp, you can drop the option -compatibleArrayDims.
Solution 2: changing Armadillo's config
The second solution involves uncommenting the flag ARMA_BLAS_LONG_LONG in Armadillo's config file includes/armadillo_bits/config.hpp. Matlab will link to its bundled LAPACK and BLAS libraries, but this time Armadillo won't segfault because it's using the right word-size. Same than before, you can also uncomment ARMA_64BIT_WORD if you want to drop the -compatibleArrayDims.
Compiled with
mex -larmadillo -DARMA_BLAS_LONG_LONG armaMex_demo2.cpp
(In Matlab)
armaMex_demo2(rand(1))
works without segfault.
However, compiled with
mex -larmadillo armaMex_demo2.cpp
(In Matlab)
armaMex_demo2(rand(1))
causes a segfault.
Here, armaMex_demo2.cpp is
/* ******************************************************************* */
// armaMex_demo2.cpp: Modified from armaMex_demo.cpp copyright Conrad Sanderson and George Yammine.
/* ******************************************************************* */
// Demonstration of how to connect Armadillo with Matlab mex functions.
// Version 0.2
//
// Copyright (C) 2014 George Yammine
// Copyright (C) 2014 Conrad Sanderson
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
/////////////////////////////
#include "armaMex.hpp"
void
mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
/*
Input: X (real matrix)
Output: Eigenvalues of X X.T
*/
if (nrhs != 1)
mexErrMsgTxt("Incorrect number of input arguments.");
// Check matrix is real
if( (mxGetClassID(prhs[0]) != mxDOUBLE_CLASS) || mxIsComplex(prhs[0]))
mexErrMsgTxt("Input must be double and not complex.");
// Create matrix X from the first argument.
arma::mat X = armaGetPr(prhs[0],true);
// Run an arma function (eig_sym)
arma::vec eigvals(X.n_rows);
if(not arma::eig_sym(eigvals, X*X.t()))
mexErrMsgTxt("arma::eig_sym failed.");
// return result to matlab
plhs[0] = armaCreateMxMatrix(eigvals.n_elem, 1);
armaSetPr(plhs[0], eigvals);
return;
}
I'm using the MATLAB Engine C interface on OS X. I noticed that if engEvalString() is given an incomplete MATLAB input such as
engEvalString(ep, "x=[1 2");
or
engEvalString(ep, "for i=1:10");
then the function simply never returns. The quickest way to test this is using the engdemo.c example which will prompt for a piece of MATLAB code and evaluate it (i.e. you can type anything).
My application lets the user enter arbitrary MATLAB input and evaluate it, so I can't easily protect against incomplete input. Is there a workaround? Is there a way to prevent engEvalString() from hanging in this situation or is there a way to check an arbitrary piece of code for correctness/completeness before I actually pass it to MATLAB?
As you noted, it seems this bug is specific to Mac and/or Linux (I couldn't reproduce it on my Windows machine). As a workaround wrap the calls in eval, evalc, or evalin:
engEvalString(ep, "eval('x = [1,2')")
Furthermore, an undocumented feature of those functions is that they take a second input that is evaluated in case an error occurs in the first one. For example:
ERR_FLAG = false;
eval('x = [1,2', 'x=nan; ERR_FLAG=true;')
You can trap errors that way by querying the value of a global error flag, and still avoid the bug above...
This was confirmed by support to be a bug in the MATLAB Engine interface on OS X (it's not present in Windows). Workarounds are possible by using the MATLAB functions eval, evalc, or similar. Instead of directly passing the code to engEvalString(), wrap it in these first.
I am interfacing python with a c++ library using cython. I need a callback function that the c++ code can call. I also need to pass a reference to a specific python object to this function. This is all quite clear from the callback demo.
However I get various errors when the callback is called from c++ thread (pthread):
Pass function pointer and class/object (as void*) to c++
Store the pointers within c++
Start new thread (pthread) running a loop
Call function using the stored function pointer and pass back the class pointer (void*)
In python: cast void* back to class/object
Call a method of above class/object (Error)
Steps 2 and 3 are in c++.
The errors are mostly segmentation faults but sometimes I get complaints about some low level python calls.
I have the exact same code where I create a thread in python and call the callback directly. This works OK. So the callback itself is working.
I do need a separate thread running in c++ since this thread is communicating with hardware and on occasion calling my callback.
I have also triple-checked all the pointers that are being passed around. They point to valid locations.
I suspect there are some problems when using cython classes from a c++ thread..?
I am using Python 2.6.6 on Ubuntu.
So my question is:
Can I manipulate python objects from a non-python thread?
If not, is there a way can make the thread python-compatible? (pthread)
This is the minimal callback that already causes problems when called from c++ thread:
cdef int CheckCollision(float* Angles, void* user_data):
self = <CollisionDetector>user_data
return self.__sizeof__() # <====== Error
No, you must not manipulate Python objects without acquiring GIL in the first place. You must use PyGILState_Ensure() + PyGILState_Release() (see the PyGILState_Ensure documentation)
You can ensure that your object will be not deleted by python if you explicitly take the reference, and release it when you're not using it anymore with Py_INCREF() and Py_DECREF(). If you pass the callback to your c++, and if you don't have a reference taken anymore, maybe your python object is freed, and the crash happen (see the Py_INCREF documentatation).
Disclamer: i'm not saying this is your issue, just giving you tips to track down the bug :)