Related
I am learning mex files in matlab. I have written this simple code
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]){
double *outData, *inData;
if(nrhs!=2) mexErrMsgTxt("Missing input data.");
inData = mxGetDoubles(prhs[0]);
outData = mxGetDoubles(plhs[0]);
outData[0] = inData[0]+inData[1];
}
But when I try to run it, matlab crashes. The problem is the last line, have you any suggestions why?
Thank you
plhs[0] (i.e. the pointer left hand side of you function call) is the output.
This output variable is not allocated in memory, you just have a pointer to it. So you can not write on it (nor read from it) without creating it first.
So you would need something like
const int ndims = 1; // or whatever dims you want
const mwSize dims[]={1}; // or whatever size you want
// create memory/variable
plhs[0] = mxCreateNumericArray(ndims ,dims,mxDOUBLE_CLASS,mxREAL);
// now it exists
outData = mxGetDoubles(plhs[0])
However, note that if you don't input a 2 length array, inData[1] will not exist, thus causing a RuntimeError, which crashes MATLAB. So its generally good practice to check the length of the array before accessing it.
Is there any way to return a variable number of outputs from a mex function?
One might pack them into a cell, but I wondered wether there is a way, so that they are expanded directly in the output list. Something like
a = mymex(arg1, arg2);
[a, b, c] = mymex(arg1, arg2, 'all');
Of course you can, just like any other MATLAB function:
test_mex.cpp
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
for (int i=0; i<nlhs; i++) {
plhs[i] = mxCreateDoubleScalar(i);
}
}
MATLAB
>> [a,b,c] = test_mex()
a =
0
b =
1
c =
2
You will have to determine how many arguments to return depending on the inputs/outputs, or issue an error if the function is called incorrectly (not enough input, too many outputs, etc..). The example above just accepts any number of output arguments (including none or zero outputs)
Take comma separated lists, which allow use to call functions in interesting ways:
>> out = cell(1,20);
>> [out{:}] = test_mex()
out =
Columns 1 through 11
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
Columns 12 through 20
[11] [12] [13] [14] [15] [16] [17] [18] [19]
This is like calling the function with 20 output variables:
>> [x1,x2,...,x20] = test_mex()
EDIT:
Just to clarify, MEX-functions act like regular M-functions defined with variable number of inputs and outputs (think function varargout = mymex(varargin)), and the same rules apply; it is up to you to manage access to inputs and create necessary outputs.
For example the previous code can be written as a regular M-function called the same way as before:
function varargout = test_fcn(varargin)
for i=1:nargout
varargout{i} = i-1;
end
end
The difference is that in MEX-files you could crash MATLAB if you try to access something out-of-range, or try to write output beyond what is actually allocated.
Take this example:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
for (int i=0; i<30; i++) { // <--- note 30 as value
plhs[i] = mxCreateDoubleScalar(i);
}
}
Calling the above as:
>> test_mex % careful!
will most likely cause an immediate MATLAB crash. While the same thing done in M-code, will just create unnecessary extra output variables that are destroyed after the call.
As #chappjc explained in his answer, for MEX-files you are always guaranteed to have space for at least one output (plhs is an array of mxArray* of length 1 at a minimum, so it's safe to assign plhs[0] no matter what). If the caller specifies a LHS variable then the output goes into it, otherwise the output is assigned to the special ans variable (of course you can still assign nothing, in case of zero outputs).
The opposite case of not assigning enough output variables is fortunately caught by MATLAB, and throws a regular catch-able error with ID MATLAB:unassignedOutputs (in both MEX and M-functions).
Also, accessing out-of-range inputs will cause an access violation (MATLAB will inform you of that with a big scary dialog, and the prompt will turn to "please restart MATLAB" message). Doing the same in regular M-code, will just throw a regular error "Index exceeds matrix dimensions.", nothing serious!
As you can see, it is very easy for things to go wrong in MEX world (in an unrecoverable way), so you have to pay special attention to validating input/output arguments.
The syntax for calling a MEX function is identical to any other MATLAB function. However, internal to the MEX function, the number of in/out arguments used is determined by the first and third arguments to mexFunction (usually named nlhs and nrhs, can be anything).
Declaration of mexFunction in mex.h (around line 141 in R2014b):
/*
* mexFunction is the user-defined C routine that is called upon invocation
* of a MEX-function.
*/
void mexFunction(
int nlhs, /* number of expected outputs */
mxArray *plhs[], /* array of pointers to output arguments */
int nrhs, /* number of inputs */
const mxArray *prhs[] /* array of pointers to input arguments */
);
This is not unlike the syntax for the standard main functions for C/C++ command line executables, but there are notable differences. Unlike the native command line version (int main(int argc, const char* argv[])), the count and pointer array does not include the name of the function (argv[0] is usually the name of the executable program file), and with mexFunction there are parameters for output arguments as well as input.
The naming for nlhs and nrhs should be clear. HINT: In mathematics, an equation has a left hand side and a right hand side.
An important but easily overlooked quality of mexFunction I/O handling pertains to MATLAB's special ans variable, the place where function outputs (often) go if you don't assign to a variable. In a MEX file, you need to remember the following when checking nlhs. If nlhs=0, you can and must still write to plhs[0] for ans to be used. It's documented, just barely, in Data Flow in MEX-Files. Consider what happens with the following code:
// test_nlhs.cpp
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
for (int i=0; i<nlhs; ++i)
plhs[i] = mxCreateDoubleScalar(i);
}
Outputs:
>> test_nlhs
>> [a,b] = test_nlhs
a =
0
b =
1
There's no output to ans because the logic with nlhs prevents it from assigning to plhs[0]. Now this code:
// test_nlhs.cpp
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (nlhs==0) {
mexPrintf("Returning special value to ""ans"".\n");
plhs[0] = mxCreateDoubleScalar(-1);
} else {
mexPrintf("Returning to specified workspace variables.\n");
for (int i=0; i<nlhs; ++i)
plhs[i] = mxCreateDoubleScalar(i);
}
}
Outputs
>> test_nlhs
Returning special value to ans.
ans =
-1
>> [a,b] = test_nlhs
Returning to specified workspace variables.
a =
0
b =
1
It's certainly doesn't have to be a special value, and it should usually be the same first output argument, but I'm illustrating how you recognize the calling syntax.
I have found a really tricky problem, which I can not seem to fix easily. In short, I would like to return from a mex file an array, which has been passed as mex function input. You could trivially do this:
void mexFunction(int nargout, mxArray *pargout [ ], int nargin, const mxArray *pargin[])
{
pargout[0] = pargin[0];
}
But this is not what I need. I would like to get the raw pointer from pargin[0], process it internally, and return a freshly created mex array by setting the corresponding data pointer. Like that:
#include <mex.h>
void mexFunction(int nargout, mxArray *pargout [ ], int nargin, const mxArray *pargin[])
{
mxArray *outp;
double *data;
int m, n;
/* get input array */
data = mxGetData(pargin[0]);
m = mxGetM(pargin[0]);
n = mxGetN(pargin[0]);
/* copy pointer to output array */
outp = mxCreateNumericMatrix(0,0,mxDOUBLE_CLASS,mxREAL);
mxSetM(outp, m);
mxSetN(outp, n);
mxSetData(outp, data);
/* segfaults with or without the below line */
mexMakeMemoryPersistent(data);
pargout[0] = outp;
}
It doesn't work. I get a segfault, if not immediately, then after a few calls. I believe nothing is said about such scenario in the documentation. The only requirement is hat the data pointer has been allocated using mxCalloc, which it obviously has. Hence, I would assume this code is legal.
I need to do this, because I am parsing a complicated MATLAB structure into my internal C data structures. I process the data, some of the data gets re-allocated, some doesn't. I would like to transparently return the output structure, without thinking when I have to simply copy an mxArray (first code snippet), and when I actually have to create it.
Please help!
EDIT
After further looking and discussing with Amro, it seems that even my first code snippet is unsupported and can cause MATLAB crashes in certain situations, e.g., when passing structure fields or cell elements to such mex function:
>> a.field = [1 2 3];
>> b = pargin_to_pargout(a.field); % ok - works and assigns [1 2 3] to b
>> pargin_to_pargout(a.field); % bad - segfault
It seems I will have to go down the 'undocumented MATLAB' road and use mxCreateSharedDataCopy and mxUnshareArray.
You should use mxDuplicateArray, thats the documented way:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
plhs[0] = mxDuplicateArray(prhs[0]);
}
While undocumented, the MEX API function mxCreateSharedDataCopy iswas given as a solution by MathWorks, now apparently disavowed, for creating a shared-data copy of an mxArray. MathWorks even provides an example in their solution, mxsharedcopy.c.
As described in that removed MathWorks Solution (1-6NU359), the function can be used to clone the mxArray header. However, the difference between doing plhs[0] = prhs[0]; and plhs[0] = mxCreateSharedDataCopy(prhs[0]); is that the first version just copies the mxArray* (a pointer) and hence does not create a new mxArray container (at least not until the mexFunction returns and MATLAB works it's magic), which would increment the data's reference count in both mxArrays.
Why might this be a problem? If you use plhs[0] = prhs[0]; and make no further modification to plhs[0] before returning from mexFunction, all is well and you will have a shared data copy thanks to MATLAB. However, if after the above assignment you modify plhs[0] in the MEX function, the change be seen in prhs[0] as well since it refers to the same data buffer. On the other hand, when explicitly generating a shared copy (with mxCreateSharedDataCopy) there are two different mxArray objects and a change to one array's data will trigger a copy operation resulting in two completely independent arrays. Also, direct assignment can cause segmentation faults in some cases.
Modified MathWorks Example
Start with an example using a modified mxsharedcopy.c from the MathWorks solution referenced above. The first important step is to provide the prototype for the mxCreateSharedDataCopy function:
/* Add this declaration because it does not exist in the "mex.h" header */
extern mxArray *mxCreateSharedDataCopy(const mxArray *pr);
As the comment states, this is not in mex.h, so you have to declare this yourself.
The next part of the mxsharedcopy.c creates new mxArrays in the following ways:
A deep copy via mxDuplicateArray:
copy1 = mxDuplicateArray(prhs[0]);
A shared copy via mxCreateSharedDataCopy:
copy2 = mxCreateSharedDataCopy(copy1);
Direct copy of the mxArray*, added by me:
copy0 = prhs[0]; // OK, but don't modify copy0 inside mexFunction!
Then it prints the address of the data buffer (pr) for each mxArray and their first values. Here is the output of the modified mxsharedcopy(x) for x=ones(1e3);:
prhs[0] = 72145590, mxGetPr = 18F90060, value = 1.000000
copy0 = 72145590, mxGetPr = 18F90060, value = 1.000000
copy1 = 721BF120, mxGetPr = 19740060, value = 1.000000
copy2 = 721BD4B0, mxGetPr = 19740060, value = 1.000000
What happened:
As expected, comparing prhs[0] and copy0 we have not created anything new except another pointer to the same mxArray.
Comparing prhs[0] and copy1, notice that mxDuplicateArray created a new mxArray at address 721BF120, and copied the data into a new buffer at 19740060.
copy2 has a different address (mxArray*) from copy1, meaning it is also a different mxArray not just the same one pointed to by different variables, but they both share the same data at address 19740060.
The question reduces to: Is it safe to return in plhs[0] either of copy0 or copy2 (from simple pointer copy or mxCreateSharedDataCopy, respectively) or is it necessary to use mxDuplicateArray, which actually copies the data? We can show that mxCreateSharedDataCopy would work by destroying copy1 and verifying that copy2 is still valid:
mxDestroyArray(copy1);
copy2val0 = *mxGetPr(copy2); % no crash!
Applying Shared-Data Copy to Input
Back to the question. Take this a step further than the MathWorks example and return a share-data copy of the input. Just do:
if (nlhs>0) plhs[0] = mxCreateSharedDataCopy(prhs[0]);
Hold your breath!
>> format debug
>> x=ones(1,2)
x =
Structure address = 9aff820 % mxArray*
m = 1
n = 2
pr = 2bcc8500 % double*
pi = 0
1 1
>> xDup = mxsharedcopy(x)
xDup =
Structure address = 9afe2b0 % mxArray* (different)
m = 1
n = 2
pr = 2bcc8500 % double* (same)
pi = 0
1 1
>> clear x
>> xDup % hold your breath!
xDup =
Structure address = 9afe2b0
m = 1
n = 2
pr = 2bcc8500 % double* (still same!)
pi = 0
1 1
Now for a temporary input (without format debug):
>> tempDup = mxsharedcopy(2*ones(1e3));
>> tempDup(1)
ans =
2
Interestingly, if I test without mxCreateSharedDataCopy (i.e. with just plhs[0] = prhs[0];), MATLAB doesn't crash but the output variable never materializes:
>> tempDup = mxsharedcopy(2*ones(1e3)) % no semi-colon
>> whos tempDup
>> tempDup(1)
Undefined function 'tempDup' for input arguments of type 'double'.
R2013b, Windows, 64-bit.
mxsharedcopy.cpp (modified C++ version):
#include "mex.h"
/* Add this declaration because it does not exist in the "mex.h" header */
extern "C" mxArray *mxCreateSharedDataCopy(const mxArray *pr);
bool mxUnshareArray(const mxArray *pr, const bool noDeepCopy); // true if not successful
void mexFunction(int nlhs,mxArray *plhs[],int nrhs,const mxArray *prhs[])
{
mxArray *copy1(NULL), *copy2(NULL), *copy0(NULL);
//(void) plhs; /* Unused parameter */
/* Check for proper number of input and output arguments */
if (nrhs != 1)
mexErrMsgTxt("One input argument required.");
if (nlhs > 1)
mexErrMsgTxt("Too many output arguments.");
copy0 = const_cast<mxArray*>(prhs[0]); // ADDED
/* First make a regular deep copy of the input array */
copy1 = mxDuplicateArray(prhs[0]);
/* Then make a shared copy of the new array */
copy2 = mxCreateSharedDataCopy(copy1);
/* Print some information about the arrays */
// mexPrintf("Created shared data copy, and regular deep copy\n");
mexPrintf("prhs[0] = %X, mxGetPr = %X, value = %lf\n",prhs[0],mxGetPr(prhs[0]),*mxGetPr(prhs[0]));
mexPrintf("copy0 = %X, mxGetPr = %X, value = %lf\n",copy0,mxGetPr(copy0),*mxGetPr(copy0));
mexPrintf("copy1 = %X, mxGetPr = %X, value = %lf\n",copy1,mxGetPr(copy1),*mxGetPr(copy1));
mexPrintf("copy2 = %X, mxGetPr = %X, value = %lf\n",copy2,mxGetPr(copy2),*mxGetPr(copy2));
/* TEST: Destroy the first copy */
//mxDestroyArray(copy1);
//copy1 = NULL;
//mexPrintf("\nFreed copy1\n");
/* RESULT: copy2 will still be valid */
//mexPrintf("copy2 = %X, mxGetPr = %X, value = %lf\n",copy2,mxGetPr(copy2),*mxGetPr(copy2));
if (nlhs>0) plhs[0] = mxCreateSharedDataCopy(prhs[0]);
//if (nlhs>0) plhs[0] = const_cast<mxArray*>(prhs[0]);
}
We've been interfacing with a library created from the Matlab Compiler. Our problem is related to an array returned from the library.
Once we're finished with the array, we'd like to free the memory, however, doing this causes occasional segmentation faults.
Here is the Matlab library (bugtest.m)::
function x = bugtest(y)
x = y.^2;
Here is the command we used to build it (creating libbugtest.so, and libbugtest.h)::
mcc -v -W lib:libbugtest -T link:lib bugtest.m
Here is our C test program (bug_destroyarray.c)::
#include <stdio.h>
#include <stdlib.h>
#include "mclmcrrt.h"
#include "libbugtest.h"
#define TESTS 15000
int main(int argc, char **argv)
{
const char *opts[] = {"-nojvm", "-singleCompThread"};
mclInitializeApplication(opts, 2);
libbugtestInitialize();
mxArray *output;
mxArray *input;
double *data;
bool result;
int count;
for (count = 0; count < TESTS; count++) {
input = mxCreateDoubleMatrix(4, 1, mxREAL);
data = mxGetPr(input); data[0] = 0.5; data[1] = 0.2; data[2] = 0.2; data[3] = 0.1;
output = NULL;
result = mlfBugtest(1, &output, input);
if (result) {
/* HERE IS THE PROBLEMATIC LINE */
/*mxDestroyArray(output);*/
}
mxDestroyArray(input);
}
libbugtestTerminate();
mclTerminateApplication();
}
Here is how we compile the C program (creating bug_destroyarray)::
mbuild -v bug_destroyarray.c libbugtest.so
We believe that mxDestroyArray(output) is problematic.
We run the following to test crashing:
On each of the 32 cluster nodes.
Run bug_destroyarray.
Monitor output for segmentation faults.
Roughly 10% of the time there is a crash. If this is independent across nodes
then you might suppose it is crashing roughly 0.3% of the time.
When we take out that problematic line we are unable to cause it to crash.
However memory usage gradually increases when this line is not included.
From the research we've done, it seems we are not supposed to destroy the array returned, if not, how do we stop from leaking memory?
Thanks.
Okay, I know this is a little old now, but in case it helps clarify things for anyone passing by ...
Amro provides the most pertinent information, but to expand upon it, IF you don't call the mxDestroyArray function as things stand, then you WILL leak memory, because you've set output to NULL and so the mlf function won't try to call mxDestroyArray. The corollary of this is that if you've called mxDestroyArray AND then try to call the mlf function AND output is NOT NULL, then the mlf function WILL try to call mxDestroyArray on output. The question then is to what does output point? It's a bit of a dark corner what happens to output after passing it to mxDestroyArray. I'd say it's an unwarranted assumption that it's set to NULL; it's certainly not documented that mxDestroyArray sets its argument to NULL. Therefore, I suspect what is happening is that in between your call to mxDestroyArray and the code re-executing the mlf function, something else has been allocated the memory pointed to by output and so your mlf function tries to free memory belonging to something else. Voila, seg fault. And of course this will only happen if that memory has been reallocated. Sometimes you'll get lucky, sometimes not.
The golden rule is if you're calling mxDestroyArray yourself for something that is going to be re-used, set the pointer to NULL immediately afterwards. You only really need to destroy stuff at the end of your function anyway, because you can safely re-use output variables in mlf calls.
Guy
A few notes:
I don't see singleCompThread in the list of allowed options for mclInitializeApplication.
The recommended way to compile your C program is to dynamically link against the compiled library:
mbuild -v -I. bug_destroyarray.c -L. -lbugtest
At the top of your C program, just include the generated header file, it will include other headers in turn. From looking at the generated header, it has:
#pragma implementation "mclmcrrt.h"
#include "mclmcrrt.h"
I dont know the exact meaning of this pragma line, but maybe it matters with GCC compilers..
The fact that both mlx/mlf generated functions return booleans is undocumented. But looking at the header files, both signatures do indeed return a bool:
extern bool mlxBugtest(int nlhs, mxArray *plhs[], int nrhs, mxArray *prhs[]);
extern bool mlfBugtest(int nargout, mxArray** x, mxArray* y);
I tried your code and it works just fine with no segfaults. As I dont have access to a cluster of computers, my testing was only done on my local machine (WinXP with R2013a).
I had to remove both MCR initialization options for it to work (specifically the nojvm caused a runtime error). Below is the full code with slight modifications. It took around 10 seconds to run:
#include <stdio.h>
#include <stdlib.h>
#include "libbugtest.h"
#define TESTS 15000
int main()
{
mxArray *output, *input;
double *data;
int count;
bool result;
if( !mclInitializeApplication(NULL,0) ) {
fprintf(stderr, "Could not initialize the application.\n");
return EXIT_FAILURE;
}
if ( !libbugtestInitialize() ) {
fprintf(stderr, "Could not initialize the library.\n");
return EXIT_FAILURE;
}
for (count = 0; count < TESTS; count++) {
input = mxCreateDoubleMatrix(4, 1, mxREAL);
data = mxGetPr(input);
data[0] = 0.5; data[1] = 0.2; data[2] = 0.2; data[3] = 0.1;
output = NULL;
result = mlfBugtest(1, &output, input);
if (!result) {
fprintf(stderr, "call failed on count=%d\n", count);
return EXIT_FAILURE;
}
mxDestroyArray(output); output = NULL;
mxDestroyArray(input); input = NULL;
}
libbugtestTerminate();
mclTerminateApplication();
return EXIT_SUCCESS;
}
Also the compilation step is a bit different on Windows, since we statically link against the import lib (which inserts a stub to dynamically load the DLL on runtime):
mbuild -v -I. bug_destroyarray.c libbugtest.lib
Thanks for the detailed reply Amro.
We tried changing our compilation steps to the recommended ones, with no success.
The following fixed our seg-faulting problem:
Do not set output = NULL at each iteration, instead do it once outside of the loop.
Do not call mxDestroyArray(output) inside the loop, reference: here.
Our misunderstanding was that (it seems) you are supposed to reuse mxArray pointers which you pass to MATLAB functions. It makes things slightly cumbersome on our side as we need to be careful reusing this pointer.
However, memory is completely stable, and we've not had a crash since.
Hey there,
I don't really understand how to access data passed via arguments in matlab to a mex-function. Assuming I have the 'default' gateway function
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] )
And now I get the pointer to the 1. input argument:
double* data_in;
data_in = mxGetPr(prhs[0]);
Both the following lines EACH seperatly make my matlab crash:
mexPrintf("%d", *data_in);
mexPrintf("%d", data_in[1]);
But why can't I access the data like that when data_in obvisously is a pointer to the first argument?
When do I need to declare the pointer as double* and when as mxArray*? Sometimes I see something like that: mxArray *arr = mxCreateDoubleMatrix(n,m,mxREAL);!?
Thanks a lot in advance!
data_in is a pointer to double so you need something like
mexPrintf("%f", data_in[0]);
This assumes the caller passed a vector or matrix of size > 0.
More generally, you can
int n = mxGetN(array);
int m = mxGetM(array);
To get the number of rows and columns of the matrix/vector passed to the mex function.
Regarding mxArray:
Matlab packs its matrices (complex and real) in an mxArray structure. mxCreateDoubleMatrix returns a pointer to such structure. To actually access that data you need to use mxGetPr() for the real part and mxGetPi() for the imaginary parts.
These return pointers to the allocated double[] arrays, which you can use to access (read and write) the elements of the matrix.
A very convenient way of handling dimensions of mxArrays is to introduce a function like the following.
#include <cstddef>
#include <cstdarg>
#include "mex.h"
bool mxCheckDimensions(const mxArray* mx_array, size_t n_dim,...) {
va_list ap; /* varargs list traverser */
size_t *dims; /* dimension list */
size_t i;
size_t dim;
bool retval = true;
va_start(ap,n_dim);
dims = (size_t *) malloc(n_dim*sizeof(size_t));
for(i=0;i<n_dim;i++) {
dims[i] = va_arg(ap,size_t);
dim = mxGetDimensions(mx_array)[i];
if (dim != dims[i])
retval = false;
}
va_end(ap);
free(dims);
return retval;
}
In this way you check an array mxArray* p is a double array of size say 1,3 using
double* pDouble = NULL;
if (mxIsDouble(p)) {
if (mxCheckDimensions(p, 2, 1, 3)) {
pDouble = (double*) GetData(p);
// Do whatever
}
}`