Extracting data from a matlab struct in mex - matlab

I'm following this example but I'm not sure what I missed. Specifically, I have this struct in MATLAB:
a = struct; a.one = 1.0; a.two = 2.0; a.three = 3.0; a.four = 4.0;
And this is my test code in MEX ---
First, I wanted to make sure that I'm passing in the right thing, so I did this check:
int nfields = mxGetNumberOfFields(prhs[0]);
mexPrintf("nfields =%i \n\n", nfields);
And it does yield 4, since I have four fields.
However, when I tried to extract the value in field three:
tmp = mxGetField(prhs[0], 0, "three");
mexPrintf("data =%f \n\n", (double *)mxGetData(tmp) );
It returns data =1.000000. I'm not sure what I did wrong. My logic is that I want to get the first element (hence index is 0) of the field three, so I expected data =3.00000.
Can I get a pointer or a hint?

EDITED
Ok, since you didn't provide your full code but you are working on a test, let's try to make a new one from scratch.
On Matlab side, use the following code:
a.one = 1;
a.two = 2;
a.three = 3;
a.four = 4;
read_struct(a);
Now, create and compile the MEX read_struct function as follows:
#include "mex.h"
void read_struct(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (nrhs != 1)
mexErrMsgTxt("One input argument required.");
/* Let's check if the input is a struct... */
if (!mxIsStruct(prhs[0]))
mexErrMsgTxt("The input must be a structure.");
int ne = mxGetNumberOfElements(prhs[0]);
int nf = mxGetNumberOfFields(prhs[0]);
mexPrintf("The structure contains %i elements and %i fields.\n", ne, nf);
mwIndex i;
mwIndex j;
mxArray *mxValue;
double *value;
for (i = 0; i < nf; ++i)
{
for (j = 0; j < ne; ++j)
{
mxValue = mxGetFieldByNumber(prhs[0], j, i);
value = mxGetPr(mxValue);
mexPrintf("Field %s(%d) = %.1f\n", mxGetFieldNameByNumber(prhs[0],i), j, value[0]);
}
}
return;
}
Does this correctly prints your structure?

Related

Cairo: Draw an array of pixels

I'm just starting to explore Cairo, but right now I really want to use it for something very simple.
I have a very low-tech bitmap, i.e., a 3*X*Y array of numbers. I'd like to use Cairo to make this into a bitmap and write to a file. I'm looking through tutorials and I'm not seeing a way to use it for comparatively low-level functions like this.
I don't think I need guidance on how to use the tool once I know what the tool is.
I didn't actually test this, but the following should give you lots of useful hints:
#include <cairo.h>
#include <stdint.h>
#define WIDTH 42
#define HEIGHT 42
uint8_t data[WIDTH][HEIGHT][3];
cairo_surface_t* convert()
{
cairo_surface_t *result;
unsigned char *current_row;
int stride;
result = cairo_image_surface_create(CAIRO_FORMAT_RGB24, WIDTH, HEIGHT);
if (cairo_surface_status(result) != CAIRO_STATUS_SUCCESS)
return result;
cairo_surface_flush(result);
current_row = cairo_image_surface_get_data(result);
stride = cairo_image_surface_get_stride(result);
for (int y = 0; y < HEIGHT; y++) {
uint32_t *row = (void *) current_row;
for (int x = 0; x < WIDTH; x++) {
uint32_t r = data[x][y][0];
uint32_t g = data[x][y][1];
uint32_t b = data[x][y][2];
row[x] = (r << 16) | (g << 8) | b;
}
current_row += stride;
}
cairo_surface_mark_dirty(result);
return result;
}

cula use of culaSgels - wrong argument?

I am trying to use the culaSgels function in order to solve Ax=B.
I modified the systemSolve example of the cula package.
void culaFloatExample()
{
int N=2;
int NRHS = 2;
int i,j;
double cula_time,start_time,end_time;
culaStatus status;
culaFloat* A = NULL;
culaFloat* B = NULL;
culaFloat* X = NULL;
culaFloat one = 1.0f;
culaFloat thresh = 1e-6f;
culaFloat diff;
printf("Allocating Matrices\n");
A = (culaFloat*)malloc(N*N*sizeof(culaFloat));
B = (culaFloat*)malloc(N*N*sizeof(culaFloat));
X = (culaFloat*)malloc(N*N*sizeof(culaFloat));
if(!A || !B )
exit(EXIT_FAILURE);
printf("Initializing CULA\n");
status = culaInitialize();
checkStatus(status);
// Set A
A[0]=1;
A[1]=2;
A[2]=3;
A[3]=4;
// Set B
B[0]=5;
B[1]=6;
B[2]=2;
B[3]=3;
printf("Calling culaSgels\n");
// Run CULA's version
start_time = getHighResolutionTime();
status = culaSgels('N',N,N, NRHS, A, N, A, N);
end_time = getHighResolutionTime();
cula_time = end_time - start_time;
checkStatus(status);
printf("Verifying Result\n");
for(i = 0; i < N; ++i){
for (j=0;j<N;j++)
{
diff = X[i+j*N] - B[i+j*N];
if(diff < 0.0f)
diff = -diff;
if(diff > thresh)
printf("\nResult check failed: X[%d]=%f B[%d]=%f\n", i, X[i+j*N],i, B[i+j*N]);
printf("\nResults:X= %f \t B= %f:\n",X[i+j*N],B[i+j*N]);
}
}
printRuntime(cula_time);
printf("Shutting down CULA\n\n");
culaShutdown();
free(A);
free(B);
}
I am using culaSgels('N',N,N, NRHS, A, N, A, N); to solve the system but :
1) The results show me that every element of X=0 , but B is right.
Also , it shows me the
Result check failed message
2) Studying the reference manual ,it says that one argument before the last argument (the A I have) ,should be the matrix B stored columnwised,but if I use "B" instead of "A" as parameter ,then I am not getting the correct B matrix.
Ok,code needs 3 things to work.
1) Change A to B ,so culaSgels('N',N,N, NRHS, A, N, B, N);
(I misunderstood that at exit B contains the solution)
2) Because CULA uses column major change A,B matrices accordingly.
3) Change to :
B = (culaFloat*)malloc(N*NRHS*sizeof(culaFloat));
X = (culaFloat*)malloc(N*NRHS*sizeof(culaFloat));
(use NHRS and not N which is the same in this example)
Thanks!

CUDA class with multidimensional pointers

I have been struggling with this class implementation now for quite a while and hope someone can help me with it.
class Material_Properties_Class_device
{
public:
int max_variables;
Logical * table_prop;
Table_Class ** prop_table;
};
The implementation for the pointers looks like this
Material_Properties_Class **d_material_prop = new Material_Properties_Class* [4];
Logical *table_prop;
for (int k = 1; k <= 3; k++ )
{
cutilSafeCall(cudaMalloc((void**)&(d_material_prop[k]),sizeof(Material_Properties_Class)));
cutilSafeCall(cudaMemcpy(d_material_prop[k], material_prop[k], sizeof(Material_Properties_Class ), cudaMemcpyHostToDevice));
}
for( int i = 1; i <= 3; i++ )
{
cutilSafeCall(cudaMalloc((void**)&(table_prop), sizeof(Logical)));
cudaMemcpy(&(d_material_prop[i]->table_prop), &(table_prop), sizeof(Logical*),cudaMemcpyHostToDevice);
cudaMemcpy(table_prop, material_prop[i]->table_prop, sizeof(Logical),cudaMemcpyHostToDevice);
}
cutilSafeCall(cudaMalloc((void ***)&material_prop_device, (4) * sizeof(Material_Properties_Class *)));
cutilSafeCall(cudaMemcpy(material_prop_device, d_material_prop, (4) * sizeof(Material_Properties_Class *), cudaMemcpyHostToDevice));
This implementation works but it can't get it working for the **prop_table.
I assume it must somehow follow the same principle but I just can't get my head around it.
I have already tried
Table_Class_device **prop_table = new Table_Class_device*[3];
and insert another loop inside the second for loop
for (int k = 1; k <= 3; k++ )
{
cutilSafeCall(cudaMalloc((void**)&(prop_table[k]), sizeof(Table_Class)));
cutilSafeCall(cudaMemcpy( prop_table[k], material_prop[i]->prop_table[k], sizeof( Table_Class *), cudaMemcpyHostToDevice));
}
Help would be much appriciated
some magic. May be it'll help
struct fading_coefficient
{
double* frequency_array;
double* temperature_array;
int frequency_size;
int temperature_size;
double** fading_coefficients;
};
struct fading_coefficient* cuda_fading_coefficient;
double* frequency_array = NULL;
double* temperature_array = NULL;
double** fading_coefficients = NULL;
double** fading_coefficients1 = (double **)malloc(fading_coefficient->frequency_size * sizeof(double *));
cudaMalloc((void**)&frequency_array,fading_coefficient->frequency_size *sizeof(double));
cudaMemcpy( frequency_array, fading_coefficient->frequency_array, fading_coefficient->frequency_size *sizeof(double), cudaMemcpyHostToDevice );
free(fading_coefficient->frequency_array);
cudaMalloc((void**)&temperature_array,fading_coefficient->temperature_size *sizeof(double));
cudaMemcpy( temperature_array, fading_coefficient->temperature_array, fading_coefficient->temperature_size *sizeof(double), cudaMemcpyHostToDevice );
free(fading_coefficient->temperature_array);
cudaMalloc((void***)&fading_coefficients,fading_coefficient->temperature_size *sizeof(double*));
for (int i = 0; i < fading_coefficient->temperature_size; i++)
{
cudaMalloc((void**)&(fading_coefficients1[i]),fading_coefficient->frequency_size *sizeof(double));
cudaMemcpy( fading_coefficients1[i], fading_coefficient->fading_coefficients[i], fading_coefficient->frequency_size *sizeof(double), cudaMemcpyHostToDevice );
free(fading_coefficient->fading_coefficients[i]);
}
cudaMemcpy(fading_coefficients, fading_coefficients1, fading_coefficient->temperature_size *sizeof(double*), cudaMemcpyHostToDevice );
fading_coefficient->frequency_array = frequency_array;
fading_coefficient->temperature_array = temperature_array;
fading_coefficient->fading_coefficients = fading_coefficients;
cudaMalloc((void**)&cuda_fading_coefficient,sizeof(struct fading_coefficient));
cudaMemcpy( cuda_fading_coefficient, fading_coefficient, sizeof(struct fading_coefficient), cudaMemcpyHostToDevice );
This question comes up frequently. Multidimensional pointers are especially challenging.
If possible, it's recommended that you flatten multidimensional pointer usage (**) to single-dimensional pointer usage (*), and as you've seen, even that is somewhat cumbersome.
The single-dimensional case (*) is further described here. Although you seem to have already figured it out.
If you really want to handle the 2 dimensional (**) case, look here.
An example implementation for 3 dimensional case (***) is here. ("madness!")
Working with 2 and 3 dimensions this way is quite difficult. Thus the recommendation to flatten.

FFTW with MEX and MATLAB argument issues

I wrote the following C/MEX code using the FFTW library to control the number of threads used for a FFT computation from MATLAB. The code works great (complex FFT forward and backward) with the FFTW_ESTIMATE argument in the planner although it is slower than MATLAB. But, when I switch to the FFTW_MEASURE argument to tune up the FFTW planner, it turns out that applying one FFT forward and then one FFT backward does not return the initial image. Instead, the image is scaled by a factor. Using FFTW_PATIENT gives me an even worse result with null matrices.
My code is as follows:
Matlab functions:
FFT forward:
function Y = fftNmx(X,NumCPU)
if nargin < 2
NumCPU = maxNumCompThreads;
disp('Warning: Use the max maxNumCompThreads');
end
Y = FFTN_mx(X,NumCPU)./numel(X);
FFT backward:
function Y = ifftNmx(X,NumCPU)
if nargin < 2
NumCPU = maxNumCompThreads;
disp('Warning: Use the max maxNumCompThreads');
end
Y = iFFTN_mx(X,NumCPU);
Mex functions:
FFT forward:
# include <string.h>
# include <stdlib.h>
# include <stdio.h>
# include <mex.h>
# include <matrix.h>
# include <math.h>
# include </home/nicolas/Code/C/lib/include/fftw3.h>
char *Wisfile = NULL;
char *Wistemplate = "%s/.fftwis";
#define WISLEN 8
void set_wisfile(void)
{
char *home;
if (Wisfile) return;
home = getenv("HOME");
Wisfile = (char *)malloc(strlen(home) + WISLEN + 1);
sprintf(Wisfile, Wistemplate, home);
}
fftw_plan CreatePlan(int NumDims, int N[], double *XReal, double *XImag, double *YReal, double *YImag)
{
fftw_plan Plan;
fftw_iodim Dim[NumDims];
int k, NumEl;
FILE *wisdom;
for(k = 0, NumEl = 1; k < NumDims; k++)
{
Dim[NumDims - k - 1].n = N[k];
Dim[NumDims - k - 1].is = Dim[NumDims - k - 1].os = (k == 0) ? 1 : (N[k-1] * Dim[NumDims-k].is);
NumEl *= N[k];
}
/* Import the wisdom. */
set_wisfile();
wisdom = fopen(Wisfile, "r");
if (wisdom) {
fftw_import_wisdom_from_file(wisdom);
fclose(wisdom);
}
if(!(Plan = fftw_plan_guru_split_dft(NumDims, Dim, 0, NULL, XReal, XImag, YReal, YImag, FFTW_MEASURE *(or FFTW_ESTIMATE respectively)* )))
mexErrMsgTxt("FFTW3 failed to create plan.");
/* Save the wisdom. */
wisdom = fopen(Wisfile, "w");
if (wisdom) {
fftw_export_wisdom_to_file(wisdom);
fclose(wisdom);
}
return Plan;
}
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[] )
{
#define B_OUT plhs[0]
int k, numCPU, NumDims;
const mwSize *N;
double *pr, *pi, *pr2, *pi2;
static long MatLeng = 0;
fftw_iodim Dim[NumDims];
fftw_plan PlanForward;
int NumEl = 1;
int *N2;
if (nrhs != 2) {
mexErrMsgIdAndTxt( "MATLAB:FFT2mx:invalidNumInputs",
"Two input argument required.");
}
if (!mxIsDouble(prhs[0])) {
mexErrMsgIdAndTxt( "MATLAB:FFT2mx:invalidNumInputs",
"Array must be double");
}
numCPU = (int) mxGetScalar(prhs[1]);
if (numCPU > 8) {
mexErrMsgIdAndTxt( "MATLAB:FFT2mx:invalidNumInputs",
"NumOfThreads < 8 requested");
}
if (!mxIsComplex(prhs[0])) {
mexErrMsgIdAndTxt( "MATLAB:FFT2mx:invalidNumInputs",
"Array must be complex");
}
NumDims = mxGetNumberOfDimensions(prhs[0]);
N = mxGetDimensions(prhs[0]);
N2 = (int*) mxMalloc( sizeof(int) * NumDims);
for(k=0;k<NumDims;k++) {
NumEl *= NumEl * N[k];
N2[k] = N[k];
}
pr = (double *) mxGetPr(prhs[0]);
pi = (double *) mxGetPi(prhs[0]);
//B_OUT = mxCreateNumericArray(NumDims, N, mxDOUBLE_CLASS, mxCOMPLEX);
B_OUT = mxCreateNumericMatrix(0, 0, mxDOUBLE_CLASS, mxCOMPLEX);
mxSetDimensions(B_OUT , N, NumDims);
mxSetData(B_OUT , (double* ) mxMalloc( sizeof(double) * mxGetNumberOfElements(prhs[0]) ));
mxSetImagData(B_OUT , (double* ) mxMalloc( sizeof(double) * mxGetNumberOfElements(prhs[0]) ));
pr2 = (double* ) mxGetPr(B_OUT);
pi2 = (double* ) mxGetPi(B_OUT);
fftw_init_threads();
fftw_plan_with_nthreads(numCPU);
PlanForward = CreatePlan(NumDims, N2, pr, pi, pr2, pi2);
fftw_execute_split_dft(PlanForward, pr, pi, pr2, pi2);
fftw_destroy_plan(PlanForward);
fftw_cleanup_threads();
}
FFT backward:
This MEX function differs from the above only in switching pointers pr <-> pi, and pr2 <-> pi2 in the CreatePlan function and in the execution of the plan, as suggested in the FFTW documentation.
If I run
A = imread('cameraman.tif');
>> A = double(A) + i*double(A);
>> B = fftNmx(A,8);
>> C = ifftNmx(B,8);
>> figure,imagesc(real(C))
with the FFTW_MEASURE and FFTW_ESTIMATE arguments respectively I get this result.
I wonder if this is due to an error in my code or in the library. I tried different thing around the wisdom, saving not saving. Using the wisdom produce by the FFTW standalone tool to produce wisdom. I haven't seen any improvement. Can anyone suggest why this is happening?
Additional information:
I compile the MEX code using static libraries:
mex FFTN_Meas_mx.cpp /home/nicolas/Code/C/lib/lib/libfftw3.a /home/nicolas/Code/C/lib/lib/libfftw3_threads.a -lm
The FFTW library hasn't been compiled with:
./configure CFLAGS="-fPIC" --prefix=/home/nicolas/Code/C/lib --enable-sse2 --enable-threads --&& make && make install
I tried different flags without success. I am using MATLAB 2011b on a Linux 64-bit station (AMD opteron quad core).
FFTW computes not normalized transform, see here:
http://www.fftw.org/doc/What-FFTW-Really-Computes.html
Roughly speaking, when you perform direct transform followed by inverse one, you get
back the input (plus round-off errors) multiplied by the length of your data.
When you create a plan using flags other than FFTW_ESTIMATE, your input is overwritten:
http://www.fftw.org/doc/Planner-Flags.html

Bottoms-up mergesort problems!

I am having problems with bottoms-up mergesort. I have problems sorting/merging. Current code includes:
public void mergeSort(long[] a, int len) {
long[] temp = new long[a.length];
int length = 1;
while (length < len) {
mergepass(a, temp, length, len);
length *= 2;
}
}
public void mergepass(long[] a, long[] temp, int blocksize, int len) {
int k = 0;
int i = 1;
while(i <= (len/blocksize)){
if(blocksize == 1){break;}
int min = a.length;
for(int j = 0; j < blocksize; j++){
if(a[i*j] < min){
temp[k++] = a[i*j];
count++;
}
else{
temp[k++] = a[(i*j)+1];
count++;
}
}
for(int n = 0; n < this.a.length; n++){
a[n] = temp[n];
}
}
}
Obvious problems:
i is never incremented.
At no point do you compare two elements in the array. (Is that what if(a[i*j] < min) is supposed to be doing? I can't tell.)
Why are you multiplying i and j?
What's this.a.length?
Style problems:
mergeSort() takes len as an argument, even though arrays have an implicit length. To make matters worse, the function also uses a.length and length.
Generally poor variable names.
Nitpicks:
If you're going to make a second array of the same size, it is common to make one the "source" and the other the "destination" and swap them between passes, instead of sorting into a temporary array and copying them back again.