Reduce size of vector in MEX-file - matlab

Given a typed vector like this
matlab::data::ArrayFactory Factory;
matlab::data::TypedArray<double> BigArray = Factory.createArray({420, 1});
How can I shrink BigArray size without (re)allocations? All I want is to set its internal length-dimension to a value smaller than 420.

Well supposing you spring for the C API instead of the C++ API, you can use mxSetN or mxSetM on the mxArray object to reduce it.
int M = 420;
int N = 1;
mxArray *BigArray = mxCreateNumericMatrix(M, N, mxDOUBLE_CLASS, mxREAL);
mxSetM(BigArray, M - 4);


Return Maximum Amount of Sequential Numbers in a Row that Meet a Condition (MATLAB)

I have a large matrix of random values (e.g. 200,000 x 6,000) between 0-1 named 'allGSR.'
I used the following code to create a logical array (?) where 1 represents numbers less than .05
sig = (allGSR < .05);
What I'd like to do is to return an array of size 1 x 200,000 called maxSIG where each row represents the MAXIMUM number of sequential ones. So for example, if in row 1, columns 3-6 are ones, that is 4 ones in a row and if columns 100-109 are ones that is 10 ones in a row and if that is the maximum number of ones in a row I would like the first column of maxSIG to be the value '10.'
I have been doing this with for loops, if statements, and counters; this is ugly and tedious and was wondering if there is an easier or more efficient way.
Thank you for any insight.
EDIT: Whoops, should probably share the loop.
EDIT 2: So I just wrote out what my basic code is with a smaller (100 x 6,000) matrix. This code should run. Sorry for the inconvenience.
GSR = 6000;
samples = 100;
allGSR = zeros(samples, GSR);
for x = 1:samples
y = rand(GSR, 1)'; %Transpose so it's 1x6000 and not 6000x1
allGSR(x,:) = y;
countSIG = zeros(samples,1);
abovethreshold = (allGSR < .05); %.05 can be replaced by whatever
for z = 1:samples
count = 0;
holdArray = zeros(1,GSR);
for a = 1:GSR
if abovethreshold(z,a) == true
count = count + 1;
count = 0;
holdArray(1,a) = count;
maxrun = max(holdArray);
countSIG(z,1) = maxrun;
Here's one approach using diff, find & accumarray -
append_col = zeros(size(abovethreshold,1),1);
df = diff([append_col abovethreshold append_col],[],2).'; %//'
[R1,C1] = find(df==1);
[R2,C2] = find(df==-1);
out = zeros(samples,1);
out(1:max(C1)) = accumarray(C1,R2 - R1,[],#max);
In the code posted above, we are creating a fat array with abovethreshold and then transposing it. From performance point of view, the transpose operation might not be the best thing to do. So, rather we can move things around it rather than itself, like so -
append_col = zeros(size(abovethreshold,1),1);
df = diff([append_col abovethreshold append_col],[],2); %//'
[R1,C1] = find(df==1);
[R2,C2] = find(df==-1);
[~,idx1] = sort(R1);
[~,idx2] = sort(R2);
out = zeros(samples,1);
out(1:max(R1)) = accumarray(R1(idx1),C2(idx2) - C1(idx1),[],#max);
If you're worried about memory allocation, speed, etc... on huge arrays, I'd just do your same basic algorithm in c++. Throw this in something like myfunction.cpp file and compile with mex -largeArrayDims myfunction.cpp.
You can then call from matlab with counts = myfunction(allGSR, .05);
I haven't tested this beyond that it compiles.
#include "mex.h"
#include "matrix.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
if(nrhs != 2)
mexErrMsgTxt("Invalid number of inputs. Shoudl be 2 input argument.");
if(nlhs != 1)
mexErrMsgTxt("Invalid number of outputs. Should be 1 output arguments.");
if(!mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]))
mexErrMsgTxt("First two arguments are not doubles");
const mxArray *input_array = prhs[0];
const mxArray *threshold_array = prhs[1];
size_t input_rows = mxGetM(input_array);
size_t input_cols = mxGetN(input_array);
size_t threshold_rows = mxGetM(threshold_array);
size_t threshold_cols = mxGetN(threshold_array);
if(threshold_rows != 1 || threshold_cols != 1)
mexErrMsgTxt("threshold array should be a scalar");
mxArray *output_array = mxCreateDoubleMatrix(1, input_rows, mxREAL);
double *output_data = mxGetPr(output_array);
double *input_data = mxGetPr(input_array);
double threshold = *mxGetPr(threshold_array);
for(int z = 0; z < input_rows; z++) {
int count = 0;
int max_count = 0;
for(int a = 0; a < input_cols; a++) {
if(input_data[z + a * input_rows] < threshold) {
} else {
if(count > max_count)
max_count = count;
count = 0;
if(count > max_count)
max_count = count;
output_data[z] = max_count;
plhs[0] = output_array;
I'm not sure if you want to check for above or below threshold? Whatever you do, you'd change the input_data[z + a * input_rows] < threshold) to whatever comparison operator you want.
Here's a one-liner, albeit slow since cellfun is a loop:
maxSIG=cellfun(#(x) max(getfield(regionprops(x),'Area')),mat2cell(allGSR,ones(6000,1),100));
The Image Processing Toolbox function regionprops identifies connected groups of 1's in a logical matrix. By operating on each row of your matrix, and returning specifically the Area property, we get the length of each connected segment of 1's in each row. The max function picks out the length in each row you're looking for.
Note the mat2cell call is necessary to split allGSR into a cell matrix of rows, so that cellfun can be called.

How to do Weighted Averaging of n conscutive values in an Array

I have a 900×1 vector of values (in MATLAB). Each 9 consecutive values should be averaged -without overlap- result in a 100×1 vector of values. The problem is that the averaging should be weighted based on a weighting vector of [1 2 1;2 4 2;1 2 1]. Is there any efficient way to do that averaging? I’ve heard about conv function in MATLAB; Is it helpful?
conv works by sliding a kernel through your data. But in your case, you need the mask to be jumping through your data, so I don't think conv will work for you.
If you want to use existing MATLAB function, you can do this (I have to assume your weighting matrix has only one dimension) :
kernel = [1;2;1;2;4;2;1;2;1];
in_matrix = reshape(in_matrix, 9, 100);
base = sum(kernel);
out_matrix = bsxfun(#times, in_matrix, kernel);
result = sum(out_matrix,1)/base;
I don't know if there is any clever way to speed this up. bsxfun allows singleton expansion, but maybe not dimension reduction.
A faster way would be to use mex. Open a new file in editor, paste the following code and save file as weighted_average.c.
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
double *in_matrix, *kernel, *out_matrix, base;
int niter;
size_t nrows_data, nrows_kernel;
/* Get number of element along first dimension of input matrix. */
nrows_kernel = mxGetM(prhs[1]);
nrows_data = mxGetM(prhs[0]);
/* Create output matrix*/
plhs[0] = mxCreateDoubleMatrix((mwSize)nrows_data/nrows_kernel,1,mxREAL);
/* Get a pointer to the real data */
in_matrix = mxGetPr(prhs[0]);
kernel = mxGetPr(prhs[1]);
out_matrix = mxGetPr(plhs[0]);
/* Sum the elements in weighting array */
base = 0;
for (int i = 0; i < nrows_kernel; i +=1)
base += kernel[i];
/* Perform calculation */
niter = nrows_data/nrows_kernel;
for (int i = 0; i < niter ; i += 1)
for (int j = 0; j < nrows_kernel; j += 1)
out_matrix[i] += in_matrix[i*nrows_kernel+j]*kernel[j];
out_matrix[i] /= base;
Then in command window , type in
mex weighted_average.c
To use it:
result = weighted_average(input, kernel);
Note that both input and kernel have to be M x 1 matrix. On my computer, the first method took 0.0012 second. The second method took 0.00007 second. That's an order of magnitude faster than the first method.

How Do I use Mexcallmatlab to Call a User-defined function?

I am trying to parallelize a section of my Matlab code using OpenMP in a mex file. The section in theMatlab code that I want to parallelize is:
for i = 1 : n
D(:, i) = CALC(A, B(:,i), C(i));
I have written this in order to parallelize it:
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
size_t r,n,i,G;
double *A, *B, *C, *D;
int nthreads;
nthreads = 4;
A = mxGetPr(prhs[0]); /* first input matrix */
B = mxGetPr(prhs[1]); /* second input matrix */
C = mxGetPr(prhs[2]);/* third input matrix */
/* dimensions of input matrices */
r = mxGetN(prhs[0]);
n = mxGetN(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(r,n, mxREAL);
D = mxGetPr(plhs[0]);
#pragma omp parallel for schedule (dynamic, G)
for i = 1 : n
D(:, i) = CALC(A, B(:,i), C(i));
CALC is a Matlab function I have written. My challenge is how to use Mexcallmatlab to call in the CALC function to the mex file so that it can execute it in parallel inside my mex file, and return the elements of each column of D (i.e. D(:, i) back to my Matlab code.
Sorry for the lenghty question. Any help I can get on this will be highly appreciated.
You need to use multiple MATLAB processes to be able to run multiple calls in parallel. The easiest way would be to use parallel computing toolbox and use parfor instead of for loop.

Generating Doubles With XORShift Generator

So I am using the Wikipedia entry of XORShift Generators to make a PRNG. My code is as follows.
uint32_t xor128(void) {
static uint32_t x = 123456789;
static uint32_t y = 362436069;
static uint32_t z = 521288629;
static uint32_t w = 88675123;
uint32_t t;
t = x ^ (x << 11);
x = y; y = z; z = w;
return w = w ^ (w >> 19) ^ t ^ (t >> 8);
My question is, how can I use this to generate double numbers between [0, 1)?
Thanks for any help.
Just divide the returned uint32_t by the maximum uint32_t (cast as a double). This does have an approximately one in four-billion chance of being 1, though. You could put in a test for the maximum and discard it if you wish.
Assuming you want a uniform distribution, and aren't too picky about randomising all of the bits for extremely small numbers:
double xor128d(void) {
return xor128() / 4294967296.0;
Since xor128() cannot return 4294967296, the result cannot be exactly 1.0 -- however, if you returned a float, it might still be rounded up to 1.0f.
If you try to add more bits to fill the whole mantissa then you'll face the same rounding headache for doubles.
Do you want the whole mantissa randomised for all possible values? That's a little harder.

How do I feed a 2-dimensional array into a kernel with pycuda?

I have created a numpy array of float32s with shape (64, 128), and I want to send it to the GPU. How do I do that? What arguments should my kernel function accept? float** myArray?
I have tried directly sending the array as it is to the GPU, but pycuda complains that objects are being accessed...
Two dimensional arrays in numpy/PyCUDA are stored in pitched linear memory in row major order by default. So you only need to have a kernel something like this:
void kernel(float* a, int lda, ...)
int r0 = threadIdx.y + blockDim.y * blockIdx.y;
int r1 = threadIdx.x + blockDim.x * blockIdx.x;
float val = a[r0 + r1*lda];
to access a numpy ndarray or PyCUDA gpuarray passed by reference to the kernel from Python.