Return Maximum Amount of Sequential Numbers in a Row that Meet a Condition (MATLAB) - matlab

I have a large matrix of random values (e.g. 200,000 x 6,000) between 0-1 named 'allGSR.'
I used the following code to create a logical array (?) where 1 represents numbers less than .05
sig = (allGSR < .05);
What I'd like to do is to return an array of size 1 x 200,000 called maxSIG where each row represents the MAXIMUM number of sequential ones. So for example, if in row 1, columns 3-6 are ones, that is 4 ones in a row and if columns 100-109 are ones that is 10 ones in a row and if that is the maximum number of ones in a row I would like the first column of maxSIG to be the value '10.'
I have been doing this with for loops, if statements, and counters; this is ugly and tedious and was wondering if there is an easier or more efficient way.
Thank you for any insight.
EDIT: Whoops, should probably share the loop.
EDIT 2: So I just wrote out what my basic code is with a smaller (100 x 6,000) matrix. This code should run. Sorry for the inconvenience.
GSR = 6000;
samples = 100;
allGSR = zeros(samples, GSR);
for x = 1:samples
y = rand(GSR, 1)'; %Transpose so it's 1x6000 and not 6000x1
allGSR(x,:) = y;
end
countSIG = zeros(samples,1);
abovethreshold = (allGSR < .05); %.05 can be replaced by whatever
for z = 1:samples
count = 0;
holdArray = zeros(1,GSR);
for a = 1:GSR
if abovethreshold(z,a) == true
count = count + 1;
else
count = 0;
end
holdArray(1,a) = count;
end
maxrun = max(holdArray);
countSIG(z,1) = maxrun;
end

Here's one approach using diff, find & accumarray -
append_col = zeros(size(abovethreshold,1),1);
df = diff([append_col abovethreshold append_col],[],2).'; %//'
[R1,C1] = find(df==1);
[R2,C2] = find(df==-1);
out = zeros(samples,1);
out(1:max(C1)) = accumarray(C1,R2 - R1,[],#max);
In the code posted above, we are creating a fat array with abovethreshold and then transposing it. From performance point of view, the transpose operation might not be the best thing to do. So, rather we can move things around it rather than itself, like so -
append_col = zeros(size(abovethreshold,1),1);
df = diff([append_col abovethreshold append_col],[],2); %//'
[R1,C1] = find(df==1);
[R2,C2] = find(df==-1);
[~,idx1] = sort(R1);
[~,idx2] = sort(R2);
out = zeros(samples,1);
out(1:max(R1)) = accumarray(R1(idx1),C2(idx2) - C1(idx1),[],#max);

If you're worried about memory allocation, speed, etc... on huge arrays, I'd just do your same basic algorithm in c++. Throw this in something like myfunction.cpp file and compile with mex -largeArrayDims myfunction.cpp.
You can then call from matlab with counts = myfunction(allGSR, .05);
I haven't tested this beyond that it compiles.
#include "mex.h"
#include "matrix.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
if(nrhs != 2)
mexErrMsgTxt("Invalid number of inputs. Shoudl be 2 input argument.");
if(nlhs != 1)
mexErrMsgTxt("Invalid number of outputs. Should be 1 output arguments.");
if(!mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]))
mexErrMsgTxt("First two arguments are not doubles");
const mxArray *input_array = prhs[0];
const mxArray *threshold_array = prhs[1];
size_t input_rows = mxGetM(input_array);
size_t input_cols = mxGetN(input_array);
size_t threshold_rows = mxGetM(threshold_array);
size_t threshold_cols = mxGetN(threshold_array);
if(threshold_rows != 1 || threshold_cols != 1)
mexErrMsgTxt("threshold array should be a scalar");
mxArray *output_array = mxCreateDoubleMatrix(1, input_rows, mxREAL);
double *output_data = mxGetPr(output_array);
double *input_data = mxGetPr(input_array);
double threshold = *mxGetPr(threshold_array);
for(int z = 0; z < input_rows; z++) {
int count = 0;
int max_count = 0;
for(int a = 0; a < input_cols; a++) {
if(input_data[z + a * input_rows] < threshold) {
count++;
} else {
if(count > max_count)
max_count = count;
count = 0;
}
}
if(count > max_count)
max_count = count;
output_data[z] = max_count;
}
plhs[0] = output_array;
}
I'm not sure if you want to check for above or below threshold? Whatever you do, you'd change the input_data[z + a * input_rows] < threshold) to whatever comparison operator you want.

Here's a one-liner, albeit slow since cellfun is a loop:
maxSIG=cellfun(#(x) max(getfield(regionprops(x),'Area')),mat2cell(allGSR,ones(6000,1),100));
The Image Processing Toolbox function regionprops identifies connected groups of 1's in a logical matrix. By operating on each row of your matrix, and returning specifically the Area property, we get the length of each connected segment of 1's in each row. The max function picks out the length in each row you're looking for.
Note the mat2cell call is necessary to split allGSR into a cell matrix of rows, so that cellfun can be called.

Related

How to do Weighted Averaging of n conscutive values in an Array

I have a 900×1 vector of values (in MATLAB). Each 9 consecutive values should be averaged -without overlap- result in a 100×1 vector of values. The problem is that the averaging should be weighted based on a weighting vector of [1 2 1;2 4 2;1 2 1]. Is there any efficient way to do that averaging? I’ve heard about conv function in MATLAB; Is it helpful?
conv works by sliding a kernel through your data. But in your case, you need the mask to be jumping through your data, so I don't think conv will work for you.
If you want to use existing MATLAB function, you can do this (I have to assume your weighting matrix has only one dimension) :
kernel = [1;2;1;2;4;2;1;2;1];
in_matrix = reshape(in_matrix, 9, 100);
base = sum(kernel);
out_matrix = bsxfun(#times, in_matrix, kernel);
result = sum(out_matrix,1)/base;
I don't know if there is any clever way to speed this up. bsxfun allows singleton expansion, but maybe not dimension reduction.
A faster way would be to use mex. Open a new file in editor, paste the following code and save file as weighted_average.c.
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *in_matrix, *kernel, *out_matrix, base;
int niter;
size_t nrows_data, nrows_kernel;
/* Get number of element along first dimension of input matrix. */
nrows_kernel = mxGetM(prhs[1]);
nrows_data = mxGetM(prhs[0]);
/* Create output matrix*/
plhs[0] = mxCreateDoubleMatrix((mwSize)nrows_data/nrows_kernel,1,mxREAL);
/* Get a pointer to the real data */
in_matrix = mxGetPr(prhs[0]);
kernel = mxGetPr(prhs[1]);
out_matrix = mxGetPr(plhs[0]);
/* Sum the elements in weighting array */
base = 0;
for (int i = 0; i < nrows_kernel; i +=1)
{
base += kernel[i];
}
/* Perform calculation */
niter = nrows_data/nrows_kernel;
for (int i = 0; i < niter ; i += 1)
{
for (int j = 0; j < nrows_kernel; j += 1)
{
out_matrix[i] += in_matrix[i*nrows_kernel+j]*kernel[j];
}
out_matrix[i] /= base;
}
}
Then in command window , type in
mex weighted_average.c
To use it:
result = weighted_average(input, kernel);
Note that both input and kernel have to be M x 1 matrix. On my computer, the first method took 0.0012 second. The second method took 0.00007 second. That's an order of magnitude faster than the first method.

How to implement parallel-for in a 4 level nested for loop block

I have to calculate the std and mean of a large data set with respect to quite a few models. The final loop block is nested to four levels.
This is what it looks like:
count = 1;
alpha = 0.5;
%%%Below if each individual block is to be posterior'd and then average taken
c = 1;
for i = 1:numel(writers) %no. of writers
for j = 1: numel(test_feats{i}) %no. of images
for k = 1: numel(gmm) %no. of models
for n = 1: size(test_feats{i}{j},1)
[~, scores(c)] = posterior(gmm{k}, test_feats{i}{j}(n,:));
c = c + 1;
end
c = 1;
index_kek=find(abs(scores-mean(scores))>alpha*std(scores));
avg = mean(scores(index_kek)); %using std instead of mean... beacause of ..reasons
NLL(count) = avg;
count = count + 1;
end
count = 1; %reset count
NLL_scores{i}(j,:) = NLL;
end
fprintf('***score for model_%d done***\n', i)
end
It works and gives the desired result but it takes 3 days to give me the final calculation, even on my i7 processor. During processing the task manager tells me that only 20% of the cpu is being used, so I would rather put more load on the cpu to get the result faster.
Going by the official help here if I suppose want to make the outer most loop a parfor while keeping the rest normal for all I have to do is to insert integer limits rather than function calls such as size or numel.
So making these changes the above code will become:
count = 1;
alpha = 0.5;
%%%Below if each individual block is to be posterior'd and then average taken
c = 1;
num_writers = numel(writers);
num_images = numel(test_feats{1});
num_models = numel(gmm);
num_feats = size(test_feats{1}{1},1);
parfor i = 1:num_writers %no. of writers
for j = 1: num_images %no. of images
for k = 1: num_models %no. of models
for n = 1: num_feats
[~, scores(c)] = posterior(gmm{k}, test_feats{i}{j}(n,:));
c = c + 1;
end
c = 1;
index_kek=find(abs(scores-mean(scores))>alpha*std(scores));
avg = mean(scores(index_kek)); %using std instead of mean... beacause of ..reasons
NLL(count) = avg;
count = count + 1;
end
count = 1; %reset count
NLL_scores{i}(j,:) = NLL;
end
fprintf('***score for model_%d done***\n', i)
end
Is this the most optimum way to implement parfor in my case? Can it be improved or optimized further?
I couldn't test in Matlab for now but it should be close to a working solution. It has a reduced number of loops and changes a few implementation details but overall it might perform just as fast (or even slower) as your earlier code.
If gmm and test_feats take lots of memory then it is important that parfor is able to determine which peaces of data need to be delivered to which workers. The IDE should warn you if inefficient memory access is detected. This modification is especially useful if num_writers is much less than the number of cores in your CPU, or if it is only slightly larger (like 5 writers for 4 cores would take about as long as 8 writers).
[i_writer i_image i_model] = ndgrid(1:num_writers, 1:num_images, 1:num_models);
idx_combined = [i_writer(:) i_image(:) i_model(:)];
n_combined = size(idx_combined, 1);
NLL_scores = zeros(n_combined, 1);
parfor i_for = 1:n_combined
i = idx_combined(i_for, 1)
j = idx_combined(i_for, 2)
k = idx_combined(i_for, 3)
% pre-allocate
scores = zeros(num_feats, 1)
for i_feat = 1:num_feats
[~, scores(i_feat)] = posterior(gmm{k}, test_feats{i}{j}(i_feat,:));
end
% "find" is redundant here and performs a bit slower, might be insignificant though
index_kek = abs(scores - mean(scores)) > alpha * std(scores);
NLL_scores(i_for) = mean(scores(index_kek));
end

how can i convert my cpu code of dot product of two matrices to GPU in matlab

I want to take weighted sum of two matrices in GPUarray to be fast. for example my code on cpu is given below:
mat1 = rand(19,19);
mat2= rand(19,19);
Receptive_fieldsize = [4,3];
overlap = 1;
Output = GetweightedSum(mat1,mat2, Receptive_fieldsize,overlap); %this will output in an 6x6 matrix
where as my function body is:
function Output = GetweightedSum(mat1,mat2, RF,overlap)
gap = RF(1) - overlap;
size_mat = size(mat1);
output_size=[6,6];
for u=1: output_size(1)
for v=1: output_size(2)
min_u = (u - 1) * gap + 1;
max_u = (u - 1) * gap + RF(1);
min_v = (v - 1) * gap + 1;
max_v = (v - 1) * gap + RF(2);
input1 = mat1(min_u:max_u,min_v:max_v);
input2 = mat2(min_u:max_u,min_v:max_v);
Output(u,v) = sum(sum(input1 .*input2));
end
end
How can i convert it to GPUfunciton. Can i do it directly, OR can i use for loop in GPU code. I am totally new to GPU so don't know anything about it.
Will be thankful if some one guid me, or change the above code as reference to GPU function so that i may learn from it.
Regards
See if the codes and the comments alongside them make sense to you -
function Output = GetweightedSumGPU(mat1,mat2, RF,overlap)
%// Create parameters
gap = RF(1) - overlap;
output_size=[6,6];
sz1 = output_size(1);
sz2 = output_size(2);
nrows = size(mat1,1); %// get number of rows in mat1
%// Copy data to GPU
gmat1 = gpuArray(mat1);
gmat2 = gpuArray(mat2);
start_row_ind = gpuArray([1:RF(1)]'); %//' starting row indices for each block
col_offset = gpuArray([0:RF(2)-1]*nrows); %// column offset for each block
%// Linear indices for each block
ind = bsxfun(#plus,start_row_ind,col_offset);
%// Linear indices along rows and columns respectively
ind_rows = bsxfun(#plus,ind(:),[0:sz1-1]*gap);
ind_rows_cols = bsxfun(#plus,ind_rows,permute([0:sz2-1]*gap*nrows,[1 3 2]));
%// Elementwise multiplication, summing and gathering back result to CPU
Output = gather(reshape(sum(gmat1(ind_rows_cols).*gmat2(ind_rows_cols),1),sz1,sz2));
return;

Implementation of Resilient Propagation

Currently I am trying to implement Resilient Propagation for my network. I'm doing this based on the encog implementation, but there is one thing I don't understand:
The documentation for RPROP and iRPROP+ says when change > 0: weightChange = -sign(gradient) * delta
The source code in lines 298 and 366 does not have a minus!
Since I assume both are in some case correct: Why is there a difference between the two?
And concerning the gradient: I'm using tanh as activion in the output layer. Is this the correct calculation of the gradient?
gradientOutput = (1 - lastOutput[j] * lastOutput[j]) * (target[j] - lastOutput[j]);
After re-reading the relevant papers and looking up in a textbook I think the documentation of encog is not correct at this point. Why don't you just try it out by temporarily adding the minus-signs in the source code? If you use the same initial weights, you should receive exact the same results, given the documentation was correct. But in the end it just matters how you use the weightUpdate variable. If the author of the documentation is used to subtracting the weightUpdate from the weights instead of adding it, this will work.
Edit: I revisited the part about the gradient calculation in my original answer.
First, here is a brief explanation on how you can imagine the gradient for the weights in your output layer. First, you calculate the error between your outputs and the target values.
What you are now trying to do is to "blame" those neurons in the previous layer, which were active. Imagine the output neuron saying "Well, I have an error here, who is responsible?". Responsible are the neurons of the previous layer. Depending on the output being too small or too large compared to the target value, it will increase or decrease the weights to each of the neurons in the previous layers depending on how active they have been.
x is the activation of a neuron in the hidden layer.
o is the activation of the output neuron.
φ is the activation function of the output neuron, φ' its derivative.
Edit2: Corrected the part below. Added matrix style computation of backpropagation.
The error at each output neuron j is:
(1) δout, j = φ'(oj)(t - oj)
The gradient for the weight connecting the hidden neuron i with the output neuron j:
(2) gradi, j = xi * δout, j
The backpropagated error at each hidden neuron i with the weights w:
(3) δhid, i = φ'(x)*∑wi, j * δout, j
By repeatedly applying formula 2 and 3, you can backpropagate up to the input layer.
Written in loops, regarding one training sample:
The error at each output neuron j is:
for(int j=0; j < numOutNeurons; j++) {
errorOut[j] = activationDerivative(o[j])*(t[j] - o[j]);
}
The gradient for the weight connecting the hidden neuron i with the output neuron j:
for(int i=0; i < numHidNeurons; i++) {
for(int j=0; j < numOutNeurons; j++) {
grad[i][j] = x[i] * errorOut[j]
}
}
The backpropagated error at each hidden neuron i:
for(int i=0; i < numHidNeurons; i++) {
for(int j=0; j < numOutNeurons; j++) {
errorHid[i] = activationDerivative(x[i]) * weights[i][j] * errorOut[j]
}
}
In fully connected Multilayer Perceptrons without convolution or anything like that you can can use standard matrix operations, which is a lot faster.
Assuming each of your samples is a row in your input matrix and the columns are its attributes, you can propagate the input through your network like this:
activations[0] = input;
for(int i=0; i < numWeightMatrices; i++){
activations[i+1] = activations[i].dot(weightMatrices[i]);
activations[i+1] = activationFunction(activations[i+1]);
}
Backpropagation then becomes:
n = numWeightMatrices;
error = activationDerivative(activations[n]) * (target - activations[n]);
for (int l=n-1; l >= 0; l--){
gradient[l] = activations[l].transposed().dot(error);
if (l > 0) {
error = error.dot(weightMatrices[l].transposed());
error = activationDerivative(activations[l])*error;
}
}
I omitted the bias neuron in the above explanations. In literature it is recommended to model the bias neuron as an additional column in each activation matrix which is alway 1.0 . You will need to deal with some slice assigns. When using the matrix backpropagation loop, do not forget to set the error at the position of the bias to 0 before each step!
private float resilientPropagation(int i, int j){
float gradientSignChange = sign(prevGradient[i][j]*gradient[i][j]);
float delta = 0;
if(gradientSignChange > 0){
float change = Math.min((prevChange[i][j]*increaseFactor), maxDelta);
delta = sign(gradient[i][j])*change;
prevChange[i][j] = change;
prevGradient[i][j] = gradient[i][j];
}
else if(gradientSignChange < 0){
float change = Math.max((prevChange[i][j]*decreaseFactor), minDelta);
prevChange[i][j] = change;
delta = -prevDelta[i][j];
prevGradient[i][j] = 0;
}
else if(gradientSignChange == 0){
float change = prevChange[i][j];
delta = sign(gradient[i][j])*change;
prevGradient[i][j] = gradient[i][j];
}
prevDelta[i][j] = delta;
return delta;
}
gradient[i][j] = error[j]*layerInput[i];
weights[i][j]= weights[i][j]+resilientPropagation(i,j);

Random number without repeating

I have NSMutableArray with 50 array elements. I need to generate randomly without any repetition. Can you suggest some sample codes.
Create a local mutablearray copy of main array, and after getting random value, remove object available at random index from local array, process it till array count is 1.
here is sample to get random int less than 1000.
int y = arc4random() % 1000;
to stay without duplicates, just check before inserting
I have assumed you want to generate numbers. This is the answer I have used for generating M random numbers from N. Though it doesn't add them into an NSMutableArray, I'm sure you can adapt this code as required.
#define M 10
#define N 100
unsigned char is_used[N] = { 0 }; /* flags */
int in, im;
im = 0;
for (in = N - M; in < N && im < M; ++in) {
int r = rand() % (in + 1); /* generate a random number 'r' */
if (is_used[r])
/* we already have 'r' */
r = in; /* use 'in' instead of the generated number */
assert(!is_used[r]);
vektor[im++] = r + 1; /* +1 since your range begins from 1 */
is_used[r] = 1;
}
assert(im == M);
Why the above works is not immediately obvious. But it works. Exactly M numbers from [1..N] range will be picked with uniform distribution.
Note, that for large N you can use a search-based structure to store "already used" numbers, thus getting a nice O(M log M) algorithm with O(M) memory requirement.
[Source]