How to Speed up code in matlab? - matlab

Below is my code for a neural network Forward propagation. I want to speed it up. As for loop takes time, Can any body help in correcting the code for speeding it up, like matlab says vectorzing etc.
In this code i take receptive field of 4x4 each time from input of size 19x19, than multiply each pixel with 4x4 of weights (net.w{layer_no}(u,v) of size 19x19). You can also say it is a dot product of the two. I didnt did directly dot product of two small matrices as there is a check of boundaries. It provides a 6x6 output saved in output in the end. I am not an experienced coder, so i did as much as i can. Can anybody guide me how to speed it up as it takes alot of time compare to Opencv. Will be thankful. Regards
receptiveSize = 4;
overlap= 1;
inhibatory = 0;
gap = receptiveSize-overlap;
UpperLayerSize = size(net.b{layer_no}); % 6x6
Curr_layerSize = size(net.w{layer_no}); % 19x19
for u=1:UpperLayerSize(1)-1
for v=1:UpperLayerSize(2)-1
summed_value=0;
min_u = (u - 1) * gap + 1;
max_u = (u - 1) * gap + receptiveSize;
min_v = (v - 1) * gap + 1;
max_v = (v - 1) * gap + receptiveSize;
for i = min_u : max_u
for j = min_v : max_v
if(i>Curr_layerSize(1) || j>Curr_layerSize(2))
continue;
end
if(i<1 || j<1)
continue;
end
summed_value = summed_value + input{layer_no}.images(i,j,sample_ind) * net.w{layer_no}(i,j);
end
end
summed_value = summed_value + net.b{layer_no}(u,v);
input{layer_no+1}.images(u,v,sample_ind) = summed_value;
end
end
temp = activate_Mat(input{layer_no+1}.images(:,:,sample_ind),net.AF{layer_no});
output{layer_no}.images(:,:,sample_ind) = temp(:,:);

How about replacing the inner loops (loop over i and loop over j) to something like:
ii = max( 1, min_u ) : min( max_u, Curr_layerSize(1) );
jj = max( 1, min_v ) : min( max_v, Curr_layerSize(2) );
input{layer_no+1}.images(u,v,sample_ind) = ...
reshape( input{layer_no}.images(ii,jj,sample_ind), 1, [] ) * ...
reshape( net.w{layer_no}(ii,jj), [], 1 ) + ...
net.b{layer_no}(u,v); %// should this term be added rather than multiplied?

Related

MATLAB function for image filtering

I'm looking to implement my own Matlab function that can be used to compute image filtering with a 3x3 kernel.
It has to be like this: function [ output_args ] = fFilter( img, mask )
where img is a original image and mask is a kernel (for example B = [1,1,1;1,4,1;1,1,1] )
I'm not supposed to use any in-built functions from Image Processing Toolbox.
I have to use this
where:
s is an image after filter
p is an image before filter
M is a kernel
and N is 1 if sum(sum(M)) == 0 else N = sum(sum(M))
I'm new to MATLAB and this is like black magic for me -_-
This should do the work (Wasn't verified):
function [ mO ] = ImageFilter( mI, mMask )
%UNTITLED2 Summary of this function goes here
% Detailed explanation goes here
numRows = size(mI, 1);
numCols = size(mI, 2);
% Assuming Odd number of Rows / Columns
maskRadius = floor(siez(mMask, 1) / 2);
sumMask = sum(mMask(:));
if(sumMask ~= 0)
mMask(:) = mMask / sumMask;
end
mO = zeros([numRows, numCols]);
for jj = 1:numCols
for ii = 1:numRows
for kk = -maskRadius:maskRadius
nn = kk + 1; %<! Mask Index
colIdx = min(max(1, jj + kk), numCols); %<! Replicate Boundary
for ll = -maskRadius:maskRadius
mm = ll + 1; %<! Mask Index
rowIdx = min(max(1, ii + ll), numRows); %<! Replicate Boundary
mO(ii, jj) = mO(ii, jj) + (mMask(mm, nn) * mI(rowIdx, colIdx));
end
end
end
end
end
The above is classic Correlation (Image Filtering) with Replicate Boundary Condition.

How to reduce the time consumed by the for loop?

I am trying to implement a simple pixel level center-surround image enhancement. Center-surround technique makes use of statistics between the center pixel of the window and the surrounding neighborhood as a means to decide what enhancement needs to be done. In the code given below I have compared the center pixel with average of the surrounding information and based on that I switch between two cases to enhance the contrast. The code that I have written is as follows:
im = normalize8(im,1); %to set the range of pixel from 0-255
s1 = floor(K1/2); %K1 is the size of the window for surround
M = 1000; %is a constant value
out1 = padarray(im,[s1,s1],'symmetric');
out1 = CE(out1,s1,M);
out = (out1(s1+1:end-s1,s1+1:end-s1));
out = normalize8(out,0); %to set the range of pixel from 0-1
function [out] = CE(out,s,M)
B = 255;
out1 = out;
for i = s+1 : size(out,1) - s
for j = s+1 : size(out,2) - s
temp = out(i-s:i+s,j-s:j+s);
Yij = out1(i,j);
Sij = (1/(2*s+1)^2)*sum(sum(temp));
if (Yij>=Sij)
Aij = A(Yij-Sij,M);
out1(i,j) = ((B + Aij)*Yij)/(Aij+Yij);
else
Aij = A(Sij-Yij,M);
out1(i,j) = (Aij*Yij)/(Aij+B-Yij);
end
end
end
out = out1;
function [Ax] = A(x,M)
if x == 0
Ax = M;
else
Ax = M/x;
end
The code does the following things:
1) Normalize the image to 0-255 range and pad it with additional elements to perform windowing operation.
2) Calls the function CE.
3) In the function CE obtain the windowed image(temp).
4) Find the average of the window (Sij).
5) Compare the center of the window (Yij) with the average value (Sij).
6) Based on the result of comparison perform one of the two enhancement operation.
7) Finally set the range back to 0-1.
I have to run this for multiple window size (K1,K2,K3, etc.) and the images are of size 1728*2034. When the window size is selected as 100, the time consumed is very high.
Can I use vectorization at some stage to reduce the time for loops?
The profiler result (for window size 21) is as follows:
The profiler result (for window size 100) is as follows:
I have changed the code of my function and have written it without the sub-function. The code is as follows:
function [out] = CE(out,s,M)
B = 255;
Aij = zeros(1,2);
out1 = out;
n_factor = (1/(2*s+1)^2);
for i = s+1 : size(out,1) - s
for j = s+1 : size(out,2) - s
temp = out(i-s:i+s,j-s:j+s);
Yij = out1(i,j);
Sij = n_factor*sum(sum(temp));
if Yij-Sij == 0
Aij(1) = M;
Aij(2) = M;
else
Aij(1) = M/(Yij-Sij);
Aij(2) = M/(Sij-Yij);
end
if (Yij>=Sij)
out1(i,j) = ((B + Aij(1))*Yij)/(Aij(1)+Yij);
else
out1(i,j) = (Aij(2)*Yij)/(Aij(2)+B-Yij);
end
end
end
out = out1;
There is a slight improvement in the speed from 93 sec to 88 sec. Suggestions for any other improvements to my code are welcomed.
I have tried to incorporate the suggestions given to replace sliding window with convolution and then vectorize the rest of it. The code below is my implementation and I'm not getting the result expected.
function [out_im] = CE_conv(im,s,M)
B = 255;
temp = ones(2*s,2*s);
temp = temp ./ numel(temp);
out1 = conv2(im,temp,'same');
out_im = im;
Aij = im-out1; %same as Yij-Sij
Aij1 = out1-im; %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0); % if Yij>Sij Mij = M/Yij-Sij;
Mij(Aij<0) = M./Aij1(Aij<0); % if Yij<Sij Mij = M/Sij-Yij;
Mij(Aij==0) = M; % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));
I am not able to figure out where I'm going wrong.
A detailed explanation of what I'm trying to implement is given in the following paper:
Vonikakis, Vassilios, and Ioannis Andreadis. "Multi-scale image contrast enhancement." In Control, Automation, Robotics and Vision, 2008. ICARCV 2008. 10th International Conference on, pp. 856-861. IEEE, 2008.
I've tried to see if I could get those times down by processing with colfiltand nlfilter, since both are usually much faster than for-loops for sliding window image processing.
Both worked fine for relatively small windows. For an image of 2048x2048 pixels and a window of 10x10, the solution with colfilt takes about 5 seconds (on my personal computer). With a window of 21x21 the time jumped to 27 seconds, but that is still a relative improvement on the times displayed on the question. Unfortunately I don't have enough memory to colfilt using windows of 100x100, but the solution with nlfilter works, though taking about 120 seconds.
Here the code
Solution with colfilt:
function outval = enhancematrix(inputmatrix,M,B)
%Inputmatrix is a 2D matrix or column vector, outval is a 1D row vector.
% If inputmatrix is made of integers...
inputmatrix = double(inputmatrix);
%1. Compute S and Y
normFactor = 1 / (size(inputmatrix,1) + 1).^2; %Size of column.
S = normFactor*sum(inputmatrix,1); % Sum over the columns.
Y = inputmatrix(ceil(size(inputmatrix,1)/2),:); % Center row.
% So far we have all S and Y, one value per column.
%2. Compute A(abs(Y-S))
A = Afunc(abs(S-Y),M);
% And all A: one value per column.
%3. The tricky part. If Y(i)-S(i) > 0 do something.
doPositive = (Y > S);
doNegative = ~doPositive;
outval = zeros(1,size(inputmatrix,2));
outval(doPositive) = (B + A(doPositive) .* Y(doPositive)) ./ (A(doPositive) + Y(doPositive));
outval(doNegative) = (A(doNegative) .* Y(doNegative)) ./ (A(doNegative) + B - Y(doNegative));
end
function out = Afunc(x,M)
% Input x is a row vector. Output is another row vector.
out = x;
out(x == 0) = M;
out(x ~= 0) = M./x(x ~= 0);
end
And to call it, simply do:
M = 1000; B = 255; enhancenow = #(x) enhancematrix(x,M,B);
w = 21 % windowsize
result = colfilt(inputImage,[w w],'sliding',enhancenow);
Solution with nlfilter:
function outval = enhanceimagecontrast(neighbourhood,M,B)
%1. Compute S and Y
normFactor = 1 / (length(neighbourhood) + 1).^2;
S = normFactor*sum(neighbourhood(:));
Y = neighbourhood(ceil(size(neighbourhood,1)/2),ceil(size(neighbourhood,2)/2));
%2. Compute A(abs(Y-S))
test = (Y>=S);
A = Afunc(abs(Y-S),M);
%3. Return outval
if test
outval = ((B + A) * Y) / (A + Y);
else
outval = (A * Y) / (A + B - Y);
end
function aval = Afunc(x,M)
if (x == 0)
aval = M;
else
aval = M/x;
end
And to call it, simply do:
M = 1000; B = 255; enhancenow = #(x) enhanceimagecontrast(x,M,B);
w = 21 % windowsize
result = nlfilter(inputImage,[w w], enhancenow);
I didn't spend much time checking that everything is 100% correct, but I did see some nice contrast enhancement (hair looks particularly nice).
This answer is the implementation that was suggested by Peter. I debugged the implementation and presenting the final working version of the fast implementation.
function [out_im] = CE_conv(im,s,M)
B = 255;
im = ( im - min(im(:)) ) ./ ( max(im(:)) - min(im(:)) )*255;
h = ones(s,s)./(s*s);
out1 = imfilter(im,h,'conv');
out_im = im;
Aij = im-out1; %same as Yij-Sij
Aij1 = out1-im; %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0); % if Yij>Sij Mij = M/(Yij-Sij);
Mij(Aij<0) = M./Aij1(Aij<0); % if Yij<Sij Mij = M/(Sij-Yij);
Mij(Aij==0) = M; % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));
out_im = ( out_im - min(out_im(:)) ) ./ ( max(out_im(:)) - min(out_im(:)) );
To call this use the following code
I = imread('pout.tif');
w_size = 51;
M = 4000;
output = CE_conv(I(:,:,1),w_size,M);
The output for the 'pout.tif' image is given below
The execution time for Bigger image and with 100*100 block size is around 5 secs with this implementation.

Recursively divide a square field - Matlab crashes

I am working with simulation of wireless sensor networks in matlab.
I have a 200*200 by field in which 100 sensor nodes have been plotted randomly. Each node has an associated load value with it. I have to place charging stations in this field. I am trying to divide this square recursively as long as I do not found a small sub-square in which I can place only one charging station. Here is the code I wrote to divide the square recursively and count number of stations that can be placed in a subsquare:
%Inputs to the function
%numstations - No. of stations to be placed = 10
%boundCoords - A 2*2 matrix with min and max coordinates of square . e.g [0 0;200 200]
% sensors - A 100*3 matrix for nodes with 1st column as randomly generated 100 x-coordinates,
%second column as randomly generated 100 y-coordinates,
%third column as corresponding load of each node (can be random)
function stationPoss = deploy(numStations, boundCoords)
global sensors;
centerCoord = mean(boundCoords, 1);
numSensors = size(sensors, 1);
sumQuadLoad = zeros(1, 4);
for i = 1:numSensors
if sensors(i, 1) < boundCoords(2, 1) && sensors(i, 2) < boundCoords(2, 2)...
&& sensors(i, 1) > boundCoords(1, 1) && sensors(i, 2) > boundCoords(1, 2)
isIn34Quads = sensors(i, 1) > centerCoord(1); % N
isIn24Quads = sensors(i, 2) > centerCoord(2);
biQuadIndex = [isIn34Quads, isIn24Quads];
quadIndex = bi2de(biQuadIndex) + 1;
sumQuadLoad(quadIndex) = sumQuadLoad(quadIndex) + sensors(i, 3);
end
end
if numStations == 1
[maxQuadLoad, quad] = max(sumQuadLoad); %#ok<ASGLU>
delta = (centerCoord - boundCoords(1, :)) .* de2bi(quad - 1);
assoQuadCoords = [boundCoords(1, :); centerCoord] + repmat(delta, 2, 1);
stationPoss = mean(assoQuadCoords, 1);
else
sumLoad = sum(sumQuadLoad);
quadNumStations = zeros(1, 4);
for i = 1:3
if sumQuadLoad(i) == 0
quadNumStations(i) = 0;
else
quadNumStations(i) = floor(numStations * sumQuadLoad(i) / sumLoad);
end
end
quadNumStations(4) = numStations - sum(quadNumStations);
stationPoss = zeros(numStations, 2);
for i = 1:4
delta = (centerCoord - boundCoords(1, :)) .* de2bi(i - 1);
newBoundCoords = [boundCoords(1, :); centerCoord] + repmat(delta, 2, 1);
if quadNumStations(i) ~= 0
indexRange = sum(quadNumStations(1:i-1)) + (1:quadNumStations(i));
stationPoss(indexRange, :) = deploy(quadNumStations(i), newBoundCoords);
end
end
end
The problem is while trying to run this code with numStations=2 it works fine and with numStations=3 it sometimes crashes. For numStation > 3 it almost always crashes.
I tried to come up with a non-recursive way to write this function but wasn't able to.
Will anyone please help me to figure out the crash problem or in writing non recursive solution to the above function. I have already tried increasing the recursion limit.

Implement those functions using matlab

I have an array of samples of ECG signals 1250x1 double let us called it "a".
I need to implement 4 functions which represent features are used to characterize the signals. Energy, 4th Power,Nonlinear Energy and Curve Length
I manged to implement Energy and 4th Power
for i=1:1250
energy = sum(a.^2,i);
power4th = sum(a.^4,i);
end
Which produce 2 array (energy and power4th)
How I can produce the other 2 array? let us called them NonLE and CL.
Use vectorization instead of for loops to solve all 4 of the formulas you need
% generate some random numbers
a = rand(1000,1);
Energy = sum(a.^2);
Power4 = sum(a.^4);
NLEnergy = sum(-a(3:end).*a(1:end-2) + a(2:end).^2);
CurveLength = sum(a(2:end) - a(1:end-1));
The . operator allows element by element operations in a vector.
Actually I think you can implement your formulas without using for loop. You can use matrix multiplication characteristic. Try the code below:
len = 1250;
a = randi(10, len, 1); % // You didn' t give your vector so I generated random a..
Energy = ones(1, len) * (a.^2);
power4th = ones(1, len) * (a.^4);
NonLE = ones(1, len - 2) * ( -a(3:end) .* a(1:end-2) ) + ones(1, len - 1) * ( a(2:end).^2 );
CL = ones(1, len - 1) * ( a(2:end) - a(1:end-1) );
You don't really need a for loop for 3 of them:
energy = sum(a.^2);
power_4th = sum(a.^4);
curve_length = sum(diff(a));
For the last one, you can do something like:
nonLE = 0;
for k = 3 : length(a)
nonLE = nonLE + a(k - 1)^2 - a(k) * a(k - 2);
end

Sum of Absolute differences between images in Matlab

I want to implement sum of absolute difference in Matlab to establish a similarity metric between between one video frame and 5 frames either side of this frame (i.e. past and future frames). I only need the SAD value for the co-located pixel in each frame, rather than a full search routine, such as full search.
Obviously I could implement this as nested loops such as:
bs = 2; % block size
for (z_i = -bs:1:bs)
for (z_j = -bs:1:bs)
I1(1+bs:end-bs,1+bs:end-bs) = F1(1+bs+z_i:end-bs+z_i, 1+bs+z_j:end-bs+z_j);
I2(1+bs:end-bs,1+bs:end-bs) = F2(1+bs+z_i:end-bs+z_i, 1+bs+z_j:end-bs+z_j);
sad(:,:) = sad(:,:) + abs( I1(:,:) - I2(:,:));
end
end
However I'm wondering is there a more efficient way of doing it than this? At the very least I guess I should define the above code snippet as a function?
Any recommendations would be grateful accepted!
You should use the command im2col in MATLAB you will be able to do so in Vectorized manner.
Just arrange each neighborhood in columns (For each frame).
Put them in 3D Matrix and apply you operation on the 3rd dimension.
Code Snippet
I used Wikipedia's definition of "Sum of Absolute Differences".
The demo script:
```
% Sum of Absolute Differences Demo
numRows = 10;
numCols = 10;
refBlockRadius = 1;
refBlockLength = (2 * refBlockRadius) + 1;
mImgSrc = randi([0, 255], [numRows, numCols]);
mRefBlock = randi([0, 255], [refBlockLength, refBlockLength]);
mSumAbsDiff = SumAbsoluteDifferences(mImgSrc, mRefBlock);
```
The Function SumAbsoluteDifferences:
```
function [ mSumAbsDiff ] = SumAbsoluteDifferences( mInputImage, mRefBlock )
%UNTITLED2 Summary of this function goes here
% Detailed explanation goes here
numRows = size(mInputImage, 1);
numCols = size(mInputImage, 2);
blockLength = size(mRefBlock, 1);
blockRadius = (blockLength - 1) / 2;
mInputImagePadded = padarray(mInputImage, [blockRadius, blockRadius], 'replicate', 'both');
mBlockCol = im2col(mInputImagePadded, [blockLength, blockLength], 'sliding');
mSumAbsDiff = sum(abs(bsxfun(#minus, mBlockCol, mRefBlock(:))));
mSumAbsDiff = col2im(mSumAbsDiff, [blockLength, blockLength], [(numRows + blockLength - 1), (numCols + blockLength - 1)]);
end
```
Enjoy...