i am currently doing a case study on the improved performance of a separable filter vs that of a square filter. I understand the mathematics behind the time complexity difference, however i have run into a problem with the real world implementation.
so basically what i have done is write a loop which implements my filter image function given by:
function imOut = FilterImage(imIn, kernel, boundFill, outputSize)
VkernelOffset = floor(size(kernel,1)/2);
HkernelOffset = floor(size(kernel,2)/2);
imIn = padarray(imIn, [VkernelOffset HkernelOffset], boundFill);
imInPadded = padarray(imIn, [VkernelOffset HkernelOffset], boundFill);
imOut = zeros(size(imIn));
kernelVector = reshape(kernel,1, []);
kernelVector3D = repmat(kernelVector, 1, 1, size(imIn,3));
for row = 1:size(imIn,1)
Vwindow = row + size(kernel,1)-1;
for column = 1:size(imIn,2)
Hwindow = column + size(kernel,2)-1;
imInWindowVector = reshape( ...
imInPadded(row:Vwindow, column:Hwindow, :),1,[],size(imIn,3));
imOut(row,column, :) = sum((imInWindowVector.*kernelVector3D),2);
end
end
ouputSize = lower(outputSize);
if strcmp(outputSize, 'same')
imOut = imOut((1+VkernelOffset):(size(imOut,1)-VkernelOffset), ...
(1+HkernelOffset):(size(imOut,2)-HkernelOffset), : );
elseif strcmp(outputSize, 'valid')
imOut = imOut((1+VkernelOffset*2):(size(imOut,1)-VkernelOffset*2), ...
(1+HkernelOffset*2):(size(imOut,2)-HkernelOffset*2), : );
end
end
I wrote another script which carries out the following two sets of commands on a 740x976 greyscale image and logs their processing time:
for n = 1:25
dim(n) = 6*n + 1;
h=fspecial('gaussian',dim(n), 4);
tic;
Im = FilterImage(I,h,0,'full');
tM(n) = toc;
h1 = fspecial('gaussian', [dim(n) 1], 4);
h2 = fspecial('gaussian', [1 dim(n)], 4);
tic;
It = FilterImage(I,h1,0,'full');
Is = FilterImage(It,h2,0,'full');
tS(n) = toc;
end
after plotting the respective time required i get the following result:
My problem is, Why is the separable method slower up to kernel matrices of size 49x49, and only shows improved speed from kernel sizes of 55x55 upwards, is something wrong with my image filter code?
p.s. the image filter code was designed for 3D images to take into account colour depth, however for the speed test i am using a greyscale image converted to double using im2double.
p.s.2 so as mentioned below, for comparison i carried out the same process using MATLAB's native conv2 function, and the results where as you'd expect, and also incredibly faster...
thanks
It seems like an optimization error.
I'd use the function conv2 instead.
Let's write a sample code:
mOutputImage = conv2((vFilterCoeff.' * vFilterCoeff), mInputImage);
mOutputImageSep = conv2(vFilterCoeff, vFilterCoeff.', mInputImage);
Try those in a loop where the length of vFilterCoeff (Row Vector!!!) is getting bigger.
update us what are the result now.
Related
Well basically the question says it all, my intuition tells me that a call to minmax should take less time than calling a min and then a max.
Is there some optimization I prevent Matlab carrying out in the following code?
minmax:
function minmax_vals = minmaxtest()
buffSize = 1000000;
A = rand(128,buffSize);
windowSize = 100;
minmax_vals = zeros(128,buffSize/windowSize*2);
for i=1:(buffSize/windowSize)
minmax_vals(:,(2*i-1):(2*i)) = minmax(A(:,((i-1)*windowSize+1):(i*windowSize)));
end
end
separate min-max:
function minmax_vals = minmaxtest()
buffSize = 1000000;
A = rand(128,buffSize);
windowSize = 100;
minmax_vals = zeros(128,buffSize/windowSize*2);
for i=1:(buffSize/windowSize)
minmax_vals(:,(2*i-1)) = min(A(:,((i-1)*windowSize+1):(i*windowSize)),[],2);
minmax_vals(:,(2*i)) = max(A(:,((i-1)*windowSize+1):(i*windowSize)),[],2);
end
end
Summary
You can see the overhead because minmax isn't completely obfuscated. Simply type
edit minmax
And you will see the function!
It appears that there is a data-type conversion to nntype.data('format',x,'Data');, which will not be the case for min or max and could be costly. This is for use with MATLAB's neural networking (nn) tools as minmax belongs to that toolbox.
In short, min and max are lower-level, compiled functions (hence they are fully obfuscated), which don't require functionality from the nn toolbox.
Benchmark
Here is a slightly more isolated benchmark, without your windowing and using timeit instead of the profiler. I've also included timings for just the data conversion used in minmax! The test gets the min and max of each row in a large matrix, see here the output plot and code below...
It appears that there is a linear relationship between number of rows and time taken (as expected for a linear operator), but the coefficient is much greater for the combined minmax relationship, with the separate operations being approximately 10x quicker. Also you can clearly see that data conversion takes more time that the min then max version alone!
function benchie()
K = zeros(10, 3);
for k = 1:10
n = 2^k;
A = rand(n, 200);
Arow = zeros(1,200);
m = zeros(n,2);
f1 = #()minmaxtest(A,m);
K(k,1) = timeit(f1);
f2 = #()minthenmaxtest(A,m);
K(k,2) = timeit(f2);
f3 = #()dataconversiontest(A, Arow);
K(k,3) = timeit(f3);
end
figure; hold on; plot(2.^(1:10), K(:,1)); plot(2.^(1:10), K(:,2)); plot(2.^(1:10), K(:,3));
end
function minmaxtest(A,m)
for ii = 1:size(A,1)
m(ii, 1:2) = minmax(A(ii,:));
end
end
function dataconversiontest(A, Arow)
for ii = 1:size(A,1)
Arow = nntype.data('format', A(ii,:), 'Data');;
end
end
function minthenmaxtest(A,m)
for ii = 1:size(A,1)
m(ii, 1) = min(A(ii,:));
m(ii, 2) = max(A(ii,:));
end
end
After having some basics understanding of GPML toolbox , I written my first code using these tools. I have a data matrix namely data consist of two array values of total size 1000. I want to use this matrix to estimate the GP value using GPML toolbox. I have written my code as follows :
x = data(1:200,1); %training inputs
Y = data(1:201,2); %, training targets
Ys = data(201:400,2);
Xs = data(201:400,1); %possibly test cases
covfunc = {#covSE, 3};
ell = 1/4; sf = 1;
hyp.cov = log([ell; sf]);
likfunc = #likGauss;
sn = 0.1;
hyp.lik = log(sn);
[ymu ys2 fmu fs2] = gp(hyp, #infExact, [], covfunc, likfunc,X,Y,Xs,Ys);
plot(Xs, fmu);
But when I am running this code getting error 'After having some basics understanding of GPML toolbox , I written my first code using these tools. I have a data matrix namely data consist of two array values of total size 1000. I want to use this matrix to estimate the GP value using GPML toolbox. I have written my code as follows :
x = data(1:200,1); %training inputs
Y = data(1:201,2); %, training targets
Ys = data(201:400,2);
Xs = data(201:400,1); %possibly test cases
covfunc = {#covSE, 3};
ell = 1/4; sf = 1;
hyp.cov = log([ell; sf]);
likfunc = #likGauss;
sn = 0.1;
hyp.lik = log(sn);
[ymu ys2 fmu fs2] = gp(hyp, #infExact, [], covfunc, likfunc,X,Y,Xs,Ys);
plot(Xs, fmu);
But when I am running this code getting:
Error using covMaha (line 58) Parameter mode is either 'eye', 'iso',
'ard', 'proj', 'fact', or 'vlen'
Please if possible help me to figure out where I am making mistake ?
I know this is way late, but I just ran into this myself. The way to fix it is to change
covfunc = {#covSE, 3};
to something like
covfunc = {#covSE, 'iso'};
It doesn't have to be 'iso', it can be any of the options listed in the error message. Just make sure your hyperparameters are set correctly for the specific mode you choose. This is detailed more in the covMaha.m file in GPML.
I'm trying to estimate the (unknown) original datapoints that went into calculating a (known) moving average. However, I do know some of the original datapoints, and I'm not sure how to use that information.
I am using the method given in the answers here: https://stats.stackexchange.com/questions/67907/extract-data-points-from-moving-average, but in MATLAB (my code below). This method works quite well for large numbers of data points (>1000), but less well with fewer data points, as you'd expect.
window = 3;
datapoints = 150;
data = 3*rand(1,datapoints)+50;
moving_averages = [];
for i = window:size(data,2)
moving_averages(i) = mean(data(i+1-window:i));
end
length = size(moving_averages,2)+(window-1);
a = (tril(ones(length,length),window-1) - tril(ones(length,length),-1))/window;
a = a(1:length-(window-1),:);
ai = pinv(a);
daily = mtimes(ai,moving_averages');
x = 1:size(data,2);
figure(1)
hold on
plot(x,data,'Color','b');
plot(x(window:end),moving_averages(window:end),'Linewidth',2,'Color','r');
plot(x,daily(window:end),'Color','g');
hold off
axis([0 size(x,2) min(daily(window:end))-1 max(daily(window:end))+1])
legend('original data','moving average','back-calculated')
Now, say I know a smattering of the original data points. I'm having trouble figuring how might I use that information to more accurately calculate the rest. Thank you for any assistance.
You should be able to calculate the original data exactly if you at any time can exactly determine one window's worth of data, i.e. in this case n-1 samples in a window of length n. (In your case) if you know A,B and (A+B+C)/3, you can solve now and know C. Now when you have (B+C+D)/3 (your moving average) you can exactly solve for D. Rinse and repeat. This logic works going backwards too.
Here is an example with the same idea:
% the actual vector of values
a = cumsum(rand(150,1) - 0.5);
% compute moving average
win = 3; % sliding window length
idx = hankel(1:win, win:numel(a));
m = mean(a(idx));
% coefficient matrix: m(i) = sum(a(i:i+win-1))/win
A = repmat([ones(1,win) zeros(1,numel(a)-win)], numel(a)-win+1, 1);
for i=2:size(A,1)
A(i,:) = circshift(A(i-1,:), [0 1]);
end
A = A / win;
% solve linear system
%x = A \ m(:);
x = pinv(A) * m(:);
% plot and compare
subplot(211), plot(1:numel(a),a, 1:numel(m),m)
legend({'original','moving average'})
title(sprintf('length = %d, window = %d',numel(a),win))
subplot(212), plot(1:numel(a),a, 1:numel(a),x)
legend({'original','reconstructed'})
title(sprintf('error = %f',norm(x(:)-a(:))))
You can see the reconstruction error is very small, even using the data sizes in your example (150 samples with a 3-samples moving average).
numberOfTrials = 10;
numberOfSizes = 6;
sizesArray = zeros(numberOfSizes, 1);
randomMAveragesArray = zeros(numberOfSizes, 1);
for i=1:numberOfSizes
N = 2^i;
%x=rand(N,1);
randomMTimesArray = zeros(numberOfTrials, 1);
for j=1:numberOfTrials
tic;
for k=1:N^3
x = .323452345e-999 * .98989898989889e-953;
end
randomMTimesArray(j) = toc;
end
sizesArray(i) = N;
end
randomMPolyfit = polyfit(log10(sizesArray), log10(randomMAveragesArray), 1);
randomMSlope = randomMPolyfit(1);
That is my Matlab script. I was originally timing a NxN random matrix using '\' to solve. The runtime on this is O(n^3). But my slope for the log graph was always about 1.8.
My understanding from this is that the timing results are O(n^k) where k is the slope from the log/log graph. So therefore the slope I should get should be around 3.
The code I posted above I have made an arbitrary loop that is N^3 with a floating point operation to test if this works.
However with the for loop I'm getting a slope of 2.5.
Why is this?
Since the O() behavior is asymptotic, you sometimes cannot see the behavior for small values of N. For example, if I set numberOfSizes = 9 and discard the first 3 points for the polynomial fit, the slope is much closer to 3:
randomMPolyfit = polyfit(log10(sizesArray(4:end)), log10(randomMAveragesArray(4:end)), 1);
randomMSlope = randomMPolyfit(1)
randomMSlope =
2.91911869082081
If you plot the timing array this behavior is clearer.
I have a set of three vectors (stored into a 3xN matrix) which are 'entangled' (e.g. some value in the second row should be in the third row and vice versa). This 'entanglement' is based on looking at the figure in which alpha2 is plotted. To separate the vector I use a difference based approach where I calculate the difference of one value with respect the three next values (e.g. comparing (1,i) with (:,i+1)). Then I take the minimum and store that. The method works to separate two of the three vectors, but not for the last.
I was wondering if you guys can share your ideas with me how to solve this problem (if possible). I have added my coded below.
Thanks in advance!
Problem in figures:
clear all; close all; clc;
%%
alpha2 = [-23.32 -23.05 -22.24 -20.91 -19.06 -16.70 -13.83 -10.49 -6.70;
-0.46 -0.33 0.19 2.38 5.44 9.36 14.15 19.80 26.32;
-1.58 -1.13 0.06 0.70 1.61 2.78 4.23 5.99 8.09];
%%% Original
figure()
hold on
plot(alpha2(1,:))
plot(alpha2(2,:))
plot(alpha2(3,:))
%%% Store start values
store1(1,1) = alpha2(1,1);
store2(1,1) = alpha2(2,1);
store3(1,1) = alpha2(3,1);
for i=1:size(alpha2,2)-1
for j=1:size(alpha2,1)
Alpha1(j,i) = abs(store1(1,i)-alpha2(j,i+1));
Alpha2(j,i) = abs(store2(1,i)-alpha2(j,i+1));
Alpha3(j,i) = abs(store3(1,i)-alpha2(j,i+1));
[~, I] = min(Alpha1(:,i));
store1(1,i+1) = alpha2(I,i+1);
[~, I] = min(Alpha2(:,i));
store2(1,i+1) = alpha2(I,i+1);
[~, I] = min(Alpha3(:,i));
store3(1,i+1) = alpha2(I,i+1);
end
end
%%% Plot to see if separation worked
figure()
hold on
plot(store1)
plot(store2)
plot(store3)
Solution using extrapolation via polyfit:
The idea is pretty simple: Iterate over all positions i and use polyfit to fit polynomials of degree d to the d+1 values from F(:,i-(d+1)) up to F(:,i). Use those polynomials to extrapolate the function values F(:,i+1). Then compute the permutation of the real values F(:,i+1) that fits those extrapolations best. This should work quite well, if there are only a few functions involved. There is certainly some room for improvement, but for your simple setting it should suffice.
function F = untangle(F, maxExtrapolationDegree)
%// UNTANGLE(F) untangles the functions F(i,:) via extrapolation.
if nargin<2
maxExtrapolationDegree = 4;
end
extrapolate = #(f) polyval(polyfit(1:length(f),f,length(f)-1),length(f)+1);
extrapolateAll = #(F) cellfun(extrapolate, num2cell(F,2));
fitCriterion = #(X,Y) norm(X(:)-Y(:),1);
nFuncs = size(F,1);
nPoints = size(F,2);
swaps = perms(1:nFuncs);
errorOfFit = zeros(1,size(swaps,1));
for i = 1:nPoints-1
nextValues = extrapolateAll(F(:,max(1,i-(maxExtrapolationDegree+1)):i));
for j = 1:size(swaps,1)
errorOfFit(j) = fitCriterion(nextValues, F(swaps(j,:),i+1));
end
[~,j_bestSwap] = min(errorOfFit);
F(:,i+1) = F(swaps(j_bestSwap,:),i+1);
end
Initial solution: (not that pretty - Skip this part)
This is a similar solution that tries to minimize the sum of the derivatives up to some degree of the vector valued function F = #(j) alpha2(:,j). It does so by stepping through the positions i and checks all possible permutations of the coordinates of i to get a minimal seminorm of the function F(1:i).
(I'm actually wondering right now if there is any canonical mathematical way to define the seminorm so we get our expected results... I initially was going for the H^1 and H^2 seminorms, but they didn't quite work...)
function F = untangle(F)
nFuncs = size(F,1);
nPoints = size(F,2);
seminorm = #(x,i) sum(sum(abs(diff(x(:,1:i),1,2)))) + ...
sum(sum(abs(diff(x(:,1:i),2,2)))) + ...
sum(sum(abs(diff(x(:,1:i),3,2)))) + ...
sum(sum(abs(diff(x(:,1:i),4,2))));
doSwap = #(x,swap,i) [x(:,1:i-1), x(swap,i:end)];
swaps = perms(1:nFuncs);
normOfSwap = zeros(1,size(swaps,1));
for i = 2:nPoints
for j = 1:size(swaps,1)
normOfSwap(j) = seminorm(doSwap(F,swaps(j,:),i),i);
end
[~,j_bestSwap] = min(normOfSwap);
F = doSwap(F,swaps(j_bestSwap,:),i);
end
Usage:
The command alpha2 = untangle(alpha2); will untangle your functions:
It should even work for more complicated data, like these shuffled sine-waves:
nPoints = 100;
nFuncs = 5;
t = linspace(0, 2*pi, nPoints);
F = bsxfun(#(a,b) sin(a*b), (1:nFuncs).', t);
for i = 1:nPoints
F(:,i) = F(randperm(nFuncs),i);
end
Remark: I guess if you already know that your functions will be quadratic or some other special form, RANSAC would be a better idea for larger number of functions. This could also be useful if the functions are not given with the same x-value spacing.