Gradient descent for linear regression takes too long to converge

Gradient descent for linear regression takes too long to converge - matlab

I began to study machine learning and stuck on one issue.
My implementation of this method (both in MATLAB and C++) converge in 1 500 000 iterations, and I can not understand why. I found the method implementation in Python, and the algorithm converged in 2000 iterations. Under converged I understand that he gave almost the same answer as obviously the correct method.
Preliminary data are not processed in any way. Can you explain me please, is this normal number of iterations, or I just made a mistake in the algorithm?
The cost function used and its partial derivatives:
MATLAB code
%y=t0+t1*x
learningRate = 0.0001;
curT0 = 0;
curT1 = 0;
i = 1;
while (i < 1500000)
derT0 = 0;
derT1 = 0;
for j=1:1:N
derT0 = derT0 + (-1/N)*(Y(j) - (curT1*X(j) + curT0));
derT1 = derT1 + (-1/N)*X(j)*(Y(j) - (curT1*X(j) + curT0));
end
curT0 = curT0 - (learningRate*derT0);
curT1 = curT1 - (learningRate*derT1);
%sprintf('Iteration %d, t0=%f, t1=%f',i,curT0,curT1)
i = i+1;
end
P.S. I tried to increase the "learning Rate" variable but in this case, the algorithm diverges and comes in huge numbers.

The gradient descent depends on three things:
the initial solution [curT0,curT1] (the starting point from which to begin the search)
the learning rate (how big a step to take in the direction of the gradient)
the number of iterations
If you start far off and make small steps it will take you many iterations to reach the solution. If the steps are too big, you can miss the solution by stepping over it. Also the search could get stuck in local minima depending on where you started from in the search space..
Also you can specify other stopping criteria like a tolerance (a threshold which if crossed, stops the iterations). Your current code will always loop the maximum number of iterations (1500000).

Related

Optimize matlab for loop for big data

I want to calculate the Euclidean distance between two images using the Hyperbolic Tangent (Sigmoid) kernel. Please follow this link where I have discussed the same problem using Gaussian Kernel in detail.
If x=(i,j) & y=(i1,j1) are any two pixels in our image then for hyperbolic tangent kernel, my H(x,y) will be defined as:
H(i,j) = tanh(alpha*(x'*y) + c)
where alpha and c are parameters and x' is the transpose of x. Parameter alpha can be taken as 1/N where N is my image dimension(8192 x 200 in my case) and c can take any value according to the problem. More detailed description about Hyperbolic Tangent kernel can be found here.
To achieve my goal & keeping the running time under consideration, I have written the below MATLAB script.
gray1=zeros(8192,200);
gray2=zeros(8192,200);
s1 = 8192;
s2 = 200;
alpha = s1*s2;
perms = combvec(1:s2,1:s1);
perms = [perms(2,:);perms(1,:)]';
perms1 = perms;
gray1(4096,100) = 10;
gray2(10,100) = 10;
img_diff = gray1 - gray2;
display('Calculation of Sigmoid Kernel started');
for i = 1:length(perms1)
kernel = sum(bsxfun(#times,perms,perms1(i,:))');
kernel1 = tanh((1/alpha)*kernel + 1)';
g_temp(i) = img_diff(:)'*kernel1;
end
temp = g_temp*img_diff(:);
ans = sqrt(temp);
In spite of my all efforts I couldn't vectorize it further so as to decrease its running cost. Currently, it is taking around 29 hours to complete which is too much for me as I want to run it for various different images. I want to give it a completely vectorized form using intrinsic MATLAB functions as it was done by #dan-man in the case of Gaussian Kernel. With his help the Gaussian Version was taking 1-2 secs to complete. I tried my best to use the same conv2fft function in this case also but it seems difficult to find a way to achieve that.
Can someone please help me to remove that one extra for loop so as to get the running cost of algorithm in the same proportion as that of the Gaussian version of same problem.
Thanks in advance.

Get rid of the nasty loop with matrix-multiplication -
g_temp = img_diff(:).'*tanh((1/alpha)*(perms*perms.')+1)

With my times in my PC for just 50 iterations, the code takes 2.07s
Just changing the bsxfun line to
kernel = sum(bsxfun(#times,perms,perms1(i,:)),2)';
as the warning suggests you can get it to 1.65s
If you use the Neural Network toolbox and substitute tanh by tansig , the time goes to 1.44s
If you write your own tanhas
kernel1= (2./(1+exp(-2.*((1/alpha)*kernel + 1)))-1)';
the time goes to 1.28s
Just these changes would mean improvement from 29h to 18h
And remember to preallocate!
g_temp=zeros(length(perms1),1);

Optimizing Fourier Series Fitting Function Matlab

I am trying to iterate through a set of samples that seems to show periodic changes. I need continuously apply the fit function to get the fourier series coefficients, the regression has to be n samples in the past (in my case, around 30). The problem is, my code is extremely slow! It will take like 1 hour to do this for a set of 50,000 samples. Is there any way to optimize this? What am I doing wrong?
Here's my code:
function[coefnames,coef] = fourier_regression(vect_waves,n)
j = 1;
coef = zeros(length(vect_waves)-n,10);
for i=n+1:length(vect_waves)
take_fourier = vect_waves(i-n+1:i);
x = 1:n;
f = fit(x,take_fourier,'fourier4');
current_coef = coeffvalues(f);
coef(j,1:length(current_coef)) = current_coef;
j = j + 1;
end
coefnames = coeffnames(f);
end
When I call [coefnames,coef] = fourier_regression(VECTOR,30); This takes forever to compute. Is there any way to fix it? What's wrong with my code?
Note: I have a intel i7 5500 U cpu, 16GB RAM, and using Matlab 2015a.

As I am not familiar with your application, I am not sure whether it is possible to vectorize the code to improve performance. However, I have a couple of other tips.
One thing you should consider is preallocation of arrays. In this case, you should preallocate at least the array coef since I believe you do know its size before starting the loop.
Another thing I suggest is to profile your code. This will provide information on what parts of your code are consuming the most time, helping you focus your effort on improving those parts' performance.

Iteration for convergence in Matlab without using a while loop

I have to iterate a process where I have an initial guess for the Mach number (M0). This initial guess will give me another guess for the Mach number by using two equations (Mn). Eventually, i want to iterate this process untill the error between M0 and Mn is small. I have the following piece of code and it actually works well with a while loop.
However, I am afraid that the while loop will take many iterations and computational time for certain inputs since this will be part of a bigger code which most likely will give unfeasible inputs for the while loop.
Therefore my question is the following. How can I iterate this process within Matlab without consulting a while loop? The code that I am implementing now is the following:
%% Input
gamma = 1.4;
theta = atan(0.315);
cpi = -0.732;
%% Loop
M0 = 0.2; %initial guess
Err = 100;
iterations = 0;
while Err > 0.5E-3
B = (1-(M0^2)*(1-M0*cpi))^0.5;
Mn = (((gamma+1)/2) * ((B+((1-cpi)^0.5)*sec(theta)-1)^2/(B^2 + (tan(theta))^2)) - ((gamma-1)/2) )^-0.5;
Err = abs(M0 - Mn);
M0 = Mn;
iterations=iterations+1;
end
disp(iterations) disp(Mn)
Many thanks

Since M0 is calculated in each iteration and you have trigonometric functions, you cannot use another way than iteration structures (i.e. while).
If you had a specific increase or decrease at M0, then you could initialize a vector of M0 and do vector calculations for B and Err.
But, with sec and tan this is not possible.
Another wat would be to use the parallel processing. But, since you change the M0 at each iteration then you cannot use the parfor loop.
As for a for loop, in MATLAB you need an array for for "command" argument (e.g. 1:10 or 1:length(x) or i = A, where A = 1:10 or A = [1:10;11:20]). Since you evaluate a condition and depending on the result of the evaluation you judge if you continue the execution or not, it seems that the while loop (or do while in another language) is the only way to go.

I think you need to clarify the issue. If it the issue you want to solve is that some inputs take a long time to calculate, it is not the while loop that takes the time, it is the execution of the code multiple times that causes it. Any method that loops through will be restricted by the time the block of code takes to execute multiplied by the number of iterations required to converge.
You can introduce something to stop at a certain number of iterationtions, conceptually:
While ((err > tolerance) && (numIterations < limit))
If you want an answer which does not require iterating over the code, this is akin to finding a closed form solution, and I suspect this does not exist.
Edit to add: by not exist I mean in a practical form which can be implemented in a more efficient way then iterating to a solution.

K-means Stopping Criteria in Matlab?

Im implementing the k-means algorithm on matlab without using the k-means built-in function, The stopping criteria is when the new centroids doesn't change by new iterations, but i cannot implement it in matlab , can anybody help?
Thanks

Setting no change as a stopping criteria is a bad idea. There are a few main reasons you shouldn't use a 0 change condition
even for a well behaved function the difference between 0 change and a very small change (say 1e-5 perhaps)could be 1000+ iterations, so you are wasting time trying to get them to be exactly the same. Especially because computers usually keep far more digits than we are interested in. IF you only need 1 digit accuracy, why wait for the computer to find an answer within 1e-31?
computers have floating point errors everywhere. Try doing some easily reversible matrix operations like a = rand(3,3); b = a*a*inv(a); a-b theoretically this should be 0 but you will see it isn't. So these errors alone could prevent your program from ever stopping
dithering. lets say we have a 1d k means problem with 3 numbers and we want to split them into 2 groups. One iteration the grouping can be a,b vs c. the next iteration could be a vs b,c the next could be a,b vs c the next.... This is of course a simplified example, but there can be instances where a few data points can dither between clusters, and you will end up with a never ending algorithm. Since those few points are reassigned, the change will never be 0
the solution is to use a delta threshold. basically you subtract the current values from the previous and if they are less than a threshold you are done. This on its own is powerful, but as with any loop, you need a backup escape plan. And that is setting a max_iterations variable. Look at matlabs documentation for kmeans, even they have a MaxIter variable (default is 100) so even if your kmeans doesn't converge, at least it wont run endlessly. Something like this might work
%problem specific
max_iter = 100;
%choose a small number appropriate to your problem
thresh = 1e-3;
%ensures it runs the first time
delta_mu = thresh + 1;
num_iter = 0;
%do your kmeans in the loop
while (delta_mu > thresh && num_iter < max_iter)
%save these right away
old_mu = curr_mu;
%calculate new means and variances, this is the standard kmeans iteration
%then store the values in a variable called curr_mu
curr_mu = newly_calculate_values;
%use the two norm to find the delta as a single number. no matter what
%the original dimensionality of mu was. If old_mu -new_mu was
% 0 the norm is still 0. so it behaves well as a distance measure.
delta_mu = norm(old_mu - curr_mu,2);
num_ter = num_iter + 1;
end
edit
if you don't know the 2 norm is essentially the euclidean distance

vectorizing loops in Matlab - performance issues

This question is related to these two:
Introduction to vectorizing in MATLAB - any good tutorials?
filter that uses elements from two arrays at the same time
Basing on the tutorials I read, I was trying to vectorize some procedure that takes really a lot of time.
I've rewritten this:
function B = bfltGray(A,w,sigma_r)
dim = size(A);
B = zeros(dim);
for i = 1:dim(1)
for j = 1:dim(2)
% Extract local region.
iMin = max(i-w,1);
iMax = min(i+w,dim(1));
jMin = max(j-w,1);
jMax = min(j+w,dim(2));
I = A(iMin:iMax,jMin:jMax);
% Compute Gaussian intensity weights.
F = exp(-0.5*(abs(I-A(i,j))/sigma_r).^2);
B(i,j) = sum(F(:).*I(:))/sum(F(:));
end
end
into this:
function B = rngVect(A, w, sigma)
W = 2*w+1;
I = padarray(A, [w,w],'symmetric');
I = im2col(I, [W,W]);
H = exp(-0.5*(abs(I-repmat(A(:)', size(I,1),1))/sigma).^2);
B = reshape(sum(H.*I,1)./sum(H,1), size(A, 1), []);
Where
A is a matrix 512x512
w is half of the window size, usually equal 5
sigma is a parameter in range [0 1] (usually one of: 0.1, 0.2 or 0.3)
So the I matrix would have 512x512x121 = 31719424 elements
But this version seems to be as slow as the first one, but in addition it uses a lot of memory and sometimes causes memory problems.
I suppose I've made something wrong. Probably some logic mistake regarding vectorizing. Well, in fact I'm not surprised - this method creates really big matrices and probably the computations are proportionally longer.
I have also tried to write it using nlfilter (similar to the second solution given by Jonas) but it seems to be hard since I use Matlab 6.5 (R13) (there are no sophisticated function handles available).
So once again, I'm asking not for ready solution, but for some ideas that would help me to solve this in reasonable time. Maybe you will point me what I did wrong.
Edit:
As Mikhail suggested, the results of profiling are as follows:
65% of time was spent in the line H= exp(...)
25% of time was used by im2col

How big are I and H (i.e. numel(I)*8 bytes)? If you start paging, then the performance of your second solution is going to be affected very badly.
To test whether you really have a problem due to too large arrays, you can try and measure the speed of the calculation using tic and toc for arrays A of increasing size. If the execution time increases faster than by the square of the size of A, or if the execution time jumps at some size of A, you can try and split the padded I into a number of sub-arrays and perform the calculations like that.
Otherwise, I don't see any obvious places where you could be losing lots of time. Well, maybe you could skip the reshape, by replacing B with A in your function (saves a little memory as well), and writing
A(:) = sum(H.*I,1)./sum(H,1);
You may also want to look into upgrading to a more recent version of Matlab - they've worked hard on improving performance.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Gradient descent for linear regression takes too long to converge - matlab

Related

Optimize matlab for loop for big data

Optimizing Fourier Series Fitting Function Matlab

Iteration for convergence in Matlab without using a while loop

K-means Stopping Criteria in Matlab?

vectorizing loops in Matlab - performance issues

Categories

Resources