Stochastic gradient descent algorithm in MATLAB - matlab

I'm trying to implement stochastic gradient descent in MATLAB, but I'm going wrong somewhere. I think that maybe the way I am checking for convergence is incorrect (I wasn't quite sure how to update the estimator with each iteration), but I'm not sure. I've been trying just to fit basic linear data, but I'm getting results that are pretty far off and I'm hoping to get some help. Would someone be able to point out where I'm going wrong, and why this isn't working correctly?
Thanks!
Here is the data set up and general code:
clear all;
close all;
clc
N_features = 2;
d = 100;
m = 100;
X_train = 10*rand(d,1);
X_test = 10*rand(d,1);
X_train = [ones(d,1) X_train];
X_test = [ones(d,1) X_test];
y_train = 5 + X_train(:,2) + 0.5*randn(d,1);
y_test = 5 + X_test(:,2) + 0.5*randn(d,1);
gamma = 0.01; %learning rate
[sgd_est_train,sgd_est_test,SSE_train,SSE_test,w] = stoch_grad(d,m,N_features,X_train,y_train,X_test,y_test,gamma);
figure(1)
plot(X_train(:,2),sgd_est_train,'ro',X_train(:,2),y_train,'go')
figure(2)
plot(X_test(:,2),sgd_est_test,'bo',X_test(:,2),y_test,'go')
and the function that actually implements the SGD is:
% stochastic gradient descent
function [sgd_est_train,sgd_est_test,SSE_train,SSE_test,w] = stoch_grad(d,m,N_features,X_train,y_train,X_test,y_test,gamma)
epsilon = 0.01; %convergence criterion
max_iter = 10000;
w0 = zeros(N_features,1); %initial guess
w = zeros(N_features,1); %for convenience
x = zeros(d,1);
z = zeros(d,1);
for jj=1:max_iter;
for kk=1:d;
x = X_train(kk,:)';
z = gamma*((w0'*x-y_train(kk))*x);
w = w0 - z;
end
if norm(w0-w,2)<epsilon
break;
else
w0 = w;
end
end
sgd_est_test = zeros(m,1);
sgd_est_train = zeros(d,1);
for ll=1:m;
sgd_est_test(ll,1) = w'*X_test(ll,:)';
end
for ii=1:d;
sgd_est_train(ii,1) = w'*X_train(ii,:)';
end
SSE_test = sum((sgd_est_test - y_test).^2);
SSE_train = sum((sgd_est_train - y_train).^2);
end

I tried lowering the learning rate at 0.001 and resulted to this:
Which tells me that your algorithm produces an estimation of the form y=ax instead of y=ax + b (for some reason ignores the constant term) and also you need to lower the learning rate in order to converge.

Related

Solving Coupled partial differential equations of stiff nature using MATLAB

I want to solve coupled partial differential equations of first order, which are of stiff nature. I have coded in MATLAB to solve this pde's, I have used Method of line to convert PDE into ODE, and i have used beam and warmings(second order upwind) method to discritize the spatial derivative. The discretization method is total variation diminishing(TVD) to eliminate the oscillation. But rather using TVD and ode15s solver to integrate resultant stiff ode's the resultant plot is oscillatory(not smooth). What should i do to eliminate this oscillation and get correct results.
I have attached my MATLAB code.. please see it and suggest some improvement.
∂y(1)/∂t=-0.1 ∂y(1)/∂x + (0.5*e^(15*(y(2)⁄(1+y(2))))*(1- y(1))
∂y(2)/∂t=-0.1 ∂y(2)/∂x - (0.4*e^(15*(y(2)⁄(1+y(2))))*(1- y(1))-0.4
Initial condition: at t = 0 y(1)= y(2)=0
Boundary condition: y(1)= y(2) = 0 at x=0
I have attached my MATLAB code.. please see it and suggest some improvement.
function brussode(N)
if nargin<1
N = 149;
end
tspan = [0 10];
m = 0.00035
t = (1:N)/(N+1)*m;
y0 = [repmat(0,1,N); repmat(0,1,N)];
p = 0.5
q = 0.4
options = odeset('Vectorized','on','JPattern',jpattern(N));
[t,y] = ode15s(#f,tspan,y0,options);
a = size(y,2)
u = y(:,1:2:end);
x = (1:N)/(N+1);
figure;
%surf(x,t(end,:),u);
plot(x,u(end,:))
xlabel('space');
ylabel('solution');
zlabel('solution u');
%--------------------------------------------------------------
%Nested function -- N is provided by the outer function.
%
function dydt = f(t,y)
%Derivative function
dydt = zeros(2*N,size(y,2)); %preallocate dy/dt
x = (1:N)/(N+1);
% Evaluate the 2 components of the function at one edge of the grid
% (with edge conditions).
i = 1;
%y(1,:) = 0;
%y(2,:) = 0;
dydt(i,:) = -0.1*(N+1)*(y(i+2,:)-0)+ (0.01/2)*m*((N+1).^3)*(y(i+2,:)-0) + p*exp(15*(0/(1+0)))*(1-0);
dydt(i+1,:) = -0.1*(N+1)*(y(i+3,:)-0)+ (0.01/2)*m*((N+1).^3)*(y(i+3,:)-0) - q*exp(15*(0/(1+0)))*(1-0)+0.25;
i = 3;
%y(1,:) = 0;
%y(2,:) = 0;
dydt(i,:) = -0.1*(N+1)*(y(i+2,:)-y(i,:)) + (0.01/2)*m*((N+1).^3)*(y(i+3,:)-y(i,:)) + p*exp(15*(y(i+1,:)/(1+y(i+1,:))))*(1-y(i,:));
dydt(i+1,:) = -0.1*(N+1)*(y(i+3,:)-y(i+1,:)) + (0.01/2)*m*((N+1).^3)*(y(i+3,:)-y(i,:)) - q*exp(15*(y(i+1,:)/(1+y(i+1,:))))*(1-y(i,:))+0.25;
%Evaluate the 2 components of the function at all interior grid
%points.
i = 5:2:2*N;
%y(1,:) = 0;
% y(2,:) = 0;
dydt(i,:) = (-0.1/2)*(N+1)*(3*y(i,:)-4*y(i-2,:)+y(i-4,:)) +(0.01/2)*m*((N+1).^3)*(y(i,:)-2*y(i-2,:)+y(i-4,:))+ p*exp(15*(y(i+1,:)/(1+y(i+1,:))))*(1-y(i,:));
dydt(i+1,:) = (-0.1/2)*(N+1)*(3*y(i+1,:)-4*y(i-1,:)+y(i-3,:))+(0.01/2)*m*((N+1).^3)*(y(i+1,:)-2*y(i-1,:)+y(i-3,:)) - q*exp(15*(y(i+1,:)/(1+y(i+1,:))))*(1-y(i,:))+0.25;
end
%-------------------------------------------------------------
end %brussode
%-------------------------------------------------------------
% Subfunction -- the sparsity pattern
%
function S = jpattern(N)
% Jacobian sparsity patter
B = ones(2*N,5);
B(2:2:2*N,2) = zeros(N,1);
B(1:2:2*N-1,4) = zeros(N,1);
S = spdiags(B,-2:2,2*N,2*N);
end
%-------------------------------------------------------------

solving a simple advection equation (1D)

I am trying to solve a 1D advection equation in Matlab as described in this paper, equations (55)-(57). I am making use of the central difference in equaton (59).
I would ultimately like to get something like figure (2) in the paper, which is the result of solving the advection equation for an advection velocity e(1-k) with k=1, i.e. a stationary wave. However, my code keeps diverging. Here is what I have so far:
%initial parameters
e = 1.0;
k = 0.5;
N = 120;
lx = 120;
%initialization of sine
for x=1:lx
if(x<3*lx/4+1 && x>lx/4+1)
phi(x) = sin(2*pi*(x-1-lx/4)/lx);
else
phi(x) = 0.0;
end
end
%advection loop
for t=1:N
gradPhi = 0.5*(+circshift(phi, [0,-1]) - circshift(phi, [0,+1]));
phiBar = phi + 0.5*k*e*gradPhi;
phiOutbar = circshift(phiBar, [0,-1]);
gradPhi = 0.5*(+circshift(phiOutbar, [0,-1]) - circshift(phiOutbar, [0,+1]));
phi = phiOutbar + 0.5*k*e*gradPhi;
end
plot(phi)
Where is the error in my simple code?
You haven't included the time step dt in your update equation. The term in equation (55) says k * e * dt / 2.
This is making your update really unstable and leading to divergence. For stability, you need CFL of 1, and yours is currently around 120. Try updating your code this way:
dt = 1/120;
%advection loop
for t=1:N/dt
gradPhi = 0.5*(+circshift(phi, [0,-1]) - circshift(phi, [0,+1]));
phiBar = phi + 0.5*dt*k*e*gradPhi;
phiOutbar = circshift(phiBar, [0,-1]);
gradPhi = 0.5*(+circshift(phiOutbar, [0,-1]) - circshift(phiOutbar, [0,+1]));
phi = phiOutbar + 0.5*dt*k*e*gradPhi;
end

Interpolation using polyfit (Matlab)

My script is supposed to run Runge-Kutta and then interpolate around the tops using polyfit to calculate the max values of the tops. I seem to get the x-values of the max points correct but the y-values are off for some reason. Have sat with it for 3 days now. The problem should be In the last for-loop when I calculate py?
Function:
function funk = FU(t,u)
L0 = 1;
C = 1*10^-6;
funk = [u(2); 2.*u(1).*u(2).^2./(1+u(1).^2) - u(1).*(1+u(1).^2)./(L0.*C)];
Program:
%Runge kutta
clear all
close all
clc
clf
%Given values
U0 = [240 1200 2400];
L0 = 1;
C = 1*10^-6;
T = 0.003;
h = 0.000001;
W = [];
% Runge-Kutta 4
for i = 1:3
u0 = [0;U0(i)];
u = u0;
U = u;
tt = 0:h:T;
for t=tt(1:end-1)
k1 = FU(t,u);
k2 = FU(t+0.5*h,u+0.5*h*k1);
k3 = FU((t+0.5*h),(u+0.5*h*k2));
k4 = FU((t+h),(u+k3*h));
u = u + (1/6)*(k1+2*k2+2*k3+k4)*h;
U = [U u];
end
W = [W;U];
end
I1 = W(1,:); I2 = W(3,:); I3 = W(5,:);
dI1 = W(2,:); dI2 = W(4,:); dI3 = W(6,:);
I = [I1; I2; I3];
dI = [dI1; dI2; dI3];
%Plot of the currents
figure (1)
plot(tt,I1,'r',tt,I2,'b',tt,I3,'g')
hold on
legend('U0 = 240','U0 = 1200','U0 = 2400')
BB = [];
d = 2;
px = [];
py = [];
format short
for l = 1:3
[H,Index(l)]=max(I(l,:));
Area=[(Index(l)-2:Index(l)+2)*h];
p = polyfit(Area,I(Index(l)-2:Index(l)+2),4);
rotp(1,:) = roots([4*p(1),3*p(2),2*p(3),p(4)]);
B = rotp(1,2);
BB = [BB B];
Imax(l,:)=p(1).*B.^4+p(2).*B.^3+p(3).*B.^2+p(4).*B+p(5);
Tsv(i)=4*rotp(1,l);
%px1 = linspace(h*(Index(l)-d-1),h*(Index(l)+d-2));
px1 = BB;
py1 = polyval(p,px1(1,l));
px = [px px1];
py = [py py1];
end
% Plots the max points
figure(1)
plot(px1(1),py(1),'b*-',px1(2),py(2),'b*-',px1(3),py(3),'b*-')
hold on
disp(Imax)
Your polyfit line should read:
p = polyfit(Area,I(l, Index(l)-2:Index(l)+2),4);
More interestingly, take note of the warnings you get about poor conditioning of that polynomial (I presume you're seeing these). Why? Partly because of numerical precision (your numbers are very small, scaled around 10^-6) and partly because you're asking for a 4th-order fit to five points (which is singular). To do this "better", use more input points (more than 5), or a lower-order polynomial fit (quadratic is usually plenty), and (probably) rescale before you use the polyfit tool.
Having said that, in practice this problem is often solved using three points and a quadratic fit, because it's computationally cheap and gives very nearly the same answers as more complex approaches, but you didn't get that from me (with noiseless data like this, it doesn't much matter anyway).

Finding instantaneous frequency

I want to find instantaneous frequency over a window of a signal. I am taking a small part of the signal and trying to find the instantaneous frequency in that window. But that frequency is not matching with the actual frequency of the signal. Below is my code.
close all; clear all; clc
fs = 25000;
T = 0.5;
t = 0:1/fs:T;
n = length(t);
m = floor(n/4);
f1 = 100;
z1 = cos(2*pi*f1*t(1:m));
f1 = 200;
z2 = cos(2*pi*f1*t(1:m));
f1 = 300;
z3 = cos(2*pi*f1*t(1:m));
f1 = 400;
z4 = cos(2*pi*f1*t(1:(n-3*m)));
z = [z1 z2 z3 z4];
window = 100;
wStart = 1;
wEnd = wStart + window;
freqs = [];
while wEnd < length(z)
x = z(wStart:wEnd);
y = t(wStart:wEnd);
h=hilbert(x);
unrolled_phase = unwrap(angle(h));
dx = diff(unrolled_phase);
dy = diff(y);
p = dx./dy;
inst_freq = p/(2*pi) + 2*pi;
freqs = [freqs inst_freq];
wStart = wEnd;
wEnd = wStart + window;
end
The plot of 'freqs' is this:
I think there is something wrong with the way I am calculating the frequencies but I am not sure. Can anyone please help? I need to get the frequencies only as 100,200,300 and 400 as evident from the code. All I want to do is find the beginning point(time) of each frequency
The first, theoretical answer, is that a signal is really a combination of several frequencies. Also, if you are decomposing a signal into ranges of 100 Hz with bins of 100Hz, 200Hz, 300Hz, etc. then you will NOT be able to recreate the signal into exactly the same signal that was transformed. That is, an inverse transform can only use 4 freqs, and will combine in a way such that the input signal and recovered signal do NOT sound the same!

MATLAB: One Step Ahead Neural Network Timeseries Forecast

Intro: I'm using MATLAB's Neural Network Toolbox in an attempt to forecast time series one step into the future. Currently I'm just trying to forecast a simple sinusoidal function, but hopefully I will be able to move on to something a bit more complex after I obtain satisfactory results.
Problem: Everything seems to work fine, however the predicted forecast tends to be lagged by one period. Neural network forecasting isn't much use if it just outputs the series delayed by one unit of time, right?
Code:
t = -50:0.2:100;
noise = rand(1,length(t));
y = sin(t)+1/2*sin(t+pi/3);
split = floor(0.9*length(t));
forperiod = length(t)-split;
numinputs = 5;
forecasted = [];
msg = '';
for j = 1:forperiod
fprintf(repmat('\b',1,numel(msg)));
msg = sprintf('forecasting iteration %g/%g...\n',j,forperiod);
fprintf('%s',msg);
estdata = y(1:split+j-1);
estdatalen = size(estdata,2);
signal = estdata;
last = signal(end);
[signal,low,high] = preprocess(signal'); % pre-process
signal = signal';
inputs = signal(rowshiftmat(length(signal),numinputs));
targets = signal(numinputs+1:end);
%% NARNET METHOD
feedbackDelays = 1:4;
hiddenLayerSize = 10;
net = narnet(feedbackDelays,[hiddenLayerSize hiddenLayerSize]);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
signalcells = mat2cell(signal,[1],ones(1,length(signal)));
[inputs,inputStates,layerStates,targets] = preparets(net,{},{},signalcells);
net.trainParam.showWindow = false;
net.trainparam.showCommandLine = false;
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
net.performFcn = 'mse'; % Mean squared error
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
next = net(inputs(end),inputStates,layerStates);
next = postprocess(next{1}, low, high); % post-process
next = (next+1)*last;
forecasted = [forecasted next];
end
figure(1);
plot(1:forperiod, forecasted, 'b', 1:forperiod, y(end-forperiod+1:end), 'r');
grid on;
Note:
The function 'preprocess' simply converts the data into logged % differences and 'postprocess' converts the logged % differences back for plotting. (Check EDIT for preprocess and postprocess code)
Results:
BLUE: Forecasted Values
RED: Actual Values
Can anyone tell me what I'm doing wrong here? Or perhaps recommend another method to achieve the desired results (lagless prediction of sinusoidal function, and eventually more chaotic timeseries)? Your help is very much appreciated.
EDIT:
It's been a few days now and I hope everyone has enjoyed their weekend. Since no solutions have emerged I've decided to post the code for the helper functions 'postprocess.m', 'preprocess.m', and their helper function 'normalize.m'. Maybe this will help get the ball rollin.
postprocess.m:
function data = postprocess(x, low, high)
% denormalize
logdata = (x+1)/2*(high-low)+low;
% inverse log data
sign = logdata./abs(logdata);
data = sign.*(exp(abs(logdata))-1);
end
preprocess.m:
function [y, low, high] = preprocess(x)
% differencing
diffs = diff(x);
% calc % changes
chngs = diffs./x(1:end-1,:);
% log data
sign = chngs./abs(chngs);
logdata = sign.*log(abs(chngs)+1);
% normalize logrets
high = max(max(logdata));
low = min(min(logdata));
y=[];
for i = 1:size(logdata,2)
y = [y normalize(logdata(:,i), -1, 1)];
end
end
normalize.m:
function Y = normalize(X,low,high)
%NORMALIZE Linear normalization of X between low and high values.
if length(X) <= 1
error('Length of X input vector must be greater than 1.');
end
mi = min(X);
ma = max(X);
Y = (X-mi)/(ma-mi)*(high-low)+low;
end
I didn't check you code, but made a similar test to predict sin() with NN. The result seems reasonable, without a lag. I think, your bug is somewhere in synchronization of predicted values with actual values.
Here is the code:
%% init & params
t = (-50 : 0.2 : 100)';
y = sin(t) + 0.5 * sin(t + pi / 3);
sigma = 0.2;
n_lags = 12;
hidden_layer_size = 15;
%% create net
net = fitnet(hidden_layer_size);
%% train
noise = sigma * randn(size(t));
y_train = y + noise;
out = circshift(y_train, -1);
out(end) = nan;
in = lagged_input(y_train, n_lags);
net = train(net, in', out');
%% test
noise = sigma * randn(size(t)); % new noise
y_test = y + noise;
in_test = lagged_input(y_test, n_lags);
out_test = net(in_test')';
y_test_predicted = circshift(out_test, 1); % sync with actual value
y_test_predicted(1) = nan;
%% plot
figure,
plot(t, [y, y_test, y_test_predicted], 'linewidth', 2);
grid minor; legend('orig', 'noised', 'predicted')
and the lagged_input() function:
function in = lagged_input(in, n_lags)
for k = 2 : n_lags
in = cat(2, in, circshift(in(:, end), 1));
in(1, k) = nan;
end
end