Time Series Forecasting Using Deep Learning in MATLAB - matlab

I am using the time series forecasting sample from MathWorks in https://uk.mathworks.com/help/nnet/examples/time-series-forecasting-using-deep-learning.html
The output in the above-mentioned web-address is:
I only changed the dataset and ran the algorithm. Surprisingly, the algorithm is not working good with my dataset and generates a line as forecast as follows:
I am really confused and I cannot understand the reason behind that. I might be need to tune parameters in the algorithm that I am not aware on that. The code I am using is:
%% Load Data
%data = chickenpox_dataset;
%data = [data{:}];
data = xlsread('data.xlsx');
data = data';
%% Divide Data: Training and Testing
numTimeStepsTrain = floor(0.7*numel(data));
XTrain = data(1:numTimeStepsTrain);
YTrain = data(2:numTimeStepsTrain+1);
XTest = data(numTimeStepsTrain+1:end-1);
YTest = data(numTimeStepsTrain+2:end);
%% Standardize Data
mu = mean(XTrain);
sig = std(XTrain);
XTrain = (XTrain - mu) / sig;
YTrain = (YTrain - mu) / sig;
XTest = (XTest - mu) / sig;
%% Define LSTM Network
inputSize = 1;
numResponses = 1;
numHiddenUnits = 500;
layers = [ ...
sequenceInputLayer(inputSize)
lstmLayer(numHiddenUnits)
fullyConnectedLayer(numResponses)
regressionLayer];
%% Training Options
opts = trainingOptions('adam', ...
'MaxEpochs',500, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'Verbose',0, ...
'Plots','training-progress');
%% Train Network
net = trainNetwork(XTrain,YTrain,layers,opts);
%% Forecast Future Time Steps
net = predictAndUpdateState(net,XTrain);
[net,YPred] = predictAndUpdateState(net,YTrain(end));
numTimeStepsTest = numel(XTest);
for i = 2:numTimeStepsTest
[net,YPred(1,i)] = predictAndUpdateState(net,YPred(i-1));
end
%% Unstandardize the predictions using mu and sig calculated earlier.
YPred = sig*YPred + mu;
%% RMSE and MAE Calculation
rmse = sqrt(mean((YPred-YTest).^2))
MAE = mae(YPred-YTest)
%% Plot results
figure
plot(data(1:numTimeStepsTrain))
hold on
idx = numTimeStepsTrain:(numTimeStepsTrain+numTimeStepsTest);
plot(idx,[data(numTimeStepsTrain) YPred],'.-')
hold off
xlabel("Month")
ylabel("Cases")
title("Forecast")
legend(["Observed" "Forecast"])
%% Compare the forecasted values with the test data
figure
subplot(2,1,1)
plot(YTest)
hold on
plot(YPred,'.-')
hold off
legend(["Observed" "Forecast"])
ylabel("Cases")
title("Forecast")
subplot(2,1,2)
stem(YPred - YTest)
xlabel("Month")
ylabel("Error")
title("RMSE = " + rmse)
And the data.xlsx is in: https://www.dropbox.com/s/vv1apug7iqlocu1/data.xlsx?dl=1

You want to find temporal patterns in the data. Matlab's data looks like a sine-wave with noise, a very clear pattern. Your data is far from showing a clear pattern. Your data needs preprocessing. I would start by removing the slow drifts. A high-pass, or band-pass filter of some sort makes sense. Here is a simple line just for a quick view of your data without the slow frequencies:
T=readtable('data.xlsx','readvariablenames',0);
figure; plot(T.Var1-smoothdata(T.Var1,'movmean',200))

Related

Transfer Learning for Regression in Matlab

I am trying to implement a model that takes an image as the input and gives a vector of 26 numbers. I am using VGG-16 at this time through the following Matlab code:
analyzeNetwork(net);
NUM_OUTPUT = 26;
layers = net.Layers;
%output = fullyConnectedLayer(NUM_OUTPUT, ...
% 'Name','output_layer', ...
% 'WeightLearnRateFactor',10, ...
% 'BiasLearnRateFactor',10);
layers = [
layers(1:38)
fullyConnectedLayer(NUM_OUTPUT)
regressionLayer];
%layers(1:67) = freezeWeights(layers(1:67));
miniBatchSize = 5;
validationFrequency = floor(numel(YTrain)/miniBatchSize);
options = trainingOptions('sgdm',...
'InitialLearnRate',0.001, ...
'ValidationData',{XValidation,YValidation},...
'Plots','training-progress',...
'Verbose',false);
net = trainNetwork(XTrain,YTrain,layers,options);
YPred = predict(net,XValidation);
predictionError = YValidation - YPred;
thr = 10;
numCorrect = sum(abs(predictionError) < thr);
numImagesValidation = numel(YValidation);
accuracy = numCorrect/numImagesValidation;
rmse = sqrt(mean(predictionError.^2));
The shape of XTrain and YTrain are as follows:
XTrain: 224 224 3 140
YTrain: 26 140
By running the code above (it is a part of the code not the whole of it) I get the following error:
Error using trainNetwork (line 170)
Number of observations in X and Y disagree.
I would appreciate it if somebody could help me to figure out what is the problem because as far as I know the number of samples in both are equal and there is no necessity for the rest of the dimensions to be equal.
Transpose YTrain to be 140x26.
Name your new layers, and make them layerGraph
Regression can easly go unstable so decrease learning rate or increase batch size if you get some nans.
net = vgg16 ; % analyzeNetwork(net);
LAYERS_FREEZE_UNTIL=35;
LAYERS_COPY_UNTIL=38;
NUM_TRAIN_SAMPLES = size(YTrain,1);
NUM_OUTPUT = size(YTrain,2);
my_layers =layerGraph([
freezeWeights(net.Layers(1:LAYERS_FREEZE_UNTIL))
net.Layers(LAYERS_FREEZE_UNTIL+1:LAYERS_COPY_UNTIL)
fullyConnectedLayer(NUM_OUTPUT*2,'Name','my_fc1')
fullyConnectedLayer(NUM_OUTPUT,'Name','my_fc2')
regressionLayer('Name','my_regr')
]);
% figure; plot(my_layers), ylim([0.5,6.5])
% analyzeNetwork(my_layers);
MINI_BATCH_SIZE = 16;
options = trainingOptions('sgdm', ...
'MiniBatchSize',MINI_BATCH_SIZE, ...
'MaxEpochs',20, ...
'InitialLearnRate',1e-4, ...
'Shuffle','every-epoch', ...
'ValidationData',{XValidation,YValidation}, ...
'ValidationFrequency',floor(NUM_TRAIN_SAMPLES/MINI_BATCH_SIZE), ...
'Verbose',true, ...
'Plots','training-progress');
my_net = trainNetwork(XTrain,YTrain,my_layers,options);

Repeated classification accuracies in a loop always the same

I have pretty simple code for binary classification (see below). When I re-run this in Matlab (just by manually pressing the "run" button), each run gives me slightly different accuracies for each of the 14 subjects. However, if I loop over my code nrPermute times, every iteration of the loop gives me EXACTLY the same accuracy for the respective subject - why is that? So in the first code, the mean(accuracy) is different for different runs, whereas in the second code it is always the same for different iterations. Both codes below
Code where only one 10-fold crossvalidation is done for each subject:
%% SVM-Classification
nrFolds = 10; %number of folds of crossvalidation, 10 is standard
kernel = 'linear'; % 'linear', 'rbf' or 'polynomial'
C = 1;
solver = 'L1QP';
cvFolds = crossvalind('Kfold', labels, nrFolds);
for k = 1:14
for i = 1:nrFolds % iteratre through each fold
testIdx = (cvFolds == i); % indices of test instances
trainIdx = ~testIdx; % indices training instances
% train the SVM
cl = fitcsvm(features(trainIdx,:),
labels(trainIdx),'KernelFunction',kernel,'Standardize',true,...
'BoxConstraint',C,'ClassNames',[0,1],'Solver',solver);
[label,scores] = predict(cl, features(testIdx,:));
eq = sum(label==labels(testIdx));
accuracy(i) = eq/numel(labels(testIdx));
end
crossValAcc(k) = mean(accuracy);
end
Code where each 10-fold crossvalidation is repeated nrPermute times:
%% SVM-Classification
nrFolds = 10; %number of folds of crossvalidation, 10 is standard
kernel = 'linear'; % 'linear', 'rbf' or 'polynomial'
C = 1;
solver = 'L1QP';
cvFolds = crossvalind('Kfold', labels, nrFolds);
nrPermute = 5;
for k = 1:14
for p = 1:nrPermute
for i = 1:nrFolds % iteratre through each fold
testIdx = (cvFolds == i); % indices of test instances
trainIdx = ~testIdx; % indices training instances
% train the SVM
cl = fitcsvm(features(trainIdx,:),
labels(trainIdx),'KernelFunction',kernel,'Standardize',true,...
'BoxConstraint',C,'ClassNames',[0,1],'Solver',solver);
[label,scores] = predict(cl, features(testIdx,:));
eq = sum(label==labels(testIdx));
accuracy(i) = eq/numel(labels(testIdx));
end
accSubj(p) = mean(accuracy); % accuracy of each permutation
end
crossValAcc(k) = mean(accSubj);
end
In case that would be useful for someone else as well, I figure it out: The loop for permutation should be outside of cvFolds = crossvalind('Kfold', labels, nrFolds); such that the distribution into folds is re-shuffled!

Test / Match images in a SVM classifier

I have extracted HoG features from a set of 20images. I did classification with the following code... But I have issues testing the classifier. What am u doing wrong please help...
%% Load Images
imgFolder1 = fullfile('C:\Users\Engineering\Desktop\Finn\NEW');
imgFolder2 = fullfile('C:\Users\Engineering\Desktop\Finn\NEW2');
training = imageSet(imgFolder1);
test = imageSet(imgFolder2);
%% Extract and display Histogram of Oriented Gradient Features for single Note
[hogFeature, visualization]= ...
extractHOGFeatures(read(training,1));
figure;
subplot(2,1,1);imshow(read(training,1));title('Input Face');
subplot(2,1,2);plot(visualization);title('HoG Feature');
%% Extract HOG Features for training set
trainingFeatures = [];
trainingLabel = [];
for i = 1:training.Count
[hogFeature, visualization] = ...
extractHOGFeatures(read(training,i));
trainingFeatures = [trainingFeatures;hogFeature];
end
%%
labels = repmat(training.Description, i);
trainingLabel = [trainingLabel; labels];
%% I am not sure about this line for noteIndex:
noteIndex = trainingLabel, i
%% Create 20 class classifier using fitcecoc
noteClassifier = fitcecoc(trainingFeatures,trainingLabel);
%% Test Images from Test Set
note = 1;
queryImage = read(test(note),1);
queryFeatures = extractHOGFeatures(queryImage);
noteLabel = predict(noteClassifier,queryFeatures);
% Map back to training set to find identity
booleanIndex = strcmp(noteLabel, noteIndex);
integerIndex = find(booleanIndex);
subplot(1,2,1);imshow(queryImage);title('Query Face');
subplot(1,2,2);imshow(read(training(integerIndex),1));title('Matched Class');
%% Test First 5 notes from Test Set
figure;
figureNum = 1;
for note=1
for k = 1:test(note).Count
queryImage = read(test(note),k);
queryFeatures = extractHOGFeatures(queryImage);
noteLabel = predict(noteClassifier,queryFeatures);
% Map back to training set to find identity
booleanIndex = strcmp(noteLabel, noteIndex);
integerIndex = find(booleanIndex);
subplot(4,4,figureNum);imshow(imresize(queryImage,3));title('Query Note');
subplot(4,4,figureNum+1);imshow(imresize(read(training(integerIndex),1),3));title('Matched Class');
figureNum = figureNum+2;
end
figure;
figureNum = 1;
end
%% I get these errors:
(this method is not supported for arrays if image Set objects.)
%%And
(error in: subplot(1,2,2);imshow(read(training(integerIndex),1));title('Matched Class');
%% so it displays query image but match image is like a graph plot with no image.

MATLAB: One Step Ahead Neural Network Timeseries Forecast

Intro: I'm using MATLAB's Neural Network Toolbox in an attempt to forecast time series one step into the future. Currently I'm just trying to forecast a simple sinusoidal function, but hopefully I will be able to move on to something a bit more complex after I obtain satisfactory results.
Problem: Everything seems to work fine, however the predicted forecast tends to be lagged by one period. Neural network forecasting isn't much use if it just outputs the series delayed by one unit of time, right?
Code:
t = -50:0.2:100;
noise = rand(1,length(t));
y = sin(t)+1/2*sin(t+pi/3);
split = floor(0.9*length(t));
forperiod = length(t)-split;
numinputs = 5;
forecasted = [];
msg = '';
for j = 1:forperiod
fprintf(repmat('\b',1,numel(msg)));
msg = sprintf('forecasting iteration %g/%g...\n',j,forperiod);
fprintf('%s',msg);
estdata = y(1:split+j-1);
estdatalen = size(estdata,2);
signal = estdata;
last = signal(end);
[signal,low,high] = preprocess(signal'); % pre-process
signal = signal';
inputs = signal(rowshiftmat(length(signal),numinputs));
targets = signal(numinputs+1:end);
%% NARNET METHOD
feedbackDelays = 1:4;
hiddenLayerSize = 10;
net = narnet(feedbackDelays,[hiddenLayerSize hiddenLayerSize]);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
signalcells = mat2cell(signal,[1],ones(1,length(signal)));
[inputs,inputStates,layerStates,targets] = preparets(net,{},{},signalcells);
net.trainParam.showWindow = false;
net.trainparam.showCommandLine = false;
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
net.performFcn = 'mse'; % Mean squared error
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
next = net(inputs(end),inputStates,layerStates);
next = postprocess(next{1}, low, high); % post-process
next = (next+1)*last;
forecasted = [forecasted next];
end
figure(1);
plot(1:forperiod, forecasted, 'b', 1:forperiod, y(end-forperiod+1:end), 'r');
grid on;
Note:
The function 'preprocess' simply converts the data into logged % differences and 'postprocess' converts the logged % differences back for plotting. (Check EDIT for preprocess and postprocess code)
Results:
BLUE: Forecasted Values
RED: Actual Values
Can anyone tell me what I'm doing wrong here? Or perhaps recommend another method to achieve the desired results (lagless prediction of sinusoidal function, and eventually more chaotic timeseries)? Your help is very much appreciated.
EDIT:
It's been a few days now and I hope everyone has enjoyed their weekend. Since no solutions have emerged I've decided to post the code for the helper functions 'postprocess.m', 'preprocess.m', and their helper function 'normalize.m'. Maybe this will help get the ball rollin.
postprocess.m:
function data = postprocess(x, low, high)
% denormalize
logdata = (x+1)/2*(high-low)+low;
% inverse log data
sign = logdata./abs(logdata);
data = sign.*(exp(abs(logdata))-1);
end
preprocess.m:
function [y, low, high] = preprocess(x)
% differencing
diffs = diff(x);
% calc % changes
chngs = diffs./x(1:end-1,:);
% log data
sign = chngs./abs(chngs);
logdata = sign.*log(abs(chngs)+1);
% normalize logrets
high = max(max(logdata));
low = min(min(logdata));
y=[];
for i = 1:size(logdata,2)
y = [y normalize(logdata(:,i), -1, 1)];
end
end
normalize.m:
function Y = normalize(X,low,high)
%NORMALIZE Linear normalization of X between low and high values.
if length(X) <= 1
error('Length of X input vector must be greater than 1.');
end
mi = min(X);
ma = max(X);
Y = (X-mi)/(ma-mi)*(high-low)+low;
end
I didn't check you code, but made a similar test to predict sin() with NN. The result seems reasonable, without a lag. I think, your bug is somewhere in synchronization of predicted values with actual values.
Here is the code:
%% init & params
t = (-50 : 0.2 : 100)';
y = sin(t) + 0.5 * sin(t + pi / 3);
sigma = 0.2;
n_lags = 12;
hidden_layer_size = 15;
%% create net
net = fitnet(hidden_layer_size);
%% train
noise = sigma * randn(size(t));
y_train = y + noise;
out = circshift(y_train, -1);
out(end) = nan;
in = lagged_input(y_train, n_lags);
net = train(net, in', out');
%% test
noise = sigma * randn(size(t)); % new noise
y_test = y + noise;
in_test = lagged_input(y_test, n_lags);
out_test = net(in_test')';
y_test_predicted = circshift(out_test, 1); % sync with actual value
y_test_predicted(1) = nan;
%% plot
figure,
plot(t, [y, y_test, y_test_predicted], 'linewidth', 2);
grid minor; legend('orig', 'noised', 'predicted')
and the lagged_input() function:
function in = lagged_input(in, n_lags)
for k = 2 : n_lags
in = cat(2, in, circshift(in(:, end), 1));
in(1, k) = nan;
end
end

Equation that compute a Neural Network in Matlab

I created a neural network matlab. This is the script:
load dati.mat;
inputs=dati(:,1:8)';
targets=dati(:,9)';
hiddenLayerSize = 10;
net = patternnet(hiddenLayerSize);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax', 'mapstd','processpca'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax', 'mapstd','processpca'};
net = struct(net);
net.inputs{1}.processParams{2}.ymin = 0;
net.inputs{1}.processParams{4}.maxfrac = 0.02;
net.outputs{2}.processParams{4}.maxfrac = 0.02;
net.outputs{2}.processParams{2}.ymin = 0;
net = network(net);
net.divideFcn = 'divideind';
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainInd = 1:428;
net.divideParam.valInd = 429:520;
net.divideParam.testInd = 521:612;
net.trainFcn = 'trainscg'; % Scaled conjugate gradient backpropagation
net.performFcn = 'mse'; % Mean squared error
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', 'plotregression', 'plotconfusion', 'plotroc'};
net=init(net);
net.trainParam.max_fail=20;
[net,tr] = train(net,inputs,targets);
outputs = net(inputs);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
Now I want to save the weights and biases of the network and write the equation.
I had saved the weights and biases:
W1=net.IW{1,1};
W2=net.LW{2,1};
b1=net.b{1,1};
b2=net.b{2,1};
So, I've done the data preprocessing and I wrote the following equation
max_range=0;
[y,ps]=removeconstantrows(input, max_range);
ymin=0;
ymax=1;
[y,ps2]=mapminmax(y,ymin,ymax);
ymean=0;
ystd=1;
y=mapstd(x,ymean,ystd);
maxfrac=0.02;
y=processpca(y,maxfrac);
in=y';
uscita=tansig(W2*(tansig(W1*in+b1))+b2);
But with the same input input=[1:8] I get different results. why? What's wrong?
Help me please! It's important!
I use Matlab R2010B
It looks like you are pre-processing the inputs but not post-processing the outputs. Post processing uses the "reverse" processing form. (Targets are pre-processed, so outputs are reverse processed).
This equation
uscita=tansig(W2*(tansig(W1*in+b1))+b2);
is wrong. Why do you write two tansig? You have 10 nerouns you should write it 10 times or use for i=1:10;