I have a multiclass classification task, and I have tried to use 'trainSoftmaxLayer' in Matlab, but it's a CPU implementation version, and is slow. So I tried to read the documentation for a GPU option, like 'trainSoftmaxLayer('useGPU', 'yes')' in traditional neural network, but there isn't any related options.
Finally, the problem is sovled by hacking the source code of trainSoftmaxLayer.m, which is provided by MATLAB. We can write our own GPU-enabled softmax layer like this:
function [net] = trainClassifier(x, t, use_gpu, showWindow)
net = network;
% define topology
net.numInputs = 1;
net.numLayers = 1;
net.biasConnect = 1;
net.inputConnect(1, 1) = 1;
net.outputConnect = 1;
% set values for labels
net.name = 'Softmax Classifier with GPU Option';
net.layers{1}.name = 'Softmax Layer';
% define transfer function
net.layers{1}.transferFcn = 'softmax';
% set parameters
net.performFcn = 'crossentropy';
net.trainFcn = 'trainscg';
net.trainParam.epochs = 1000;
net.trainParam.showWindow = showWindow;
net.divideFcn = 'dividetrain';
if use_gpu == 1
net = train(net, x, full(t), 'useGPU', 'yes');
else
net = train(net, x, full(t));
end
end
Related
I'm trying to use k-fold cross-validation with the patternnet neural network.
inputs1 is a feature vector and targets1 is label vector from 'iris_dataset'. And xtrain, xtest, ytrain, and ytest are training & testing features and labels respectively after splitting using the cvpartition function.
The steps are as follows:
1.First of all, a Pattern Recognition Network (patternnet) is Created.
In the first and second scripts: net = patternnet;
2.After dividing data into k-folds using cvpartition, two training & testing features and labels are created (k=10).
In the first and second scripts: fold = cvpartition(targets_Vec, 'kfold', kfold);
3.Then, the configure command is used to configure the network object and also initializes the weights and biases of the network;
In the first script: net = configure(net, xtrain', dummyvar(ytrain)'); % xtrain and ytrain are features and labels from step (2).
or
In the second script: net = configure(net, inputs1, targets1); % inputs1 and targets1 are features and labels before splitting up.
4.After initializing the parameters and hyper-parameters, the network is trained using the training data (by the train() function).
In the first script: [net, tr] = train(net, xtrain', dummyvar(ytrain)'); % xtrain and ytrain are features and labels from step (2).
or
In the second script: [net, tr] = train(net, inputs1, targets1); % inputs1 and targets1 are features and labels before splitting up.
5.And finally, the targets are estimated using the trained network (by the net() function).
In the first script: pred = net(xtest'); % testing features from step (2).
or
In the second script: pred = net(inputs1);
Since the training & testing features are separated using cvpartition, so the network should be trained using training features and its labels and then it should be tested by testing features (new data).
Although, the train() function is used for training the network, but it splits its own input (training data and labels from step (2)) into training, validation and testing data while the original testing data from step (2) remains unused.
Therefore, I need a function that is used training features and labels from step 2 (train and validation) for learning and also another function to classifying the new data (testing features from step 2).
After searching, I wrote two scripts, I think the first one isn't correct but I don't sure the second one is also incorrect or not? How to solve it?
The first script:
clc; close all; clearvars;
load iris_dataset
max_iter = 10;
kfold = 10;
[inputs, targets] = iris_dataset;
inputs = inputs';
targets = targets';
targets_Vec= [];
for j = 1 : size(targets, 1)
if max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==1
targets_Vec = [targets_Vec; 1];
elseif max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==2
targets_Vec = [targets_Vec; 2];
elseif max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==3
targets_Vec = [targets_Vec; 3];
end
end
net = patternnet; ... Create a Pattern Recognition Network
rng('default');
... Divide data into k-folds
fold = cvpartition(targets_Vec, 'kfold', kfold);
... Pre
pred2 = []; ytest2 = []; Afold = zeros(kfold,1);
... Neural network start
for k = 1 : kfold
... Call index of training & testing sets
trainIdx = fold.training(k); testIdx = fold.test(k);
... Call training & testing features and labels
xtrain = inputs(trainIdx,:); ytrain = targets_Vec(trainIdx);
xtest = inputs(testIdx,:); ytest = targets_Vec(testIdx);
... configure
net = configure(net, xtrain', dummyvar(ytrain)');
... Initialize neural network
net.layers{1}.name='Hidden Layer 1';
net.layers{2}.name='Output Layer';
net.layers{1}.size = 20;
net.layers{1}.transferFcn = 'tansig';
net.trainFcn = 'trainscg';
net.performFcn = 'crossentropy';
... Choose Input and Output Pre/Post-Processing Functions
net.input.processFcns = {'removeconstantrows','mapminmax'};
net.output.processFcns = {'removeconstantrows','mapminmax'};
... Train the Network
[net, tr] = train(net, xtrain', dummyvar(ytrain)');
... Estimate the targets using the trained network.(Test)
pred = net(xtest');
... Confusion matrix
[c, cm] = confusion(dummyvar(ytest)',pred);
... Get accuracy for each fold
Afold(k) = 100*sum(diag(cm))/sum(cm(:));
... Store temporary result for each fold
pred2 = [pred2(1:end,:), pred];
ytest2 = [ytest2(1:end); ytest];
end
... Overall confusion matrix
[~,confmat] = confusion(dummyvar(ytest2)', pred2);
confmat=transpose(confmat);
... Average accuracy over k-folds
acc = mean(Afold);
... Store results
NN.fold = Afold;
NN.acc = acc;
NN.con = confmat;
fprintf('\n Final classification Accuracy (NN): %g %%',acc);
The second script:
clc; close all; clearvars;
load iris_dataset
max_iter = 10;
kfold = 10;
[inputs1, targets1] = iris_dataset;
inputs = inputs1';
targets = targets1';
targets_Vec= [];
for j = 1 : size(targets, 1)
if max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==1
targets_Vec = [targets_Vec; 1];
elseif max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==2
targets_Vec = [targets_Vec; 2];
elseif max(targets(j, 1:3) == 1) && find(targets(j, 1:3))==3
targets_Vec = [targets_Vec; 3];
end
end
net = patternnet; ... Create a Pattern Recognition Network
rng('default');
... Divide data into k-folds
fold = cvpartition(targets_Vec, 'kfold', kfold);
... Pre
pred2 = []; ytest2 = []; Afold = zeros(kfold,1);
... Neural network start
for k = 1 : kfold
... Call index of training & testing sets
trainIdx = fold.training(k); testIdx = fold.test(k);
... Call training & testing features and labels
xtrain = inputs(trainIdx,:); ytrain = targets_Vec(trainIdx);
xtest = inputs(testIdx,:); ytest = targets_Vec(testIdx);
... configure
net = configure(net, inputs1, targets1);
trInd = find(trainIdx); tstInd = find(testIdx);
net.divideFcn = 'divideind';
net.divideParam.trainInd = trInd;
net.divideParam.testInd = tstInd;
... Initialize neural network
net.layers{1}.name='Hidden Layer 1';
net.layers{2}.name='Output Layer';
net.layers{1}.size = 20;
net.layers{1}.transferFcn = 'tansig';
net.trainFcn = 'trainscg';
net.performFcn = 'crossentropy';
... Choose Input and Output Pre/Post-Processing Functions
net.input.processFcns = {'removeconstantrows','mapminmax'};
net.output.processFcns = {'removeconstantrows','mapminmax'};
... Train the Network
[net, tr] = train(net, inputs1, targets1);
pred = net(inputs1); ... Estimate the targets using the trained network (Test)
... Confusion matrix
[c, cm] = confusion(targets1, pred);
y = net(inputs1);
e = gsubtract(targets1, y);
performance = perform(net, targets1, y);
tind = vec2ind(targets1);
yind = vec2ind(y);
percentErrors = sum(tind ~= yind)/numel(tind);
... Recalculate Training, Validation and Test Performance
trainTargets = targets1 .* tr.trainMask{1};
% valTargets = targets1 .* tr.valMask{1};
testTargets = targets1 .* tr.testMask{1};
trainPerformance = perform(net, trainTargets, y);
% valPerformance = perform(net, valTargets, y);
testPerformance = perform(net, testTargets, y);
test_Fold(k) = testPerformance;
end
test_Fold_mean = mean(test_Fold);
acc = 100*(1-test_Fold_mean);
fprintf('\n Final classification Accuracy (NN): %g %%',acc);
I am using the time series forecasting sample from MathWorks in https://uk.mathworks.com/help/nnet/examples/time-series-forecasting-using-deep-learning.html
The output in the above-mentioned web-address is:
I only changed the dataset and ran the algorithm. Surprisingly, the algorithm is not working good with my dataset and generates a line as forecast as follows:
I am really confused and I cannot understand the reason behind that. I might be need to tune parameters in the algorithm that I am not aware on that. The code I am using is:
%% Load Data
%data = chickenpox_dataset;
%data = [data{:}];
data = xlsread('data.xlsx');
data = data';
%% Divide Data: Training and Testing
numTimeStepsTrain = floor(0.7*numel(data));
XTrain = data(1:numTimeStepsTrain);
YTrain = data(2:numTimeStepsTrain+1);
XTest = data(numTimeStepsTrain+1:end-1);
YTest = data(numTimeStepsTrain+2:end);
%% Standardize Data
mu = mean(XTrain);
sig = std(XTrain);
XTrain = (XTrain - mu) / sig;
YTrain = (YTrain - mu) / sig;
XTest = (XTest - mu) / sig;
%% Define LSTM Network
inputSize = 1;
numResponses = 1;
numHiddenUnits = 500;
layers = [ ...
sequenceInputLayer(inputSize)
lstmLayer(numHiddenUnits)
fullyConnectedLayer(numResponses)
regressionLayer];
%% Training Options
opts = trainingOptions('adam', ...
'MaxEpochs',500, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'Verbose',0, ...
'Plots','training-progress');
%% Train Network
net = trainNetwork(XTrain,YTrain,layers,opts);
%% Forecast Future Time Steps
net = predictAndUpdateState(net,XTrain);
[net,YPred] = predictAndUpdateState(net,YTrain(end));
numTimeStepsTest = numel(XTest);
for i = 2:numTimeStepsTest
[net,YPred(1,i)] = predictAndUpdateState(net,YPred(i-1));
end
%% Unstandardize the predictions using mu and sig calculated earlier.
YPred = sig*YPred + mu;
%% RMSE and MAE Calculation
rmse = sqrt(mean((YPred-YTest).^2))
MAE = mae(YPred-YTest)
%% Plot results
figure
plot(data(1:numTimeStepsTrain))
hold on
idx = numTimeStepsTrain:(numTimeStepsTrain+numTimeStepsTest);
plot(idx,[data(numTimeStepsTrain) YPred],'.-')
hold off
xlabel("Month")
ylabel("Cases")
title("Forecast")
legend(["Observed" "Forecast"])
%% Compare the forecasted values with the test data
figure
subplot(2,1,1)
plot(YTest)
hold on
plot(YPred,'.-')
hold off
legend(["Observed" "Forecast"])
ylabel("Cases")
title("Forecast")
subplot(2,1,2)
stem(YPred - YTest)
xlabel("Month")
ylabel("Error")
title("RMSE = " + rmse)
And the data.xlsx is in: https://www.dropbox.com/s/vv1apug7iqlocu1/data.xlsx?dl=1
You want to find temporal patterns in the data. Matlab's data looks like a sine-wave with noise, a very clear pattern. Your data is far from showing a clear pattern. Your data needs preprocessing. I would start by removing the slow drifts. A high-pass, or band-pass filter of some sort makes sense. Here is a simple line just for a quick view of your data without the slow frequencies:
T=readtable('data.xlsx','readvariablenames',0);
figure; plot(T.Var1-smoothdata(T.Var1,'movmean',200))
My dataset is huge. Let X be input training data which is 6X140000 and T be targets, which are 3X140000.
net = patternnet(10);
% Set divide parameters
net.divideFcn = 'divideind';
net.divideParam.trainInd = loc_Train;
net.divideParam.testInd = loc_Test;
net.divideParam.valInd = loc_Valid;
net.trainFcn = 'trainscg';
% Set training parameters
net.trainParam.epochs = 1000;
net.trainParam.max_fail = 20;
net.trainParam.min_grad = 1e-20;
net.trainParam.goal = 1e-10; % Set a very small value
% Set network performance functions
net.performFcn = 'crossentropy';
net.performParam.regularization = 0.02;
net.performParam.normalization = 'none';
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 1;
After I have setup my network, I run the following code to train my network.
[net, tr] = train(net, X, T);
The command line shows:
Calculation mode: MEX Training Pattern Recognition Neural Network
with TRAINSCG.
Epoch 0/1000, Time 0.001, Performance 0.0061672/1e-10,
Gradient 0.00065207/1e-20, Validation Checks 0/20
Epoch 20/1000, Time 2.214, Performance 0.0060292/1e-10, Gradient 6.3997e-05/1e-20, Validation Checks 20/20
Training with TRAINSCG completed: Validation
stop.
The tr object, which is the training record, holds information such as testing indices. however, tr.testInd returns empty.
Intro: I'm using MATLAB's Neural Network Toolbox in an attempt to forecast time series one step into the future. Currently I'm just trying to forecast a simple sinusoidal function, but hopefully I will be able to move on to something a bit more complex after I obtain satisfactory results.
Problem: Everything seems to work fine, however the predicted forecast tends to be lagged by one period. Neural network forecasting isn't much use if it just outputs the series delayed by one unit of time, right?
Code:
t = -50:0.2:100;
noise = rand(1,length(t));
y = sin(t)+1/2*sin(t+pi/3);
split = floor(0.9*length(t));
forperiod = length(t)-split;
numinputs = 5;
forecasted = [];
msg = '';
for j = 1:forperiod
fprintf(repmat('\b',1,numel(msg)));
msg = sprintf('forecasting iteration %g/%g...\n',j,forperiod);
fprintf('%s',msg);
estdata = y(1:split+j-1);
estdatalen = size(estdata,2);
signal = estdata;
last = signal(end);
[signal,low,high] = preprocess(signal'); % pre-process
signal = signal';
inputs = signal(rowshiftmat(length(signal),numinputs));
targets = signal(numinputs+1:end);
%% NARNET METHOD
feedbackDelays = 1:4;
hiddenLayerSize = 10;
net = narnet(feedbackDelays,[hiddenLayerSize hiddenLayerSize]);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
signalcells = mat2cell(signal,[1],ones(1,length(signal)));
[inputs,inputStates,layerStates,targets] = preparets(net,{},{},signalcells);
net.trainParam.showWindow = false;
net.trainparam.showCommandLine = false;
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
net.performFcn = 'mse'; % Mean squared error
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
next = net(inputs(end),inputStates,layerStates);
next = postprocess(next{1}, low, high); % post-process
next = (next+1)*last;
forecasted = [forecasted next];
end
figure(1);
plot(1:forperiod, forecasted, 'b', 1:forperiod, y(end-forperiod+1:end), 'r');
grid on;
Note:
The function 'preprocess' simply converts the data into logged % differences and 'postprocess' converts the logged % differences back for plotting. (Check EDIT for preprocess and postprocess code)
Results:
BLUE: Forecasted Values
RED: Actual Values
Can anyone tell me what I'm doing wrong here? Or perhaps recommend another method to achieve the desired results (lagless prediction of sinusoidal function, and eventually more chaotic timeseries)? Your help is very much appreciated.
EDIT:
It's been a few days now and I hope everyone has enjoyed their weekend. Since no solutions have emerged I've decided to post the code for the helper functions 'postprocess.m', 'preprocess.m', and their helper function 'normalize.m'. Maybe this will help get the ball rollin.
postprocess.m:
function data = postprocess(x, low, high)
% denormalize
logdata = (x+1)/2*(high-low)+low;
% inverse log data
sign = logdata./abs(logdata);
data = sign.*(exp(abs(logdata))-1);
end
preprocess.m:
function [y, low, high] = preprocess(x)
% differencing
diffs = diff(x);
% calc % changes
chngs = diffs./x(1:end-1,:);
% log data
sign = chngs./abs(chngs);
logdata = sign.*log(abs(chngs)+1);
% normalize logrets
high = max(max(logdata));
low = min(min(logdata));
y=[];
for i = 1:size(logdata,2)
y = [y normalize(logdata(:,i), -1, 1)];
end
end
normalize.m:
function Y = normalize(X,low,high)
%NORMALIZE Linear normalization of X between low and high values.
if length(X) <= 1
error('Length of X input vector must be greater than 1.');
end
mi = min(X);
ma = max(X);
Y = (X-mi)/(ma-mi)*(high-low)+low;
end
I didn't check you code, but made a similar test to predict sin() with NN. The result seems reasonable, without a lag. I think, your bug is somewhere in synchronization of predicted values with actual values.
Here is the code:
%% init & params
t = (-50 : 0.2 : 100)';
y = sin(t) + 0.5 * sin(t + pi / 3);
sigma = 0.2;
n_lags = 12;
hidden_layer_size = 15;
%% create net
net = fitnet(hidden_layer_size);
%% train
noise = sigma * randn(size(t));
y_train = y + noise;
out = circshift(y_train, -1);
out(end) = nan;
in = lagged_input(y_train, n_lags);
net = train(net, in', out');
%% test
noise = sigma * randn(size(t)); % new noise
y_test = y + noise;
in_test = lagged_input(y_test, n_lags);
out_test = net(in_test')';
y_test_predicted = circshift(out_test, 1); % sync with actual value
y_test_predicted(1) = nan;
%% plot
figure,
plot(t, [y, y_test, y_test_predicted], 'linewidth', 2);
grid minor; legend('orig', 'noised', 'predicted')
and the lagged_input() function:
function in = lagged_input(in, n_lags)
for k = 2 : n_lags
in = cat(2, in, circshift(in(:, end), 1));
in(1, k) = nan;
end
end
I created a neural network matlab. This is the script:
load dati.mat;
inputs=dati(:,1:8)';
targets=dati(:,9)';
hiddenLayerSize = 10;
net = patternnet(hiddenLayerSize);
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax', 'mapstd','processpca'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax', 'mapstd','processpca'};
net = struct(net);
net.inputs{1}.processParams{2}.ymin = 0;
net.inputs{1}.processParams{4}.maxfrac = 0.02;
net.outputs{2}.processParams{4}.maxfrac = 0.02;
net.outputs{2}.processParams{2}.ymin = 0;
net = network(net);
net.divideFcn = 'divideind';
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainInd = 1:428;
net.divideParam.valInd = 429:520;
net.divideParam.testInd = 521:612;
net.trainFcn = 'trainscg'; % Scaled conjugate gradient backpropagation
net.performFcn = 'mse'; % Mean squared error
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', 'plotregression', 'plotconfusion', 'plotroc'};
net=init(net);
net.trainParam.max_fail=20;
[net,tr] = train(net,inputs,targets);
outputs = net(inputs);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
Now I want to save the weights and biases of the network and write the equation.
I had saved the weights and biases:
W1=net.IW{1,1};
W2=net.LW{2,1};
b1=net.b{1,1};
b2=net.b{2,1};
So, I've done the data preprocessing and I wrote the following equation
max_range=0;
[y,ps]=removeconstantrows(input, max_range);
ymin=0;
ymax=1;
[y,ps2]=mapminmax(y,ymin,ymax);
ymean=0;
ystd=1;
y=mapstd(x,ymean,ystd);
maxfrac=0.02;
y=processpca(y,maxfrac);
in=y';
uscita=tansig(W2*(tansig(W1*in+b1))+b2);
But with the same input input=[1:8] I get different results. why? What's wrong?
Help me please! It's important!
I use Matlab R2010B
It looks like you are pre-processing the inputs but not post-processing the outputs. Post processing uses the "reverse" processing form. (Targets are pre-processed, so outputs are reverse processed).
This equation
uscita=tansig(W2*(tansig(W1*in+b1))+b2);
is wrong. Why do you write two tansig? You have 10 nerouns you should write it 10 times or use for i=1:10;