Learning vector quantization doesn't work well in matlab - matlab

I want to use learning vector quantization (LVQ) to classify F_CK data with 7 classes.
When I use MLP, error is about 15% . but when I use LVQ, error is about 75% :(
I see that LVQ only classifies one class very good but doesn't classify other classes.
my code:
data = load('F_CK+');
x = data.X';
y_data = data.Y';
t = ind2vec(y_data);
net = lvqnet(4,0.1,'learnlv2');
net.divideFcn = 'dividerand';
net.divideMode = 'sample';
net.divideParam.trainRatio = 85/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 0/100;
net.trainParam.epochs = 15;
net = train(net, x, t);
y = net(x);
classes = vec2ind(y);
figure, plotconfusion(t,y);
confusion matrix of my result.
FC_K
Can any one help me, why this network only classify one class and what is my fault ?
dataset link:
https://dl.dropboxusercontent.com/u/100069389/File/Stackoverflow/F_CK.rar
https://mega.nz/#!J8ES1DRS!NZwDsD0FFojeZiI-OpORzxGLbMp9rx0XKsfOvGDOaR0

I don't know what is my fault but I do something that improve the accuracy of classification.
1. normalize data between -1 and 1
2. increase the subclasses/ LVQ neurons to 64 to cover all of image class.
as far as I'm remembered, the LVQ network must more accurate than MLP, but my accuracy with LVQ is increased to 80%.

Related

K-fold cross validation modification to generated ANN code?

My data set is basically a matrix of 3 variables (input), and a matrix of 1 variable (target). There are 50 total data sets for each of these (basically 50 samples of f(x,y,z) = t)
I have only done the ANN training using the GUI. Never really with the script/code.
My most simple objective now is to split the data manually for each train-test run, so I can just painstakingly run the neural network 5 times, but I'm not even sure how to manually select a range of the data set for use in training, and which one for testing.
Here's the full exported script from MATLAB. The point of focus is shown below the wall of code.
% Solve an Input-Output Fitting problem with a Neural Network
% Script generated by NFTOOL
% Created Mon Jul 17 02:39:31 SGT 2017
%
% This script assumes these variables are defined:
%
% DEinp - input data.
% DEcgl - target data.
inputs = DEinp;
targets = DEcgl;
% Create a Fitting Network
hiddenLayerSize = 10;
net = fitnet(hiddenLayerSize);
% Choose Input and Output Pre/Post-Processing Functions
% For a list of all processing functions type: help nnprocess
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% For help on training function 'trainlm' type: help trainlm
% For a list of all training functions type: help nntrain
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
% Choose a Performance Function
% For a list of all performance functions type: help nnperformance
net.performFcn = 'mse'; % Mean squared error
% Choose Plot Functions
% For a list of all plot functions type: help nnplot
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ...
'plotregression', 'plotfit'};
% Train the Network
[net,tr] = train(net,inputs,targets);
% Test the Network
outputs = net(inputs);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
% Recalculate Training, Validation and Test Performance
trainTargets = targets .* tr.trainMask{1};
valTargets = targets .* tr.valMask{1};
testTargets = targets .* tr.testMask{1};
trainPerformance = perform(net,trainTargets,outputs)
valPerformance = perform(net,valTargets,outputs)
testPerformance = perform(net,testTargets,outputs)
% View the Network
view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, plotfit(net,inputs,targets)
%figure, plotregression(targets,outputs)
%figure, ploterrhist(errors)
I figured that all I needed to do was mess with the net.divideMode section, but I really have no idea how to change the syntax to complete my objective.
Network Parameters
The process of splitting the data into training, validation and test sets happens in the section that you identified. I'm just going to break down each of the lines. Starting with:
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideMode = 'sample'; % Divide up every sample
The divideMode is well documented in Neural Network Object Properties
net.divideMode
This property defines the target data dimensions which
to divide up when the data division function is called. Its default
value is 'sample' for static networks and 'time' for dynamic networks.
It may also be set to 'sampletime' to divide targets by both sample
and timestep, 'all' to divide up targets by every scalar value, or
'none' to not divide up data at all (in which case all data is used
for training, none for validation or testing).
So your network is a static network which divides up every sample into a training example. This will remain the same for your cross-validation. What you are interested in manipulating is the training, test, and validation splits.
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
Okay, the variable names here seem promising, but you want a little more control than just choosing the ratio size.
Again the Neural Network Object Properties point us towards more information
net.divideParam
This property defines the parameters and values of the current
data-division function. To get a description of what each field means,
type the following command:
help(net.divideFcn)
This will print out information about how your dataset is partitioned into training, validation, and test splits. In your current configuration, the message reads
dividerand Partition indices into three sets using random indices.
[trainInd,valInd,testInd] = dividerand(Q,trainRatio,valRatio,testRatio) takes a number of
samples Q and divides up the sample indices 1:Q between training,
validation and test indices.
dividerand randomly assigns sample indices to the three sets according to the three ratios.
(...)
See also divideblock, divideind, divideint, dividetrain.
Since you want more control of the partitions, you should check out these additional options.
I think the most promising is divideind. This option allows you to specify the indices for each partition. You can calculate the indices for each fold in your k-fold cross validation and reassign the partitions in each iteration using this option.
To set this parameter, replace the net.divideParam lines above with something like,
net.divideFcn = 'divideind';
net.divideParam.Q = length(targets); %This is the total number of instances in your data
net.divideParam.trainInd = your_train_ind;
net.divideParam.valInd = your_val_ind;
net.divideParam.testInd = your_test_ind;
K-folds
Last detail, how to select the indices? First, a quick review on k-fold cross-validation.
The data is split into k equally sized subsamples.
In each iteration of cross-validation, we train on k-1 of the subsamples and test on the remaining subsamples, rotating to a new testing subsamples each time.
An implementation sketch might look like this
k = 5; % As an example, let's let k = 5
sample_size = length(targets)/k;
%Make a vector of all the indices of your data from 1 to the total number of instances
indices= 1:length(targets);
% Optional: Randomize samples
indices = randperm(length(targets));
% Iterate in steps of sample_size
for ii = 1: sample_size:length(targets) - sample_size
% Grab one subsample of indices for testing
your_test_ind = indices( ii:ii + sample_size - 1);
% Everything else
your_train_ind = indices( [1:ii, ii + sample_size:end]);
%Train and test your network here!
end
This is just an implementation sketch and doesn't handle some edge cases correctly. For example, the first element is always added to the training set, but it should be enough to get you started.

How to increase accuracy in SVM training and classification in Matlab?

I am having svm training with several images. This is my first project with SVM. I am extracting features with HOG feature extraction. Training features and label their locations 1 if it is on the horizon line, 0 if it is on the background. I have 74 images for training and 7 images for testing. Unfortunately, I can't go above 50 percent accuracy. I have changed image sizes, I have played cell sizes in feature extraction. It does not change that much. What can I try? And what is the ideal dataset number, how many images for training and testing? For example in one image it predicts all correct in next image all wrong.
This is how I am calculating accuracy;
%%%%% Evaluation
% Testing Data
hfsTest = vertcat(dataset.HorizonFeatsTest{:});
bfsTest = vertcat(dataset.BgFeatsTest{:});
test_data = [hfsTest;bfsTest];
% Labels
hlabelTest = ones(size(hfsTest,1),1);
blabelTest = zeros(size(bfsTest,1),1);
test_label = [hlabelTest;blabelTest];
Predict_label = vertcat(results.predicted_label{:});
acc = numel(find(Predict_label==test_label))/length(test_label);
disp(['Accuracy ', num2str(acc)]);
%done
% Training Data
hfs = vertcat(dataset.HorizonFeats{:});
bfs = vertcat(dataset.BgFeats{:});
train_data = [hfs;bfs];
% Labels
hlabel = ones(size(hfs,1),1);
blabel = zeros(size(bfs,1),1);
train_label = [hlabel;blabel];
%%%
% do training ...
svmModel = svmtrain(train_data, train_label,'BoxConstraint',2e-1);
and I have used Predict_label_image = svmclassify (svmModel, image_feats); for testing.
You need to do a lot of tunning. Here in the documentation you have all the hyperparameters you can play with. I'll start with a rbf kernel and trying [0.01, 0.1, 1, 10] for BoxConstraint.
I'm afraid you can't expect svm to work if you don't try different hyperparameter configurations.

Rainfall pattern Recognition / Classification using Neural Network

I am working on assignment titled "Pattern Recognition of Rainfall using ANN". for training and validation proposes, i have time-series rainfall data ranging from 1976 - 2006 of two metro-logical stations.. I have arranged the data in EXCEL from where i import and convert it into matrices . Input matrix contains 480 * 5 . where 480 cloums with 5 rows representing
Year, Month, Longitude, Latitude and elevation
, Output matrix contains mean rainfall(mm) value of that month with
size of 480 * 1.
for example
Input parameters are
1976 1 31.5400 74.2200 214
and output is
23.9000
I have used PattternNet "" and code is below
net = patternnet(10);
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideFcn = 'dividerand'; % Divide data randomly
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
load Input
load Target
P = transpose(Input);
T = transpose(Target);
x=P;
y=T;
[net,tr] = train(net,x,y);
nntraintool
view(net)
% The network outputs will be in the range 0 to 1,
% so we can use vec2ind function to get the class indices as the position of the highest element in each output vector.
testX = x(:,tr.testInd);
testT = y(:,tr.testInd);
testY = net(testX);
testIndices = vec2ind(testY)
% the overall percentages of correct and incorrect classification.
[c,cm] = confusion(testT,testY)
fprintf('Percentage Correct Classification : %f%%\n', 100*(1-c));
fprintf('Percentage Incorrect Classification : %f%%\n', 100*c);
% View the Network
view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, plotconfusion(targets,outputs)
%figure, ploterrhist(errors)
I transpose the Inputs and output values but results are not satisfactory because
Percentage Correct Classification = 85%
Best validation performance graph values are lying on same axis for all epochs
regression value is 27.36
ROC line is on 0 in training, validations and testing
Are the above parameters of Inputs and Targets are in correct form and valid for accurate and precise results ? am i using correct way / code for this propose? any suggestions please.
Thanks for your time....!

How to apply Back propagation for 3 class classification task in matlab 2012a?

I want to solve a classification problem with 3 classes using multi layer neural network with back propagation algorithm. I'm using matlab 2012a. I'm facing trouble with newff function. I want to build a network with one hidden layer and there will be 3 neurons in the output layer, one for each class. Please advise me with example.
Here is my code
clc
%parameters
nodesInHL=7;
nodesInOutput=3;
iteration=1000;
HLtranfer='tansig';
outputTranser='tansig';
trainFunc='traingd';
learnRate=0.05;
performanceFunc='mse';
%rand('seed',0);
%randn('seed',0);
rng('shuffle');
net=newff(trainX,trainY,[nodesInHL],{HLtranfer,outputTranser},trainFunc,'learngd',performanceFunc);
net=init(net);
%setting parameters
net.trainParam.epochs=iteration;
net.trainParam.lr=learnRate;
%training
[net,tr]=train(net,trainX,trainY);
Thanks.
The newff function was made obsolete. The recommended function is feedforwardnet, or in your case (classification), use patternnet.
You could also use the GUI of nprtool, which provides a wizard-like tool that guides you step-by-step to build your network. It even allows for code generation at the end of the experiment.
Here is an example:
%# load sample dataset
%# simpleclassInputs: 2x1000 matrix (1000 points of 2-dimensions)
%# simpleclassTargets: 4x1000 matrix (4 possible classes)
load simpleclass_dataset
%# create ANN of one hidden layer with 7 nodes
net = patternnet(7);
%# set params
net.trainFcn = 'traingd'; %# training function
net.trainParam.epochs = 1000; %# max number of iterations
net.trainParam.lr = 0.05; %# learning rate
net.performFcn = 'mse'; %# mean-squared error function
net.divideFcn = 'dividerand'; %# how to divide data
net.divideParam.trainRatio = 70/100; %# training set
net.divideParam.valRatio = 15/100; %# validation set
net.divideParam.testRatio = 15/100; %# testing set
%# training
net = init(net);
[net,tr] = train(net, simpleclassInputs, simpleclassTargets);
%# testing
y_hat = net(simpleclassInputs);
perf = perform(net, simpleclassTargets, y_hat);
err = gsubtract(simpleclassTargets, y_hat);
view(net)
note that NN will automatically set the number of nodes in the output layer (based on the target class matrix size)

neural network on matlab performance problem

I'm using this code to do a NN in order to train my network to give me the classifications on images:
net = newff(p,t,15,{},'traingd');
net.divideParam.trainRatio = 70/100; % Adjust as desired
net.divideParam.valRatio = 15/100; % Adjust as desired
net.divideParam.testRatio = 15/100; % Adjust as desired
net.trainParam.epochs = 10000;
net.trainParam.goal = 0.01;
net.trainParam.show = 25;
net.trainParam.time = inf;
net.trainParam.min_grad = 1e-10;
net.trainParam.max_fail = 10;
net.trainParam.sigma = 5.0e-5;
net.trainParam.lambda = 5.0e-7;
net.trainParam.mu_max = 1e-20;
net.trainParam.lr = 0.001;
% Train and Apply Network
[net,tr] = train(net,p,t);
outputs = sim(net,p);
% Create P.
% Plot
plotperf(tr)
plotfit(net,p,t)
plotregression(t,outpts)
But my performance never goes bellow 0.5. Tryed to do PCA on the data but I think something is not right on the code? Is it possible to change the initial value of the performance that shows on the nntraintool?
thank you
Paulo
It's hard to say without having your data, but from my experience with neural nets only one of a few things can possibly be happening:
You don't have enough hidden nodes to represent your data
Your time step is too high
Your error space is complicated due to your data and you're reaching lots of local minima. This is a similar but slightly different way of saying 1.
Your data is degenerate, in that you have training samples with different labels but exactly the same features.
If 1, then increase the number of hidden nodes.
If 2, decrease the time step
If 3, you can try initializing better with Nguyen-Widrow initialization perhaps (this used to be in the function initnw.)
If 4, figure out why your data is like this and fix it.
Thanks to #sazary for pointing out some details about initnw being the default when you create a new network with newff or newcf.