SVM multiclassification with MATLAB R2015a - matlab

I try to use MATLAB R2015a classification toolbox for my 4 classes. I imported my dataset and selected a Gaussian kernel to train my classifier. This is my dataset:
my Data=[9.36 0;8.72 0;9.13 0;7.38 0;8.02 0;12.15 1;11.02 1;11.61 1;
12.31 1;15.23 1;52.92 2;54.49 2;48.82 2;52.00 2;49.79 2;22.46 3;30.38 3;
21.98 3;24.46 3;26.08 3];
Then I export it into my workspace to use it with my new test data, but when I want to use it in work space this error apears:
Variables have been created in the base workspace.
To use the exported classifier trainedClassifier to make predictions on new data, T, use
yfit = predict(trainedClassifier, T{:,trainedClassifier.PredictorNames})
If your new data contains any integer variables, then preprocess the data to doubles like this:
X = table2array(varfun(#double, T(:,trainedClassifier.PredictorNames)));
yfit = predict(trainedClassifier, X)
I don't understand what does it mean exactly and what is T and yfit?
How can I test my new data with this classifier?

The thing is that you are trying to predict the classes of the data stored in a cell. First import it as a table.
Home_>import_>file name_>import_>(here choose Table from the imported data part). Now you can use your predictor by providing this table name.

yfit= a vector of predicted class labels for predictor data in the table T.
T = Sample data, specified as a table. Each row of T corresponds to one observation, and each column corresponds to one predictor variable. Optionally, T can contain additional columns for the response variable and observation weights. T must contain all of the predictors used to train SVMModel. Multi-column variables and cell arrays other than cell arrays of strings are not allowed.
Test data: example
load newdataset
rng(1);
CVSVMModel = fitcsvm(X,Y,'Holdout',0.15,'ClassNames',{'classname1','classname2'},...
'Standardize',true);
CompactSVMModel = CVSVMModel.Trained{1}; % Extract trained, compact classifier
testInds = test(CVSVMModel.Partition); % Extract the test indices
XTest = X(testInds,:);
predict(CompactSVMModel,XTest);% test here

Related

Matlab: Change variable resolution and names for viewing regression trees

Using treeMine = fitctree(....) I can generate a decision tree but the tree is very big, and therefore very difficult to convey information, when using view(treeMine,'Mode','Graph')
Therefore my question is if it is possible to change variable names x1-x9 to other names to make it human understandable and if I could force the numbers to be represented by engineering notation meaning 10e3.
Does anybody know how this can be done?
Minimal Example
Minimal example can be to use Matlabs own car example:
load carsmall
idxNaN = isnan(MPG + Weight);
X = Weight(~idxNaN);
Y = MPG(~idxNaN);
n = numel(X);
rng(1) % For reproducibility
idxTrn = false(n,1);
idxTrn(randsample(n,round(0.5*n))) = true; % Training set logical indices
idxVal = idxTrn == false; % Validation set logical indices
Mdl = fitrtree(X(idxTrn),Y(idxTrn));
view(Mdl,'Mode','graph')
How do you then specify the value resolution and variable name
About the names: It's a bit a poor example because you use only one predictor (weight), but you can change the name with the 'PredictorNames' name-value pair, e.g.
Mdl = fitrtree(X(idxTrn),Y(idxTrn),'PredictorNames',{'weight'});
If you were to use more predictors you just have to add more elements to the cell array, e.g.
'PredictorNames',{'weight','age','women'}
I don't know about the numbers tough.

How do I actually execute a saved TensorFlow model?

Tensorflow newbie here. I'm trying to build an RNN. My input data is a set of vector instances of size instance_size representing the (x,y) positions of a set of particles at each time step. (Since the instances already have semantic content, they do not require an embedding.) The goal is to learn to predict the positions of the particles at the next step.
Following the RNN tutorial and slightly adapting the included RNN code, I create a model more or less like this (omitting some details):
inputs, self._input_data = tf.placeholder(tf.float32, [batch_size, num_steps, instance_size])
self._targets = tf.placeholder(tf.float32, [batch_size, num_steps, instance_size])
with tf.variable_scope("lstm_cell", reuse=True):
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(hidden_size, forget_bias=0.0)
if is_training and config.keep_prob < 1:
lstm_cell = tf.nn.rnn_cell.DropoutWrapper(
lstm_cell, output_keep_prob=config.keep_prob)
cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers)
self._initial_state = cell.zero_state(batch_size, tf.float32)
from tensorflow.models.rnn import rnn
inputs = [tf.squeeze(input_, [1])
for input_ in tf.split(1, num_steps, inputs)]
outputs, state = rnn.rnn(cell, inputs, initial_state=self._initial_state)
output = tf.reshape(tf.concat(1, outputs), [-1, hidden_size])
softmax_w = tf.get_variable("softmax_w", [hidden_size, instance_size])
softmax_b = tf.get_variable("softmax_b", [instance_size])
logits = tf.matmul(output, softmax_w) + softmax_b
loss = position_squared_error_loss(
tf.reshape(logits, [-1]),
tf.reshape(self._targets, [-1]),
)
self._cost = cost = tf.reduce_sum(loss) / batch_size
self._final_state = state
Then I create a saver = tf.train.Saver(), iterate over the data to train it using the given run_epoch() method, and write out the parameters with saver.save(). So far, so good.
But how do I actually use the trained model? The tutorial stops at this point. From the docs on tf.train.Saver.restore(), in order to read back in the variables, I need to either set up exactly the same graph I was running when I saved the variables out, or selectively restore particular variables. Either way, that means my new model will require inputs of size batch_size x num_steps x instance_size. However, all I want now is to do a single forward pass through the model on an input of size num_steps x instance_size and read out a single instance_size-sized result (the prediction for the next time step); in other words, I want to create a model that accepts a different-size tensor than the one I trained on. I can kludge it by passing the existing model my intended data batch_size times, but that doesn't seem like a best practice. What's the best way to do this?
You have to create a new graph that has the same structure but with the batch_size = 1 and import the saved variables with tf.train.Saver.restore(). You can take a look at how they define multiple models with variable batch size in ptb_word_lm.py: https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/rnn/ptb/ptb_word_lm.py
So you can have a separate file for instance, where you instantiate the graph with the batch_size that you want, then restore the saved variables. Then you can execute your graph.

Matlab Neural Network for Classes - Unseen Data

Say I create a neural network to separate classes:
X1; %Some data in Class 1 100x2
X2; %Some data in Class 2 100x2
classInput = [X1;X2];
negative = zeros(N, 1);
positive = ones(N,1);
classTarget = [positive negative; negative positive];
net = feedforwardnet(20);
net = configure(net, classInput, classTarget);
net = train(net, classInput, classTarget);
%output of training data
output = net(classInput);
I can plot the classes and they are correctly separated:
figure();
hold on
style = {'ro' 'bx'};
for i=1:(2*N)
plot(classInput(i,1),classInput(i,2), style{round(output(i,1))+1});
end
However, how can I apply the network that's just been trained to unseen data? There must be a model which is generated by the network that can be applied to new data?
EDIT: Using sim:
Once the network is trained, if I use sim on the training data:
[Z,Xf,Af] = sim(net,classInput);
The result is as expected. But this only works if the input is of the same size. If for example I want to evalute an individual data point:
[Z1,Xf,Af] = sim(net,[1,2]);
size(Z) == size(Z1), but this clearly doesn't make sense? Surely I can evaluate a single data point?
I'm the OP,
I had assumed that the rows of the input matrices were the data samples and the columns were the "categories", this is the other way around. Transposing the matrices before inputting them to the train() function fixes this.

Libsvm dummy labels interferring with prediction

I m trying to simulate out of sample prediction of a binary classifier using libsvm in matlab. My target variable (ie my label) is binary (-1 +1). Therefore, in my test set there are series for which i don t know the label. I created a new label for these observations (this label is 747). I found that in my predicted_label_test vector (see code below), this 747 label is included. So it means the prediction I get is influenced by the labels of the data included the test set, which is what I m supposed to predict? - The mistake may be in the way I use Libsvm read and write functions but i can t find it - many thanks!!
%%%%%%%%%% GET DATA FROM THE CSV FILE AND CONVERT THEM TO LIBSVM
addpath('C:\libsvm1\matlab'); %indicate position of the CSV file
ALLDATA = csvread('DATACSV.csv'); % read a csv file
labels = ALLDATA(:, 1); % labels are included in the first column of data
labels_sparse = sparse (labels); %? needed
features = ALLDATA(:, 4:end); % features start at 4th column
features_sparse = sparse(features); % features must be in a sparse matrix
libsvmwrite('TTT.train', labels_sparse, features_sparse); % write the file to libsvm format
[label_vector, predictors_matrix] = libsvmread('C:\libsvm1\matlab\TTT.train'); % read the file that was recorded in Libsvm format
%%%%% DEFINE VECTOR AND MATRIX SIZE
label_vector_train = label_vector (1:143,:);
predictors_matrix_train = predictors_matrix (1:143,:);
label_vector_test = label_vector (144:193,:);
predictors_matrix_test = predictors_matrix (144:193,:);
%PREDICTION
param = ['-q -c 2 -g 3'];
bestModel = svmtrain(label_vector_test, predictors_matrix_test, param);
[predicted_label_test, accuracy, prob_values] = svmpredict(label_vector_test, predictors_matrix_test, bestModel);
You are training a svm model with test data, when you should train it with training data:
bestModel = svmtrain(label_vector_test, predictors_matrix_test, param);
should be:
bestModel = svmtrain(label_vector_train, predictors_matrix_train, param);

How to use classifiers

I want to use the svm, knn, adaboost classifier on my data features. I build up code where I calculated the frame differences and calculated the features (eigenvalues, strain energy, potential energy).... build up an array of [number of frames , features]. I try to use svm as:
Features = data; % Features array [40, 5]
class = ones(numFrames-1, 1); % numFrames=41
class(1:(fix(numFrames/2))) = -1;
SVMstruct = svmtrain(Features, class, 'Kernel_Function', 'rbf');
newclass = svmclassify(SVMstruct, [40 5]); %Test data
I got an error:
The number of columns in TEST and training data must be equal.
%classperf(cp,newclass); %performance of the class given by cp'`
What is the reason for this error? And how do I to use further classifiers with this features set?
I can infer following things from the error which you are getting.
There is no error in svmtrain that means size(features)=[40 5]. The error is in the last line. See the syntax of svmclassify. You pass a sample of test data which has same number of features/columns as the training data in your case 5). Instead you are passing the size which is [40 5] which has only two columns. Pass the actual test set of n rows and 5 columns. The last line should be
newclass= svmclassify(SVMstruct,testData); %where size(testData)=[n 5], n indicates how many test samples you have.