How do I use Weka's MLP output prediction model in Matlab? - matlab

I'm trying to do prediction in Matlab using the output of Weka's single layer MLP.
In my case I have a single layer with 100 nodes and 200 features.
I'm running Weka 3.7.10, and the options for weka.classifiers.functions.MultilayerPerceptron is
-L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a
Node 0 is Linear in my case, not Sigmoid.
Node 1-100 are all Sigmoid. I didn't use the -C or -I option, so by default Weka normalizes the data.
When I try to compute the predicted value in Matlab, I don't get the same value as on Weka.
In the following code:
featvals is my feature vector (stored as 200 rows)
featweightsall is the 200 feature weights from each of the 100 nodes that Weka provides.
nodeweights is the 100 nodeweights that Weka provides.
nodethresh is the 101 thresholds that Weka provides (rows 2-101 contain the threshold for Nodes 1-100, and row 1 contains the threshold for Node 0).
featvalsnorm = interp1([min(featvals) max(featvals)],[-1 1],featvals,'linear');
featvalsnorm2 = (featvals - min(featvals))/(max(featvals)-min(featvals));
for j = 1:100
featweights = featweightsall( ((j-1)*200+1):(j*200));
x = sum(featvalsnorm.*featweights) + nodethresh(j+1);
nodenorm(j) = 1/(1+exp(-x));
end
predvalnorm = sum(nodenorm.*nodeweights)+nodethresh(1);
predval = interp1([-1 1],[min(featvals) max(featvals)],predvalnorm,'linear');
What's wrong with the code? Am I supposed to un-normalize in the loop, or apply the
sigmoid to Node 0? I've tried many of these combinations, and nothing's working. Any help would be appreciated!

Related

Predicting same values for entire test Set in MATLAB using LIBSVM

I am using Support Vector Regression(SVR) in libsvm package to predict outputs. Kernel : RBF
Train set size : 729x40
Test set size : 137x40
The output of train set seems fine when measured against ground truth. But the predictions on test set are all the same. It outputs same values.
After checking the related posts, I normalized the data and played with the values of gamma(10-100000) but still the problem persists.
trainGT=games(((games(:,46)>=2010) & (games(:,46)<2015) & (games(:,1)~=8)),43);
featuresTrain=lastGame(games,true,1);
testGT=games((games(:,46)>=2015 & (games(:,1)~=8)),43);
featureTest=lastGame(games,false,1);
eval(['model = svmtrain( trainGT, featuresTrain,''-s 4 -t 2 -c 10 -g 10 ' ''');']);
w = (model.sv_coef' * full(model.SVs));
b = -model.rho;
predictionsTrain = svmpredict(trainGT, featuresTrain,model);
predictionsTest = svmpredict(zeros(length(testGT),1), featureTest, model);
My output is as follows
optimization finished, #iter = 1777
epsilon = 0.630588
obj = -19555.036253, rho = -17.470386
nSV = 681, nBSV = 118
Mean squared error = 305.214 (regression)
Squared correlation coefficient = -1.#IND (regression)
All my predictionTest values are 17.4704 (which is same as the rho value in the output). Can someone please help me on this? Thanks.

CIFAR-10 pixelwise training with libSVM matlab

Training the 50000 training images with feature vectors of 32x32x3 = 3072 dimensionality is making my computer get stuck. Is there a work around I'm missing to use libSVM efficiently for multiclass SVM classification? A day passes and the SVM is still running for only one class in a one-vs-all training framework.
*I am aware that using pixel values is a terrible way of optimal classification, yet I still want to run this as a lower bound benchmark for a study.
Code:
clc;close all;clear all;
addpath(genpath('./libsvm-3.21'));
addpath(genpath('./liblinear-2.1'));
%load all images:
M1 = load('../Data/cifar-10-batches-mat/data_batch_1.mat');
M2 = load('../Data/cifar-10-batches-mat/data_batch_2.mat');
M3 = load('../Data/cifar-10-batches-mat/data_batch_3.mat');
M4 = load('../Data/cifar-10-batches-mat/data_batch_4.mat');
M5 = load('../Data/cifar-10-batches-mat/data_batch_5.mat');
M = [M1.data; M2.data; M3.data; M4.data; M5.data];
M_labels = [M1.labels; M2.labels; M3.labels; M4.labels; M5.labels];
M_labels_double = double(M_labels);
M_double = double(M)/255.0;
%M_double is the dataset of [50000x3072]
%M_labels_double are the labels and has size of [50000x1]
model=cell(10,1);
for i=1:10
model{i} = svmtrain(double(M_labels_double==i),M_double,'-t 0 -c 1 -g 0.2 -b 1 -m 4000');
end

Why `libsvm` in matlab gives me all 1 prediction

I use svm in Rand matlab with the same dataset.
My R code works fine, which gives me some reasonable predictions.
matdat <- readMat(con = "data.mat")
svm.model <- svm(x = matdat$normalize.X, y = matdat$Yt)
pred <- predict(svm.model, newdata = matdat$normalize.X)
pred <- sapply(pred, function(x){ifelse(x > 0, 1, -1)})
sum(pred == matdat$Yt)/length(matdat$Yt)
But, my matlab code gives me all 1 prediction on the training data.
load('data.mat')
model2 = svmtrain(Yt, normalize_X,'-s 3 -c 1 -t 2 -p 0.1');
[predicted_label,accuracy, decision_values] = svmpredict(Yt, normalize_X, model2);
I have checked the default parameters of svm{e1071}, which in my opinion agrees with the matlab version.
I use the e1071 package with verion 1.6-7 in R. And the latest libsvm from the official page.
So, what can I do to find the reason, any ideas?
==== update====
Before feeding the data to libsvm in data, I apply mapstd to normalize the data which is automatically done in R. Then I got the same trained model in both R and Matlab.
In Matlab you use the -s 3 option which is regression, not classification.
As a starting point, don't assume anything about default parameters, just specify parameters explicitly in both R and Matlab.

matlab libsvm: unable to predict

I am using libsvm on Matlab. I want to build a model and use this model for prediction.
It is wired that the returns of svmpredict ([predict_label, accuracy_all, prob_values]) are empty. Here is my simple code:
svm_model = svmtrain([train_label],[train],'-t 2, -c 100 -q');
[predict_label, accuracy_all, prob_values] = svmpredict(testlabels,testdata,svm_model,'-q, -b 1');
[predict_label, accuracy_all, prob_values] are 0x0 matrix. And also Matlab also shows some warning information:
Usage: [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model, 'libsvm_options')
Parameters:
model: SVM model structure from svmtrain.
libsvm_options:
-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); one-class SVM not supported yet
Returns:
predicted_label: SVM prediction output vector.
accuracy: a vector with accuracy, mean squared error, squared correlation coefficient.
prob_estimates: If selected, probability estimate vector.
Can anyone help me?
What is q in the SVM model ? where is it's value ?
there is two parameters in the SVM which need to be well defined c and g , you have put 100 as value of c but ther is no value for q (or must be g which is called gamma)
this what you need
cmd = ['-t 2 -c ',num2str(C), ' -g ',num2str(gamma) ];
model = svmtrain2(trainClass, trainData, cmd);
[predClass, acc, decVals] = svmpredict(testClass, testData, model);
Also i think that svmtrain must be renamed svmtrain2 to avoid the confusion with svmtrain function of matlab.

LIBSVM data preparation

I am doing a project on Image processing in Matlab and wish to implement LIBSVM for supervised learning.
I am encountering a problem in data preparation.
I have the data in CSV format and when i try to convert it into libsvm format by using the information provided in LIBSVM faq:-
matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> labels = SPECTF(:, 1); % labels from the 1st column
matlab> features = SPECTF(:, 2:end);
matlab> features_sparse = sparse(features); % features must be in a sparse matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);
I get the data in the following form:
3.0012 1:2.1122 2:0.9088 ......
[value 1] [index 1]:[value 2] [index 2]:[value 3]
That is the first value takes no index and the value following the index 1 is value 2.
From what i had read, the data should be in the following format:
[label] [index 1]:[value 1] [index 2]:[value 2]......
[label] [index 1]:[value 1] [index 2]:[value 2]......
I need help to make this right.
And also if anyone would give me a clue about how to give labels it will be really helpful.
Thanking you in advance,
Sidra
You don't have to write data to a file, you can instead use the Matlab interface to LIBSVM. This interface consists of two functions, svmtrain and svmpredict. Each function prints a help text if called without arguments:
Usage: model = svmtrain(training_label_vector, training_instance_matrix, 'libsvm_options');
libsvm_options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC
1 -- nu-SVC
2 -- one-class SVM
3 -- epsilon-SVR
4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_instance_matrix)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n : n-fold cross validation mode
-q : quiet mode (no outputs)
and
Usage: [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model, 'libsvm_options')
Parameters:
model: SVM model structure from svmtrain.
libsvm_options:
-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); one-class SVM not supported yet
Returns:
predicted_label: SVM prediction output vector.
accuracy: a vector with accuracy, mean squared error, squared correlation coefficient.
prob_estimates: If selected, probability estimate vector.
Example code for training a linear SVM on a data set of four points with three features:
training_label_vector = [1 ; 1 ; -1 ; -1];
training_instance_matrix = [1 2 3 ; 3 4 5 ; 5 6 7; 7 8 9];
model = svmtrain(training_label_vector, training_instance_matrix, '-t 0');
Applying the resulting model to test data
testing_instance_matrix = [9 5 1; 2 9 5];
predicted_label = svmpredict(nan(2, 1), testing_instance_matrix, model)
results in
predicted_label =
-1
-1
You can also pass the true testing_label_vector to svmpredict so that it directly computes the accuracy; I here replaced the true labels by NaNs.
Please note that there is also a function svmtrain in Matlab's Statistics Toolbox which is incompatible with the one from LIBSVM – make sure you call the correct one.
As #A.Donda answers, you don't have to transfer the data to 'libsvm' format, if you can do the training and predicting in the matlab.
When you want to do the training and predicting work in windows or linux, you have to make the data in 'libsvm' format.
From your mistake, I think you didn't give the label in every line of'data features'. You should add the label in front of the features in every line of the data.
matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> features = SPECTF(:, :); % because there are no labels in your csv file
matlab> labels = [??];% to add the label as your plan
matlab> features_sparse = sparse(features); % features must be in a sparse matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);
You should provide more about your data, so we can help you with the label. BTW, label data is usually set by the user at the beginning. And you can set the label data any integer to one kind of data as you like.