Calculation of Accuracy for Each Class for Multi class classification - matlab

I am working on Multi-Class classification problem, where I have designed
SVM classifier
for
31 different classes
After the classification process I got the predicted labels now I want to calculate the accuracy rate for each distinct class. Although I have calculated the overall accuracy of model but I want more detail.
Can some one provide me Matlab code for this purpose or guide me to calculate the desired result.

Related

Poor performance for SVM for unbalanced dataset- how to improve?

Consider a dataset A which has examples for training in a binary classification problem. I have used SVM and applied the weighted method (in MATLAB) since the dataset is highly imbalanced. I have applied weights as inversely proportional to the frequency of data in each class. This is done on training using the command
fitcsvm(trainA, trainTarg , ...
'KernelFunction', 'RBF', 'KernelScale', 'auto', ...
'BoxConstraint', C,'Weight',weightTrain );
I have used 10 folds cross-validation for training and learned the hyperparameter as well. so, inside CV the dataset A is split into train (trainA) and validation sets (valA). After training is over and outside the CV loop, I get the confusion matrix on A:
80025 1
0 140
where the first row is for the majority class and the second row is for the minority class. There is only 1 false positive (FP) and all minority class examples have been correctly classified giving true positive (TP) = 140.
PROBLEM: Then, I run the trained model on a new unseen test data set B which was never seen during training. This is the confusion matrix for testing on B .
50075 0
100 0
As can be seen, the minority class has not been classified at all, hence the purpose of weights has failed. Although, there is no FP the SVM fails to capture the minority class examples.
I have not applied any weights or balancing method such as sampling (SMOTE, RUSBoost etc) on B. What could be wrong and how to overcome this problem?
Class misclassification weights could be set instead of sample weights!
You can set the class weights based on the following example.
Mis-classification weight for class A(n-records; dominant) into class B (m-records; minority class) can be n/m.
Mis-classification weight For class B as class A can be set as 1 or m/n based on the severity, which you want to impose on the learning
c=[0 2.2;1 0];
mod=fitcsvm(X,Y,'Cost',c)
According to documentation:
For two-class learning, if you specify a cost matrix, then the
software updates the prior probabilities by incorporating the
penalties described in the cost matrix. Consequently, the cost matrix
resets to the default. For more details on the relationships and
algorithmic behavior of BoxConstraint, Cost, Prior, Standardize, and
Weights, see Algorithms.
Area Under Curve (AUC) is usually used to measure performance of models that applied on unbalanced data. It is also good to plot ROC curve to visually get more insights. Using only confusion matrix for such models may lead to misinterpretation.
perfcurve from the Statistics and Machine Learning Toolbox provides both functionalities.

Random component on fitcsvm/predict

I have a train dataset and a test dataset, and I train a SVM with fitcsvm in MATLAB. Then, I proceed to test the trained model with predict. I'm always using the same datasets, but I keep getting different AUCs for the same model, which makes me wonder where in the process is there a random component. Note that
I'm aware of the fact that formally there isn't such thing as ROC curve or AUC and
I'm not asking for the statistical background of the SVM problem. It is relative to the matlab implementation of the training/test algorithm. I expected to have the same results because the training algorithm is, afaik, a deterministic process.

How to change a binary KNN classifier to be a SVM classifier?

I am classifying gender using a KNN classifier.
I want to add an SVM classifier instead of KNN classifier with the same labels of 0 and 1 (0 for women and 1 for men)
I have a matrix of test examples, sample, a matrix of training examples, training, and a vector with the labels for the training examples group. I want class, a vector of the labels for the test examples.
class = knnclassify(sample, training, group);
if class==1
x='Male';
else
x='Female';
end
How can I change this code to find class using an SVM?
To train an SVM, you will need the Statistics and Machine Learning Toolbox.
The biggest difference between the knnclassify and using an SVM classifier is that training and classifying new labels will be two separate steps.
1. Train your SVM : fitcsvm
This step teaches the classifier how to distinguish between your two classes. It is learning a linear separator (or a weighted combination of the features) which has the largest margin between positive and negative examples. All the examples you give it need to have ground truth labels.
SVM's have many tunable parameters that you can adjust during the training step. There are several good tutorials in the Matlab documentation which describe the differences, but for the most basic version, you can just use your training examples
model = fitcsvm(training,group);
This model will be used in the next step.
2. Classify new examples : predict
To classify your new example, run
class = predict(sample, model);
Notes:
Using your model, you can also run cross-fold validation, useful for accuracy analysis.
cvModel = crossval(model);
classError = kfoldLoss(cvModel);
You can also save your model, like any other Matlab variable for future use.
save('model.m', 'model');
knnclassify comes from the bioinformatics toolbox. In the Statistics and Machine Learning Toolbox, there is also a KNN model which you train with fitcknn and classify with predict. The benefit is that you can reuse your KNN model with several sets of data, compare cross-validation results, and save it for future use.

how to combine two models in matlab for classification?

Hi All I made a NN model for classification and give me what I want also I made KNN which gives me higher accuracy but in my model I want to combine both so both gives me higher accuracy so how I can do that in matlab? ( I am not professional in matlab but now I can understand the code and make the network and knn )

h2o random forest calculating MSE for multinomial classification

Why is h2o.randomforest calculating MSE on Out of bag sample and while training for a multinomail classification problem?
I have done binary classification also using h2o.randomforest, there it used to calculate AUC on out of bag sample and while training but for multi classification random forest is calculating MSE which seems suspicious. Please see this screenshot.
My target variable was a factor containing 4 factor levels model1, model2, model3 and model4. In the screenshot you would also a confusion matrix for these factors.
Can someone please explain this behaviour?
Both binomial and multinomial classification display MSE, so you will see it in the Scoring History table for both models (highlighted training_MSE column).
H2O does not evaluate a multinomial AUC. A few evaluation methods exist, but there is not yet a single widely adopted method. The pROC package discusses the method of Hand and Till, but mentions that it cannot be plotted and results rarely tested. Log loss and classification error are still available, specific to classification, as each has standard methods of evaluation in a multinomial context.
There is a confusion matrix comparing your 4 factor levels, as you highlighted. Can you clarify what more you are expecting? If you were looking for four individual confusion matrices, the four-column table contains enough information that they could be computed.