Why does Jaccard similarity score is the same as accuarcy score in binary classification? - classification

In sklearn document (https://scikit-learn.org/0.15/modules/generated/sklearn.metrics.jaccard_similarity_score.html), it said:
In binary and multiclass classification, this function is equivalent to the accuracy_score. It differs in the multilabel classification problem.
However, according to the Jaccard index on Wikipedia (https://en.wikipedia.org/wiki/Jaccard_index)
It clearly differs from the accuracy score ((TP+TN)/(TP+FP+FN+TN)).
Might someone explain me that's which is correct, and why?

The 0.15 documentation is outdated, check the stable version that no longer points out metrics.jaccard_score is the same as accuracy for binary classification.

Related

Matlab fitcsvm Feature Coefficients

I'm running a series of SVM classifiers for a binary classification problem, and am getting very nice results as far as classification accuracy.
The next step of my analysis is to understand how the different features contribute to the classification. According to the documentation, Matlab's fitcsvm function returns a class, SVMModel, which has a field called "Beta", defined as:
Numeric vector of trained classifier coefficients from the primal linear problem. Beta has length equal to the number of predictors (i.e., size(SVMModel.X,2)).
I'm not quite sure how to interpret these values. I assume higher values represent a greater contribution of a given feature to the support vector? What do negative weights mean? Are these weights somehow analogous to beta parameters in a linear regression model?
Thanks for any help and suggestions.
----UPDATE 3/5/15----
In looking closer at the equations describing the linear SVM, I'm pretty sure Beta must correspond to w in the primal form.
The only other parameter is b, which is just the offset.
Given that, and given this explanation, it seems that taking the square or absolute value of the coefficients provides a metric of relative importance of each feature.
As I understand it, this interpretation only holds for the linear binary SVM problem.
Does that all seem reasonable to people?
Intuitively, one can think of the absolute value of a feature weight as a measure of it's importance. However, this is not true in the general case because the weights symbolize how much a marginal change in the feature value would affect the output, which means that it is dependent on the feature's scale. For instance, if we have a feature for "age" that is measured in years, but than we change it to months, the corresponding coefficient will be divided by 12, but clearly,it doesn't mean that the age is less important now!
The solution is to scale the data (which is usually a good practice anyway).
If the data is scaled your intuition is correct and in fact, there is a feature selection method that does just that: choosing the features with the highest absolute weight. See http://jmlr.csail.mit.edu/proceedings/papers/v3/chang08a/chang08a.pdf
Note that this is correct only to linear SVM.

Binary classification with KNN

I post here because I don't know how to improve the performance of my binary KNN.
The problem is that I have 99.8% Specificity and only 82% Sensitivity, but I'd rather have more Sensitivity than Specificity.
I'm new to this field, and I've only been working on this for 1 month.
In my study I've used an anomaly detector trained with only 1 class, and in that case, in order to raise the Sensitivity of the knn classifier I increased the value of the threshold...
Now that I have to compare my anomaly detector with a 2 class classifier it seems that KNN works better in the first case... The geometric mean of Sensitivity and Specificity (√Se*Sp) is 0,95 in the one-class classifier, and only 0,91 in the two-classes because of the low Sensitivity. I expected the exact opposite...
Can anyone help me?

Support Vector Machine vs K Nearest Neighbours

I have a data set to classify.By using KNN algo i am getting an accuracy of 90% but whereas by using SVM i just able to get over 70%. Is SVM not better than KNN. I know this might be stupid to ask but, what are the parameters for SVM which will give nearly approximate results as KNN algo. I am using libsvm package on matlab R2008
kNN and SVM represent different approaches to learning. Each approach implies different model for the underlying data.
SVM assumes there exist a hyper-plane seperating the data points (quite a restrictive assumption), while kNN attempts to approximate the underlying distribution of the data in a non-parametric fashion (crude approximation of parsen-window estimator).
You'll have to look at the specifics of your scenario to make a better decision as to what algorithm and configuration are best used.
It really depends on the dataset you are using. If you have something like the first line of this image ( http://scikit-learn.org/stable/_images/plot_classifier_comparison_1.png ) kNN will work really well and Linear SVM really badly.
If you want SVM to perform better you can use a Kernel based SVM like the one in the picture (it uses a rbf kernel).
If you are using scikit-learn for python you can play a bit with code here to see how to use the Kernel SVM http://scikit-learn.org/stable/modules/svm.html
kNN basically says "if you're close to coordinate x, then the classification will be similar to observed outcomes at x." In SVM, a close analog would be using a high-dimensional kernel with a "small" bandwidth parameter, since this will cause SVM to overfit more. That is, SVM will be closer to "if you're close to coordinate x, then the classification will be similar to those observed at x."
I recommend that you start with a Gaussian kernel and check the results for different parameters. From my own experience (which is, of course, focused on certain types of datasets, so your mileage may vary), tuned SVM outperforms tuned kNN.
Questions for you:
1) How are you selecting k in kNN?
2) What parameters have you tried for SVM?
3) Are you measuring accuracy in-sample or out-of-sample?

polyphase sample rate conversion with non-integer factor

I am not sure if this is the right place to ask this question, but I have designed a sample rate conversion filter h[n], with matlab's filterbuilder for interpolation factor I=5, and decimation factor D=9. Since D>I, matlab will design a filter with cutoff frequency pi/D.
Then I converted the designed filter h[n] into I=5 polyphase filters, using matlab's method polyphase(). However, I noticed that the coefficients of each seperate polyphase filter do not sum to 1. Hence, I cannot compute valid interpolated sample points. How is this possible? Am I missing something?
See my post at: dsp.stackexchange.com for an answer to this question, and a guide on how to design a sample rate converter with non-integer factor.

my model is predicting all positive class to negative class in libsvm toolbox matlab

I created a classifier using libsvm toolbox in Matlab. It is classifying all positive class data as negative class and vice versa. I got good result while doing cross validation but while testing some data I am finding that classifier is working in wrong way. I can't seem to figure out where the problem lies.
Can anybody please help me on this matter.
This was a "feature" of prior versions of libsvm when the first training example's (binary) label was -1. The easiest solution is to get the latest version (> 3.17).
See here for more details: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#f430
Suppose you have 500 training instances. 250 will be positive and other negative. Now in the testing set, the instances which have same features as positive will get predicted as positive. But when you are supplying testing labels (you have to provide testing labels so that LIBSVM can calculate accuracy, they won't obviously be used in the preiction algorithm) to LIBSVM, you have provided exactly reverse labels (by mistake). So you have a feeling that your predicted labels have come out exactly opposite. Because even a random classifier will have a 50% accuracy for a binary classification problem.