Is it possible for an ensemble classifier to return bimodal vote? - classification

Knowing fully well that Majority and Plurality voting of ensemble classifiers for prediction of a class label returns the modal prediction by each base classifier, if there's an ensemble of about 4 classifiers and above, is it possible to get a bimodal or trimodal votes? If possible, what will the ensemble return as the predicted class label?

I think it depends on the classifiers that you are trying to ensemble. If the number of possible labels/classes of each classifier is the same, then 'reducing' a 4 classifiers or above to 2-3 votes can be done using some function/layer. A common example of a layer you can use to do that is a fully connected neural network layer, which is readily available in many popular libraries (e.g. Tensorflow, Pytorch, Caffe...)
In case the types of outputs from each classifier is different, you may need to 'normalize' the classifiers' outputs so that they have the same types and the same ranges of values. Then these normalized outputs can be used to produce bimodal or trimodal votes, similar to what was explained above.

Related

How to Combine two classification model in matlab?

I am trying to detect the faces using the Matlab built-in viola jones face detection. Is there anyway that I can combine two classification models like "FrontalFaceCART" and "ProfileFace" into one in order to get a better result?
Thank you.
You can't combine models. That's a non-sense in any classification task since every classifier is different (works differently, i.e. different algorithm behind it, and maybe is also trained differently).
According to the classification model(s) help (which can be found here), your two classifiers work as follows:
FrontalFaceCART is a model composed of weak classifiers, based on classification and regression tree analysis
ProfileFace is composed of weak classifiers, based on a decision stump
More infos can be found in the link provided but you can easily see that their inner behaviour is rather different, so you can't mix them or combine them.
It's like (in Machine Learning) mixing a Support Vector Machine with a K-Nearest Neighbour: the first one uses separating hyperplanes whereas the latter is simply based on distance(s).
You can, however, train several models in parallel (e.g. independently) and choose the model that better suits you (e.g. smaller error rate/higher accuracy): so you basically create as many different classifiers as you like, give them the same training set, evaluate each accuracy (and/or other parameters) and choose the best model.
One option is to make a hierarchical classifier. So in a first step you use the frontal face classifier (assuming that most pictures are frontal faces). If the classifier fails, you try with the profile classifier.
I did that with a dataset of faces and it improved my overall classification accuracy. Furthermore, if you have some a priori information, you can use it. In my case the faces were usually in the middle up part of the picture.
To further improve your performance, without using the two classifiers in MATLAB you are using, you would need to change your technique (and probably your programming language). This is the best method so far: Facenet.

Self organizing Maps and Linear vector quantization

Self organizing maps are more suited for clustering(dimension reduction) rather than classification. But SOM's are used in Linear vector quantization for fine tuning. But LVQ is a supervised leaning method. So to use SOM's in LVQ, LVQ should be provided with a labelled training data set. But since SOM's only do clustering and not classification and thus cannot have labelled data how can SOM be used as an input for LVQ?
Does LVQ fine tune the clusters in SOM?
Before using in LVQ should SOM be put through another classification algorithm so that it can classify the inputs so that these labelled inputs maybe used in LVQ?
It must be clear that supervised differs from unsupervised because in the first the target values are known.
Therefore, the output of supervised models is a prediction.
Instead, the output of unsupervised models is a label for which we don't know the meaning yet. For this purpose, after clustering, it is necessary to do the profiling of each one of those new label.
Having said so, you could label the dataset using an unsupervised learning technique such as SOM. Then, you should profile each class in order to be sure to understand the meaning of each class.
At this point, you can pursue two different path depending on what is your final objective:
1. use this new variable as a way for dimensionality reduction
2. use this new dataset featured with the additional variable representing the class as a labelled data that you will try to predict using the LVQ
Hope this can be useful!

neural network with multiplicative probability factor

I'm developing a project for the university. I have to create a classifier for a disease. The data-set i have contains several inputs (symptoms) and each of them is associated to a multiplicative probability factor (e.g. if patient has the symptom A, he has a double probability to have that disease).
So, how can i do this type of classifier? Is there any type of neural network or other instrument to do this??
Thanks in advance
You should specify how much labeled data and unlabeled data you have.
Let's assume you have only labeled data. Then you could use neural networks, but IMHO, SVM or random forests are the best techniques for a first try.
Note that if you use machine learning techniques, your prior information about symptoms (multiplicative coefficients) are not used because the labels are used instead. If you want to use these coefficients, it's no more machine learning.
You can use neural network for this purpose also. If to speak about your situation, with binding symptom A to more chances for decease B, that is what neural network should be able to accomplish. To bind connection weights from input A ( symptom A ) to desease B. From your side, you can engrain such classification rule in case if you'll have enough training data in your training data set. Also I propose you to try two different approaches: 1. neural network with N outputs (N = number of deseases to clasif). 2. Create for each desease neural network.

How to improve predictor importance in decision tree ensemble (using TreeBagger class in Matlab)

I'm trying to train a classifier (specifically, a decision forest) using the Matlab 'TreeBagger' class.
I notice from the online documentation for TreeBagger, that there are a couple of methods/properties that could be used to see how important each data point feature is for distinguishing between classes of data point.
The two I found were the ComputeOOBVarImp property and the ClassificationTree.predictorImportance method. Using the latter on a decision forest/bagged ensemble of trees that I'd built, I found that many data point features had zero importance for the classifier.
Is there anything I can do with the TreeBagger class, or in conjunction with it, so that my trees use weak learners/splitting criteria that aren't just bounds on single input data features, but linear combinations of these features, in order to improve the 'information gain' at each node split.
I suppose this comes down to dimensionality reduction of the data, that I have no experience in dealing with in Matlab.
Thanks.

SVM LibSVM Ignore Feature 1,3,5 when Predicting

this question is about LibSVM or SVMs in general.
I wonder if it is possible to categorize Feature-Vectors of different length with the same SVM Model.
Let's say we train the SVM with about 1000 Instances of the following Feature Vector:
[feature1 feature2 feature3 feature4 feature5]
Now I want to predict a test-vector which has the same length of 5.
If the probability I receive is to poor, I now want to check the first subset of my test-vector containing the columns 2-5. So I want to dismiss the 1 feature.
My question now is: Is it possible to tell the SVM only to check the features 2-5 for prediction (e.g. with weights), or do I have to train different SVM Models. One for 5 features, another for 4 features and so on...?
Thanks in advance...
marcus
You can always remove features from your test points by fiddling with the file, but I highly recommend not using such an approach. An SVM model is valid when all features are present. If you are using the linear kernel, simply setting a given feature to 0 will implicitly cause it to be ignored (though you should not do this). When using other kernels, this is very much a no no.
Using a different set of features for predictions than the set you used for training is not a good approach.
I strongly suggest to train a new model for the subset of features you wish to use in prediction.