Different accuracy for HOG, SIFT and DENSE SIFT descriptor for recognition - matlab

Heyy guys,,
I'm really confused about the result that I got for the recognition.
When I was using HOG, I got different accuracy for recognition (Based on he parameter im using). I can understand this things.
May be, it's because I have dfferent viewpoint of the training Images and HOG doesnt has a capability for this. So the maximum accuracy that I got is only 40%.
Then Im using SIFT. I got more better result. It's around 70%.
And for Dense SIFT, I got the maximum only 38%.
I dont know, why it's happend. Because Dense SIFT should be more better than SIFT.
So, for this Im trying to use PCA to get the principal features from each descriptor. And then I combine these principal features to do recognition. But I got result more worst. Is it only 30%.
PCA HOG(150,4),PCA SIFT(150,3)=PCA COMBINATION(50,7)
Why this happend? Why DENSE SIFT gives worst result? and Why when I combine all those principal component (from HOG, SIFT and DENSE SIFT), I got more worst result??
And for now, I just doing everything in a sample images. I have 4000 images but for now, I only using 150 images..
I used this small size of sample to try different parameter and after I got this best parameter I'll use it in a whole train images.
Is it going to give a same result (compare with the small size images for the sample)??

May be, it's because I have different viewpoint of the training Images and HOG doesnt has a capability for this
The same holds for dense SIFT. If you have different viewpoints, dense methods are not suitable for this case. Try Hessian-Affine as detector and SIFT as a descriptor, I assume you will have better performance than SIFT.

Related

When to use PCA for dimensionality reduction?

I am using the Matlab Classification Learner app to test different classifiers over a training set (size = 700). My response variable is a categorical label with 5 possible values. I have 7 numerical features and 2 categorical ones. I found a Cubic SVM to have the highest accuracy of 83%. But the performance goes down considerably when I enable PCA with 95% explained variance (accuracy = 40.5%). I am a student and this is the first time I am using PCA.
Why do I see such a result?
Could it be because of a small / unbalanced data set?
When is it useful to apply PCA? When we say "reduce dimensionality", is there a minimum number of features (dimensionality) in the original set?
Any help is appreciated. Thanks in advance!
I want to share my opinion
I think training set 700 means, your data is < 1k.
I'm even surprised that svm performs 83%.
Even MNIST dataset is considered to be small (60.000 training - 10.000 test). Your data is much-much smaller.
You try to reduce your small data even smaller using pca. So what will svm learns? There is no discriminating samples left?
If I were you I would test using random-forest classifier. Random-forest might even perform better.
Even if you balanced your data, it is small data.
I believe using SMOTE will not improve the result. If your data consist of images then you could use ImageDataGenerator for replicating your data. Though I'm not sure matlab contains ImageDataGenerator.
You will use PCA, when you have lots of samples. Yet the samples are not directly effecting the accuracy but they are the components of data.
For instance: Let's consider handwritten digit classification data.
From above can we say each pixel is directly effecting the accuracy?
The answer is no? Above the black pixels are not important for the accuracy, therefore to remove them we use pca.
If you want a detailed explanation with a python example. Check out my other answer

niftynet multi-class 3D segmentation with dense vnet

Neural network newbie here. I've been testing Niftynet and achieved decent single-class 3D segmentation predictions on an own MRI data set with dense_vnet. However, I ran out of luck when I tried to add a second label. The network seems to spot the correct organs but can't get rid of additional artifacts as if it cannot get out of a local minimum or it doesn't have enough degrees of freedom or something. This is one of the better looking prediction slices which does show some correct labels but also additional noise.
Why would a single-class segmentation work better than a multi-class segmentation? Is it even reasonable to expect good multi-class 3D segmentation results out of DenseVnet? If yes, is there a specific approach to improve the results?
P.S.
Niftynet's site refers to stackoverflow for general questions.
Apparently, DenseVnet does handle multi-class segmentation okay. They have provided a ready model with a Dice loss extension. It worked with my MRI data without any pre-processing even though it's been designed for CT images and Hounsfield units.

Convolution Neural Network for image detection/classification

So here is there setup, I have a set of images (labeled train and test) and I want to train a conv net that tells me whether or not a specific object is within this image.
To do this, I followed the tensorflow tutorial on MNIST, and I train a simple conv net reduced to the area of interest (the object) which are training on image of size 128x128. The architecture is as follows : successively 3 layers consisting of 2 conv layers and 1 max pool down-sampling layers, and one fully connected softmax layers (with two class 0 and 1 whether the object is present or not)
I impleted it using tensorflow, and this works quite well, but since I have enough computing power I was wondering how I could improve the complexity of the classification:
- adding more layers ?
- adding more channel at each layer ? (currently 32,64,128 and 1024 for the fully connected)
- anything else ?
But the most important part is that now I want to detect this same object on larger images (roughle 600x600 whereas the size of the object should be around 100x100).
I was wondering how I could use the previously training "small" network used for small images, in order to pretrained a larger network on the large images ? One option could be to classify the image using a slicing window of size 128x128 and scan the whole image but I would like to try if possible to train a whole network on it.
Any suggestion on how to proceed ? Or an article / ressource tackling this kind of problem ? (I am really new to deep learning so sorry if this is stupid question...)
Thanks !
I suggest that you continue reading on the field overall. Your search keys include CNN, image classification, neural net, AlexNet, GoogleNet, and ResNet. This will return many articles, on-line classes and lectures, and other materials to help you learn about classification with neural nets.
Don't just add layers or filters: the complexity of the topology (net design) must be fitted to the task; a net that's too complex will over-fit the training data. The one you've been using is probably LeNet; the three I cite above are for the ImageNet image classification contest.
Since you are working on images, I would suggest you to use a pretrained image classification network (like VGG, Alexnet etc.)and fine tune this network with your 128x128 image data. In my experience until we have very large data set fine tuned network will give more accuracy and also save training time. After building a good image classifier on your data set you can use any popular algorithm to generate region of proposal from the image. Now take all regions of proposal and pass them to classification network one by one and check weather this network is classifying given region of proposal as positive or negative. If it classifying as positively then most probably your object is present in that region. Otherwise it's not. If there are a lot of region of proposal in which object is present according to classifier then you can use non maximal suppression algorithms to reduce number of positive proposals.

PCA on Sift desciptors and Fisher Vectors

I was reading this particular paper http://www.robots.ox.ac.uk/~vgg/publications/2011/Chatfield11/chatfield11.pdf and I find the Fisher Vector with GMM vocabulary approach very interesting and I would like to test it myself.
However, it is totally unclear (to me) how do they apply PCA dimensionality reduction on the data. I mean, do they calculate Feature Space and once it is calculated they perform PCA on it? Or do they just perform PCA on every image after SIFT is calculated and then they create feature space?
Is this supposed to be done for both training test sets? To me it's an 'obviously yes' answer, however it is not clear.
I was thinking of creating the feature space from training set and then run PCA on it. Then, I could use that PCA coefficient from training set to reduce each image's sift descriptor that is going to be encoded into Fisher Vector for later classification, whether it is a test or a train image.
EDIT 1;
Simplistic example:
[coef , reduced_feat_space]= pca(Feat_Space','NumComponents', 80);
and then (for both test and train images)
reduced_test_img = test_img * coef; (And then choose the first 80 dimensions of the reduced_test_img)
What do you think? Cheers
It looks to me like they do SIFT first and then do PCA. the article states in section 2.1 "The local descriptors are fixed in all experiments to be SIFT descriptors..."
also in the introduction section "the following three steps:(i) extraction
of local image features (e.g., SIFT descriptors), (ii) encoding of the local features in an image descriptor (e.g., a histogram of the quantized local features), and (iii) classification ... Recently several authors have focused on improving the second component" so it looks to me that the dimensionality reduction occurs after SIFT and the paper is simply talking about a few different methods of doing this, and the performance of each
I would also guess (as you did) that you would have to run it on both sets of images. Otherwise your would be using two different metrics to classify the images it really is like comparing apples to oranges. Comparing a reduced dimensional representation to the full one (even for the same exact image) will show some variation. In fact that is the whole premise of PCA, you are giving up some smaller features (usually) for computational efficiency. The real question with PCA or any dimensionality reduction algorithm is how much information can I give up and still reliably classify/segment different data sets
And as a last point, you would have to treat both images the same way, because your end goal is to use the Fisher Feature Vector for classification as either test or training. Now imagine you decided training images dont get PCA and test images do. Now I give you some image X, what would you do with it? How could you treat one set of images differently from another BEFORE you've classified them? Using the same technique on both sets means you'd process my image X then decide where to put it.
Anyway, I hope that helped and wasn't to rant-like. Good Luck :-)

Matlab and Support Vector Machines: Why doesn't the implementation of PCA give good prediction results?

I have a training dataset with 60,000 images and a testing dataset with 10,000 images. Each image represents an integer number from 0 to 9. My goal was to use libsvm which is a library for Support Vector Machines in order to learn the numbers from the training dataset and use the classification produced to predict the images of the testing dataset.
Each image is 28x28 which means that it has 784 pixels or features. While the features seem to be too many it took only 5-10 minutes to run the SVM application and learn the training dataset. The testing results were very good giving me 93% success rate.
I decided to try and use PCA from matlab in order to reduce the amount of features while at the same time not losing too much information.
[coeff scores latent] = princomp(train_images,'econ');
I played with the latent a little bit and found out that the first 90 features would have as a result 10% information loss so I decided to use only the first 90.
in the above code train_images is an array of size [60000x784]
from this code I get the scores and from the scores I simply took the number of features I wanted, so finally I had for the training images an array of [60000x90]
Question 1: What's the correct way to project the testing dataset to the coefficients => coeff?
I tried using the following:
test_images = test_images' * coeff;
Note that the test_images accordingly is an array of size [784x10000] while the coeff an array of size [784x784]
Then from that again I took only the 90 features by doing the following:
test_images = test_images(:,(1:number_of_features))';
which seemed to be correct. However after running the training and then the prediction, I got a 60% success rate which is way lower than the success rate I got when I didn't use any PCA at all.
Question 2: Why did I get such low results?
After PCA I scaled the data as always which is the correct thing to do I guess. Not scaling is generally not a good idea according to the libsvm website so I don't think that's an issue here.
Thank you in advance
Regarding your first question, I believe MarkV has already provided you with an answer.
As for the second question: PCA indeed conserves most of the variance of your data, but it does not necessarily means that it maintains 90% of the information of your data. Sometimes, the information required for successful classification is actually located at these 10% you knocked off. A good example for this can be found here, especially figure 1 there.
So, if you have nice results with the full features, why reduce the dimension?
You might want to try and play with different principal components. What happens if you take components 91:180 ? that might be an interesting experiment...