Background Information
I'm trying to apply Bag of Words on SURF/BRISK features as an experiment on the Cats/Dogs dataset. I've extracted all the features into a vector.
Issue:
When I feed the vectors into kmeans(points, numPts*0.04) it says that:
Undefined function 'isnan' for input arguments of type 'BRISKPoints'
The problem here is that BRISKPoints is a MATLAB object, not a numeric matrix. You cannot do k-means on it directly. What should go into k-means is the output of extractFeatures. Note that extractFeatures can return either SURF or FREAK descriptors depending on the type of the input points or the value of the 'Method' parameter. You can use k-means to cluster SURF descriptors, which are simply numerical vectors, but not FREAK descriptors, which are strings of bits encapsulated in a binaryFeatures object.
By the way, as of R2014b there is built-in support for the bag of words image classification in the Computer Vision System Toolbox. Please see this example.
Related
I am trying to cluster a dataset using an encoder and since I am new in this field I cant tell how to do it.My main issue is how to define the loss function since the dataset is unlabeled and up to know, what I have seen from bibliography they define as loss function the distance between the desired output and the predicted output.My question is since that I dont have a desired output how should I implement this?
You can use an auto encoder to pre-train your convolutional layers, like it described in my question here with usage of convolutional autoencoder for images
As you can see form code, loss function is Adam with metrics accuracy and dice coefficient, I think you can use accuracy only, since dice coefficient is image-specific
I’m not sure how it will work for you, because you hadn’t provided your idea how you will transform your bibliography lists to vector, perhaps you will create a list for bibliography id’s sorted by the cosine distance between them
For example, you can use a set of vector with cosine distances to each item in a bibliography list above for each reference in your dataset and use it as input for autoencoder
After encoder will be trained, you can remove the decoder part from your model output and use as an input for one of unsupervised clustering algorithms, for example, k-mean. You can find details about them here
I have computed colour descriptors of a dataset of images and generated a 152×320 matrix (152 samples and 320 features). I would like to use PCA to reduce the dimensionality of my image descriptors space. I know that I could implement this using Matlab PCA built-in function but as I have just started learning about this concept I would like to implement the Matlab code without the built-in function so I can have a clear understanding how the function works. I tried to find how to do that online but all I could find is the either the general concept of PCA or the implementation of it with the built-in functions without explaining clearly how it works. Anyone could help me with a step by step instructions or a link that could explain a simple way on how to implement PCA for dimensionality reduction. The reason why I'm so confused is because there are so many uses for PCA and methods to implement it and the more I read about it the more confused I get.
PCA is basically taking the dominant eigen vectors of the data (Or better yet their projection of the dominant Eigen Vectors of the covariance matrix).
What you can do is use the SVD (Singular Value Decomposition).
To imitate MATLAB's pca() function here what you should do:
Center all features (Each column of your data should have zero mean).
Apply the svd() function on your data.
Use the V Matrix (Its columns) as your vectors to project your data on. Chose the number of columns to use according to the dimension of the data you'd like to have.
The projected data is now you new dimensionality reduction data.
I have a problem with the function bagoffeatures implemented in matlab computer vision system toolbox.
I'm doing a study of a classification of different types of images, first of all i'm trying to use bagoffeatures with diffenrets custom extractors, i want to divide my work in 2 branches, first extract SURFpoints and extract 3 different types of descriptors, for example SURF BRISK and FREAK, when i use in my custom extractor the next line:
features = extractFeatures(grayImage,multiscaleGridPoints,'Upright',true, 'method', 'SURF');
It allways need to get SURF method to work, but i need to be able to get differents types of descriptors.
Can i use the function bag of features from computer vision system toolbox to do this? or it only support surffeature extractions?
Unfortunately, you cannot use BRISK or FREAK with MATLAB's implementation of bag-of-features, because the bag-of-features algorithm uses K-means clustering to create the "visual words". The problem is that BRISK and FREAK descriptors are binary bit strings, and you cannot cluster them with K-means, which only works on real-valued vectors.
You can certainly use different kinds of interest point detectors with the MATLAB's framework. However, you are limited to descriptors which are real-valued vectors. So SURF and SIFT will work, but BRISK and FREAK will not. If you absolutely must use BRISK and FREAK, you will have implement your own bag of features. There are several methods for clustering binary descriptors, but I do not know how well any of them work in the context of bag-of-features.
I am new to Digital Image Processing and have to simulate a Fourier Descriptor Program that is Affine Invariant, I want to know the prerequisites required to be able to understand this program, my reference is Digital Image Processing Using MATLAB by Gonzalez, I have seen a question on this site, regarding same program, but not able to understand the program as well as the solution, the question says:
"I am using Gonzalez frdescp function to get Fourier descriptors of a boundary. I use this code, and I get two totally different sets of numbers describing two identical but different in scale shapes.
So what is wrong?"
Can some body help me in knowing the prerequisite to understand this program as well as help me further?
Let me give this a try as I will have to use english and not mathematical notation. First, this is the documentation of the frdescp shown here. frdescp takes one argument which is an n by 2 matrix of numbers. What are these numbers? This requires some understanding of the mathematical foundation of Fourier Descriptors. The assumptions, before computing the Fourier Descriptors, is that you have a contour of the object, and you have some points on that contour. So for example a contour is shown in this picture:
You see that black line in the image? That is where you will pick a list of points going clockwise from the contour. Let's call this vector {(x_1, y_1), (x_2,y_2),... ,(x_n,y_n)}. Now that we have these points we are ready to compute the Fourier descriptors of this contour. The complex Fourier descriptor implemented in this Matlab function requires numbers to be in the complex domain. So you have to convert the numbers in our list to complex numbers, this is easy as you can transform a tuple of real numbers in 2D (x,y) to x + iy in the complex plane. However the matlab the function already does this for you. But now you know what the n by 2 matrix is for, it is just a list of xs and ys on the contour. After you have this, the matlab function takes the discrete Fourier transform and you get the descriptors. The benefit of this descriptor business is that it is invariant under certain geometric transformations such translation, rotation and scaling. I hope this was helpful.
I am studying Support Vector Machines (SVM) by reading a lot of material. However, it seems that most of it focuses on how to classify the input 2D data by mapping it using several kernels such as linear, polynomial, RBF / Gaussian, etc.
My first question is, can SVM handle high-dimensional (n-D) input data?
According to what I found, the answer is YES!
If my understanding is correct, n-D input data will be
constructed in Hilbert hyperspace, then those data will be
simplified by using some approaches (such as PCA ?) to combine it together / project it back to 2D plane, so that
the kernel methods can map it into an appropriate shape such a line or curve can separate it into distinguish groups.
It means most of the guides / tutorials focus on step (3). But some toolboxes I've checked cannot plot if the input data greater than 2D. How can the data after be projected to 2D?
If there is no projection of data, how can they classify it?
My second question is: is my understanding correct?
My first question is, does SVM can handle high-dimensional (n-D) input data?
Yes. I have dealt with data where n > 2500 when using LIBSVM software: http://www.csie.ntu.edu.tw/~cjlin/libsvm/. I used linear and RBF kernels.
My second question is, does it correct my understanding?
I'm not entirely sure on what you mean here, so I'll try to comment on what you said most recently. I believe your intuition is generally correct. Data is "constructed" in some n-dimensional space, and a hyperplane of dimension n-1 is used to classify the data into two groups. However, by using kernel methods, it's possible to generate this information using linear methods and not consume all the memory of your computer.
I'm not sure if you've seen this already, but if you haven't, you may be interested in some of the information in this paper: http://pyml.sourceforge.net/doc/howto.pdf. I've copied and pasted a part of the text that may appeal to your thoughts:
A kernel method is an algorithm that depends on the data only through dot-products. When this is the case, the dot product can be replaced by a kernel function which computes a dot product in some possibly high dimensional feature space. This has two advantages: First, the ability to generate non-linear decision boundaries using methods designed for linear classifiers. Second, the use of kernel functions allows the user to apply a classifier to data that have no obvious fixed-dimensional vector space representation. The prime example of such data in bioinformatics are sequence, either DNA or protein, and protein structure.
It would also help if you could explain what "guides" you are referring to. I don't think I've ever had to project data on a 2-D plane before, and it doesn't make sense to do so anyway for data with a ridiculous amount of dimensions (or "features" as it is called in LIBSVM). Using selected kernel methods should be enough to classify such data.