Using MNIST DATABASE for digits recognition. - matlab

I am trying to use the MNIST DATABASE in order to recognize hand written digits. What I have so far is a binary matrix that represents the digit , the algorithm is written in matlab . I would love some help on getting started with using the MNIST DATABASE to recognize the digit from the binary image.
Thanks.

If you are using Matlab and already have the binary images now you need to:
1) Extract features from the images (you have many choices). For example, you can start by using the raw pixels ==> convert each image matrix into a row vector.
(Use a part of the data for training and the rest for testing)
Create a feature matrix with all these row vectors. Each row will be an "instance" in your feature matrix.
2) Now can select and try different classifiers. Try for example, an SVM (Support Vector Machine). The most basic way is using the svmtrain and svmclassify functions. The usage is simple and well explained in Matlab's help.
3)Test different partitions of data.
4)Experiment with other features and classifiers.

Related

How to use multiple labels as targets in Neural Net Pattern Recognition Toolbox?

I am trying to use the Neural Net Pattern Recognition toolbox in MATLAB for recognizing different types of classes in my dataset. I have a 21392 x 4 table, with the columns 1-3 which I would like to use as predictors and the 4th column has the labels with 14 different categories (strings like Angry, Sad, Happy, Neutral etc.). It seems that the Neural Net Pattern Recognition toolbox, unlike the MATLAB Classification Learner toolbox doesn't allow me to import the table and automatically extract the predictors and responses from it. Moreover, I am unable to either specify the inputs and targets to the neural network manually as it isn't showing up in the options.
I looked into the examples like the Iris Dataset, Wine Dataset, Cancer Dataset etc., but all of them only have 2-3 classes as outputs which are being Identified (and encoded in binary like 000, 010, 011 etc.) and the labels are not string type unlike mine like Angry, Sad, Happy, Neutral etc. (total 14 different classes). I would like to know how I can use my table as input to the neural network pattern recognition toolbox, or otherwise, any way in which I can extract the data from my table and use it in the toolbox. I am new to using the toolbox, so any help in this regard would be highly appreciated. Thanks!
The first step to use the Neural Net Pattern Recognition Toolbox is to convert the table to a numeric array, as neural networks work only with numeric arrays, not other datatypes directly. Considering the table as my_table, it can be converted to a numeric array using
my_table_array = table2array(my_table);
From my_table_array, the inputs (predictors) and outputs/targets can be extracted. But, it is imperative to mention that the inputs and outputs need to be transposed (as the data is needed to be in column format for the toolbox, each column is one datapoint, and each row is the feature), which can easily be accomplished using:-
inputs = inputs'; %(now of dimensions 3x21392)
labels = labels'; %(now of dimensions 1x21392)
The string type labels (categorical) can be converted to numeric values using a one-hot encoding technique with categorical, followed by ind2vec:
my_table_vector = ind2vec(double(categorical(labels)));
Now, the my_table_vector (final targets) and inputs (final input predictors) can easily be fed to the neural network and used for classification/prediction of the target labels.

The size of the generated confusion matrix using confusionmat function is not right, why?

I am working on a traffic sign recognition code in MATLAB using Belgian Traffic Sign Dataset. This dataset can be found here.
The dataset consists of training data and test data (or evaluation data).
I resized the given images and extracted HOG features using the VL_HOG function from VL_feat library.
Then, I trained a multi class SVM using all of the signs inside the training dataset. There are 62 categories (i.e. different types of traffic signs) and 4577 frames inside the training set.
I used the fitcecoc function to obtain the classifier.
Upon training the multi-class SVM, I want to test the classifier performance using the test data and I used the predict and confusionmat functions, respectively.
For some reason, the size of the returned confusion matrix is 53 by 53 instead of 62 by 62.
Why the size of the confusion matrix is not the same as the number of categories?
Some of the folders inside the testing dataset are empty, causing MATLAB to skip those rows and columns in the confusion matrix.

Vlfeat Matlab SVM

I'm trying to build an application for image processing, the purpose is to get thermal image and to decide if the image contains a human object or no.
My thoughts were to try Matlab (actually Octave), for that mission i'm trying to use Vlfeat package and i'm really confuse on how should i use this library.
I'm trying to use the SVM trainer after extracting HOG features but couldn't figure out how to test the data.
After I have trained the SVM, how to test a new image?
*If there are better solutions I'm open for suggestions.
From the first paragraph of the link you provided
(...) Y W'*X(:,i)+B has the same sign of LABELS(i) for all i.
Then Y W'*X(:,i)+B is the value assigned to some feature vector X(:,i), so for any given feature vector x you want to test, just evaluate W.' * x+B.
EDIT: A feature vector x for some test data is generated the same as for the training data using your feature extractio method. To classify this vector you evaluate the linear function given by the svm to get the classification "value" c=W.' * x+B Then you just need to consider the sign of c as the classification to one or the other class.

How to train SVM in matlab for character recognition?

Im a final year student working on my major project. My project is basically to extract text from a natural scene, and recognize it and then display them in a notepad etc..
I have already extracted the text form the images and have also obtained 85 features for each character which is extracted.
How ever, for the recognition part, I have no clue as of how to train or use SVM(support vector machines) in matlab so I can get a match.
Please help me out as this is turning out to be painstakingly difficult
If you're happy with using an existing SVM implementation, then you should either use the bioinformatics toolbox svmtrain, or download the Matlab version of libsvm. If you want to implement an SVM yourself then you should understand SVM theory and you can use quadprog to solve the appropriate optimisation problem.
With your data, you will need to have an N-by-85 feature matrix, where N is a number of characters, and an N-by-1 array of 'true labels' which you provide manually. Depending on which tool you use to train an SVM, the paramaters to svmtrain are slightly different - check the documentation.
If you want to evaluate your SVM to show that it works, you may need to organise your data such that you can estimate the generalization error of classifier - see cross-validation

svm classification

I am a beginner in MATLAB and doing my Programming project in Digital Image Processing,i.e. Magnetic Resonance image classification using wavelet features+SVM+PCA+ANN. I executed the example SVM classification from MATLAB tool and modified that to fit my requirements. I am facing problems in storing more than one feature in an input vector and in giving new input to SVM. Please help.
Simply feed multidimensional feature data to svmtrain(Training, Group) function as Training parameter (Training can be matrix, each column represents separate feature). After that use svmclassify(SVMStruct, Sample) for testing data classification.