Matching PCA output to corresponding coordiantes - matlab

I have this dataset consisting of around 800 images of the view of a car driving in circles with corresponding coordinates and changing background. The goal is
to train a neural network to predict the position based on the images. I reshaped the images so that
the original 160x320 pixels come down to 1x51200, so that I can feed my NN more easily. However, because
this is a quite large dimension I applied a PCA to reduce the dimension and the PCA indeed worked well, so
that I could only take the 100 eigenvectors with the highest eigenvalues and still have 90-95 % of the total variance.
But now comes my obstacle: I have these 100 images, still reconstructable and visualizable, but I don't exactly know, to which coordinates they correspond. I can't just take the first 100 coordinates because these eigenvalues where obviously taken from different timesteps through the progression of the images. I need this information so that my NN is able to match them while learning and checking its progress while testing. I read a similar question where the answers stated that it's not possible to extract indices out of a PCA-output, but I'm pretty sure there must be other persons who already faced similar obstacles?

What you have to be able to do with PCA to go from compact to original representation.
# dataset.shape = (800, 51200)
M = pca(dataset); # M.shape = (100, 51200)
original = dataset[k, :] # one vector corresponding to one image in your dataset
compressed_dataset = dataset # M.T; # compressed_dataset.shape = (800, 100)
# if you have a compressed representation of an image you restore it with
restored = compressed_dataset[k, :] # M # .shape = (1, 512000)
# here you expect all_close(restored, original)
Notice that the indices are stored appart from the data vectors.

Related

Principal component analysis in matlab?

I have a training set with the size of (size(X_Training)=122 x 125937).
122 is the number of features
and 125937 is the sample size.
From my little understanding, PCA is useful when you want to reduce the dimension of the features. Meaning, I should reduce 122 to a smaller number.
But when I use in matlab:
X_new = pca(X_Training)
I get a matrix of size 125973x121, I am really confused, because this not only changes the features but also the sample size? This is a big problem for me, because I still have the target vector Y_Training that I want to use for my neural network.
Any help? Did I badly interpret the results? I only want to reduce the number of features.
Firstly, the documentation of the PCA function is useful: https://www.mathworks.com/help/stats/pca.html. It mentions that the rows are the samples while the columns are the features. This means you need to transpose your matrix first.
Secondly, you need to specify the number of dimensions to reduce to a priori. The PCA function does not do that for you automatically. Therefore, in addition to extracting the principal coefficients for each component, you also need to extract the scores as well. Once you have this, you simply subset into the scores and perform the reprojection into the reduced space.
In other words:
n_components = 10; % Change to however you see fit.
[coeff, score] = pca(X_training.');
X_reduce = score(:, 1:n_components);
X_reduce will be the dimensionality reduced feature set with the total number of columns being the total number of reduced features. Also notice that the number of training examples does not change as we expect. If you want to make sure that the number of features are along the rows instead of the columns after we reduce the number of features, transpose this output matrix as well before you proceed.
Finally, if you want to automatically determine the number of features to reduce to, one method to do so is to calculate the variance explained of each feature, then accumulate the values from the first feature up to the point where we exceed some threshold. Usually 95% is used.
Therefore, you need to provide additional output variables to capture these:
[coeff, score, latent, tsquared, explained, mu] = pca(X_training.');
I'll let you go through the documentation to understand the other variables, but the one you're looking at is the explained variable. What you should do is find the point where the total variance explained exceeds 95%:
[~,n_components] = max(cumsum(explained) >= 95);
Finally, if you want to perform a reconstruction and see how well the reconstruction into the original feature space performs from the reduced feature, you need to perform a reprojection into the original space:
X_reconstruct = bsxfun(#plus, score(:, 1:n_components) * coeff(:, 1:n_components).', mu);
mu are the means of each feature as a row vector. Therefore you need add this vector across all examples, so broadcasting is required and that's why bsxfun is used. If you're using MATLAB R2018b, this is now implicitly done when you use the addition operation.
X_reconstruct = score(:, 1:n_components) * coeff(:, 1:n_components).' + mu;

KNN Classifier for simple digit recognition

Actually , i have an assignment where it is required to recognize individual decimal digits as a part of the text recognition process.I am already given a set of JPEG formatted images of some digits. Each image is of size 160 x 160 pixels.After checking some resources here i managed to write this code but :
1)I am not sure if reading the images and resizing them in matrices for holding them is right or not.
2)Supposing that i have 30 train data images for numbers [0-9] each number has three images and i have 10 images for test each image is of only one digit.How to calculate distance between every test and train in a loop ? Because in my part of code for calculating Euclidean it gives an output zero.
3)How to calculate accuracy by using confusion matrix ?
% number of train data
Train = 30;
%number of test data
Test =10;
% to store my images
tData = uint8(zeros(160,160,30));
tTest = uint8(zeros(160,160,10));
for k=1:Test
s1='im-';
s2=num2str(k);
t = strcat('testy/im-',num2str(k),'.jpg');
im=rgb2gray(imread(t));
I=imresize(im,[160 160]);
tTest(:,:,k)=I;
%case testing if it belongs to zero
for l=1:3
ss1='zero-';
ss2=num2str(l);
t1 = strcat('data/zero-',num2str(l),'.jpg');
im1=rgb2gray(imread(t1));
I1=imresize(im1,[160 160]);
tData(:,:,l)=I1;
% Euclidean distance
distance= sqrt(sum(bsxfun(#minus, tData(:,:,k), tTest(:,:,l)).^2, 2));
[d,index] = sort(distance);
%k=3
% index_close(l) = index(l:3);
%x_close = I(index_close,:);
end
end
First of all i think 10 test data is not enough.
Just use the below function, data_test is your training data() and data_label is their labels. re size your images to smaller sizes!
I think the default distance measure is Euclidean distance but you can choose other ways such as City-block method for example.
Class = knnclassify(data_test, data_train, lab_train, 11);
fprintf('11-NN Accuracy: %f\n', sum(Class == lab_test')/length(lab_test));
Class = knnclassify(data_test, data_train, lab_train, 1, 'cityblock');
fprintf('1-NN Accuracy (cityblock): %f\n', sum(Class == lab_test')/length(lab_test));
Ok now you have the overall accuracy but this is not a good measure, it's better to calculate the accuracy separately for each class and then calculate their mean.
you can calculate a specific class (id) accuracy like this,,
idLocations = (lab_test == id);
NumberOfId = sum(idLocations);
NumberOfCurrect =sum (lab_test (idLocations) == Class(idLocations));
NumberOfCurrect/NumberOfId %Class id accuracy
as your questions are:
1) image re-sizing does affects the accuracy of the whole process.
Ans: As you mentioned in your question your images are already of the size 160 by 160, imresize will not affect it, but if your image is too small in size say 60*60 it will perform interpolation to increase the spatial dimensions of the image, which may affects structure and shape of the digit, to tackle these kind of variability, your training data should have much more samples(at least 50 samples per class), and some pre-processing should be apply on data like de-skewing of the digit image.
2) euclidean distance is good measure but not the best to deal with these kind of problems, as its distribution is a spherical distribution it may give same distance for to different digits. if you are working in MATLAB beware of of variable casting, you are taking difference so both the variable should be double in nature. it may be one of the reason of wrong distance calculation. in this line distance= sqrt(sum(bsxfun(#minus, tData(:,:,k), tTest(:,:,l)).^2, 2)); you are summing matrices column wise so output of this will be a row vector(1 X 160) which have sum along each corner. i think it should be like this:distance= sqrt(sum(sum(bsxfun(#minus, tData(:,:,k), tTest(:,:,l)).^2, 2))); i have just added one more sum there for getting sum of differences for whole matrix try it whether it helps or not.
3) For checking accuracy of your classifier precisely you have to have a large training dataset,by the way, Confusion matrix created during the process of cross-validation, where you split your training data into training samples and testing samples, so you know output classes in both the sample, now perform classification process, prepare a matrix for num_classe X num_classes(in your case 10 X 10), where rows resembles actual classes and columns belongs to prediction. take a sample from test and predict output class, suppose your classifier predict 5 and sample's actual class is also 5 put +1 in the confusion_matrix(5,5); if your classifier have predicted it as 3, you should do +1 at confusion_matrix(5,3). finally add diagonal elements of the confusion_mat and divide it by the total number of the test samples. output will be accuracy of your classifier.
P.S. Try to have atleast 50 samples per class and during cross-validation divide the training data 85:10 ratio where 90% sample should be used for training and rest 10 % should be used for testing the classifier.
Hope it have helps you.
feel free to share your thoughts.
Thank You

Implementing Naive Bayes Nearest Neighbor (NBNN) in MATLAB

I posted this question on the CV SO a few days ago but it has gone basically unobserved by the forum. I'm trying to implement NBNN in MATLAB to do image classification on the CIFAR-10 image dataset. The algorithm is pretty simple, and I'm confident in it's correctness, however, I'm receiving terrible accuracy rates with 22-28%. I'm unsure why and I'm hoping someone who has experience with image classification or the algorithm could point me in the right direction. The algorithm learns off of SIFT image descriptors, which could be one of the reasons why it's under-performing. Matlab only has a SURF feature detector. From what I've read, SURF/SIFT are basically equivalent. I've been using features, (nDescriptros x 64) obtained from the function below to train my model.
points = detectSURFFeatures(rgb2gray(image));
[features,valid_points] = extractFeatures(image,points);
Another possible issue with the CIFAR dataset and this approach is the small size of the images. Each image is 32 x 32 images which, I beleive, makes feature detection very difficult. I've played around with the different octave settings in the detectSURFFeatures() but nothing has brought my accuracy above 28%.
The annotated code for the approach is below. It's difficult to understand but still might be helpful.
Hopefully someone can help me out.
for i = 3001:4000 %Train on the first 3000 instances and test on the remaining 1000.
closeness = [];
for j = 1:10 %loop over the 10 catergories
class = train(trainLabel==j,:); % Pull out all descriptors that belong to class j
descriptor = test(test_id==i,:); % Pull out all descriptors for image i
[idx,dist] = knnsearch(class,descriptor,'K',1); % Find the distance between the descriptors and the closest labeled descriptor in class j.
total = sum(dist); % sum up distances
closeness = [closeness,sum(total)]; % append a vector of the image-to-class distances.
end
[val,cat] = min(closeness); % Find choose the class that resulted in the lowest, summed distance.
predLabel = [predLabel;cat]; % Append to a vector of labels.
i
end
If your images are 32x32 pixels, then trying to detect interest points is not a good idea. As you have observed, you would get very few features, if any. Upsampling the images is one option. Another option is to use a global descriptor like HOG (extractHOGFeatures).

computing PCA matrix for set of sift descriptors

I want to compute a general PCA matrix for a dataset, and I will use it to reduce dimensions of sift descriptors. I have already found some algorithms to compute it, but I couldn't find a way to compute it by using MATLAB.
Can someone help me?
[coeff, score] = princomp(X)
is the right thing to do, but knowing how to use it is a little tricky.
My understanding is that you did something like:
sift_image = sift_fun(img)
which gives you a binary image: sift_feature?
(Even if not binary, this still works.)
Inputs, formulating X:
To use princomp/pca formulate X so that each column is a numel(sift_image) x 1 vector (i.e. sift_image(:))
Do this for all your images and line them up as columns in X. So X will be numel(sift_image) x num_images.
If your images aren't the same size (e.g. pixel dimensions different, more or less of a scene in the images), then you'll need to bring them into some common space, which is a whole different problem.
Unless your stuff is binary, you'll probably want to de-mean/normalize X, both in the column direction (i.e. normalizing each individual image) and row direction (de-meaning the whole dataset).
Outputs
score is the set of eigen vectors: it will be num_pixels * num_images.
To get, say the first eigen vector back into an image shape, do:
first_component = reshape(score(:,1),size(im));
And so on for the rest of the components. There are as many components as input images.
Each row of coeff is the num_images (equal to num_components) set of weights that can be applied to generate each input image. i.e.
input_image_1 = reshape(score * coeff(:,1) , size(original_im));
where input_image_1 is the correct, original shape
coeff(1,:) is a vector (num_images x 1)
score is pixels x num_images
(Disclaimer: I may have the columns/rows mixed up, but the descriptions are correct.)
Does that help?
If you have access to Statistics Toolbox, you can use the command princomp, or in recent versions the command pca.

How can I use the princomp function of Matlab in the following case?

I have 10 images(18x18). I save these images inside an array named images[324][10] where the number 324 represents the amount of pixels for an image and the number 10 the total amount of images that I have.
I would like to use these images for a neuron network however 324 is a big number to give as an input and thus I would like to decrease this number but retain as much information as possible.
I heard that you can do this with the princomp function which implements PCA.
The problem is that I haven't found any example on how to use this function, and especially for my case.
If I run
[COEFF, SCORE, latent] = princomp(images);
it runs fine but how can I then get the array newimages[number_of_desired_features][10]?
PCA could be a right choice here (but not the only one). Although, you should be aware of the fact, that PCA does not reduce the number of your input data features automatically. I recommend you reading this tutorial: http://arxiv.org/pdf/1404.1100v1.pdf - it is the one I used to understand PCA and its really good for beginners.
Getting back to your question. An image is an vector in a 324-dimensional space. In this space the first base vector is one having a white pixel in top left corner, next one is having next pixel white, all the other black - and so on. It probably is not the best base vector set to represent this image data. PCA computes new base vectors (the COEFF matrix - the new vectors expressed as values in old vector space) and new image vector values (the SCORE matrix). At that point you have not lost ANY data at all (no feature number reduction). But, you could stop using some of the new base vectors, because they are probably connected with noice, not the data itself. It is all described in details in the tutorial.
images = rand(10,324);
[COEFF, SCORE] = princomp(images);
reconstructed_images = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images
%as you see there are almost only zeros - the non-zero values are effects of small numerical errors
%its possible because you are only switching between the sets of base vectors used to represent the data
for i=100:324
SCORE(:,i) = zeros(10,1);
end
%we remove the features 100 to 324, leaving only first 99
%obviously, you could take only the non-zero part of the matrix and use it
%somewhere else, like for your neural network
reconstructed_images_with_reduced_features = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images_with_reduced_features
%there are less features, but reconstruction is still pretty good