I want to do PCA and then MDA (Multiple Discriminative Analysis) in order to reduce the dimensions of the dataset from 99^2 to 49 (face recognition).
My first step was reducing dimensions from 99^2 to 50 by PCA. Now I want to use MDA to reduce from c to c-1 -> from 50 to 49.
I've tried this code but I get complex values in the 'Answer', which is wrong.
% calculate PCA
mat_mean = mean(trainData(:));
normalized_train = trainData - mat_mean;
A = normalized_train/std(normalized_train(:));
S1 = A * A';
[V,Z] = eigs(S2,50);
Wpca = A'*V*Z;
% calculate MDA
[Sb,Sw] = scattermat(Wpca);
Sb1=Wpca*Sb*Wpca';
Sw1=Wpca*Sw*Wpca';
[Answer,ready1] = eigs(Sb1,Sw1,49);
Any suggestions what am I doing wrong?
The reason is that "eigs" calculates the eigenvalues of the matrix, which includes SQRT in it... and I have negative values in Sb,Sw
Related
I want to get the probability to get a value X higher than x_i, which means the cumulative distribution functions CDF. P(X>=x_i).
I've tried to do it in Matlab with this code.
Let's assume the data is in the column vector p1.
xp1 = linspace(min(p1), max(p1)); %range of bins
histp1 = histc(p1(:), xp1); %histogram od data
probp1 = histp1/sum(histp1); %PDF (probability distribution function)
`figure;plot(probp1, 'o') `
Now I want to calculate the CDF,
sorncount = flipud(histp1);
cumsump1 = cumsum(sorncount);
normcumsump1 = cumsump1/max(cumsump1);
cdf = flipud(normcumsump1);
figure;plot(xp1, cdf, 'ok');
I'm wondering whether anyone can help me to know if I'm ok or am I doing something wrong?
Your code works correctly, but is a bit more complicated than it could be. Since probp1 has been normalized to have sum equal to 1, the maximum of its cumulative sum is guaranteed to be 1, so there is no need to divide by this maximum. This shortens the code a bit:
xp1 = linspace(min(p1), max(p1)); %range of bins
histp1 = histc(p1(:), xp1); %count for each bin
probp1 = histp1/sum(histp1); %PDF (probability distribution function)
cdf = flipud(cumsum(flipud(histp1))); %CDF (unconventional, of P(X>=a) kind)
As Raab70 noted, most of the time CDF is understood as P(X<=a), in which case you don't need flipud: taking cumsum(histp1) is all that's needed.
Also, I would probably use histp1(end:-1:1) instead of flipud(histp1), so that the vector is flipped no matter if it's a row or column.
I have a matrix 64x64x32x90 which stands for pixels at x,y,z, at time t.
I have a reference signal 1x90 which stands for the behavior I expect for a pixel at some point (x,y,z).
I am constructing a new image of the correlation between each pixel versus my reference.
load('DATA.mat');
ON = ones(1,10);
OFF = zeros(1,10);
taskRef = [OFF ON OFF ON OFF ON OFF ON OFF];
corrImage = zeros(64,64,36);
for i=1:64,
for j=1:63,
for k=1:36
signal = squeeze(DATA(i,j,k,:));
coef = corrcoef(signal',taskRef);
corrImage(i,j,k) = coef(2);
end
end
end
My process is too slow. Is there a way to get rid of my loops or adjust the code to have a better runtime?
Reshape your data so that its first three dimensions are collapsed into one (so now there are 64*64*32 rows and 90 columns).
Then use pdist2 (with 'correlation' option) to compute the correlation of each row with the expected pattern.
Finally, reshape result into the desired shape.
DATA2 = reshape(DATA, [],90);
corrImage = 1 - pdist2(DATA2, taskRef, 'correlation');
corrImage = reshape(corrImage, 64,64,32);
i am trying to calculate the inverse fourier transform of the vector XRECW. for some reason i get a vector of NANs.
please help!!
t = -2:1/100:2;
x = ((2/5)*sin(5*pi*t))./((1/25)-t.^2);
w = -20*pi:0.01*pi:20*pi;
Hw = (exp(j*pi.*(w./(10*pi)))./(sinc(w./(10*pi)))).*(heaviside(w+5*pi)-heaviside(w-5*pi));%low pass filter
xzohw = 0;
for q=1:20:400
xzohw = xzohw + x(q).*(2./w).*sin(0.1.*w).*exp(-j.*w*0.2*((q-1)/20)+0.5);%calculating fourier transform of xzoh
end
xzohw = abs(xzohw);
xrecw = abs(xzohw.*Hw);%filtering the fourier transform high frequencies
xrect=0;
for q=1:401
xrect(q) = (1/(2*pi)).*trapz(xrecw.*exp(j*w*t(q))); %inverse fourier transform
end
xrect = abs(xrect);
plot(t,xrect)
Here's a direct answer to your question of "why" there is a nan. If you run your code, the Nan comes from dividing by zero in line 7 for computing xzohw. Notice that w contains zero:
>> find(w==0)
ans =
2001
and you can see in line 7 that you divide by the elements of w with the (2./w) factor.
A quick fix (although it is not a guarantee that your code will do what you want) is to avoid including 0 in w by using a step which avoids zero. Since pi is certainly not divisible by 100, you can try taking steps in .01 increments:
w = -20*pi:0.01:20*pi;
Using this, your code produces a plot which might resemble what you're looking for. In order to do better, we might need more details on exactly what you're trying to do, or what these variables represent.
Hope this helps!
I am trying to implement Naive Bayes Classifier using a dataset published by UCI machine learning team. I am new to machine learning and trying to understand techniques to use for my work related problems, so I thought it's better to get the theory understood first.
I am using pima dataset (Link to Data - UCI-ML), and my goal is to build Naive Bayes Univariate Gaussian Classifier for K class problem (Data is only there for K=2). I have done splitting data, and calculate the mean for each class, standard deviation, priors for each class, but after this I am kind of stuck because I am not sure what and how I should be doing after this. I have a feeling that I should be calculating posterior probability,
Here is my code, I am using percent as a vector, because I want to see the behavior as I increase the training data size from 80:20 split. Basically if you pass [10 20 30 40] it will take that percentage from 80:20 split, and use 10% of 80% as training.
function[classMean] = naivebayes(file, iter, percent)
dm = load(file);
for i=1:iter
idx = randperm(size(dm.data,1))
%Using same idx for data and labels
shuffledMatrix_data = dm.data(idx,:);
shuffledMatrix_label = dm.labels(idx,:);
percent_data_80 = round((0.8) * length(shuffledMatrix_data));
%Doing 80-20 split
train = shuffledMatrix_data(1:percent_data_80,:);
test = shuffledMatrix_data(percent_data_80+1:length(shuffledMatrix_data),:);
train_labels = shuffledMatrix_label(1:percent_data_80,:)
test_labels = shuffledMatrix_data(percent_data_80+1:length(shuffledMatrix_data),:);
%Getting the array of percents
for pRows = 1:length(percent)
percentOfRows = round((percent(pRows)/100) * length(train));
new_train = train(1:percentOfRows,:)
new_trin_label = shuffledMatrix_label(1:percentOfRows)
%get unique labels in training
numClasses = size(unique(new_trin_label),1)
classMean = zeros(numClasses,size(new_train,2));
for kclass=1:numClasses
classMean(kclass,:) = mean(new_train(new_trin_label == kclass,:))
std(new_train(new_trin_label == kclass,:))
priorClassforK = length(new_train(new_trin_label == kclass))/length(new_train)
priorClassforK_1 = 1 - priorClassforK
end
end
end
end
First, compute the probability of evey class label based on frequency counts. For a given sample of data and a given class in your data set, you compute the probability of evey feature. After that, multiply the conditional probability for all features in the sample by each other and by the probability of the considered class label. Finally, compare values of all class labels and you choose the label of the class with the maximum probability (Bayes classification rule).
For computing conditonal probability, you can simply use the Normal distribution function.
I'm currently working on classifying images with different image-descriptors. Since they have their own metrics, I am using precomputed kernels. So given these NxN kernel-matrices (for a total of N images) i want to train and test a SVM. I'm not very experienced using SVMs though.
What confuses me though is how to enter the input for training. Using a subset of the kernel MxM (M being the number of training images), trains the SVM with M features. However, if I understood it correctly this limits me to use test-data with similar amounts of features. Trying to use sub-kernel of size MxN, causes infinite loops during training, consequently, using more features when testing gives poor results.
This results in using equal sized training and test-sets giving reasonable results. But if i only would want to classify, say one image, or train with a given amount of images for each class and test with the rest, this doesn't work at all.
How can i remove the dependency between number of training images and features, so i can test with any number of images?
I'm using libsvm for MATLAB, the kernels are distance-matrices ranging between [0,1].
You seem to already have figured out the problem... According to the README file included in the MATLAB package:
To use precomputed kernel, you must include sample serial number as
the first column of the training and testing data.
Let me illustrate with an example:
%# read dataset
[dataClass, data] = libsvmread('./heart_scale');
%# split into train/test datasets
trainData = data(1:150,:);
testData = data(151:270,:);
trainClass = dataClass(1:150,:);
testClass = dataClass(151:270,:);
numTrain = size(trainData,1);
numTest = size(testData,1);
%# radial basis function: exp(-gamma*|u-v|^2)
sigma = 2e-3;
rbfKernel = #(X,Y) exp(-sigma .* pdist2(X,Y,'euclidean').^2);
%# compute kernel matrices between every pairs of (train,train) and
%# (test,train) instances and include sample serial number as first column
K = [ (1:numTrain)' , rbfKernel(trainData,trainData) ];
KK = [ (1:numTest)' , rbfKernel(testData,trainData) ];
%# train and test
model = svmtrain(trainClass, K, '-t 4');
[predClass, acc, decVals] = svmpredict(testClass, KK, model);
%# confusion matrix
C = confusionmat(testClass,predClass)
The output:
*
optimization finished, #iter = 70
nu = 0.933333
obj = -117.027620, rho = 0.183062
nSV = 140, nBSV = 140
Total nSV = 140
Accuracy = 85.8333% (103/120) (classification)
C =
65 5
12 38