Hidden Markov model classifying a sequence in Matlab

Hidden Markov model classifying a sequence in Matlab - matlab

I'm very new to machine learning, I'v read about Matlab's Statistics toolbox for hidden Markov model, I want to classify a given sequence of signals using it. I'v 3D co-ordinates in matrix P i.e [501x3] and I want to train model based on that. Evert complete trajectory ends on a specfic set of points, i.e at (0,0,0) where it achieves its target.
What is the appropriate Pseudocode/approach according to my scenario.
My Pseudocode:
501x3 matrix P is Emission matrix where each co-ordinate is state
random NxN transition matrix values (but i'm confused in it)
generating test sequence using the function hmmgenerate
train using hmmtrain(sequence,old_transition,old_emission)
give final transition and emission matrix to hmmdecode with an unknown sequence to give the probability (confusing also)
EDIT 1:
In a nutshell, I want to classify 10 classes of trajectories having each of [501x3] with HMM. I want to sampled 50 rows i.e [50x3] for each trajectory in order to build model. However, I'v murphyk's toolbox of HMM for such random sequences.

Here is a general outline of the approach to classifying d-dimensional sequences using hidden Markov models:
1) Training:
For each class k:
prepare an HMM model. This includes initializing the following:
a transition matrix: Q-by-Q matrix, where Q is the number of states
a vector of prior probabilities: Q-by-1 vector
the emission model: in your case the observations are 3D points so you could use a mutlivariate normal distribution (with specified mean vector and covariance matrix) or a Guassian mixture model (a bunch of MVN distributions combined using mixture coefficient)
after properly initializing the above parameters, you train the HMM model, feeding it the set of sequences belong to this class (EM algorithm).
2) Prediction
Next to classify a new sequence X:
you compute the log-likelihood of the sequence using each model log P(X|model_k)
then you pick the class that gave the highest probability. This is the class prediction.
As I mentioned in the comments, the Statistics Toolbox only implement discrete observation HMM models, so you will have to find another libraries or implement the code yourself. Kevin Murphy's toolboxes (HMM toolbox, BNT, PMTK3) are popular choices in this domain.
Here are some answers I posted in the past using Kevin Murphy's toolboxes:
Issue in training hidden markov model and usage for classification
Simple example/use-case for a BNT gaussian_CPD
The above answers are somewhat different from what you are trying to do here, but it's a good place to start.

The statement/case tells to build and train a hidden Markov's model having following components specially using murphyk's toolbox for HMM as per the choice:
O = Observation's vector
Q = States vector
T = vectors sequence
nex = number of sequences
M = number of mixtures
Demo Code (from murphyk's toolbox):
O = 8; %Number of coefficients in a vector
T = 420; %Number of vectors in a sequence
nex = 1; %Number of sequences
M = 1; %Number of mixtures
Q = 6; %Number of states
data = randn(O,T,nex);
% initial guess of parameters
prior0 = normalise(rand(Q,1));
transmat0 = mk_stochastic(rand(Q,Q));
if 0
Sigma0 = repmat(eye(O), [1 1 Q M]);
% Initialize each mean to a random data point
indices = randperm(T*nex);
mu0 = reshape(data(:,indices(1:(Q*M))), [O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));
else
[mu0, Sigma0] = mixgauss_init(Q*M, data, 'full');
mu0 = reshape(mu0, [O Q M]);
Sigma0 = reshape(Sigma0, [O O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));
end
[LL, prior1, transmat1, mu1, Sigma1, mixmat1] = ...
mhmm_em(data, prior0, transmat0, mu0, Sigma0, mixmat0, 'max_iter', 5);
loglik = mhmm_logprob(data, prior1, transmat1, mu1, Sigma1, mixmat1);

Related

Multiclass Logistic Regression ROC Curves in MATLAB

I have 7 classes within my training examples (labeled 1-7). I'm running logistic regression and I want to create my ROC curve for each of my classes.
To train my model and make a prediction, I have the following code:
Theta = zeros(k, n+1); %initialize theta
[Theta, costs] = gradientDescent(Theta, #(t)(CostFunc(t, X, Y, lambda)),...
#(t)(DerivOfCostFunc(t, X, Y, lambda)), alpha, iter_num);
%Make prediction with trained model
[scores,prediction] = predict(Theta, X_test); %X_test is the design matrix (ones on the first col)
Within the predict script, I have
scores = g(X*all_theta'); %this is the sigmoid function
[p_max, IndexOfMax]=max(scores, [], 2);
prediction = IndexOfMax;
Note that scores is a m by k matrix, where m is the number of training examples and k is the number of classes. Prediction is a m by 1 vector with numbers going from 1-7, based on the predicted class.
To create the ROC curve, for class 3 for example,
classNum=3;
for i=1:size(scores,1)
temp=scores(i,:);
diffscore(i,:)=temp(classNum)-max([temp(:,1:classNum-1),temp(:,classNum+1:end)]);
end
This last part I did because I read that I had to establish my class 3 as positive and the others as negative.
At last, I made my curve with the following code:
[xROC,yROC,~,auc] = perfcurve(y_test,diffscore,classNum);
%y_test contains my true labels, m by 1 column vector
However, when running the ROC curve for each of my classes, I get the same plot for all. They all have an AUC of 1. Based on some analysis, I know this is not correct but can't figure out in which part of the code I went wrong! Is there additional code I should add or should I need to modify any of my existing code?

MATLAB fitcSVM weight vector

I am training a linear SVM classifier with the fitcsvm function in MATLAB:
cvFolds = crossvalind('Kfold', labels, nrFolds);
for i = 1:nrFolds % iterate through each fold
testIdx = (cvFolds == i); % indices of test instances
trainIdx = ~testIdx; % indices training instances
cl = fitcsvm(features(trainIdx,:),
labels(trainIdx),'KernelFunction',kernel,'Standardize',true,...
'BoxConstraint',C,'ClassNames',[0,1], 'Solver', solver);
[labelPred,scores] = predict(cl, features(testIdx,:));
eq = sum(labelPred==labels(testIdx));
accuracy(i) = eq/numel(labels(testIdx));
end
As visible from this part of code, the trained SVM model is stored in cl. Checking the model parameters in cl I do not see which parameters correspond to classifier weight - ie. the parameter for linear classifiers which reflects the importance of each feature. Which parameter represents the classification weights? I see in the MATLAB documentation "The vector β contains the coefficients that define an orthogonal vector to the hyperplane" - is hence cl.beta representing the classification weights?

As you can see in this documentation, the equation of a hyperplane in fitcsvm is
f(x)=x′β+b=0
And as you know, this equation shows following relationship:
f(x)=w*x+b=0 or f(x)=x*w+b=0
So, β is equal to w (weights).

Fitting Gaussian Mixture Model

I have six bivariate normal distributions and I want to combine them as a Gaussian mixture model. I calculated the mean and covariance matrices below. When I sample random data (mvnrnd) for given distribution parameters, gmdistribution.fit gives different results for different sample sizes. In other words, random sampling sizes n=50 and n=1000 converge different gaussian distributions. My underlying data contains 30 samples for each cluster. So what is the best way to fit gaussian mixture model to my data? Any ideas?
mu1=[log(0.29090) log(0.0038)]
mu2=[log(0.4017) log(0.0053)]
mu3=[log(0.4477) log(0.0051)]
mu4=[log(0.5396) log(0.0072)]
mu5=[log(0.6881) log(0.0090)]
mu6=[log(0.8091) log(0.0099)]
cov1=[0.052 0.0011;0.0011 0.044]
cov2=[0.054 0.0010;0.0010 0.078]
cov3=[0.126 0.011;0.011 0.23]
cov4=[0.092 0.0061;0.0061 0.12]
cov5=[0.113 0.0092;0.0092 0.14]
cov6=[0.1047 0.0217;0.0217 0.35]
X = [mvnrnd(mu1,cov1,50);mvnrnd(mu2,cov2,50);mvnrnd(mu3,cov3,50);mvnrnd(mu4,cov4,50);mvnrnd(mu5,cov5,50);mvnrnd(mu6,cov6,50)];
scatter(X(:,1),X(:,2),'g')
options = statset('MaxIter',200,'Display','final','TolFun',1e-6)
obj = gmdistribution.fit(X,6,'Options',options)
hold on
ezcontour(#(x,y)pdf(obj,[x y]),[-2.5 1],[-7 -2.5],300);
hold off
ezsurfc(#(x,y) pdf(obj,[x y]))
x = -2.5:0.1:1.5; y = -7.0:0.1:-3; n=length(x); a=zeros(n,n);
for i = 1:n,
for j = 1:n,
gaussPDF(i,j) = pdf(obj,[x(i) y(j)]);
end;
end;

MATLAB weighted resampling

I'm writing a particle filter localization algorithm as part of an exercise to locate a plane flying over mountains.
From my understanding, the steps to this are:
- make a bunch of random guesses
- filter out unlikely guesses (using Gaussian hypothesis testing and some known information about the problem)
- shift filtered points by how much the plane moved in that step
- resample, weighted by shifted points
What I'm having trouble with is the resampling bit - how could I perform a weighted resampling in MATLAB?
Please let me know if there's anything I should clarify!! Thanks!

Firstly you should look into the SIR (Sequential Importance Sampling Re-sampling) Particle Filter [PF] (Or Sequential Monte-Carlo Methods is the other name it is known by).
I recommend the book called by Arnaud Doucet & Neil Gordon called "Sequential Monte Carlo Methods in Practice". It contains practically the state of the art when it comes to Particle Filters and contains a description of the implementations of the various flavors of the PF.
The SIR-PF has the following steps:
Prediction: Based on your state equations and the previous particle population propagate the particles to the next discrete time instance i.e. x(t+1) = f(x(t),w(t)) := where x is a vector of n states and for each state you have N realisations (particles) of the state eg. x ~ [N x n]
Correction: based on your estimation of your measurement equations that should be in the form y(t+1) = g(x(t+1),v(t)), where x(t+1) is your state population of particles. You calculate the error, e(t) = y(t+1) - y_m(t+1) and weight the population according to a likelihood function, which can be, but not necessarily has to be, a Normal distribution. You now will have a set of weights e.g. if you have m "sensors" you will have a weighting matrix W = [N x m] or in the simple case you'll have a [N x 1] vector of weights. (remember to normalise the weights)
Re-sampling (Conditional): This step should be based on a conditional to avoid the pitfall of particle degeneracy (which you should look into), the common conditional is to compute the "effective particle population size", := 1/(sum of the squared weights) i.e. Neff = 1/sum(w1**2, w2**2, ...., wN**2). If Neff < 0.85*N then resample.
Re-sampling: Calculate the CDF of the (normalised) weights vector i.e. P = cumsum(W) and generate random samples from a uniform distribution (r), select the first particle that P(w) >= r, repeat this until you have N realisations of the CDF, this will sample more frequently from the particles that have higher weights and less frequently from those that do not, effectively condensing your particle population. You then create a new set of weights that are uniformly weighted i.e. wN = 1/N
function [weights,X_update] = Standardised_Resample(P,X)
Neff = 1/(sum(P.^2)); % Test effective particle size
P = P./sum(P) % Ensure particle weights are normalised
if Neff < 0.85*size(P,1)
N = size(P,1)
X_update(N,1) = 0
L = cumsum(P)
for i = 1:N
X_update(i) = X(find(rand <= L,1))
end
weights = ones(N,1)*1./N;
else
weights = P;
X_update = X;
end
end
Estimation: XEst = W(t+1)*x(t+1) := the weighted product produces the estimate for the states at time t+1
Rinse and Repeat for time t+2 etc.
Note: x(0/0) is a population of N samples of a random distribution of ~N(x(0),Q(0)) where x(0) is an estimate of the initial conditions [IC] and Q(0/0) is an estimate of the variance (uncertainty) of your IC guess

MatLab: Fisher Linear Discriminant K > 2

I am trying to implement fisher's linear discriminant function in matlab for K(Class) > 2, I am not really sure the algorithm for the K > 2 scenario. I know Matlab has inbuilt functions but I want to implement this without using them.
It will be great if someone could clear the algorithm.

Here is some sample pseudo code:
N = number of cases
c= number of classes
Priors = vector of prior probabilities for each case per class
Target = Target labels for each case per class
dimension of Data = Features x Cases.
Get target labels for each data point:
T = Targets(:,Cases); % Target labels for each case
Calculate the mean vector per class and the common covariance matrix:
classifier.u = [mean(Data(:,(T(1,:)==1)),2),mean_nan(Data(:,(T(2,:)==1)),2),....,mean_nan(Data(:,(T(2,:)==c)),2]; % Matrix of data means
classifier.invCV = cov(Data');
Get discriminant value using class mean vectors and common covariance matrix:
A1=classifier.u;
B1=classifier.invCV;
D = A1'*B1*Data-0.5*(A1'*B1.*A1')*ones(d,N)+log(Priors(:,Cases));
Function will produce c discriminant values. The case is then assigned to the class with the largest discriminant value.