I have six bivariate normal distributions and I want to combine them as a Gaussian mixture model. I calculated the mean and covariance matrices below. When I sample random data (mvnrnd) for given distribution parameters, gmdistribution.fit gives different results for different sample sizes. In other words, random sampling sizes n=50 and n=1000 converge different gaussian distributions. My underlying data contains 30 samples for each cluster. So what is the best way to fit gaussian mixture model to my data? Any ideas?
mu1=[log(0.29090) log(0.0038)]
mu2=[log(0.4017) log(0.0053)]
mu3=[log(0.4477) log(0.0051)]
mu4=[log(0.5396) log(0.0072)]
mu5=[log(0.6881) log(0.0090)]
mu6=[log(0.8091) log(0.0099)]
cov1=[0.052 0.0011;0.0011 0.044]
cov2=[0.054 0.0010;0.0010 0.078]
cov3=[0.126 0.011;0.011 0.23]
cov4=[0.092 0.0061;0.0061 0.12]
cov5=[0.113 0.0092;0.0092 0.14]
cov6=[0.1047 0.0217;0.0217 0.35]
X = [mvnrnd(mu1,cov1,50);mvnrnd(mu2,cov2,50);mvnrnd(mu3,cov3,50);mvnrnd(mu4,cov4,50);mvnrnd(mu5,cov5,50);mvnrnd(mu6,cov6,50)];
scatter(X(:,1),X(:,2),'g')
options = statset('MaxIter',200,'Display','final','TolFun',1e-6)
obj = gmdistribution.fit(X,6,'Options',options)
hold on
ezcontour(#(x,y)pdf(obj,[x y]),[-2.5 1],[-7 -2.5],300);
hold off
ezsurfc(#(x,y) pdf(obj,[x y]))
x = -2.5:0.1:1.5; y = -7.0:0.1:-3; n=length(x); a=zeros(n,n);
for i = 1:n,
for j = 1:n,
gaussPDF(i,j) = pdf(obj,[x(i) y(j)]);
end;
end;
Related
I am trying to fit a Poisson function to a histogram in Matlab: the example calls for using hist() (which is deprecated) so I want to use histogram() instead, especially as you cannot seem to normalize a hist(). I then want to apply a poisson function to it using poisspdf() or any other standard function (preferably no toolboxes!). The histogram is probability scaled, which is where the issue with the poisson function comes from AFAIK.
clear
clc
lambda = 5;
range = 1000;
rangeVec = 1:range;
randomData = poissrnd(lambda, 1, range);
histoFigure = histogram(randomData, 'Normalization', 'probability');
hold on
poissonFunction = poisspdf(randomData, lambda);
poissonFunction2 = poisspdf(histoFigure, lambda);
plot(poissonFunction)
plot(poissonFunction2)
I have tried multiple different methods of creating the poisson function + plotting and neither of them seems to work: the values within this function are not consistent with the histogram values as they differ by several decimals.
This is what the image should look like
however currently I can only get the bar graphs to show up correctly.
You're not specifing the x-data of you're curve. Then the sample number is used and since you have 1000 samples, you get the ugly plot. The x-data that you use is randomData. Using
plot(randomData, poissonFunction)
will lead to lines between different samples, because the samples follow each other randomly. To take each sample only once, you can use unique. It is important that the x and y values stay connected to each other, so it's best to put randomData and poissonFunction in 1 matrix, and then use unique:
d = [randomData;poissonFunction].'; % make 1000x2 matrix to find unique among rows
d = unique(d,'rows');
You can use d to plot the data.
Full code:
clear
clc
lambda = 5;
range = 1000;
rangeVec = 1:range;
randomData = poissrnd(lambda, 1, range);
histoFigure = histogram(randomData, 'Normalization', 'probability');
hold on
poissonFunction = poisspdf(randomData, lambda);
d = [randomData; poissonFunction].';
d = unique(d, 'rows');
plot(d(:,1), d(:,2))
With as result:
I also have to keep in mind the skewness and the kurtosis of the distribution and these have to be reflected in the simulated values.
My empirical values are past stock returns (non-standard normal distribution).
Is there an existing package that will do this for me? All the packages I see online have only the first two moments.
What you're describing is using the method of moments to define a distribution. Such methods have generally fallen out of favor in statistics. However, you can check out pearsonrnd, which may work fine depending on your data.
Instead, I'd suggest directly finding the empirical CDF for the data using ecdf and use that in conjunction with inverse sampling to generate random variates. Here's a basic function that will do that:
function r=empiricalrnd(y,varargin)
%EMPIRICALRND Random values from an empirical distribution
% EMPIRICALRND(Y) returns a single random value from distribution described by the data
% in the vector Y.
%
% EMPIRICALRND(Y,M), EMPIRICALRND(Y,M,N), EMPIRICALRND(Y,[M,N]), etc. return random arrays.
[f,x] = ecdf(y);
r = rand(varargin{:});
r = interp1(f,x,r,'linear','extrap');
You can play with the options for interp1 if you like. And here's a quick test:
% Generate demo data for 100 samples of log-normal distribution
mu = 1;
sig = 1;
m = 1e2;
rng(1); % Set seed to make repeatable
y = lognrnd(mu,sig,m,1);
% Generate 1000 random variates from data
n = 1e3;
r = empiricalrnd(y,n,1);
% Plot CDFs
ecdf(y);
hold on;
ecdf(r);
x = 0:0.1:50;
plot(x,logncdf(x,mu,sig),'g');
legend('Empirical CDF','CDF of sampled data','Actual CDF');
I've made a GMModel using fitgmdist. The idea is to produce two gaussian distributions on the data and use that to predict their labels. How can I determine if a future data point fits into one of those distributions? Am I misunderstanding the purpose of a GMModel?
clear;
load C:\Users\Daniel\Downloads\data1 data;
% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2)
Produces
GMModel =
Gaussian mixture distribution with 2 components in 4 dimensions
Component 1:
Mixing proportion: 0.509709
Mean: 2.3254 -2.5373 3.9288 0.4863
Component 2:
Mixing proportion: 0.490291
Mean: 2.5161 -2.6390 0.8930 0.4833
Edit:
clear;
load C:\Users\Daniel\Downloads\data1 data;
% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2);
P = posterior(GMModel, data(:, 1:4));
X = round(P)
blah = X(:, 1)
dah = data(:, 5)
Y = max(mean(blah == dah), mean(~blah == dah))
I don't understand why you round the posterior values. Here is what I would do after fitting a mixture model.
P = posterior(GMModel, data(:, 1:4));
[~,Y] = max(P,[],2);
Now Y contains the labels that is index of which Gaussian the data belongs in-terms of maximum aposterior (MAP). Important thing to do is to align the labels before evaluating the classification error. Since renumbering might happen, i.e., Gaussian component 1 in the true might be component 2 in the clustering produced and so on. May be that why you are getting varying accuracy ranging from 51% accuracy to 95% accuracy, in addition to other subtle problems.
I'm very new to machine learning, I'v read about Matlab's Statistics toolbox for hidden Markov model, I want to classify a given sequence of signals using it. I'v 3D co-ordinates in matrix P i.e [501x3] and I want to train model based on that. Evert complete trajectory ends on a specfic set of points, i.e at (0,0,0) where it achieves its target.
What is the appropriate Pseudocode/approach according to my scenario.
My Pseudocode:
501x3 matrix P is Emission matrix where each co-ordinate is state
random NxN transition matrix values (but i'm confused in it)
generating test sequence using the function hmmgenerate
train using hmmtrain(sequence,old_transition,old_emission)
give final transition and emission matrix to hmmdecode with an unknown sequence to give the probability (confusing also)
EDIT 1:
In a nutshell, I want to classify 10 classes of trajectories having each of [501x3] with HMM. I want to sampled 50 rows i.e [50x3] for each trajectory in order to build model. However, I'v murphyk's toolbox of HMM for such random sequences.
Here is a general outline of the approach to classifying d-dimensional sequences using hidden Markov models:
1) Training:
For each class k:
prepare an HMM model. This includes initializing the following:
a transition matrix: Q-by-Q matrix, where Q is the number of states
a vector of prior probabilities: Q-by-1 vector
the emission model: in your case the observations are 3D points so you could use a mutlivariate normal distribution (with specified mean vector and covariance matrix) or a Guassian mixture model (a bunch of MVN distributions combined using mixture coefficient)
after properly initializing the above parameters, you train the HMM model, feeding it the set of sequences belong to this class (EM algorithm).
2) Prediction
Next to classify a new sequence X:
you compute the log-likelihood of the sequence using each model log P(X|model_k)
then you pick the class that gave the highest probability. This is the class prediction.
As I mentioned in the comments, the Statistics Toolbox only implement discrete observation HMM models, so you will have to find another libraries or implement the code yourself. Kevin Murphy's toolboxes (HMM toolbox, BNT, PMTK3) are popular choices in this domain.
Here are some answers I posted in the past using Kevin Murphy's toolboxes:
Issue in training hidden markov model and usage for classification
Simple example/use-case for a BNT gaussian_CPD
The above answers are somewhat different from what you are trying to do here, but it's a good place to start.
The statement/case tells to build and train a hidden Markov's model having following components specially using murphyk's toolbox for HMM as per the choice:
O = Observation's vector
Q = States vector
T = vectors sequence
nex = number of sequences
M = number of mixtures
Demo Code (from murphyk's toolbox):
O = 8; %Number of coefficients in a vector
T = 420; %Number of vectors in a sequence
nex = 1; %Number of sequences
M = 1; %Number of mixtures
Q = 6; %Number of states
data = randn(O,T,nex);
% initial guess of parameters
prior0 = normalise(rand(Q,1));
transmat0 = mk_stochastic(rand(Q,Q));
if 0
Sigma0 = repmat(eye(O), [1 1 Q M]);
% Initialize each mean to a random data point
indices = randperm(T*nex);
mu0 = reshape(data(:,indices(1:(Q*M))), [O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));
else
[mu0, Sigma0] = mixgauss_init(Q*M, data, 'full');
mu0 = reshape(mu0, [O Q M]);
Sigma0 = reshape(Sigma0, [O O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));
end
[LL, prior1, transmat1, mu1, Sigma1, mixmat1] = ...
mhmm_em(data, prior0, transmat0, mu0, Sigma0, mixmat0, 'max_iter', 5);
loglik = mhmm_logprob(data, prior1, transmat1, mu1, Sigma1, mixmat1);
I have a following stochastic model describing evolution of a process (Y) in space and time. Ds and Dt are domain in space (2D with x and y axes) and time (1D with t axis). This model is usually known as mixed-effects model or components-of-variation models
I am currently developing Y as follow:
%# Time parameters
T=1:1:20; % input
nT=numel(T);
%# Grid and model parameters
nRow=100;
nCol=100;
[Grid.Nx,Grid.Ny,Grid.Nt] = meshgrid(1:1:nCol,1:1:nRow,T);
xPower=0.1;
tPower=1;
noisePower=1;
detConstant=1;
deterministic_mu = detConstant.*(((Grid.Nt).^tPower)./((Grid.Nx).^xPower));
beta_s = randn(nRow,nCol); % mean-zero random effect representing location specific variability common to all times
gammaTemp = randn(nT,1);
for t = 1:nT
gamma_t(:,:,t) = repmat(gammaTemp(t),nRow,nCol); % mean-zero random effect representing time specific variability common to all locations
end
var=0.1;% noise has variance = 0.1
for t=1:nT
kappa_st(:,:,t) = sqrt(var)*randn(nRow,nCol);
end
for t=1:nT
Y(:,:,t) = deterministic_mu(:,:,t) + beta_s + gamma_t(:,:,t) + kappa_st(:,:,t);
end
My questions are:
How to produce delta in the expression for Y and the difference in kappa and delta?
Help explain, through some illustration using Matlab, if I am correctly producing Y?
Please let me know if you need some more information/explanation. Thanks.
First, I rewrote your code to make it a bit more efficient. I see you generate linearly-spaced grids for x,y and t and carry out the computation for all points in this grid. This approach has severe limitations on the maximum attainable grid resolution, since the 3D grid (and all variables defined with it) can consume an awfully large amount of memory if the resolution goes up. If the model you're implementing will grow in complexity and size (it often does), I'd suggest you throw this all into a function accepting matrix/vector inputs for s and t, which will be a bit more flexible in this regard -- processing "blocks" of data that will otherwise not fit in memory will be a lot easier that way.
Then, I generated the the delta_st term with rand instead of randn since the noise should be "white". Now I'm very unsure about that last one, and I didn't have time to read through the paper you linked to -- can you tell me on what pages I can find relevant the sections for the delta_st?
Now, the code:
%# Time parameters
T = 1:1:20; % input
nT = numel(T);
%# Grid and model parameters
nRow = 100;
nCol = 100;
% noise has variance = 0.1
var = 0.1;
xPower = 0.1;
tPower = 1;
noisePower = 1;
detConstant = 1;
[Grid.Nx,Grid.Ny,Grid.Nt] = meshgrid(1:nCol,1:nRow,T);
% deterministic mean
deterministic_mu = detConstant .* Grid.Nt.^tPower ./ Grid.Nx.^xPower;
% mean-zero random effect representing location specific
% variability common to all times
beta_s = repmat(randn(nRow,nCol), [1 1 nT]);
% mean-zero random effect representing time specific
% variability common to all locations
gamma_t = bsxfun(#times, ones(nRow,nCol,nT), randn(1, 1, nT));
% mean zero random effect capturing the spatio-temporal
% interaction not found in the larger-scale deterministic mu
kappa_st = sqrt(var)*randn(nRow,nCol,nT);
% mean zero random effect representing the micro-scale
% spatio-temporal variability that is modelled by white
% noise (i.i.d. at different time steps) in Ds·Dt
delta_st = noisePower * (rand(nRow,nCol,nT)-0.5);
% Final result:
Y = deterministic_mu + beta_s + gamma_t + kappa_st + delta_st;
Your implementation samples beta, gamma and kappa as if they are white (e.g. their values at each (x,y,t) are independent). The descriptions of the terms suggest that this is not meant to be the case. It looks like delta is supposed to capture the white noise, while the other terms capture the correlations over their respective domains. e.g. there is a non-zero correlation between gamma(t_1) and gamma(t_1+1).
If you wish to model gamma as a stationary Gaussian Markov process with variance var_g and correlation cor_g between gamma(t) and gamma(t+1), you can use something like
gamma_t = nan( nT, 1 );
gamma_t(1) = sqrt(var_g)*randn();
K_g = cor_g/var_g;
K_w = sqrt( (1-K_g^2)*var_g );
for t = 2:nT,
gamma_t(t) = K_g*gamma_t(t-1) + K_w*randn();
end
gamma_t = reshape( gamma_t, [ 1 1 nT ] );
The formulas I've used for gains K_g and K_w in the above code (and the initialization of gamma_t(1)) produce the desired stationary variance \sigma^2_0 and one-step covariance \sigma^2_1:
Note that the implementation above assumes that later you will sum the terms using bsxfun to do the "repmat" for you:
Y = bsxfun( #plus, deterministic_mu + kappa_st + delta_st, beta_s );
Y = bsxfun( #plus, Y, gamma_t );
Note that I haven't tested the above code, so you should confirm with sampling that it does actually produce a zero noise process of the specified variance and covariance between adjacent samples. To sample beta the same procedure can be extended into two dimensions, but the principles are essentially the same. I suspect kappa should be similarly modeled as a Markov Gaussian Process, but in all three dimensions and with a lower variance to represent higher-order effects not captured in mu, beta and gamma.
Delta is supposed to be zero mean stationary white noise. Assuming it to be Gaussian with variance noisePower one would sample it using
delta_st = sqrt(noisePower)*randn( [ nRows nCols nT ] );