Matlab - LDA "The pooled covariance matrix of TRAINING must be positive definite." - matlab

Can someone help me out with this problem. I have trying to figure this out from a long time.
I have a training_Set: <1530*270400 double>
and Test_Set: <4794*270400 double>
I am using Linear discriminant analysis method
class = classify(Test_Set,Training_Set,train_label,'linear')
Error using classify (line 228)
The pooled covariance matrix of TRAINING must be positive definite.

In order for the covariance matrix of TRAINING to be positive definite, you must at the very least have more observations than variables in Test_Set. In your case, it seems as though you have many more variables (270400) than observations (1530). You can try dimension reduction before classifying.
I have answered a very similar question here: Matlab bug with linear discriminant analysis

Related

Positive Covariance but why not fit in OLS?

Could you help explain me why please? I have tested a pair of data for their covariance and I got a positive result but why I input them together in linear regression, the p value is not significant?
Thank you very much
I am trying to test linear regression in excel and the output summary is not significant but when I pair the data, the covariance is positive

How can I reduce extract features from a set of Matrices and vectors to be used in Machine Learning in MATLAB

I have a task where I need to train a machine learning model to predict a set of outputs from multiple inputs. My inputs are 1000 iterations of a set of 3x 1 vectors, a set of 3x3 covariance matrices and a set of scalars, while my output is just a set of scalars. I cannot use regression learner app because these inputs need to have the same dimensions, any idea on how to unify them?
One possible way to solve this is to flatten the covariance matrix into a vector. Once you did that, you can construct a 1000xN matrix where 1000 refers to the number of samples in your dataset and N is the number of features. For example if your features consist of a 3x1 vector, a 3x3 covariance matrix and lets say 5 other scalars, N could be 3+3*3+5=17. You then use this matrix to train an arbitrary model such as a linear regressor or more advanced models like a tree or the like.
When training machine learning models it is important to understand your data and exploit its structure to help the learning algorithms. For example we could use the fact that a covariance matrix is symmetric and positive semi-definite and thus lives in a closed convex cone. Symmetry of the matrix implies that it lives in a subspace of the set of all 3x3 matrices. In fact the dimension of the space of 3x3 symmetric matrices is only 6. You can use that knowledge to reduce redundancy in your data.

Matlab's Princomp command- is it non-negative definite or not?

In Matlab, while trying to do PCA, is there a difference when using the princomp command for positive vs. negative data values? Is it a non-negative definite command? Thank you!
The condition of non-negative defininiteness in PCA does not refer to the data (that wouldn't make sense) but to the covariance matrix estimated from the data. princomp estimates the covariance matrix internally from your data, and such an estimate is always guaranteed to be non-negative definite, no matter what you data look like.

Matlab: Determinant of VarianceCovariance matrix

When solving the log likelihood expression for autoregressive models, I cam across the variance covariance matrix Tau given under slide 9 Parameter estimation of time series tutorial. Now, in order to use
fminsearch
to maximize the likelihood function expression, I need to express the likelihood function where the variance covariance matrix arises. Can somebody please show with an example how I can implement (determinant of Gamma)^-1/2 ? Any other example apart from autoregressive model will also do.
How about sqrt(det(Gamma)) for the sqrt-determinant and inv(Gamma) for inverse?
But if you do not want to implement it yourself you can look at yulewalkerarestimator
UPD: For estimation of autocovariance matrix use xcov
also, this topic is a bit more explained here

Matlab - bug with linear discriminant analysis

I run
Y_testing_obtained = classify(X_testing, X_training, Y_training);
and the error I get is
Error using ==> classify at 246
The pooled covariance matrix of TRAINING must be positive definite.
X_training is 1550 x 5 matrix. Can you please tell me what this error means, i.e. why is it appearing, and how to work around it?
Thanks
Explanation: When you run the function classify without specifying the type of discriminant function (as you did), Matlab uses Linear Discriminant Analysis (LDA). Without going into too much details on LDA, the algorithms needs to calculate the covariance matrix of X_testing in order to solve an optimisation problem, and this matrix has to be positive definite (see Wikipedia: Positive-definite matrix). The underlying assumption is that your data is represented by a multivariate probability distribution, which always has a positive definite covariance matrix unless one or more variables are exact linear combinations of the others.
To solve your problem: It is possible that one of your variables is a linear combination of the others. You can try selecting a sensible subset of your variables, or perform Principal Component Analysis (PCA) on the training data and then classify using the first few principal components. Or, you could specify the type of discriminant function and choose one of the two naive Bayes classifiers, for example:
Y_testing_obtained = classify(X_testing, X_training, Y_training, 'diaglinear');
As a side note, you also need to have more observations (rows) than variables (columns), but in your case this is not the problem as you seem to have 1550 observations and 5 variables.
Finally, you can also have a look at the answers posted to a similar question on the Matlab forum.
Try regularizing the data using cvshrink function in Matlab