Matlab libsvm - how to find the w coefficients

Matlab libsvm - how to find the w coefficients - matlab

How can find what the vector w is, i.e. the perpendicular to the separation plane?

This is how I did it here. If I remember correctly, this is based on how the dual form of the SVM optimisation works out.
model = svmtrain(...);
w = (model.sv_coef' * full(model.SVs));
And the bias is (and I don't really remember why its negative):
bias = -model.rho;
Then to do the classification (for a linear SVM), for a N-by-M dataset 'features' with N instances and M features,
predictions = sign(features * w' + bias);
If the kernel is not linear, then this won't give you the right answer.
For more information see How could I generate the primal variable w of linear SVM? , from the manual of libsvm.

Related

Manual prediction of Gaussian Regression SVM in Matlab

I trained a SVM using the Regression Learner of Matlab with a Gaussian kernel. The learning worked really well and the RSE is small.
Now, I exported the model back to the Matlab workspace (trainedModel) and I can use the predict function to get the estimation of new values. However, I would like to manually implement the prediction function, because I need to export it to a different programming language, thus I cannot rely on the Matlab's predict function. Therefore, following the MATLAB explanation I implemented the following equation:
with
This is my code for a [0.5 1 50] input:
bias = trainedModel.RegressionSVM.Bias;
alpha = trainedModel.RegressionSVM.Alpha;
SV = trainedModel.RegressionSVM.SupportVectors;
Mu = trainedModel.RegressionSVM.Mu;
Sg = trainedModel.RegressionSVM.Sigma;
input = ([0.5 1 50] - Mu) ./ Sg;
sum = bias;
for n=1:length(alpha)
G = exp(-norm((SV(n,:)'-input))^2);
sum = sum + alpha(n) .* G;
end
disp(sum)
(Note that alpha is already the difference of the Lagrangian multipliers according to the documentation)
However, the predicted results are completely wrong. I think something is wrong with G because the values are very small (in the order of 10^(-25)), but I cannot figure out the error.

The mistake was very small... The reason is the transposition of the SV array, which is incorrect (it creates a matrix due to the - operator, but then it's hidden by the norm). Therefore, changing the following line:
G = exp(-norm((SV(n,:)'-input))^2);
to
G = exp(-norm((SV(n,:)-input))^2);
solved the problem.

Linear Regression in MATLAB without fitlm

I am tasked to perform a prediction analysis. This requires performing a linear regression on several (~10) predictor variables and coming up with intercepts for all and a constant.
so final equation will be of this format y = c + c1x1 + c2x2 + c3x3....
Now I know that you can use fitlm function in MATLAB that is available with Statistics and Machine Learning Toolbox however at this point I don't know if we will be purchasing it. How do I perform linear regression on them ?

You can use the closed form solution of linear least squares.
C=inv(transpose(X)*X)*transpose(X)*y
In the above, make the first row of X all ones, and the following rows are x1, x2,...
C will contain the corresponding constants. The first entry in C is c.

From: https://www.mathworks.com/help/matlab/data_analysis/linear-regression.html
You can write your predictor variables as a matrix X using X = [ones(length(x1),1),x1,x2,x3,...,xn] and formulating the response variables Y as the equation Y = XB and doing a matrix inverse operation using mldivide as B = X\Y to find your regression coefficients.

How to reduce dimensions of Gaussian Mixture Model parameters

Assuming I have already built a Gaussian Mixture Model using the fitgmdist function and want to map the multivariate distributions into a subspace with a smaller dimension without having to recreate the model how do I go about it?
In MATLAB terms, I have a GMM, gmm_goal, with gmm_goal.NumComponents = K and gmm_goal.NumVariables = N and want to reduce N to a number n < N.
If code isn't available, an explanation or mathematical derivation will do.

The parameters of the Gaussian Mixture Model effected by the transformation into a subspace are the mean and variance of the Gaussian distributions that form the GMM.
Assuming a linear transformation of your data points x:
y = A*x + b
Because of linearity of expectation, we can calculate the new mean and variance of the subspace from the old ones:
mean_new = A*mean + b
variance_new = A*variance*A'

Adaptive Linear regression

Let's say I have a set of samples, which consists of a non-stationary stochastic process with a uniform probability distribution (Gaussian). I need an adaptive linear regression over the set of samples. Basically I want the 'best-fit' line to behave a certain way. I have a separate signal, and I know the 'best-fit' line of the form Y=Mx+B will have a slope M proportional to that other signal. So I need the optimization problem to minimize the distance between the points BUT giving me a slope proportional to the other signal. What's the simplest machine learning/stats approach to use for this problem?

If i understand your question correctly, you can just use normal regression, or a gradient descent type algorithm, but instead of having the degrees of freedom as M and B, you can use a proportionality constant to M of the known data, and a separate B.
ie. the known signal:
Y1 = M1*x + B1
Y2 = k*M1*x + B2
solve for k and B2 such that the mean difference to x and y is minimised.
In theory, this seems to be intrinsic anyway. If you solved the problem for a linear solution in the first place. k would be M2 / M1 ....

Fast scaling of Gaussian Kernel by the Covariance of the Inputs

I am currently fiddling with multivariate kernel density estimations for estimating the probability density functions (PDF) of hydrological data sets using Matlab. I am most familiar with kernel density estimation using Gaussian kernels as outlined in Sharma (2000 and 2014) (where the kernel bandwidths are set using the Gaussian Reference Rule (GRR)). The GRR is written as follows (Sharma, 2000):
where lambda_ref = GRR bandwidth of kernel, n is the sample size, and d is the dimension of the data set we are using for density estimation. To estimate the multivariate density of our data set X we use the following formula (Sharma, 2000):
where lamda is the same as lamda_ref above, S is the sample covariance of X and det() stands for determinant.
My question is: I understand that there are many "fast" methods for calculating the Gaussian kernel function represented by the term exp() such as the method proposed here (using Matlab): http://mrmartin.net/?p=218. Since I will be working with data sets that are quite large in sample size (1000-10,000) I am looking for a fast code. Is anyone aware how I can write a fast code for the second equation that takes into account the inverse of the sample covariance matrix (S^-1)?
I greatly appreciate any help that can be provided on this issue. Thank you!
Note(s):
I understand that there is a Matlab code for calculating the second equation, found as a sub-function in: http://www.mathworks.com/matlabcentral/fileexchange/29039-mutual-information-2-variablle/content/MutualInfo.m. However this code has a bottleneck in how it calculates the kernel matrix.
References:
1 A. Sharma, Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 3 — A nonparametric probabilistic forecast model, Journal of Hydrology, Volume 239, Issues 1–4, 20 December 2000, Pages 249-258, ISSN 0022-1694, http://dx.doi.org/10.1016/S0022-1694(00)00348-6.
2 Sharma, A., and R. Mehrotra (2014), An information theoretic alternative to model a natural system using observational information alone, Water Resour. Res., 50, 650–660, doi:10.1002/2013WR013845.

I have found a code that I am able to modify for my purposes. The original code is listed at the following link: http://www.kernel-methods.net/matlab/kernels/rbf.m.
Code
function K = rbf(coord,sig)
%function K = rbf(coord,sig)
%
% Computes an rbf kernel matrix from the input coordinates
%
%INPUTS
% coord = a matrix containing all samples as rows
% sig = sigma, the kernel width; squared distances are divided by
% squared sig in the exponent
%
%OUTPUTS
% K = the rbf kernel matrix ( = exp(-1/(2*sigma^2)*(coord*coord')^2) )
%
%
% For more info, see www.kernel-methods.net
%
%Author: Tijl De Bie, february 2003. Adapted: october 2004 (for speedup).
n=size(coord,1);
K=coord*coord'/sig^2;
d=diag(K);
K=K-ones(n,1)*d'/2;
K=K-d*ones(1,n)/2;
K=exp(K);
Modified Code incorporating sample covariance scaling:
xcov = cov(x.'); % sample covariance of the data
invxc = pinv(xcov); % inversion of data sample covariance
coord = x.';
sig = sigma; % kernel bandwidth
n = size(coord,1);
K = coord*invxc*coord'/sig^2;
d = diag(K);
K = K-ones(n,1)*d'/2;
K = K-d*ones(1,n)/2;
K = exp(K); % kernel matrix
I hope this helps someone else looking into the same problem.