standard error of regression coefficient without using inverse - linear-regression

I want to compute the standard error of regression coefficients:
y = X$$\theta$$ + e
where theta is the regression coefficient. I know the formula is:
$$V(\theta) = \sigma^2 (X^T X)^-1$$
How can I calculate the above formula without computing the inverse?

Actually, the answer is to use residual bootstrap.

Related

Linear Regression in MATLAB without fitlm

I am tasked to perform a prediction analysis. This requires performing a linear regression on several (~10) predictor variables and coming up with intercepts for all and a constant.
so final equation will be of this format y = c + c1x1 + c2x2 + c3x3....
Now I know that you can use fitlm function in MATLAB that is available with Statistics and Machine Learning Toolbox however at this point I don't know if we will be purchasing it. How do I perform linear regression on them ?
You can use the closed form solution of linear least squares.
C=inv(transpose(X)*X)*transpose(X)*y
In the above, make the first row of X all ones, and the following rows are x1, x2,...
C will contain the corresponding constants. The first entry in C is c.
From: https://www.mathworks.com/help/matlab/data_analysis/linear-regression.html
You can write your predictor variables as a matrix X using X = [ones(length(x1),1),x1,x2,x3,...,xn] and formulating the response variables Y as the equation Y = XB and doing a matrix inverse operation using mldivide as B = X\Y to find your regression coefficients.

Adaptive Linear regression

Let's say I have a set of samples, which consists of a non-stationary stochastic process with a uniform probability distribution (Gaussian). I need an adaptive linear regression over the set of samples. Basically I want the 'best-fit' line to behave a certain way. I have a separate signal, and I know the 'best-fit' line of the form Y=Mx+B will have a slope M proportional to that other signal. So I need the optimization problem to minimize the distance between the points BUT giving me a slope proportional to the other signal. What's the simplest machine learning/stats approach to use for this problem?
If i understand your question correctly, you can just use normal regression, or a gradient descent type algorithm, but instead of having the degrees of freedom as M and B, you can use a proportionality constant to M of the known data, and a separate B.
ie. the known signal:
Y1 = M1*x + B1
Y2 = k*M1*x + B2
solve for k and B2 such that the mean difference to x and y is minimised.
In theory, this seems to be intrinsic anyway. If you solved the problem for a linear solution in the first place. k would be M2 / M1 ....

Discrete surface integral with cumsum

I have a matrix z(x,y)
This is an NxN abitary pdf constructed from a unique Kernel density estimation (i.e. not a usual pdf and it doesn't have a function). It is multivariate and can't be separated and is discrete data.
I wan't to construct a NxN matrix (F(x,y)) that is the cumulative distribution function in 2 dimensions of this pdf so that I can then randomly sample the F(x,y) = P(x < X ,y < Y);
Analytically I think the CDF of a multivariate function is the surface integral of the pdf.
What I have tried is using the cumsum function in order to calculate the surface integral and tested this with a multivariate normal against the analytical solution and there seems to be some discrepancy between the two:
% multivariate parameters
delta = 100;
mu = [1 1];
Sigma = [0.25 .3; .3 1];
x1 = linspace(-2,4,delta); x2 = linspace(-2,4,delta);
[X1,X2] = meshgrid(x1,x2);
% Calculate Normal multivariate pdf
F = mvnpdf([X1(:) X2(:)],mu,Sigma);
F = reshape(F,length(x2),length(x1));
% My attempt at a numerical surface integral
FN = cumsum(cumsum(F,1),2);
% Normalise the CDF
FN = FN./max(max(FN));
X = [X1(:) X2(:)];
% Analytic solution to a multivariate normal pdf
p = mvncdf(X,mu,Sigma);
p = reshape(p,delta,delta);
% Highlight the difference
dif = p - FN;
error = max(max(sqrt(dif.^2)));
% %% Plot
figure(1)
surf(x1,x2,F);
caxis([min(F(:))-.5*range(F(:)),max(F(:))]);
xlabel('x1'); ylabel('x2'); zlabel('Probability Density');
figure(2)
surf(X1,X2,FN);
xlabel('x1'); ylabel('x2');
figure(3);
surf(X1,X2,p);
xlabel('x1'); ylabel('x2');
figure(5)
surf(X1,X2,dif)
xlabel('x1'); ylabel('x2');
Particularly the error seems to be in the transition region which is the most important.
Does anyone have any better solution to this problem or see what I'm doing wrong??
Any help would be much appreciated!
EDIT: This is the desired outcome of the cumulative integration, The reason this function is of value to me is that when you randomly generate samples from this function on the closed interval [0,1] the higher weighted (i.e. the more likely) values appear more often in this way the samples converge on the expected value(s) (in the case of multiple peaks) this is desired outcome for algorithms such as particle filters, neural networks etc.
Think of the 1-dimensional case first. You have a function represented by a vector F and want to numerically integrate. cumsum(F) will do that, but it uses a poor form of numerical integration. Namely, it treats F as a step function. You could instead do a more accurate numerical integration using the Trapezoidal rule or Simpson's rule.
The 2-dimensional case is no different. Your use of cumsum(cumsum(F,1),2) is again treating F as a step function, and the numerical errors resulting from that assumption only get worse as the number of dimensions of integration increases. There exist 2-dimensional analogues of the Trapezoidal rule and Simpson's rule. Since there's a bit too much math to repeat here, take a look here:
http://onestopgate.com/gate-study-material/mathematics/numerical-analysis/numerical-integration/2d-trapezoidal.asp.
You DO NOT need to compute the 2-dimensional integral of the probability density function in order to sample from the distribution. If you are computing the 2-d integral, you are going about the problem incorrectly.
Here are two ways to approach the sampling problem.
(1) You write that you have a kernel density estimate. A kernel density estimate is a special case of a mixture density. Any mixture density can be sampled by first selecting one kernel (perhaps differently or equally weighted, same procedure applies), and then sampling from that kernel. (That applies in any number of dimensions.) Typically the kernels are some relatively simple distribution such as a Gaussian distribution so that it is easy to sample from it.
(2) Any joint density P(X, Y) is equal to P(X | Y) P(Y) (and equivalently P(Y | X) P(X)). Therefore you can sample from P(Y) (or P(X)) and then from P(X | Y). In order to sample from P(X | Y), you will need to integrate P(X, Y) along a line Y = y (where y is the sampled value of Y), but (this is crucial) you only need to integrate along that line; you don't need to integrate over all values of X and Y.
If you tell us more about your problem, I can help with the details.

Spectral Entropy and Spectral Energy of a vector in Matlab

I am going to use Spectral Energy and Spectral Entropy as features for window-based time-series data. However, I'm bit confused about the formula being used for it online, especially about the the special Entropy.
I used entropy from Matlab but that doesn't work with time-series data. It just give me zero for everything. http://www.mathworks.nl/help/images/ref/entropy.html
Test window Entropy result for this version = 0
Then I used this version.
http://www.mathworks.com/matlabcentral/fileexchange/28692-entropy
Test window Entropy result for this version = 4.3219
I also tried the -sum(p.*log2(p)) after applying imhist to a data window ( p = imhist(aw1(:));). Got this from an online help.
Test window Entropy result for this version = 0.0369
All of them reported different values.
For spectral energy, I am using the squared sum of fft coefficients.
sum(abs(fft(data-window)).^2)
Can any body give me any suggestion which is the correct version ?
For Spectral Entropy the steps are:
Compute the Power Spectral Density(PSD)
Normalize the PSD
Calculate the Entropy −∑(P)log2(P), where P = PSD
P=sum(abs(fft(data-window)).^2)
%Normalization
d=P(:);
d=d/sum(d+ 1e-12);
%Entropy Calculation
logd = log2(d + 1e-12);
Entropy(inc) = -sum(d.*logd)/log2(length(d));
I have calculated the spectral entropy of signal. I followed the same steps.But I did not add that 1e-12.Why did you add that 1e-12? You can use sum(d) instead of sum(d+1e-12)

How do I compute the Inverse gaussian distribution from given CDF?

I want to compute the parameters mu and lambda for the Inverse Gaussian Distribution given the CDF.
By 'given the CDF' I mean that I have given the data AND the (estimated) quantile for the data I.e.
Quantile - Value
0.01 - 10
0.5 - 12
0.7 - 13
Now I want to find out the inverse gaussian distribution for this data so that I can e.g. Look up the quantile for value 11 based on my distribution.
How can I find out the values mu and lambda?
The only solution I can think of is using Gradient descent to find the best mu and lambda using RMSE as an error measure.
Isn't there a better solution?
Comment: Matlab's MLE-Algorithm is not an option, since it does not use the quantile data.
As all you really want to do is estimate the quantiles of the distribution at unknown values and you have a lot of data points you can simply interpolate the values you want to lookup.
quantile_estimate = interp1(values, quantiles, value_of_interest);
According to #mpiktas here I implemented a gradient descent algorithm for estimating my mu and lambda:
Make initial guess using MLE
Learn mu and lambda using gradient descent with RMSE as error measure.
The following article explains in detail how to compute quantiles (the inverse CDF) for the inverse Gaussian distribution:
Giner, G, and Smyth, GK (2016). statmod: probability calculations for the inverse Gaussian distribution. R Journal. http://arxiv.org/abs/1603.06687
Code for the R language is contained in the R package statmod available from CRAN. For example:
> library(statmod)
> qinvgauss(0.01, lower.tail=FALSE)
[1] 4.98
computes the 0.01 upper tail quantile of the standard IG distribution.