I'm trying to do Poisson regression with overdispersed data and so I believe I should be using huber-white robust standard errors.
However, I don't see any option for that in glmfit. And from what I understand, robustfit is only for linear regression. Am I correct in this? Is there a way to do Poisson regression with robust standard errors in Matlab?
Related
I am estimating a regression in MATLAB long hand.
I've got the standard error and coefficient, is there any quick way to get the p-value in MATLAB?
Any tips or tricks would be appreciated?
Usually p-values are from a statistical test, to check the statistical significance of the result, using t-test for example.
I am trying to calculate a linear regression of some data that I have using MATLAB's fitlm tool. Using ordinary least-squares (OLS) I get fairly low R-squared values (~ 0.2-0.5), and occasionally even unrealistic results. Whereas when using robust regression (specifically the 'talwar' option), I get much better results (R2 ~ 0.7-0.8).
I am no statistician, so my question is: Is there any reason I should not believe that the robust results are better?
Here is an example of some of the data. The data shown produces R2 of OLS: 0.56, robust:0.72.
One reason you're going to get notable differences in R values is that the Talwar handles outliers differently. Talwar subdivides your data set into segments and computes averages for each of those segments.
Taken from the abstract of Talwar's paper:
'Estimates of the parameters of a linear model are usually obtained by the method of ordinary least-squares (OLS), which is sensitive to large values of the additive error term... we obtain a simple, consistent and asymptotically normal initial estimate of the coefficients, which protects the analyst from large values of εi which are often hard to detect using OLS on a model with many regressors. '- https://www.jstor.org/stable/2285386?seq=1#page_scan_tab_contents
Whether Talwar or OLS is better depends on your knowledge of the measurement process (namely, how outliers can be explained). If appropriate, and you prune the data with a Q-test to remove outliers ( see http://education.mrsec.wisc.edu/research/topic_guides/outlier_handout.pdf), that should minimize the differences in R you see between Talwar and OLS.
Of course yes. The idea of robust regression is very broad. There are different types of Robust Regression. Thus, there are a situations that the performance of one robust regression is better than the other robust regression methods.
I am trying to implement bayesian optimization using gauss process regression, and I want to try the multiple output GP firstly.
There are many softwares that implemented GP, like the fitrgp function in MATLAB and the ooDACE toolbox.
But I didn't find any available softwares that implementd the so called multiple output GP, that is, the Gauss Process Model that predict vector valued functions.
So, Are there any softwares that implemented the multiple output gauss process that I can use directly?
I am not sure my answer will help you as you seem to search matlab libraries.
However, you can do co-kriging in R with gstat. See http://www.css.cornell.edu/faculty/dgr2/teach/R/R_ck.pdf or https://github.com/cran/gstat/blob/master/demo/cokriging.R for more details about usage.
The lack of tools to do cokriging is partly due to the relative difficulty to use it. You need more assumptions than for simple kriging: in particular, modelling the dependence between in of the cokriged outputs via a cross-covariance function (https://stsda.kaust.edu.sa/Documents/2012.AGS.JASA.pdf). The covariance matrix is much bigger and you still need to make sure that it is positive definite, which can become quite hard depending on your covariance functions...
Is there a way to turn off pivoting when computing the inverse of a tridiagonal matrix in matlab? I'm trying to see if a problem I'm having with solving a tridiagonal system is coming from not pivoting and I can test it simply in matlab by solving the same system and turning off pivoting. Any help is appreciated!
The documentation to mldivide doesn't list any options for setting low-level options like that.
I'd imagine that is because automatic pivoting is not only desired but expected from most tools these days.
For a tridiagonal matrix that is full, MATLAB will use its Hessenberg solver (which I imagine is akin to this flow) and, for a sparse tridiagonal matrix, will use a tridiagonal solver. In both cases, partial pivoting may used to ensure an accurate solution of the system.
To get around the fact that MATLAB doesn't have a toggle for pivoting, you could implement your own tridiagonal solver (see above link) without pivoting and see how the solution is affected.
I have a problem where I am fitting a high-order polynomial to (not very) noisy data using linear least squares. Currently I'm using polynomial orders around 15 - 25, which work surprisingly well: The dependence is very nearly linear, but the accuracy of modelling the 'very nearly' is critical. I'm using Matlab's polyfit() function, and (obviously) normalising the x-data. This generally works fine, but I have come across an issue with some recent datasets. The fitted polynomial has extrema within the x-data interval. For the application I'm working on this is a non-no. The polynomial model must have no stationary points over the x-interval.
So I need to add a constraint to the least-squares problem: the derivative of the fitted polynomial must be strictly positive over a known x-range (or strictly negative - this depends on the data but a simple linear fit will quickly tell me which it is.) I have had a quick look at the available optimisation toolbox functions, but I admit I'm at a loss to know how to go about this. Does anyone have any suggestions?
[I appreciate there are probably better models than polynomials for this data, but in the short term it isn't feasible to change the form of the model]
[A closing note: I have finally got the go-ahead to replace this awful polynomial model! I am going to adopt a nonparametric approach, spline smoothing, using the excellent SPLINEFIT code by Jonas Lundgren. This has the advantage that I'm already using a spline model in the end-user application, so I already have C# code available to evaluate a spline model]
You could use cftool and use the exclude data points option.