Poor fit with nlinfit - matlab

I'm having fitting issues with nlinfit. I can't seem to figure out how to improve the fit. Decreasing TolX or TolFun has not changed the value in coeffs.
model = #(a,x) 1./(1 + a*x.^2);
model0 = [1e13];
opts = statset('TolX', 1e-25, 'TolFun', 1e-25);
coeffs = nlinfit(freqData, noiseData, model, model0, opts);
Here's my fit.
http://i.imgur.com/v1dkd4X.png

it seems like you are dealing with very small numbers so there might be a floating point precision issue. Why won't you transform the expression to a different from, then fit, then inverse transform?
For example:
take 1/model as the transformation, now you have just a simple polynomial fitting,
model_new=(x,a)=1+a*x.^2
where you can use polyfit and polyval, then take 1/result ...

I fit simulated data that looks similar to yours, without scaling:
The trick is to inspect your data - your signal amplitude is dropping from ~1.5 to ~1.0 between x~40 and ~150. Yet if you inspect the function it's clear that its value should not drop below 1, so it cannot model the data properly.
This data is better fit by including an initial amplitude:
model_new = #(a,x) a(1)./(1 + a(2)*x.^2);
Looking at the fitting function plotted onto your data it looks like you also include a scaling parameter somewhere.
Including an amplitude parameter improves on the original function, but is not necessarily safe: your data is noisy, and not dropping by much, so you can expect your uncertainties (and correlations) to be large.
Scaling the data before fitting probably would not really help here, since you don't have data down to x=0 and don't know what an appropriate scaling factor should be.

Related

Optimization problem - multivariable linear array, fitting data to data

I have a multivariable linear optimization problem that I could use some guidance with on finding an optimal function/code method (Matlab). My problem is as as follows:
I have a set of observed data, I'll call this d(i), which is a 5000x1 vector (# of rows may change).
I have 10 - 100 sets of simulated data, the number of sets is a number I decide on. Each of these sets is also a 5000x1 vector (again, # of rows may change). I'll call these c1(i), c2(i), etc.
I would like to fit the simulated data sets to the observed data set with this equation:
sf1*c1(i) + sf2*c2(i) + sf3*c3(i) sf4*c4(i) + ... = d(i) + error
In this equation, I would like to solve for all of the scale factors (sf) (non-negative constants) and the error. I am assuming I need to set initial values for all the scale factors for this problem to work. I have looked into things like lssqnonneg, but I am unclear on whether that function can solve or optimize for this many variables per equation.
See above - I have also manually input the values of some scale factors and I can get a pretty good fit to the data by hand, but this is impractical for large quantities of simulated data sets.
did you try looking at https://www.mathworks.com/help/stats/linear-regression.html?s_tid=CRUX_lftnav ?
Instead of using c1,c2,...c100 as different vectors better concatenate them into an array 100x5000, say A=[c1;c2;...;c100] this will be needed to make life easier.
Then look for example at ridge regression
Ans= ridge(d,A,k)
where k is the regularization parameter that can be found by cross-validation:
[U,s,V]=svd( A,"econ");
k=gcv(U,diag(s),d,'tsvd');
see the function gcv here https://www.mathworks.com/matlabcentral/fileexchange/52-regtools

Should I perform data centering before apply SVD?

I have to use SVD in Matlab to obtain a reduced version of my data.
I've read that the function svds(X,k) performs the SVD and returns the first k eigenvalues and eigenvectors. There is not mention in the documentation if the data have to be normalized.
With normalization I mean both substraction of the mean value and division by the standard deviation.
When I implemented PCA, I used to normalize in such way. But I know that it is not needed when using the matlab function pca() because it computes the covariance matrix by using cov() which implicitly performs the normalization.
So, the question is. I need the projection matrix useful to reduce my n-dim data to k-dim ones by SVD. Should I perform data normalization of the train data (and therefore, the same normalization to further projected new data) or not?
Thanks
Essentially, the answer is yes, you should typically perform normalization. The reason is that features can have very different scalings, and we typically do not want to take scaling into account when considering the uniqueness of features.
Suppose we have two features x and y, both with variance 1, but where x has a mean of 1 and y has a mean of 1000. Then the matrix of samples will look like
n = 500; % samples
x = 1 + randn(n,1);
y = 1000 + randn(n,1);
svd([x,y])
But the problem with this is that the scale of y (without normalizing) essentially washes out the small variations in x. Specifically, if we just examine the singular values of [x,y], we might be inclined to say that x is a linear factor of y (since one of the singular values is much smaller than the other). But actually, we know that that is not the case since x was generated independently.
In fact, you will often find that you only see the "real" data in a signal once we remove the mean. At the extremely end, you could image that we have some feature
z = 1e6 + sin(t)
Now if somebody just gave you those numbers, you might look at the sequence
z = 1000001.54, 1000001.2, 1000001.4,...
and just think, "that signal is boring, it basically is just 1e6 plus some round off terms...". But once we remove the mean, we see the signal for what it actually is... a very interesting and specific one indeed. So long story short, you should always remove the means and scale.
It really depends on what you want to do with your data. Centering and scaling can be helpful to obtain principial components that are representative of the shape of the variations in the data, irrespective of the scaling. I would say it is mostly needed if you want to further use the principal components itself, particularly, if you want to visualize them. It can also help during classification since your scores will then be normalized which may help your classifier. However, it depends on the application since in some applications the energy also carries useful information that one should not discard - there is no general answer!
Now you write that all you need is "the projection matrix useful to reduce my n-dim data to k-dim ones by SVD". In this case, no need to center or scale anything:
[U,~] = svd(TrainingData);
RecudedData = U(:,k)'*TestData;
will do the job. The svds may be worth considering when your TrainingData is huge (in both dimensions) so that svd is too slow (if it is huge in one dimension, just apply svd to the gram matrix).
It depends!!!
A common use in signal processing where it makes no sense to normalize is noise reduction via dimensionality reduction in correlated signals where all the fearures are contiminated with a random gaussian noise with the same variance. In that case if the magnitude of a certain feature is twice as large it's snr is also approximately twice as large so normalizing the features makes no sense since it would just make the parts with the worse snr larger and the parts with the good snr smaller. You also don't need to subtract the mean in that case (like in PCA), the mean (or dc) isn't different then any other frequency.

Differentiating a Centred and Scaled Polyfit Fit

I have some data which I wish to model in order to be able to get relatively accurate values in the same range as the data.
To do this I used polyfit to fit a 6th order polynomial and due to my x-axis values it suggested I centred and scaled it to get a more accurate fit which I did.
However, now I want to find the derivative of this function in order to model the velocity of my model.
But I am not sure how the polyder function interacts with the scaled and fitted polyfit which I have produced. (I don't want to use the unscaled model as this is not very accurate).
Here is some code which reproduces my problem. I attempted to rescale the x values before putting them into the fit for the derivative but this still did no fix the problem.
x = 0:100;
y = 2*x.^2 + x + 1;
Fit = polyfit(x,y,2);
[ScaledFit,s,mu] = polyfit(x,y,2);
Deriv = polyder(Fit);
ScaledDeriv = polyder(ScaledFit);
plot(x,polyval(Deriv,x),'b.');
hold on
plot(x,polyval(ScaledDeriv,(x-mu(1))/mu(2)),'r.');
Here I have chosen a simple polynomial so that I could fit it accurate and produce the actual derivative.
Any help would be greatly appreciated thanks.
I am using Matlab R2014a BTW.
Edit.
Just been playing about with it and by dividing the resulting points for the differential by the standard deviation mu(2) it gave a very close result within the range -3e-13 to about 5e-13.
polyval(ScaledDeriv,(x-mu(1))/mu(2))/mu(2);
Not sure quite why this is the case, is there another more elegant way to solve this?
Edit2. Sorry for another edit but again was mucking around and found that for a large sample x = 1:1000; the deviation became much bigger up to 10. I am not sure if this is due to a bad polyfit even though it is centred and scaled or due to the funny way the derivative is plotted.
Thanks for your time
A simple application of the chain rule gives
Since by definition
it follows that
Which is exactly what you have verified numerically.
The lack of accuracy for large samples is due to the global, rather then local, Lagrange polynomial interpolation which you have done. I would suggest that you try to fit your data with splines, and obtain the derivative with fnder(). Another option is to apply the polyfit() function locally, i.e. to a moving small set of points, and then apply polyder() to all the fitted polynomials.

Matlab gmdistribution.fit only finding one peak

I have an array named Area, which contains a set of values.
The histogram of the array looks like this
The bin width is 60 in this case. I'd like to fit two gaussians to the two peaks here (even if it won't be a great fit).
So I used:
options = statset('Display','final');
obj = gmdistribution.fit(area,2,'Options',options);
gausspdf = pdf(obj, xaxis);
A = sum(gausspdf);
gausspdf = gausspdf/A;
But when I try to plot the two fitted Gaussians, the resulting curve looks like this:
I'm quite confused, as there should be two peaks appearing in the plot?
The gmdistribution.fit method fits data according to maximum-likelihood criterion; that is, it tries to find parameters which maximize the likelihood given the data. It will not necessarily fit what you see or expect visually. Still, there is the possibility that the algorithm converged to a "bad" local minimum. You can try and set the initial conditions according to what you want to get, practically 'helping' the algorithm to converge to the desired result. You do this using the Start option to the fit method, which enables you to give it either an initial guess, in which case you should try and estimate the parameters from the histogram, or an initial component index for each data sample. See the documentation for more details.
I think that your peaks are too close and the function can't distinguish them. so maybe you should change the options for gmdistribution or apply a non-linear function to your data first to get more separate peaks in histogram.

How to fit low frequencies in octave using invfreqs?

I am trying to find a transfer function from frequency response data using invfreqs in octave.
In principle it works, the problem is that the resulting transfer function is always fitting the highest frequencies, low frequencies are badly matched.
Trying to weight the fit-errors versus frequency doesn't work. Am I doing something wrong?
Hg = 10.^(mg/20).*exp(i*pg*pi/180);
wt(fgrps>1500) = 0;
m = 44;
n = 52;
[Bg,Ag] = invfreqs(Hg,fgrps,m,n,wt);
This is the result I get:
The result is more or less the same for different orders of the numerator and denominator polynomials. High frequencies are matched good, low frequencies are matched bad.
What can I do about this?
Thank you very much in advance!
Kind regards
Stefan
My first bet is that since frequency domain (for convenience) is most often (and from your plot also for you) shown in logarithmic scaling. Thus, if you fit, the function isn't fit like you'd "imagine", but rather scaled and then fit. On a logarithmic scale higher values are represented more often -> your fit is better there.
So what you should do is: find out what scaling is applied, and try a linear frequency scaling. Bear in mind, that this is also not a "good" idea. So try to find a frequency vector, for which you need to be close and fit with that.