How to solve a Nonlinear least squares problem?

How to solve a Nonlinear least squares problem? - matlab

image
Initial idea is to use euclidean distances. But I do not understand how should I solve this task. Does anyone have some hints on how to approach this?

The least-squares error is obviously the sum of the squares of the differences between the computed distances from the estimated point to the bases, and the measured distances.
Now you can implement Levenberg-Marquard to minimize this objective function.
Hint: for a good initial approximation, you can solve the problem for two bases, as there is a closed-form solution. There will be two solutions, and you can discriminate using the distance to the third base.

Related

Initial guess and resnorm Issue in Matlab curve fitting

I am fitting data to a system of non linear ODEs to estimate model parameters using Matlab lsqcurvefit.
In this fitting the fit depends so much on the initial guesses that I use for the lsqcurvefit .
For example, if I use x0=5 as a initial guess I will get residual norm of 30, where as if I choose x0=5.2 I get a residual norm of 1.5.
1) What does residual norm (resnorm) in Matlab represent? is it the sum of the squared errors? Is there a way to decide what range of value for resnorm is acceptable.
2) When the fit depends so much on initial guess, is there a way to deal with these problems? How would I know whether a better fit can be obtained from a different initial guess?
3) In using lsqcurvefit is it required to check whether the residuals are normally distributed?

lsqcurvefit fits your data in the least squares sense. Thus it all comes down to the minimisation, and as your data is non-linear, you do not have any guarantee, that this is the global minimum nor that it is unique.
E.g. Consider the function sin(x), which x-value minimises this function, well all x=2*pi*n + 3/2*pi for n=0,1,2,... but your numerical method can only return one solution, which will then depend on your initial guess.
To further elaborate on the problem. The simplest (in my opinion) minimisation algorithm is known as the steepest descent. it uses the idea, known from calculus, that the steepest descent is in the direction of the minus the gradient. Thus it finds the gradient in the suggested point takes a step in negative that direction (scaled by some stepsize) and continues to do this until the step/derivative is significantly small.
However, even if you consider the function cos(3 pi x)/x from 0.5 to infinity, which does have a unique global minima in 1, you only find it if your guess lies in between 0.7 and 1.3 (or so). All other guesses will find their respective local minima.
With this we can answer your questions:
1) resnorm is the 2-norm of the residuals. What would it mean that specific norm would be acceptable? The algorithm is looking for a minimum, if you already are at a minimum, what would it mean to continue the search?
2) Not in an (pseudo) exact sense. What is typically done is to either use your knowledge to come up with a sensible initial guess. If this is not possible, you simply have to repeatedly make random initial guesses and then keep the best.
3) Depends on what you want to do, if you want to make statistical tests, which depends on the residuals being normally distributed, then YES. If you are solely interested in fitting the function with the lowest residual norm, then NO.

Naive bayes classifier calculation

I'm trying to use naive Bayes classifier to classify my dataset.My questions are:
1- Usually when we try to calculate the likehood we use the formula:
P(c|x)= P(c|x1) * P(c|x2)*...P(c|xn)*P(c) . But in some examples it says in order to avoid getting very small results we use P(c|x)= exp(log(c|x1) + log(c|x2)+...log(c|xn) + logP(c)). can anyone explain more to me the difference between these two formula and are they both used to calculate the "likehood" or the sec one is used to calculate something called "information gain".
2- In some cases when we try to classify our datasets some joints are null. Some ppl use "LAPLACE smoothing" technique in order to avoid null joints. Doesnt this technique influence on the accurancy of our classification?.
Thanks in advance for all your time. I'm just new to this algorithm and trying to learn more about it. So is there any recommended papers i should read? Thanks alot.

I'll take a stab at your first question, assuming you lost most of the P's in your second equation. I think the equation you are ultimately driving towards is:
log P(c|x) = log P(c|x1) + log P(c|x2) + ... + log P(c)
If so, the examples are pointing out that in many statistical calculations, it's often easier to work with the logarithm of a distribution function, as opposed to the distribution function itself.
Practically speaking, it's related to the fact that many statistical distributions involve an exponential function. For example, you can find where the maximum of a Gaussian distribution K*exp^(-s_0*(x-x_0)^2) occurs by solving the mathematically less complex problem (if we're going through the whole formal process of taking derivatives and finding equation roots) of finding where the maximum of its logarithm K-s_0*(x-x_0)^2 occurs.
This leads to many places where "take the logarithm of both sides" is a standard step in an optimization calculation.
Also, computationally, when you are optimizing likelihood functions that may involve many multiplicative terms, adding logarithms of small floating-point numbers is less likely to cause numerical problems than multiplying small floating point numbers together is.

Is ||w|| in libsvm...is it a constant value?

So I understand that with SVM ||w|| is the norm to the hyperplane. I'm wondering though, if this ||w|| ever changes in LibSVM. I ask because I'm wanting to find the distance from certain vectors to the hyperplane. The problem with that is that MATLAB LibSVM doesn't natively do this. They, though, do give decision values that are related to the distance and ||w||.
tldr; ||w||--->LibSVM-->is this value constant?

Obviously not. The norm of normal to the hyperplane is the whole point of the SVM optimization. I would say it is the only thing truly "variable" in the SVM. What is this model doing is trying to minimize this term, so it cannot be constant.

how to find the similarity between two curves and the score of similarity?

I have two data sets (t,y1) and (t,y2). These data sets visually look same but their is some time delay or magnitude shift. i want to find the similarity between the two curves (giving the score of similarity 1 for approximately similar curves and 0 for not similar curves). Some curves are seem to be different because of oscillation in data. so, i am searching for the method to find the similarity between the curves. i already tried gradient command in Matlab to find the slope of the curve at each time step and compared it. but it is not giving me satisfactory results. please anybody suggest me the method to find the similarity between the curves.
Thanks in Advance

This answer assumes your y1 and y2 are signals rather than curves. The latter I would try to parametrise with POLYFIT.
If they really look the same, but are shifted in time (and not wrapped around) then you can:
y1n=y1/norm(y1);
y2n=y2/norm(y2);
normratio=norm(y1)/norm(y2);
c=conv2(y1n,y2n,'same');
[val ind]=max(c);
ind will indicate the time shift and normratio the difference in magnitude.
Both can be used as features for your similarity metric. I assume however your signals actually vary by more than just timeshift or magnitude in which case some sort of signal parametrisation may be a better choice and then building a metric on those parameters.
Without knowing anything about your data I would first try with AR (assuming things as typical as FFT or PRINCOMP won't work).

For time series data similarity measurement, one traditional solution is DTW (Dynamic Time Warpping)

Kolmongrov Smirnov Test (kstest2 function in Matlab)
Chi Square Test
to measure similarity there is a measure called MIC: Maximal information coefficient. It quantifies the information shared between 2 data or curves.

The dv and dc distance in the following paper may solve your problem.
http://bioinformatics.oxfordjournals.org/content/27/22/3135.full

Linear least-squares fit with constraint - any ideas?

I have a problem where I am fitting a high-order polynomial to (not very) noisy data using linear least squares. Currently I'm using polynomial orders around 15 - 25, which work surprisingly well: The dependence is very nearly linear, but the accuracy of modelling the 'very nearly' is critical. I'm using Matlab's polyfit() function, and (obviously) normalising the x-data. This generally works fine, but I have come across an issue with some recent datasets. The fitted polynomial has extrema within the x-data interval. For the application I'm working on this is a non-no. The polynomial model must have no stationary points over the x-interval.
So I need to add a constraint to the least-squares problem: the derivative of the fitted polynomial must be strictly positive over a known x-range (or strictly negative - this depends on the data but a simple linear fit will quickly tell me which it is.) I have had a quick look at the available optimisation toolbox functions, but I admit I'm at a loss to know how to go about this. Does anyone have any suggestions?
[I appreciate there are probably better models than polynomials for this data, but in the short term it isn't feasible to change the form of the model]
[A closing note: I have finally got the go-ahead to replace this awful polynomial model! I am going to adopt a nonparametric approach, spline smoothing, using the excellent SPLINEFIT code by Jonas Lundgren. This has the advantage that I'm already using a spline model in the end-user application, so I already have C# code available to evaluate a spline model]

You could use cftool and use the exclude data points option.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse