Least-squares minimization within threshold in MATLAB - matlab

The cvx suite for MATLAB can solve the (seemingly innocent) optimization problem below, but it is rather slow for the large, full matrices I'm working with. I'm hoping this is because using cvx is overkill, and that the problem actually has an analytic solution, or that a clever use of some built-in MATLAB functions can more quickly do the job.
Background: It is well-known that both x1=A\b and x2=pinv(A)*b solve the least-squares problem:
minimize norm(A*x-b)
with the distinction that norm(x2)<=norm(x1). In fact, x2 is the minimum-norm solution to the problem, so norm(x2)<=norm(x) for all possible solutions x.
Defining D=norm(A*x2-b), (equivalently D=norm(A*x1-b)), then x2 solves the problem
minimize norm(x)
subject to
norm(A*x-b) == D
Problem: I'd like to find the solution to:
minimize norm(x)
subject to
norm(A*x-b) <= D+threshold
In words, I don't need norm(A*x-b) to be as small as possible, just within a certain tolerance. I want the minimum-norm solution x that gets A*x within D+threshold of b.
I haven't been able to find an analytic solution to the problem (like using the pseudoinverse in the classic least-squares problem) on the web or by hand. I've been searching things like "least squares with nonlinear constraint" and "least squares with threshold".
Any insights would be greatly appreciated, but I suppose my real question is:
What is the fastest way to solve this "thresholded" least-squares problem in MATLAB?

Interesting question. I do not know the answer to your exact question, but a working solution is presented below.
Recap
Define res(x) := norm(Ax - b).
As you state, x2 minimizes res(x). In the overdetermined case (typically A having more rows than col's), x2 is the unique minimum. In the underdetermined case, it is joined by infinitely many others*. However, among all of these, x2 is the unique one that minimizes norm(x).
To summarize, x2 minimizes (1) res(x) and (2) norm(x), and it does so in that order of priority. In fact, this characterizes (fully determines) x2.
The limit characterization
But, another characterization of x2 is
x2 := limit_{e-->0} x_e
where
x_e := argmin_{x} J(x;e)
where
J(x;e) := res(x) + e * norm(x)
It can be shown that
x_e = (A A' + e I)^{-1} A' b (eqn a)
It should be appreciated that this characterization of x2 is quite magical. The limits exists even if (A A')^{-1} does not. And the limit somehow preserves priority (2) from above.
Using e>0
Of course, for finite (but small) e, x_e will not minimize res(x)(instead it minimzes J(x;e)). In your terminology, the difference is the threshold. I will rename it to
gap := res(x_e) - min_{x} res(x).
Decreasing the value of e is guaranteed to decrease the value of the gap. Reaching a specific gap value (i.e. the threshold) is therefore easy to achieve by tuning e.**
This type of modification (adding norm(x) to the res(x) minimization problem) is known as regularization in statistics litterature, and is generally considered a good idea for stability (numerically and with respect to parameter values).
*: Note that x1 and x2 only differ in the underdetermined case
**:It does not even require any heavy computations, because the inverse in (eqn a) is easily computed for any (positive) value of e if the SVD of A has already been computed.

Related

How to solve that recurence by itrative method ....?

I have tried but faild to find big o of
T(n)=3t(n/3)+n/lg(n)
Can anyone please give me a solution
A good way to visualize this problem is a Recursive Tree Diagram.
The first three levels of the tree are drawn here.
For the first level, we have a total work of n / lg n
For the second level, we have 3 calls of (n/3) / lg (n/3). Summing these calls gives a total work of n / lg (n/3) at this level.
For the third level, we have 9 calls of (n/9) / lg (n/9). Summing these calls gives a total work of n / lg (n/9) or n / lg (n/3^2) at this level.
The recursive calls will continue until we have a call of T(1). This condition is met at n/3^k = 1 or k = log3(n)
So now we have a simple summation of all of the levels equal to this.
n is a constant and can be pulled out of the equation. Expanding the summation then gives this equation.
We can simplify this summation as shown here.
In Big Theta, the summation can be simplified to Θ(n*(1+log(logn)) as the summation is a harmonic series.
Simplifying further, we have Θ(n+n*log(logn)) and due to the rules of Big Theta, we can simplify to a final Big Theta of Θ(nlog(logn)) as the first n will grow at a slower rate and doesn't really matter in our final equation.
But wait! You asked for Big O. Thankfully, Big Theta provides both an upper and lower bound for the problem. Big O only provides an upper bound. What does this mean for us? Big Theta actually is Big O, but not the other way around allowing you to say that the recurrence relation is O(nlog(logn)).
Hope this helps!

Matlab: Solve for a single variable in a linear system of equations

I have a linear system of about 2000 sparse equations in Matlab. For my final result, I only really need the value of one of the variables: the other values are irrelevant. While there is no real problem in simply solving the equations and extracting the correct variable, I was wondering whether there was a faster way or Matlab command. For example, as soon as the required variable is calculated, the program could in principle stop running.
Is there anyone who knows whether this is at all possible, or if it would just be easier to keep solving the entire system?
Most of the computation time is spent inverting the matrix, if we can find a way to avoid completely inverting the matrix then we may be able to improve the computation time. Lets assume I'm only interested in the solution for the last variable x(N). Using the standard method we compute
x = A\b;
res = x(N);
Assuming A is full rank, we can instead use LU decomposition of the augmented matrix [A b] to get x(N) which looks like this
[~,U] = lu([A b]);
res = U(end,end-1)/U(end,end);
This is essentially performing Gaussian elimination and then solving for x(N) using back-substitution.
We can extend this to find any value of x by swapping the columns of A before LU decomposition,
x_index = 123; % the index of the solution we are interested in
A(:,[x_index,end]) = A(:,[end,x_index]);
[~,U] = lu([A b]);
res = U(end,end)/U(end,end-1);
Bench-marking performance in MATLAB2017a with 10,000 random 200 dimensional systems we get a slight speed-up
Total time direct method : 4.5401s
Total time LU method : 3.9149s
Note that you may experience some precision issues if A isn't well conditioned.
Also, this approach doesn't take advantage of the sparsity of A. In my experiments even with 2000x2000 sparse matrices everything significantly slowed down and the LU method is significantly slower. That said full matrix representation only requires about 30MB which shouldn't be a problem on most computers.
If you have access to theory manuals on NASTRAN, I believe (from memory) there is coverage of partial solutions of linear systems. Also try looking for iterative or tri diagonal solvers for A*x = b. On this page, review the pqr solution answer by Shantachhani. Another reference.

Fit a quadratic function of two variables (Practitioner Black Scholes Deterministic Volatility Functions)

I am attempting to fit the parameters of a deterministic volatility function for use in the practitioner Black Scholes model.
The formula for which I want to estimate the "a" parameters is:
sig = a0 + a1*K + a2*K^2 + a3*T + a4*T^2 + a5*KT
Where sig, K and T are known; I have multiple observations of K, T and sig combinations but only want a single set of "a" parameters.
How might I go about this? My google searches and own attempts all failed, unfortunately.
Thank you!
The function lsqcurvefit allows you to define the function that you want to fit. It should be straight forward from there on.
http://se.mathworks.com/help/optim/ug/lsqcurvefit.html
Some Mathematics
Notation stuff: index your observations by i and add an error term.
sig_i = a0 + a1*K_i + a2*K_i^2 + a3*T_i + a4*T_i^2 + a5*KT_i + e_i
Something probably not insane to do would be to minimize the square of the error term:
minimize (over a) \sum_i e_i^2
The solution to least squares is a simple linear algebra problem. (See https://stats.stackexchange.com/questions/186196/understanding-linear-algebra-in-ordinary-least-squares-derivation/186289#186289 for a solution if you really care.) (Further note: e_i is a linear function of a. I'm not sure why you would need lsqcurvefit as another answer suggested?)
Matlab Code for OLS (Ordinary Least Squares)
Assuming sig, K, T, and KT are n by 1 vectors
y = sig;
X = [ones(length(sig),1), K, K.^2, T, T.^2, KT];
a = X \ y; %basically computes a = inv(X'*X)*(X'*y) but in a better way
This an ordinary least squares regression of y on X.
Further Ideas
Depending on the distribution of your error terms, correlated error etc... regular OLS may be inefficient or possibly even inappropriate... I'm not familiar with the details of this problem to know. You may want to check what people do.
Eg. a technique that's less sensitive to big outliers is to minimize the absolute value of the error.
minimize (over a) \sum_i |a_i|
If you have a good, statistical model of how the data is generated you could do maximum likelihood estimation. Anyway... this rapidly devolve into a multi-quarter, statistics class.

What is the fastest method for solving exp(ax)-ax+c=0 for x in matlab

What is the least computational time consuming way to solve in Matlab the equation:
exp(ax)-ax+c=0
where a and c are constants and x is the value I'm trying to find?
Currently I am using the in built solver function, and I know the solution is single valued, but it is just taking longer than I would like.
Just wanting something to run more quickly is insufficient for that to happen.
And, sorry, but if fzero is not fast enough then you won't do much better for a general root finding tool.
If you aren't using fzero, then why not? After all, that IS the built-in solver you did not name. (BE EXPLICIT! Otherwise we must guess.) Perhaps you are using solve, from the symbolic toolbox. It will be more slow, since it is a symbolic tool.
Having said the above, I might point out that you might be able to improve by recognizing that this is really a problem with a single parameter, c. That is, transform the problem to solving
exp(y) - y + c = 0
where
y = ax
Once you know the value of y, divide by a to get x.
Of course, this way of looking at the problem makes it obvious that you have made an incorrect statement, that the solution is single valued. There are TWO solutions for any negative value of c less than -1. When c = -1, the solution is unique, and for c greater than -1, no solutions exist in real numbers. (If you allow complex results, then there will be solutions there too.)
So if you MUST solve the above problem frequently and fzero was inadequate, then I would consider a spline model, where I had precomputed solutions to the problem for a sufficient number of distinct values of c. Interpolate that spline model to get a predicted value of y for any c.
If I needed more accuracy, I might take a single Newton step from that point.
In the event that you can use the Lambert W function, then solve actually does give us a solution for the general problem. (As you see, I am just guessing what you are trying to solve this with, and what are your goals. Explicit questions help the person trying to help you.)
solve('exp(y) - y + c')
ans =
c - lambertw(0, -exp(c))
The zero first argument to lambertw yields the negative solution. In fact, we can use lambertw to give us both the positive and negative real solutions for any c no larger than -1.
X = #(c) c - lambertw([0 -1],-exp(c));
X(-1.1)
ans =
-0.48318 0.41622
X(-2)
ans =
-1.8414 1.1462
Solving your system symbolically
syms a c x;
fx0 = solve(exp(a*x)-a*x+c==0,x)
which results in
fx0 =
(c - lambertw(0, -exp(c)))/a
As #woodchips pointed out, the Lambert W function has two primary branches, W0 and W−1. The solution given is with respect to the upper (or principal) branch, denoted W0, your equation actually has an infinite number of complex solutions for Wk (the W0 and W−1 solutions are real if c is in [−∞, 0]). In Matlab, lambertw is only implemented for symbolic inputs and thus is very slow method of solving your equation if you're interested in numerical (double precision) solutions.
If you wish to solve such equations numerically in an efficient manner, you might look at Corless, et al. 1996. But, as long as your parameter c is in [−∞, 0], i.e., -exp(c) in [−1/e, 0] and you're interested in the W0 branch, you can use the Matlab code that I wrote to answer a similar question at Math.StackExchange. This code should be much much more efficient that using a naïve approach with fzero.
If your values of c are not in [−∞, 0] or you want the solution corresponding to a different branch, then your solution may be complex-valued and you won't be able to use the simple code I linked to above. In that case, you can more fully implement the function by reading the Corless, et al. 1996 paper or you can try converting the Lambert W to a Wright ω function: W0(z) = ω(log(z)), W−1(z) = ω(log(z)−2πi). In your case, using Matlab's wrightOmega, the W0 branch corresponds to:
fx0 =
(c - wrightOmega(log(-exp(c))))/a
and the W−1 branch to:
fxm1 =
(c - wrightOmega(log(-exp(c))-2*sym(pi)*1i))/a
If c is real, then the above reduces to
fx0 =
(c - wrightOmega(c+sym(pi)*1i))/a
and
fxm1 =
(c - wrightOmega(c-sym(pi)*1i))/a
Matlab's wrightOmega function is also symbolic only, but I have written a double precision implementation (based on Lawrence, et al. 2012) that you can find on my GitHub here and that is 3+ orders of magnitude faster than evaluating the function symbolically. As your problem is technically in terms of a Lambert W, it may be more efficient, and possibly more numerically accurate, to implement that more complicated function for the regime of interest (this is due to the log transformation and the extra evaluation of a complex log). But feel free to test.

How to overcome singularities in numerical integration (in Matlab or Mathematica)

I want to numerically integrate the following:
where
and a, b and β are constants which for simplicity, can all be set to 1.
Neither Matlab using dblquad, nor Mathematica using NIntegrate can deal with the singularity created by the denominator. Since it's a double integral, I can't specify where the singularity is in Mathematica.
I'm sure that it is not infinite since this integral is based in perturbation theory and without the
has been found before (just not by me so I don't know how it's done).
Any ideas?
(1) It would be helpful if you provide the explicit code you use. That way others (read: me) need not code it up separately.
(2) If the integral exists, it has to be zero. This is because you negate the n(y)-n(x) factor when you swap x and y but keep the rest the same. Yet the integration range symmetry means that amounts to just renaming your variables, hence it must stay the same.
(3) Here is some code that shows it will be zero, at least if we zero out the singular part and a small band around it.
a = 1;
b = 1;
beta = 1;
eps[x_] := 2*(a-b*Cos[x])
n[x_] := 1/(1+Exp[beta*eps[x]])
delta = .001;
pw[x_,y_] := Piecewise[{{1,Abs[Abs[x]-Abs[y]]>delta}}, 0]
We add 1 to the integrand just to avoid accuracy issues with results that are near zero.
NIntegrate[1+Cos[(x+y)/2]^2*(n[x]-n[y])/(eps[x]-eps[y])^2*pw[Cos[x],Cos[y]],
{x,-Pi,Pi}, {y,-Pi,Pi}] / (4*Pi^2)
I get the result below.
NIntegrate::slwcon:
Numerical integration converging too slowly; suspect one of the following:
singularity, value of the integration is 0, highly oscillatory integrand,
or WorkingPrecision too small.
NIntegrate::eincr:
The global error of the strategy GlobalAdaptive has increased more than
2000 times. The global error is expected to decrease monotonically after a
number of integrand evaluations. Suspect one of the following: the
working precision is insufficient for the specified precision goal; the
integrand is highly oscillatory or it is not a (piecewise) smooth
function; or the true value of the integral is 0. Increasing the value of
the GlobalAdaptive option MaxErrorIncreases might lead to a convergent
numerical integration. NIntegrate obtained 39.4791 and 0.459541
for the integral and error estimates.
Out[24]= 1.00002
This is a good indication that the unadulterated result will be zero.
(4) Substituting cx for cos(x) and cy for cos(y), and removing extraneous factors for purposes of convergence assessment, gives the expression below.
((1 + E^(2*(1 - cx)))^(-1) - (1 + E^(2*(1 - cy)))^(-1))/
(2*(1 - cx) - 2*(1 - cy))^2
A series expansion in cy, centered at cx, indicates a pole of order 1. So it does appear to be a singular integral.
Daniel Lichtblau
The integral looks like a Cauchy Principal Value type integral (i.e. it has a strong singularity). That's why you can't apply standard quadrature techniques.
Have you tried PrincipalValue->True in Mathematica's Integrate?
In addition to Daniel's observation about integrating an odd integrand over a symmetric range (so that symmetry indicates the result should be zero), you can also do this to understand its convergence better (I'll use latex, writing this out with pen and paper should make it easier to read; it took a lot longer to write than to do, it's not that complicated):
First, epsilon(x)-\epsilon(y)\propto\cos(y)-\cos(x)=2\sin(\xi_+)\sin(\xi_-) where I have defined \xi_\pm=(x\pm y)/2 (so I've rotated the axes by pi/4). The region of integration then is \xi_+ between \pi/\sqrt{2} and -\pi/\sqrt{2} and \xi_- between \pm(\pi/\sqrt{2}-\xi_-). Then the integrand takes the form \frac{1}{\sin^2(\xi_-)\sin^2(\xi_+)} times terms with no divergences. So, evidently, there are second-order poles, and this isn't convergent as presented.
Perhaps you could email the persons who obtained an answer with the cos term and ask what precisely it is they did. Perhaps there's a physical regularisation procedure being employed. Or you could have given more information on the physical origin of this (some sort of second order perturbation theory for some sort of bosonic system?), had that not been off-topic here...
May be I am missing something here, but the integrand
f[x,y]=Cos^2[(x+y)/2]*(n[x]-n[y])/(eps[x]-eps[y]) with n[x]=1/(1+Exp[Beta*eps[x]]) and eps[x]=2(a-b*Cos[x]) is indeed a symmetric function in x and y: f[x,-y]= f[-x,y]=f[x,y].
Therefore its integral over any domain [-u,u]x[-v,v] is zero. No numerical integration seems to be needed here. The result is just zero.