I have a little programming experience, so I'm pretty sure I didn't code the problem in the optimal way, so I would be happy to hear any hints.
I have two parameters: the dimension of the problem n and an N x N matrix of constraints B where N = 2n. In my case B is symmetric and has only positive values. I need to solve the following problem
That is I need to maximize a certain average of the distances subject to constraints on pairwise distances given by B(i,j).
They way I'm doing it now is an implementation of linprog(-f,A,b) where
f = ones([1,n])/n;
f = [f -f]
and
b = reshape(B',numel(B),[])
and A is defined as follows
A = zeros([N^2,N]);
for i = 1:N
for j = 1:N
if i ~= j
A((i-1)*N + j,i) = 1;
A((i-1)*N + j,j) = -1;
end
end
end
However, when n = 500 even a simple construction of A takes quite some time, not to say how long does the solution of the linear program take. Any hints are highly appreciated and please feel free to retag.
First of all, try constructing A like so:
AI = eye(N);
AV = ones(N, 1);
A = kron(AI, AV) - kron(AV, AI);
I think it should run by at least an order of magnitude faster than the way you're creating it.
In addition to creating your problem matrix in a more efficient way, you may want to look into using glpk with the glpkmex interface for MATLAB. I've found that my solution times can decrease substantially. You may see another order of magnitude decrease depending on the problem size.
If you are an academic you can get either CPLEX or Gurobi licenses for free, which should net you further decreases in solution time without a lot of fiddling around with solver parameters. This may be necessary with a problem of the size you describe.
Related
I have a linear system of about 2000 sparse equations in Matlab. For my final result, I only really need the value of one of the variables: the other values are irrelevant. While there is no real problem in simply solving the equations and extracting the correct variable, I was wondering whether there was a faster way or Matlab command. For example, as soon as the required variable is calculated, the program could in principle stop running.
Is there anyone who knows whether this is at all possible, or if it would just be easier to keep solving the entire system?
Most of the computation time is spent inverting the matrix, if we can find a way to avoid completely inverting the matrix then we may be able to improve the computation time. Lets assume I'm only interested in the solution for the last variable x(N). Using the standard method we compute
x = A\b;
res = x(N);
Assuming A is full rank, we can instead use LU decomposition of the augmented matrix [A b] to get x(N) which looks like this
[~,U] = lu([A b]);
res = U(end,end-1)/U(end,end);
This is essentially performing Gaussian elimination and then solving for x(N) using back-substitution.
We can extend this to find any value of x by swapping the columns of A before LU decomposition,
x_index = 123; % the index of the solution we are interested in
A(:,[x_index,end]) = A(:,[end,x_index]);
[~,U] = lu([A b]);
res = U(end,end)/U(end,end-1);
Bench-marking performance in MATLAB2017a with 10,000 random 200 dimensional systems we get a slight speed-up
Total time direct method : 4.5401s
Total time LU method : 3.9149s
Note that you may experience some precision issues if A isn't well conditioned.
Also, this approach doesn't take advantage of the sparsity of A. In my experiments even with 2000x2000 sparse matrices everything significantly slowed down and the LU method is significantly slower. That said full matrix representation only requires about 30MB which shouldn't be a problem on most computers.
If you have access to theory manuals on NASTRAN, I believe (from memory) there is coverage of partial solutions of linear systems. Also try looking for iterative or tri diagonal solvers for A*x = b. On this page, review the pqr solution answer by Shantachhani. Another reference.
I am calculating the solution of a constrained linear least-squares problem as follows:
lb = zeros(7,1);
ub = ones(7,1);
for i = 1:size(b,2)
x(:,i) = lsqlin(C,b(:,i),[],[],[],[],lb,ub);
end
where C is m x 7 and b is m x n. n is quite large leading to a slow computation time. Is there any way to speed up this procedure and get rid of the slow for loop. I am using lsqlin instead of pinv or \ because I need to constrain my solution to the boundaries of 0–1 (lb and ub).
The for loop is not necessarily the reason for any slowness – you're not pre-allocating and lsqlin is probably printing out a lot of stuff on each iteration. However, you may be able to speed this up by turning your C matrix into a sparse block diagonal matrix, C2, with n identical blocks (see here). This solves all n problems in one go. If the new C2 is not sparse you may use a lot more memory and the computation may take much longer than with the for loop.
n = size(b,2);
C2 = kron(speye(n),C);
b2 = b(:);
lb2 = repmat(lb,n,1); % or zeros(7*n,1);
ub2 = repmat(ub,n,1); % or ones(7*n,1);
opts = optimoptions(#lsqlin,'Algorithm','interior-point','Display','off');
x = lsqlin(C2,b2,[],[],[],[],lb2,ub2,[],opts);
Using optimoptions, I've specified the algorithm and set 'Display' to 'off' to make sure any outputs and warnings don't slow down the calculations.
On my machine this is 6–10 times faster than using a for loop (with proper pre-allocation and setting options). This approach assumes that the sparse C2 matrix with m*n*7 elements can fit in memory. If not, a for loop based approach will be the only option (other than writing your own specialized version of lsqlin or taking advantage any other spareness in the problem).
Does Matlab do a full matrix multiplication when a matrix multiplication is given as an argument to the trace function?
For example, in the code below, does A*B actually happen, or are the columns of B dotted with the rows of A, then summed? Or does something else happen?
A = [2,2;2,2];
B = eye(2);
f = trace(A*B);
Yes, MATLAB calculates the product, but you can avoid it!
First, let's see what MATLAB does if you do f = trace(A*B):
I think the picture from my Performance monitor says it all really. The first bump is when I created a large A = 2*ones(n), the second, very little bump is for the creation of B = eye(n), and the last bump is where f = trace(A*B) is calculated.
Now, let's see that you get if you do it manually:
If you do it manually, you can save a lot of memory, and it's much faster.
tic
n = 6e3;
A = rand(n);
B = rand(n);
f = trace(A*B);
toc
pause(10)
tic
C(n) = 0;
for ii = 1:n
C(ii) = sum(A(ii,:)*B(:,ii));
end
g = sum(C);
toc
abs(f-g) < 1e-10
Elapsed time is 11.982804 seconds.
Elapsed time is 0.540285 seconds.
ans =
1
Now, as you asked about in the comments: "Is this still true if you use it in a function where optimization can kick in?"
This depends on what you mean here, but as a quick example:
Calculating x = inv(A)*b can be done in a few different ways. If you do:
x = A\b;
MATLAB will chose an algorithm that's best suited for your particular matrix/vector. There are many different alternatives here, depending on the structure of the matrix: is it triangular, hermatian, sparse...? Often it's a upper/lower triangulation. I can pretty much guarantee you that you can't write a code in MATLAB that can outperform MATLABs builtin functions here.
However, if you calculate the same thing this way:
x = inv(A)*b;
MATLAB will actually calculate the inverse of A, then multiply it by b, even though the inverse is not stored in the workspace afterwards. This is much slower, and can also be inaccurate. (In the A\b approach, MATLAB will, if necessary create a permutation matrix to ensure numerical stability.
Recently I have run into a Quadratically constrainted quadratic programming (QCQP) problem in my research. I have found something useful in MATLAB optimization toolbox, i.e. 'fmincon' function (general nonlinear optimization with nonlinear constraints), it use 'interior point algorithm' to solve my problem, which contains 8 variables, 1 equality quadratic constraint and 1 inequality quadratic constraint. 'fmincon' with or without 'Hessian' and 'Gradient' provide quite good solution, the only thing I am not satisfied is the efficiency, since I need to call it like million times in my main code. I need to find something which may be more specific to QCQP, possibly efficiency may improved. However I have found a lot of information from netlib and wiki, but I have no judgement on which one I should use, and it would be tedious to try things one by one, I really need some suggestions. By the way, I am mostly programming in MATLAB for this problem, but suitable c/fortran are also useful.
-Yan
An alternative is to use CVX, available here, which works nicely for QCQPs (amongst many other types of problems). Here is a code snippet which solves a QCQP:
close all; clear; clc
n = 10;
H = rand(n); H = H*H'; % make spsd
f = -rand(n,1);
Q = rand(n); Q = Q*Q'; % make spsd
g = -rand(n,1);
cvx_begin
variable x(n)
0.5*x'*Q*x+g'*x <=0
x >= 0
minimize(0.5*x'*H*x + f'*x)
cvx_end
I checked the answers about Lagrange interpolation, but I couldn't find a suitable one to my question. I'm trying to use Lagrange interpolation for a surface with matlab. Let's say I have a x and y vector and f=f(x,y). I want to interpolate this f function. I think, what I did is mathematically correct:
function q = laginterp(x,y,f,ff)
n = length(x);
m = length(y);
v = zeros(size(ff));
for k = 1:n
for l = 1:m
w1 = ones(size(ff));
w2 = ones(size(ff))
for j = [1:k-1 k+1:n]
for j = [1:l-1 l+1:n]
w1 = (x-x(j))./(x(k)-x(j)).*w1;
w2 = (y-y(i))./(y(l)-y(i)).*w2;
end
end
ff = ff + w1.*w2.*f(k,l);
end
end
It is my function and then I'm waiting for an answer for any given x,y,f like
x= 0:4;
y = [-6 -3 -1 6];
f=[2 9 4 25 50];
v = laginterp(x,y,f,ff);
plot3(x,y,'o',f,q,'-')
I'm always grateful for any help!
Lagrange interpolation is essentially NEVER a good choice for interpolation. Yes, it is used in the first chapter of many texts that discuss interpolation. Does that make it good? No. That just makes it convenient, a good way to INTRODUCE ideas of interpolation, and sometimes to prove some simple results.
A serious problem is that a user decides to try this miserable excuse for an interpolation method, and finds that lo and behold, it does work for 2 or 3 points. Wow, look at that! So the obvious continuation is to use it on their real data sets with 137 points, or 10000 data points or more, some of which points are usually replicates. What happened? Why does my code not give good results? Or, maybe they will just blindly assume that it did work, and then publish a paper containing meaningless results.
Yes, there is a Lagrange tool on the File Exchange. Yes, it even probably got some good reviews, written by first year students who had no real idea what they were looking at, and who sadly have no concept of numerical analysis. Don't use it.
If you need an interpolation tool in MATLAB, you could start with griddata or TriScatteredInterp. These will yield quite reasonable results. Other methods are radial basis function interpolations, of which there is also a tool on the FEX, and a wide variety of splines, my personal favorite. Note that ANY interpolation, used blindly without understanding or appreciation of the pitfalls can and will produce meaningless results. But this is true of almost any numerical method.
This doesn't address your question directly, but there's a Lagrange interpolation function on the Matlab File Exchange which seems pretty popular.