Construct a Matlab array through nested loops - matlab

Suppose I have three matrices A_1, A_2, A_3 each of dimension mxn, where m and n are large. These matrices contain strictly positive numbers.
I want to construct a matrix check of dimension mx1 such that, for each i=1,...,m:
check(i)=1 if there exists j,k such that
A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,3:end), 0.95)
In my case m is large (m=10^5) and n=500. Therefore, I would like your help to find an efficient way to do this.
Below I reproduce an example. I impose m smaller than in reality and report my incomplete and probably inefficient attempt to construct check.
clear
rng default
m=4;
n=500;
A_1=betarnd(1,2,m,n);
A_2=betarnd(1,2,m,n);
A_3=betarnd(1,2,m,n);
check=zeros(m,1);
for i=1:m
for j=1:m
for k=1:m
if A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,2:end), 0.95)
check(i)=1;
STOP TO LOOP OVER j AND k, MOVE TO THE NEXT i (INCOMPLETE!)
else
KEEP SEARCHING FOR j,k SUCH THAT THE CONDITION IS SATISFIED (INCOMPLETE!)
end
end
end
end

Given a scalar x and a vector v the expression x <=quantile (v, .95) can be written as sum( x > v) < Q where Q = .95 * numel(v) *.
Also A_1 can be splitted before the loop to avoid extra indexing.
Moreover the most inner loop can be removed in favor of vectorization.
Af_1 = A_1(:,1);
Af_2 = A_2(:,1);
Af_3 = A_3(:,1);
As_1 = A_1(:,2:end);
As_2 = A_2(:,2:end);
As_3 = A_3(:,2:end);
Q = .95 * (n -1);
for i=1:m
for j=1:m
if any (sum (Af_1(i) + Af_2(j) + Af_3 > As_1(i,:) + As_2(j,:) + As_3, 2) < Q)
check(i) = 1;
break;
end
end
end
More optimization can be achieved by rearranging the expressions involved in the inequality and pre-computation:
lhs = A_3(:,1) - A_3(:,2:end);
lhsi = A_1(:,1) - A_1(:,2:end);
rhsj = A_2(:,2:end) - A_2(:,1);
Q = .95 * (n - 1);
for i=1:m
LHS = lhs + lhsi(i,:);
for j=1:m
if any (sum (LHS > rhsj(j,:), 2) < Q)
check(i) = 1;
break;
end
end
end
Note that because of the method that is used in the computation of quantile you may get a slightly different result.

Option 1:
Because all numbers are positive, you can do some optimizations. 95 percentile will be only higher if you add A1 to the mix - if you find the j and k of greatest 95 percentile of A2+A3 on the right side compared to the sum of the first 2 elements, you can simply take that for every i.
maxDif = -inf;
for j = 1 : m
for k = 1 : m
newDif = quantile(A_2..., 0.95) - A_2(j,1)-A_3(k,1);
maxDif = max(newDif, maxDif);
end
end
If even that is too slow, you can first get maxDifA2 and maxDifA3, then estimate that maxDif will be for those particular j and k values and calculate it.
Now, for some numbers you will get that maxDif > A_1, then the check is 1. For some numbers you will get that maxDif + quantile(A1, 0.95) < A_1, here check is 0 (if you estimated maxDif by separate calculation of A2 and A3 this isn't true!). For some (most?) you will unfortunately get values in between and this won't be helpful at all. Then what remains is option 2 (it is also more straightforward):
Option 2:
You could save some time if you can save summation A_2+A_3 on the right side, as that calculation repeats for every different i, but that requires A LOT of memory. But quantile is the more expensive operation anyway, so you aren't saving a lot of time. Something along the lines of
for j = 1 : m
for k = 1 : m
A23R(j,k,:) = A2(j,:)+A3(k,:); % Unlikely to fit in memory.
end
end
Then you can perform your loops, using A23R and avoiding to repeat that sum for every i.

Related

MATLAB - Secant method produces NaN

I'm writing a secant method in MATLAB, which I want to iterate through exactly n times.
function y = secantmeth(f,xn_2,xn_1,n)
xn = (xn_2*f(xn_1) - xn_1*f(xn_2))/(f(xn_1) - f(xn_2));
k = 0;
while (k < n)
k = k + 1;
xn_2 = xn_1;
xn_1 = xn;
xn = (xn_2*f(xn_1) - xn_1*f(xn_2))/(f(xn_1) - f(xn_2));
end
y = xn;
end
I believe the method works for small values of n, but even something like n = 9 produces NaN. My guess is that the quantity f(xn_1) - f(xn_2) is approximately zero, which causes this error. How can I prevent this?
Examples:
Input 1
eqn = #(x)(x^2 + x -9)
secantmeth(eqn,2,3,5)
Input 2
eqn = #(x)(x^2 + x - 9)
secantmeth(eqn, 2, 3, 9)
Output 1
2.7321
Output 2
NaN
The value for xn will be NaN when xn_2 and xn_1 are exactly equal, which results in a 0/0 condition. You need to have an additional check in your while loop condition to see if xn_1 and x_n are equal (or, better yet, within some small tolerance of one another), thus suggesting that the loop has converged on a solution and can't iterate any further:
...
while (k < n) && (xn_1 ~= xn)
k = k + 1;
xn_2 = xn_1;
xn_1 = xn;
xn = (xn_2*f(xn_1) - xn_1*f(xn_2))/(f(xn_1) - f(xn_2));
end
...
As Ander mentions in a comment, you could then continue with a different method after your while loop if you want to try and get a more accurate approximation:
...
if (xn_1 == xn) % Previous loop couldn't iterate any further
% Try some new method
end
...
And again, I would suggest reading through this question to understand some of the pitfalls of floating-point comparison (i.e. == and ~= aren't usually the best operators to use for floating-point numbers).

Implementing Simplex Method infinite loop

I am trying to implement a simplex algorithm following the rules I was given at my optimization course. The problem is
min c'*x s.t.
Ax = b
x >= 0
All vectors are assumes to be columns, ' denotes the transpose. The algorithm should also return the solution to dual LP. The rules to follow are:
Here, A_J denotes columns from A with indices in J and x_J, x_K denotes elements of vector x with indices in J or K respectively. Vector a_s is column s of matrix A.
Now I do not understand how this algorithm takes care of condition x >= 0, but I decided to give it a try and follow it step by step. I used Matlab for this and got the following code.
X = zeros(n, 1);
Y = zeros(m, 1);
% i. Choose starting basis J and K = {1,2,...,n} \ J
J = [4 5 6] % for our problem
K = setdiff(1:n, J)
% this while is for goto
while 1
% ii. Solve system A_J*\bar{x}_J = b.
xbar = A(:,J) \ b
% iii. Calculate value of criterion function with respect to current x_J.
fval = c(J)' * xbar
% iv. Calculate dual solution y from A_J^T*y = c_J.
y = A(:,J)' \ c(J)
% v. Calculate \bar{c}^T = c_K^T - u^T A_K. If \bar{c}^T >= 0, we have
% found the optimal solution. If not, select the smallest s \in K, such
% that c_s < 0. Variable x_s enters basis.
cbar = c(K)' - c(J)' * inv(A(:,J)) * A(:,K)
cbar = cbar'
tmp = findnegative(cbar)
if tmp == -1 % we have found the optimal solution since cbar >= 0
X(J) = xbar;
Y = y;
FVAL = fval;
return
end
s = findnegative(c, K) %x_s enters basis
% vi. Solve system A_J*\bar{a} = a_s. If \bar{a} <= 0, then the problem is
% unbounded.
abar = A(:,J) \ A(:,s)
if findpositive(abar) == -1 % we failed to find positive number
disp('The problem is unbounded.')
return;
end
% vii. Calculate v = \bar{x}_J / \bar{a} and find the smallest rho \in J,
% such that v_rho > 0. Variable x_rho exits basis.
v = xbar ./ abar
rho = J(findpositive(v))
% viii. Update J and K and goto ii.
J = setdiff(J, rho)
J = union(J, s)
K = setdiff(K, s)
K = union(K, rho)
end
Functions findpositive(x) and findnegative(x, S) return the first index of positive or negative value in x. S is the set of indices, over which we look at. If S is omitted, whole vector is checked. Semicolons are omitted for debugging purposes.
The problem I tested this code on is
c = [-3 -1 -3 zeros(1,3)];
A = [2 1 1; 1 2 3; 2 2 1];
A = [A eye(3)];
b = [2; 5; 6];
The reason for zeros(1,3) and eye(3) is that the problem is inequalities and we need slack variables. I have set starting basis to [4 5 6] because the notes say that starting basis should be set to slack variables.
Now, what happens during execution is that on first run of while, variable with index 1 enters basis (in Matlab, indices go from 1 on) and 4 exits it and that is reasonable. On the second run, 2 enters the basis (since it is the smallest index such that c(idx) < 0 and 1 leaves it. But now on the next iteration, 1 enters basis again and I understand why it enters, because it is the smallest index, such that c(idx) < 0. But here the looping starts. I assume that should not have happened, but following the rules I cannot see how to prevent this.
I guess that there has to be something wrong with my interpretation of the notes but I just cannot see where I am wrong. I also remember that when we solved LP on the paper, we were updating our subjective function on each go, since when a variable entered basis, we removed it from the subjective function and expressed that variable in subj. function with the expression from one of the equalities, but I assume that is different algorithm.
Any remarks or help will be highly appreciated.
The problem has been solved. Turned out that the point 7 in the notes was wrong. Instead, point 7 should be

Composite Simpson's Rule

I have this code for the Composite Simpson's Rule. However, I have been fiddling with it for quite a while and I can't seem to get it to work.
How can I fix this algorithm?
function out = Sc2(func,a,b,N)
% Sc(func,a,b,N)
% This function calculates the integral of func on the interval [a,b]
% using the Composite Simpson's rule with N subintervals.
x=linspace(a,b,N+1);
% Partition [a,b] into N subintervals
fx=func(x);
h=(b-a)/(2*N);
%define for odd and even sums
sum_even = 0;
for i = 1:N-1
x(i) = a + (2*i-2)*h;
sum_even = sum_even + func(x(i));
end
sum_odd = 0;
for i = 1:N+1
x(i) = a + (2*i-1)*h;
sum_odd = sum_odd + func(x(i));
end
% Define the length of a subinterval
out=(h/3)*(fx(1)+ 2*sum_even + 4*sum_odd +fx(end));
% Apply the composite Simpsons rule
end
Well for one thing, your h definition is wrong. h stands for the step size of each interval you want to estimate. You are unnecessarily dividing by 2. Remove that 2 in your h definition. You also are evaluating your function at the values of n not x. You should probably remove this statement because you end up not using this in the end.
Also, you are summing from 1 to N+1 or from 1 to N-1 for either the odd or even values, This is incorrect. Remember, you are choosing every other value in an odd interval, or even interval, so this should really be looping from 1 to N/2 - 1. To escape figuring out what to multiply i with, just skip this and make your loop go in steps of 2. That's besides the point though.
I would recommend that you don't loop over and add up the values for the odd and even intervals that way. You can easily do that by specifying the odd or even values of x and just applying a sum. I would use the colon operator and specify a step size of 2 to exactly determine which values of x for odd or even you want to apply to the overall sum.
You also are declaring x to be your n-point interval, yet you are overwriting those values in your loops. You actually don't need that x declaration in your code in that case.
As such, here's a modified version of your function with the optimizations I have in mind:
function out = Sc2(func, a, b, N)
h = (b – a) / N; %// Width of each interval
odd = 1 : 2 : n-1; %// Define odd interval
xodd = a + h*odd; %// Create odd x values
even = 2 : 2 : n-2; %// Create even interval
xeven = a + h*even; % Create even x values
%// Return area
out = (h/3)*(func(a) + 4*sum(func(xodd)) + 2*sum(func(xeven))+ func(b));
However, if you want to get your code working, you simply have to change your for loop iteration limits as well as your value of h. You also have to remove some lines of code, and change some variable names. Therefore:
function out = Sc2(func,a,b,N)
% Sc(func,a,b,N)
% This function calculates the integral of func on the interval [a,b]
% using the Composite Simpson's rule with N subintervals.
%// Define width of each segment
h = (b - a) / N; %// Change
%//define for odd and even sums
sum_even = 0;
for i = 2 : 2 : N-2 %// Change
x = a + i*h; %// Change
sum_even = sum_even + func(x);
end
sum_odd = 0;
for i = 1 : 2 : N-1 %// Change
x = a + i*h %// Change
sum_odd = sum_odd + func(x);
end
%// Output area
out = (h / 3)*(func(a) + 2*sum_even + 4*sum_odd + func(b)); %// Change
end

MatLab: Matrix with one peak and rest decreasing

I'm trying to create a matrix such that if I define a random number between 0 and 1 and a random location in the matrix, I want all the values around that to "diffuse" out. Here's sort of an example:
0.214 0.432 0.531 0.631 0.593 0.642
0.389 0.467 0.587 0.723 0.654 0.689
0.421 0.523 0.743 0.812 0.765 0.754
0.543 0.612 0.732 0.843 0.889 0.743
0.322 0.543 0.661 0.732 0.643 0.694
0.221 0.321 0.492 0.643 0.521 0.598
if you notice, there's a peak at (4,5) = 0.889 and all the other numbers decrease as they move away from that peak.
I can't figure out a nice way to generate a code that does this. Any thoughts? I need to be able to generate this type of matrix with random peaks and a random rate of decrease...
Without knowing what other constraints you want to implement:
Come up with a function z = f(x,y) whose peak value is at (x0,y0) == (0,0) and whose values range between [0,1]. As an example, the PDF for the Normal distribution with mu = 0 and sigma = 1/sqrt(2*pi) has a peak at x == 0 of 1.0, and whose lower bound is zero. Similarly, a bivariate normal PDF with mu = {0,0} and determinate(sigma) == [1/(2*pi)]^2 will have similar characteristics.
Any mathematical function may have its domain shifted: f(x-x0, y-y0)
Your code will look something like this:
someFunction = #(x,y) theFunctionYouPicked(x,y);
[x0,y0,peak] = %{ you supply these values %};
myFunction = #(x,y) peak * someFunction(x - x0, y - y0);
[dimX,dimY] = %{ you supply these values %};
mymatrix = bsxfun( myFunction, 0:dimX, (0:dimY)' );
You can read more about bsxfun here; however, here's an example of how it works:
bsxfun( blah, [a b c], [d e f]' )
That should give the following matrix (or its transpose ... I don't have matlab in front of me):
[blah(a,d) blah(a,e) blah(a,f);
blah(b,d) blah(b,e) blah(b,f);
blah(c,d) blah(c,e) blah(c,f)]
Get a toy example working, then you can tinker with it to be more flexible. If the function dictating how it decreases is random (with the constraint that points closer to (x0,y0) are larger than more distant points), it won't be an issue to make a procedural function instead of using strictly mathematical ones.
In response to your answer:
Your equation could be thought of as a model for gravity where an object instantaneously induces a force on another mass, then stops exerting force. Following that logic, it could be modified to a naive vector formulation like this:
% v1 & v2 are vectors that point from the two peak points to the point [ii,jj]
theMatrix(ii,jj) = norm( (r1 / norm( v1 )) * v1 / norm( v1 ) ...
+ (r2 / norm( v2 )) * v2 / norm( v2 ) ...
);
The most extreme type of corner case you'll run into is one where v1 & v2 point in the same direction as in the following row:
[ . . A X1 X2 . . ]
... where you want a value for A w/respect to X1 & X2. Using the above expression it'll boil down to A = X1 / norm(v1) + X2 / norm(v2), which will definitely exceed the peak value at X1 because norm(v1) == 1. You could certainly do some dirty stuff to Band-Aid it, but personally I'd start looking for a different function.
Along those lines, if you used Newton's Law of Universal Gravitation with a few modifications:
You wouldn't need an analogue for G, so you could just assume G == 1
Treat each of the points in the matrix as having mass m2 == 1, so the equation reduces to: F_12 == -1 * (m1 / r^2) * RHAT_12
Sum the "force" vectors and calculate the norm to get each value
... you'll still run into the same problem. The corner case I laid out above would boil down to A = X1/norm(v1)^2 + X2/norm(v2)^2 == X1 + X2/4. Since it's inversely proportional to the square of the distances, it'd be easier to Band-Aid than the linear one, but I wouldn't recommend it.
Similarly, if you use polynomials it won't scale well; you can design one that won't ever exceed your chosen peaks, but there wouldn't be a lower bound.
You could use the logistic function to help with this:
1 / (1 + E^(-c*x))
Here's an example of using the logistic function on a degree 4 polynomial with peaks at points 2 & 4; you'll note I gave the polynomial a scaling factor to pull the polynomial down to relatively small values so calculated values aren't so close together.
I ended up creating a code that wraps the way I want based on a dimension, which I provide. Here's the code:
dims = 100;
A = zeros(dims);
b = floor(1+dims*rand(1));
c = floor(1+dims*rand(1));
d = rand(1);
x1 = c;
y1 = b;
A(x1,y1) = d;
for i = 1:dims
for j = i
k = 1-j;
while k <= j
if x1-j>0 && y1+k>0 && y1+k <= dims
if A(x1-j,y1+k) == 0
A(x1-j,y1+k) = eqn(d,x1-j,y1+k,x1,y1);
end
end
k = k+1;
end
end
for k = i
j = 1-k;
while j<=k
if x1+j>0 && y1+k>0 && y1+k <= dims && x1+j <= dims
if A(x1+j,y1+k)==0
A(x1+j, y1+k) = eqn(d,x1+j,y1+k,x1,y1);
end
end
j = j+1;
end
end
for j = i
k = 1-j;
while k<=j
if x1+j>0 && y1-k>0 && x1+j <= dims && y1-k<= dims
if A(x1+j,y1-k) == 0
A(x1+j,y1-k) = eqn(d,x1+j,y1-k,x1,y1);
end
end
k=k+1;
end
end
for k = i
j = 1-k;
while j<=k
if x1-j>0 && y1-k>0 && x1-j <= dims && y1-k<= dims
if A(x1-j,y1-k)==0
A(x1-j,y1-k) = eqn(d,x1-j,y1-k,x1,y1);
end
end
j = j+1;
end
end
end
colormap('hot');
imagesc(A);
colorbar;
If you notice, the code calls a function (I called it eqn), which provided the information for how to changes the values in each cell. The function that I settled on is d/distance (distance being computed using the standard distance formula).
It seems to work pretty well. I'm now just trying to develop a good way to have multiple peaks in the same square without one peak completely overwriting the other.

Iterating over all integer vectors summing up to a certain value in MATLAB?

I would like to find a clean way so that I can iterate over all the vectors of positive integers of length, say n (called x), such that sum(x) == 100 in MATLAB.
I know it is an exponentially complex task. If the length is sufficiently small, say 2-3 I can do it by a for loop (I know it is very inefficient) but how about longer vectors?
Thanks in advance,
Here is a quick and dirty method that uses recursion. The idea is that to generate all vectors of length k that sum to n, you first generate vectors of length k-1 that sum to n-i for each i=1..n, and then add an extra i to the end of each of these.
You could speed this up by pre-allocating x in each loop.
Note that the size of the output is (n + k - 1 choose n) rows and k columns.
function x = genperms(n, k)
if k == 1
x = n;
elseif n == 0
x = zeros(1,k);
else
x = zeros(0, k);
for i = 0:n
y = genperms(n-i,k-1);
y(:,end+1) = i;
x = [x; y];
end
end
Edit
As alluded to in the comments, this will run into memory issues for large n and k. A streaming solution is preferable, which generates the outputs one at a time. In a non-strict language like Haskell this is very simple -
genperms n k
| k == 1 = return [n]
| n == 0 = return (replicate k 0)
| otherwise = [i:y | i <- [0..n], y <- genperms (n-i) (k-1)]
viz.
>> mapM_ print $ take 10 $ genperms 100 30
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,99]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,98]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,97]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,96]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,95]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,94]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,93]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,92]
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,91]
which runs virtually instantaneously - no memory issues to worry about.
In Python you could achieve something nearly as simple using generators and the yield keyword. In Matlab it is certainly possible, but I leave the translation up to you!
This is one possible method to generate all vectors at once (will give memory problems for moderately large n):
s = 10; %// desired sum
n = 3; %// number of digits
vectors = cell(1,n);
[vectors{:}] = ndgrid(0:s); %// I assume by "integer" you mean non-negative int
vectors = cell2mat(cellfun(#(c) reshape(c,1,[]), vectors, 'uni', 0).');
vectors = vectors(:,sum(vectors)==s); %// each column is a vector
Now you can iterate over those vectors:
for vector = vectors %// take one column at each iteration
%// do stuff with the vector
end
To avoid memory problems it is better to generate each vector as needed, instead of generating all of them initially. The following approach iterates over all possible n-vectors in one for loop (regardless of n), rejecting those vectors whose sum is not the desired value:
s = 10; %// desired sum
n = 3;; %// number of digits
for number = 0: s^n-1
vector = dec2base(number,s).'-'0'; %// column vector of n rows
if sum(vector) ~= s
continue %// reject that vector
end
%// do stuff with the vector
end