MATLAB: Indexing a large matrix for Monte Carlo Simulation

MATLAB: Indexing a large matrix for Monte Carlo Simulation - matlab

I'm trying to index a large matrix in MATLAB that contains numbers monotonically increasing across rows, and across columns, i.e. if the matrix is called A, for every (i,j), A(i+1,j) > A(i,j) and A(i,j+1) > A(i,j).
I need to create a random number n and compare it with the values of the matrix A, to see where that random number should be placed in the matrix A. In other words, the value of n may not equal any of the contents of the matrix, but it may lie in between any two rows and any two columns, and that determines a "bin" that identifies its position in A. Once I find this position, I increment the corresponding index in a new matrix of the same size as A.
The problem is that I want to do this 1,000,000 times. I need to create a random number a million times and do the index-checking for each of these numbers. It's a Monte Carlo Simulation of a million photons coming from a point landing on a screen; the matrix A consists of angles in spherical coordinates, and the random number is the solid angle of each incident photon.
My code so far goes something like this (I haven't copy-pasted it here because the details aren't important):
for k = 1:1000000
n = rand(1,1)*pi;
for i = length(A(:,1))
for j = length(A(1,:))
if (n > A(i-1,j)) && (n < A(i+1,j)) && (n > A(i,j-1)) && (n < A(i,j+1))
new_img(i,j) = new_img(i,j) + 1; % new_img defined previously as zeros
end
end
end
end
The "if" statement is just checking to find the indices of A that form the bounds of n.
This works perfectly fine, but it takes ridiculously long, especially since my matrix A is an image of dimensions 11856 x 11000. is there a quicker / cleverer / easier way of doing this?
Thanks in advance.

You can get rid of the inner loops by performing the calculation on all elements of A at once. Also, you can create the random numbers all at once, instead of one at a time. Note that the outermost pixels of new_img can never be different from zero.
randomNumbers = rand(1,1000000)*pi;
new_img = zeros(size(A));
tmp_img = zeros(size(A)-2);
for r = randomNumbers
tmp_img = tmp_img + A(:,1:end-2)<r & A(:,3:end)>r & A(1:end-1,:)<r & A(3:end,:)>r;
end
new_img(2:end-1,2:end-1) = tmp_img;
/aside: If the arrays were smaller, I'd have used bsxfun for the comparison, but with the array sizes in the OP, the approach would run out of memory.

Are the values in A bin edges? Ie does A specify a grid? If this is the case then you can QUICKLY populate A using hist3.
Here is an example:
numRand = 1e
n = randi(100,1e6,1);
nMatrix = [floor(data./10), mod(data,10)];
edges = {0:1:9, 0:10:99};
A = hist3(dataMat, edges);
If your A doesn't specify a grid, then you should create all of your random values once and sort them. Then iterate through those values.
Because you know that n(i) >= n(i-1) you don't have to check bins that were too small for n(i-1). This is a very easy way to optimize away most redundant checks.

Here is a snippet that should help a lot in the inner loop, it finds the location of the greatest point that is smaller than your value.
idx1 = A<value
idx2 = A(idx1) == max(A(idx1))
if you want to find the exact location you can wrap it with a find.

Related

Mean value of each column of a matrix

I have a 64 X 64 matrix that I need to find the column-wise mean values for.
However, instead of dividing by the total number of elements in each column (i.e. 64), I need to divide by the total number of non-zeros in the matrix.
I managed to get it to work for a single column as shown below. For reference, the function that generates my matrix is titled fmu2(i,j).
q = 0;
for i = 1:64
if fmu2(i,1) ~= 0;
q = q + 1;
end
end
for i = 1:64
mv = (1/q).*sum(fmu2(i,1));
end
This works for generating the "mean" value of the first column. However, I'm having trouble looping this procedure so that I will get the mean for each column. I tried doing a nested for loop, but it just calculated the mean for the entire 64 X 64 matrix instead of one column at a time. Here's what I tried:
q = 0;
for i = 1:64
for j = 1:64
if fmu2(i,j) ~= 0;
q = q +1;
end
end
end
for i = 1:64
for j = 1:64
mv = (1/q).*sum(fmu2(i,j));
end
end
Like I said, this just gave me one value for the entire matrix instead of 64 individual "means" for each column. Any help would be appreciated.

For one thing, do not call the function that generates your matrix in each iteration of a loop. This is extremely inefficient and will cause major problems if your function is complex enough to have side effects. Store the return value in a variable once, and refer to that variable from then on.
Secondly, you do not need any loops here at all. The total number of nonzeros is given by the nnz function (short for number of non-zeros). The sum function accepts an optional dimension argument, so you can just tell it to sum along the columns instead of along the rows or the whole matrix.
m = fmu2(i,1)
averages = sum(m, 1) / nnz(m)
averages will be a 64-element array with an average for each column, since sum(m, 1) is a 64 element sum along each column and nnz(m) is a scalar.
One of the great things about MATLAB is that it provides vectorized implementations of just about everything. If you do it right, you should almost never have to use an explicit loop to do any mathematical operations at all.

If you want the column-wise mean of non-zero elements you can do the following
m = randi([0,5], 5, 5); % some data
avg = sum(m,1) ./ sum(m~=0,1);
This is a column-wise sum of values, divided by the column-wise number of elements not equal to 0. The result is a row vector where each element is the average of the corresponding column in m.
Note this is very flexible, you could use any condition in place of ~=0.

Optimize nested for loop for calculating xcorr of matrix rows

I have 2 nested loops which do the following:
Get two rows of a matrix
Check if indices meet a condition or not
If they do: calculate xcorr between the two rows and put it into new vector
Find the index of the maximum value of sub vector and replace element of LAG matrix with this value
I dont know how I can speed this code up by vectorizing or otherwise.
b=size(data,1);
F=size(data,2);
LAG= zeros(b,b);
for i=1:b
for j=1:b
if j>i
x=data(i,:);
y=data(j,:);
d=xcorr(x,y);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(i,j)=I-1;
d=xcorr(y,x);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(j,i)=I-1;
end
end
end

First, a note on floating point precision...
You mention in a comment that your data contains the integers 0, 1, and 2. You would therefore expect a cross-correlation to give integer results. However, since the calculation is being done in double-precision, there appears to be some floating-point error introduced. This error can cause the results to be ever so slightly larger or smaller than integer values.
Since your calculations involve looking for the location of the maxima, then you could get slightly different results if there are repeated maximal integer values with added precision errors. For example, let's say you expect the value 10 to be the maximum and appear in indices 2 and 4 of a vector d. You might calculate d one way and get d(2) = 10 and d(4) = 10.00000000000001, with some added precision error. The maximum would therefore be located in index 4. If you use a different method to calculate d, you might get d(2) = 10 and d(4) = 9.99999999999999, with the error going in the opposite direction, causing the maximum to be located in index 2.
The solution? Round your cross-correlation data first:
d = round(xcorr(x, y));
This will eliminate the floating-point errors and give you the integer results you expect.
Now, on to the actual solutions...
Solution 1: Non-loop option
You can pass a matrix to xcorr and it will perform the cross-correlation for every pairwise combination of columns. Using this, you can forego your loops altogether like so:
d = round(xcorr(data.'));
[~, I] = max(d(F:(2*F)-1,:), [], 1);
LAG = reshape(I-1, b, b).';
Solution 2: Improved loop option
There are limits to how large data can be for the above solution, since it will produce large intermediate and output variables that can exceed the maximum array size available. In such a case for loops may be unavoidable, but you can improve upon the for-loop solution above. Specifically, you can compute the cross-correlation once for a pair (x, y), then just flip the result for the pair (y, x):
% Loop over rows:
for row = 1:b
% Loop over upper matrix triangle:
for col = (row+1):b
% Cross-correlation for upper triangle:
d = round(xcorr(data(row, :), data(col, :)));
[~, I] = max(d(:, F:(2*F)-1));
LAG(row, col) = I-1;
% Cross-correlation for lower triangle:
d = fliplr(d);
[~, I] = max(d(:, F:(2*F)-1));
LAG(col, row) = I-1;
end
end

Multiply two vectors with dimensions increasing along time

I have two vectors (called A and B) with length N. Then I need to multiply both of them, but as an "integration" process. Which means I have to multiply first A(1)*B(1), then A(1:2)*B(1:2), until A(1:N)*B(1:N). The result of multiplying booth vector is a number, since B is a column vector. I've done it with a for loop:
for k = 1:N
C(k) = A(1:k) * B(1:k).';
end
But I wanted to ask you if this is the best solution or there is any other option more time-efficient, since N is very large (about 110,000)

C = cumsum(A.*B)
does the same thing without for loop. As EBH suggested in the comments if you are not sure whether A and B have same orientation, then use
C = cumsum(A(:).*B(:))

How to count the number of overlapping blocks in an Image

I have a 512X512 size of image and I have made 4x4 overlapping blocks for the entire image.How can i count the number of overlapping blocks and save it in an Array in matlab.
I have done like below for 4x4 overlapping blocks. Now how to count the no of blocks and store it using an Array.
[e f] = size(outImg);
l=0;
for i=2:e-2
for j=2:f-2
H =double(outImg((i-1:i+2),(j-1:j+2)));
eval(['out_' num2str(l) '=H']);
l=l+1
end;
end;

From what I understand the question, you want to know how many blocks of 4x4 can fit in the image, and then store them.
Calculating the number of blocks is trivial, in the code that you give as example, l is the number of element counted. Of course, that its value is deterministic (determined by f and e). No need to loop over them to get the value of the count.
count = (f-3)*(e-3);
If you want to save the values in an array (assuming that you mean here a matrix and not a cell array) you need to decide how to represent it, you can store it as a 4D e-3 x f-3 x 4 x 4 matrix (as #Steffen suggested), or as a 3D 4 x 4 x count matrix, I think that the later is more intuitive. In any case you should assign the memory for the matrix in advance and not on the fly:
[e f] = size(outImg);
count = (f-3)*(e-3);
outMat = zeros(4,4,count); % assign the memory for the matrix
l = 0;
for i=2:e-2
for j=2:f-2
l = l + 1;
outMat(:,:,l) = double(outImg((i-1:i+2),(j-1:j+2)));
end;
end;
The number of blocks is stored as both count and l, but calculating count in advance allows to assign the needed memory in advance, the i block is stored as outMat(:,:,i).
An implementation using the 4D matrix would be:
[e f] = size(outImg);
count = (f-3)*(e-3);
outMat = zeros((f-3),(e-3),4,4); % assign the memory for the matrix
for i=2:e-2
for j=2:f-2
outMat(i,j,:,:) = double(outImg((i-1:i+2),(j-1:j+2)));
end;
end;
In this case, l isn't needed and each block (indexed i,j) is located at outMat(i,j,:,:)
Regarding cell array vs. a matrix, since a matrix requires a continuous place in the memory, you may want to consider using a cell array instead of a matrix. A 512x512x4 matrix of doubles requires (assuming 8 Byte representation) 8MB (512*512*8*4 = 8*1024*1024). If the dimensions were bigger, or if you are strapped for (continuous) memory a cell array may be a better solution. You can read more about the difference at Difference between cell and matrix in matlab?.
The implementation would be very similar.
[e f] = size(outImg);
count = (f-3)*(e-3);
outArray = cell(1,count);
l = 0;
for i=2:e-2
for j=2:f-2
l = l + 1;
outArray{1,l} = double(outImg((i-1:i+2),(j-1:j+2)));
end;
end;

The answer is very simple. Each loop iteration will access 1 overlapping block in your image. All you have to do is count how many times the loop iterates, which is ((e-2) - 2 + 1) x ((f - 2) - 2 + 1) = (e - 3) x (f - 3). There's no need to keep a loop iteration variable.
Minor note. Under no circumstances should you use eval unless absolutely necessary. The MATLAB gods will smite any offenders swiftly. That code to assign a new variable to each increment of l is absolutely unnecessary. If you were to have 10000 overlapping blocks, you would have 10000 variables. You can just simply take a look at l at the end and this would tell you how many overlapping blocks you have.
Remove that line of code. WE HATES IT. IT BURNS US.
See this post by Loren Shure for more details on why using eval is bad.

For the 1-D case, the correct formula to estimate the total number of overlapping blocks is:
(#Of_blocks) = (#Image_width - #Block_width) / (#Block_width - #Block_overlap) + 1
If the result is not integer you just take the floor of #Of_blocks, this means that the whole interval can't be covered using that setup.

How can I generate a binary matrix with specific patterns?

I have a binary matrix of size m-by-n. Given below is a sample binary matrix (the real matrix is much larger):
1010001
1011011
1111000
0100100
Given p = m*n, I have 2^p possible matrix configurations. I would like to get some patterns which satisfy certain rules. For example:
I want not less than k cells in the jth column as zero
I want the sum of cell values of the ith row greater than a given number Ai
I want at least g cells in a column continuously as one
etc....
How can I get such patterns satisfying these constraints strictly without sequentially checking all the 2^p combinations?
In my case, p can be a number like 2400, giving approximately 2.96476e+722 possible combinations.

Instead of iterating over all 2^p combinations, one way you could generate such binary matrices is by performing repeated row- and column-wise operations based on the given constraints you have. As an example, I'll post some code that will generate a matrix based on the three constraints you have listed above:
A minimum number of zeroes per column
A minimum sum for each row
A minimum sequential length of ones per column
Initializations:
First start by initializing a few parameters:
nRows = 10; % Row size of matrix
nColumns = 10; % Column size of matrix
minZeroes = 5; % Constraint 1 (for columns)
minRowSum = 5; % Constraint 2 (for rows)
minLengthOnes = 3; % Constraint 3 (for columns)
Helper functions:
Next, create a couple of functions for generating column vectors that match constraints 1 and 3 from above:
function vector = make_column
vector = [false(minZeroes,1); true(nRows-minZeroes,1)]; % Create vector
[vector,maxLength] = randomize_column(vector); % Randomize order
while maxLength < minLengthOnes, % Loop while constraint 3 is not met
[vector,maxLength] = randomize_column(vector); % Randomize order
end
end
function [vector,maxLength] = randomize_column(vector)
vector = vector(randperm(nRows)); % Randomize order
edges = diff([false; vector; false]); % Find rising and falling edges
maxLength = max(find(edges == -1)-find(edges == 1)); % Find longest
% sequence of ones
end
The function make_column will first create a logical column vector with the minimum number of 0 elements and the remaining elements set to 1 (using the functions TRUE and FALSE). This vector will undergo random reordering of its elements until it contains a sequence of ones greater than or equal to the desired minimum length of ones. This is done using the randomize_column function. The vector is randomly reordered using the RANDPERM function to generate a random index order. The edges where the sequence switches between 0 and 1 are detected using the DIFF function. The indices of the edges are then used to find the length of the longest sequence of ones (using FIND and MAX).
Generate matrix columns:
With the above two functions we can now generate an initial binary matrix that will at least satisfy constraints 1 and 3:
binMat = false(nRows,nColumns); % Initialize matrix
for iColumn = 1:nColumns,
binMat(:,iColumn) = make_column; % Create each column
end
Satisfy the row sum constraint:
Of course, now we have to ensure that constraint 2 is satisfied. We can sum across each row using the SUM function:
rowSum = sum(binMat,2);
If any elements of rowSum are less than the minimum row sum we want, we will have to adjust some column values to compensate. There are a number of different ways you could go about modifying column values. I'll give one example here:
while any(rowSum < minRowSum), % Loop while constraint 2 is not met
[minValue,rowIndex] = min(rowSum); % Find row with lowest sum
zeroIndex = find(~binMat(rowIndex,:)); % Find zeroes in that row
randIndex = round(1+rand.*(numel(zeroIndex)-1));
columnIndex = zeroIndex(randIndex); % Choose a zero at random
column = binMat(:,columnIndex);
while ~column(rowIndex), % Loop until zero changes to one
column = make_column; % Make new column vector
end
binMat(:,columnIndex) = column; % Update binary matrix
rowSum = sum(binMat,2); % Update row sum vector
end
This code will loop until all the row sums are greater than or equal to the minimum sum we want. First, the index of the row with the smallest sum (rowIndex) is found using MIN. Next, the indices of the zeroes in that row are found and one of them is randomly chosen as the index of a column to modify (columnIndex). Using make_column, a new column vector is continuously generated until the 0 in the given row becomes a 1. That column in the binary matrix is then updated and the new row sum is computed.
Summary:
For a relatively small 10-by-10 binary matrix, and the given constraints, the above code usually completes in no more than a few seconds. With more constraints, things will of course get more complicated. Depending on how you choose your constraints, there may be no possible solution (for example, setting minRowSum to 6 will cause the above code to never converge to a solution).
Hopefully this will give you a starting point to begin generating the sorts of matrices you want using vectorized operations.

If you have enough constraints, exploring all possible matrices could be attempted:
// Explore all possibilities starting at POSITION (0..P-1)
explore(int position)
{
// Check if one or more constraints can't be verified anymore with
// all values currently set.
invalid = ...;
if (invalid) return;
// Do we have a solution?
if (position >= p)
{
// print the matrix
return;
}
// Set one more value and continue exploring
for (int value=0;value<2;value++)
{ matrix[position] = value; explore(position+1); }
}
If the number of constraints is low, this approach will take too much time.
In this case, for the kind of constraints you gave as examples, simulated annealing may be a good solution.
You must design an energy function, high when all constraints are met. That would be something like that:
Generate a random matrix
Compute energy E0
Change one cell
Compute energy E1
If E1>E0, or E0-E1 is smaller than f(temperature), keep it, otherwise reverse the move
Update temperature, and goto 2 unless stop criterion is reached

If all the contraints relate to columns (as is the case in the question), then you can find all possible valid columns and check that each column in the matrix is in this set. (i.e. when you consider each column independently, you reduce the number of possibilities a lot.)

I might be way off here, but I remember doing something similar once with some genetic algorithm.

Check out pseudo boolean constraints (also called 0-1 integer programming).

This is virtually impossible if your constraint set is complex enough. You might try to use a stochastic optimizer, like simulated annealing, particle swarm optimization, or a genetic algorithm to find a feasible solution.
However, if you can generate one (non-random) solution to such a problem, then often you can generate others by random permutations made to the existing solution.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

MATLAB: Indexing a large matrix for Monte Carlo Simulation - matlab

Here is a snippet that should help a lot in the inner loop, it finds the location of the greatest point that is smaller than your value. idx1 = A<value idx2 = A(idx1) == max(A(idx1)) if you want to find the exact location you can wrap it with a find.

Related

Mean value of each column of a matrix

Optimize nested for loop for calculating xcorr of matrix rows

Multiply two vectors with dimensions increasing along time

How to count the number of overlapping blocks in an Image

How can I generate a binary matrix with specific patterns?

Categories

Resources