How to vectorize Matlab Code with mvnpdf in? - matlab

I have some working code in matlab, and speed is vital. I have vectorized/optimized many parts of it, and the profiler now tells me that the most time is spent a short piece of code. For this,
I have some parameter sets for a multi-variate normal
distribution.
I then have to get the value from the corresponding PDF at some point
pos,
and multiply it by some other value stored in a vector.
I have produced a minimal working example below:
num_params = 1000;
prob_dist_params = repmat({ [1, 2], [10, 1; 1, 5] }, num_params, 1);
saved_nu = rand( num_params, 1 );
saved_pos = rand( num_params, 2 );
saved_total = 0;
tic()
for param_counter = 1:size(prob_dist_params)
% Evaluate the PDF at specified points
pdf_vals = mvnpdf( saved_pos(param_counter,:), prob_dist_params{param_counter,1}, prob_dist_params{param_counter, 2} );
saved_total = saved_total + saved_nu(param_counter)*pdf_vals;
end % End of looping over parameters
toc()
I am aware that prob_dist_params are all the same in this case, but in my code we have each element of this different depending on a few things upstream. I call this particular piece of code many tens of thousands of time in my full program, so am wondering if there is anything at all I can do to vectorize this loop, or failing that, speed it up at all? I do not know how to do so with the inclusion of a mvnpdf() function.

Yes you can, however, I don't think it will give you a huge performance boost. You will have to reshape your mu's and sigma's.
Checking the doc of mvnpdf(X,mu,sigma), you see that you will have to provide X and mu as n-by-d numeric matrix and sigma as d-by-d-by-n.
In your case, d is 2 and n is 1000. You have to split the cell array in two matrices, and reshape as follows:
prob_dist_mu = cell2mat(prob_dist_params(:,1));
prob_dist_sigma = cell2mat(permute(prob_dist_params(:,2),[3 2 1]));
With permute, I make the first dimension of the cell array the third dimension, so cell2mat will result in a 2-by-2-by-1000 matrix. Alternatively you can define them as follows,
prob_dist_mu = repmat([1 2], [num_params 1]);
prob_dist_sigma = repmat([10, 1; 1, 5], [1 1 num_params]);
Now call mvnpdf with
pdf_vals = mvnpdf(saved_pos, prob_dist_mu, prob_dist_sigma);
saved_total = saved_nu.'*pdf_vals; % simple dot product

Related

Matlab: Vectorizing 4 nested for loops

So, I need to vectorize some for loops into a single line. I understand how vectorize one and two for-loops, but am really struggling to do more than that. Essentially, I am computing a "blur" matrix M2 of size (n-2)x(m-2) of an original matrix M of size nxm, where s = size(M):
for x = 0:1
for y = 0:1
m = zeros(1, 9);
k = 1;
for i = 1:(s(1) - 1)
for j = 1:(s(2) - 1)
m(1, k) = M(i+x,j+y);
k = k+1;
end
end
M2(x+1,y+1) = mean(m);
end
end
This is the closest I've gotten:
for x=0:1
for y=0:1
M2(x+1, y+1) = mean(mean(M((x+1):(3+x),(y+1):(3+y))))
end
end
To get any closer to a one-line solution, it seems like there has to be some kind of "communication" where I assign two variables (x,y) to index over M2 and index over M; I just don't see how it can be done otherwise, but I am assured there is a solution.
Is there a reason why you are not using MATLAB's convolution function to help you do this? You are performing a blur with a 3 x 3 averaging kernel with overlapping neighbourhoods. This is exactly what convolution is doing. You can perform this using conv2:
M2 = conv2(M, ones(3) / 9, 'valid');
The 'valid' flag ensures that you return a size(M) - 2 matrix in both dimensions as you have requested.
In your code, you have hardcoded this for a 4 x 4 matrix. To double-check to see if we have the right results, let's generate a random 4 x 4 matrix:
rng(123);
M = rand(4, 4);
s = size(M);
If we run this with your code, we get:
>> M2
M2 =
0.5054 0.4707
0.5130 0.5276
Doing this with conv2:
>> M2 = conv2(M, ones(3) / 9, 'valid')
M2 =
0.5054 0.4707
0.5130 0.5276
However, if you want to do this from first principles, the overlapping of the pixel neighbourhoods is very difficult to escape using loops. The two for loop approach you have is good enough and it tackles the problem appropriately. I would make the size of the input instead of being hard coded. Therefore, write a function that does something like this:
function M2 = blur_fp(M)
s = size(M);
M2 = zeros(s(1) - 2, s(2) - 2);
for ii = 2 : s(1) - 1
for jj = 2 : s(2) - 1
p = M(ii - 1 : ii + 1, jj - 1 : jj + 1);
M2(ii - 1, jj - 1) = mean(p(:));
end
end
The first line of code defines the function, which we will call blur_fp. The next couple lines of code determine the size of the input matrix as well as initialising a blank matrix to store out output. We then loop through each pixel location in the matrix that is possible without the kernel going outside of the boundaries of the image, we grab a 3 x 3 neighbourhood with each pixel location serving as the centre, we then unroll the matrix into a single column vector, find the average and store it in the appropriate output. For small kernels and relatively large matrices, this should perform OK.
To take this a little further, you can use user Divakar's im2col_sliding function which takes overlapping neighbourhoods and unrolls them into columns. Therefore, each column represents a neighbourhood which you can then blur the input using vector-matrix multiplication. You would then use reshape to reshape the result back into a matrix:
T = im2col_sliding(M, [3 3]);
V = ones(1, 9) / 9;
s = size(M);
M2 = reshape(V * T, s(1) - 2, s(2) - 2);
This unfortunately cannot be done in a single line unless you use built-in functions. I'm not sure what your intention is, but hopefully the gamut of approaches you have seen here have given you some insight on how to do this efficiently. BTW, using loops for small matrices (i.e. 4 x 4) may be better in efficiency. You will start to notice performance changes when you increase the size of the input... then again, I would argue that using loops are competitive as of R2015b when the JIT has significantly improved.

How to neatly sum values of histogram bins, given a value matrix and an associated category association matrix (bin association)

Consider the example following below, where I have a 10x10 matrix, say A, of random values in some range, say [-5, 5]. I quantize the values of A into 8 categories, 1, ..., 8, such that an additional 10x10 matrix, say qA, describes the category association for each number in A. Finally, I produce the sums of all values assigned to each category. My question regards this final step.
myRange = 5; % values in open interval [-myRange, myRange]
A = myRange*(2*rand(10) - 1);
qA = uencode(A, 3, myRange)+1;
% (+) create "histogram" of sum of values assigned to each bin
myHistogram = zeros(8,1);
for i = 1:numel(A)
myHistogram(qA(i)) = myHistogram(qA(i)) + A(i);
end
bar(myHistogram)
Question: Is the some neater way of doing this, specifically the counting step (+) above? (Some better alternative than explicitly iterating over each element in the matrix A?).
Just as I was about to finish up and post my question I found a satisfying answer to it, however not here on SO. As self-answering is encouraged, I'll post the Q+A instead of aborting this Q posting.
Hence, based on the following Matlab Central thread, one neater solution is as follows:
myRange = 5; % values in open interval [-myRange, myRange]
A = myRange*(2*rand(10) - 1);
qA = uencode(A, 3, myRange)+1;
% or, if you dont have Signal Processing Toolbox required for 'uencode'
% [~, ~, qA] = histcounts(A, -myRange:myRange/4:myRange);
% (+) create "histogram" of sum of values assigned to each bin
myHistogram = accumarray(qA(:), A(:), [8 1])
Possibly there's alternative/even better ways to do this, performing quantizing and bin value summation in same step?

Efficient aggregation of high dimensional arrays

I have a 3 dimensional (or higher) array that I want to aggregate by another vector. The specific application is to take daily observations of spatial data and average them to get monthly values. So, I have an array with dimensions <Lat, Lon, Day> and I want to create an array with dimensions <Lat, Lon, Month>.
Here is a mock example of what I want. Currently, I can get the correct output using a loop, but in practice, my data is very large, so I was hoping for a more efficient solution than the second loop:
% Make the mock data
A = [1 2 3; 4 5 6];
X = zeros(2, 3, 9);
for j = 1:9
X(:, :, j) = A;
A = A + 1;
end
% Aggregate the X values in groups of 3 -- This is the part I would like help on
T = [1 1 1 2 2 2 3 3 3];
X_agg = zeros(2, 3, 3);
for i = 1:3
X_agg(:,:,i) = mean(X(:,:,T==i),3);
end
In 2 dimensions, I would use accumarray, but that does not accept higher dimension inputs.
Before getting to your answer let's first rewrite your code in a more general way:
ag = 3; % or agg_size
X_agg = zeros(size(X)./[1 1 ag]);
for i = 1:ag
X_agg(:,:,i) = mean(X(:,:,(i-1)*ag+1:i*ag), 3);
end
To avoid using the for loop one idea is to reshape your X matrix to something that you can use the mean function directly on.
splited_X = reshape(X(:), [size(X_agg), ag]);
So now splited_X(:,:,:,i) is the i-th part
that contains all the matrices that should be aggregated which is X(:,:,(i-1)*ag+1:i*ag)) (like above)
Now you just need to find the mean in the 3rd dimension of splited_X:
temp = mean(splited_X, 3);
However this results in a 4D matrix (where its 3rd dimension size is 1). You can again turn it into 3D matrix using reshape function:
X_agg = reshape(temp, size(X_agg))
I have not tried it to see how much more efficient it is, but it should do better for large matrices since it doesn't use for loops.

Creating a vector with random sampling of two vectors in matlab

How does one create a vector that is composed of a random sampling of two other vectors?
For example
Vector 1 [1, 3, 4, 7], Vector 2 [2, 5, 6, 8]
Random Vector [random draw from vector 1 or 2 (value 1 or 2), random draw from vector 1 or 2 (value 3 or 5)... etc]
Finally, how can one ask matlab to repeat this process n times to draw a distribution of results?
Thank you,
There are many ways you could do this. One possibility is:
tmp=round(rand(size(vector1)))
res = tmp.*vector1 + (1-tmp).*vector2
To get one mixed sample, you may use the idea of the following code snippet (not the optimal one, but maybe clear enough):
a = [1, 3, 4, 7];
b = [2, 5, 6, 8];
selector = randn(size(a));
sample = a.*(selector>0) + b.*(selector<=0);
For n samples put the above code in a for loop:
for k=1:n
% Sample code (without initial "samplee" assignments)
% Here do stuff with the sample
end;
More generally, if X is a matrix and for each row you want to take a sample from a column chosen at random, you can do this with a loop:
y = zeros(size(X,1),1);
for ii = 1:size(X,1)
y(ii) = X(ii,ceil(rand*size(X,2)));
end
You can avoid the loop using clever indexing via sub2ind:
idx_n = ceil(rand(size(X,1),1)*size(X,2));
idx = sub2ind(size(X),(1:size(X,1))',idx_n);
y = X(idx);
If I understand your question, you are choosing two random numbers. First you decide whether to select vector 1 or vector 2; next you pick an element from the chosen vector.
The following code takes advantage of the fact that vector1 and vector2 are the same length:
N = 1000;
sampleMatrix = [vector1 vector2];
M = numel(sampleMatrix);
randIndex = ceil(rand(1,N)*M); % N random numbers from 1 to M
randomNumbers = sampleMatrix(randIndex); % sample N times from the matrix
You can then display the result with, for instance
figure; hist(randomNumbers); % draw a histogram of numbers drawn
When vector1 and vector2 have different elements, you run into a problem. If you concatenate them, you will end up picking elements from the longer vector more often. One way around this is to create random samplings from both arrays, then choose between them:
M1 = numel(vector1);
M2 = numel(vector2);
r1 = ceil(rand(1,N)*M1);
r2 = ceil(rand(1,N)*M2);
randMat = [vector1(r1(:)) vector2(r2(:))]; % two columns, now pick one or the other
randPick = ceil(rand(1,N)*2);
randomNumbers = [randMat(randPick==1, 1); randMat(randPick==2, 2)];
On re-reading, maybe you just want to pick "element 1 from either 1 or 2", then "element 2 from either 1 or 2", etc for all the elements of the vector. In that case, do
N=numel(vector1);
randPick = ceil(rand(1,N)*2);
randMat=[vector1(:) vector2(:)];
randomNumbers = [randMat(randPick==1, 1); randMat(randPick==2, 2)];
This problem can be solved using the function datasample.
Combine both vectors into one and apply the function. I like this approach more than the handcrafted versions in the other answers. It gives you much more flexibility in choosing what you actually want, while being a one-liner.

Update only one matrix element for iterative computation

I have a 3x3 matrix, A. I also compute a value, g, as the maximum eigen value of A. I am trying to change the element A(3,3) = 0 for all values from zero to one in 0.10 increments and then update g for each of the values. I'd like all of the other matrix elements to remain the same.
I thought a for loop would be the way to do this, but I do not know how to update only one element in a matrix without storing this update as one increasingly larger matrix. If I call the element at A(3,3) = p (thereby creating a new matrix Atry) I am able (below) to get all of the values from 0 to 1 that I desired. I do not know how to update Atry to get all of the values of g that I desire. The state of the code now will give me the same value of g for all iterations, as expected, as I do not know how to to update Atry with the different values of p to then compute the values for g.
Any suggestions on how to do this or suggestions for jargon or phrases for me to web search would be appreciated.
A = [1 1 1; 2 2 2; 3 3 0];
g = max(eig(A));
% This below is what I attempted to achieve my solution
clear all
p(1) = 0;
Atry = [1 1 1; 2 2 2; 3 3 p];
g(1) = max(eig(Atry));
for i=1:100;
p(i+1) = p(i)+ 0.01;
% this makes a one giant matrix, not many
%Atry(:,i+1) = Atry(:,i);
g(i+1) = max(eig(Atry));
end
This will also accomplish what you want to do:
A = #(x) [1 1 1; 2 2 2; 3 3 x];
p = 0:0.01:1;
g = arrayfun(#(x) eigs(A(x),1), p);
Breakdown:
Define A as an anonymous function. This means that the command A(x) will return your matrix A with the (3,3) element equal to x.
Define all steps you want to take in vector p
Then "loop" through all elements in p by using arrayfun instead of an actual loop.
The function looped over by arrayfun is not max(eig(A)) but eigs(A,1), i.e., the 1 largest eigenvalue. The result will be the same, but the algorithm used by eigs is more suited for your type of problem -- instead of computing all eigenvalues and then only using the maximum one, you only compute the maximum one. Needless to say, this is much faster.
First, you say 0.1 increments in the text of your question, but your code suggests you are actually interested in 0.01 increments? I'm going to operate under the assumption you mean 0.01 increments.
Now, with that out of the way, let me state what I believe you are after given my interpretation of your question. You want to iterate over the matrix A, where for each iteration you increase A(3, 3) by 0.01. Given that you want all values from 0 to 1, this implies 101 iterations. For each iteration, you want to calculate the maximum eigenvalue of A, and store all these eigenvalues in some vector (which I will call gVec). If this is correct, then I believe you just want the following:
% Specify the "Current" A
CurA = [1 1 1; 2 2 2; 3 3 0];
% Pre-allocate the values we want to iterate over for element (3, 3)
A33Vec = (0:0.01:1)';
% Pre-allocate a vector to store the maximum eigenvalues
gVec = NaN * ones(length(A33Vec), 1);
% Loop over A33Vec
for i = 1:1:length(A33Vec)
% Obtain the version of A that we want for the current i
CurA(3, 3) = A33Vec(i);
% Obtain the maximum eigen value of the current A, and store in gVec
gVec(i, 1) = max(eig(CurA));
end
EDIT: Probably best to paste this code into your matlab editor. The stack-overflow automatic text highlighting hasn't done it any favors :-)
EDIT: Go with Rody's solution (+1) - it is much better!