Discrete probability distribution calculation in Matlab - matlab

I have given P(x1...n) discrete independent probability values which represent for example the possibility of happening X.
I want a universal code for the question: With which probability does happening X occur at the same time 0-n times?
For example:
Given: 3 probabilities P(A),P(B),P(C) that each car(A,B,C) parks. Question would be: With which probability would no car, one car, two cars, and three cars park?
The answer for example for two cars parking at the same time would be:
P(A,B,~C) = P(A)*P(B)*(1-P(C))
P(A,~B,C) = P(A)*(1-P(B))*P(C)
P(~A,B,C) = (1-P(A))*P(B)*P(C)
P(2 of 3) = P(A,B,~C) + P(A,~B,C) + P(~A,B,C)
I have written the code for all possibilities, but the more values I get, of course the slower it gets due to more possible combinations.
% probability: Vector with probabilities P1, P2, ... PN
% result: Vector with results as stated above.
% All possibilities:
result(1) = prod(probability);
shift_vector = zeros(anzahl_werte,1);
for i = 1:anzahl_werte
% Shift Vector allocallization
shift_vector(i) = 1;
% Compute all unique permutations of the shift_vector
mult_vectors = uperm(shift_vector);
% Init Result Vector
prob_vector = zeros(length(mult_vectors(:,1)), 1);
% Calc Single Probabilities
for k = 1:length(mult_vectors(:,1))
prob_vector(k) = prod(abs(mult_vectors(k,:)'-probability));
end
% Sum of this Vector for one probability.
result(i+1) = sum(prob_vector);
end
end
%%%%% Calculate Permutations
function p = uperm(a)
[u, ~, J] = unique(a);
p = u(up(J, length(a)));
end % uperm
function p = up(J, n)
ktab = histc(J,1:max(J));
l = n;
p = zeros(1, n);
s = 1;
for i=1:length(ktab)
k = ktab(i);
c = nchoosek(1:l, k);
m = size(c,1);
[t, ~] = find(~p.');
t = reshape(t, [], s);
c = t(c,:)';
s = s*m;
r = repmat((1:s)',[1 k]);
q = accumarray([r(:) c(:)], i, [s n]);
p = repmat(p, [m 1]) + q;
l = l - k;
end
end
%%%%% Calculate Permutations End
Does anybody know a way to speed up this function? Or maybe Matlab has an implemented function for that?
I found the name of the calculation:
Poisson binomial distribution

How about this?
probability = [.3 .2 .4 .7];
n = numel(probability);
combs = dec2bin(0:2^n-1).'-'0'; %'// each column is a combination of n values,
%// where each value is either 0 or 1. A 1 value will represent an event
%// that happens; a 0 value will represent an event that doesn't happen.
result = NaN(1,n+1); %// preallocate
for k = 0:n; %// number of events that happen
ind = sum(combs,1)==k; %// combinations with exactly k 1's
result(k+1) = sum(prod(...
bsxfun(#times, probability(:), combs(:,ind)) + ... %// events that happen
bsxfun(#times, 1-probability(:), ~combs(:,ind)) )); %// don't happen
end

Related

Vectorize with Matlab Meshgrid in Chebfun

I am trying to use meshgrid in Matlab together with Chebfun to get rid of double for loops. I first define a quasi-matrix of N functions,
%Define functions of type Chebfun
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
A sample calculation would be to compute the double sum $\sum_{i,j=1}^10 psi(:,i).*psi(:,j)$. I can achieve this using two for loops in Matlab,
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
I then tried to use meshgrid to vectorize in the following way:
[i j] = meshgrid(1:N,1:N);
h = psi(:,i).*psi(:,j);
I get the error "Column index must be a vector of integers". How can I overcome this issue so that I can get rid of my double for loops and make my code a bit more efficient?
BTW, Chebfun is not part of native MATLAB and you have to download it in order to run your code: http://www.chebfun.org/. However, that shouldn't affect how I answer your question.
Basically, psi is a N column matrix and it is your desire to add up products of all combinations of pairs of columns in psi. You have the right idea with meshgrid, but what you should do instead is unroll the 2D matrix of coordinates for both i and j so that they're single vectors. You'd then use this and create two N^2 column matrices that is in such a way where each column corresponds to that exact column numbers specified from i and j sampled from psi. You'd then do an element-wise multiplication between these two matrices and sum across all of the columns for each row. BTW, I'm going to use ii and jj as variables from the output of meshgrid instead of i and j. Those variables are reserved for the complex number in MATLAB and I don't want to overshadow those unintentionally.
Something like this:
%// Your code
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
%// Create two matrices and sum
matrixA = psi(:, ii(:));
matrixB = psi(:, jj(:));
h = sum(matrixA.*matrixB, 2);
If you want to do away with the temporary variables, you can do it in one statement after calling meshgrid:
h = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
I don't have Chebfun installed, but we can verify that this calculates what we need with a simple example:
rng(123);
N = 10;
psi = randi(20, N, N);
Running this code with the above more efficient solution gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
Also, running the above double for loop code also gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
If you want to be absolutely sure, we can have both codes run with the outputs as separate variables, then check if they're equal:
%// Setup
rng(123);
N = 10;
psi = randi(20, N, N);
%// Old code
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
hnew = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
%// Check for equality
eql = isequal(h, hnew);
eql checks if both variables are equal, and we do get them as such:
>> eql
eql =
1

Multiplying a vector times the inverse of a matrix in Matlab

I have a problem multiplying a vector times the inverse of a matrix in Matlab. The code I am using is the following:
% Final Time
T = 0.1;
% Number of grid cells
N=20;
%N=40;
L=20;
% Delta x
dx=1/N
% define cell centers
%x = 0+dx*0.5:dx:1-0.5*dx;
x = linspace(-L/2, L/2, N)';
%define number of time steps
NTime = 100; %NB! Stability conditions-dersom NTime var 50 ville en fått helt feil svar pga lambda>0,5
%NTime = 30;
%NTime = 10;
%NTime = 20;
%NTime = 4*21;
%NTime = 4*19;
% Time step dt
dt = T/NTime
% Define a vector that is useful for handling teh different cells
J = 1:N; % number the cells of the domain
J1 = 2:N-1; % the interior cells
J2 = 1:N-1; % numbering of the cell interfaces
%define vector for initial data
u0 = zeros(1,N);
L = x<0.5;
u0(L) = 0;
u0(~L) = 1;
plot(x,u0,'-r')
grid on
hold on
% define vector for solution
u = zeros(1,N);
u_old = zeros(1,N);
% useful quantity for the discrete scheme
r = dt/dx^2
mu = dt/dx;
% calculate the numerical solution u by going through a loop of NTime number
% of time steps
A=zeros(N,N);
alpha(1)=A(1,1);
d(1)=alpha(1);
b(1)=0;
c(1)=b(1);
gamma(1,2)=A(1,2);
% initial state
u_old = u0;
pause
for j = 2:NTime
A(j,j)=1+2*r;
A(j,j-1)=-(1/dx^2);
A(j,j+1)=-(1/dx^2);
u=u_old./A;
% plotting
plot(x,u,'-')
xlabel('X')
ylabel('P(X)')
hold on
grid on
% update "u_old" before you move forward to the next time level
u_old = u;
pause
end
hold off
The error message I get is:
Matrix dimensions must agree.
Error in Implicit_new (line 72)
u=u_old./A;
My question is therefore how it is possible to perform u=u_old*[A^(-1)] in Matlab?
David
As knedlsepp said, v./A is the elementwise division, which is not what you wanted. You can use either
v/A provided that v is a row vector and its length is equal to the number of columns in A. The result is a row vector.
A\v provided that v is a column vector and its length is equal to the number of rows in A
The results differ only in shape: v/A is the transpose of A'\v'

Finding optimal weight factor for SOR

I am using the SOR method and need to find the optimal weight factor. I think a good way to go about this is to run my SOR code with a number of omegas from 0 to 2, then store the number of iterations for each of these. Then I can see which iteration is the lowest and which omega it corresponds to. Being a novice programer, however, I am unsure how to go about this.
Here is my SOR code:
function [x, l] = SORtest(A, b, x0, TOL,w)
[m n] = size(A); % assigning m and n to number of rows and columns of A
l = 0; % counter variable
x = [0;0;0]; % introducing solution matrix
max_iter = 200;
while (l < max_iter) % loop until max # of iters.
l = l + 1; % increasing counter variable
for i=1:m % looping through rows of A
sum1 = 0; sum2 = 0; % intoducing sum1 and sum2
for j=1:i-1 % looping through columns
sum1 = sum1 + A(i,j)*x(j); % computing sum using x
end
for j=i+1:n
sum2 = sum2 + A(i,j)*x0(j); % computing sum using more recent values in x0
end
x(i) =(1-w)*x0(i) + w*(-sum1-sum2+b(i))/A(i,i); % assigning elements to the solution matrix.
end
if abs(norm(x) - norm(x0)) < TOL % checking tolerance
break
end
x0 = x; % assigning x to x0 before relooping
end
end
That's pretty easy to do. Simply loop through values of w and determine what the total number of iterations is at each w. Once the function finishes, check to see if this is the current minimum number of iterations required to get a solution. If it is, then update what the final solution would be. Once we iterate over all w, the result would be the solution vector that produced the smallest number of iterations to converge. Bear in mind that SOR has the w such that it does not include w = 0 or w = 2, or 0 < w < 2, so we can't include 0 or 2 in the range. As such, do something like this:
omega_vec = 0.01:0.01:1.99;
final_x = x0;
min_iter = intmax;
for w = omega_vec
[x, iter] = SORtest(A, b, x0, TOL, w);
if iter < min_iter
min_iter = iter;
final_x = x;
end
end
The loop checks to see if the total number of iterations at each w is less than the current minimum. If it is, log this and also record what the solution vector was. The final solution vector that was the minimum over all w will be stored in final_x.

How can I (efficiently) compute a moving average of a vector?

I've got a vector and I want to calculate the moving average of it (using a window of width 5).
For instance, if the vector in question is [1,2,3,4,5,6,7,8], then
the first entry of the resulting vector should be the sum of all entries in [1,2,3,4,5] (i.e. 15);
the second entry of the resulting vector should be the sum of all entries in [2,3,4,5,6] (i.e. 20);
etc.
In the end, the resulting vector should be [15,20,25,30]. How can I do that?
The conv function is right up your alley:
>> x = 1:8;
>> y = conv(x, ones(1,5), 'valid')
y =
15 20 25 30
Benchmark
Three answers, three different methods... Here is a quick benchmark (different input sizes, fixed window width of 5) using timeit; feel free to poke holes in it (in the comments) if you think it needs to be refined.
conv emerges as the fastest approach; it's about twice as fast as coin's approach (using filter), and about four times as fast as Luis Mendo's approach (using cumsum).
Here is another benchmark (fixed input size of 1e4, different window widths). Here, Luis Mendo's cumsum approach emerges as the clear winner, because its complexity is primarily governed by the length of the input and is insensitive to the width of the window.
Conclusion
To summarize, you should
use the conv approach if your window is relatively small,
use the cumsum approach if your window is relatively large.
Code (for benchmarks)
function benchmark
clear all
w = 5; % moving average window width
u = ones(1, w);
n = logspace(2,6,60); % vector of input sizes for benchmark
t1 = zeros(size(n)); % preallocation of time vectors before the loop
t2 = t1;
th = t1;
for k = 1 : numel(n)
x = rand(1, round(n(k))); % generate random row vector
% Luis Mendo's approach (cumsum)
f = #() luisMendo(w, x);
tf(k) = timeit(f);
% coin's approach (filter)
g = #() coin(w, u, x);
tg(k) = timeit(g);
% Jubobs's approach (conv)
h = #() jubobs(u, x);
th(k) = timeit(h);
end
figure
hold on
plot(n, tf, 'bo')
plot(n, tg, 'ro')
plot(n, th, 'mo')
hold off
xlabel('input size')
ylabel('time (s)')
legend('cumsum', 'filter', 'conv')
end
function y = luisMendo(w,x)
cs = cumsum(x);
y(1,numel(x)-w+1) = 0; %// hackish way to preallocate result
y(1) = cs(w);
y(2:end) = cs(w+1:end) - cs(1:end-w);
end
function y = coin(w,u,x)
y = filter(u, 1, x);
y = y(w:end);
end
function jubobs(u,x)
y = conv(x, u, 'valid');
end
function benchmark2
clear all
w = round(logspace(1,3,31)); % moving average window width
n = 1e4; % vector of input sizes for benchmark
t1 = zeros(size(n)); % preallocation of time vectors before the loop
t2 = t1;
th = t1;
for k = 1 : numel(w)
u = ones(1, w(k));
x = rand(1, n); % generate random row vector
% Luis Mendo's approach (cumsum)
f = #() luisMendo(w(k), x);
tf(k) = timeit(f);
% coin's approach (filter)
g = #() coin(w(k), u, x);
tg(k) = timeit(g);
% Jubobs's approach (conv)
h = #() jubobs(u, x);
th(k) = timeit(h);
end
figure
hold on
plot(w, tf, 'bo')
plot(w, tg, 'ro')
plot(w, th, 'mo')
hold off
xlabel('window size')
ylabel('time (s)')
legend('cumsum', 'filter', 'conv')
end
function y = luisMendo(w,x)
cs = cumsum(x);
y(1,numel(x)-w+1) = 0; %// hackish way to preallocate result
y(1) = cs(w);
y(2:end) = cs(w+1:end) - cs(1:end-w);
end
function y = coin(w,u,x)
y = filter(u, 1, x);
y = y(w:end);
end
function jubobs(u,x)
y = conv(x, u, 'valid');
end
Another possibility is to use cumsum. This approach probably requires fewer operations than conv does:
x = 1:8
n = 5;
cs = cumsum(x);
result = cs(n:end) - [0 cs(1:end-n)];
To save a little time, you can replace the last line by
%// clear result
result(1,numel(x)-n+1) = 0; %// hackish way to preallocate result
result(1) = cs(n);
result(2:end) = cs(n+1:end) - cs(1:end-n);
If you want to preserve the size of your input vector, I suggest using filter
>> x = 1:8;
>> y = filter(ones(1,5), 1, x)
y =
1 3 6 10 15 20 25 30
>> y = (5:end)
y =
15 20 25 30

Vectorizing MATLAB function

I have double summation over m = 1:M and n = 1:N for polar point with coordinates rho, phi, z:
I have written vectorized notation of it:
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
summ = cos (n*z) * besselj(m'-1, n*rho) * cos(m*phi)';
Now I need to rewrite this function for accepting vectors (columns) of coordinates rho, phi, z. I tried arrayfun, cellfun, simple for loop - they work too slow for me. I know about "MATLAB array manipulation tips and tricks", but as MATLAB beginner I can't understand repmat and other functions.
Can anybody suggest vectorized solution?
I think your code is already well vectorized (for n and m). If you want this function to also accept an array of rho/phi/z values, I suggest you simply process the values in a for-loop, as I doubt any further vectorization will bring significant improvements (plus the code will be harder to read).
Having said that, in the code below, I tried to vectorize the part where you compute the various components being multiplied {row N} * { matrix N*M } * {col M} = {scalar}, by making a single call to the BESSELJ and COS functions (I place each of the row/matrix/column in the third dimension). Their multiplication is still done in a loop (ARRAYFUN to be exact):
%# parameters
N = 10; M = 10;
n = 1:N; m = 1:M;
num = 50;
rho = 1:num; phi = 1:num; z = 1:num;
%# straightforward FOR-loop
tic
result1 = zeros(1,num);
for i=1:num
result1(i) = cos(n*z(i)) * besselj(m'-1, n*rho(i)) * cos(m*phi(i))';
end
toc
%# vectorized computation of the components
tic
a = cos( bsxfun(#times, n, permute(z(:),[3 2 1])) );
b = besselj(m'-1, reshape(bsxfun(#times,n,rho(:))',[],1)'); %'
b = permute(reshape(b',[length(m) length(n) length(rho)]), [2 1 3]); %'
c = cos( bsxfun(#times, m, permute(phi(:),[3 2 1])) );
result2 = arrayfun(#(i) a(:,:,i)*b(:,:,i)*c(:,:,i)', 1:num); %'
toc
%# make sure the two results are the same
assert( isequal(result1,result2) )
I did another benchmark test using the TIMEIT function (gives more fair timings). The result agrees with the previous:
0.0062407 # elapsed time (seconds) for the my solution
0.015677 # elapsed time (seconds) for the FOR-loop solution
Note that as you increase the size of the input vectors, the two methods will start to have similar timings (the FOR-loop even wins on some occasions)
You need to create two matrices, say m_ and n_ so that by selecting element i,j of each matrix you get the desired index for both m and n.
Most MATLAB functions accept matrices and vectors and compute their results element by element. So to produce a double sum, you compute all elements of the sum in parallel by f(m_, n_) and sum them.
In your case (note that the .* operator performs element-wise multiplication of matrices)
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
% N rows x M columns for each matrix
% n_ - all columns are identical
% m_ - all rows are identical
n_ = repmat(n', 1, M);
m_ = repmat(m , N, 1);
element_nm = cos (n_*z) .* besselj(m_-1, n_*rho) .* cos(m_*phi);
sum_all = sum( element_nm(:) );