I have the following code for a 8 dimensional empirical copula that creates a 8d matrix but I only need the diagonal of the matrix which is named EC in this code. Since this code is very slow, is there anyway that I can get "EC" without computing all of "ecop"?
function EC = ecopula8d(x)
[m n] = size(x);
y = sort(x);
for r=1:m
for q=1:m
for p=1:m
for o=1:m
for l=1:m
for k=1:m
for j=1:m
for i=1:m
ecop(i,j,k,l,o,p,q,r) = sum( (x(:,1)<=y(i,1)).*(x(:,2)<=y(j,2)).*(x(:,3)<=y(k,3)).*(x(:,4)<=y(l,4))...
.*(x(:,5)<=y(o,5)).*(x(:,6)<=y(p,6)).*(x(:,7)<=y(q,7)).*(x(:,8)<=y(r,8)) )/(m+1);
end
end
end
end
end
end
end
end
for i=1:m
EC(i,1)=ecop(i,i,i,i,i,i,i,i);
end
I haven't checked if your initial computation is correct (as in compared your implementation with the wikipedia article's formula), but your code should be equivalent to:
[m n] = size(x);
y = sort(x);
for i = 1:m
EC(i,1) = sum(all(bsxfun(#le, x, y(i,:)), 2), 1)/(m+1);
end
You can employ a completely vectorized solution with bsxfun -
EC = squeeze(sum(all(bsxfun(#le,x,permute(y,[3 2 1])),2),1))/(m+1)
The magic here happens with the use of permute enabling us to go full throttle on vectorization.
Here's a benchmarking test to compare this approach and the other partially vectorized approach with bsxfun on runtime efficiency -
x = rand(2000,2000);
y = sort(x);
m = size(x,1);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('----------- With completely vectorized solution')
tic
EC1 = squeeze(sum(all(bsxfun(#le,x,permute(y,[3 2 1])),2),1))/(m+1);
toc, clear EC1
disp('----------- With partial vectorized solution')
tic
for i = 1:m
EC2(i,1) = sum(all(bsxfun(#le, x, y(i,:)), 2), 1)/(m+1);
end
toc
The runtimes thus obtained were -
----------- With completely vectorized solution
Elapsed time is 2.883594 seconds.
----------- With partial vectorized solution
Elapsed time is 4.752508 seconds.
One can pre-allocate for the other partially vectorized approach -
EC2 = zeros(m,1);
for i = 1:m
EC2(i,1) = sum(all(bsxfun(#le, x, y(i,:)), 2), 1)/(m+1);
end
The runtimes thus obtained weren't that different though -
----------- With completely vectorized solution
Elapsed time is 2.963835 seconds.
----------- With partial vectorized solution
Elapsed time is 4.620455 seconds.
Once of the approaches I would use is to convert N-D array into square 2-D(if possible) and then simply extract diagonal term as they should be equal in both cases:
EC=diag(reshape(ecop,size1,size2));
I would suggest to try use Python because numpy has really nice and efficient linear algebra package to deal with N-D arrays. Matlab is pretty slow in adding and updating its libraries.
Related
Here is the original code:
K = zeros(N*N)
for a=1:N
for i=1:I
for j=1:J
M = kron(X(:,:,a).',Y(:,:,a,i,j));
%A function that essentially adds M to K.
end
end
end
The goal is to vectorize the kroniker multiplication calls. My intuition is to think of X and Y as containers of matrices (for reference, the slices of X and Y being fed to kron are square matrices of the order 7x7). Under this container scheme, X appears a 1-D container and Y as a 3-D container. My next guess was to reshape Y into a 2-D container or better yet a 1-D container and then do element wise multiplication of X and Y. Questions are: how would do this reshaping in a way that preserves the trace of M and can matlab even handle this idea in this container idea or do the containers need to be further reshaped to expose the inner matrix elements further?
Approach #1: Matrix multiplication with 6D permute
% Get sizes
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
% Lose the third dim from X and Y with matrix-multiplication
parte1 = reshape(permute(Y,[1,2,4,5,3]),[],N)*reshape(X,[],N).';
% Rearrange the leftover dims to bring kron format
parte2 = reshape(parte1,[n1,n2,I,J,m1,m2]);
% Lose dims correspinding to last two dims coming in from Y corresponding
% to the iterative summation as suggested in the question
out = reshape(permute(sum(sum(parte2,3),4),[1,6,2,5,3,4]),m1*n1,m2*n2)
Approach #2: Simple 7D permute
% Get sizes
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
% Perform kron format elementwise multiplication betwen the first two dims
% of X and Y, keeping the third dim aligned and "pushing out" leftover dims
% from Y to the back
mults = bsxfun(#times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
% Lose the two dims with summation reduction for final output
out = sum(reshape(mults,m1*n1,m2*n2,[]),3);
Verification
Here's a setup for running the original and the proposed approaches -
% Setup inputs
X = rand(10,10,10);
Y = rand(10,10,10,10,10);
% Original approach
[n1,n2,N,I,J] = size(Y);
K = zeros(100);
for a=1:N
for i=1:I
for j=1:J
M = kron(X(:,:,a).',Y(:,:,a,i,j));
K = K + M;
end
end
end
% Approach #1
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
mults = bsxfun(#times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
out1 = sum(reshape(mults,m1*n1,m2*n2,[]),3);
% Approach #2
[m1,m2,~] = size(X);
[n1,n2,N,n4,n5] = size(Y);
parte1 = reshape(permute(Y,[1,2,4,5,3]),[],N)*reshape(X,[],N).';
parte2 = reshape(parte1,[n1,n2,I,J,m1,m2]);
out2 = reshape(permute(sum(sum(parte2,3),4),[1,6,2,5,3,4]),m1*n1,m2*n2);
After running, we see the max. absolute deviation with the proposed approaches against the original one -
>> error_app1 = max(abs(K(:)-out1(:)))
error_app1 =
1.1369e-12
>> error_app2 = max(abs(K(:)-out2(:)))
error_app2 =
1.1937e-12
Values look good to me!
Benchmarking
Timing these three approaches using the same big dataset as used for verification, we get something like this -
----------------------------- With Loop
Elapsed time is 1.541443 seconds.
----------------------------- With BSXFUN
Elapsed time is 1.283935 seconds.
----------------------------- With MATRIX-MULTIPLICATION
Elapsed time is 0.164312 seconds.
Seems like matrix-multiplication is doing fairly good for dataset of these sizes!
I have a 3 for loops and I would like if possible to vectorize the two inner loops.
for t=1:size(datesdaily1)
for i=1:size(secids,1)
sum=0;
if inc(t,i)==1
for j=1:size(secids,1)
if inc(t,j)==1
sum=sum+weig1(t,j)*sqrt(Rates(t,j))*rhoneutral(i,j);
end
end
b(t,i)=sqrt(Rates(t,i))*sum/MRates(t,1);
end
end
end
Any idea on how to accomplish that? Here 'weig', 'inc' and 'Rates' are (size(datesdaily1) by size(secids,1)) matrixes and 'rhoneutral' is a (size(secids,1) by size(secids,1)) matrix.
I tried but I was not able to figure out how to do it ...
Actual full code:
for t=1:size(datesdaily1)
rho=NaN(size(secids,1),size(secids,1));
aux=datesdaily1(t,1);
windowlenght=252;
index=find(datesdaily==aux);
auxret=dailyret(index-windowlenght+1:index,:);
numerator=0;
denominator=0;
auxret(:,any(isnan(auxret))) = NaN;
rho = corr(auxret, 'rows','pairwise');
rho1 = 1 - rho;
w = weig1(t,:) .* sqrt(Rates(t,:));
x = w.' * w;
y = x .* rho;
z = x .* rho1;
numerator = numerator + nansum(nansum(y));
denominator = denominator + nansum(nansum(z));;
if not(denominator==0)
alpha(t,1)=-(MRates(t,1)-numerator)/denominator;
%Stocks included
inc(t,:)=not(isnan(weig1(t,:).*diag(rho)'.*Rates(t,:)));
rhoneutral=rho-alpha(t,1).*(1-rho);
for i=1:size(secids,1)
sum=0;
if inc(t,i)==1
for j=1:size(secids,1)
if inc(t,j)==1
sum=sum+weig1(t,j)*sqrt(Rates(t,j))*rhoneutral(i,j);
end
end
bet(t,i)=sqrt(Rates(t,i))*sum/MRates(t,1);
end
end
check(t,1)=nansum(weig1(t,:).*bet(t,:));
end
end
One vectorized approach using fast matrix multiplication in MATLAB -
%// Mask of valid calculations
mask = inc==1
%// Store square root of Rates which seem to be used rather than Rates itself
sqRates = sqrt(Rates)
%// Use mask to set invalid positions in weig1 and sqRates to zeros
weig1masked = weig1.*mask
sqRates = sqRates.*mask
%// Perform the sum calculations using matrix multiplication.
%// This is where the magic happens!!
sum_vals = (weig1masked.*sqRates)*rhoneutral' %//'
%// Perform the outermost loop calculations for the final output
b_vect = bsxfun(#rdivide,sum_vals.*sqRates,MRates)
Benchmarking
Here's a benchmark test specially dedicated to #Dmitry Grigoryev for the doubts put on vectorization for performance -
M = 200;
N = 200;
weig1 = rand(M,N);
inc = rand(M,N)>0.5;
Rates = rand(M,N);
rhoneutral = rand(N,N);
MRates = rand(M,1);
disp('--------------------------- With Original Approach')
tic
%// Code from the original approach
toc
disp('--------------------------- With DmitryGrigoryev Approach')
tic
%// Code from the DmitryGrigoryev's solution
toc
disp('--------------------------- With Much-Hated Vectorized Approach')
tic
%// Proposed matrix-multiplication approach in this solution
toc
Runtimes -
--------------------------- With Original Approach
Elapsed time is 0.104084 seconds.
--------------------------- With DmitryGrigoryev Approach
Elapsed time is 3.562170 seconds.
--------------------------- With Much-Hated Vectorized Approach
Elapsed time is 0.002058 seconds.
Posting runtimes for bigger datasizes might just be too embarrasing for loopy approches, way to go vectorization!!
I am using Matlab R2014a.
I have a 3-dimensional M x N x M matrix A. I would like a vectorized way to extract a 2 dimensional matrix B from it, such that for each i,j I have
B(i,j)=A(i,j,g(i,j))
where g is a 2-dimensional index matrix of size M x N, i.e. with integral values in {1,2,...,M}.
The context is that I am representing a function A(k,z,k') as a 3-dimensional matrix, the function g(k,z) as a 2-dimensional matrix, and I would like to compute the function
h(k,z)=f(k,z,g(k,z))
This seems like a simple and common thing to try to do but I really can't find anything online. Thank you so much to whoever can help!
My first thought was to try something like B = A(:,:,g) or B=A(g) but neither of these works, unsurprisingly. Is there something similar?
You can employ the best tool for vectorization, bsxfun here -
B = A(bsxfun(#plus,[1:M]',M*(0:N-1)) + M*N*(g-1))
Explanation: Breaking it down to two steps
Step #1: Calculate the indices corresponding to the first two dimensions (rows and columns) of A -
bsxfun(#plus,[1:M]',M*(0:N-1))
Step #2: Add the offset needed to include the dim-3 indices being supplied by g and index into A with those indices to get our desired output -
A(bsxfun(#plus,[1:M]',M*(0:N-1)) + M*N*(g-1))
Benchmarking
Here's a quick benchmark test to compare this bsxfun based approach against the ndgrid + sub2ind based solution as presented in Luis's solution with M and N as 100.
The benchmarking code using tic-toc would look something like this -
M = 100;
N = 100;
A = rand(M,N,M);
g = randi(M,M,N);
num_runs = 5000; %// Number of iterations to run each approach
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp('-------------------- With BSXFUN')
tic
for iter = 1:num_runs
B1 = A(bsxfun(#plus,[1:M]',M*(0:N-1)) + M*N*(g-1)); %//'
end
toc, clear B1
disp('-------------------- With NDGRID + SUB2IND')
tic
for iter = 1:num_runs
[ii, jj] = ndgrid(1:M, 1:N);
B2 = A(sub2ind([M N M], ii, jj, g));
end
toc
Here's the runtime results -
-------------------- With BSXFUN
Elapsed time is 2.090230 seconds.
-------------------- With NDGRID + SUB2IND
Elapsed time is 4.133219 seconds.
Conclusions
As you can see bsxfun based approach works really well, both as a vectorized approach and good with performance too.
Why is bsxfun better here -
bsxfun does replication of offsetted elements and adding them, both on-the-fly.
In the other solution, ndgrid internally makes two function calls to repmat, thus incurring the function call overheads. At the next step, sub2ind spends time in adding the offsets to get the linear indices, bringing in another function call overhead.
Try using sub2ind. This assumes g is defined as an MxN matrix with possible values 1, ..., M:
[ii, jj] = ndgrid(1:M, 1:N);
B = A(sub2ind([M N M], ii, jj, g));
Let's say I have two matrices A and B
A = rand(4,5,3);
B = rand(4,5,6)
I want to apply the function 'corr2' to calculate the correlation coefficients.
corr2(A(:,:,1),B(:,:,1))
corr2(A(:,:,1),B(:,:,2))
corr2(A(:,:,1),B(:,:,3))
...
corr2(A(:,:,1),B(:,:,6))
...
corr2(A(:,:,2),B(:,:,1))
corr2(A(:,:,2),B(:,:,2))
...
corr2(A(:,:,3),B(:,:,6))
How to avoid using loops to create such a vectorization?
Hacked into the m-file for corr2 to create a customized vectorized version for working with 3D arrays. Proposed here are two approaches with bsxfun (of course!)
Approach #1
szA = size(A);
szB = size(B);
a1 = bsxfun(#minus,A,mean(mean(A)));
b1 = bsxfun(#minus,B,mean(mean(B)));
sa1 = sum(sum(a1.*a1));
sb1 = sum(sum(b1.*b1));
v1 = reshape(b1,[],szB(3)).'*reshape(a1,[],szA(3));
v2 = sqrt(sb1(:)*sa1(:).');
corr3_out = v1./v2; %// desired output
corr3_out stores corr2 results between all 3D slices of A and B.
Thus, for A = rand(4,5,3), B = rand(4,5,6), we would have corr3_out as a 6x3 array.
Approach #2
Slightly different approach to save on few calls to sum and mean by using reshape instead -
szA = size(A);
szB = size(B);
dim12 = szA(1)*szA(2);
a1 = bsxfun(#minus,A,mean(reshape(A,dim12,1,[])));
b1 = bsxfun(#minus,B,mean(reshape(B,dim12,1,[])));
v1 = reshape(b1,[],szB(3)).'*reshape(a1,[],szA(3));
v2 = sqrt(sum(reshape(b1.*b1,dim12,[])).'*sum(reshape(a1.*a1,dim12,[])));
corr3_out = v1./v2; %// desired output
Benchmarking
Benchmark code -
%// Create random input arrays
N = 55; %// datasize scaling factor
A = rand(4*N,5*N,3*N);
B = rand(4*N,5*N,6*N);
%// Warm up tic/toc
for k = 1:50000
tic(); elapsed = toc();
end
%// Run vectorized and loopy approach codes on the input arrays
%// 1. Vectorized approach
%//... solution code (Approach #2) posted earlier
%// clear variables used
%// 2. Loopy approach
tic
s_A=size(A,3);
s_B=size(B,3);
out1 = zeros(s_B,s_A);
for ii=1:s_A
for jj=1:s_B
out1(jj,ii)=corr2(A(:,:,ii),B(:,:,jj));
end
end
toc
Results -
-------------------------- With BSXFUN vectorized solution
Elapsed time is 1.231230 seconds.
-------------------------- With loopy approach
Elapsed time is 139.934719 seconds.
MATLAB-JIT lovers show some love here! :)
Some examples, yet none is better than loops. As Divakar says in a comment below this is not a vectorized solution.
CODE:
A = rand(4,5,1000);
B = rand(4,5,200);
s_A=size(A,3);
s_B=size(B,3);
%%% option 1
tic
corr_AB=cell2mat(arrayfun(#(indx1) arrayfun(#(indx2) corr2(A(:,:,indx1),B(:,:,indx2)),1:s_B),1:s_A,'UniformOutput',false));
toc
%%% option 2
tic
indx1=repmat(1:s_A,s_B,1);
indx1=indx1(:);
indx2=repmat(1:s_B,1,s_A);
indx2=indx2(:);
indx=[indx1,indx2];
corr_AB=arrayfun(#(i) corr2(A(:,:,indx(i,1)),B(:,:,indx(i,2))),1:size(indx,1));
toc
%%% option 3
tic
a=1;
for i=1:s_A
for j=1:s_B
corr_AB(a)=corr2(A(:,:,i),B(:,:,j));
a=a+1;
end
end
toc
OUTPUT:
Elapsed time is 9.655696 seconds.
Elapsed time is 9.398979 seconds.
Elapsed time is 8.489744 seconds.
I have many points and I want to build distance matrix i.e. distance of every point with all of other points but I want to don't use from loop because take too time...
Is a better way for building this matrix?
this is my loop: for a setl with size: 10000x3 this method take a lot of my time :(
for i=1:size(setl,1)
for j=1:size(setl,1)
dist = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+...
(zl(i)-zl(j))^2);
distanceMatrix(i,j) = dist;
end
end
How about using some linear algebra? The distance of two points can be computed from the inner product of their position vectors,
D(x, y) = ∥y – x∥ = √ (
xT x + yT y – 2 xT y ),
and the inner product for all pairs of points can be obtained through a simple matrix operation.
x = [xl(:)'; yl(:)'; zl(:)'];
IP = x' * x;
d = sqrt(bsxfun(#plus, diag(IP), diag(IP)') - 2 * IP);
For 10000 points, I get the following timing results:
ahmad's loop + shoelzer's preallocation: 7.8 seconds
Dan's vectorized indices: 5.3 seconds
Mohsen's bsxfun: 1.5 seconds
my solution: 1.3 seconds
You can use bsxfun which is generally a faster solution:
s = [xl(:) yl(:) zl(:)];
d = sqrt(sum(bsxfun(#minus, permute(s, [1 3 2]), permute(s, [3 1 2])).^2,3));
You can do this fully vectorized like so:
n = numel(xl);
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:)
Iy = Y(:)
reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
If you look at Ix and Iy (try it for like a 3x3 dataset), they make every combination of linear indexes possible for each of your matrices. Now you can just do each subtraction in one shot!
However mixing the suggestions of shoelzer and Jost will give you an almost identical performance performance boost:
n = 50;
xl = rand(n,1);
yl = rand(n,1);
zl = rand(n,1);
tic
for t = 1:100
distanceMatrix = zeros(n); %// Preallocation
for i=1:n
for j=min(i+1,n):n %// Taking advantge of symmetry
distanceMatrix(i,j) = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+(zl(i)-zl(j))^2);
end
end
d1 = distanceMatrix + distanceMatrix'; %'
end
toc
%// Vectorized solution that creates linear indices using meshgrid
tic
for t = 1:100
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:);
Iy = Y(:);
d2 = reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
end
toc
Returns:
Elapsed time is 0.023332 seconds.
Elapsed time is 0.024454 seconds.
But if I change n to 500 then I get
Elapsed time is 1.227956 seconds.
Elapsed time is 2.030925 seconds.
Which just goes to show that you should always bench mark solutions in Matlab before writing off loops as slow! In this case, depending on the scale of your solution, loops could be significantly faster.
Be sure to preallocate distanceMatrix. Your loops will run much, much faster and vectorization probably isn't needed. Even if you do it, there may not be any further speed increase.
The latest versions (Since R2016b) of MATLAB support Implicit Broadcasting (See also noted on bsxfun()).
Hence the fastest way for distance matrix is:
function [ mDistMat ] = CalcDistanceMatrix( mA, mB )
mDistMat = sum(mA .^ 2).' - (2 * mA.' * mB) + sum(mB .^ 2);
end
Where the points are along the columns of the set.
In your case mA = mB.
Have a look on my Calculate Distance Matrix Project.