I have two matrices A(2*1600*3) and B(2*1600). I am trying to do xcorr operation for each row in A against each row in B and want to store the results in a Matrix. At present I am using the following code.
for ii=1:3
for jj=1:2
X(ii,jj)=max((xcorr(A(jj,:,ii),B(jj,:))));
end
end
Since I am using two for loops, it is consuming more time and is affecting the execution time of my entire program which already had a for loop. How can I do this without the two for loops and store the output in a matrix ?
Meanwhile, I have also tried the above code with cellfun for a single column of the output matrix.
`cellfun(#(x) max(xcorr(x, B(1,:))), A, 'UniformOutput', false);`
In my observation, for loop is much faster than cellfun.
Execution times:
For loop: 2.4 secs for two columns of matrix output. Cellfun:2.6 secs for one column of matrix output.
You can do this easily using fft. Cross-correlation is very similar to convolution:
% Compute the size of the cross-correlation.
N = size(A,2) + size(B,2) - 1;
% Do correlation using FFT. We have to flip one of the inputs.
% If A and B are both symmetric, you might want to add the 'symmetric' flag to ifft
Y = ifft(fft(A,N,2) .* fft(flip(B,2),N,2), N, 2);
% Squeeze out the second dimension and transpose so it matches your size and shape.
Y = squeeze(max(Y, [], 2))'
Related
I have a 943x1682 matrix in which I want to calculate the two most similar vectors in this matrix. So I want see the cosine distance of each vector in the matrix to each vector in the matrix, of course not including the vector with itself, if one cannot do that I can just ignore those.
I made this loop to try to calculate this, so I can get a 1682x1682 matrix, with each cell corresponding to the similarity between i and j. However when I run this, it takes forever to run, and when I try to open the resulting matrix in my workspace, it says:
Cannot display summaries of variables with more than 524288 elements.
Is there an easier way to do this or am I doing something wrong?
Cross posted on MATLAB Answers. Repeating answer here:
Use a standard matrix multiply to get the dot products. MATLAB is very fast at standard matrix multiplies. And then normalize the result. E.g.,
AA = A' * A; % the column dot products via a standard matrix multiply
Anorm = sqrt(diag(AA)); % the norms of the columns
Adist = AA ./ (Anorm .* Anorm.'); % normalize the column dot products into cosine distances
Then pick off the maximum value for your answer, disregarding the diagonal. E.g.,
n = size(A,2); % the number of columns
Adist(1:n+1:end) = -inf; % disregard the diagonal (column compared to itself)
[~,x] = max(Adist(:)); % find the max cosine distance linear index
[col1,col2] = ind2sub(size(Adist),x); % convert linear index into the original columns
Then col1 and col2 are the column numbers of the most similiar columns using cosine distance as a measure.
You can normalise the columns of the matrix first, then the cosine similarity equation simplifies to a matrix multiplication:
aNorm = normc(A);
cosSim = aNorm' * aNorm;
Generally, matrix multiplication is more performant than looping. In a quick test, with N = 1000, the looping code takes ~7 seconds and the matrix multiplication code ~0.5 seconds.
The resultant matrix may still be too large to open in your workspace, you could copy any individual rows or columns into a temporary and view those, or do a contour plot (heat-map) of the matrix to get a visual representation.
Given two random variables X and Y, where X=(x1,..,xn) and Y=(y1,...,yn) in a nx2 matrix A, so A=[X Y], i need to perform the next operation:
median((x-median(x))(y-median(y)))
I'm trying to obtain an estimator of the covariance matrix using the median instead the mean, for a nxt matrix where t represents the number of random variables and n the length of the data set.
So far, I made the next code:
for i=1:n
for j=1:n
a1=median(A(:,i));
a2=median(A(:,j));
SMM(i,j)=median(((A(:,i)-a1(ones(t,1),:)).*(A(:,j)-a2(ones(t,1),:))));
end
end
However, theoretically I must obtain a semidefinite (positive or negative) symmetric matrix, however that's not the case with this code.
Am I making any mistake in the code formulation?
Various points:
For each of your columns of A (x, y), the median (a1, a2) doesn't change. You should compute these outside the loops.
The loops go over n, rather than t, which are the variables and the indices to the output matrix.
I would first subtract the median from each column, to avoid repeatedly doing the same computations:
A = A - median(A,1); % be explicit about which dimension to take the median over!
Next, we'd loop over the txt output elements of the covariance matrix, and compute each of the elements:
t = size(A,2);
SMM = zeros(t,t); % always preallocate output arrays before a loop
for j=1:t
for i=1:t
SMM(i,j) = median(A(:,i).*A(:,j));
end
end
The loop can likely be vectorized, but that leads to a large intermediate matrix, which slow down code also. So it might not be worth the effort to vectorize. Only try it if this code is too slow!
It should also be possible to run the inner loop from i=j:t, to skip computing the redundant half of the symmetric matrix, instead copying over the previously computed values.
Please correct me if there are somethings unclear in this question. I have two matrices pop, and ben of 3 dimensions. Call these dimensions as c,t,w . I want to repeat the exact same process I describe below for all of the c dimensions, without using a for loop as that is slow. For the discussion below, fix a value of the dimension c, to explain my thinking, later I will give a MWE. So when c is fixed I have a 2D matrix with dimension t,w.
Now I repeat the entire process (coming below!) for all of the w dimension.
If the value of u is zero, then I find the next non zero entry in this same t dimension. I save both this entry as well as the corresponding t index. If the value of u is non zero, I simply store this value and the corresponding t index. Call the index as i - note i would be of dimension (c,t,w). The last entry of every u(c,:,w) is guaranteed to be non zero.
Example if the u(c,:,w) vector is [ 3 0 4 2 0 1], then the corresponding i values are [1,3,3,4,6,6].
Now I take these entries and define a new 3d array of dimension (c,t,w) as follows. I take my B array and do the following what is not a correct syntax but to explain you: B(c,t,w)/u(c,i(c,t,w),w). Meaning I take the B values and divide it by the u values corresponding to the non zero indices of u from i that I computed.
For the above example, the denominator would be [3,4,4,2,1,1]. I hope that makes sense!!
QUESTION:
To do this, as this process simply repeats for all c, I can do a very fast vectorizable calculation for a single c. But for multiple c I do not know how to avoid the for loop. I don't knw how to do vectorizable calculations across dimensions.
Here is what I did, where c_size is the dimension of c.
for c=c_size:-1:1
uu=squeeze(pop(c,:,:)) ; % EXTRACT A 2D MATRIX FROM pop.
BB=squeeze(B(c,:,:)) ; % EXTRACT A 2D MATRIX FROM B
ii = nan(size(uu)); % Start with all nan values
[dum_row, ~] = find(uu); % Get row indices of non-zero values
ii(uu ~= 0) = dum_row; % Place row indices in locations of non-zero values
ii = cummin(ii, 1, 'reverse'); % Column-wise cumulative minimum, starting from bottomi
dum_i = ii+(time_size+1).*repmat(0:(scenario_size-1), time_size+1, 1); % Create linear index
ben(c,:,:) = BB(dum_i)./uu(dum_i);
i(c,:,:) = ii ;
clear dum_i dum_row uu BB ii
end
The central question is to avoid this for loop.
Related questions:
Vectorizable FIND function with if statement MATLAB
Efficiently finding non zero numbers from a large matrix
Vectorizable FIND function with if statement MATLAB
I want to multiply every column of a M × N matrix by corresponding element of a vector of size N.
I know it's possible using a for loop. But I'm seeking a more simple way of doing it.
I think this is what you want:
mat1=randi(10,[4 5]);
vec1=randi(10,[1 5]);
result=mat1.*repmat(vec1,[size(mat1,1),1]);
rempat will replicate vec1 along the rows of mat1. Then we can do element-wise multiplication (.*) to "multiply every column of a M × N matrix by corresponding element of a vector of size N".
Edit: Just to add to the computational aspect. There is an alternative to repmat that I would like you to know. Matrix indexing can achieve the same behavior as repmat and be faster. I have adopted this technique from here.
Observe that you can write the following statement
repmat(vec1,[size(mat1,1),1]);
as
vec1([1:size(vec1,1)]'*ones(1,size(mat1,1)),:);
If you see closely, the expression boils down to vec1([1]'*[1 1 1 1]),:); which is again:
vec1([1 1 1 1]),:);
thereby achieving the same behavior as repmat and be faster. I ran three solutions 100000 times, namely,
Solution using repmat : 0.824518 seconds
Solution using indexing technique explained above : 0.734435 seconds
Solution using bsxfun provided by #LuisMendo : 0.683331 seconds
You can observe that bsxfun is slightly faster.
Although you can do it with repmat (as in #Parag's answer), it's often more efficient to use bsxfun. It also has the advantage that the code (last line) is the same for a row and for a column vector.
%// Example data
M = 4;
N = 5;
matrix = rand(M,N);
vector = rand(1,N); %// or size M,1
%// Computation
result = bsxfun(#times, matrix, vector); %// bsxfun does an "implicit" repmat
I have the following code, how will I be able to simplify it using the function as it currently runs pretty slow, assume X is 10x7 and Y is 4x7 and D is a matrix stores the correlation between each pair of vectors. If the solution is to use the xcorr2 function can someone show me how it is done?
for i = 1:4
for j = 1:10
D(j,i) = corr2(X(j,:),Y(i,:));
end
end
Use pdist2 (Statistics toolbox) with 'correlation' option. It's faster than your code (even with preallocation), and requires just one line:
D = 1-pdist2(X,Y,'correlation');
Here is how I would do it:
First of all, store/process your matrix transposed. This makes for easier use of the correlation function.
Now assuming you have matrices X and Y and want to get the correlations between combinations of columns, this is easily achieved with a single loop:
Take the first column of X
use corrcoef to determine the correlation with all columns of Y at once.
As long as there is one, take the next column of X
Alternate to this, you can check whether it helps to replace corr2 in your original code with corr, xcorr or corrcoef and see which one runs fastest.
With corrcoef you can do this without a loop, and without using a toolbox:
D = corrcoef([X', Y']);
D = D(1 : size(X, 1), end - size(Y, 1) + 1 : end);
A drawback is that more coefficients are computed than necessary.
The transpose is necessary because your matrices do not follow the Matlab convention to enumerate samples with the first index and variables with the second.