Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I want to brainstorm an idea in MATLAB with you guys. Given a matrix with many columns (14K) and few rows (7) where columns are items and rows features of the items, I would like to compute the similarity with all items and keep it in matrix which is:
Easy to compute
Easy to access
for 1., I came up with a brilliant idea of using pdist() which is very fast:
A % my matrix
S = pdist(A') % computes the similarity btw all columns very fast
However accessing s is not convenient. I prefer to access similarity between item i and j , e.g. using S(i,j):
S(4,5) % is the similarity between item 4 and 5
In its original definition, S is an array not a matrix. Is making it as an 2D matrix a bad idea storage-wise? Could we think about a cool idea that can help me find which similaity corresponds to which items quickly?
Thank you.
You can use pdist2(A',A'). What is returned is essentially the distance matrix in its standard form where element (i,j) is the dissimilarity (or similarity) between i-th and j-th pattern.
Also, if you want to use pdist(), which is ok, you can convert the resulting array into the well-known distance matrix by using the function squareform().
So, in conclusion, if A is your dataset and S the distance matrix, you can use either
S=pdist(A');
S=squareform(S);
or
S=pdist2(A',A');
Now, regarding the storage point-of-view, you will certainly notice that such matrix is symmetric. What Matlab essentially proposes with the array S in pdist() is to save space: due to the fact that such matrix is symmetric you can as well save half of it in a vector. Indeed the array S has m(m-1)/2 elements whereas the matrix form has m^2 elements (if m is the number of patterns in your training set). On the other hand, most certainly is trickier to access such vector whereas the matrix is absolutely straightforward.
I'm not completely sure to understand what your question is, but if you want to access S(i, j) easily then the function squareform is made for this:
S = squareform(pdist(A'));
Best,
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I'm impressed by the low speed of Matlab matrix arithmetic operation in my code as follows:
pitemp=zeros(nz,na,nb,nk,nxi,nann,nbn,nkn);
pitemp=alphal^(alphal*alphares)*alpham^(alpham*alphares)*expz.^alphares.*expk.^((eta-alphal-alpham)*alphares)+(1-delta)*expk-expkn...
-ximgrid.*(expkn./expk-1+delta>zeta | expkn./expk-1+delta<-zeta)+rd*amgrid-rl*bmgrid-amgridn+bmgridn;
pitemp, expz, expk, expkn, amgrid,bmgrid, amgridn, bmgridn are all 8*D matrix. It seems that it's not the logical operator part that slower things down. I just don't see any clue why it could take up 10 seconds....
Does any one see where the problem is? Thanks a lot in advance! I'm really being killed by the slow performance just because of this line...
High dimensional matrices may indeed be slow to work with sometimes. I will show this comparing the speed of a 2D matrix and your matrix. The matrices will have approximately the same size.
m = function fRef1()
m = rand(5000,10000); % Generate an unsorted matrix to ensure worst case behaviour
end
Running the timeit function,
timeit(#fRef1,1)
The matrix generation takes 0.7058s for the 2D matrix
m = function fRef2()
m=rand(10,10,10,10,5,10,10,10);
end
timeit(fRef2,1)
And for the 8D matrix it takes 0.7277s which is about the same speed.
Now test to do a simple matrix operation
function M = f1()
m = rand(5000,10000);
M = m.^2.*m+m;
end
Which with timeit takes 0.9449s. Using the result from fRef1 you can see the matrix operation takes about 0.24s.
Now compare to the 8D matrix
function M = f2()
m = rand(10,10,10,10,5,10,10,10);
M = m.^2.*m+m;
end
which with timeit takes 1.2553s. Removing the time from fRef2 you will get the time for the matrix operation. The calculation time is then 0.5276s which is about twize the time for the 2D matrix. So can we do better? The answer is yes! Since the operations are done elementwise, the operation is independent on the shape of the matrix. Let us then modify the matrix to a for that Matlab finds more suitable.
function M=f3()
m=rand(10,10,10,10,5,10,10,10);
m=m(:); % Create a row vector
M=m.^2.*m+m;
M = reshape(M,10,10,10,10,5,10,10,10); % reshape the matrix again to
% its original shape
end
timeit gives us a result of 0.9494s. Which, by removing the time for the creation of m gives us a result of 0.2217s which is about the same time as of the 2D matrix.
Windows 7, intel core i5-2540M, 2.60GHz, Matlab 2014b
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I need to create a function that takes a component matrix as a parameter and returns a matrix?
Apparently this function should normalise my data?
There are other instructions along with this step in my project such as:
Take the matrix and calculate the mean value along a certain column.
Calculate the difference between the measurement and this mean.
Subtract this difference from each measurement.
Return corrected matrix to the script.
Place corrected matrix in a variable within the script.
(I don't know if this is what the function is supposed to do or anything I'm completely lost and any help would be appreciated thanks!)
This is probably homework but I'll help you get started.
To create a function which takes a matrix and return a matrix:
function m_out = my_function(m_in)
%insert calculations here
end
To find the 2-norm of a matrix (which is the largest singular value):
the_norm = norm(my_matrix); % returns a scalar, 2-norm of matrix
To find the mean of a vector:
the_mean = mean(my_vector); % returns a scalar, mean of the vector
To access a specific column of a matrix:
my_col = my_matrix(:, col_number); % my_col is a vector
To access a specific row of a matrix:
my_row = my_matrix(row_num, :); % my_row is a vector
To subtract a scalar (single number) from a matrix:
new_matrix = old_matrix - single_number; % returns a matrix
To store a matrix into a variable (example):
my_matrix = [1,2,3;4,5,6;7,8,9];
Give it a try creating a function which puts this all together.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm implementing metric rectification of an image with projective distortion in the following manner:
From the original image I'm finding two sets of parallel lines and finding their intersection points (the vanishing points at infinity).
I'm selecting five non-collinear points on a circle to be fit to a conic, then I'm checking where that conic intersects the line at infinity using the aforementioned points.
I use those points to find the distorted dual degenerate conic.
Theoretically, since the distorted conic is determined by C*'=HC*H' (C* is the dual degenerate conic, ' is transpose, H is my homography), I should be able to run SVD to determine H. Undistorted, C* is a 3x3 identity matrix with the last element on the diagonal zero. However, if I run SVD I don't get ones in the diagonal matrix. For some matrices I can avoid this by using Cholesky factorization instead (which factors to C*'=HH' which, at least for this, is mostly okay) but this requires a matrix that's positive definite. Is there a way to distribute the scale inside the diagonal matrix returned in SVD equally into the U and V' matrices while keeping them the same? (e.g. U = V).
I'm using MATLAB for this. I'm sure I'm missing something obvious...
The lack of positive definiteness of the resulting matrices is due to noise, as the image used had too much radial distortion rendering even the selection of many points on the circle fairly useless in this approach.
The point missed in the SVD approach was to remove the scale from the diagonal component by right and left multiplying by the square root of the diagonal matrix (with the last diagonal element set to 1, since that singular value should be zero but a zero component there would not yield correct results).
Consider having a matrix. From this matrix I select the same number of elements from every row. Let us say that the matrix is nxn and from each row I take m elements (m<n).
I will build a mxm matrix with this selected elements. In every row I put the elements taken from the original matrix (same row index of course).
What is the best way to achieve this?
Thankyou
One way to achieve this is illustrated here. Define an array a to play around with ...
a = randi(6,6);
b = a([1 3 5],[2 4 6])
This demonstrates the use of index vectors for selecting rows and columns from one matrix into another. It depends on being able to specify the vectors you want to use as indices. You could also write:
c = a(1:2:end,2:2:end)
Now, if you tell us what you mean by 'the best way' we may be able to tell you that too !
EDIT
So I read the question again, it seems by 'best' you mean 'fastest'. I've never been concerned to measure the speed of this sort of operation, I await with interest one of the real Matlab experts who lurk hereabouts providing a much cleverer answer than this.
Of course, the fastest way is to not build a submatrix at all, but to operate on the elements of the original matrix. Whether your algorithm can be adapted to avoid building a submatrix is unknown to me.
I'm trying to select a subset of features from a data that contains 2000 of them for 63 samples. Now I know how to do PCA in MATLAB. I used 'pcacov' and it returns the eigenvectors and the eigenvalues too. However, I don't know how to select the features I want. I mean if the features aren't labeled, how can I select my features ? or they will be returned in the same order ?
PCA does not tell you which features are the most significant, but which combinations of features keep the most variance.
What PCA does is rotate your dataset in such a way that it has the most variance along the first dimension, second most along second, and so on. So, what you do when you multiply your feature vectors by the first N eigenvectors is rotate the set and keep the first N dimensions to transform your vectors into a lower-dimensional representation that keeps most of the variance.
how can I select my features ?
If you call it like
[pc,variances,explained] = pcacov(covx)
then the principal components are the vectors in the first return argument with variances as in the second return argument. They are in correspondence and sorted from most significant to least significant.
or they will be returned in the same order ?
You can assume this if the function help says so, otherwise it's not safe to assume so and you can do something like.
[varsorted,varsortedinds] = sort(variances,'descend');
pcsorted = pc(:,varsortedinds);
And varsorted and pcsorted will be in order from most to least significant.
Edit 7 years later: I realized in re-reading the question that my answer doesn't actually answer this. I thought what was being asked was are the principal components sorted. Don Reba's answer is an answer to the actual question asked. I can't delete a selected answer though.