Vector-defined cross product application matrix and vectorization in Matlab - matlab

I ran into an operation I cannot seem to achieve via vectorization.
Let's say I want to find the matrix of the application defined by
h: X -> cross(V,X)
where V is a predetermined vector (both X and V are 3-by-1 vectors).
In Matlab, I would do something like
M= cross(repmat(V,1,3),eye(3,3))
to get this matrix. For instance, V=[1;2;3] yields
M =
0 -3 2
3 0 -1
-2 1 0
Let's now suppose that I have a 3-by-N matrix
V=[V_1,V_2...V_N]
with each column defining its own cross-product operation. For N=2, here's a naive try to find the two cross-product matrices that V's columns define
V=[1,2,3;4,5,6]'
M=cross(repmat(V,1,3),repmat(eye(3,3),1,2))
results in
V =
1 4
2 5
3 6
M =
0 -6 2 0 -3 5
3 0 -1 6 0 -4
-2 4 0 -5 1 0
while I was expecting
M =
0 -3 2 0 -6 5
3 0 -1 6 0 -4
-2 1 0 -5 4 0
2 columns are inverted.
Is there a way to achieve this without for loops?
Thanks!

First, make sure you read the documentation of cross very carefully when dealing with matrices:
It says:
C = cross(A,B,DIM), where A and B are N-D arrays, returns the cross
product of vectors in the dimension DIM of A and B. A and B must
have the same size, and both SIZE(A,DIM) and SIZE(B,DIM) must be 3.
Bear in mind that if you don't specify DIM, it's automatically assumed to be 1, so you're operating along the columns. In your first case, you specified both the inputs A and B to be 3 x 3 matrices. Therefore, the output will be the cross product of each column independently due to the assumption that DIM=1. As such, you expect that the i'th column of the output contains the cross product of the i'th column of A and the i'th column of B and the number of rows is expected to be 3 and the number of columns needs to match between A and B.
You're getting what you expect because the first input A has [1;2;3] duplicated correctly over the columns three times. From your second piece of code, what you're expecting for V as the first input (A) looks like this:
V =
1 1 1 4 4 4
2 2 2 5 5 5
3 3 3 6 6 6
However, when you do repmat, you are in fact alternating between each column. In fact, you are getting this:
V =
1 4 1 4 1 4
2 5 2 5 2 5
3 6 3 6 3 6
repmat tile matrices together and you specified that you wanted to tile V horizontally three times. That's obviously not correct. This explains why the columns are swapped because the second, fourth and sixth columns of V actually should appear at the last three columns instead. As such, the ordering of your input columns is the reason why the output appears swapped.
As such, you need to re-order V so that the first three vectors are [1;2;3], followed by the next three vectors as [4;5;6] after. Therefore, you can generate your original V matrix first, then create a new matrix such that the odd column comes first in a group of three, followed by the even column in a group of three after:
>> V = [1,2,3;4,5,6].';
>> V = V(:, [1 1 1 2 2 2])
V =
1 1 1 4 4 4
2 2 2 5 5 5
3 3 3 6 6 6
Now use V with cross and maintain the same second input:
>> M = cross(V, repmat(eye(3), 1, 2))
M =
0 -3 2 0 -6 5
3 0 -1 6 0 -4
-2 1 0 -5 4 0
Looks good to me!

Related

dot product of matrix columns

I have a 4x8 matrix which I want to select two different columns of it then derive dot product of them and then divide to norm values of that selected columns, and then repeat this for all possible two different columns and save the vectors in a new matrix. can anyone provide me a matlab code for this purpose?
The code which I supposed to give me the output is:
A=[1 2 3 4 5 6 7 8;1 2 3 4 5 6 7 8;1 2 3 4 5 6 7 8;1 2 3 4 5 6 7 8;];
for i=1:8
for j=1:7
B(:,i)=(A(:,i).*A(:,j+1))/(norm(A(:,i))*norm(A(:,j+1)));
end
end
I would approach this a different way. First, create two matrices where the corresponding columns of each one correspond to a unique pair of columns from your matrix.
Easiest way I can think of is to create all possible combinations of pairs, and eliminate the duplicates. You can do this by creating a meshgrid of values where the outputs X and Y give you a pairing of each pair of vectors and only selecting out the lower triangular part of each matrix offsetting by 1 to get the main diagonal just one below the diagonal.... so do this:
num_columns = size(A,2);
[X,Y] = meshgrid(1:num_columns);
X = X(tril(ones(num_columns),-1)==1); Y = Y(tril(ones(num_columns),-1)==1);
In your case, here's what the grid of coordinates looks like:
>> [X,Y] = meshgrid(1:num_columns)
X =
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
Y =
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5
6 6 6 6 6 6 6 6
7 7 7 7 7 7 7 7
8 8 8 8 8 8 8 8
As you can see, if we select out the lower triangular part of each matrix excluding the diagonal, you will get all combinations of pairs that are unique, which is what I did in the last parts of the code. Selecting the lower-part is important because by doing this, MATLAB selects out values column-wise, and traversing the columns of the lower-triangular part of each matrix gives you the exact orderings of each pair of columns in the right order (i.e. 1-2, 1-3, ..., 1-7, 2-3, 2-4, ..., etc.)
The point of all of this is that can then use X and Y to create two new matrices that contain the columns located at each pair of X and Y, then use dot to apply the dot product to each matrix column-wise. We also need to divide the dot product by the multiplication of the magnitudes of the two vectors respectively. You can't use MATLAB's built-in function norm for this because it will compute the matrix norm for matrices. As such, you have to sum over all of the rows for each column respectively for each of the two matrices then multiply both of the results element-wise then take the square root - this is the last step of the process:
matrix1 = A(:,X);
matrix2 = A(:,Y);
B = dot(matrix1, matrix2, 1) ./ sqrt(sum(matrix1.^2,1).*sum(matrix2.^2,1));
I get this for B:
>> B
B =
Columns 1 through 11
1 1 1 1 1 1 1 1 1 1 1
Columns 12 through 22
1 1 1 1 1 1 1 1 1 1 1
Columns 23 through 28
1 1 1 1 1 1
Well.. this isn't useful at all. Why is that? What you are actually doing is finding the cosine angle between two vectors, and since each vector is a scalar multiple of another, the angle that separates each vector is in fact 0, and the cosine of 0 is 1.
You should try this with different values of A so you can see for yourself that it works.
To make this code compatible for copying and pasting, here it is:
%// Define A here:
A = repmat(1:8, 4, 1);
%// Code to produce dot products here
num_columns = size(A,2);
[X,Y] = meshgrid(1:num_columns);
X = X(tril(ones(num_columns),-1)==1); Y = Y(tril(ones(num_columns),-1)==1);
matrix1 = A(:,X);
matrix2 = A(:,Y);
B = dot(matrix1, matrix2, 1) ./ sqrt(sum(matrix1.^2,1).*sum(matrix2.^2,1));
Minor Note
If you have a lot of columns in A, this may be very memory intensive. You can get your original code to work with loops, but you need to change what you're doing at each column.
You can do something like this:
num_columns = nchoosek(size(A,2),2);
B = zeros(1, num_columns);
counter = 1;
for ii = 1 : size(A,2)
for jj = ii+1 : size(A,2)
B(counter) = dot(A(:,ii), A(:,jj), 1) / (norm(A(:,ii))*norm(A(:,jj)));
counter = counter + 1;
end
end
Note that we can use norm because we're specifying vectors for each of the inputs into the function. We first preallocate a matrix B that will contain the dot products of all possible combinations. Then, we go through each pair of combinations - take note that the inner for loop starts from the outer most for loop index added with 1 so you don't look at any duplicates. We take the dot product of the corresponding columns referenced by positions ii and jj and store the results in B. I need an external counter so we can properly access the right slot to place our result in for each pair of columns.

How to count patterns columnwise in Matlab?

I have a matrix S in Matlab that looks like the following:
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1
I would like to count patterns of values column-wise. I am interested into the frequency of the numbers that follow right after number 3 in any of the columns. For instance, number 3 occurs three times in the first column. The first time we observe it, it is followed by 3, the second time it is followed by 3 again and the third time it is followed by 4. Thus, the frequency for the patters observed in the first column would look like:
3-3: 66.66%
3-4: 33.33%
3-1: 0%
3-2: 0%
To generate the output, you could use the convenient tabulate
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1];
idx = find(S(1:end-1,:)==3);
S2 = S(2:end,:);
tabulate(S2(idx))
Value Count Percent
1 0 0.00%
2 0 0.00%
3 4 66.67%
4 2 33.33%
Here's one approach, finding the 3's then looking at the following digits
[i,j]=find(S==3);
k=i+1<=size(S,1);
T=S(sub2ind(size(S),i(k)+1,j(k))) %// the elements of S that are just below a 3
R=arrayfun(#(x) sum(T==x)./sum(k),1:max(S(:))).' %// get the number of probability of each digit
I'm going to restate your problem statement in a way that I can understand and my solution will reflect this new problem statement.
For a particular column, locate the locations that contain the number 3.
Look at the row immediately below these locations and look at the values at these locations
Take these values and tally up the total number of occurrences found.
Repeat these for all of the columns and update the tally, then determine the percentage of occurrences for the values.
We can do this by the following:
A = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// Define your matrix
[row,col] = find(A(1:end-1,:) == 3);
vals = A(sub2ind(size(A), row+1, col));
h = 100*accumarray(vals, 1) / numel(vals)
h =
0
0
66.6667
33.3333
Let's go through the above code slowly. The first few lines define your example matrix A. Next, we take a look at all of the rows except for the last row of your matrix and see where the number 3 is located with find. We skip the last row because we want to be sure we are within the bounds of your matrix. If there is a number 3 located at the last row, we would have undefined behaviour if we tried to check the values below the last because there's nothing there!
Once we do this, we take a look at those values in the matrix that are 1 row beneath those that have the number 3. We use sub2ind to help us facilitate this. Next, we use these values and tally them up using accumarray then normalize them by the total sum of the tallying into percentages.
The result would be a 4 element array that displays the percentages encountered per number.
To double check, if we look at the matrix, we see that the value of 3 follows other values of 3 for a total of 4 times - first column, row 3, row 4, second column, row 2 and third column, row 6. The value of 4 follows the value of 3 two times: first column, row 6, second column, row 3.
In total, we have 6 numbers we counted, and so dividing by 6 gives us 4/6 or 66.67% for number 3 and 2/6 or 33.33% for number 4.
If I got the problem statement correctly, you could efficiently implement this with MATLAB's logical indexing and an approach that is essentially of two lines -
%// Input 2D matrix
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]
Labels = [1:4]'; %//'# Label array
counts = histc(S([false(1,size(S,2)) ; S(1:end-1,:) == 3]),Labels)
Percentages = 100*counts./sum(counts)
Verify/Present results
The styles for presenting the output results listed next use MATLAB's table for a well human-readable format of data.
Style #1
>> table(Labels,Percentages)
ans =
Labels Percentages
______ ___________
1 0
2 0
3 66.667
4 33.333
Style #2
You can do some fancy string operations to present the results in a more "representative" manner -
>> Labels_3 = strcat('3-',cellstr(num2str(Labels','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-1' 0
'3-2' 0
'3-3' 66.667
'3-4' 33.333
Style #3
If you want to present them in descending sorted manner based on the percentages as listed in the expected output section of the question, you can do so with an additional step using sort -
>> [Percentages,idx] = sort(Percentages,'descend');
>> Labels_3 = strcat('3-',cellstr(num2str(Labels(idx)','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-3' 66.667
'3-4' 33.333
'3-1' 0
'3-2' 0
Bonus Stuff: Finding frequency (counts) for all cases
Now, let's suppose you would like repeat this process for say 1, 2 and 4 as well, i.e. find occurrences after 1, 2 and 4 respectively. In that case, you can iterate the above steps for all cases and for the same you can use arrayfun -
%// Get counts
C = cell2mat(arrayfun(#(n) histc(S([false(1,size(S,2)) ; S(1:end-1,:) == n]),...
1:4),1:4,'Uni',0))
%// Get percentages
Percentages = 100*bsxfun(#rdivide, C, sum(C,1))
Giving us -
Percentages =
90.9091 20.0000 0 100.0000
9.0909 20.0000 0 0
0 60.0000 66.6667 0
0 0 33.3333 0
Thus, in Percentages, the first column are the counts of [1,2,3,4] that occur right after there is a 1 somewhere in the input matrix. As as an example, one can see column -3 of Percentages is what you had in the sample output when looking for elements right after 3 in the input matrix.
If you want to compute frequencies independently for each column:
S = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// data: matrix
N = 3; %// data: number
r = max(S(:));
[R, C] = size(S);
[ii, jj] = find(S(1:end-1,:)==N); %// step 1
count = full(sparse(S(ii+1+(jj-1)*R), jj, 1, r, C)); %// step 2
result = bsxfun(#rdivide, count, sum(S(1:end-1,:)==N)); %// step 3
This works as follows:
find is first applied to determine row and col indices of occurrences of N in S except its last row.
The values in the entries right below the indices of step 1 are accumulated for each column, in variable count. The very convenient sparse function is used for this purpose. Note that this uses linear indexing into S.
To obtain the frequencies for each column, count is divided (with bsxfun) by the number of occurrences of N in each column.
The result in this example is
result =
0 0 0 NaN
0 0 0 NaN
0.6667 0.5000 1.0000 NaN
0.3333 0.5000 0 NaN
Note that the last column correctly contains NaNs because the frequency of the sought patterns is undefined for that column.

matlab indexing with multiple condition

I can't figure out how to create a vector based on condition on more than one other vectors. I have three vectors and I need values of one vector if values on other vectors comply to condition.
As an example below I would like to choose values from vector a if values on vector b==2 and values on vector c==0 obviously I expect [2 4]
a = [1 2 3 4 5 6 7 8 9 10];
b = [1 2 1 2 1 2 1 2 1 2];
c = [0 0 0 0 0 1 1 1 1 1]
I thought something like:
d = a(b==2) & a(c==0)
but I have d = 1 1 1 1 1 not sure why.
It seems to be basic problem but I can find solution for it.
In your case you can consider using a(b==2 & c==0)
Use ismember to find the matching indices along the rows after concatenating b and c and then index to a.
Code
a(ismember([b;c]',[2 0],'rows'))
Output
ans =
2
4
You may use bsxfun too for the same result -
a(all(bsxfun(#eq,[b;c],[2 0]'),1))
Or you may just tweak your method to get the correct result -
a(b==2 & c==0)

Extracting unique values

I have data in two columns that looks as follows:
A B
1,265848208 3
-0,608043611 0
-0,285735893 0
0,006895134 7
0 7
-0,004526196 7
0,176326617 10
-0,159688071 2
0,22439945 2
-0,991045044 1
0,178022324 1
-0,270967397 4
0,285849994 4
1,881705539 23
1,057184204 10
NaN 10
For all unique values in B I want to extract the corresponding value in column A and move it to a new matrix. I'm looking to then compute the mean of all the corresponding values in A and use as a dependent variable (weighted by no of observations per value in B) in a regression with the common value of B being the independent variable to reduce noise. Any help would on how to do this in Matlab (except running the regression) would be great!
Thanks
Oscar
Here is an efficient solution:
X = [
1.265848208 3
-0.608043611 0
-0.285735893 0
0.006895134 7
0 7
-0.004526196 7
0.176326617 10
-0.159688071 2
0.22439945 2
-0.991045044 1
0.178022324 1
-0.270967397 4
0.285849994 4
1.881705539 23
1.057184204 10
NaN 10
];
%# unique values in B, and their indices
[valB,~,subs] = unique(X(:,2));
%# values of A for each unique number in B (cellarray)
valA = accumarray(subs, X(:,1), [], #(x) {x});
%# mean of each group
meanValA = cellfun(#nanmean, valA)
%# perform regression here...
The result:
%# B values, mean of corresponding values in A, number of A values
>> [valB meanValA cellfun(#numel,valA)]
ans =
0 -0.44689 2
1 -0.40651 2
2 0.032356 2
3 1.2658 1
4 0.0074413 2
7 0.00078965 3
10 0.61676 3
23 1.8817 1

How to combine vectors of different length in a cell array into matrix in MATLAB

How to efficiently combined cell array vectors with different length into a matrix, filling the vectors to max length with 0s or NaNs? It would be a nice option for cell2mat().
For example, if I have
C = {1:3; 1:5; 1:4};
I'd like to get either
M = [1 2 3 0 0
1 2 3 4 5
1 2 3 4 0];
or
M = [1 2 3 NaN NaN
1 2 3 4 5
1 2 3 4 NaN];
EDIT:
For a cell of row vectors as in your case, this will pad vectors with zeros to form a matrix
out=cell2mat(cellfun(#(x)cat(2,x,zeros(1,maxLength-length(x))),C,'UniformOutput',false))
out =
1 2 3 0 0
1 2 3 4 5
1 2 3 4 0
A similar question was asked earlier today, and although the question was worded slightly differently, my answer basically does what you want.
Copying the relevant parts here, a cell of uneven column vectors can be zero padded into a matrix as:
out=cell2mat(cellfun(#(x)cat(1,x,zeros(maxLength-length(x),1)),C,'UniformOutput',false));
where maxLength is assumed to be known. In your case, you have row vectors, which is just a slight modification from this.
If maxLength is not known, you can get it as
maxLength=max(cellfun(#(x)numel(x),C));