Related
I have a NxNx5 array T that I would like to convert into a Rx5 array TT such that the following condition is satisfied (where R is the number of non-zero entries of the array T(:,:,1)):
If T(i,j,1) == 0 then we ignore. If T(i,j,1) != 0 then I would like a row of TT whose entry is
[T(i,j,1) T(i,j,2) T(i,j,3) T(i,j,4) T(i,j,5)]
Note that T(i,j,k) (k = 2,3,4,5) could be zero. For example,
If
T(3,2,1) = 3
then I would like a row of TT to be
[3 0 2 1 5].
Some notes:
The entries of TT are all integers.
The entries accent in order column wise. i.e the first column of TT(:,:,1) maybe
[1 2 0 0 3 4 0 0 0 5 6]'
then the next column
[7 8 0 0 0 0 0 9 10 11 12]'
I think this does what you want:
ind = find(T(:,:,1));
ind = bsxfun(#plus, ind(:), (0:size(T,3)-1)*size(T,1)*size(T,2));
result = T(ind);
This will do it:
clear
rng(343)
N=7;
K=5;
T=randi([0,4],[N,N,K])
TT=reshape(T,[N*N,K])
TT(T(:,1)==0,:)=[] %delete rows with first col equal to 0
Let z = [1 3 5 6] and by getting all the difference between each elements:
we get:
bsxfun(#minus, z', z)
ans =
0 -2 -4 -5
2 0 -2 -3
4 2 0 -1
5 3 1 0
I now want to order these values in ascending order and remove the duplicates. So:
sort(reshape(bsxfun(#minus, z', z),1,16))
ans =
Columns 1 through 13
-5 -4 -3 -2 -2 -1 0 0 0 0 1 2 2
Columns 14 through 16
3 4 5
C = unique(sort(reshape(bsxfun(#minus, z', z),1,16)))
C =
-5 -4 -3 -2 -1 0 1 2 3 4 5
But by looking at -5 in [-5 -4 -3 -2 -1 0 1 2 3 4 5],
how can I tell where -5 comes from. By reading myself the matrix,
0 -2 -4 -5
2 0 -2 -3
4 2 0 -1
5 3 1 0
I know it comes from z(1) - z(4), i.e. row 1 column 4.
Also 2 comes from both z(3) - z(2) and z(2) - z(1), which comes from two cases. Without reading the originally matrix itself, how can we know that the 2 in [-5 -4 -3 -2 -1 0 1 2 3 4 5] is originally in row 3 column 2 and row 2 column 1 of the original matrix?
So by looking at each element in [-5 -4 -3 -2 -1 0 1 2 3 4 5], how do we know, for example, where -5 comes from in the original matrix index efficiently. I want to know as I need to do operation on ,e.g.,-5 and two indices that produce this: for example, for each difference, say -5, i do (-5)*1*6, as z(1)- z(6) = -5. But for 2, I need to do 2*(3*2+2*1) as z(3) - z(2) = 2, z(2) - z(1) = 2 which is not distinct.
Thinking hard, I think i should not reshape bsxfun(#minus, z', z) to array. I will also create two index array such that I can do operations like (-5)*1*6 stated above effectively. However, this is easier said than done and I also have to take care of nondistinct sources. Or should I do the desired operations first?
Use the third output from unique. And don't sort, unique will do that for you.
[sortedOutput,~,linearIndices] = unique(reshape(bsxfun(#minus, z', z),[1 16]))
You can reconstruct the result from bsxfun like so:
distances = reshape(sortedOutput(linearIndices),[4 4]);
If you want to know where a certain value appears, you write
targetValue = -5;
targetValueIdx = find(sortedOutput==targetValue);
linearIndexIntoDistances = find(targetValueIdx==linearIndices);
[row,col] = ind2sub([4 4],linearIndexIntoDistances);
Because linearIndices is 1 wherever the first value in sortedOutput appears in the original vector.
If you save the result of bsxfun in an intermediate variable:
distances=bsxfun(#minus, z', z)
Then you can look for the values of C in distances using find iteratively.
[rows,cols]=find(C(i)==distances)
This will give all rows and cols if the values are repeated. You just need to then use them for your equation.
You can use accumarray to collect all row and column indices that correspond to the same value in the matrix of differences:
z = [1 3 5 6]; % data vector
zd = bsxfun(#minus, z.', z); % matrix of differences
[C, ~, ind] = unique(zd); % unique values and indices
[rr, cc] = ndgrid(1:numel(z)); % template for row and col indices
f = #(x){x}; % anonymous function to collect row and col indices
row = accumarray(ind, rr(:), [], f); % group row indices according to ind
col = accumarray(ind, cc(:), [], f); % same for col indices
For example, C(6) is value 0, which appears four times in zd, at positions given by row{6} and col{6}:
>> row{6}.'
ans =
3 2 1 4
>> col{6}.'
ans =
3 2 1 4
As you see, the results are not guaranteed to be sorted. If you need to sort them in linear order:
rowcol = cellfun(#(r,c)sortrows([r c]), row, col, 'UniformOutput', false);
so now
>> rowcol{6}
ans =
1 1
2 2
3 3
4 4
I'm not sure I've followed exactly but some points to consider:
unique will sort the data for you by default so you don't need to call sort first
unique actually has three outputs and you can recover your original vector (i.e. with duplicates) using the third output so
[C,~,ic] = unique(reshape(bsxfun(#minus, z', z),1,16))
now you can get back to bsxfun(#minus, z', z),1,16) by calling
reshape(C(ic), numel(z), numel(z))
You might be more interested in the second output of unique which tells you what index each unique value was at in your 1-by-16 vector. It really depends on what you're trying to do though. But with this you could get a list of row column pairs to match your unique values:
[rows, cols] = ndgrid(1:4);
coords = [rows(:), cols(:)];
[C, ia] = unique(reshape(bsxfun(#minus, z', z),1,16));
coords_pairs = coords(ia,:)
which results in
coords_pairs =
1 4
1 3
2 4
2 3
3 4
4 4
4 3
3 2
4 2
3 1
4 1
How to efficiently combined cell array v row and Column with different size into a matrix, filling the vectors with 0?
for For example, if I have
A= {[1;2;3] [1 2 ; 1 3; 2 3] [1 2 3]};
I'd like to get either:
A=[1 0 0
2 0 0
3 0 0
1 2 0
1 3 0
2 3 0
1 2 3]
You can use simply padarray to pad your arrays with zeros before vertcat them:
B = padarray(A{1},[0 3-size(A{1},2)],'post')
C = padarray(A{2},[0 3-size(A{2},2)],'post')
D = padarray(A{3},[0 3-size(A{3},2)],'post')
%//Note the 3-size(A{1},2)... The 3 comes from the number of columns you want your final matrix to be, and it cannot be smaller than the maximum value of size(A{N},2) in your case it is 3, since A{3} is 3 columns wide.
result = vertcat (B,C,D)
result =
1 0 0
2 0 0
3 0 0
1 2 0
1 3 0
2 3 0
1 2 3
you can write a loop to iterate through your cell or use a cellfun to parallelize.
In a simple loop, it looks like:
result = [];
for t = 1:size(A,2)
B = padarray(A{t},[0 3-size(A{t},2)],'post');
result = vertcat(result,B);
end
I have a matrix S in Matlab that looks like the following:
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1
I would like to count patterns of values column-wise. I am interested into the frequency of the numbers that follow right after number 3 in any of the columns. For instance, number 3 occurs three times in the first column. The first time we observe it, it is followed by 3, the second time it is followed by 3 again and the third time it is followed by 4. Thus, the frequency for the patters observed in the first column would look like:
3-3: 66.66%
3-4: 33.33%
3-1: 0%
3-2: 0%
To generate the output, you could use the convenient tabulate
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1];
idx = find(S(1:end-1,:)==3);
S2 = S(2:end,:);
tabulate(S2(idx))
Value Count Percent
1 0 0.00%
2 0 0.00%
3 4 66.67%
4 2 33.33%
Here's one approach, finding the 3's then looking at the following digits
[i,j]=find(S==3);
k=i+1<=size(S,1);
T=S(sub2ind(size(S),i(k)+1,j(k))) %// the elements of S that are just below a 3
R=arrayfun(#(x) sum(T==x)./sum(k),1:max(S(:))).' %// get the number of probability of each digit
I'm going to restate your problem statement in a way that I can understand and my solution will reflect this new problem statement.
For a particular column, locate the locations that contain the number 3.
Look at the row immediately below these locations and look at the values at these locations
Take these values and tally up the total number of occurrences found.
Repeat these for all of the columns and update the tally, then determine the percentage of occurrences for the values.
We can do this by the following:
A = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// Define your matrix
[row,col] = find(A(1:end-1,:) == 3);
vals = A(sub2ind(size(A), row+1, col));
h = 100*accumarray(vals, 1) / numel(vals)
h =
0
0
66.6667
33.3333
Let's go through the above code slowly. The first few lines define your example matrix A. Next, we take a look at all of the rows except for the last row of your matrix and see where the number 3 is located with find. We skip the last row because we want to be sure we are within the bounds of your matrix. If there is a number 3 located at the last row, we would have undefined behaviour if we tried to check the values below the last because there's nothing there!
Once we do this, we take a look at those values in the matrix that are 1 row beneath those that have the number 3. We use sub2ind to help us facilitate this. Next, we use these values and tally them up using accumarray then normalize them by the total sum of the tallying into percentages.
The result would be a 4 element array that displays the percentages encountered per number.
To double check, if we look at the matrix, we see that the value of 3 follows other values of 3 for a total of 4 times - first column, row 3, row 4, second column, row 2 and third column, row 6. The value of 4 follows the value of 3 two times: first column, row 6, second column, row 3.
In total, we have 6 numbers we counted, and so dividing by 6 gives us 4/6 or 66.67% for number 3 and 2/6 or 33.33% for number 4.
If I got the problem statement correctly, you could efficiently implement this with MATLAB's logical indexing and an approach that is essentially of two lines -
%// Input 2D matrix
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]
Labels = [1:4]'; %//'# Label array
counts = histc(S([false(1,size(S,2)) ; S(1:end-1,:) == 3]),Labels)
Percentages = 100*counts./sum(counts)
Verify/Present results
The styles for presenting the output results listed next use MATLAB's table for a well human-readable format of data.
Style #1
>> table(Labels,Percentages)
ans =
Labels Percentages
______ ___________
1 0
2 0
3 66.667
4 33.333
Style #2
You can do some fancy string operations to present the results in a more "representative" manner -
>> Labels_3 = strcat('3-',cellstr(num2str(Labels','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-1' 0
'3-2' 0
'3-3' 66.667
'3-4' 33.333
Style #3
If you want to present them in descending sorted manner based on the percentages as listed in the expected output section of the question, you can do so with an additional step using sort -
>> [Percentages,idx] = sort(Percentages,'descend');
>> Labels_3 = strcat('3-',cellstr(num2str(Labels(idx)','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-3' 66.667
'3-4' 33.333
'3-1' 0
'3-2' 0
Bonus Stuff: Finding frequency (counts) for all cases
Now, let's suppose you would like repeat this process for say 1, 2 and 4 as well, i.e. find occurrences after 1, 2 and 4 respectively. In that case, you can iterate the above steps for all cases and for the same you can use arrayfun -
%// Get counts
C = cell2mat(arrayfun(#(n) histc(S([false(1,size(S,2)) ; S(1:end-1,:) == n]),...
1:4),1:4,'Uni',0))
%// Get percentages
Percentages = 100*bsxfun(#rdivide, C, sum(C,1))
Giving us -
Percentages =
90.9091 20.0000 0 100.0000
9.0909 20.0000 0 0
0 60.0000 66.6667 0
0 0 33.3333 0
Thus, in Percentages, the first column are the counts of [1,2,3,4] that occur right after there is a 1 somewhere in the input matrix. As as an example, one can see column -3 of Percentages is what you had in the sample output when looking for elements right after 3 in the input matrix.
If you want to compute frequencies independently for each column:
S = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// data: matrix
N = 3; %// data: number
r = max(S(:));
[R, C] = size(S);
[ii, jj] = find(S(1:end-1,:)==N); %// step 1
count = full(sparse(S(ii+1+(jj-1)*R), jj, 1, r, C)); %// step 2
result = bsxfun(#rdivide, count, sum(S(1:end-1,:)==N)); %// step 3
This works as follows:
find is first applied to determine row and col indices of occurrences of N in S except its last row.
The values in the entries right below the indices of step 1 are accumulated for each column, in variable count. The very convenient sparse function is used for this purpose. Note that this uses linear indexing into S.
To obtain the frequencies for each column, count is divided (with bsxfun) by the number of occurrences of N in each column.
The result in this example is
result =
0 0 0 NaN
0 0 0 NaN
0.6667 0.5000 1.0000 NaN
0.3333 0.5000 0 NaN
Note that the last column correctly contains NaNs because the frequency of the sought patterns is undefined for that column.
This question already has answers here:
How can I find indices of each row of a matrix which has a duplicate in matlab?
(3 answers)
Closed 8 years ago.
I have two matrices and I want to find the indices of rows in Matrix B which have the same row values in Matrix A. Let me give a simple example:
A=[1,2,3; 2,3,4; 3,5,7; 1,2,3; 1,2,3; 5,8,6];
B=[1,2,3; 29,3,4; 3,59,7; 1,29,3; 1,2,3; 5,8,6;1,2,3];
For example, for first row in matrix A, The row1, row5, and row7 in Matrix B are correspondences.
I have written below code but it doesn't return back all indices which have the same row value in matrix A and only one of them (row7) is backed !!
A_sorted = sort(A,2,'descend'); % sorting angles
B_sorted = sort(B,2,'descend'); % sorting angles
[~,indx]=ismember(A_sorted,B_sorted,'rows')
the result is
indx_2 =
7
0
0
7
7
6
It means for the first row in matrix A , only one row ( row 7) in Matrix B is available !! But as you can see for first row in matrix A there is three correspondent rows in matrix B (Row 1, row 5 and row 7)
I think the best strategy is to apply ismember to unique rows
%make matrix unique
[B_unique,B2,B3]=unique(B_sorted,'rows')
[~,indx]=ismember(A_sorted,B_unique,'rows')
%For each row in B_unique, get the corresponding indices in B_sorted
indx2=arrayfun(#(x)find(B3==x),indx,'uni',0)
If you want to compare all pairs of rows between A and B, use
E = squeeze(all(bsxfun(#eq, A, permute(B, [3 2 1])), 2));
or equivalently
E = pdist2(A,B)==0;
In your example, this gives
E =
1 0 0 0 1 0 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 0 0 0 1 0 1
1 0 0 0 1 0 1
0 0 0 0 0 1 0
The value E(ia,ib) tells you if the ia-th row of A equals the ib-th row of B.