Consider the following the matrix:
1 2 1 1
1 3 1 1
2 5 2 3
2 6 2 4
2 6 2 4
2 9 0 0
3 4 5 6
3 4 1 1
3 2 0 0
3 1 1 1
.
.
.
I want to select the row(s) with the maximum value in column 2 for every unique value in column 1.
For eg.
Answer should be:
1 3 1 1
2 9 0 0
3 4 5 6
3 4 1 1
Any ideas?
Here's one way:
%// Get a unique list of column 1 without changing the order in which they appear
[C1, ~, subs] = unique(M(:,1), 'stable');
%// Get the max value from column 2 corresponding to each unique value of column 1
C2 = accumarray(subs, M(:,2), [], #max);
%// Find the desired row indices
I = ismember(M(:,1:2), [C1, C2], 'rows');
%//Extract the rows
M(I, :)
Some code to get you started:
% unique values in first column
col1 = unique(x(:,1));
% we first store results in a cell array (later converted to matrix)
xx = cell(numel(col1), 1);
for i=1:numel(col1)
% rows with the same value in column 1
rows = x(x(:,1) == col1(i),:);
% maximum value along column 2
mx = max(rows(:,2));
% store all rows with the max value (in case of ties)
xx{i} = rows(rows(:,2)==mx,:);
end
% combine all resulting rows
xx = vertcat(xx{:});
The result for the matrix you've shown:
>> xx
xx =
1 3 1 1
2 9 0 0
3 4 5 6
3 4 1 1
Related
I created a cell array of shape m x 2, each element of which is a matrix of shape d x d.
For example like this:
A = cell(8, 2);
for row = 1:8
for col = 1:2
A{row, col} = rand(3, 3);
end
end
More generally, I can represent A as follows:
where each A_{ij} is a matrix.
Now, I need to randomly pick a matrix from each row of A, because A has m rows in total, so eventually I will pick out m matrices, which we call a combination.
Obviously, since there are only two picks for each row, there are a total of 2^m possible combinations.
My question is, how to get these 2^m combinations quickly?
It can be seen that the above problem is actually finding the Cartesian product of the following sets:
2^m is actually a binary number, so we can use those to create linear indices. You'll get an array containing 1s and 0s, something like [1 1 0 0 1 0 1 0 1], which we can treat as column "indices", using a 0 to indicate the first column and a 1 to indicate the second.
m = size(A, 1);
% Build all binary numbers and create a logical matrix
bin_idx = dec2bin(0:(2^m -1)) == '1';
row = 3; % Loop here over size(bin_idx,1) for all possible permutations
linear_idx = [find(~bin_idx(row,:)) find(bin_idx(row,:))+m];
A{linear_idx} % the combination as specified by the permutation in out(row)
On my R2007b version this runs virtually instant for m = 20.
NB: this will take m * 2^m bytes of memory to store bin_idx. Where that's just 20 MB for m = 20, that's already 30 GB for m = 30, i.e. you'll be running out of memory fairly quickly, and that's for just storing permutations as booleans! If m is large in your case, you can't store all of your possibilities anyway, so I'd just select a random one:
bin_idx = rand(m, 1); % Generate m random numbers
bin_idx(bin_idx > 0.5) = 1; % Set half to 1
bin_idx(bin_idx < 0.5) = 0; % and half to 0
Old, slow answer for large m
perms()1 gives you all possible permutations of a given set. However, it does not take duplicate entries into account, so you'll need to call unique() to get the unique rows.
unique(perms([1,1,2,2]), 'rows')
ans =
1 1 2 2
1 2 1 2
1 2 2 1
2 1 1 2
2 1 2 1
2 2 1 1
The only thing left now is to somehow do this over all possible amounts of 1s and 2s. I suggest using a simple loop:
m = 5;
out = [];
for ii = 1:m
my_tmp = ones(m,1);
my_tmp(ii:end) = 2;
out = [out; unique(perms(my_tmp),'rows')];
end
out = [out; ones(1,m)]; % Tack on the missing all-ones row
out =
2 2 2 2 2
1 2 2 2 2
2 1 2 2 2
2 2 1 2 2
2 2 2 1 2
2 2 2 2 1
1 1 2 2 2
1 2 1 2 2
1 2 2 1 2
1 2 2 2 1
2 1 1 2 2
2 1 2 1 2
2 1 2 2 1
2 2 1 1 2
2 2 1 2 1
2 2 2 1 1
1 1 1 2 2
1 1 2 1 2
1 1 2 2 1
1 2 1 1 2
1 2 1 2 1
1 2 2 1 1
2 1 1 1 2
2 1 1 2 1
2 1 2 1 1
2 2 1 1 1
1 1 1 1 2
1 1 1 2 1
1 1 2 1 1
1 2 1 1 1
2 1 1 1 1
1 1 1 1 1
NB: I've not initialised out, which will be slow especially for large m. Of course out = zeros(2^m, m) will be its final size, but you'll need to juggle the indices within the for loop to account for the changing sizes of the unique permutations.
You can create linear indices from out using find()
linear_idx = [find(out(row,:)==1);find(out(row,:)==2)+size(A,1)];
A{linear_idx} % the combination as specified by the permutation in out(row)
Linear indices are row-major in MATLAB, thus whenever you need the matrix in column 1, simply use its row number and whenever you need the second column, use the row number + size(A,1), i.e. the total number of rows.
Combining everything together:
A = cell(8, 2);
for row = 1:8
for col = 1:2
A{row, col} = rand(3, 3);
end
end
m = size(A,1);
out = [];
for ii = 1:m
my_tmp = ones(m,1);
my_tmp(ii:end) = 2;
out = [out; unique(perms(my_tmp),'rows')];
end
out = [out; ones(1,m)];
row = 3; % Loop here over size(out,1) for all possible permutations
linear_idx = [find(out(row,:)==1).';find(out(row,:)==2).'+m];
A{linear_idx} % the combination as specified by the permutation in out(row)
1 There's a note in the documentation:
perms(v) is practical when length(v) is less than about 10.
I have a matrix with a shape like below. I want to delete rows with duplicate values in the first column and leaving row with the smallest number of duplicate values in the second column. my matrix
`d =
1 1
2 1
4 1
8 2
2 2
5 4
2 4
6 4
7 3
`
I want to remove duplicate number 2 in the first column and leaving the row with the smallest number of duplicate values in the second row
result required:
1 1
4 1
8 2
2 2
5 4
6 4
7 3
Thanks for the helps. best regard.
We can sort the array regards to first column and replace elements of second column by their descending count
to obtain this array:
1 3
2 3
2 3
2 2
4 3
5 3
6 3
7 1
8 2
Then if we apply unique to this array indices of desirable rows can be obtained and then then those rows can be extracted:
1 1
2 2
4 1
5 4
6 4
7 3
8 2
If oreder of original data should be preserved more step required that commented in the code.
a=[...
1 1
2 1
4 1
8 2
2 2
5 4
2 4
6 4
7 3];
%steps to replace counts of each element of column2 with it
[a2_sorted, i_a2_sorted] = sort(a(:,2));
[a2_sorted_unique, i_a2_sorted_unique] = unique(a2_sorted);
h = hist(a2_sorted, a2_sorted_unique);
%count = repelems(h, [1:numel(h); h]);%octave
count = repelem(h, h);
[~,a2_back_idx] = sort(i_a2_sorted);
count = count (a2_back_idx);
b = [a(:,1) , count.'];
%counts shoule be sorted in descending order
%because the unique function extracts last element from each category
[b_sorted i_b_sorted] =sortrows(b,[1 -2]);
[~, i_b1_sorted_unique] = unique(b_sorted(:,1));
c = [b_sorted(:,1) , a(i_b_sorted,2)];
out = c(i_b1_sorted_unique,:)
%more steps to recover the original order
[~,b_back_idx] = sort(i_b_sorted);
idx_logic = false(1,size(a,1));
idx_logic(i_b1_sorted_unique) = true;
idx_logic = idx_logic(b_back_idx);
out = c(b_back_idx(idx_logic),:)
Create a function that finds the minimal duplicate from the right column, given an index from the left column:
function Out = getMinDuplicate (Index, Data)
Candidates = Data(Data(:,1) == Index, :); Candidates = Candidates(:, 2);
Hist = histc (Data(:,2), [1 : max(Data(:,2))]);
[~,Out] = min (Hist(Candidates)); Out = Candidates(Out);
end
Call this function for all unique values in column 1:
>> [unique(d(:,1)), arrayfun(#(x) getMinDuplicate(x, d), unique(d(:,1)))]
ans =
1 1
2 2
4 1
5 4
6 4
7 3
8 2
(where d is your data array).
I think the question is pretty basic, but still keeps me busy since some time.
Lets assume we have a vector containing 4 integers randomly repetetive, like:
v = [ 1 3 3 3 4 2 1 2 3 4 3 2 1 4 3 3 4 2 2]
I am searching for the vector of all positions of each integer, e.g. for 1 it should be a vector like:
position_one = [1 7 13]
Since I want to search every row of a 100x10000 matrix I was not able to deal with linear indeces.
Thanks in advance!
Rows and columns
Since your output for every integer changes, a cell array will fit the whole task. For the whole matrix, you can do something like:
A = randi(4,10,30); % some data
Row = repmat((1:size(A,1)).',1,size(A,2)); % index of the row
Col = repmat((1:size(A,2)),size(A,1),1); % index of the column
pos = #(n) [Row(A==n) Col(A==n)]; % Anonymous function to find the indices of 'n'
than for every n you can write:
>> pos(3)
ans =
1 1
2 1
5 1
6 1
9 1
8 2
3 3
. .
. .
. .
where the first column is the row, and the second is the column for every instance of n in A.
And for all ns you can use an arrayfun:
positions = arrayfun(pos,1:max(A(:)),'UniformOutput',false) % a loop that goes over all n's
or a simple for loop (faster):
positions = cell(1,max(A(:)));
for n = 1:max(A(:))
positions(n) = {pos(n)};
end
The output in both cases would be a cell array:
positions =
[70x2 double] [78x2 double] [76x2 double] [76x2 double]
and for every n you can write positions{n}, to get for example:
>> positions{1}
ans =
10 1
2 3
5 3
3 4
5 4
1 5
4 5
. .
. .
. .
Only rows
If all you want in the column index per a given row and n, you can write this:
A = randi(4,10,30);
row_pos = #(k,n) A(k,:)==n;
positions = false(size(A,1),max(A(:)),size(A,2));
for n = 1:max(A(:))
positions(:,n,:) = row_pos(1:size(A,1),n);
end
now, positions is a logical 3-D array, that every row corresponds to a row in A, every column corresponds to a value of n, and the third dimension is the presence vector for the combination of row and n. this way, we can define R to be the column index:
R = 1:size(A,2);
and then find the relevant positions for a given row and n. For instance, the column indices of n=3 in row 9 is:
>> R(positions(9,3,:))
ans =
2 6 18 19 23 24 26 27
this would be just like calling find(A(9,:)==3), but if you need to perform this many times, the finding all indices and store them in positions (which is logical so it is not so big) would be faster.
Find linear indexes in a matrix: I = find(A == 1).
Find two dimensional indexes in matrix A: [row, col] = find(A == 1).
%Create sample matrix, with elements equal one:
A = zeros(5, 4);
A([2 10 14]) = 1
A =
0 0 0 0
1 0 0 0
0 0 0 0
0 0 1 0
0 1 0 0
Find ones as linear indexes in A:
find(A == 1)
ans =
2
10
14
%This is the same as reshaping A to a vector and find ones in the vector:
B = A(:);
find(B == 1);
B' =
0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
Find two dimensional indexes:
[row, col] = find(A == 1)
row =
2
5
4
col =
1
2
3
You can do that with accumarray using an anonymous function as follows:
positions = accumarray(v(:), 1:numel(v), [], #(x) {sort(x.')});
For
v = [ 1 3 3 3 4 2 1 2 3 4 3 2 1 4 3 3 4 2 2];
this gives
positions{1} =
1 7 13
positions{2} =
6 8 12 18 19
positions{3} =
2 3 4 9 11 15 16
positions{4} =
5 10 14 17
I want to do the following:
I create a matrix with all possible permutations from 1:n, for example
n=4;
L=perms(1:n)';
I get as output as expected a 4-by-24 matrix:
L =
Columns 1 through 13
4 4 4 4 4 4 3 3 3 3 3 3
3 3 2 2 1 1 4 4 2 2 1 1
2 1 3 1 2 3 2 1 4 1 2 4
1 2 1 3 3 2 1 2 1 4 4 2
Columns 14 through 24
2 2 2 2 2 1 1 1 1 1 1
3 4 4 1 1 3 3 2 2 4 4
1 3 1 4 3 2 4 3 4 2 3
4 1 3 3 4 4 2 4 3 3 2
Now I want to use this matrix for the indexes of a for loop:
Using the first column, I want to feed the input of my loop the following indexes: i=4 j=3,2,1. Then for i=3 j=2,1. Then for i=2 j=1. i=1 is empty
This could be done just for the first column like this:
for u=4:-1:1
for v=u-1:-1:1
But will not work for other columns so I need to do the same but with the entries of matrix L, something like (it doesn't work in MATLB) for column i=1:
u=L(1:4,1)
v=L(u:L(4,1) , 1) %// where u corresponds to L(1,1) then L(2,1) then L(3,1)
(for all the columns it would look like:
for i=1:length(L)
for u=L(4*(i-1)+1:4*i)
for v=.. ?
)
This doesn't work because MATLAB takes the values of the entries and when I write L(1,1):L(4,1) it doesn't mean return the entries from line one to line four but rather all the numbers with increment 1 from the value of L(1,1) to the value of L(4,1) (here empty).
Any ideas ? thank you very much in advance
I believe something like this will solve you problem.
for col = 1:size(L,2)
rowIdx = 1;
for j = [L(:,col)]'
for k = [L(rowIdx:end,col)]'
% Do your stuff here
end
rowIdx = rowIdx + 1;
end
end
Notice how I use the values from columns from L directly as loop index variable. In a for loop statement you can basically write any row vector and the index takes those values. For example
for i = [1, 7, 11, 14, 23]
disp(i); % prints 1,7,11,14,23
end
This is true for arrays of objects, cell arrays, basically any single row matrix.
You can do it like this:
for col = 1:size(L, 2)
for I = 1:n-1
for J = I:n
i = L(I,col);
j = L(J,col);
%// As an example just print out the loop variable values
disp(sprintf('Col:%d\ti:%d\tj:%d\r\n',col,i,j))
end
end
end
So here's my problem i want to count the numbers of same values in a column, this my data:
a b d
2 1 5
1 3 10
1 -2 5
0 5 25
5 0 25
1 1 2
-1 -1 2
i want to count the same values of d (where d = a^2 + b^2), so this is the output i want:
a b d count
2 1 5 2
1 3 10 1
0 5 25 2
1 1 2 2
so as you can see, only positive combinations of a and b displayed. so how can i do that? thanks.
Assuming your data is a matrix, here's an accumarray-based approach. Note this doesn't address the requirement "only positive combinations of a and b displayed".
M = [2 1 5
1 3 10
1 -2 5
0 5 25
5 0 25
1 1 2
-1 -1 2]; %// data
[~, jj, kk] = unique(M(:,3),'stable');
s = accumarray(kk,1);
result = [M(jj,:) s];
Assuming your input data to be stored in a 2D array, this could be one approach -
%// Input
A =[
2 1 5
1 3 10
1 -2 5
0 5 25
5 0 25
1 1 2
-1 -1 2]
[unqcol3,unqidx,allidx] = unique(A(:,3),'stable')
counts = sum(bsxfun(#eq,A(:,3),unqcol3.'),1) %//'
out =[A(unqidx,:) counts(:)]
You can also get the counts with histc -
counts = histc(allidx,1:max(allidx))
Note on positive combinations of a and b: If you are looking to have positive combinations of A and B, you can select only those rows from A that fulfill this requirement and save back into A as a pre-processing step -
A = A(all(A(:,1:2)>=0,2),:)