split a matrix according to a column with matlab. - matlab

A = [1,4,2,5,10
2,4,5,6,2
2,1,5,6,10
2,3,5,4,2]
And I want split it into two matrix by the last column
A ->B and C
B = [1,4,2,5,10
2,1,5,6,10]
C = [2,4,5,6,2
2,3,5,4,2]
Also, this method could be applied to a big matrix, like matrix 100*22 according to the last column value into 9 groups by matlab.

Use logical indexing
B=A(A(:,end)==10,:);
C=A(A(:,end)==2,:);
returns
>> B
B =
1 4 2 5 10
2 1 5 6 10
>> C
C =
2 4 5 6 2
2 3 5 4 2
EDIT: In reply to Dan's comment here is the extension for general case
e = unique(A(:,end));
B = cell(size(e));
for k = 1:numel(e)
B{k} = A(A(:,end)==e(k),:);
end
or more compact way
B=arrayfun(#(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
so for
A =
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 1
1 0 4 5 9
1 2 4 3 1
you get the matrices in elements of cell array B
>> B{1}
ans =
1 3 4 5 1
1 2 4 3 1
>> B{2}
ans =
2 4 5 6 2
2 3 5 4 2
>> B{3}
ans =
0 3 1 4 9
1 0 4 5 9
>> B{4}
ans =
1 4 2 5 10
2 1 5 6 10

Here is a general approach which will work on any number of numbers in the last column on any sized matrix:
A = [1,4,2,5,10
2,4,5,6,2
1,1,1,1,1
2,1,5,6,10
2,3,5,4,2
0,0,0,0,2];
First sort by the last column (many ways to do this, don't know if this is the best or not)
[~, order] = sort(A(:,end));
As = A(order,:);
Then create a vector of how many rows of the same number appear in that last col (i.e. how many rows per group)
rowDist = diff(find([1; diff(As(:, end)); 1]));
Note that for my example data rowDist will equal [1 3 2] as there is 1 1, 3 2s and 2 10s.
Now use mat2cell to split by these row groupings:
Ac = mat2cell(As, rowDist);
If you really want to you can now split it into separate matrices (but I doubt you would)
Ac{:}
results in
ans =
1 1 1 1 1
ans =
0 0 0 0 2
2 3 5 4 2
2 4 5 6 2
ans =
1 4 2 5 10
2 1 5 6 10
But I think you would find Ac itself more useful
EDIT:
Many solutions so might as well do a time comparison:
A = [...
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 3
1 0 4 5 9
1 2 4 3 1];
A = repmat(A, 1000, 1);
tic
for l = 1:100
[~, y] = sort(A(:,end));
As = A(y,:);
rowDist = diff(find([1; diff(As(:, end)); 1]));
Ac = mat2cell(As, rowDist);
end
toc
tic
for l = 1:100
D=arrayfun(#(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
end
toc
tic
for l = 1:100
for k = 1:numel(e)
B{k} = A(A(:,end)==e(k),:);
end
end
toc
tic
for l = 1:100
Bb = sort(A(:,end));
[~,b] = histc(A(:,end), Bb([diff(Bb)>0;true]));
C = accumarray(b, (1:size(A,1))', [], #(r) {A(r,:)} );
end
toc
resulted in
Elapsed time is 0.053452 seconds.
Elapsed time is 0.17017 seconds.
Elapsed time is 0.004081 seconds.
Elapsed time is 0.22069 seconds.
So for even for a large matrix the loop method is still the fastest.

Use accumarray in combination with histc:
% Example data (from Mohsen Nosratinia)
A = [...
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 1
1 0 4 5 9
1 2 4 3 1];
% Get the proper indices to the specific rows
B = sort(A(:,end));
[~,b] = histc(A(:,end), B([diff(B)>0;true]));
% Collect all specific rows in their specific groups
C = accumarray(b, (1:size(A,1))', [], #(r) {A(r,:)} );
Results:
>> C{:}
ans =
1 3 4 5 1
1 2 4 3 1
ans =
2 3 5 4 2
2 4 5 6 2
ans =
0 3 1 4 9
1 0 4 5 9
ans =
2 1 5 6 10
1 4 2 5 10
Note that
B = sort(A(:,end));
[~,b] = histc(A(:,end), B([diff(B)>0;true]));
can also be written as
[~,b] = histc(A(:,end), unique(A(:,end)));
but unique is not built-in and is therefore likely to be slower, especially when this is all used in a loop.
Note also that the order of the rows has changed w.r.t. the order they had in the original matrix. If the order matters, you'll have to throw in another sort:
C = accumarray(b, (1:size(A,1))', [], #(r) {A(sort(r),:)} );

Related

Matlab: How to enumerate the possible ways of forming pairs from a list

Suppose I have a list of length 2k, say {1,2,...,2k}. The number of possible ways of grouping the 2k numbers into k (unordered) pairs is n(k) = 1*3* ... *(2k-1). So for k=2, we have the following three different ways of forming 2 pairs
(1 2)(3 4)
(1 3)(2 4)
(1 4)(2 3)
How can I use Matlab to create the above list, i.e., create a matrix of n(k)*(2k) such that each row contains a different way of grouping the list of 2k numbers into k pairs.
clear
k = 3;
set = 1: 2*k;
p = perms(set); % get all possible permutations
% sort each two column
[~, col] = size(p);
for i = 1: 2: col
p(:, i:i+1) = sort(p(:,i:i+1), 2);
end
p = unique(p, 'rows'); % remove the same row
% sort each row
[row, col] = size(p);
for i = 1: row
temp = reshape(p(i,:), 2, col/2)';
temp = sortrows(temp, 1);
p(i,:) = reshape(temp', 1, col);
end
pairs = unique(p, 'rows'); % remove the same row
pairs =
1 2 3 4 5 6
1 2 3 5 4 6
1 2 3 6 4 5
1 3 2 4 5 6
1 3 2 5 4 6
1 3 2 6 4 5
1 4 2 3 5 6
1 4 2 5 3 6
1 4 2 6 3 5
1 5 2 3 4 6
1 5 2 4 3 6
1 5 2 6 3 4
1 6 2 3 4 5
1 6 2 4 3 5
1 6 2 5 3 4
As someone think my former answer is not useful, i post this.
I have the following brute force way of enumerating the pairs. Not particularly efficient. It can also cause memory problem when k>9. In that case, I can just enumerate but not create Z and store the result in it.
function Z = pair2(k)
count = [2*k-1:-2:3];
tcount = prod(count);
Z = zeros(tcount,2*k);
x = [ones(1,k-2) 0];
z = zeros(1,2*k);
for i=1:tcount
for j=k-1:-1:1
if x(j)<count(j)
x(j) = x(j)+1;
break
end
x(j) = 1;
end
y = [1:2*k];
for j=1:k-1
z(2*j-1) = y(1);
z(2*j) = y(x(j)+1);
y([1 x(j)+1]) = [];
end
z(2*k-1:2*k) = y;
Z(i,:) = z;
end
k = 3;
set = 1: 2*k;
combos = combntns(set, k);
[len, ~] = size(combos);
pairs = [combos(1:len/2,:) flip(combos(len/2+1:end,:))];
pairs =
1 2 3 4 5 6
1 2 4 3 5 6
1 2 5 3 4 6
1 2 6 3 4 5
1 3 4 2 5 6
1 3 5 2 4 6
1 3 6 2 4 5
1 4 5 2 3 6
1 4 6 2 3 5
1 5 6 2 3 4
You can also use nchoosek instead of combntns. See more at combntns or nchoosek

Count repeating integers in an array

If I have this vector:
x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6]
I would like to get the position of each unique number according to itself.
y = [1 2 3 4 5 1 2 3 1 1 2 1 2 3 4]
At the moment I'm using:
y = sum(triu(x==x.')) % MATLAB 2016b and above
It's compact but obviously not memory efficient.
For the pure beauty of MATLAB programming I would avoid using a loop. Do you have a better simple implementation ?
Context:
My final goal is to sort the vector x but with the constraint that a number that appear N times has the priority over another number that has appeared more than N times:
[~,ind] = sort(y);
x_relative_sort = x(ind);
% x_relative_sort = 1 2 3 4 6 1 2 4 6 1 2 6 1 6 1
Assuming x is sorted, here's one vectorized alternative using unique, diff, and cumsum:
[~, index] = unique(x);
y = ones(size(x));
y(index(2:end)) = y(index(2:end))-diff(index).';
y = cumsum(y);
And now you can apply your final sorting:
>> [~, ind] = sort(y);
>> x_relative_sort = x(ind)
x_relative_sort =
1 2 3 4 6 1 2 4 6 1 2 6 1 6 1
If you have positive integers you can use sparse matrix:
[y ,~] = find(sort(sparse(1:numel(x), x, true), 1, 'descend'));
Likewise x_relative_sort can directly be computed:
[x_relative_sort ,~] = find(sort(sparse(x ,1:numel(x),true), 2, 'descend'));
Just for variety, here's a solution based on accumarray. It works for x sorted and containing positive integers, as in the question:
y = cell2mat(accumarray(x(:), x(:), [], #(t){1:numel(t)}).');
You can be more memory efficient by only comparing to unique(x), so you don't have a large N*N matrix but rather N*M, where N=numel(x), M=numel(unique(x)).
I've used an anonymous function syntax to avoid declaring an intermediate matrix variable, needed as it's used twice - this can probably be improved.
f = #(X) sum(cumsum(X,2).*X); y = f(unique(x).'==x);
Here's my solution that doesn't require sorting:
x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6 1 1 1];
y = cell2mat( splitapply(#(v){cumsum(v)},x,cumsum(logical([1 diff(x)]))) ) ./ x;
Explanation:
% Turn each group new into a unique number:
t1 = cumsum(logical([1 diff(x)]));
% x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6 1 1 1];
% t1 = [1 1 1 1 1 2 2 2 3 4 4 5 5 5 5 6 6 6];
% Apply cumsum separately to each group:
t2 = cell2mat( splitapply(#(v){cumsum(v)},x,t1) );
% t1 = [1 1 1 1 1 2 2 2 3 4 4 5 5 5 5 6 6 6];
% t2 = [1 2 3 4 5 2 4 6 3 4 8 6 12 18 24 1 2 3];
% Finally, divide by x to get the increasing values:
y = t2 ./ x;
% x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6 1 1 1];
% t2 = [1 2 3 4 5 2 4 6 3 4 8 6 12 18 24 1 2 3];

Find the rows of a matrix with conditions concerning the values of certain columns in matlab

As the title says, I want to find all rows in a Matlab matrix that in certain columns the values in the row are equal with the values in the previous row, or in general, equal in some row in the matrix. For example I have a matrix
1 2 3 4
1 2 8 10
4 5 7 9
2 3 6 4
1 2 4 7
and I want to find the following rows:
1 2 3 4
1 2 3 10
1 2 4 7
How do I do something like that and how do I do it generally for all the possible pairs in columns 1 and 2, and have equal values in previous rows, that exist in the matrix?
Here's a start to see if we're headed in the right direction:
>> M = [1 2 3 4;
1 2 8 10;
4 5 7 9;
2 3 6 4;
1 2 4 7];
>> N = M; %// copy M into a new matrix so we can modify it
>> idx = ismember(N(:,1:2), N(1,1:2), 'rows')
idx =
1
1
0
0
1
>> N(idx, :)
ans =
1 2 3 4
1 2 8 10
1 2 4 7
Then you can remove those rows from the original matrix and repeat.
>> N = N(~idx,:)
N =
4 5 7 9
2 3 6 4
this will give you the results
data1 =[1 2 3 4
1 2 8 10
4 5 7 9
2 3 6 4
1 2 4 7];
data2 = [1 2 3 4
1 2 3 10
1 2 4 7];
[exists,position] = ismember(data1,data2, 'rows')
where the exists vector tells you wheter the row is on the other matrix and position gives you the position...
a less elegant and simpler version would be
array_data1 = reshape (data1',[],1);
array_data2 = reshape (data2',[],1);
matchmatrix = zeros(size(data2,1),size(data1,1));
for irow1 = 1: size(data2,1)
for irow2 = 1: size(data1,1)
matchmatrix(irow1,irow2) = min(data2(irow1,:) == data1(irow2,:))~= 0;
end
end
the matchmatrix is to read as a connectivity matrix where value of 1 indicates which row of data1 matches with which row of data2

Find how many times a pair of values occurs

Suppose there are 2 vectors (i,j), A(10,1), B(10,1) which have obtain random values from [1,10] interval. eg.
A = [1 6 1 10 1 7 1 9 3 6]
B = [7 2 3 5 6 8 7 9 10 2].
I am interested in creating a new vector which will count how many values with the same i index occur. e.g.
1 and 7 ⇒ 2 occurrences
6 and 2 ⇒ 2 occurrences
1 and 3 ⇒ 1 occurrence
10 and 5 ⇒ 1 occurrence
1 and 6 ⇒ 1 occurrence
...
etc.
So that a final array/vector C occurs, with all the possible pairs and their counted occurrence with size C(number_of_pairs,2).
Use accumarray and then find:
A = [1 6 1 10 1 7 1 9 3 6];
B = [7 2 3 5 6 8 7 9 10 2]; %// data
aa = accumarray([A(:) B(:)], 1); %// how many times each pair occurs
[ii jj vv] = find(aa);
C = [ii jj vv]; %// only pairs which occurr at least once
In your example, this gives
C =
6 2 2
1 3 1
10 5 1
1 6 1
1 7 2
7 8 1
9 9 1
3 10 1
Or perhaps aa or vv are what you need; I'm not sure about what your desired output is.
Another possible approach, inspired by #Rody's answer:
mat = [A(:) B(:)];
[bb ii jj] = unique(mat, 'rows');
C = [mat(ii,:) accumarray(jj,1)];
Something like this?
A = [1 6 1 10 1 7 1 9 3 6];
B = [7 2 3 5 6 8 7 9 10 2];
%// Find the unique pairs
AB = [A;B].';
ABu = unique(AB, 'rows');
%// Count the number of occurrences of each unique pair
%// (pretty sure there's a better way to do this...)
C = arrayfun(#(ii) ...
[ABu(ii,:) sum(all(bsxfun(#eq, AB, ABu(ii,:)),2))], 1:size(ABu,1), ...
'UniformOutput', false);
C = cat(1,C{:});
Result:
C =
1 3 1
1 6 1
1 7 2
3 10 1
6 2 2
7 8 1
9 9 1
10 5 1
Something like this?
C = zeros(10);
for k = 1:10
C(A(k), B(k)) = C(A(k), B(k)) + 1
end

How do I make sums of sumatrixes in MATLAB without cycle?

I have a large matrix (time x frequency), which I want to reduce partially. I want to sum every 1000 rows (time-samples) together keepinq the frequency information, it is kind of a segmentation.
Is there any way to do it without any cycle in MATLAB?
A smaller example:
M=[1 2 3; 2 3 4; 5 8 7; 5 6 7; 1 2 3; 1 2 4];
and I want to sum every 2 rows together so, that I get:
[3 5 7; 10 14 14; 2 4 7]
Suppose you have a matrix with N rows and M columns and you want to sum every R rows together (where N is divisible by R),
>> mat = [1 2 3; 2 3 4; 5 8 7; 5 6 7; 1 2 3; 1 2 4]
mat =
1 2 3
2 3 4
5 8 7
5 6 7
1 2 3
1 2 4
>> [N, M] = size(mat); %=> [6, 3]
>> R = 2;
The following will allow you to sum groups of R rows:
>> res = reshape(mat, R, [])
res =
1 5 1 2 8 2 3 7 3
2 5 1 3 6 2 4 7 4
>> res = sum(res)
res =
3 10 2 5 14 4 7 14 7
>> res = reshape(res, [], M)
res =
3 5 7
10 14 14
2 4 7
You can also do everything in one line:
>> reshape(sum(reshape(mat, R, [])), [], M)
ans =
3 5 7
10 14 14
2 4 7