Shuffle vector elements such that two similar elements coming together at most twice - matlab

For the sake of an example:
I have a vector named vec containing ten 1s and ten 2s. I am trying to randomly arrange it but with one condition that two same values must not come together more than twice.
What I have done till now is generating random indexes of vec using the randperm function and shuffling vec accordingly. This is what I have:
vec = [1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2];
atmost = 2;
indexes = randperm(length(vec));
vec = vec(indexes);
>> vec =
2 1 1 1 2 1 2 1 2 1 1 1 2 2 1 2 1 2 2 2
My code randomly arranges elements of vec but does not fulfill the condition of two similar values coming at most two times. How can I do this? Any ideas?

You can first determine the lengths of the runs of one value, and then fit the other values around them.
In the explanation I'll use values of a and b in the vector, so as not to confuse the values of the elements of the vector (1 and 2) and the lengths of the runs for each element (1 and 2).
The end goal is to use repelem to construct the shuffled vector. repelem takes a vector of the elements to repeat, and a vector of how many times to repeat each element. For example, if we have:
v = [b a b a b a b a b a b a b a b a b]
n = [1 1 1 2 1 1 2 1 2 1 1 1 1 2 1 1 0]
repelem will return:
shuffled_vec = [b a b a a b a b b a b b a b a b a a b a]
As a first step, I'll generate random values for the counts corresponding to the a values. In this example, that would be:
a_grouping = [1 2 1 1 1 1 2 1]
First, randomly select the number of 2's in the grouping vector. There can be at most n/2 of them. Then add 1's to make up the desired total.
num_total = 10; % number of a's (and b's)
% There can be 0..num_total/2 groups of two a's in the string.
two_count = randi(num_total/2 + 1) - 1;
% The remainder of the groups of a's have length one.
one_count = num_total - (2*two_count);
% Generate random permutation of two_count 2's and one_count 1's
a_grouping = [repmat(2, 1, two_count), repmat(1, 1, one_count)];
This will give us something like:
a_grouping = [2 2 1 1 1 1 1 1]
Now shuffle:
a_grouping = a_grouping(randperm(numel(a_grouping)));
With the result:
a_grouping = [1 2 1 1 1 1 2 1]
Now we need to figure out where the b values go. There must be at least one b between each run of a values (and at most two), and there may be 0, 1 or 2 b values at the beginning and end of the string. So we need to generate counts for each of the x and y values below:
all_grouping = [y 1 x 2 x 1 x 1 x 1 x 1 x 2 x 1 y]
The x values must be at least 1, so we'll assign them first. Since the y values can be either 0, 1 or 2, we'll leave them set to 0.
% Between each grouping of a's, there must be at least one b.
% There can be 0, 1, or 2 b's before and after the a's,
% so make room for them as well.
b_grouping = zeros(1, numel(a_grouping) - 1 + 2);
b_grouping(2:end-1) = 1; % insert one b between each a group
For each of the the remaining counts we need to assign, just select a random slot. If it's not filled yet (i.e. if it's < 2), increment the count, otherwise find a different slot.
% Assign location of remaining 2's
for s = numel(a_grouping):num_total
unassigned = true;
while unassigned
% generate random indices until we find one that's open
idx = randi(numel(b_grouping));
if b_grouping(idx) < 2
b_grouping(idx) = b_grouping(idx) + 1;
unassigned = false;
end
end
end
Now we've got separate counts for the a's and b's:
a_grouping = [1 2 1 1 1 1 2 1]
b_grouping = [1 1 1 2 2 1 1 1 0]
We'll build the value vector (v from the start of the example) and interleave the groupings (the n vector).
% Interleave the a and b values
group_values = zeros(1, numel(a_grouping) + numel(b_grouping));
group_values(1:2:end) = 2;
group_values(2:2:end) = 1;
% Interleave the corresponding groupings
all_grouping = zeros(size(group_values));
all_grouping(2:2:end) = a_grouping;
all_grouping(1:2:end) = b_grouping;
Finally, repelem puts everything together:
shuffled_vec = repelem(group_values, all_grouping)
The final result is:
shuffled_vec =
1 2 2 1 1 2 1 1 2 2 1 1 2 2 1 1 2 2 1 2
Full code:
num_total = 10; % number of a's (and b's)
% There can be 0..num_total/2 groups of two a's in the string.
two_count = randi(num_total/2 + 1) - 1;
% The remainder of the groups of a's have length one.
one_count = num_total - (2*two_count);
% Generate random permutation of two_count 2's and one_count 1's
a_grouping = [repmat(2, 1, two_count), repmat(1, 1, one_count)];
a_grouping = a_grouping(randperm(numel(a_grouping)));
% disp(a_grouping)
% Between each grouping of a's, there must be at least one b.
% There can be 0, 1, or 2 b's before and after the a's,
% so make room for them as well.
b_grouping = zeros(1, numel(a_grouping) - 1 + 2);
b_grouping(2:end-1) = 1; % insert one b between each a group
% Assign location of remaining 2's
for s = numel(a_grouping):num_total
unassigned = true;
while unassigned
% generate random indices until we find one that's open
idx = randi(numel(b_grouping));
if b_grouping(idx) < 2
b_grouping(idx) = b_grouping(idx) + 1;
unassigned = false;
end
end
end
% Interleave the a and b values
group_values = zeros(1, numel(a_grouping) + numel(b_grouping));
group_values(1:2:end) = 2;
group_values(2:2:end) = 1;
% Interleave the corresponding groupings
all_grouping = zeros(size(group_values));
all_grouping(2:2:end) = a_grouping;
all_grouping(1:2:end) = b_grouping;
shuffled_vec = repelem(group_values, all_grouping)

This should generate fairly random non-consecutive vectors. Whether it covers all possibilities uniformly, I'm not sure.
out=[];
for i=1:10
if randi(2)==1
out=[out,1,2];
else
out=[out,2,1];
end
end
disp(out)
Example results
1,2,1,2,1,2,1,2,2,1,2,1,1,2,2,1,2,1,1,2,
2,1,2,1,1,2,2,1,2,1,1,2,2,1,2,1,1,2,1,2,
1,2,2,1,2,1,1,2,1,2,2,1,2,1,2,1,2,1,2,1,
2,1,2,1,2,1,1,2,1,2,2,1,1,2,2,1,2,1,1,2,
2,1,1,2,1,2,1,2,2,1,2,1,1,2,1,2,1,2,1,2,
2,1,1,2,2,1,2,1,1,2,1,2,2,1,1,2,1,2,2,1,
1,2,2,1,1,2,1,2,2,1,2,1,2,1,1,2,2,1,1,2,
2,1,1,2,2,1,2,1,1,2,2,1,1,2,1,2,1,2,1,2,
1,2,2,1,1,2,1,2,2,1,2,1,1,2,1,2,1,2,1,2,
1,2,1,2,2,1,2,1,2,1,2,1,1,2,1,2,2,1,2,1,

Related

Find unique rows of a cell array considering all possible permutations on each row

I have cell array A of dimension m * k.
I want to keep the rows of A unique up to an order of the k cells.
The "tricky" part is "up to an order of the k cells": consider the k cells in the ith row of A, A(i,:); there could be a row j of A, A(j,:), that is equivalent to A(i,:) up to a re-ordering of its k cells, meaning that for example if k=4it could be that:
A{i,1}=A{j,2}
A{i,2}=A{j,3}
A{i,3}=A{j,1}
A{i,4}=A{j,4}
What I am doing at the moment is:
G=[0 -1 1; 0 -1 2; 0 -1 3; 0 -1 4; 0 -1 5; 1 -1 6; 1 0 6; 1 1 6; 2 -1 6; 2 0 6; 2 1 6; 3 -1 6; 3 0 6; 3 1 6];
h=7;
M=reshape(G(nchoosek(1:size(G,1),h),:),[],h,size(G,2));
A=cell(size(M,1),2);
for p=1:size(M,1)
A{p,1}=squeeze(M(p,:,:));
left=~ismember(G, A{p,1}, 'rows');
A{p,2}=G(left,:);
end
%To find equivalent rows up to order I use a double loop (VERY slow).
indices=[];
for j=1:size(A,1)
if ismember(j,indices)==0 %if we have not already identified j as a duplicate
for i=1:size(A,1)
if i~=j
if (isequal(A{j,1},A{i,1}) || isequal(A{j,1},A{i,2}))...
&&...
(isequal(A{j,2},A{i,1}) || isequal(A{j,2},A{i,2}))...
indices=[indices;i];
end
end
end
end
end
A(indices,:)=[];
It works but it is too slow. I am hoping that there is something quicker that I can use.
I'd like to propose another idea, which has some conceptual resemblance to erfan's. My idea uses hash functions, and specifically, the GetMD5 FEX submission.
The main task is how to "reduce" each row in A to a single representative value (such as a character vector) and then find unique entries of this vector.
Judging by the benchmark vs. the other suggestions, my answer doesn't perform as well as one of the alternatives, but I think its raison d'ĂȘtre lies in the fact that it is completely data-type agnostic (within the limitations of the GetMD51), that the algorithm is very straightforward to understand, it's a drop-in replacement as it operates on A, and that the resulting array is exactly equal to the one obtained by the original method. Of course this requires a compiler to get working and has a risk of hash collisions (which might affect the result in VERY VERY rare cases).
Here are the results from a typical run on my computer, followed by the code:
Original method timing: 8.764601s
Dev-iL's method timing: 0.053672s
erfan's method timing: 0.481716s
rahnema1's method timing: 0.009771s
function q39955559
G=[0 -1 1; 0 -1 2; 0 -1 3; 0 -1 4; 0 -1 5; 1 -1 6; 1 0 6; 1 1 6; 2 -1 6; 2 0 6; 2 1 6; 3 -1 6; 3 0 6; 3 1 6];
h=7;
M=reshape(G(nchoosek(1:size(G,1),h),:),[],h,size(G,2));
A=cell(size(M,1),2);
for p=1:size(M,1)
A{p,1}=squeeze(M(p,:,:));
left=~ismember(G, A{p,1}, 'rows');
A{p,2}=G(left,:);
end
%% Benchmark:
tic
A1 = orig_sort(A);
fprintf(1,'Original method timing:\t\t%fs\n',toc);
tic
A2 = hash_sort(A);
fprintf(1,'Dev-iL''s method timing:\t\t%fs\n',toc);
tic
A3 = erfan_sort(A);
fprintf(1,'erfan''s method timing:\t\t%fs\n',toc);
tic
A4 = rahnema1_sort(G,h);
fprintf(1,'rahnema1''s method timing:\t%fs\n',toc);
assert(isequal(A1,A2))
assert(isequal(A1,A3))
assert(isequal(numel(A1),numel(A4))) % This is the best test I could come up with...
function out = hash_sort(A)
% Hash the contents:
A_hashed = cellfun(#GetMD5,A,'UniformOutput',false);
% Sort hashes of each row:
A_hashed_sorted = A_hashed;
for ind1 = 1:size(A_hashed,1)
A_hashed_sorted(ind1,:) = sort(A_hashed(ind1,:));
end
A_hashed_sorted = cellstr(cell2mat(A_hashed_sorted));
% Find unique rows:
[~,ia,~] = unique(A_hashed_sorted,'stable');
% Extract relevant rows of A:
out = A(ia,:);
function A = orig_sort(A)
%To find equivalent rows up to order I use a double loop (VERY slow).
indices=[];
for j=1:size(A,1)
if ismember(j,indices)==0 %if we have not already identified j as a duplicate
for i=1:size(A,1)
if i~=j
if (isequal(A{j,1},A{i,1}) || isequal(A{j,1},A{i,2}))...
&&...
(isequal(A{j,2},A{i,1}) || isequal(A{j,2},A{i,2}))...
indices=[indices;i];
end
end
end
end
end
A(indices,:)=[];
function C = erfan_sort(A)
STR = cellfun(#(x) num2str((x(:)).'), A, 'UniformOutput', false);
[~, ~, id] = unique(STR);
IC = sort(reshape(id, [], size(STR, 2)), 2);
[~, col] = unique(IC, 'rows');
C = A(sort(col), :); % 'sort' makes the outputs exactly the same.
function A1 = rahnema1_sort(G,h)
idx = nchoosek(1:size(G,1),h);
%concatenate complements
M = [G(idx(1:size(idx,1)/2,:),:), G(idx(end:-1:size(idx,1)/2+1,:),:)];
%convert to cell so A1 is unique rows of A
A1 = mat2cell(M,repmat(h,size(idx,1)/2,1),repmat(size(G,2),2,1));
1 - If more complicated data types need to be hashed, one can use the DataHash FEX submission instead, which is somewhat slower.
Stating the problem: The ideal choice in identifying unique rows in an array is to use C = unique(A,'rows'). But there are two major problems here, preventing us from using this function in this case. First is that you want to count in all the possible permutations of each row when comparing to other rows. If A has 5 columns, it means checking 120 different re-arrangements per row! Sounds impossible.
The second issue is related to unique itself; It does not accept cells except cell arrays of character vectors. So you cannot simply pass A to unique and get what you expect.
Why looking for an alternative? As you know, because currently it is very slow:
With nested loop method:
------------------- Create the data (first loop):
Elapsed time is 0.979059 seconds.
------------------- Make it unique (second loop):
Elapsed time is 14.218691 seconds.
My solution:
Generate another cell array containing same cells, but converted to string (STR).
Find the index of all unique elements there (id).
Generate the associated matrix with the unique indices and sort rows (IC).
Find unique rows (rows).
Collect corresponding rows of A (C).
And this is the code:
disp('------------------- Create the data:')
tic
G = [0 -1 1; 0 -1 2; 0 -1 3; 0 -1 4; 0 -1 5; 1 -1 6; 1 0 6; ...
1 1 6; 2 -1 6; 2 0 6; 2 1 6; 3 -1 6; 3 0 6; 3 1 6];
h = 7;
M = reshape(G(nchoosek(1:size(G,1),h),:),[],h,size(G,2));
A = cell(size(M,1),2);
for p = 1:size(M,1)
A{p, 1} = squeeze(M(p,:,:));
left = ~ismember(G, A{p,1}, 'rows');
A{p,2} = G(left,:);
end
STR = cellfun(#(x) num2str((x(:)).'), A, 'UniformOutput', false);
toc
disp('------------------- Make it unique (vectorized):')
tic
[~, ~, id] = unique(STR);
IC = sort(reshape(id, [], size(STR, 2)), 2);
[~, col] = unique(IC, 'rows');
C = A(sort(col), :); % 'sort' makes the outputs exactly the same.
toc
Performance check:
------------------- Create the data:
Elapsed time is 1.664119 seconds.
------------------- Make it unique (vectorized):
Elapsed time is 0.017063 seconds.
Although initialization needs a bit more time and memory, this method is extremely faster in finding unique rows with the consideration of all permutations. Execution time is almost insensitive to the number of columns in A.
It seems that G is a misleading point.
Here is result of nchoosek for a small number
idx=nchoosek(1:4,2)
ans =
1 2
1 3
1 4
2 3
2 4
3 4
first row is complement of the last row
second row is complement of one before the last row
.....
so if we extract rows {1 , 2} from G then its complement will be rows {3, 4} and so on. In the other words if we assume number of rows of G to be 4 then G(idx(1,:),:) is complement of G(idx(end,:),:).
Since rows of G are all unique then all A{m,n}s always have the same size.
A{p,1} and A{p,2} are complements of each other. and size of unique rows of A is size(idx,1)/2
So no need to any loop or further comparison:
h=7;
G = [0 -1 1; 0 -1 2; 0 -1 3; 0 -1 4; 0 -1 5; 1 -1 6; 1 0 6; ...
1 1 6; 2 -1 6; 2 0 6; 2 1 6; 3 -1 6; 3 0 6; 3 1 6];
idx = nchoosek(1:size(G,1),h);
%concatenate complements
M = [G(idx(1:size(idx,1)/2,:).',:), G(idx(end:-1:size(idx,1)/2+1,:).',:)];
%convert to cell so A1 is unique rows of A
A1 = mat2cell(M,repmat(h,size(idx,1)/2,1),repmat(size(G,2),2,1));
Update: Above method works best however if the idea is to get A1 from A other than G I suggest following method based of erfan' s. Instead of converting array to string we can directly work with the array:
STR=reshape([A.'{:}],numel(A{1,1}),numel(A)).';
[~, ~, id] = unique(STR,'rows');
IC = sort(reshape(id, size(A, 2),[]), 1).';
[~, col] = unique(IC, 'rows');
C1 = A(sort(col), :);
Since I use Octave I can not currently run mex file then I cannot test Dev-iL 's method
Result:
erfan method (string): 4.54718 seconds.
rahnema1 method (array): 0.012639 seconds.
Online Demo

Sort elements of rows in a matrix with another matrix

I have a matrix D of distances between 3 places and 4 persons
example D(2,3) = 10 means person 3 is far away from place 2 of 10 units.
D=[23 54 67 32
32 5 10 2
3 11 13 5]
another matrix A with the same number of rows (3 places) and where A(i,:) correspond to the persons that picked place i
example for place 1, persons 1 and 3 picked it
no one picked place 2
and persons 2 and 4 picked place 3
A=[1 3 0
0 0 0
2 4 0]
I want to reorder each row of A by the persons who are closest to the place it represents.
In this example, for place 1, person 1 is closer to it than person 3 based on D so nothing to do.
nothing to do for place 2
and there is a change for place 3 since person 4 is closer than 2 to place 3 D(3,2)>D(3,4)
The result should be
A=[1 3
0 0
4 2 ]
each row(place) in A can have 0 or many non zeros elements in it (persons that picked it)
Basically, I want to reorder elements in each row of A based on the rows of D (the closest to the location comes first), something like this but here A and D are not of the same size (number of columns).
[SortedD,Ind] = sort(D,2)
for r = 1:size(A,1)
A(r,:) = A(r,Ind(r,:));
end
There is another Matlab function sortrows(C,colummn_index) that can do the trick. It can sort rows based on the elements in a particular column. So if you transpose your matrix A (C = A') and extend the result by adding to the end the proper column, according to which you want to sort a required row, then you will get what you want.
To be more specific, you can do something like this:
clear all
D=[23 54 67 32;
32 5 10 2;
3 11 13 5];
A=[1 0;
3 0;
4 2 ];
% Sort elements in each row of the matrix A,
% because indices of elements in each row of the matrix D are always
% ascending.
A_sorted = sort(A,2);
% shifting all zeros in each row to the end
for i = 1:length(A_sorted(:,1))
num_zeros = sum(A_sorted(i,:)==0);
if num_zeros < length(A_sorted(i,:))
z = zeros(1,num_zeros);
A_sorted(i,:) = [A_sorted(i,num_zeros+1:length(A_sorted(i,:))) z];
end;
end;
% Prelocate in memory an associated array of the corresponding elements in
% D. The matrix Dr is just a reduced derivation from the matrix D.
Dr = zeros(length(A_sorted(:,1)),length(A_sorted(1,:)));
% Create a matrix Dr of elements in D corresponding to the matrix A_sorted.
for i = 1:length(A_sorted(:,1)) % i = 1:3
for j = 1:length(A_sorted(1,:)) % j = 1:2
if A_sorted(i,j) == 0
Dr(i,j) = 0;
else
Dr(i,j) = D(i,A_sorted(i,j));
end;
end;
end;
% We don't need the matrix A_sorted anymore
clear A_sorted
% In order to use the function SORTROWS, we need to transpose matrices
A = A';
Dr = Dr';
% The actual sorting procedure starts here.
for i = 1:length(A(1,:)) % i = 1:3
C = zeros(length(A(:,1)),2); % buffer matrix
C(:,1) = A(:,i);
C(:,2) = Dr(:,i);
C = sortrows(C,2);
A(:,i) = C(:,1);
% shifting all zeros in each column to the end
num_zeros = sum(A(:,i)==0);
if num_zeros < length(A(:,i))
z = zeros(1,num_zeros);
A(:,i) = [A(num_zeros+1:length(A(:,i)),i) z]';
end;
end;
% Transpose the matrix A back
A = A';
clear C Dr z

count co-occurrence neighbors in a vector

I have a vector : for example S=(0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3,0).
I want to count co-occurrence neighbors, for example the first neighbor "o,3" how many times did it happen till the end of the sequence? Then it investigates the next pair"2,0" and similarly do it for other pairs.
Below is a part of my code:
s=size(pic);
S=s(1)*s(2);
V = reshape(pic,1,S);
min= min(V);
Z=zeros(1,S+1);
Z(1)=min;
Z(2:end)=V;
for i=[0:2:length(Z)-1];
contj=0
for j=0;length(Z)-1;
if Z(i,i+1)= Z(j,j+1)
contj=contj+1
end
end
count(i)= contj
end
It gives me this error:
The expression to the left of the equals sign is not a valid target for an assignment
in this line:
if Z(i,i+1)= Z(j,j+1)
I read similar questions and apply the tips on it but they didn't work!
If pairs are defined without overlapping (according to comments):
S = [0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3]; %// define data
S2 = reshape(S,2,[]).'; %'// arrange in pairs: one pair in each row
[~, jj, kk] = unique(S2,'rows'); %// get unique labels for pairs
pairs = S2(jj,:); %// unique pairs
counts = accumarray(kk, 1); %// count of each pair. Or use histc(kk, 1:max(kk))
Example: with S as above (I introduce blanks to make pairs stand out),
S = [0,3, 2,0, 1,2, 0,1, 1,2, 3,3, 0,1, 2,3];
the result is
pairs =
0 1
0 3
1 2
2 0
2 3
3 3
counts =
2
1
2
1
1
1
If pairs are defined without overlapping but counted with overlapping:
S = [0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3]; %// define data
S2 = reshape(S,2,[]).'; %'// arrange in pairs: one pair in each row
[~, jj] = unique(S2,'rows'); %// get unique labels for pairs
pairs = S2(jj,:); %// unique pairs
P = [S(1:end-1).' S(2:end).']; %// all pairs, with overlapping
counts = sum(pdist2(P,pairs,'hamming')==0);
If you don't have pdist2 (Statistics Toolbox), replace last line by
counts = sum(all(bsxfun(#eq, pairs.', permute(P, [2 3 1]))), 3);
Result:
>> pairs
pairs =
0 1
0 3
1 2
2 0
2 3
3 3
>> counts
counts =
3 1 3 2 2 1
do it using sparse command
os = - min(S) + 1; % convert S into indices
% if you want all pairs, i.e., for S = (2,0,1) count (2,0) AND (0,1):
S = sparse( S(1:end-1) + os, S(2:end) + os, 1 );
% if you don't want all pairs, i.e., for S = (2,0,1,3) count (2,0) and (1,3) ONLY:
S = sparse( S(1:2:end)+os, S(2:2:end) + os, 1 );
[f s c] = find(S);
f = f - os; % convert back
s = s - os;
co-occurences and their count are in the pairs (f,s) - c
>> [f s c]
ans =
2 0 2 % i.e. the pair (2,0) appears twice in S...
3 0 2
0 1 3
1 1 1
1 2 3
3 2 1
0 3 1
2 3 2
3 3 1

For each element in vector, sum previous n elements

I am trying to write a function that sums the previous n elements for each the element
v = [1 1 1 1 1 1];
res = sumLastN(v,3);
res = [0 0 3 3 3 3];
Until now, I have written the following function
function [res] = sumLastN(vec,ppts)
if iscolumn(vec)~=1
error('First argument must be a column vector')
end
sz_x = size(vec,1);
res = zeros(sz_x,1);
if sz_x > ppts
for jj = 1:ppts
res(ppts:end,1) = res(ppts:end,1) + ...
vec(jj:end-ppts+jj,1);
end
% for jj = ppts:sz_x
% res(jj,1) = sum(vec(jj-ppts+1:jj,1));
% end
end
end
There are around 2000 vectors of about 1 million elements, so I was wondering if anyone could give me any advice of how I could speed up the function.
Using cumsum should be much faster:
function [res] = sumLastN(vec,ppts)
w=cumsum(vec)
res=[zeros(1,ppts-1),w(ppts+1:end)-w(1:end-ppts)]
end
You basically want a moving average filter, just without the averaging.
Use a digital filter:
n = 3;
v = [1 1 1 1 1 1];
res = filter(ones(1,n),1,v)
res =
1 2 3 3 3 3
I don't get why the first two elements should be zero, but why not:
res(1:n-1) = 0
res =
0 0 3 3 3 3

matlab: simple matrix filtering - group size

I have a huuuge matrix storing information about X and Y coordinates of multiple particle trajectories , which in simplified version looks like that:
col 1- track number; col 2- frame number; col 2- coordinate X; col 3- coordinate Y
for example:
A =
1 1 5.14832 3.36128
1 2 5.02768 3.60944
1 3 4.85856 3.81616
1 4 5.17424 4.08384
2 1 2.02928 18.47536
2 2 2.064 18.5464
3 1 8.19648 5.31056
3 2 8.04848 5.33568
3 3 7.82016 5.29088
3 4 7.80464 5.31632
3 5 7.68256 5.4624
3 6 7.62592 5.572
Now I want to filter out trajectories shorter than lets say 4 and keep remaining stuff like (note renumbering of trajectories):
B =
1 1 5.14832 3.36128
1 2 5.02768 3.60944
1 3 4.85856 3.81616
1 4 5.17424 4.08384
2 1 8.19648 5.31056
2 2 8.04848 5.33568
2 3 7.82016 5.29088
2 4 7.80464 5.31632
2 5 7.68256 5.4624
2 6 7.62592 5.572
How to do it efficiently? I can think about some ideas using for loop and vertcat, but its the slowest solution ever :/
Thanks!
This will filter out those trajectories of length less than 4:
[v, u1, w] = unique(A(:, 1), 'last');
[~, u2, ~] = unique(A(:, 1), 'first');
keys = v(find(u1 - u2 >= 3));
B = A(ismember(A(:, 1), keys), :);
This will re-number them:
[~, ~, B(:, 1)] = unique(B(:, 1));
Here is a slightly different solution than that of #Ansari:
t = 1:max(A(:,1)); %# possible track numbers
tt = t( histc(A(:,1),t) >= 4 ); %# tracks with >= 4 frames
B = A(ismember(A(:,1),tt),:); %# filter rows
[~,~,B(:,1)] = unique(B(:,1)); %# renumber track numbers
Another way to compute the indices variable tt in my code above:
tt = find( accumarray(A(:,1), 1, [], #(x)numel(x)>=4) );