count co-occurrence neighbors in a vector - matlab

I have a vector : for example S=(0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3,0).
I want to count co-occurrence neighbors, for example the first neighbor "o,3" how many times did it happen till the end of the sequence? Then it investigates the next pair"2,0" and similarly do it for other pairs.
Below is a part of my code:
s=size(pic);
S=s(1)*s(2);
V = reshape(pic,1,S);
min= min(V);
Z=zeros(1,S+1);
Z(1)=min;
Z(2:end)=V;
for i=[0:2:length(Z)-1];
contj=0
for j=0;length(Z)-1;
if Z(i,i+1)= Z(j,j+1)
contj=contj+1
end
end
count(i)= contj
end
It gives me this error:
The expression to the left of the equals sign is not a valid target for an assignment
in this line:
if Z(i,i+1)= Z(j,j+1)
I read similar questions and apply the tips on it but they didn't work!

If pairs are defined without overlapping (according to comments):
S = [0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3]; %// define data
S2 = reshape(S,2,[]).'; %'// arrange in pairs: one pair in each row
[~, jj, kk] = unique(S2,'rows'); %// get unique labels for pairs
pairs = S2(jj,:); %// unique pairs
counts = accumarray(kk, 1); %// count of each pair. Or use histc(kk, 1:max(kk))
Example: with S as above (I introduce blanks to make pairs stand out),
S = [0,3, 2,0, 1,2, 0,1, 1,2, 3,3, 0,1, 2,3];
the result is
pairs =
0 1
0 3
1 2
2 0
2 3
3 3
counts =
2
1
2
1
1
1
If pairs are defined without overlapping but counted with overlapping:
S = [0,3,2,0,1,2,0,1,1,2,3,3,0,1,2,3]; %// define data
S2 = reshape(S,2,[]).'; %'// arrange in pairs: one pair in each row
[~, jj] = unique(S2,'rows'); %// get unique labels for pairs
pairs = S2(jj,:); %// unique pairs
P = [S(1:end-1).' S(2:end).']; %// all pairs, with overlapping
counts = sum(pdist2(P,pairs,'hamming')==0);
If you don't have pdist2 (Statistics Toolbox), replace last line by
counts = sum(all(bsxfun(#eq, pairs.', permute(P, [2 3 1]))), 3);
Result:
>> pairs
pairs =
0 1
0 3
1 2
2 0
2 3
3 3
>> counts
counts =
3 1 3 2 2 1

do it using sparse command
os = - min(S) + 1; % convert S into indices
% if you want all pairs, i.e., for S = (2,0,1) count (2,0) AND (0,1):
S = sparse( S(1:end-1) + os, S(2:end) + os, 1 );
% if you don't want all pairs, i.e., for S = (2,0,1,3) count (2,0) and (1,3) ONLY:
S = sparse( S(1:2:end)+os, S(2:2:end) + os, 1 );
[f s c] = find(S);
f = f - os; % convert back
s = s - os;
co-occurences and their count are in the pairs (f,s) - c
>> [f s c]
ans =
2 0 2 % i.e. the pair (2,0) appears twice in S...
3 0 2
0 1 3
1 1 1
1 2 3
3 2 1
0 3 1
2 3 2
3 3 1

Related

Shuffle vector elements such that two similar elements coming together at most twice

For the sake of an example:
I have a vector named vec containing ten 1s and ten 2s. I am trying to randomly arrange it but with one condition that two same values must not come together more than twice.
What I have done till now is generating random indexes of vec using the randperm function and shuffling vec accordingly. This is what I have:
vec = [1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2];
atmost = 2;
indexes = randperm(length(vec));
vec = vec(indexes);
>> vec =
2 1 1 1 2 1 2 1 2 1 1 1 2 2 1 2 1 2 2 2
My code randomly arranges elements of vec but does not fulfill the condition of two similar values coming at most two times. How can I do this? Any ideas?
You can first determine the lengths of the runs of one value, and then fit the other values around them.
In the explanation I'll use values of a and b in the vector, so as not to confuse the values of the elements of the vector (1 and 2) and the lengths of the runs for each element (1 and 2).
The end goal is to use repelem to construct the shuffled vector. repelem takes a vector of the elements to repeat, and a vector of how many times to repeat each element. For example, if we have:
v = [b a b a b a b a b a b a b a b a b]
n = [1 1 1 2 1 1 2 1 2 1 1 1 1 2 1 1 0]
repelem will return:
shuffled_vec = [b a b a a b a b b a b b a b a b a a b a]
As a first step, I'll generate random values for the counts corresponding to the a values. In this example, that would be:
a_grouping = [1 2 1 1 1 1 2 1]
First, randomly select the number of 2's in the grouping vector. There can be at most n/2 of them. Then add 1's to make up the desired total.
num_total = 10; % number of a's (and b's)
% There can be 0..num_total/2 groups of two a's in the string.
two_count = randi(num_total/2 + 1) - 1;
% The remainder of the groups of a's have length one.
one_count = num_total - (2*two_count);
% Generate random permutation of two_count 2's and one_count 1's
a_grouping = [repmat(2, 1, two_count), repmat(1, 1, one_count)];
This will give us something like:
a_grouping = [2 2 1 1 1 1 1 1]
Now shuffle:
a_grouping = a_grouping(randperm(numel(a_grouping)));
With the result:
a_grouping = [1 2 1 1 1 1 2 1]
Now we need to figure out where the b values go. There must be at least one b between each run of a values (and at most two), and there may be 0, 1 or 2 b values at the beginning and end of the string. So we need to generate counts for each of the x and y values below:
all_grouping = [y 1 x 2 x 1 x 1 x 1 x 1 x 2 x 1 y]
The x values must be at least 1, so we'll assign them first. Since the y values can be either 0, 1 or 2, we'll leave them set to 0.
% Between each grouping of a's, there must be at least one b.
% There can be 0, 1, or 2 b's before and after the a's,
% so make room for them as well.
b_grouping = zeros(1, numel(a_grouping) - 1 + 2);
b_grouping(2:end-1) = 1; % insert one b between each a group
For each of the the remaining counts we need to assign, just select a random slot. If it's not filled yet (i.e. if it's < 2), increment the count, otherwise find a different slot.
% Assign location of remaining 2's
for s = numel(a_grouping):num_total
unassigned = true;
while unassigned
% generate random indices until we find one that's open
idx = randi(numel(b_grouping));
if b_grouping(idx) < 2
b_grouping(idx) = b_grouping(idx) + 1;
unassigned = false;
end
end
end
Now we've got separate counts for the a's and b's:
a_grouping = [1 2 1 1 1 1 2 1]
b_grouping = [1 1 1 2 2 1 1 1 0]
We'll build the value vector (v from the start of the example) and interleave the groupings (the n vector).
% Interleave the a and b values
group_values = zeros(1, numel(a_grouping) + numel(b_grouping));
group_values(1:2:end) = 2;
group_values(2:2:end) = 1;
% Interleave the corresponding groupings
all_grouping = zeros(size(group_values));
all_grouping(2:2:end) = a_grouping;
all_grouping(1:2:end) = b_grouping;
Finally, repelem puts everything together:
shuffled_vec = repelem(group_values, all_grouping)
The final result is:
shuffled_vec =
1 2 2 1 1 2 1 1 2 2 1 1 2 2 1 1 2 2 1 2
Full code:
num_total = 10; % number of a's (and b's)
% There can be 0..num_total/2 groups of two a's in the string.
two_count = randi(num_total/2 + 1) - 1;
% The remainder of the groups of a's have length one.
one_count = num_total - (2*two_count);
% Generate random permutation of two_count 2's and one_count 1's
a_grouping = [repmat(2, 1, two_count), repmat(1, 1, one_count)];
a_grouping = a_grouping(randperm(numel(a_grouping)));
% disp(a_grouping)
% Between each grouping of a's, there must be at least one b.
% There can be 0, 1, or 2 b's before and after the a's,
% so make room for them as well.
b_grouping = zeros(1, numel(a_grouping) - 1 + 2);
b_grouping(2:end-1) = 1; % insert one b between each a group
% Assign location of remaining 2's
for s = numel(a_grouping):num_total
unassigned = true;
while unassigned
% generate random indices until we find one that's open
idx = randi(numel(b_grouping));
if b_grouping(idx) < 2
b_grouping(idx) = b_grouping(idx) + 1;
unassigned = false;
end
end
end
% Interleave the a and b values
group_values = zeros(1, numel(a_grouping) + numel(b_grouping));
group_values(1:2:end) = 2;
group_values(2:2:end) = 1;
% Interleave the corresponding groupings
all_grouping = zeros(size(group_values));
all_grouping(2:2:end) = a_grouping;
all_grouping(1:2:end) = b_grouping;
shuffled_vec = repelem(group_values, all_grouping)
This should generate fairly random non-consecutive vectors. Whether it covers all possibilities uniformly, I'm not sure.
out=[];
for i=1:10
if randi(2)==1
out=[out,1,2];
else
out=[out,2,1];
end
end
disp(out)
Example results
1,2,1,2,1,2,1,2,2,1,2,1,1,2,2,1,2,1,1,2,
2,1,2,1,1,2,2,1,2,1,1,2,2,1,2,1,1,2,1,2,
1,2,2,1,2,1,1,2,1,2,2,1,2,1,2,1,2,1,2,1,
2,1,2,1,2,1,1,2,1,2,2,1,1,2,2,1,2,1,1,2,
2,1,1,2,1,2,1,2,2,1,2,1,1,2,1,2,1,2,1,2,
2,1,1,2,2,1,2,1,1,2,1,2,2,1,1,2,1,2,2,1,
1,2,2,1,1,2,1,2,2,1,2,1,2,1,1,2,2,1,1,2,
2,1,1,2,2,1,2,1,1,2,2,1,1,2,1,2,1,2,1,2,
1,2,2,1,1,2,1,2,2,1,2,1,1,2,1,2,1,2,1,2,
1,2,1,2,2,1,2,1,2,1,2,1,1,2,1,2,2,1,2,1,

Sort elements of rows in a matrix with another matrix

I have a matrix D of distances between 3 places and 4 persons
example D(2,3) = 10 means person 3 is far away from place 2 of 10 units.
D=[23 54 67 32
32 5 10 2
3 11 13 5]
another matrix A with the same number of rows (3 places) and where A(i,:) correspond to the persons that picked place i
example for place 1, persons 1 and 3 picked it
no one picked place 2
and persons 2 and 4 picked place 3
A=[1 3 0
0 0 0
2 4 0]
I want to reorder each row of A by the persons who are closest to the place it represents.
In this example, for place 1, person 1 is closer to it than person 3 based on D so nothing to do.
nothing to do for place 2
and there is a change for place 3 since person 4 is closer than 2 to place 3 D(3,2)>D(3,4)
The result should be
A=[1 3
0 0
4 2 ]
each row(place) in A can have 0 or many non zeros elements in it (persons that picked it)
Basically, I want to reorder elements in each row of A based on the rows of D (the closest to the location comes first), something like this but here A and D are not of the same size (number of columns).
[SortedD,Ind] = sort(D,2)
for r = 1:size(A,1)
A(r,:) = A(r,Ind(r,:));
end
There is another Matlab function sortrows(C,colummn_index) that can do the trick. It can sort rows based on the elements in a particular column. So if you transpose your matrix A (C = A') and extend the result by adding to the end the proper column, according to which you want to sort a required row, then you will get what you want.
To be more specific, you can do something like this:
clear all
D=[23 54 67 32;
32 5 10 2;
3 11 13 5];
A=[1 0;
3 0;
4 2 ];
% Sort elements in each row of the matrix A,
% because indices of elements in each row of the matrix D are always
% ascending.
A_sorted = sort(A,2);
% shifting all zeros in each row to the end
for i = 1:length(A_sorted(:,1))
num_zeros = sum(A_sorted(i,:)==0);
if num_zeros < length(A_sorted(i,:))
z = zeros(1,num_zeros);
A_sorted(i,:) = [A_sorted(i,num_zeros+1:length(A_sorted(i,:))) z];
end;
end;
% Prelocate in memory an associated array of the corresponding elements in
% D. The matrix Dr is just a reduced derivation from the matrix D.
Dr = zeros(length(A_sorted(:,1)),length(A_sorted(1,:)));
% Create a matrix Dr of elements in D corresponding to the matrix A_sorted.
for i = 1:length(A_sorted(:,1)) % i = 1:3
for j = 1:length(A_sorted(1,:)) % j = 1:2
if A_sorted(i,j) == 0
Dr(i,j) = 0;
else
Dr(i,j) = D(i,A_sorted(i,j));
end;
end;
end;
% We don't need the matrix A_sorted anymore
clear A_sorted
% In order to use the function SORTROWS, we need to transpose matrices
A = A';
Dr = Dr';
% The actual sorting procedure starts here.
for i = 1:length(A(1,:)) % i = 1:3
C = zeros(length(A(:,1)),2); % buffer matrix
C(:,1) = A(:,i);
C(:,2) = Dr(:,i);
C = sortrows(C,2);
A(:,i) = C(:,1);
% shifting all zeros in each column to the end
num_zeros = sum(A(:,i)==0);
if num_zeros < length(A(:,i))
z = zeros(1,num_zeros);
A(:,i) = [A(num_zeros+1:length(A(:,i)),i) z]';
end;
end;
% Transpose the matrix A back
A = A';
clear C Dr z

For each element in vector, sum previous n elements

I am trying to write a function that sums the previous n elements for each the element
v = [1 1 1 1 1 1];
res = sumLastN(v,3);
res = [0 0 3 3 3 3];
Until now, I have written the following function
function [res] = sumLastN(vec,ppts)
if iscolumn(vec)~=1
error('First argument must be a column vector')
end
sz_x = size(vec,1);
res = zeros(sz_x,1);
if sz_x > ppts
for jj = 1:ppts
res(ppts:end,1) = res(ppts:end,1) + ...
vec(jj:end-ppts+jj,1);
end
% for jj = ppts:sz_x
% res(jj,1) = sum(vec(jj-ppts+1:jj,1));
% end
end
end
There are around 2000 vectors of about 1 million elements, so I was wondering if anyone could give me any advice of how I could speed up the function.
Using cumsum should be much faster:
function [res] = sumLastN(vec,ppts)
w=cumsum(vec)
res=[zeros(1,ppts-1),w(ppts+1:end)-w(1:end-ppts)]
end
You basically want a moving average filter, just without the averaging.
Use a digital filter:
n = 3;
v = [1 1 1 1 1 1];
res = filter(ones(1,n),1,v)
res =
1 2 3 3 3 3
I don't get why the first two elements should be zero, but why not:
res(1:n-1) = 0
res =
0 0 3 3 3 3

Replacing zeros (or NANs) in a matrix with the previous element row-wise or column-wise in a fully vectorized way

I need to replace the zeros (or NaNs) in a matrix with the previous element row-wise, so basically I need this Matrix X
[0,1,2,2,1,0;
5,6,3,0,0,2;
0,0,1,1,0,1]
To become like this:
[0,1,2,2,1,1;
5,6,3,3,3,2;
0,0,1,1,1,1],
please note that if the first row element is zero it will stay like that.
I know that this has been solved for a single row or column vector in a vectorized way and this is one of the nicest way of doing that:
id = find(X);
X(id(2:end)) = diff(X(id));
Y = cumsum(X)
The problem is that the indexing of a matrix in Matlab/Octave is consecutive and increments columnwise so it works for a single row or column but the same exact concept cannot be applied but needs to be modified with multiple rows 'cause each of raw/column starts fresh and must be regarded as independent. I've tried my best and googled the whole google but coukldn’t find a way out. If I apply that same very idea in a loop it gets too slow cause my matrices contain 3000 rows at least. Can anyone help me out of this please?
Special case when zeros are isolated in each row
You can do it using the two-output version of find to locate the zeros and NaN's in all columns except the first, and then using linear indexing to fill those entries with their row-wise preceding values:
[ii jj] = find( (X(:,2:end)==0) | isnan(X(:,2:end)) );
X(ii+jj*size(X,1)) = X(ii+(jj-1)*size(X,1));
General case (consecutive zeros are allowed on each row)
X(isnan(X)) = 0; %// handle NaN's and zeros in a unified way
aux = repmat(2.^(1:size(X,2)), size(X,1), 1) .* ...
[ones(size(X,1),1) logical(X(:,2:end))]; %// positive powers of 2 or 0
col = floor(log2(cumsum(aux,2))); %// col index
ind = bsxfun(#plus, (col-1)*size(X,1), (1:size(X,1)).'); %'// linear index
Y = X(ind);
The trick is to make use of the matrix aux, which contains 0 if the corresponding entry of X is 0 and its column number is greater than 1; or else contains 2 raised to the column number. Thus, applying cumsum row-wise to this matrix, taking log2 and rounding down (matrix col) gives the column index of the rightmost nonzero entry up to the current entry, for each row (so this is a kind of row-wise "cummulative max" function.) It only remains to convert from column number to linear index (with bsxfun; could also be done with sub2ind) and use that to index X.
This is valid for moderate sizes of X only. For large sizes, the powers of 2 used by the code quickly approach realmax and incorrect indices result.
Example:
X =
0 1 2 2 1 0 0
5 6 3 0 0 2 3
1 1 1 1 0 1 1
gives
>> Y
Y =
0 1 2 2 1 1 1
5 6 3 3 3 2 3
1 1 1 1 1 1 1
You can generalize your own solution as follows:
Y = X.'; %'// Make a transposed copy of X
Y(isnan(Y)) = 0;
idx = find([ones(1, size(X, 1)); Y(2:end, :)]);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)), [], size(X, 1)).'; %'// Reshape back into a matrix
This works by treating the input data as a long vector, applying the original solution and then reshaping the result back into a matrix. The first column is always treated as non-zero so that the values don't propagate throughout rows. Also note that the original matrix is transposed so that it is converted to a vector in row-major order.
Modified version of Eitan's answer to avoid propagating values across rows:
Y = X'; %'
tf = Y > 0;
tf(1,:) = true;
idx = find(tf);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)),fliplr(size(X)))';
x=[0,1,2,2,1,0;
5,6,3,0,1,2;
1,1,1,1,0,1];
%Do it column by column is easier
x=x';
rm=0;
while 1
%fields to replace
l=(x==0);
%do nothing for the first row/column
l(1,:)=0;
rm2=sum(sum(l));
if rm2==rm
%nothing to do
break;
else
rm=rm2;
end
%replace zeros
x(l) = x(find(l)-1);
end
x=x';
I have a function I use for a similar problem for filling NaNs. This can probably be cutdown or sped up further - it's extracted from pre-existing code that has a bunch more functionality (forward/backward filling, maximum distance etc).
X = [
0 1 2 2 1 0
5 6 3 0 0 2
1 1 1 1 0 1
0 0 4 5 3 9
];
X(X == 0) = NaN;
Y = nanfill(X,2);
Y(isnan(Y)) = 0
function y = nanfill(x,dim)
if nargin < 2, dim = 1; end
if dim == 2, y = nanfill(x',1)'; return; end
i = find(~isnan(x(:)));
j = 1:size(x,1):numel(x);
j = j(ones(size(x,1),1),:);
ix = max(rep([1; i],diff([1; i; numel(x) + 1])),j(:));
y = reshape(x(ix),size(x));
function y = rep(x,times)
i = find(times);
if length(i) < length(times), x = x(i); times = times(i); end
i = cumsum([1; times(:)]);
j = zeros(i(end)-1,1);
j(i(1:end-1)) = 1;
y = x(cumsum(j));

matlab: simple matrix filtering - group size

I have a huuuge matrix storing information about X and Y coordinates of multiple particle trajectories , which in simplified version looks like that:
col 1- track number; col 2- frame number; col 2- coordinate X; col 3- coordinate Y
for example:
A =
1 1 5.14832 3.36128
1 2 5.02768 3.60944
1 3 4.85856 3.81616
1 4 5.17424 4.08384
2 1 2.02928 18.47536
2 2 2.064 18.5464
3 1 8.19648 5.31056
3 2 8.04848 5.33568
3 3 7.82016 5.29088
3 4 7.80464 5.31632
3 5 7.68256 5.4624
3 6 7.62592 5.572
Now I want to filter out trajectories shorter than lets say 4 and keep remaining stuff like (note renumbering of trajectories):
B =
1 1 5.14832 3.36128
1 2 5.02768 3.60944
1 3 4.85856 3.81616
1 4 5.17424 4.08384
2 1 8.19648 5.31056
2 2 8.04848 5.33568
2 3 7.82016 5.29088
2 4 7.80464 5.31632
2 5 7.68256 5.4624
2 6 7.62592 5.572
How to do it efficiently? I can think about some ideas using for loop and vertcat, but its the slowest solution ever :/
Thanks!
This will filter out those trajectories of length less than 4:
[v, u1, w] = unique(A(:, 1), 'last');
[~, u2, ~] = unique(A(:, 1), 'first');
keys = v(find(u1 - u2 >= 3));
B = A(ismember(A(:, 1), keys), :);
This will re-number them:
[~, ~, B(:, 1)] = unique(B(:, 1));
Here is a slightly different solution than that of #Ansari:
t = 1:max(A(:,1)); %# possible track numbers
tt = t( histc(A(:,1),t) >= 4 ); %# tracks with >= 4 frames
B = A(ismember(A(:,1),tt),:); %# filter rows
[~,~,B(:,1)] = unique(B(:,1)); %# renumber track numbers
Another way to compute the indices variable tt in my code above:
tt = find( accumarray(A(:,1), 1, [], #(x)numel(x)>=4) );