analyzing sequences matlab - matlab

I am trying to write a short matlab function that will recieve a vector and will return me the index of the first element of the longest sequence of 1s (I can assume that the sequence consists of 1s and 0s). for example:
IndexLargeSeq([110001111100000000001111111111110000000000000000000000000000000])
will return 21 - which is the index of the first 1 of the longest sequence of 1s.
thank you
ariel

There you go:
% input:
A = [0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1]';
% replace 0 with 2 because the next command doesn't work with '0' as values
A(A == 0) = 2;
% accumulate data sets
B = [A(diff([A; 0]) ~= 0), diff(find(diff([0; A; 0])))];
% maximize second column where first column == 1
maxSeq = max(B(B(:, 1) == 1, 2));
% get row of B where first column == 1 && second column == maxSeq
row = find(B(:,1) == 1 & B(:,2) == maxSeq, 1);
% calculate the index of the first 1s of this longest sequence:
idx = sum(B(1:(row-1),2)) + 1
idx than is the value (the index) you are looking for, maxSeq is the length of this sewuence of 1s. A has to be a row-vector.
If you want to understand how the datasets are accumulated (the command B = ...), look here: How to accumulate data-sets?.

Here is another option measuring distances between indices of 0s. The code takes into account situations if there are no 1s at all (returns empty vector), or if there are multiple sequences with the longest length. x is an input row vector.
idx = find([1 ~x 1]); %# indices of 0s +1
idxdiff = diff(idx); %# lengths of sequences (+1)
maxdiff = max(idxdiff);
if maxdiff == 1
maxseqidx = []; %# no 1s at all
else
%# find all longest sequences, may be more then one
maxidx = find(idxdiff == maxdiff);
maxseqidx = idx(maxidx);
end
disp(maxseqidx)
EDIT: If x can be either row or column vector, you can change the first line to
idx = find([1; ~x(:); 1]);
The output will be a column vector in this case.

Related

return the index of the last K non-zero element of each row of a matrix

Is there a vectorization way of returning the index of the last K nonzero elements of each row of a matrix?
For example, my matrix only contains 0 and 1 and the last column of each row is always 1. Then I want to find the index of the last K, where K>1, nonzero elements of each row. If a row only has M (less than K) nonzero elements, then the index for that row is just the index of the last M nonzero element. e.g.
A = [0 1 0 1;
1 1 0 1;
1 1 1 1;
0 0 0 1]
And my K = 2, then I expected to return a matrix such that
B = [0 1 0 1;
0 1 0 1;
0 0 1 1;
0 0 0 1]
Namely B is originally a zero matrix with same shape as A, then it copies each row of A where the corresponding column starts from the index of the last K non-zero element of the row of A (and if in one row of A there is only M < K non-zero element, then it starts from the index of the last M non-zero element of that row of A)
Knowing that elements are only 0 or 1, you can make a mask using cumsum on the flipped matrix A and throw away values with a cumulative sum greater than k:
A = [0 1 0 1;1 1 0 1;1 1 1 1;0 0 0 1]
k = 2;
C = fliplr(cumsum(fliplr(A), 2)); % take the cumulative sum backwards across rows
M = (C <= k); % cumsum <= k includes 0 elements too, so...
B = A .* M % multiply original matrix by mask
As mentioned in the comments (Thanks #KQS!), if you're using a recent version of MATLAB, there's a direction optional parameter to cumsum, so the line to generate C can be shortened to:
C = cumsum(A, 2, 'reverse');
Results:
A =
0 1 0 1
1 1 0 1
1 1 1 1
0 0 0 1
B =
0 1 0 1
0 1 0 1
0 0 1 1
0 0 0 1
knowing that find function can get indices of last k elements, we can use bsxfun to apply find to rows of a matrix to find which element in each row satisfy the condition. find again used to extract rows and columns of nonzero elements of the resultant matrix, so reducing size of data and complexity of operations. then save the result to a sparse matrix then convert to full matrix:
A = [0 1 0 1;
1 1 0 1;
1 1 1 1;
0 0 0 1]
k = 2;
[row , col]= size(A);
last_nz = bsxfun(#(a,b)find(a,b,'last'),A',(repmat(k, 1, row))); %get indices of last two nonzero elements for each row
[~,rr,cc]=find(last_nz); %get columns and rows of correspondong element for whole matrix
B = full(sparse(rr,cc,1));

How to find a non-zero number between two zeros in a cell array in matlab

I have a cell array (11000x500) with three different type of elements.
1) Non-zero doubles
2) zero
3) Empty cell
I would like to find all occurances of a non-zero number between two zeros.
E.g. A = {123 13232 132 0 56 0 12 0 0 [] [] []};
I need the following output
out = logical([0 0 0 0 1 0 1 0 0 0 0 0]);
I used cellfun and isequal like this
out = cellfun(#(c)(~isequal(c,0)), A);
and got the follwoing output
out = logical([1 1 1 0 1 0 1 0 0 1 1 1]);
I need help to perform the next step where i can ignore the consecutive 1's and only take the '1's' between two 0's
Could someone please help me with this?
Thanks!
Here is a quick way to do it (and other manipulations binary data) using your out:
out = logical([1 1 1 0 1 0 1 0 0 1 1 1]);
d = diff([out(1) out]); % find all switches between 1 to 0 or 0 to 1
len = 1:length(out); % make a list of all indices in 'out'
idx = [len(d~=0)-1 length(out)]; % the index of the end each group
counts = [idx(1) diff(idx)]; % the number of elements in the group
elements = out(idx); % the type of element (0 or 1)
singles = idx(counts==1 & elements==1)
and you will get:
singles =
5 7
from here you can continue and create the output as you need it:
out = false(size(out)); % create an output vector
out(singles) = true % fill with '1' by singles
and you get:
out =
0 0 0 0 1 0 1 0 0 0 0 0
You can use conv to find the elements with 0 neighbors (notice that the ~ has been removed from isequal):
out = cellfun(#(c)(isequal(c,0)), A); % find 0 elements
out = double(out); % cast to double for conv
% elements that have more than one 0 neighbor
between0 = conv(out, [1 -1 1], 'same') > 1;
between0 =
0 0 0 0 1 0 1 0 0 0 0 0
(Convolution kernel corrected to fix bug found by #TasosPapastylianou where 3 consecutive zeros would result in True.)
That's if you want a logical vector. If you want the indices, just add find:
between0 = find(conv(out, [1 -1 1], 'same') > 1);
between0 =
5 7
Another solution, this completely avoids your initial logical matrix though, I don't think you need it.
A = {123 13232 132 0 56 0 12 0 0 [] [] []};
N = length(A);
B = A; % helper array
for I = 1 : N
if isempty (B{I}), B{I} = nan; end; % convert empty cells to nans
end
B = [nan, B{:}, nan]; % pad, and collect into array
C = zeros (1, N); % preallocate your answer array
for I = 1 : N;
if ~any (isnan (B(I:I+2))) && isequal (logical (B(I:I+2)), logical ([0,1,0]))
C(I) = 1;
end
end
C = logical(C)
C =
0 0 0 0 1 0 1 0 0 0 0 0

How can I vectorise this loop in MATLAB

I have a loop that iterates over a matrix and sets all rows and columns with only one non-zero element to all zeroes.
so for example, it will transform this matrix:
A = [ 1 0 1 1
0 0 1 0
1 1 1 1
1 0 1 1 ]
to the matrix:
A' = [ 1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1 ]
row/column 2 of A only has 1 non zero element in it, so every element in row/column 2 is set to 0 in A'
(it is assumed that the matrices will always be diagonally symmetrical)
here is my non-vectorised code:
for ii = 1:length(A)
if nnz(A(ii,:)) == 1
A(ii,:) = 0;
A(:,ii) = 0;
end
end
Is there a more efficient way of writing this code in MATLAB?
EDIT:
I have been asked in the comments for some clarification, so I will oblige.
The purpose of this code is to remove edges from a graph that lead to a vertex of degree 1.
if A is the adjacency matrix representing a undirected graph G, then a row or column of that matrix which only has one non-zero element indicates that row/column represents a vertex of degree one, as it only has one edge incident to it.
My objective is to remove such edges from the graph, as these vertices will never be visited in a solution to the problem I am trying to solve, and reducing the graph will also reduce the size of the input to my search algorithm.
#TimeString, i understand that in the example you gave, recursively applying the algorithm to your matrix will result in a zero matrix, however the matrices that I am applying it to represent large, connected graphs, so there will never be a case like that. In response to your question as to why I only check for how many elements in a row, but the clear both columns and rows; this is because the matrix is always diagonally symmetrical, so i know that if something is true for a row, so it will be for the corresponding column..
so, just to clarify using another example:
I want to turn this graph G:
represented by matrix:
A = [ 0 1 1 0
1 0 1 0
1 1 0 1
0 0 1 0 ]
to this graph G':
represented by this matrix:
A' = [ 0 1 1 0
1 0 1 0
1 1 0 0
0 0 0 0 ]
(i realise that this matrix should actually be a 3x3 matrix because point D has been removed, but i already know how to shrink the matrix in this instance, my question is about efficiently setting columns/rows with only 1 non-zero element all to 0)
i hope that is a good enough clarification..
Not sure if it's really faster (depends on Matlab's JIT) but you can try the following:
To find out which columns (equivalently, rows, since the matrix is symmetric) have more than one non zero element use:
sum(A ~= 0) > 1
The ~= 0 is probably not needed in your case since the matrix consists of 1/0 elements only (graph edges if I understand correctly).
Transform the above into a diagonal matrix in order to eliminate unwanted columns:
D = diag(sum(A~=0) > 1)
And multiply with A from left to zero rows and from right to zero columns:
res = D * A * D
Thanks to nimrodm's suggestion of using sum(A ~= 0) instead of nnz, i managed to find a better solution than my original one
to clear the rows with one element i use:
A(sum(A ~= 0) == 1,:) = 0;
and then to clear columns with one element:
A(:,sum(A ~= 0) == 1) = 0;
for those of you who are interested, i did a 'tic-toc' comparison on a 1000 x 1000 matrix:
% establish matrix
A = magic(1000);
rem_rows = [200,555,950];
A(rem_rows,:) = 0;
A(:,rem_rows) = 0;
% insert single element into empty rows/columns
A(rem_rows,500) = 5;
A(500,rem_rows) = 5;
% testing original version
A_temp = A;
for test = 1
tic
for ii = 1:length(A_temp)
if nnz(A_temp(ii,:)) == 1
A_temp(ii,:) = 0;
A_temp(:,ii) = 0;
end
end
toc
end
Elapsed time is 0.041104 seconds.
% testing new version
A_temp = A;
for test = 1
tic
A_temp(sum(A_temp ~= 0) == 1,:) = 0;
A_temp(:,sum(A_temp ~= 0) == 1) = 0;
toc
end
Elapsed time is 0.010378 seconds
% testing matrix operations based solution suggested by nimrodm
A_temp = A;
for test = 1
tic
B = diag(sum(A_temp ~= 0) > 1);
res = B * A_temp * B;
toc
end
Elapsed time is 0.258799 seconds
so it appears that the single line version that I came up with, inspired by nimrodm's suggestion, is the fastest
thanks for all your help!
Bsxfuning it -
A(bsxfun(#or,(sum(A~=0,2)==1),(sum(A~=0,1)==1))) = 0
Sample run -
>> A
A =
1 0 1 1
0 0 1 0
1 1 1 1
1 0 1 1
>> A(bsxfun(#or,(sum(A~=0,2)==1),(sum(A~=0,1)==1))) = 0
A =
1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1

Finding X binary numbers with n non-zero digits for large X. - Matlab

Is there a more efficient method for generating X binary numbers (that have n non-zero digits) for a range of 1 to N? I have developed the following solution:
Totalcombos = nchoosek(N,n);
floor = floor(log2(Totalcombos));
L = 2.^floor;
NumElem = 2^N-1;
i=0;
x=1;
%Creates Index combination LUT
while 1
%Produces Binary from 1 : NumElem
binNum= de2bi(x,N,'right-msb')';
x=x+1;
%Finds number of bits in each binary number
NumOfBits = sum(binNum);
%Creates a matrix of binary numbers from 1:NumElem with n 1's
if NumOfBits == n
i=i+1;
ISmatrixShapes{i} = binNum(:,:);
end
if i==L
break
end
end
ISmatrixShape2=cell2mat(ISmatrixShapes);
ISmatrixShape=ISmatrixShape2(:,1:L)';
Is there a way to generate these values without a massive number of loop iterations?
This generates all N-digit binary numbers that have n ones and N-n zeros:
N = 5;
n = 3;
ind = nchoosek(1:N, n);
S = size(ind,1);
result = zeros(S,N);
result(bsxfun(#plus, (ind-1)*S, (1:S).')) = 1;
It works by generating all combinations of n positions of ones out of the N possible positions (nchoosek line), and then filling those values with 1 using linear indexing (bsxfun line).
The result in this example is
result =
1 1 1 0 0
1 1 0 1 0
1 1 0 0 1
1 0 1 1 0
1 0 1 0 1
1 0 0 1 1
0 1 1 1 0
0 1 1 0 1
0 1 0 1 1
0 0 1 1 1
Another, less efficient approach is to generate all permutations of a vector containing n ones and N-n zeros, and then removing duplicates:
result = unique(perms([ones(1,n) zeros(1,N-n)]), 'rows');

Midpoints of matrix rows depending on certain conditions Matlab

I have a matrix A with size 10x100 as shown below. What I want to do is:
I'll work row by row in which for each row I'll check the data of
each coloumn in this row
Let's say I'm now in the first col cell in the first row. I'll check if the value is zero I'll move to the next col, and so on till I found a col having a non-zero value and save its col number e.g. col 3 "this means that col 1&2 were zeros"
Now I'm in the first non zero col in row1, I'll move to the next col till I find a col with zero value. I'll fetch the col just before this zero one which must be a non-zero one and save it. e.g col 7 "this means that col4&5&6 are non-zeros and col8 is zero"
Now I want to save the median middle col between this two columns e.g col3 and col7 then the middle col is col5 so I'll save the index row1_col5. if there are two middle values then any of them is fine.
I'll then move to the next col till I find a non-zero col "do the
same steps from 2-->5" till the first row is finished.
Move to the next row and start over again from step 2-->5.
There are two rules: -The first one is that I'll get the middle index of non-zero consecutive values only if there is a minimum of 3 non-zero consecutive values, if there are two non-zero consecutive value then the middle will not be calculated -The second one is that if the number of zero consecutive values are less than 3 then they will be ignored and will be considered as non-zero values. e.g in the below example the first row middle values are col5 and col11. In row2 col5 is counted, while no cols in row3 satisfy this conditions , and in row4 col6 or col7 will be counted.
After finishing all the rows want to have a vector or array holding the positions of all the middle indexes e.g row1_col5 row1_col17 row2_col_10 and so on.
example:
A = [ 0 0 0 2 4 1 0 0 0 1 3 2;
0 0 0 5 1 1 1 1 0 0 0 1;
0 3 4 1 0 3 1 2 0 0 1 3;
0 0 0 0 1 3 4 5 0 0 0 0];
for the first row the middle value will be 5 and 11 and so on
So if anyone could please advise how can I do this with least processing as this can be done using loops but if there is more efficient way of doing it? Please let me know if any clarification is needed.
Now you have clarified your question (again...) here is a solution (still using a for loop...). It includes "rule 7" - excluding runs of fewer than three elements; it also includes the second part of that rule - runs of fewer than three zeros don't count as zero. The new code looks like this:
A = [ 0 0 0 2 4 1 0 0 0 1 3 2;
0 0 0 5 1 1 1 1 0 0 0 1;
0 3 4 1 0 3 1 2 0 0 1 3;
0 0 0 0 1 3 4 5 0 0 0 0];
retVal = cell(1, size(A, 1));
for ri = 1:size(A,1)
temp = [1 0 0 0 A(ri,:) 0 0 0 1]; % pad ends with 3 zeros + 1
% so that is always a "good run"
isz = (temp == 0); % find zeros - pad "short runs of 0" with ones
diffIsZ = diff(isz);
f = find(diffIsZ == 1);
l = find(diffIsZ == -1);
shortRun = find((l-f)<3); % these are the zeros that need eliminating
for ii = 1:numel(shortRun)
temp(f(shortRun(ii))+1:l(shortRun(ii))) = 1;
end
% now take the modified row:
nz = (temp(4:end-3)~=0);
dnz = diff(nz); % find first and last nonzero elements
f = find(dnz==1);
l = find(dnz==-1);
middleValue = floor((f + l)/2);
rule7 = find((l - f) > 2);
retVal{ri} = middleValue(rule7);
end
You have to use a cell array for the return value since you don't know how many elements will be returned per row (per your updated requirement).
The code above returns the following cell array:
{[5 11], [6], [7], [7]}
I appear still not to understand your "rule 7", because you say that "no columns in row 3 satisfy this condition". But it seems to me that once we eliminate the short runs of zeros, it does. Unless I misinterpret how you want to treat a run of non-zero numbers that goes right to the edge (I assume that's OK - which is why you return 11 as a valid column in row 1; so why wouldn't you return 7 for row 3??)
Try this:
sizeA = size(A);
N = sizeA(1);
D = diff([zeros(1, N); (A.' ~= 0); zeros(1,N)]) ~= 0;
[a b] = find(D ~= 0);
c = reshape(a, 2, []);
midRow = floor(sum(c)/2);
midCol = b(1:2:length(b))
After this, midRow and midCol contain the indices of your centroids (e.g. midRow(1) = 1, midCol(1) = 4 for the example matrix you gave above.
If you don't mind using a for loop:
A = [ 0 0 1 1 1 0 1;
0 0 0 0 0 0 0;
0 1 1 1 1 0 0;
0 1 1 1 0 1 1;
0 0 0 0 1 0 0]; % data
sol = repmat(NaN,size(A,1),1);
for row = 1:size(A,1)
[aux_row aux_col aux_val] = find(A(row,:));
if ~isempty(aux_col)
sol(row) = aux_col(1) + floor((find(diff([aux_col 0])~=1,1)-1)/2);
% the final 0 is necessary in case the row of A ends with ones
% you can use either "floor" or "ceil"
end
end
disp(sol)
Try it and see if it does what you want. I hope the code is clear; if not, tell me