Unique vectors in cell based on tolerance

Unique vectors in cell based on tolerance - matlab

I'm trying to make a search algorithm which finds the unique columns of a cell based on a tolerance level. The unique function of MATLAB (R2012a), does not provide a tolerance input. Below is the code which I have so far; I have limited myself to checking uniqueness based on the first identity (j=1) for now, however, this needs to be updated later.
The output is: I obtain a store cell which contains all the vector expect the duplicates of [0;1;0]. However other duplicate are maintained (e.g. [1;0;-0.4])
clear all; close all; clc;
%%
tolerance=1e-6;
U_vector{1} = [0 1 0 1 1 0 1 0 1 1;
1 0 1 0 0 1 0 1 0 0;
0 -0.4238 0 0.4238 -0.4238 0 0.4238 0 0.8161001 -0.8161];
for i = 1:1:size(U_vector,2)
k=1;
store{i}(:,k) = U_vector{i}(:,k);
for j=1;%:1:(size(U_vector{i},2))
for m=j:1:(size(U_vector{i},2))
if (abs(U_vector{i}(:,j)-U_vector{i}(:,m)) >= tolerance)
k=k+1;
store{i}(:,k) = U_vector{i}(:,m);
end
end
end
end

There's an undocumented function to merge similar points, which works on rows too:
>> u = [0 1 0 1 1 0 1 0 1 1;
1 0 1 0 0 1 0 1 0 0;
0 -0.4238 0 0.4238 -0.4238 0 0.4238 0 0.8161001 -0.8161];
>> uMerged = builtin('_mergesimpts',u.',0.3).'
uMerged =
0 1.0000 1.0000 1.0000 1.0000
1.0000 0 0 0 0
0 -0.8161 -0.4238 0.4238 0.8161
Just get u = U_vector{1}; in your case, then pack the result in a cell too (out{1} = uMerged;).
Also, the function can take a vector tolerance indicating a tolerance for each column. From the command line message from this function:
Tolerance must be a scalar or a vector with the same number of columns as the first input 'X'.
So this works too:
uMerged = builtin('_mergesimpts',u.',[eps eps 0.3]).'
BTW: There will probably be an official function for this in the future, but we're not allowed to discuss :).

You do not need so many nested loops. This works with the sample you provided.
It uses a working table which is reduced as duplicates are found.
for ii = 1:1:size(U_vector,2)
A = U_vector{ii} ; %// create a working copy of the current table
store{ii} = [] ; %// initialize the result cell array
endOfTable = false ;
while ~endOfTable
store{ii}(:,end+1) = A(:,1) ; %// save the first column of the table
idx = logical( sum( abs( bsxfun(#minus,A(:,2:end),A(:,1))) >= tolerance ) ) ; %// find the indices of the columns not within the tolerance
A = A(:, [false idx] ) ; %// remove the duplicate columns in A
if size(A,2) < 2 ; endOfTable = true ; end %// exit loop if we reached the last column
end
%// store last column if it remained unmatched
if size(A,2) == 1
store{ii}(:,end+1) = A(:,1) ;
end
end
Which output with your data:
>> store{1}
ans =
0 1.0000 1.0000 1.0000 1.0000
1.0000 0 0 0 0
0 -0.4238 0.4238 0.8161 -0.8161

What about this?!:
u = cell2mat(U_vector{1});
i=1;
while i<=size(u,2)
test=repmat(u(:,i),1,size(u,2)); % compare matrix entries to current column i
differentCols = ~all(same); % column indices that are not equal to column i
differentCols(i)=1; % ensure one such column stays in u
u=u(:,differentCols); % new u-> keep different columns
i=i+1; % next column
end
u % print u
Seems to work for me, but no guarantees.

Related

Replace repeated value based on sequence size - Matlab

I have a 2D matrix composed of ones and zeros.
mat = [0 0 0 0 1 1 1 0 0
1 1 1 1 1 0 0 1 0
0 0 1 0 1 1 0 0 1];
I need to find all consecutive repetitions of ones in each row and replace all ones with zeros only when the sequence size is smaller than 5 (5 consecutive ones):
mat = [0 0 0 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0];
Any suggestion on how to approach this problem would be very welcome.

You can use diff to find the start and end points of the runs of 1, and some logic based on that to zero out the runs which are too short. Please see the below code with associated comments
% Input matrix of 0s and 1s
mat = [0 0 0 0 1 1 1 0 0
1 1 1 1 1 0 0 1 0
0 0 1 0 1 1 0 0 1];
% Minimum run length of 1s to keep
N = 5;
% Get the start and end points of the runs of 1. Add in values from the
% original matrix to ensure that start and end points are always paired
d = [mat(:,1),diff(mat,1,2),-mat(:,end)];
% Find those start and end points. Use the transpose during the find to
% flip rows/cols and search row-wise relative to input matrix.
[cs,r] = find(d.'>0.5); % Start points
[ce,~] = find(d.'<-0.5); % End points
c = [cs, ce]; % Column number array for start/end
idx = diff(c,1,2) < N; % From column number, check run length vs N
% Loop over the runs which didn't satisfy the threshold and zero them
for ii = find(idx.')
mat(r(ii),c(ii,1):c(ii,2)-1) = 0;
end
If you want to throw legibility out of the window, this can be condensed for a slightly faster and denser version, based on the exact same logic:
[c,r] = find([mat(:,1),diff(mat,1,2),-mat(:,end)].'); % find run start/end points
for ii = 1:2:numel(c) % Loop over runs
if c(ii+1)-c(ii) < N % Check if run exceeds threshold length
mat(r(ii),c(ii):c(ii+1)-1) = 0; % Zero the run if not
end
end

The vectorized solution by #Wolfie is nice and concise, but a bit hard to understand and far from the wording of the problem. Here is a direct translation of the problem using loops. It has the advantage of being easier to understand and is slightly faster with less memory allocations, which means it will work for huge inputs.
[m,n] = size(mat);
for i = 1:m
j = 1;
while j <= n
seqSum = 1;
if mat(i,j) == 1
for k = j+1:n
if mat(i,k) == 1
seqSum = seqSum + 1;
else
break
end
end
if seqSum < 5
mat(i,j:j+seqSum-1) = 0;
end
end
j = j + seqSum;
end
end

Remove column mean from nonzero entries of a column

Given an sparse matrix A in MATLAB and the mean for the nonzero elements in its columns m, is there anyway to subtract the nonzero elements in each column from the mean of each column and avoid looping over columns?
I am looking for efficient solutions. Using 'bsxfun' could be one solution if it is possible to use.
Thanks

You can use the second output of find to get the column indices; use those to index into m to do the subtraction; and put the results back into A using logical indexing:
A = sparse([0 0 0 0; 1 0 3 2; 2 1 0 5]); %// example data
m = [1.5 1 3 3.5]; %// vector of mean of nonzero elements of each column
m = m(:);
[~, jj, vv] = find(A);
A(logical(A)) = vv - m(jj);
Original A:
>> full(A)
ans =
0 0 0 0
1 0 3 2
2 1 0 5
Final A:
>> full(A)
ans =
0 0 0 0
-0.5000 0 0 -1.5000
0.5000 0 0 1.5000

How can I vectorise this loop in MATLAB

I have a loop that iterates over a matrix and sets all rows and columns with only one non-zero element to all zeroes.
so for example, it will transform this matrix:
A = [ 1 0 1 1
0 0 1 0
1 1 1 1
1 0 1 1 ]
to the matrix:
A' = [ 1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1 ]
row/column 2 of A only has 1 non zero element in it, so every element in row/column 2 is set to 0 in A'
(it is assumed that the matrices will always be diagonally symmetrical)
here is my non-vectorised code:
for ii = 1:length(A)
if nnz(A(ii,:)) == 1
A(ii,:) = 0;
A(:,ii) = 0;
end
end
Is there a more efficient way of writing this code in MATLAB?
EDIT:
I have been asked in the comments for some clarification, so I will oblige.
The purpose of this code is to remove edges from a graph that lead to a vertex of degree 1.
if A is the adjacency matrix representing a undirected graph G, then a row or column of that matrix which only has one non-zero element indicates that row/column represents a vertex of degree one, as it only has one edge incident to it.
My objective is to remove such edges from the graph, as these vertices will never be visited in a solution to the problem I am trying to solve, and reducing the graph will also reduce the size of the input to my search algorithm.
#TimeString, i understand that in the example you gave, recursively applying the algorithm to your matrix will result in a zero matrix, however the matrices that I am applying it to represent large, connected graphs, so there will never be a case like that. In response to your question as to why I only check for how many elements in a row, but the clear both columns and rows; this is because the matrix is always diagonally symmetrical, so i know that if something is true for a row, so it will be for the corresponding column..
so, just to clarify using another example:
I want to turn this graph G:
represented by matrix:
A = [ 0 1 1 0
1 0 1 0
1 1 0 1
0 0 1 0 ]
to this graph G':
represented by this matrix:
A' = [ 0 1 1 0
1 0 1 0
1 1 0 0
0 0 0 0 ]
(i realise that this matrix should actually be a 3x3 matrix because point D has been removed, but i already know how to shrink the matrix in this instance, my question is about efficiently setting columns/rows with only 1 non-zero element all to 0)
i hope that is a good enough clarification..

Not sure if it's really faster (depends on Matlab's JIT) but you can try the following:
To find out which columns (equivalently, rows, since the matrix is symmetric) have more than one non zero element use:
sum(A ~= 0) > 1
The ~= 0 is probably not needed in your case since the matrix consists of 1/0 elements only (graph edges if I understand correctly).
Transform the above into a diagonal matrix in order to eliminate unwanted columns:
D = diag(sum(A~=0) > 1)
And multiply with A from left to zero rows and from right to zero columns:
res = D * A * D

Thanks to nimrodm's suggestion of using sum(A ~= 0) instead of nnz, i managed to find a better solution than my original one
to clear the rows with one element i use:
A(sum(A ~= 0) == 1,:) = 0;
and then to clear columns with one element:
A(:,sum(A ~= 0) == 1) = 0;
for those of you who are interested, i did a 'tic-toc' comparison on a 1000 x 1000 matrix:
% establish matrix
A = magic(1000);
rem_rows = [200,555,950];
A(rem_rows,:) = 0;
A(:,rem_rows) = 0;
% insert single element into empty rows/columns
A(rem_rows,500) = 5;
A(500,rem_rows) = 5;
% testing original version
A_temp = A;
for test = 1
tic
for ii = 1:length(A_temp)
if nnz(A_temp(ii,:)) == 1
A_temp(ii,:) = 0;
A_temp(:,ii) = 0;
end
end
toc
end
Elapsed time is 0.041104 seconds.
% testing new version
A_temp = A;
for test = 1
tic
A_temp(sum(A_temp ~= 0) == 1,:) = 0;
A_temp(:,sum(A_temp ~= 0) == 1) = 0;
toc
end
Elapsed time is 0.010378 seconds
% testing matrix operations based solution suggested by nimrodm
A_temp = A;
for test = 1
tic
B = diag(sum(A_temp ~= 0) > 1);
res = B * A_temp * B;
toc
end
Elapsed time is 0.258799 seconds
so it appears that the single line version that I came up with, inspired by nimrodm's suggestion, is the fastest
thanks for all your help!

Bsxfuning it -
A(bsxfun(#or,(sum(A~=0,2)==1),(sum(A~=0,1)==1))) = 0
Sample run -
>> A
A =
1 0 1 1
0 0 1 0
1 1 1 1
1 0 1 1
>> A(bsxfun(#or,(sum(A~=0,2)==1),(sum(A~=0,1)==1))) = 0
A =
1 0 1 1
0 0 0 0
1 0 1 1
1 0 1 1

How to count the number of 1's from the total matrix

I have code like below:
N=10;
R=[1 1 1 1 1 0 0 0 0 0;1 1 1 1 1 1 1 1 1 1];
p=[0.1,0.2,0.01];
B = zeros(N , N);
B(1:N,1:N) = eye(N);
C=[B;R];
for q=p(1:length(p))
Rp=C;
for i=1:N
if(rand < p)
Rp(i,:) = 0;
end
end
end
from this code I vary the value of p. So for different value of p, i am getting different Rp. Now I want to get the total number of "1"'s from each Rp matrix. it means may be for p1 I am getting Rp1=5, for p2, Rp=4.
For example
Rp1=[1 0 0 0 0;0 1 0 0 0;0 0 0 0 0],
Rp2=[1 0 0 0 0;0 1 0 0 0;1 0 0 0 0],
Rp3=[0 0 0 0 0;0 1 0 0 0;0 0 0 0 0],
So total result will be 2,3,1.
I want to get this result.

If the matrix contains only 0 and 1 you are trying to count the nonzero values and there is a function for that called nnz
n = nnz(Rp);
As I mentioned in the comments you should replace
if(rand < p)
with
if(rand < q)
Then you can add the number of nonzero values to a vector like
r = [];
for q=p(1:length(p))
Rp=C;
for i=1:N
if(rand < p)
Rp(i,:) = 0;
end
end
r = [r nnz(Rp)];
end
Then r will contain your desired result. There are many ways to improve your code as mentioned in other answers and comments.

Assuming Rp is your matrix, then simply do one of the following:
If your matrix only contains zeros and ones
sum(Rp(:))
Or if your matrix contains multiple values:
sum(Rp(:)==1)
Note that for two dimensional matrices sum(Rp(:)) is the same as sum(sum(Rp))
I think your real question is how to save this result, you can do this by assigning it to an indexed varable, for example:
S(count) = sum(Rp(:));
This will require you to add a count variable that increases with one every step of the loop. It will be good practice (and efficient) to initialize your variable properly before the loop:
S = zeros(length(p),1);

If you need to count the 1's in any matrix M you should be able to do sum(M(:)==1)

Creating matrix of maximum values indices in MATLAB

Using MATLAB, I have an array of values of size 8 rows x N columns. I need to create a matrix of the same size, that counts maximum values in each column and puts 1 in the cell that contains maximum value, and 0 elsewhere.
A little example. Lets assume we have an array of values D:
D =
0.0088358 0.0040346 0.40276 0.0053221
0.017503 0.011966 0.015095 0.017383
0.14337 0.38608 0.16509 0.15763
0.27546 0.25433 0.2764 0.28442
0.01629 0.0060465 0.0082339 0.0099775
0.034521 0.01196 0.016289 0.021012
0.12632 0.13339 0.11113 0.10288
0.3777 0.19219 0.005005 0.40137
Then, the output matrix for such matrix D would be:
0 0 1 0
0 0 0 0
0 1 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
1 0 0 1
Is there a way to do it without catching vector of indices from max function and then putting ones in the right place using for loop?

A one-line answer:
M = D==repmat(max(D),size(D,1),1)
or more elegantly:
M = bsxfun(#eq, D, max(D))
Update:
According to the comments, if you want to be on the safe side and catch the accidental non-unique maximums, add the following statement:
M( cumsum(M)>1 ) = false
which will ensure that in the case of multiple maximums, only the first to occur has a corresponding one in the output matrix (this is equivalent to the behavior of the max() function's returned index).

There are probably better ways to do it, my first approach is:
D = rand(8,4)
[val, sub] = max(D)
ind = sub2ind( size(D), sub, 1:4 )
res = false( size(D) )
res( ind ) = true

I have written an extension to the original problem that can handle arbitrary multidimension array and search for maximum along any specified dimension.
I used it to solve for the Nash equilibrium in game theory. Hope others will find it helpful.
A = rand([3 3 2]);
i = 1; % specify the dimension of A through which we find the maximum
% the following codes find the maximum number of each column of A
% and create a matrix M of the same size with A
% which puts 1 in the cell that contains maximum value, and 0 elsewhere.
[Amax pos] = max(A, [], i);
% pos is a now 1x3x3 matrix (the ith dimension is "shrinked" by the max function)
sub = cell(1, ndims(A));
[sub{:}] = ind2sub(size(pos), (1:length(pos(:)))');
sub{i} = pos(:);
ind = sub2ind(size(A), sub{:});
M = false(size(A));
M(ind) = true;
Example:
A(:,:,1) =
0.0292 0.4886 0.4588
0.9289 0.5785 0.9631
0.7303 0.2373 0.5468
A(:,:,2) =
0.5211 0.6241 0.3674
0.2316 0.6791 0.9880
0.4889 0.3955 0.0377
M(:,:,1) =
0 0 0
1 1 1
0 0 0
M(:,:,2) =
1 0 0
0 1 1
0 0 0