Find all subsets of a finite metric space for which the sum of distances is less than a given number - matlab

I have five elements A, B, C, D and E.
The distance between each of the elements is given by the matrix below:
Distances =
[0 5 3 8 15;
5 0 7 5 20;
3 7 0 12 12;
8 5 12 0 8;
7 20 12 8 0]
I want to choose all combinations of elements such that the sum of distances is less than 10.
It can be done recursively by:
First find sets of 2-item eligible combinations.
Then, find sets of 3-item eligible combinations by adding another item to the previously-found eligible 2-item combinations.
Etc.
Doing it by hand for the above example I get the following combinations:
A,
B,
C,
D,
E,
A B,
A C,
A D,
B C,
B D,
D E,
A B C
How would I do this systematically in Octave, if the number of elements is large (say 250)?

Several general points:
Since the original question was tagged with matlab, I will show a solution which I tested there.
This solution uses the functions VChooseK and VChooseKRO found on FEX, which need to be compiled into MEX using an appropriate compiler.
Even though the question talks about distances, and there's little sense in adding up discontinuous paths (i.e. A->C + B->D), since this is not specified explicitly in the question as something invalid, the solution below outputs them as well.
The solution is shown for the example given in the OP. It should be modified slightly to output readable results for 250 nodes, (i.e. change the node "names" from letters to numbers seeing how 26 < 250).
Outputs are currently only printed. Some modifications need to be made (in the form of temporary variables) if further computations are required on the result.
function q41308927
%% Initialization:
nodes = char((0:4) + 'A');
D = [0 5 3 8 15;
5 0 7 5 20;
3 7 0 12 12;
8 5 12 0 8;
7 20 12 8 0];
thresh = 10;
d = triu(D); % The problem is symmetric (undirected), so we only consider the upper half.
% Also keep track of the "letter form":
C = reshape(cellstr(VChooseKRO(nodes,2)), size(D)).'; % "C" for "Combinations"
%{
C =
5×5 cell array
'AA' 'AB' 'AC' 'AD' 'AE'
'BA' 'BB' 'BC' 'BD' 'BE'
'CA' 'CB' 'CC' 'CD' 'CE'
'DA' 'DB' 'DC' 'DD' 'DE'
'EA' 'EB' 'EC' 'ED' 'EE'
%}
C = C(d>0); d = d(d>0);
assert(numel(C) == numel(d)); % This is important to check
%% Find eligible sets of size n
for k = 1:numel(nodes)
if numel(d)<k
break
end
% Enumerate combinations:
C = C(VChooseK(1:numel(C),k));
d = sum(VChooseK(d,k),2);
% Filter combinations:
if any(d < thresh)
C(d >= thresh,:) = [];
d = d(d < thresh);
disp(sortrows(C)); % This is just to show it works like the manual example
else
break
end
end
The output of the above is:
'AB'
'AC'
'AD'
'BC'
'BD'
'DE'
'AB' 'AC'
'AC' 'BD'

This is a plain Octave (or Matlab) solution. The matrix Distances is as in the question. The algorithm builds a 0-1 matrix a in which each column encodes a set with sum of distances less than limit (for example 10).
The matrix a is initialized with identity, because all one-elements subsets are admissible (sum of distances is 0). Then each column is picked c = a(:,m); and the possibility of adding another element is investigated (cand = c; cand(k) = 1; means adding k-th element). Without loss of generality, it is enough to consider adding only the elements after the last current element of the set.
The product cand'*Distances*cand is twice the sum of distances of the candidate set cand. So, if it's less than twice the limit, the column is added: a = [a cand];. At the end, the matrix a is displayed in transposed form, for readability.
limit = 10;
n = length(Distances);
a = eye(n, n);
col1 = 1;
col2 = n;
while (col1 <= col2)
for m = col1:col2
c = a(:,m);
for k = max(find(c>0))+1:n
cand = c;
cand(k) = 1;
if cand'*Distances*cand < 2*limit
a = [a cand];
end
end
end
col1 = col2 + 1;
col2 = length(a);
end
disp(a')
Output:
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
1 1 0 0 0
1 0 1 0 0
1 0 0 1 0
0 1 1 0 0
0 1 0 1 0
0 0 0 1 1
With a 250 by 250 matrix, the performance will greatly depend on how large limit is. If it is so large that all or most of 2^250 sets qualify, this is going to run out of memory. (2^250 is more than 10^75, so I don't think you'd ever want to see anywhere near that many sets).
And this is to have output in a more readable form:
for m = 1:length(a)
disp(find(a(:,m))')
end
Output:
1
2
3
4
5
1 2
1 3
1 4
2 3
2 4
4 5

Related

Adding numbers based on 2 matrices according to a constraint matlab

I have 2 matrices; Matrix A and Matrix B.
Matrix A = [1 3 6 2 7;
2 1 5 3 4;
8 3 7 2 1]
Matrix B = [0 0 1 0 0;
0 0 0 0 1;
0 1 0 0 0]
and I want to check if the '1' in matrix B is placed in a place in matrix A where it is greater than or equal to 6 then leave it as it is. But if it is smaller than 6, then go to the place of the number that is less than this number in matrix A and put a '1' in this place in matrix B and add the 2 numbers and recheck if the sum is equal to or greater than 6 and so on.
As you can see in matrix B row 2 the 1 is put in the place of 4 in matrix A. Since the 4 is less than 6 then I will go to the second smaller number than the 4 which in this case is 3 and add 3 and 4 together. This will give us 7 which is greater than 6 so we will stop. So here for example the output matrix will be:
Matrix output = [0 0 1 0 0;
0 0 0 1 1;
0 1 0 1 1]
The steps:
Go to the number that is just smaller than it. In this case go to 3 as it is the one that is just smaller than the 4. I can explain more:
check the place of the 1 in Matrix B and see its value in Matrix A.
If the number in Matrix A is greater than 6, leave it as it is and leave the 1 in Matrix B as it is and go to another row.
If the number in Matrix A is smaller than 6, then what we want is that we want add this number to another number and make it equal to or greater than 6.
This number is the one that is just smaller than it. For example if the row has [2 5 6 1 3] and the 1 is placed in the place of the 5 and 5 is less than the constraint. So we have to go to the 3 as it is the one that is just smaller than the 5 and add them together.
After adding them put 1's in the places of both numbers and check the constraints again. If it satisfies the constraint leave them and go to another row. If not go the one that is just smaller than the number again and do the same.
Thank you so much.
This code is working when matrix B is empty and it puts the 1 in the place of the highest number and it checks the constraint. If it is less than the number it will go to the second highest number and add and recheck and so on.. But what I want now is to solve it with predefined 0s and 1s
B=zeros(size(A));
for k=1:size(A,1)
a=A(k,:)
[b,ia]=sort(a,'descend')
c=cumsum(b)
jj=find(c>=6,1)
idx=ia(1:jj)
B(k,idx)=1
end
This one took a while, but I think I got it in the end...
Doing most of the process without loop except the final stage, plugging in the index row by row to change B which can be done by a arrayfun. I think there might be a few redundant steps, but I think it is pretty fast.
C = A';
D = B' > 0 ;
E = repmat(max(C(D),1),[1 size(A,2)]);
F = A-E<=0;
G = A.*F;
[H ind] = sort (G,2,'descend');
I = (cumsum(H,2) >=6)*-1 +1;
Indent = ones(size(A,1),1);
J = [Indent I];
K = J(:,1:size(A,2)).*ind;
for t= 1:size(A,1)
B(t,K(t,K(t,:)~=0)) = 1;
end
>> B =
0 0 1 0 0
0 0 0 1 1
0 1 0 1 1

How to traverse two same size matrices and compare them

I have two matrices filled with 0s and 1s
e.g.
A = [ 0 0 1 0,
1 0 1 0 ]
B = [ 1 1 1 1
0 0 0 0 ]
and I'd like to compared the values form the same position against each other and return a 2x2 matrice
R = [ TP(1) FN(3)
FP(2) TN(2) ]
TP = returns the amount of times A has the value 1, and B has the value 1
FN = returns the amount of times A has the value 0, and B has the value 1
FP = returns the amount of times A has the value 1, and B has the value 0
TN = returns the amount of times A has the value 0, and B has the value 0
How do i get each individual number in A and B?
Approach #1: Comparison based using bsxfun -
pA = [1 0 1 0] %// pattern for A
pB = [1 1 0 0] %// pattern for B
%// Find matches for A against pattern-A and pattern-B for B using bsxfun(#eq.
%// Then, perform AND for detecting combined matches
matches = bsxfun(#eq,A(:),pA) & bsxfun(#eq,B(:),pB)
%// Sum up the matches to give us the desired counts
R = reshape(sum(matches),2,[]).'
Output -
R =
1 3
2 2
Approach #2: Finding decimal numbers -
Step-1: Find decimal numbers corresponding to the combined A's and B's
>> dec_nums = histc(bin2dec(num2str([B(:) A(:)],'%1d')),0:3)
dec_nums =
2
2
3
1
Step-2: Re-order the decimal numbers such that they line up as needed in the problem
>> R = reshape(flipud(dec_nums),2,[])'
R =
1 3
2 2
Use logical operators & and ~ applied on the linearized versions of A and B, and then nnz (or sum) to count the true values:
R = [nnz(A(:)&B(:)) nnz(~A(:)&B(:)); nnz(A(:)&~B(:)) nnz(~A(:)&~B(:))];

Finding all possible “lists” of possible pairs in Matlab

I have been thinking about a problem for the last few days but as I am a beginner in MATLAB, I have no clue how to solve it. Here is the background. Suppose that you have a symmetric N×N matrix where each element is either 0 or 1, and N = (1,2,...,n).
For example:
A =
0 1 1 0
1 0 0 1
1 0 0 0
0 1 0 0
If A(i,j) == 1, then it is possible to form the pair (i,j) and if A(i,j)==0 then it is NOT possible to form the pair (i,j). For example, (1,2) is a possible pair, as A(1,2)==A(2,1)==1 but (3,4) is NOT a possible pair as A(3,4)==A(4,3)==0.
Here is the problem. Suppose that a member of the set N only can for a pair with at most one other distinct member of the set N (i.e., if 1 forms a pair with 2, then 1 cannot form a pair with 3). How can I find all possible “lists” of possible pairs? In the above example, one “list” would only consist of the pair (1,2). If this pair is formed, then it is not possible to form any other pairs. Another “list” would be: ((1,3),(2,4)). I have searched the forum and found that the latter “list” is the maximal matching that can be found, e.g., by using a bipartite graph approach. However, I am not necessarily only interested to find the maximal matching; I am interested in finding ALL possible “lists” of possible pairs.
Another example:
A =
0 1 1 1
1 0 0 1
1 0 0 0
1 1 0 0
In this example, there are three possible lists:
(1,2)
((1,3),(2,4))
(1,4)
I hope that you can understand my question, and I apologize if am unclear. I appreciate all help I can get. Many thanks!
This might be a fast approach.
Code
%// Given data, A
A =[ 0 1 1 1;
1 0 0 1;
1 0 0 0;
1 1 0 0];
%%// The lists will be stored in 'out' as a cell array and can be accessed as out{1}, out{2}, etc.
out = cell(size(A,1)-1,1);
%%// Code that detects the lists using "selective" diagonals
for k = 1:size(A,1)-1
[x,y] = find(triu(A,k).*(~triu(ones(size(A)),k+1)));
out(k) = {[x y]};
end
out(cellfun('isempty',out))=[]; %%// Remove empty lists
%%// Verification - Print out the lists
for k = 1:numel(out)
disp(out{k})
end
Output
1 2
1 3
2 4
1 4
EDIT 1
Basically I will calculate all the the pairwise indices of the matrix to satisfy the criteria set in the question and then simply map them over the given matrix. The part of finding the "valid" indices is obviously the tedious part in it and in this code with some aggressive approach is expensive too when dealing with input matrices of sizes more than 10.
Code
%// Given data, A
A = [0 1 1 1; 1 0 1 1; 1 1 0 1; 1 1 1 0]
%%// Get all pairwise combinations starting with 1
all_combs = sortrows(perms(1:size(A,1)));
all_combs = all_combs(all_combs(:,1)==1,:);
%%// Get the "valid" indices
all_combs_diff = diff(all_combs,1,2);
valid_ind_mat = all_combs(all(all_combs_diff(:,1:2:end)>0,2),:);
valid_ind_mat = valid_ind_mat(all(diff(valid_ind_mat(:,1:2:end),1,2)>0,2),:);
%%// Map the ones of A onto the valid indices to get the lists in a matrix and then cell array
out_cell = mat2cell(valid_ind_mat,repmat(1,[1 size(valid_ind_mat,1)]),repmat(2,[1 size(valid_ind_mat,2)/2]));
A_masked = A(sub2ind(size(A),valid_ind_mat(:,1:2:end),valid_ind_mat(:,2:2:end)));
out_cell(~A_masked)={[]};
%%// Remove empty lists
out_cell(all(cellfun('isempty',out_cell),2),:)=[];
%%// Verification - Print out the lists
disp('Lists =');
for k1 = 1:size(out_cell,1)
disp(strcat(' List',num2str(k1),':'));
for k2 = 1:size(out_cell,2)
if ~isempty(out_cell{k1,k2})
disp(out_cell{k1,k2})
end
end
end
Output
A =
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
Lists =
List1:
1 2
3 4
List2:
1 3
2 4
List3:
1 4
2 3
I'm sure there's a faster way to do it, but here's the obvious solution:
%// Set top half to 0, and find indices of all remaining 1's
A(triu(A)==1) = 0;
[ii,jj] = find(A);
%// Put these in a matrix for further processing
P = [ii jj];
%// Sort indices into 'lists' of the kind you defined
X = repmat({}, size(P,1),1);
for ii = 1:size(P,1)-1
X{ii}{1} = P(ii,:);
for jj = ii+1:size(P,1)
if ~any(ismember(P(ii,:), P(jj,:)))
X{ii}{end+1} = P(jj,:); end
end
end

Creating a variable with unequal rows

I want to create a variable that finds a pattern (let's say [1 1]) in different rows of a matrix (A). Of course there aren't an equal number of occurrences of this string in each row.
A = [ 0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
1 1 1 0 0
0 1 0 0 1
1 0 1 1 1
0 1 0 1 0
1 1 1 0 1];
I could do:
for i = 1:n
var(i,:) = strfind(A(i,:),[1 1]);
end
but then both sides of the equation won't be equal.
ERROR: ??? Subscripted assignment dimension mismatch.
I try to preallocate. I create a matrix with what I think would be the maximum number of occurrences of this string in each row of matrix A (let's say 50).
for i = 1:n
var(i, :) = NaN(1,50)
end
That's followed by the previous bit of code and it's no good either.
I've also tried:
for i = 1:n
var(i,1:numel(strfind(A(i,:),[1 1])) = strfind(A(i,:),[1 1])
end
Error: The expression to the left of the equals sign is not a valid
target for an assignment.
How should I go about doing this?
The output I expect is a matrix var(i,:) that gives me the position in the matrix where each of these patterns occur. It works fine for just one row.
For example:
var(1,:) = [1 2 5 8 10 22 48]
var(2,:) = [2 3 4 7 34 45 NaN]
var(3,:) = [4 5 21 32 33 NaN]
Thanks!
In your first try: you tried to build a matrix with different length of rows.
In your second try: you pre-allocated, but then run it over by re-definning var(i,:), while you tried to put there your desired result.
In your third try: unfortunately you just missed one brackets- ) at the end of left expression.
This code suppose to work (what you did at 2nd and 3rd attempts, with pre-allocate and fixed brackets):
var=NaN(1,50);
for i = 1:n
var(i,1:numel(strfind(A(i,:),[1 1]))) = strfind(A(i,:),[1 1])
end

cumsum only within groups?

Lets say I have 2 vectors:
a=[0 1 0 1 1 0 1 0 0 0 1 1 1];
b=[1 1 1 1 1 1 2 2 2 3 3 3 3];
For every group of numbers in b I want to cumsum, so that the result should look like that:
c=[1 3;2 1;3 3]
That means that I have for the ones in b 3 ones in a, for group two in b I have only one one in a etc.
There have been some complicated answers so far. Try accumarray(b',a').
If you're looking for a solution where b can be anything, then a combination of hist and unique will help:
num = unique(b(logical(a))); %# identify the numbers in b with non-zero counts
cts = hist(b(logical(a)),num); %# count
c = [num(:),cts(:)]; %# combine.
If you want the first column of c to go from 1 to the maximum of b, then you can rewrite the first line as num=1:max(b), and you'll also get rows in c where the counts are zero.
Assuming that b is monotonically increasing by 1:
c = cell2mat(transpose(arrayfun( #(x) [ x sum(a(find( b == x ))) ], min(b):max(b), 'UniformOutput',false)))
should give the right answer in a one liner format, or:
for ii=min(b):max(b)
II = find( b == ii );
v = sum(a(II));
c(ii,:) = [ii v];
end
which is a bit easier to read. Hope this helps.