Random binary matrix with two non-trivial constraints - matlab

I need to generate a random matrix of K columns and N rows containing ones and zeroes, such that:
a) Each row contains exactly k ones.
b) Each row is different from the other (combinatorics imposes that if N > nchoosek(K, k) there will be nchoosek(K,k) rows).
Assume I want N = 10000 (out of all the possible nchoosek(K, k) = 27405 combinations), different 1Γ—K vectors (with K = 30) containing k (with k = 4) ones and K - k zeroes.
This code:
clear all; close
N=10000; K=30; k=4;
M=randi([0 1],N,K);
plot(sum(M,2)) % condition a) not satisfied
does not satisfy neither a) nor b).
This code:
clear all; close;
N=10000;
NN=N; K=30; k=4;
tempM=zeros(NN,K);
for ii=1:NN
ttmodel=tempM(ii,:);
ttmodel(randsample(K,k,false))=1; %satisfies condition a)
tempM(ii,:)=ttmodel;
end
Check=bi2de(tempM); %from binary to decimal
[tresh1,ind,tresh2] = unique(Check);%drop the vectors that appear more than once in the matrix
M=tempM(ind,:); %and satisfies condition b)
plot(sum(M,2)) %verify that condition a) is satisfied
%Effective draws, Wanted draws, Number of possible combinations to draw from
[sum(sum(M,2)==k) N nchoosek(K,k) ]
satisfies condition a) and partially condition b). I say partially because unless NN>>N the final matrix will contain less than N rows each different from each other.
Is there a better and faster way (that possible avoids the for cycle and the need of having NN>>N) to solve the problem?

First, generate N unique k-long permutations of the positions of ones:
cols = randperm(K, N);
cols = cols(:, 1:k);
Then generate the matching row indices:
rows = meshgrid(1:N, 1:k)';
and finally create the sparse matrix with:
A = sparse(rows, cols, 1, N, K);
To obtain the full form of the matrix, use full(A).
Example
K = 10;
k = 4;
N = 5;
cols = randperm(K, N);
cols = cols(:, 1:k);
rows = meshgrid(1:N, 1:k)';
A = sparse(rows, cols , 1, N, K);
full(A)
The result I got is:
ans =
1 1 0 0 0 0 0 1 0 1
0 0 1 1 0 1 0 0 0 1
0 0 0 1 1 0 1 0 1 0
0 1 0 0 0 0 1 0 1 1
1 1 1 0 0 1 0 0 0 0
This computation should be pretty fast even for large values of K and N. For K = 30, k = 4, N = 10000 the result was obtained in less than 0.01 seconds.

You could use randperm(n) to generate random sequences of integers from 1 to n, and store the nonrepeated sequences as rows in a matrix M until size(unique(M,'rows'),1)==size(M,1). Then you could use M to index a logical matrix with the appropriate number of true values in each row.

If you have enough memory for nchoosek(K,k) integers, build an array of those, use a partial Fisher-Yates shuffle to get a proper uniformly random subset of N of those. Now, given the array of N integers, interpret each as the rank of the combination representing each row of your final array. If you use colexicographical ordering of combinations, computing the combination from a rank is pretty simple (though it uses lots of binomial combination functions, so it pays to have a fast one).
I'm not a Matlab guy, but I've done things similar to this in C. This code, for example:
for (i = k; i >= 1; --i) {
while ((b = binomial(n, i)) > r) --n;
buf[i-1] = n;
r -= b;
}
will fill the array buf[] with indices from 0 to n-1 for the rth combination of k out of n elements in colex order. You would interpret these as the positions of the 1s in your row.

Related

Matrix logical matlab

I am trying to get M. To do this, it is necessary for Matlab to relate column 1 of A to column 1 of 𝐡^𝑇and build a matrix M with 1 and 0 depending on whether in position 𝐴[𝑖,𝑗] and 𝐡^𝑇[𝑖,𝑗] are equal to 1
A = [1 0 1; 0 1 1; 0 0 1 ];
B = [0 0 1 ; 0 1 0; 1 1 1];
for i = 1:3
for j =1:3
if A(i,j) == BT(i,j) && A(i,j)==1;
Z(i,j) = 1
end
end
end
When you use "if A(i,j) == BT(i,j) && A(i,j)==;" you are comparing individual elements. Instead you want to be comparing columns:
A(:, i) and BT(:, j).
Precisely, you want
for i = 1:3
for j = 1:3
M(i,j) = any( A(:,i) & BT(:,j) );
end
end
OR
You are comparing the columns of BT and the columns of A.
That is to say, the rows of B and the columns of A. You want to see if there are any occasions when both of the elements are 1. Thus you can compare the products of the terms in the rows of B and columns of A.
i.e. M = logical(B * A) should also give you the desired output.
NOTE that the data in B are different in your image examples and in your code.

Combination and Multiplying Rows of array in matlab

I have a matrix (89x42) of 0's and 1's that I'd like to multiply combinations of rows together.
For example, for matrix
input = [1 0 1
0 0 0
1 1 0];
and with 2 combinations, I want an output of
output = [0 0 0; % (row1*row2)
1 0 0; % (row1*row3)
0 0 0] % (row2*row3)
Which rows to multiply is dictated by "n Choose 2" (nCk), or all possible combinations of the rows n taken k at a time. In this case k=2.
Currently I am using a loop and it works fine for the 89C2 combinations of rows, but when I run it with 89C3 it takes far too long too run.
What would be the most efficient way to do this program so I can do more than 2 combinations?
You can do it using nchoosek and element-wise multiplication.
inp = [1 0 1; 0 0 0; 1 1 0]; %Input matrix
C = nchoosek(1:size(inp,1),2); %Number of rows taken 2 at a time
out = inp(C(:,1),:) .* inp(C(:,2),:); %Multiplying those rows to get the desired output
Several things you can do:
Use logical ("binary") arrays (or even sparse logical arrays) instead of double arrays.
Use optimized combinatorical functions.
bitand or and instead of times (where applicable).
Vectorize:
function out = q44417404(I,k)
if nargin == 0
rng(44417404);
I = randi(2,89,42)-1 == 1;
k = 3;
end
out = permute(prod(reshape(I(nchoosek(1:size(I,1),k).',:).',size(I,2),k,[]),2),[3,1,2]);

How to zero out the centre k by k matrix in an input matrix with odd number of columns and rows

I am trying to solve this problem:
Write a function called cancel_middle that takes A, an n-by-m
matrix, as an input where both n and m are odd numbers and k, a positive
odd integer that is smaller than both m and n (the function does not have to
check the input). The function returns the input matrix with its center k-by-k
matrix zeroed out.
Check out the following run:
>> cancel_middle(ones(5),3)
ans =
1 1 1 1 1
1 0 0 0 1
1 0 0 0 1
1 0 0 0 1
1 1 1 1 1
My code works only when k=3. How can I generalize it for all odd values of k? Here's what I have so far:
function test(n,m,k)
A = ones(n,m);
B = zeros(k);
A((end+1)/2,(end+1)/2)=B((end+1)/2,(end+1)/2);
A(((end+1)/2)-1,((end+1)/2)-1)= B(1,1);
A(((end+1)/2)-1,((end+1)/2))= B(1,2);
A(((end+1)/2)-1,((end+1)/2)+1)= B(1,3);
A(((end+1)/2),((end+1)/2)-1)= B(2,1);
A(((end+1)/2),((end+1)/2)+1)= B(2,3);
A(((end+1)/2)+1,((end+1)/2)-1)= B(3,1);
A(((end+1)/2)+1,((end+1)/2))= B(3,2);
A((end+1)/2+1,(end+1)/2+1)=B(3,3)
end
You can simplify your code. Please have a look at
Matrix Indexing in MATLAB. "one or both of the row and column subscripts can be vectors", i.e. you can define a submatrix. Then you simply need to do the indexing correct: as you have odd numbers just subtract m-k and n-k and you have the number of elements left from your old matrix A. If you divide it by 2 you get the padding on the left/right, top/bottom. And another +1/-1 because of Matlab indexing.
% Generate test data
n = 13;
m = 11;
A = reshape( 1:m*n, n, m )
k = 3;
% Do the calculations
start_row = (n-k)/2 + 1
start_col = (m-k)/2 + 1
A( start_row:start_row+k-1, start_col:start_col+k-1 ) = zeros( k )
function b = cancel_middle(a,k)
[n,m] = size(a);
start_row = (n-k)/2 + 1;
start_column = (m-k)/2 + 1;
end_row = (n-k)/2 + k;
end_column = (m-k)/2 + k;
a(start_row:end_row,start_column:end_column) = 0;
b = a;
end
I have made a function in an m file called cancel_middle and it basically converts the central k by k matrix as a zero matrix with the same dimensions i.e. k by k.
the rest of the matrix remains the same. It is a general function and you'll need to give 2 inputs i.e the matrix you want to convert and the order of submatrix, which is k.

Replacing zeros (or NANs) in a matrix with the previous element row-wise or column-wise in a fully vectorized way

I need to replace the zeros (or NaNs) in a matrix with the previous element row-wise, so basically I need this Matrix X
[0,1,2,2,1,0;
5,6,3,0,0,2;
0,0,1,1,0,1]
To become like this:
[0,1,2,2,1,1;
5,6,3,3,3,2;
0,0,1,1,1,1],
please note that if the first row element is zero it will stay like that.
I know that this has been solved for a single row or column vector in a vectorized way and this is one of the nicest way of doing that:
id = find(X);
X(id(2:end)) = diff(X(id));
Y = cumsum(X)
The problem is that the indexing of a matrix in Matlab/Octave is consecutive and increments columnwise so it works for a single row or column but the same exact concept cannot be applied but needs to be modified with multiple rows 'cause each of raw/column starts fresh and must be regarded as independent. I've tried my best and googled the whole google but coukldn’t find a way out. If I apply that same very idea in a loop it gets too slow cause my matrices contain 3000 rows at least. Can anyone help me out of this please?
Special case when zeros are isolated in each row
You can do it using the two-output version of find to locate the zeros and NaN's in all columns except the first, and then using linear indexing to fill those entries with their row-wise preceding values:
[ii jj] = find( (X(:,2:end)==0) | isnan(X(:,2:end)) );
X(ii+jj*size(X,1)) = X(ii+(jj-1)*size(X,1));
General case (consecutive zeros are allowed on each row)
X(isnan(X)) = 0; %// handle NaN's and zeros in a unified way
aux = repmat(2.^(1:size(X,2)), size(X,1), 1) .* ...
[ones(size(X,1),1) logical(X(:,2:end))]; %// positive powers of 2 or 0
col = floor(log2(cumsum(aux,2))); %// col index
ind = bsxfun(#plus, (col-1)*size(X,1), (1:size(X,1)).'); %'// linear index
Y = X(ind);
The trick is to make use of the matrix aux, which contains 0 if the corresponding entry of X is 0 and its column number is greater than 1; or else contains 2 raised to the column number. Thus, applying cumsum row-wise to this matrix, taking log2 and rounding down (matrix col) gives the column index of the rightmost nonzero entry up to the current entry, for each row (so this is a kind of row-wise "cummulative max" function.) It only remains to convert from column number to linear index (with bsxfun; could also be done with sub2ind) and use that to index X.
This is valid for moderate sizes of X only. For large sizes, the powers of 2 used by the code quickly approach realmax and incorrect indices result.
Example:
X =
0 1 2 2 1 0 0
5 6 3 0 0 2 3
1 1 1 1 0 1 1
gives
>> Y
Y =
0 1 2 2 1 1 1
5 6 3 3 3 2 3
1 1 1 1 1 1 1
You can generalize your own solution as follows:
Y = X.'; %'// Make a transposed copy of X
Y(isnan(Y)) = 0;
idx = find([ones(1, size(X, 1)); Y(2:end, :)]);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)), [], size(X, 1)).'; %'// Reshape back into a matrix
This works by treating the input data as a long vector, applying the original solution and then reshaping the result back into a matrix. The first column is always treated as non-zero so that the values don't propagate throughout rows. Also note that the original matrix is transposed so that it is converted to a vector in row-major order.
Modified version of Eitan's answer to avoid propagating values across rows:
Y = X'; %'
tf = Y > 0;
tf(1,:) = true;
idx = find(tf);
Y(idx(2:end)) = diff(Y(idx));
Y = reshape(cumsum(Y(:)),fliplr(size(X)))';
x=[0,1,2,2,1,0;
5,6,3,0,1,2;
1,1,1,1,0,1];
%Do it column by column is easier
x=x';
rm=0;
while 1
%fields to replace
l=(x==0);
%do nothing for the first row/column
l(1,:)=0;
rm2=sum(sum(l));
if rm2==rm
%nothing to do
break;
else
rm=rm2;
end
%replace zeros
x(l) = x(find(l)-1);
end
x=x';
I have a function I use for a similar problem for filling NaNs. This can probably be cutdown or sped up further - it's extracted from pre-existing code that has a bunch more functionality (forward/backward filling, maximum distance etc).
X = [
0 1 2 2 1 0
5 6 3 0 0 2
1 1 1 1 0 1
0 0 4 5 3 9
];
X(X == 0) = NaN;
Y = nanfill(X,2);
Y(isnan(Y)) = 0
function y = nanfill(x,dim)
if nargin < 2, dim = 1; end
if dim == 2, y = nanfill(x',1)'; return; end
i = find(~isnan(x(:)));
j = 1:size(x,1):numel(x);
j = j(ones(size(x,1),1),:);
ix = max(rep([1; i],diff([1; i; numel(x) + 1])),j(:));
y = reshape(x(ix),size(x));
function y = rep(x,times)
i = find(times);
if length(i) < length(times), x = x(i); times = times(i); end
i = cumsum([1; times(:)]);
j = zeros(i(end)-1,1);
j(i(1:end-1)) = 1;
y = x(cumsum(j));

How do I compute the sum of a subset of elements in a matrix?

I want to calculate the sum of the elements in a matrix that are divisible by 2. How do I do it? And how do I output the answer in a co-ordinate form?
If you have a matrix M, you can find a logical index (i.e. mask) for where the even elements are by using the MOD function, which can operate on an entire matrix without needing loops. For entries in the matrix that are even the remainder will be 0 after dividing by 2:
index = (mod(M,2) == 0);
You can get the row and column indices of these even entries using the function FIND:
[rowIndices,colIndices] = find(index);
And you can get the sum of the even elements by indexing M with the logical mask from above to extract the even entries and using the SUM function to add them up:
evenSum = sum(M(index));
Here's an example with a matrix M created using the function MAGIC:
>> M = magic(3)
M =
8 1 6
3 5 7
4 9 2
>> index = (mod(M,2) == 0)
index =
1 0 1 %# A matrix the same size as M with
0 0 0 %# 1 (i.e. "true") where entries of M are even
1 0 1 %# and 0 (i.e. "false") elsewhere
>> evenSum = sum(M(index))
evenSum =
20
This is the matrix M with only its even values:
(mod(M,2) == 0).*M
You can sum it with sum(M) or sum(sum(M)) (not sure what "co-ordinate form" means).
Some pseudo-code. Pretty much loop through each column for each of the rows.
sum = 0
for(i = 0; i < matrix.num_rows; i++) {
for(j = 0; j < matrix.num_cols; j++) {
if(matrix[i][j] % 2 == 0)
sum += matrix[i][j]
}
}
Not sure what you mean by Coordinate form though.