How to create all permutations of a 2-column cell-array? - matlab

I created a cell array of shape m x 2, each element of which is a matrix of shape d x d.
For example like this:
A = cell(8, 2);
for row = 1:8
for col = 1:2
A{row, col} = rand(3, 3);
end
end
More generally, I can represent A as follows:
where each A_{ij} is a matrix.
Now, I need to randomly pick a matrix from each row of A, because A has m rows in total, so eventually I will pick out m matrices, which we call a combination.
Obviously, since there are only two picks for each row, there are a total of 2^m possible combinations.
My question is, how to get these 2^m combinations quickly?
It can be seen that the above problem is actually finding the Cartesian product of the following sets:

2^m is actually a binary number, so we can use those to create linear indices. You'll get an array containing 1s and 0s, something like [1 1 0 0 1 0 1 0 1], which we can treat as column "indices", using a 0 to indicate the first column and a 1 to indicate the second.
m = size(A, 1);
% Build all binary numbers and create a logical matrix
bin_idx = dec2bin(0:(2^m -1)) == '1';
row = 3; % Loop here over size(bin_idx,1) for all possible permutations
linear_idx = [find(~bin_idx(row,:)) find(bin_idx(row,:))+m];
A{linear_idx} % the combination as specified by the permutation in out(row)
On my R2007b version this runs virtually instant for m = 20.
NB: this will take m * 2^m bytes of memory to store bin_idx. Where that's just 20 MB for m = 20, that's already 30 GB for m = 30, i.e. you'll be running out of memory fairly quickly, and that's for just storing permutations as booleans! If m is large in your case, you can't store all of your possibilities anyway, so I'd just select a random one:
bin_idx = rand(m, 1); % Generate m random numbers
bin_idx(bin_idx > 0.5) = 1; % Set half to 1
bin_idx(bin_idx < 0.5) = 0; % and half to 0
Old, slow answer for large m
perms()1 gives you all possible permutations of a given set. However, it does not take duplicate entries into account, so you'll need to call unique() to get the unique rows.
unique(perms([1,1,2,2]), 'rows')
ans =
1 1 2 2
1 2 1 2
1 2 2 1
2 1 1 2
2 1 2 1
2 2 1 1
The only thing left now is to somehow do this over all possible amounts of 1s and 2s. I suggest using a simple loop:
m = 5;
out = [];
for ii = 1:m
my_tmp = ones(m,1);
my_tmp(ii:end) = 2;
out = [out; unique(perms(my_tmp),'rows')];
end
out = [out; ones(1,m)]; % Tack on the missing all-ones row
out =
2 2 2 2 2
1 2 2 2 2
2 1 2 2 2
2 2 1 2 2
2 2 2 1 2
2 2 2 2 1
1 1 2 2 2
1 2 1 2 2
1 2 2 1 2
1 2 2 2 1
2 1 1 2 2
2 1 2 1 2
2 1 2 2 1
2 2 1 1 2
2 2 1 2 1
2 2 2 1 1
1 1 1 2 2
1 1 2 1 2
1 1 2 2 1
1 2 1 1 2
1 2 1 2 1
1 2 2 1 1
2 1 1 1 2
2 1 1 2 1
2 1 2 1 1
2 2 1 1 1
1 1 1 1 2
1 1 1 2 1
1 1 2 1 1
1 2 1 1 1
2 1 1 1 1
1 1 1 1 1
NB: I've not initialised out, which will be slow especially for large m. Of course out = zeros(2^m, m) will be its final size, but you'll need to juggle the indices within the for loop to account for the changing sizes of the unique permutations.
You can create linear indices from out using find()
linear_idx = [find(out(row,:)==1);find(out(row,:)==2)+size(A,1)];
A{linear_idx} % the combination as specified by the permutation in out(row)
Linear indices are row-major in MATLAB, thus whenever you need the matrix in column 1, simply use its row number and whenever you need the second column, use the row number + size(A,1), i.e. the total number of rows.
Combining everything together:
A = cell(8, 2);
for row = 1:8
for col = 1:2
A{row, col} = rand(3, 3);
end
end
m = size(A,1);
out = [];
for ii = 1:m
my_tmp = ones(m,1);
my_tmp(ii:end) = 2;
out = [out; unique(perms(my_tmp),'rows')];
end
out = [out; ones(1,m)];
row = 3; % Loop here over size(out,1) for all possible permutations
linear_idx = [find(out(row,:)==1).';find(out(row,:)==2).'+m];
A{linear_idx} % the combination as specified by the permutation in out(row)
1 There's a note in the documentation:
perms(v) is practical when length(v) is less than about 10.

Related

How can I calculate the relative frequency of a row in a data set using Matlab?

I am new to Matlab and I have a basic question.
I have this data set:
1 2 3
4 5 7
5 2 7
1 2 3
6 5 3
I am trying to calculate the relative frequencies from the dataset above
specifically calculating the relative frequency of x=1, y=2 and z=3
my code is:
data = load('datasetReduced.txt')
X = data(:, 1)
Y = data(:, 2)
Z = data(:, 3)
f = 0;
for i=1:5
if X == 1 & Y == 2 & Z == 3
s = 1;
else
s = 0;
end
f = f + s;
end
f
r = f/5
it is giving me a 0 result.
How can the code be corrected??
thanks,
Shosho
Your issue is likely that you are comparing floating point numbers using the == operator which is likely to fail due to floating point errors.
A faster way to do this would be to use ismember with the 'rows' option which will result in a logical array that you can then sum to get the total number of rows that matched and divide by the total number of rows.
tf = ismember(data, [1 2 3], 'rows');
relFreq = sum(tf) / numel(tf);
I think you want to count frequency of each instance, So try this
data = [1 2 3
4 5 7
5 2 7
1 2 3
6 5 3];
[counts,centers] = hist(data , unique(data))
Where centers is your unique instances and counts is count of each of them. The result should be as follow:
counts =
2 0 0
0 3 0
0 0 3
1 0 0
1 2 0
1 0 0
0 0 2
centers =
1 2 3 4 5 6 7
That it means you have 7 unique instances, from 1 to 7 and there is two 1s in first column and there is not any 1s in second and third and etc.

MATLAB generate all ways that n items can be put into m bins?

I want to find all ways that n items can be split among m bins. For example, for n=3 and m=3 the output would be (the order doesn't matter):
[3 0 0
0 3 0
0 0 3
2 1 0
1 2 0
0 1 2
0 2 1
1 0 2
2 0 1
1 1 1]
The algorithm should be as efficient as possible, preferrably vectorized/using inbuilt functions rather than for loops. Thank you!
This should be pretty efficient.
It works by generating all posible splitings of the real interval [0, n] at m−1 integer-valued, possibly coincident split points. The lengths of the resulting subintervals give the solution.
For example, for n=4 and m=3, some of the possible ways to split the interval [0, 4] at m−1 points are:
Split at 0, 0: this gives subintervals of lenghts 0, 0, 4.
Split at 0, 1: this gives subintervals of lenghts 0, 1, 3.
...
Split at 4, 4: this gives subintervals of lenghts 4, 0, 0.
Code:
n = 4; % number of items
m = 3; % number of bins
x = bsxfun(#minus, nchoosek(0:n+m-2,m-1), 0:m-2); % split points
x = [zeros(size(x,1),1) x n*ones(size(x,1),1)]; % add start and end of interval [0, n]
result = diff(x.').'; % compute subinterval lengths
The result is in lexicographical order.
As an example, for n = 4 items in m = 3 bins the output is
result =
0 0 4
0 1 3
0 2 2
0 3 1
0 4 0
1 0 3
1 1 2
1 2 1
1 3 0
2 0 2
2 1 1
2 2 0
3 0 1
3 1 0
4 0 0
I'd like to suggest a solution based on an external function and accumarray (it should work starting R2015a because of repelem):
n = uint8(4); % number of items
m = uint8(3); % number of bins
whichBin = VChooseKR(1:m,n).'; % see FEX link below. Transpose saves us a `reshape()` later.
result = accumarray([repelem(1:size(whichBin,2),n).' whichBin(:)],1);
Where VChooseKR(V,K) creates a matrix whose rows are all combinations created by choosing K elements of the vector V with repetitions.
Explanation:
The output of VChooseKR(1:m,n) for m=3 and n=4 is:
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 2
1 1 2 3
1 1 3 3
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
2 2 2 2
2 2 2 3
2 2 3 3
2 3 3 3
3 3 3 3
All we need to do now is "histcount" the numbers on each row using positive integer bins to get the desired result. The first output row would be [4 0 0] because all 4 elements go in the 1st bin. The second row would be [3 1 0] because 3 elements go in the 1st bin and 1 in the 2nd, etc.

Transform a matrix to a stacked vector where all zeroes after the last non-zero value per row are removed

I have a matrix with some zero values I want to erase.
a=[ 1 2 3 0 0; 1 0 1 3 2; 0 1 2 5 0]
>>a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
However, I want to erase only the ones after the last non-zero value of each line.
This means that I want to retain 1 2 3 from the first line, 1 0 1 3 2 from the second and 0 1 2 5 from the third.
I want to then store the remaining values in a vector. In the case of the example this would result in the vector
b=[1 2 3 1 0 1 3 2 0 1 2 5]
The only way I figured out involves a for loop that I would like to avoid:
b=[];
for ii=1:size(a,1)
l=max(find(a(ii,:)));
b=[b a(ii,1:l)];
end
Is there a way to vectorize this code?
There are many possible ways to do this, here is my approach:
arotate = a' %//rotate the matrix a by 90 degrees
b=flipud(arotate) %//flips the matrix up and down
c= flipud(cumsum(b,1)) %//cumulative sum the matrix rows -and then flip it back.
arotate(c==0)=[]
arotate =
1 2 3 1 0 1 3 2 0 1 2 5
=========================EDIT=====================
just realized cumsum can have direction parameter so this should do:
arotate = a'
b = cumsum(arotate,1,'reverse')
arotate(b==0)=[]
This direction parameter was not available on my 2010b version, but should be there for you if you are using 2013a or above.
Here's an approach using bsxfun's masking capability -
M = size(a,2); %// Save size parameter
at = a.'; %// Transpose input array, to be used for masked extraction
%// Index IDs of last non-zero for each row when looking from right side
[~,idx] = max(fliplr(a~=0),[],2);
%// Create a mask of elements that are to be picked up in a
%// transposed version of the input array using BSXFUN's broadcasting
out = at(bsxfun(#le,(1:M)',M+1-idx'))
Sample run (to showcase mask usage) -
>> a
a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
>> M = size(a,2);
>> at = a.';
>> [~,idx] = max(fliplr(a~=0),[],2);
>> bsxfun(#le,(1:M)',M+1-idx') %// mask to be used on transposed version
ans =
1 1 1
1 1 1
1 1 1
0 1 1
0 1 0
>> at(bsxfun(#le,(1:M)',M+1-idx')).'
ans =
1 2 3 1 0 1 3 2 0 1 2 5

Modify parts of a matrix based on linear equation on row and column numbers

For example:
>> tmp = ones(5,5)
tmp =
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
I want a command like:
tmp(colNum - 2*rowNum > 0) = 0
that modifies entries of tmp when the column number is more than twice the row number e.g. it should produce:
tmp =
1 1 0 0 0
1 1 1 1 0
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
As a second example, tmp(colNum - rowNum == 0) = 0 should set the diagonal elements of tmp to be zero.
A possibly more efficient solution is to use bsxfun like so
nRows = 5;
nCols = 5;
bsxfun(#(col,row)~(col - 2*row > 0), 1:nCols, (1:nRows)')
You can generalize this to just accept a function so it becomes
bsxfun(#(col,row)~f(col,row), 1:nCols, (1:nRows)')
And now just replace f with exactly the way you specify the equation in your question i.e.
f = #(colNum, rowNum)(colNum - 2*rowNum > 0)
or
f = #(colNum, rowNum)(colNum - rowNum == 0)
of course it might make more sense to specify your function to accept (row,col) instead of (col,row) as that's how MATLAB indexes
You can use meshgrid to generate a grid of 2D coordinates, then use this to impose any condition you wish. The variant you seek outputs 2 2D matrices where the first matrix gives you the column locations and the second matrix outputs the row locations.
For example, given your situation above:
>> [X,Y] = meshgrid(1:5, 1:5)
X =
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
Y =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
You can see that each unique spatial location shared between X and Y give you the desired 2D location as if you were envisioning a 2D grid.
Therefore, you would do something like this for your first situation:
[X,Y] = meshgrid(1:5,1:5); % Generate 2D coordinates
tmp = ones(5); % Generate desired matrix
tmp(X > 2*Y) = 0; % Set desired locations to 0
We get:
>> tmp
tmp =
1 1 0 0 0
1 1 1 1 0
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
Finally for your second example:
[X,Y] = meshgrid(1:5,1:5); % Generate 2D coordinates
tmp = ones(5); % Generate desired matrix
tmp(X == Y) = 0; % Set desired locations to 0
We get:
>> tmp
tmp =
0 1 1 1 1
1 0 1 1 1
1 1 0 1 1
1 1 1 0 1
1 1 1 1 0
Simply put, generate a grid of 2D coordinates, then use those directly to index into your desired matrix using logical / Boolean conditions to set the desired locations to 0.

Position from reordering in ascending order in Matlab?

I have a matrix in Matlab of dimension mxn, e.g.
A= [ 1 1 1;
1 1 1;
2 2 2;
0 0 1]
I want to order the rows of A in ascending order and get the position of each row within this order. If I use
[~,~,jj] = unique(A,'rows');
I get
jj=[2;2;3;1]
What I want to get is jj=[2;3;4;1] (or jj=[3;2;4;1]), i.e. even if the first two rows of A are equivalent they should not be associated to the same position jj.
Check sortrows. This sorts your array row-based and gives you an array index that tells you where each row was initially.
[B,index] = sortrows(A);
B =
0 0 1
1 1 1
1 1 1
2 2 2
index =
4
1
2
3
And, as #Divakar pointed out:
[~,out] = intersect(index,1:4);
out =
2 3 4 1
If the elements are integers only, this could be another way -
[~,idx] = sort(A*[0:size(A,2)-1].'*(max(A(:))+1),1) %//'
[~,out] = sort(idx) %//'
Sample run -
>> A
A =
1 1 1
1 1 1
2 2 2
0 0 1
>> [~,idx] = sort(A*[0:size(A,2)-1].'*(max(A(:))+1),1);
[~,out] = sort(idx)
out =
2
3
4
1