Randomly select subset of all possible combinations in Matlab? - matlab

I need to select random combinations of k elements from a set of n elements, where n can be fairly large. Given the size of the set, it is not feasible to simply use combnk or nchoosek to generate all possible combinations, and select randomly from those.
Is there an easy way to generate a unique random subset of M of those combinations?
When n is small, the following works:
M = 20; %want to pick M random combinations
n = 10; %number of elements
k = 5; %number of elements in each combination
allCombos = nchoosek([1:n], k); %for large n this is not feasible
numCombos = nchoosek(n,k);
permutationsToUse = randperm(numCombos, M);
randomCombos = allCombos(permutationsToUse, :);
When n is large, this is no longer feasible.
Related Posts
Retrieve a specific permutation without storing all possible permutations in Matlab
How to randomly pick a number of combinations from all the combinations efficiently?
Select a subset of combinations

You can try using randi and generate random combinations of 7 integers from 1 to Nelements and then check that you only have unique combinations:
Nelements=100;
M=10;
combsubset=randi(Nelements,[M 7]);
combsubset=unique(combsubset,'rows');
If you want to get exactly M combinations you can use a loop:
Nelements=100;
M=10;
combsubset=[];
while(size(combsubset,1)<M)
combsubset=[combsubset;randi(Nelements,[M 7])];
combsubset=unique(combsubset,'rows');
end
combsusbet=combsubset(1:M,:);
If you want to reuse this to get other combinations you can pretty much use the same code:
Nelements=100;
Mtotal=20
M=10;
while(size(combsubset,1)<Mtotal)
combsubset=[combsubset;randi(Nelements,[M 7])];
combsubset=unique(combsubset,'rows');
end
combsusbet=combsubset(1:Mtotal,:);
EDIT: Another method for your needs would be to order the combinations to be able to get only a given subset. One method to order them can be explained with the following example: if you have three indices i,j,k ranging from 0 to N-1 you can use a unique index n=i*N*N+j*N+k to go over all the possibilities. Then if you want to get the nth vector:
k=mod(n,N);
j=mod((n-k)/N,N);
i=mod((((n-k)/N)-j)/N,N);
I do not know if you will find this more elegant but with the help of a little function that uses recursion you could easily get a fixed subset of your combinations.

Related

How to save matrices from for loop into another matrix

I have a 5-by-200 matrix where the i:50:200, i=1:50 are related to each other, so for example the matrix columns 1,51,101,151 are related to each other, and columns 49,99,149,199 are also related to each other.
I want to use a for-loop to create another matrix that re-sorts the previous matrix based on this relationship.
My code is
values=zeros(5,200);
for j=1:50
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a
end
end
However, the code does not work.
Here's what's happening. Let's say we're on the first iteration of the outer loop, so j == 1. This effectively gives you:
j = 1;
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a;
end
So you're creating the same submatrix for a (j doesn't change) 50 times and storing it at different places in the values matrix. This isn't really what you want to do.
To create each 4-column submatrix once and store them in 50 different places, you need to use j to tell you which of the 50 you're currently processing:
for j=1:50
a=factor_mat(:,j:50:200);
m=j*4; %// This gives us the **end** of the current range
values(:,m-3:m)=a;
end
I've used a little trick here, because the indices of Matlab arrays start at 1 rather than 0. I've calculated the index of the last column we want to insert. For the first group, this is column 4. Since j == 1, j * 4 == 4. Then I subtract 3 to find the first column index.
That will fix the problem you have with your loops. But loops aren't very Matlab-ish. They used to be very slow; now they're adequate. But they're still not the cool way to do things.
To do this without loops, you can use reshape and permute:
a=reshape(factor_mat,[],50,4);
b=permute(a,[1,3,2]);
values=reshape(b,[],200);

Retrieve a specific permutation without storing all possible permutations in Matlab

I am working on 2D rectangular packing. In order to minimize the length of the infinite sheet (Width is constant) by changing the order in which parts are placed. For example, we could place 11 parts in 11! ways.
I could label those parts and save all possible permutations using perms function and run it one by one, but I need a large amount of memory even for 11 parts. I'd like to be able to do it for around 1000 parts.
Luckily, I don't need every possible sequence. I would like to index each permutation to a number. Test a random sequence and then use GA to converge the results to find the optimal sequence.
Therefore, I need a function which gives a specific permutation value when run for any number of times unlike randperm function.
For example, function(5,6) should always return say [1 4 3 2 5 6] for 6 parts. I don't need the sequences in a specific order, but the function should give the same sequence for same index. and also for some other index, the sequence should not be same as this one.
So far, I have used randperm function to generate random sequence for around 2000 iterations and finding a best sequence out of it by comparing length, but this works only for few number of parts. Also using randperm may result in repeated sequence instead of unique sequence.
Here's a picture of what I have done.
I can't save the outputs of randperm because I won't have a searchable function space. I don't want to find the length of the sheet for all sequences. I only need do it for certain sequence identified by certain index determined by genetic algorithm. If I use randperm, I won't have the sequence for all indexes (even though I only need some of them).
For example, take some function, 'y = f(x)', in the range [0,10] say. For each value of x, I get a y. Here y is my sheet length. x is the index of permutation. For any x, I find its sequence (the specific permutation) and then its corresponding sheet length. Based on the results of some random values of x, GA will generate me a new list of x to find a more optimal y.
I need a function that duplicates perms, (I guess perms are following the same order of permutations each time it is run because perms(1:4) will yield same results when run any number of times) without actually storing the values.
Is there a way to write the function? If not, then how do i solve my problem?
Edit (how i approached the problem):
In Genetic Algorithm, you need to crossover parents(permutations), But if you crossover permutations, you will get the numbers repeated. for eg:- crossing over 1 2 3 4 with 3 2 1 4 may result something like 3 2 3 4. Therefore, to avoid repetition, i thought of indexing each parent to a number and then convert the number to binary form and then crossover the binary indices to get a new binary number then convert it back to decimal and find its specific permutation. But then later on, i discovered i could just use ordered crossover of the permutations itself instead of crossing over their indices.
More details on Ordered Crossover could be found here
Below are two functions that together will generate permutations in lexographical order and return the nth permutation
For example, I can call
nth_permutation(5, [1 2 3 4])
And the output will be [1 4 2 3]
Intuitively, how long this method takes is linear in n. The size of the set doesn't matter. I benchmarked nth_permutations(n, 1:1000) averaged over 100 iterations and got the following graph
So timewise it seems okay.
function [permutation] = nth_permutation(n, set)
%%NTH_PERMUTATION Generates n permutations of set in lexographical order and
%%outputs the last one
%% set is a 1 by m matrix
set = sort(set);
permutation = set; %First permutation
for ii=2:n
permutation = next_permute(permutation);
end
end
function [p] = next_permute(p)
%Following algorithm from https://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order
%Find the largest index k such that p[k] < p[k+1]
larger = p(1:end-1) < p(2:end);
k = max(find(larger));
%If no such index exists, the permutation is the last permutation.
if isempty(k)
display('Last permutation reached');
return
end
%Find the largest index l greater than k such that p[k] < p[l].
larger = [false(1, k) p(k+1:end) > p(k)];
l = max(find(larger));
%Swap the value of p[k] with that of p[l].
p([k, l]) = p([l, k]);
%Reverse the sequence from p[k + 1] up to and including the final element p[n].
p(k+1:end) = p(end:-1:k+1);
end

Changing numbers for given indices between matrices

I'm struggling with one of my matlab assignments. I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est. Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them. The cycle repeats 10 times chosing different values each time and adding different random number. Here is a part of my code:
steps=10;
for s=1:steps
for i=1:1:5
rl(s,i)=m_est(randi(numel(m_est)));
rl_nr(s,i)=find(rl(s,i)==m_est);
a=-1;
b=1;
r(s)=(b-a)*rand(1,1)+a;
end
pert_layers(s,:)=rl(s,:)+r(s);
M=repmat(m_est',s,1);
end
for k=steps
for m=1:1:5
M_pert=M;
M_pert(1:k,rl_nr(k,1:m))=pert_layers(1:k,1:m);
end
end
In matrix M I am storing 10 initial models and want to replace the random numbers with indices from rl_nr matrix into those stored in pert_layers matrix. However, the last loop responsible for assigning values from pert_layers to rl_nr indices does not work properly.
Does anyone know how to solve this?
Best regards
Your code uses a lot of loops and in this particular circumstance, it's quite inefficient. It's better if you actually vectorize your code. As such, let me go through your problem description one point at a time and let's code up each part (if applicable):
I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est.
I'm interpreting this as you having an array m_est of 100 elements, and with this array, you wish to create 10 different "models", where each model is 5 elements sampled from m_est. rl will store these values from m_est while rl_nr will store the indices / locations of where these values originated from. Also, for each model, you wish to add a random value to every element that is part of this model.
Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them.
Instead of doing this with a for loop, generate all of your random indices in one go. Since you have 10 steps, and we wish to sample 5 points per step, you have 10*5 = 50 points in total. As such, why don't you use randperm instead? randperm is exactly what you're looking for, and we can use this to generate unique random indices so that we can ultimately use this to sample from m_est. randperm generates a vector from 1 to N but returns a random permutation of these elements. This way, you only get numbers enumerated from 1 to N exactly once and we will ensure no repeats. As such, simply use randperm to generate 50 elements, then reshape this array into a matrix of size 10 x 5, where the number of rows tells you the number of steps you want, while the number of columns is the total number of points per model. Therefore, do something like this:
num_steps = 10;
num_points_model = 5;
ind = randperm(numel(m_est));
ind = ind(1:num_steps*num_points_model);
rl_nr = reshape(ind, num_steps, num_points_model);
rl = m_est(rl_nr);
The first two lines are pretty straight forward. We are just declaring the total number of steps you want to take, as well as the total number of points per model. Next, what we will do is generate a random permutation of length 100, where elements are enumerated from 1 to 100, but they are in random order. You'll notice that this random vector uses only a value within the range of 1 to 100 exactly once. Because you only want to get 50 points in total, simply subset this vector so that we only get the first 50 random indices generated from randperm. These random indices get stored in ind.
Next, we simply reshape ind into a 10 x 5 matrix to get rl_nr. rl_nr will contain those indices that will be used to select those entries from m_est which is of size 10 x 5. Finally, rl will be a matrix of the same size as rl_nr, but it will contain the actual random values sampled from m_est. These random values correspond to those indices generated from rl_nr.
Now, the final step would be to add the same random number to each model. You can certainly use repmat to replicate a random column vector of 10 elements long, and duplicate them 5 times so that we have 5 columns then add this matrix together with rl.... so something like:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
r = repmat(r, 1, num_points_model);
M_pert = rl + r;
Now M_pert is the final result you want, where we take each model that is stored in rl and add the same random value to each corresponding model in the matrix. However, if I can suggest something more efficient, I would suggest you use bsxfun instead, which does this replication under the hood. Essentially, the above code would be replaced with:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
M_pert = bsxfun(#plus, rl, r);
Much easier to read, and less code. M_pert will contain your models in each row, with the same random value added to each particular model.
The cycle repeats 10 times chosing different values each time and adding different random number.
Already done in the above steps.
I hope you didn't find it an imposition to completely rewrite your code so that it's more vectorized, but I think this was a great opportunity to show you some of the more advanced functions that MATLAB has to offer, as well as more efficient ways to generate your random values, rather than looping and generating the values one at a time.
Hopefully this will get you started. Good luck!

How to generate unique random numbers in Matlab?

I need to generate m unique random numbers in range 1 to n. Currently what I have implemented is:
round(rand(1,m)*(n-1)+1)
However, some numbers are repeated in the array. How can I get only unique numbers?
You can use randperm.
From the description:
p = randperm(n,k) returns a row vector containing k unique integers
selected randomly from 1 to n inclusive.
Thus, randperm(6,3)
might be the vector
[4 2 5]
Update
The two argument version of randperm only appeared in R2011b, so if you are using an earlier version of MATLAB then you will see that error. In this case, use:
A = randperm(n);
A = A(1:m);
As pointed out above, in Matlab versions older than R2011b randperm only accepts one input argument. In that case the easiest approach, assuming you have the Statistics Toolbx, is to use randsample:
randsample(n,m)
The randperm approach described by #Stewie appears to be the way to go in most cases. However if you can only use Matlab with 1 input argument and n is really large, it may not be feasible to use randperm on all numbers and select the first few. In this case here is what you can do:
Generate an integer between 1 and n
Generate an integer between 1 and n-1, this is the choice out of the available integers.
Repeat until you have m numbers
This can be done with randi and could even be vectorized by just drawing a lot of random numbers at each step until the unique amount is correct.
Use Shuffle, from the MATLAB File Exchange.
Index = Shuffle(n, 'index', m);
This can be done by sorting a random vector of floats:
[i,i]=sort(rand(1,range));
output=i(1:m);

Generating all possible combinations for selecting a particular number of rows in a large matrix

I would like to generate all possible combinations for selecting rows in batches of lets say 'k' size. For example, matrix A has 3 rows and I want all combinations for batch size 2, i.e. rows (1,2)(1,3)(2,3). What would be the simplest way to do it? Then I would like use them for some operation like myfunction();
I think nchoosek function does the trick of selecting the combination but then how can I use each row of the output from nchoosek as index for my matrix?
If you want to use each combination one by one you can do something like this:
A = rand(3);
comb = nchoosek(length(A), 2);
for i = 1:size(comb, 1)
myfunction(A(comb(i, :), :));
end
A(comb(i, :)) is a k x n matrix (here 3 x 2) corresponding to the i-th combination of rows.