From the integers 1,...,N I would like to take k random distinct combinations without repetition of size p. For example, if N=10, k=4 and p=3, a possible outcome would be:
1 4 9
9 4 2
3 5 2
1 8 4
But not:
1 4 9
9 4 2
3 5 3
1 9 4
For two reasons:
[1 4 9] and [1 9 4] are the same combination.
[3 5 3] is not without repetition.
Note that getting all possible combinations and the (randomly) picking k of them easily runs into memory problems.
Okay, I have found a solution that works for me. My main concerns were:
I want the k combinations to be random.
processing time.
The below function samples a single random combination of size p, (namely row = randperm(N,p)) each iteration and adds that combination if it isn't already present.
Of the three parameters, mainly k influences the processing time. For not too large k, this codes runs in matters of seconds. The most extreme case I myself will encounter is N = 10^6, k = 2000, p = 10 and it still runs in 1 second.
I hope this also helps other people, as I've come across this question on multiple sites, without a satisfactory answer.
function C = kcombsn(N,k,p)
C = randperm(N,p);
Csort = sort(C,2);
while size(C,1) < k
row = randperm(N,p);
row_sort = sort(row);
if isempty(intersect(row_sort,Csort,'rows'))
C = [C; row];
Csort = [Csort; row_sort];
end
end
end
Edit:
I also posted the code on the MATLAB File Exchange.
Related
I am familiar with Matlab but am still having trouble with vectorized methods in my intuition, so I was wondering if anyone could demonstrate how they would manage this problem.
I have an array, for example A = [1 1 2 2 1 3 3 3 4 3 4 4 5].
I want to return an array B such that each element is the index of A's most 'recent' element with a different value than the previous ones.
So for our array A, B would equal [x x 2 2 4 5 5 5 8 9 10 10 12], where the x's can be any consistent value you like, because there is no previous index satisfying those characteristics.
I know how I would code it as a for-loop, and I bet the for-loop is probably faster, but can anyone vectorize this to faster than the for-loop?
Here's my for-loop:
prev=0;
B=zeros(length(A),1);
for i=2:length(A)
if A(i-1)~=A(i)
prev=i-1;
end
B(i)=prev;
end
Find the indices of the entries where the value changes:
ind = find(diff(A) ~= 0);
The values that should appear in B are therefore:
val = [0 ind];
Construct the diff of B: fill in the difference between the values that should appear at the right places:
Bd = zeros(size(B))';
Bd(ind + 1) = diff(val);
Now use cumsum to construct B:
B = cumsum(Bd)
Not sure whether this results in a speed-up though.
I am supposed to determine the probability of 4 of a kind in a 5 card poker draw using Matlab.
I understand the first thing I have to do is generate a deck and shuffle the cards, then draw 5 cards.
I am having trouble with determining whether the hand is 4 of a kind or not.
I have written the code below, which works for shuffling the deck and drawing 5 cards.
I have tried to use an if statement to determine if the hand is a 4 of a kind or not, but it does not work. My reasoning behind the if statement was that if I already had a sorted vector, the only two possibilities would be the first 4 or the last 4 numbers should all equal each other
Ex. AAAA_
_2222
Any advice on how to determine 4 of a kind would be very helpful :)
DECK = ['AH';'2H';'3H';'4H';'5H';'6H';'7H';'8H';'9H';'TH';'JH';'QH';'KH'; ...
'AS';'2S';'3S';'4S';'5S';'6S';'7S';'8S';'9S';'TS';'JS';'QS';'KS'; ...
'AD';'2D';'3D';'4D';'5D';'6D';'7D';'8D';'9D';'TD';'JD';'QD';'KD'; ...
'AC';'2C';'3C';'4C';'5C';'6C';'7C';'8C';'9C';'TC';'JC';'QC';'KC'];
%deck of 52 cards
total_runs=10000;
n=0;
for i=1:total_runs
index=randperm(52);
shuffle=DECK(index);
%shuffles the 52 columns
b=shuffle(1:5);
%chooses the first 5 cards
d=sort(b);
if d(1)==d(2)==d(3)==d(4)||d(2)==d(3)==d(4)==d(5)
%attempt to determine 4 of a kind
disp(d);
n=n+1;
end
end
prob=n/total_runs
You can't chain comparisons like that. You wrote:
d(1)==d(2)==d(3)==d(4)
But d(1) == d(2) evaluates to a logical, either true or false. That won't be equal to d(3).
Since they're sorted, you can just test
d(1)==d(4) || d(2)==d(5)
I was wracking my head about this for the last 30 minutes, and I began to wonder, why do we need to specify the suit? He can simply get a vector of [1 through 13 ...
1 through 13] with size 1x52 and use randperm(52,5). Or as follows:
DECK = [1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 ...
1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13];
draw = randperm(52,5);
for k = 1:5;
hand(k) = DECK(draw(k));
end
Then you can check the first two indices of hand() and compare to hand; or:
for i=1:2
if sum(hand(i)==hand) == 4
n = n+1;
end
end
I think this way is short enough, though it would be more ideal to compare column or row values. This takes about 1 second to run N=100,000 iterations on an i5 5th gen. When I set it for 10 million iterations, I'm getting about 0.04% success, which is quite higher than the theoretical 0.02401%.
My first attempt comes out likes this:
hand = randperm(52,5);
for k=1:5
match = 0;
for i=1:3
if sum(hand(k)+13*i == hand) > 0
match = match+1;
end
end
if match == 3
four = four +1;
end
end
prob = four/N;
I like this one because I don't need to waste space with a large vector; however, it takes more processing power because of the 15 loops/more comparisons. I'm getting about 0.024% success over N=100,000 iterations for this one, which is almost on-the-dot with theory. The idea with the inner-most loop is that one of the cards in a four-of-a-kind will be equal to another card when you add 13*a to it, where a = 1,2,3. This method took me almost an hour to write since I was getting a little deep with the loops.
Please met me know of any concerns with the code, it's greatly appreciated.
edit: Haha I just realized that I am replicating results with my first script. Let's do it like this:
for i=1:2
if sum(hand(i)==hand) == 4
n = n+1;
end
end
should be:
if sum(hand(1)==hand) == 4
n = n+1;
elseif sum(hand(2)==hand) == 4
n = n+1;
end
something like that.
Thanks for posting an interesting question.
I somewhat find mixing strings and integers bit awkward to work with in MATLAB.
However this problem is solvable if we consider only integers from 1 to 52.
% 1 through 52
% ['AH';'2H';'3H';'4H';'5H';'6H';'7H';'8H';'9H';'TH';'JH';'QH';'KH'; ...
% 'AS';'2S';'3S';'4S';'5S';'6S';'7S';'8S';'9S';'TS';'JS';'QS';'KS'; ...
% 'AD';'2D';'3D';'4D';'5D';'6D';'7D';'8D';'9D';'TD';'JD';'QD';'KD'; ...
% 'AC';'2C';'3C';'4C';'5C';'6C';'7C';'8C';'9C';'TC';'JC';'QC';'KC'];
%deck of 52 cards . . from wikipedia
total_runs=2598960;
n=0;
for i=1:total_runs
index=randperm(52,5);
value = mod(index-1, 14);
if length(unique(value)) == 2
%attempt to determine 4 of a kind
n=n+1;
end
end
prob=n/total_runs
EDIT:
corrected to length(unique(value)) == 2
The probability that this gave is between 0.1% and 0.2%.Which seems reasonable.
However it should not be mod 13, because we want 13 distinct values for each color right .
How can I construct a scrambled matrix with 128 rows and 32 columns in vb.net or Matlab?
Entries of the matrix are numbers between 1 and 32 with the condition that each row mustn't contain duplicate elements and rows mustn't be duplicates.
This is similar to #thewaywewalk's answer, but makes sure that the matrix has no repeated rows by testing if it does and in that case generating a new matrix:
done = 0;
while ~done
[~, matrix] = sort(rand(128,32),2);
%// generate each row as a random permutation, independently of other rows.
%// This line was inspired by randperm code
done = size(unique(matrix,'rows'),1) == 128;
%// in the event that there are repeated rows: generate matrix again
end
If my computations are correct, the probability that the matrix has repteated rows (and thus has to be generated again) is less than
>> 128*127/factorial(32)
ans =
6.1779e-032
Hey, it's more likely that a cosmic ray will spoil a given run of the program! So I guess you can safely remove the while loop :-)
With randperm you can generate one row:
row = randperm(32)
if this vector wouldn't be that long you could just use perms to find all permutations:
B = perms(randperm(32))
but it's memory-wise too much! ( 32! = 2.6313e+35 rows )
so you can use a little loop:
N = 200;
A = zeros(N,32);
for ii = 1:N
A(ii,:) = randperm(32);
end
B = unique(A, 'rows');
B = B(1:128,:);
For my tests it was sufficient to use N = 128 directly and skip the last two lines, because with 2.6313e+35 possibly permutations the probability that you get a correct matrix with the first try is very high. But to be sure that there are no row-duplicates choose a higher number and select the first 128 rows finally. In case the input vector is relatively short and the number of desired rows close to the total number of possible permutations use the proposed perms(randperm( n )).
small example for intergers from 1 to 4 and a selection of 10 out of 24 possible permutations:
N = 20;
A = zeros(N,4);
for ii = 1:N
A(ii,:) = randperm(4);
end
B = unique(A, 'rows');
B = B(1:10,:);
returns:
B =
1 2 3 4
1 2 4 3
1 3 4 2
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
some additional remarks for the choice of N:
I made some test runs, where I used the loop above to find all permutations like perms does. For vector lengths of n=4 to n=7 and in each case N = factorial(n): 60-80% of the rows are unique.
So for small n I would recommend to choose N as follows to be absolutely on the safe side:
N = min( [Q factorial(n)] )*2;
where Q is the number of permutations you want. For bigger n you either run out of memory while searching for all permutations, or the desired subset is so small compared to the number of all possible permutations that repetition is very unlikely! (Cosmic Ray theory linked by Luis Mendo)
Your requirements are very loose and allow many different possibilities. The most efficient solution I can think off that meets these requirements is as follows:
p = perms(1:6);
[p(1:128,:) repmat(7:32,128,1)]
I am new to matlab and just wondering if you guys can help me out with this problem.
For instance, I have two matrices:
A = [X1 X2 X3 X4]
B = [Y1; Y2; Y3]
now what I really want to achieve is to multiply these two matrices in this way:
[X1Y1 X2Y1 X3Y1 X4Y1;
X1Y2 X2Y2 X3Y2 X4Y2;
X1Y3 X2Y3 X3Y3 X4Y3;
.... and so on]
I tried using A(1,:).*B(:,1) but matlab is saying that matrix dimensions must agree.
I just don't know how to manipulate this on matlab but in excel is possible.
This is a simple outer product. kron is not needed (although it will work.) bsxfun is wild overkill, although will yield what you have asked for. repmat is inappropriate, because while it will help you do what you wish, it replicates the arrays in memory, using more resources than are needed. (Avoid using inefficient programming styles when there are good ones immediately at your disposal.)
All you need use is the simple * operator.
A is a row vector. B a column vector.
C = B*A
will yield the result C(i,j)=B(i)*A(j), which is exactly what you are looking for. Note that this works because B is 3x1 and A is 1x4, so the "inner" dimensions of B and A do conform.
In MATLAB, IF you are unsure if something works, TRY IT!
A = [1 2 3 4];
B = [1;2;3];
C = B*A
ans =
1 2 3 4
2 4 6 8
3 6 9 12
See that kron did indeed work, although I'd bet that use of kron here is probably less efficient than is the simple outer product multiply.
C = kron(B,A)
C =
1 2 3 4
2 4 6 8
3 6 9 12
As well, bsxfun will work here too, although since we are using a general tool to do something that a basic operator will do, I'd bet it is slightly less efficient.
C = bsxfun(#times,B,A)
C =
1 2 3 4
2 4 6 8
3 6 9 12
The WORST choice is repmat. Again, since it artificially replicates the vectors in memory FIRST, it must go out and grab big chunks of memory in the case of large vectors.
C = repmat(B,1,4).*repmat(A,3,1)
C =
1 2 3 4
2 4 6 8
3 6 9 12
I suppose for completeness, you could also have used meshgrid or ndgrid. See that it is doing exactly what repmat did, but here it explicitly creates new matrices. Again, this is a poor programming style when there are good tools to do exactly what you wish.
[BB,AA] = ndgrid(B,A)
BB =
1 1 1 1
2 2 2 2
3 3 3 3
AA =
1 2 3 4
1 2 3 4
1 2 3 4
C = BB.*AA
C =
1 2 3 4
2 4 6 8
3 6 9 12
What you need to understand is exactly why each of these tools COULD have been used for the job, and why they are different.
In Matlab there is * and .* and they are very different.
* is normal matrix multiplication which is what you want i.e. B*A, note the B must come first as the inner dimension must match. You can multiply a column by a row but not a row by a column (unless they have the same number of elements).
.* is element by element multiplication in which case the matrices must be exactly the same size and shape so for example [1 2 3].*[4 5 6] = [1*4 2*5 3*6] = [4 10 18]
Do not do a ".*". You should rather do a "*".
The ".*" is for index by index multiplication and should have given you [X1Y1 X2Y2 X3Y3] were they vectors have been equal in size.
If you do the regular multiplication "*", this is actually matrix multiplication.
I think you just need to transpose one of the vectors. You are multiplying a column vector (A(1,:)) with a row vector (B(:,1)). This should work:
C = A(1,:).*B(:,1)';
The expression is:
for i=1:n
X(:,i) = [P{i}(:)];
end
where X is a DxN matrix and P is a cell-array.
reshape(cat(3,P{:}),[numel(P{1}) n])
Of course, the above solution is just for fun. I would recommend profiling both solutions and only using this one if it has a significant performance advantage.
Maintenance and readability are also very important factors to consider when writing code.
If you obtained the cell array via mat2cell, you may be wanting to arrange blocks of an image into the columns of an array X. This can be achieved in a single step using the command IM2COL
%# rearrange the large array so that each column of X
%# corresponds to the 4 pixels of each 2-by-2 block
X = im2col(largeArray,[2 2],'distinct');
You might be able to get away with:
P{1} = [ 1 2; 3 4];
P{2} = [ 7 8; 9 10];
P{3} = [ 11 12; 13 14];
X = [P{:}]
X =
1 2 7 8 11 12
3 4 9 10 13 14
Then some sort of reshape() to get to where you want to be.