How can we efficiently generate k random and non-consecutive samples out of [1,...,N]?
Non-desired Example with (N=10, k=4):
2,3,8,10
This is not a desired example since 2 and 3 are consecutive.
Desired Example with (N=10, k=4):
2,6,8,10
This is a good example since the difference between every pair of samples is greater than 1
sort(randperm(N-(k-1),k))+[0:(k-1)]
There is a simple obersavation behind this solution, if you take any sorted solution to your problem and substract [0:(k-1)], you end up with a random choice of k numbers out of N-(k-1)
Let S denote the set of all k-element vectors with values taken from [1,...,N] that don't have any consecutive values. To randomly sample with a uniform distribution over S, you can use a rejection method:
Sample uniformly on a larger sample space, T.
If the sample belongs to the target region S, accept the sample. Else return to step 1 (the sample is rejected).
In Matlab, it's easy to generate uniformly distributed k-element vectors with values taken from [1,...,N] without replacement (function randsample). So this is used as the sample space T:
k = 4;
N = 10;
result = [1 1]; % // just to get while loop started
while any(diff(result)<=1) % // does not meet condition: try again
result = sort(randsample(N, k).'); %'// random sample without replacement
end
A Python class which correctly checks every pair of samples. You're responsible for not passing it a set of numbers that is impossible, though (like N = 10, k = 100).
>>> class NonConsecutiveSampler(object):
def __init__(self,N):
import random
self.num = N
def get_samples(self,k):
possibilities = [i for i in range(1,self.num + 1)]
samples = []
while len(samples) < k:
r = random.sample(possibilities,1)[0]
samples.append(r)
for i in range(r - 1, r + 2):
if i in possibilities:
possibilities.remove(i)
samples.sort()
return samples
>>> n = NonConsecutiveSampler(10)
>>> n.get_samples(4)
[2, 5, 8, 10]
>>> n.get_samples(4)
[1, 5, 7, 10]
>>> n.get_samples(4)
[3, 6, 8, 10]
>>> n.get_samples(4)
[1, 3, 5, 8]
EDIT: Made it much more efficient
Sometimes it's faster and easier to generate more samples than you need and then throw away the undesireable values.
One (slow) example.
vec= randi(100,1,1);
for j = 2:50,
while(abs(vec(j)-vec(j-1)<2) vec(j)= randi(100,1,1);end;
end
Another way. Suppose you want 50 samples
vec = rand(100,100,1);
badindex = find(abs(vec(1:99)-vec(2:100) < 1));
vec(badindex) = vec(badindex-1)+vec(badindex+1);
% if you don't want big values,
vec(vec>100) = vec (vec>100) -100; % to ensure, I hope, that neighbors
% are nonconsecutive
(This would be easier in R) .
You can make the increment between samples evenly distributed between 2 and N-1 (to avoid consecutive and repeated numbers):
N=10;
k=4;
increments = floor(rand(1,k)*(N-2))+2 %// increments allowed are from 2 to N-1 inclusive
out = mod(cumsum(increments), N)+1 %// sum increments
Same in python:
from numpy import cumsum, floor, mod, random
N=5
k=100
increments = floor(random.rand(1,k)*(N-2))+2
out = mod(cumsum(increments), N)+1
print(out)
[ 5. 3. 1. 5. 2. 4. 3. 2. 4. 2. 4. 3. 1. 5. 4. 3. 5. 4.
2. 5. 4. 2. 5. 2. 4. 1. 5. 4. 1. 5. 3. 1. 3. 2. 4. 1.
5. 4. 1. 3. 5. 4. 3. 5. 2. 1. 3. 2. 4. 3. 1. 4. 2. 1.
3. 2. 1. 4. 3. 2. 1. 3. 5. 3. 5. 4. 2. 4. 2. 1. 3. 2.
1. 3. 5. 2. 5. 4. 3. 1. 4. 1. 4. 3. 5. 4. 2. 1. 5. 2.
1. 5. 4. 2. 4. 3. 5. 2. 4. 1.]
Over 100 iterations, even if I limit the number to 1..5, there is no repeated/consecutive number.
A solution in MATLAB (perhaps inelegant) could be something like this:
N = 10;
k = 4;
out = zeros(1,k);
vec = 1 : N;
for idx = 1 : k
ind = randi(numel(vec), 1);
left = max(ind-1, 1); right = min(numel(vec), ind+1);
out(idx) = vec(ind);
to_remove = ind;
if vec(left) == vec(ind)-1
to_remove = [to_remove left];
end
if vec(right) == vec(ind)+1
to_remove = [to_remove right];
end
vec(to_remove) = [];
end
We first declare N and k, then declare an output array of zeroes that is k long. We then generate a sampling vector vec that goes from 1 up to as N initially. Next, for each value we want to place into the output, we generate a random position to sample from the vector, then take a look at the position from the left and from the right... ensuring that we are within the boundaries of the array. Also, we only remove to the left or right if the value at the left of the index to remove and also the right are equal to each other (thanks beaker!)
We use this location and sample from this vector, place the value at this location to the output, then remove the indices in this vector that are to the left, to the right, and the actual index itself from this vector. This removes the possibility of sampling from those values again. We repeat this until we run out of values to place in the output.
Here are a few of trial runs:
>> out
out =
9 7 1 5
>> out
out =
7 1 4 10
>> out
out =
10 8 1 6
>> out
out =
10 4 8 1
A not-particularly-elegant python solution:
def nonseq(n, k):
out = [random.randint(0, n)]
while len(out) < k:
x = random.randint(0, n)
if abs(x - out[-1]) > 1:
out.append(x)
return out
This is a recursive elegant version, I just added a check on k and N to avoid infinite recursion, if k>N/2 no solution exists.
The result is guaranteed random.
import random
def myFunc(N,k):
if k>(N+1)/2:
return "k to big for N"
returnValue = sorted(random.sample(range(1,N+1),k))
toTest = [x - returnValue[i - 1] for i, x in enumerate(returnValue)][1:]
if 1 in toTest:
return myFunc(N,k)
else:
return returnValue
print myFunc(10,4)
My implementation:
def ncsample(population, k):
import random
if k > 0:
i = random.randrange(0, len(population) - 2*(k-1))
return [population[i]] + ncsample(population[i+2:], k-1)
else:
return []
Note: it randomly finds the sequence in one shot (no rejection sampling in a while loop).
MATLAB implementation:
function r = ncsample(population, k)
if k > 0
i = randi(length(population) - 2*(k-1));
r = [population(i) ncsample(population((i+2):end), k-1)];
else
r = [];
end
end
Some tests:
>> for i=1:10; fprintf('%s\n',sprintf('%d ', ncsample(1:10, 4))); end
1 5 7 9
3 5 8 10
3 5 8 10
4 6 8 10
2 6 8 10
1 4 8 10
1 4 7 9
3 6 8 10
1 6 8 10
2 4 7 9
Related
I have a matrix like this:
fd =
x y z
2 5 10
2 6 10
3 5 11
3 9 11
4 3 11
4 9 12
5 4 12
5 7 13
6 1 13
6 5 13
I have two parts of my problem:
1) I want to calculate the difference of each two elements in a column.
So I tried the following code:
for i= 1:10
n=10-i;
for j=1:n
sdiff1 = diff([fd(i,1); fd(i+j,1)],1,1);
sdiff2 = diff([fd(i,2); fd(i+j,2)],1,1);
sdiff3 = diff([fd(i,3); fd(i+j,3)],1,1);
end
end
I want all the differences such as:
x1-x2, x1-x3, x1-x4....x1-x10
x2-x3, x2-x4.....x2-x10
.
.
.
.
.
x9-x10
same for y and z value differences
Then all the values should stored in sdiff1, sdiff2 and sdiff3
2) what I want next is for same z values, I want to keep the original data points. For different z values, I want to merge those points which are close to each other. By close I mean,
if abs(sdiff3)== 0
keep the original data
for abs(sdiff3) > 1
if abs(sdiff1) < 2 & abs(sdiff2) < 2
then I need mean x, mean y and mean z of the points.
So I tried the whole programme as:
for i= 1:10
n=10-i;
for j=1:n
sdiff1 = diff([fd(i,1); fd(i+j,1)],1,1);
sdiff2 = diff([fd(i,2); fd(i+j,2)],1,1);
sdiff3 = diff([fd(i,3); fd(i+j,3)],1,1);
if (abs(sdiff3(:,1)))> 1
continue
mask1 = (abs(sdiff1(:,1)) < 2) & (abs(sdiff2(:,1)) < 2) & (abs(sdiff3:,1)) > 1);
subs1 = cumsum(~mask1);
xmean1 = accumarray(subs1,fd(:,1),[],#mean);
ymean1 = accumarray(subs1,fd(:,2),[],#mean);
zmean1 = accumarray(subs1,fd(:,3),[],#mean);
fd = [xmean1(subs1) ymean1(subs1) zmean1(subs1)];
end
end
end
My final output should be:
2.5 5 10.5
3.5 9 11.5
5 4 12
5 7 13
6 1 13
where, (1,2,3),(4,6),(5,7,10) points are merged to their mean position (according to the threshold difference <2) whereas 8 and 9th point has their original data.
I am stuck in finding the differences for each two elements of a column and storing them. My code is not giving me the desired output.
Can somebody please help?
Thanks in advance.
This can be greatly simplified using vectorised notation. You can do for instance
fd(:,1) - fd(:,2)
to get the difference between columns 1 and 2 (or equivalently diff(fd(:,[1 2]), 1, 2)). You can make this more elegant/harder to read and debug with pdist but if you only have three columns it's probably more trouble than it's worth.
I suspect your first problem is with the third argument to diff. If you use diff(X, 1, 1) it will do the first order diff in direction 1, which is to say between adjacent rows (downwards). diff(X, 1, 2) will do it between adjacent columns (rightwards), which is what you want. Matlab uses the opposite convention to spreadsheets in that it indexes rows first then columns.
Once you have your diffs you can then test the elements:
thesame = find(sdiff3 < 2); % for example
this will yield a vector of the row indices of sdiff3 where the value is less than 2. Then you can use
fd(thesame,:)
to select the elements of fd at those indexes. To remove matching rows you would do the opposite test
notthesame = find(sdiff > 2);
to find the ones to keep, then extract those into a new array
keepers = fd(notthesame,:);
These won't give you the exact solution but it'll get you on the right track. For the syntax of these commands and lots of examples you can run e.g. doc diff in the command window.
What I'm trying to accomplish is the following:
I wish to create a vector of integers, from a relatively small range, and ensure that none of the integers will be followed by the same integer.
i.e., This is a "legal" vector:
[ 1 3 4 2 5 3 2 3 5 4 ]
and this is an "illegal" vector (since 5 follows 5):
[ 1 3 4 2 5 5 2 3 5 4 ]
I've experimented with randi, and all sorts of variations with randperm, and I always get stuck when i try to generate a vector of around 100 elements, from a small range (i.e., integers between 1 and 5).
The function just runs for too long.
Here's one of the attempts that i've made:
function result = nonRepeatingRand(top, count)
result = randi(top, 1, count);
while any(diff(result) == 0)
result = randi(top, 1, count);
end
end
Any and all help will be much appreciated. Thanks !
The kind of sequence you are looking for can be defined by generating differences from 1 to top - 1 and then computing the cumulative sum modulus top, starting from a random initial value:
function result = nonRepeatingRand(top, count)
diff = randi(top - 1, 1, count);
result = rem(cumsum(diff) + randi(1, 1, count) - 1, top) + 1;
end
On my machine, this generates a non-repeating sequence of 10 million numbers out of 1:5 in 0.58 seconds.
you can use the following code for generate Non Repeating Random Numbers from 1 to M
randperm(M);
and for K Non Repeating Random Numbers from 1 to M
randperm(M, K);
enjoy
Do not regenerate the sequence every time, but fix the repetitions. E.g.:
function result = nonRepeatingRand(top, count)
result = randi(top, 1, count);
ind = (diff(result) == 0);
while any(ind)
result(ind) = [];
result(end + 1 : count) = randi(top, 1, count - numel(result));
ind = (diff(result) == 0);
end
end
On my machine, this generates a non-repeating sequence of 10 million numbers out of 1:5 in 1.6 seconds.
Taking the idea from A. Donda but fixing the implementation:
r=[randi(top,1,1),randi(top - 1, 1, count-1)];
d=rem(cumsum(r)-1,top)+1;
The first element of r is a randomly chosen element to start with. The following elements of r randomly choose the difference to the previous element, using modulo arithmetic.
How this?
top = 5;
count = 100;
n1 = nan;
out = [];
for t = 1: count
n2 = randi(top);
while n1 == n2
n2 = randi(top);
end
out = [out, n2];
n1 = n2;
end
I would like to generate a random number between 1 and 10 using for example randi([1,10]) but I would like to exclude a single number, say 7 - this number would always change and be specified in a variable called b.
Is that possible to do somehow?
Use randsample. For instance, to generate a number between 1 and 10, excluding 7, do the following:
b = 7;
x = randsample(setdiff(1:10, b), 1);
Here setdiff is used to exclude the value of b from the vector 1:10.
If you don't have the Statistics Toolbox installed, you won't be able to use randsample, so use rand:
v = setdiff(1:10, b);
x = v(ceil(numel(v) * rand));
For those without the statistics toolbox:
b = 7;
pop = 1:10;
pop(b) = [];
then
pop(randperm(9,1))
or for n random integers from the population:
pop(randi(numel(pop), 1, n))
As #EitanT mentioned, you can use randsample to do so, but I think that doing so in a simpler manner should do for you:
>> b = 7;
>> randsample([1:b-1,b+1:10],1)
This simply samples a random value from the array [1:b-1,b+1:10] which would here be
1 2 3 4 5 6 8 9 10
Or similarly, if the `randsample' function is unavailable as #EitanT had mentioned,
v = [1:b-1,b+1:10];
x = v(ceil(numel(v) * rand));
I have a matrix and I want to compare rows of this matrix to rows of another matrix and verify if there are rows wich match them.
For example:
A = [ 1 2 3;...
4 5 6;...
7 8 9 ];
B = [ 54 23 13;...
54 32 12;...
1.1 2.2 2.9];
I need to detect that row 1 of the Matrix A match with the row 3 of the Matrix B. The rows are not equal because I want a +-10 per cent of margin.
Thank you very much.
This code is untested, but should do it:
valid = all(abs(A(1,:) - B(3,:)) ./ A(1,:) < 0.1)
An explanation:
A(1,:) takes the first row of A, and B(3,:) takes the third row of B.
abs(...) takes the absolute value.
abs(...) ./ A(1,:) gives the percentage change
< 0.1 ensures that each element is less than 10%.
all(...) aggregates the values from the last step and tests that they're all true.
In general, if you don't know which row of A may match with B, I wrote a for loop, which is an extension of Fabian answer....
for i = 1:size(A,1)
match(:,i) = sum(abs(ones(size(A,1),1)*A(i,:) - B) ./ (ones(size(A,1),1)*A(i,:)) <= 0.100001, 2) == size(A,2)*ones(size(A,1),1);
end
match(i,j) == 1 if ith row of B matches with jth row of A
I ask this question in other forums and I get the best answer possible to me:
margin = 0.1;
A = [1 2 3; 4 5 6; 7 8 9];
B = [7 8 10; 4 5 12; 1.1 2.2 2.9; 1.101 2 3; 6.3 7.2 9.9];
k = 0;
for i = 1:size(A,1)
for j = 1:size(B,1)
if all(abs((A(i,:)-B(j,:))./A(i,:)) <= margin+eps)
k = k+1;
match(:,k) = [i;j];
end
end
end
fprintf('A row %d matches B row %d.\n',match)
I would like to thank all your answers and I would give you accepted answers, but I think this is the best code for me.
I need to generate some 5x6 matrices in MATLAB. They need to consist of randomly generated integers in the range 1-6, however, an integer cannot occur more than once in a particular row or column.
Here is the script I am currently using to generate random 5x6 matrices:
mat=zeros(5,6);
rows=5;
columns=6;
for i=1:rows
for j=1:columns
mat(i,j)=round(rand*(high-low)+low);
end
end
disp(mat)
But I don't know how to insert the rule about repeats into this.
I'm sure this is a relatively simple problem but I'm very new to MATLAB and haven't been able to generate something that satisfies these conditions. I'd be greatful for any assistance anyone can give.
Try this:
m = zeros(5,6);
for row = 1:5
flag = 1;
while(flag)
flag = 0;
R = randperm(6);
for testRow = 1:row
flag = flag + sum(m(testRow,:) == R);
end;
if (~flag)
m(row,:) = R;
end;
end;
end;
m
Don't try to fill the matrix with completely random ints all at once. The likelihood of that being a valid puzzle grid is vanishingly low.
Instead, use the same method as used by Sudoku generators - start with a blank matrix and fill in elements one at a time, as restricted by your rules.
Where you have more than one choice for the entry, pick one of them at random.
You might progress something like this (4x4 example for brevity - allowable numbers 1-4)
x x x x
x x x x
x x x x
x x x x
Pick first number by dice roll: 3.
3 x x x
x x x x
x x x x
x x x x
Pick second number from list of allowable numbers: [1, 2, 4].
3 1 x x
x x x x
x x x x
x x x x
Pick third number from list of allowable numbers, [1, 4]:
3 1 4 x
x x x x
x x x x
x x x x
And so on.
If your "list of allowable numbers" at some insertion step is an empty set, then your matrix can't be salvaged and you may need to start again.
Also a 10x10 matrix with 5 unique integers is clearly impossible - insert some logic to test for this trivial error case.
Edit: Since it's not homework in the traditional sense, and since it was an interesting problem....
function arena = generate_arena(num_rows, num_cols, num_symbols)
% Generate an "arena" by repeatedly calling generate_arena_try
% until it succeeds.
arena = 0;
number_of_tries = 0;
while ~(arena)
arena = generate_arena_try(num_rows, num_cols, num_symbols);
number_of_tries = number_of_tries + 1;
end
sprintf('Generating this matrix took %i tries.', number_of_tries)
end
function arena = generate_arena_try(num_rows, num_cols, num_symbols)
% Attempts to generate a num_rows by num_cols matrix of random integers
% from the range 1:num_symbols, with no symbols repeated in each row or
% column.
%
% returns 0 on failure, or the random matrix if it succeeds.
arena = zeros(num_rows, num_cols);
symbols = 1:num_symbols;
for n = 1:num_rows
for m = 1:num_cols
current_row = arena(n,:);
current_col = arena(:,m);
% find elements in $symbols that are not in the current row or col
choices = setdiff ( symbols, [current_row current_col'] );
if isempty(choices)
arena = 0;
return;
end
% Pick one of the valid choices at random.
arena(n,m) = choices(randi(length(choices)));
end
end
return;
end
Invocation and output are like:
>> generate_arena(5,6,6)
ans =
Generating this matrix took 5 tries.
ans =
2 3 6 4 5 1
6 1 5 3 4 2
1 5 4 2 6 3
4 6 2 1 3 5
3 4 1 5 2 6
Don't say I never gave you nothing. ;)
Here's another way of doing it:
Start off with a known valid solution, say this one:
>> A = mod(meshgrid(1:size) - meshgrid(1:size)', size) + 1
A =
1 2 3 4 5 6
6 1 2 3 4 5
5 6 1 2 3 4
4 5 6 1 2 3
3 4 5 6 1 2
2 3 4 5 6 1
Then swap rows and columns at random. You can prove that each swap preserves the "no-repeats" property in each row and column.
Say you swap row 1 and row 2. You haven't changed the contents of the rows, so the "no repeats in each row" property remains true. Similarly, you haven't changed the contents of any of the columns - just the ordering - so the "no repeats in each column" property also remains true.
Here is what I came up with:
function arena = gen_arena_2 (size)
arena = mod(meshgrid(1:size) - meshgrid(1:size)', size) + 1;
%repeatedly shuffle rows and columns
for i = 1:10
arena = arena(:,randperm(size))'; %shuffle columns and transpose
end
end
Example usage:
>> gen_arena_2(6)
ans =
3 5 4 2 1 6
6 2 1 5 4 3
5 1 6 4 3 2
4 6 5 3 2 1
1 3 2 6 5 4
2 4 3 1 6 5
I'm not sure this is probably "as random" as the other way - but this way is fast and it doesn't need any logic to detect a failure (because it will (provably) always produce a correct result.)