Efficiently generating unique pairs of integers - matlab

In MATLAB, I would like to generate n pairs of random integers in the range [1, m], where each pair is unique. For uniqueness, I consider the order of the numbers in the pair to be irrelevant such that [3, 10] is equal to [10, 3].
Also, each pair should consist of two distinct integers; i.e. [3, 4] is ok but [3, 3] would be rejected.
EDIT: Each possible pair should be chosen with equal likelihood.
(Obviously a constraint on the parameters is that n <= m(m-1)/2.)
I have been able to successfully do this when m is small, like so:
m = 500; n = 10; % setting parameters
A = ((1:m)'*ones(1, m)); % each column has the numbers 1 -> m
idxs1 = squareform(tril(A', -1))';
idxs2 = squareform(tril(A, -1))';
all_pairs = [idxs1, idxs2]; % this contains all possible pairs
idx_to_use = randperm( size(all_pairs, 1), n ); % choosing random n pairs
pairs = all_pairs(idx_to_use, :)
pairs =
254 414
247 334
111 146
207 297
45 390
229 411
9 16
75 395
12 338
25 442
However, the matrix A is of size m x m, meaning when m becomes large (e.g. upwards of 10,000), MATLAB runs out of memory.
I considered generating a load of random numbers randi(m, [n, 2]), and repeatedly rejecting the rows which repeated, but I was concerned about getting stuck in a loop when n was close to m(m-1)/2.
Is there an easier, cleaner way of generating unique pairs of distinct integers?

Easy, peasy, when viewed in the proper way.
You wish to generate n pairs of integers, [p,q], such that p and q lie in the interval [1,m], and p
How many possible pairs are there? The total number of pairs is just m*(m-1)/2. (I.e., the sum of the numbers from 1 to m-1.)
So we could generate n random integers in the range [1,m*(m-1)/2]. Randperm does this nicely. (Older matlab releases do not allow the second argument to randperm.)
k = randperm(m/2*(m-1),n);
(Note that I've written this expression with m in a funny way, dividing by 2 in perhaps a strange place. This avoids precision problems for some values of m near the upper limits.)
Now, if we associate each possible pair [p,q] with one of the integers in k, we can work backwards, from the integers generated in k, to a pair [p,q]. Thus the first few pairs in that list are:
{[1,2], [1,3], [2,3], [1,4], [2,4], [3,4], ..., [m-1,m]}
We can think of them as the elements in a strictly upper triangular array of size m by m, thus those elements above the main diagonal.
q = floor(sqrt(8*(k-1) + 1)/2 + 1/2);
p = k - q.*(q-1)/2;
See that these formulas recover p and q from the unrolled elements in k. We can convince ourselves that this does indeed work, but perhaps a simple way here is just this test:
k = 1:21;
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
[k;p;q]'
ans =
1 1 2
2 1 3
3 2 3
4 1 4
5 2 4
6 3 4
7 1 5
8 2 5
9 3 5
10 4 5
11 1 6
12 2 6
13 3 6
14 4 6
15 5 6
16 1 7
17 2 7
18 3 7
19 4 7
20 5 7
21 6 7
Another way of testing it is to show that all pairs get generated for a small case.
m = 5;
n = 10;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
sortrows([p;q]',[2 1])
ans =
1 2
1 3
2 3
1 4
2 4
3 4
1 5
2 5
3 5
4 5
Yup, it looks like everything works perfectly. Now try it for some large numbers for m and n to test the time used.
tic
m = 1e6;
n = 100000;
k = randperm(m/2*(m-1),n);
q = floor(sqrt(8*(k-1) + 1)/2 + 3/2);
p = k - (q-1).*(q-2)/2;
toc
Elapsed time is 0.014689 seconds.
This scheme will work for m as large as roughly 1e8, before it fails due to precision errors in double precision. The exact limit should be m no larger than 134217728 before m/2*(m-1) exceeds 2^53. A nice feature is that no rejection for repeat pairs need be done.

This is more of a general approach rather then a matlab solution.
How about you do the following first you fill a vector like the following.
x[n] = rand()
x[n + 1] = x[n] + rand() %% where rand can be equal to 0.
Then you do the following again
x[n][y] = x[n][y] + rand() + 1
And if
x[n] == x[n+1]
You would make sure that the same pair is not already selected.
After you are done you can run a permutation algorithm on the matrix if you want them to be randomly spaced.
This approach will give you all the possibility or 2 integer pairs, and it runs in O(n) where n is the height of the matrix.

The following code does what you need:
n = 10000;
m = 500;
my_list = unique(sort(round(rand(n,2)*m),2),'rows');
my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
%temp = my_list; %In case you want to check what you initially generated.
while(size(my_list,1)~=n)
%my_list = unique([my_list;sort(round(rand(1,2)*m),2)],'rows');
%Changed as per #jucestain's suggestion.
my_list = unique([my_list;sort(round(rand((n-size(my_list,1)),2)*m),2)],'rows');
my_list = my_list(find((my_list(:,1)==my_list(:,2))==0),:);
end

Related

Repeat row vector as matrix with different offsets in each row

I want to repeat a row vector to create a matrix, in which every row is a slightly modified version of the original vector.
For example, if I have a vector v = [10 20 30 40 50], I want every row of my matrix to be that vector, but with a random number added to every element of the vector to add some fluctuations.
My matrix should look like this:
M = [10+a 20+a 30+a 40+a 50+a;
10+b 20+b 30+b 40+b 50+b;
... ]
Where a, b, ... are random numbers between 0 and 2, for an arbitrary number of matrix rows.
Any ideas?
In Matlab, you can add a column vector to a matrix. This will add the vector elements to each of the row values accordingly.
Example:
>> M = [1 2 3; 4 5 6; 7 8 9];
>> v = [1; 2; 3];
>> v + M
ans =
2 3 4
6 7 8
10 11 12
Note that in your case v is a row vector, so you should transpose it first (using v.').
As Sardar Usama and Wolfie note, this method of adding is only possible since MATLAB version R2016b, for earlier versions you will need to use bsxfun:
>> % instead of `v + M`
>> bsxfun(#plus, v, M)
ans =
2 4 6
5 7 9
8 10 12
If you have a MATLAB version earlier than 2016b (when implicit expansion was introduced, as demonstrated in Daan's answer) then you should use bsxfun.
v = [10 20 30 40 50]; % initial row vector
offsets = rand(3,1); % random values, add one per row (this should be a column vector)
output = bsxfun(#plus,offsets,v);
Result:
>> output =
10.643 20.643 30.643 40.643 50.643
10.704 20.704 30.704 40.704 50.704
10.393 20.393 30.393 40.393 50.393
This can be more easily understood with less random inputs!
v = [10 20 30 40 50];
offsets = [1; 2; 3];
output = bsxfun(#plus,offsets,v);
>> output =
11 21 31 41 51
12 22 32 42 52
13 23 33 43 53
Side note: to get an nx1 vector of random numbers between 0 and 2, use
offsets = rand(n,1)*2

Weighted Random number?

How can I randomize and generate numbers from 0-50 in matrix of 5x5 with SUM or each row printed on the right side?
+
is there any way to give weight to individual numbers before generating the numbers?
Please help
Thanks!
To generate a random matrix of integers between 0 and 50 (sampled with replacement) you could use
M = randint(5,5,[0,50])
To print the matrix with the sum of each row execute the following command
[M sum(M,2)]
To use a different distribution there are a number of techniques but one of the easiest is to use the datasample function from the Statistics and Machine Learning toolbox.
% sample from a truncated Normal distribution. No need to normalize
x = 0:50;
weights = exp(-0.5*(x-25).^2 / 5^2);
M = reshape(datasample(x,25,'Weights',weights),[5,5])
Edit:
Based on your comment you want to perform random sampling without replacement. You can perform such a random sampling without replacement if the weights are non-negative integers by simulating the classic ball-urn experiment.
First create an array containing the appropriate number of each value.
Example: If we have the values 0,1,2,3,4 with the following weights
w(0) = 2
w(1) = 3
w(2) = 5
w(3) = 4
w(4) = 1
Then we would first create the urn array
>> urn = [0 0 1 1 1 2 2 2 2 2 3 3 3 3 4];
then, we would shuffle the urn using randperm
>> urn_shuffled = urn(randperm(numel(urn)))
urn_shuffled =
2 0 4 3 0 3 2 2 3 3 1 2 1 2 1
To pick 5 elements without replacement we would simple select the first 5 elements of urn_shuffled.
Rather than typing out the entire urn array, we can construct it programatically given an array of weights for each value. For example
weight = [2 3 5 4 1];
urn = []
v = 0
for w = weight
urn = [urn repmat(v,1,w)];
v = v + 1;
end
In your case, the urn will contain many elements. Once you shuffle you would select the first 25 elements and reshape them into a matrix.
>> M = reshape(urn_shuffled(1:25),5,5)
To draw random integer uniformly distributed numbers, you can use the randi function:
>> randi(50,[5,5])
ans =
34 48 13 28 13
33 18 26 7 41
9 30 35 8 13
6 12 45 13 47
25 38 48 43 18
Printing the sum of each row can be done by using the sum function with 2 as the dimension argument:
>> sum(ans,2)
ans =
136
125
95
123
172
For weighting the various random numbers, see this question.

Multiplying matrix Matlab

I have a matrix M[1,98] and a matrix N[1,x], let's assume in this case x =16.
What I want is to multiply N by M , make the sum by element, and increment the matrix M. With the finality of getting an output of [1,98].
It's a bit confusing. An example:
M=[2 3 4 5 6 7]
N=[1 2 3]
it1=(2*1)+(3*2)+(4*3)+(5*0)+...=20
it2=(3*1)+(4*2)+(5*3)+(6*0)+...=26
it3=..
Output=[20 26 ... ... ... ...]
Like that until the end but considering the size of the matrix N variable. M has always the same size.
That's a convolution:
result = conv(M, N(end:-1:1), 'valid');
To achieve the result you want you need to flip the second vector and keep only the "valid" part of the convolution (no border effects).
In your example:
>> M = [2 3 4 5 6 7];
>> N = [1 2 3];
>> result = conv(M, N(end:-1:1), 'valid')
result =
20 26 32 38

creating diagonal matrix with selected elements

I have a 4x5 matrix called A from which I want to select randomly 3 rows, then 4 random columns and then select those elements which coincide in those selected rows and columns so that I have 12 selected elements.Then I want to create a diagonal matrix called B which will have entries either 1 or 0 so that multiplication of that B matrix with reshaped A matrix (20x1) will give me those selected 12 elements of A.
How can I create that B matrix? Here is my code:
A=1:20;
A=reshape(A,4,5);
Mr=4;
Ma=3;
Na=4;
Nr=5;
M=Ma*Mr;
[S1,S2]=size(A);
N=S1*S2;
y2=zeros(size(A));
k1=randperm(S1);
k1=k1(1:Ma);
k2=randperm(S2);
k2=k2(1:Mr);
y2(k1,k2)=A(k1,k2);
It's a little hard to understand what you want and your code isn't much help but I think I've a solution for you.
I create a matrix (vector) of zeros of the same size as A and then use bsxfun to determine the indexes in this vector (which will be the diagonal of B) that should be 1.
>> A = reshape(1:20, 4, 5);
>> R = [1 2 3]; % Random rows
>> C = [2 3 4 5]; % Random columns
>> B = zeros(size(A));
>> B(bsxfun(#plus, C, size(A, 1)*(R-1).')) = 1;
>> B = diag(B(:));
>> V = B*A(:);
>> V = V(V ~= 0)
V =
2
3
4
5
6
7
8
9
10
11
12
13
Note: There is no need for B = diag(B(:)); we could have simply used element by element multiplication in Matlab.
>> V = B(:).*A(:);
>> V = V(V ~= 0)
Note: This may be overly complex or very poorly put together and there is probably a better way of doing it. It's my first real attempt at using bsxfun on my own.
Here is a hack but since you are creating y2 you might as well just use it instead of creating the useless B matrix. The bsxfun answer is much better.
A=1:20;
A=reshape(A,4,5);
Mr=4;
Ma=3;
Na=4;
Nr=5;
M=Ma*Mr;
[S1,S2]=size(A);
N=S1*S2;
y2=zeros(size(A));
k1=randperm(S1);
k1=k1(1:Ma);
k2=randperm(S2);
k2=k2(1:Mr);
y2(k1,k2)=A(k1,k2);
idx = reshape(y2 ~= 0, numel(y2), []);
B = diag(idx);
% "diagonal" matrix 12x20
B = B(all(B==0,2),:) = [];
output = B * A(:)
output =
1
3
4
9
11
12
13
15
16
17
19
20
y2 from example.
y2 =
1 0 9 13 17
0 0 0 0 0
3 0 11 15 19
4 0 12 16 20

Matlab: How to locate a data in Matrix B based on the info in Matrix A? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Sort a matrix with another matrix
The Matrix A ('10 x 1000' all numbers) look like this:
score(1.1) score(1.2) score(1.3)....score(1.1000)
score(2.1) score(2.2) score(2.3)....score(1.1000)
...
The Matrix B ('1 x 1000' all numbers):
Return(1) Return(2) Return(3) .....Return(1.1000)
Every time I sort a row of Matrix A, I want to sort Matrix B based on the order of the sorted row in Matrix A. Because there are 10 rows in Matrix A, Matrix B will be sorted 10 times and generate a new Matrix C ('10 x 1000') like this: (I am looking for a script to generate this Matrix C)
Return(3) Return(25) Return(600) .......Return(1000)
Return(36)Return(123) Return(2)........Return(212)
....
....
This should do what you want:
A = randn(10,1000);
B = randn(1,1000);
C = zeros(size(A));
for i = 1:10
[a idx] = sort(A(1,:));
A(i,:) = a;
C(i,:) = B(idx);
end
Now the rows of A are sorted, and the rows of C contain the corresponding sorted B.
This solution is a bit more compact, and it's also good to get used to doing this kind of solution for efficiency when your matrices get big. You can solve your problem with two ideas:
In [a, ix] = sort(X), a is a column-sorted version of X, and ix stores which rows moved where in each column. Thus if we do [a, ix] = sort(X.').'; (where the dot-apostrophe is the transpose) we can sort the rows.
B(ix) where ix is a bunch of indeces will make a matrix the same size as ix with the i-jth element being B at ix(i,j)
Then you just need to reshape it. So you can do:
A = rand(4,8);
B = rand(1,8);
n = size(A,1);
m = size(A,2);
[~,ix] = sort(A.');
C = reshape(B(ix'),n,m);
If I understand your question correctly the following should work. Using some sample scores:
>> score = [1 4 7 9; 3 5 1 7; 9 3 1 6]
score =
1 4 7 9
3 5 1 7
9 3 1 6
and sample return vector:
>> r = [10 20 30 40]
r =
10 20 30 40
Transpose the scores and sort since the SORT command works on columns of a matrix. We're only interested in the indices of the sorted values:
>> [~, ix] = sort(score')
ix =
1 3 3
2 1 2
3 2 4
4 4 1
Now transpose these indices and use them to reference the return values:
>> answer = r(ix)'
answer =
10 20 30 40
30 10 20 40
30 20 40 10