filling a matrix with random integers from a range according to a rule - matlab

I'm using the matrix as an initial population for multiobjective optimization using NSGA-II in matlab. The size of my chromosome vector,(C), is 1x192 and each gene must be within the range 0<=gene<=40 and the genes must be integers. The rule is that the sum of groupings of 6 genes must be less or equal to 40.that is:
sum(reshape(6,[]))<=40
I've use the following code but it outputs either an all-zero population matrix(population matrix=vertical concatenation of 500 chromosomes) or a matrix that does not satisfy the rule:
X=zeros(500,192);
while i<501
r=randi(40,6,32);
if nnz(((sum(r))./40)>1)==0
X(i,:)=reshape(r,1,[]);
i=i+1;
clear r;
else
clear r;
end
end
It is also taking forever to exit the while loop.
What am I doing wrong here? Is there another way of doing the above?
I've also tried this:
i=1;
while i<17500
r=randi([1,40],6,1);
s=sum(r);
if s<=40
X(:,i)=r;
i=i+1;
else
clear r;
end
end
X=unique(X','rows')';
A=X(:,randperm(size(X,2)));
A=X(randperm(size(X,1)),:);
The above tries to create random columns that will be reshaped to the population matrix. But the numbers are repeating; i.e in the 17500(16448 after removing duplicate columns) columns there is no occurrence of the numbers 37 and 40. Is there any way I can optimize the spread of the generated random numbers?
#0x90
I have a vector,called 'chromosome', of size 1x192 and each successive group of 6 members(called phenotype) must sum to 40 or less. To make it clearer:
That is, each P must be an integer in the range 0 to 40 inclusive and the sum at each phenotype must be <=40. I need 500 chromosomes like this.
I hope it makes sense now. ><

You should use randi([min,max],n,m). randint is going to be deprecated.
>> r = randi([1,4],3,2)
r =
3 3
2 2
4 4

Related

Produce 6 different number by only use "randi" and some loops

I want to only use "randi" this function to produce the 6 different number randomly in matlab ,and the range of these 6 number is 1 ~ 12.
number=randi([1,12],1,6)
c=0;
for n=1:6%when "n" is 1 to 6
while c <= 6 %while c is less equal than 6,do the program below
c = c + 1; %c=c+1
if number(n) == number(c) %when the nth element is equal to cth element
number(n) = randi(12); %produce a random integer in the nth element
c = 0; %the reason why i set c=0 again is because i want to check again whether the new random integer is the same as cth element or not
end
end
end
final_number=number
but the result still show me like
1 "2" 6 11 "2" 3
5 "8" "8" 12 3 1
How do i improve my code to produce 6 different numbers.i don't want to always rely on the convenient matlab instruction too much,so my tags will also write c.hoping someone can help me to improve this
If you're trying to reproduce randsample (or randperm), why not just reproduce the algorithm MATLAB uses? (As far as we can tell...)
This is the Fisher-Yates shuffle. If you have a vector v, each iteration selects a random, previously unused element and puts it at the end of the unselected elements. If you do k iterations, the last k elements of the list are your random sample. If k equals the number of elements in v, you've shuffled the entire array.
function sample = fisher_yates_sample(v, k)
% Select k random elements without replacement from vector v
% if k == numel(v), this is simply a fisher-yates shuffle
for n = 0:k-1
randnum = randi(numel(v)-n); % choose from unused values
% swap elements v(end-n) and v(randnum)
v([end-n, randnum]) = v([randnum, end-n]);
end
sample = v(end-k+1:end);
end
Unlike MATLAB's version, mine requires a vector as input, so to get 6 random values in the range 1:12 you'd call the function like this:
>> fisher_yates_sample(1:12,6)
ans =
5 11 6 10 8 4
Since you're re-selecting single random numbers, when there is one occuring multiple times, why not just re-selecting all numbers at once?
% Initial selecting of random numbers.
number = randi([1, 12], 1, 6)
% While the amount of unique elements in numbers is less than 6:
while (numel(unique(number)) < 6)
% Re-select random numbers.
number = randi([1, 12], 1, 6)
end
And since you wrote, you specifically want to use the randi method, I guess there is a reason, you don't want to use randperm(12, 6)!?
What you are looking for is randperm. It produces a random permutation of a range of integers, so that if you select the first k numbers, you are sure that you get k unique integers in the range [1;n].
In your case, simply call:
randperm(12,6)

Matlab: Array of random integers with no direct repetition

For my experiment I have 20 categories which contain 9 pictures each. I want to show these pictures in a pseudo-random sequence where the only constraint to randomness is that one image may not be followed directly by one of the same category.
So I need something similar to
r = randi([1 20],1,180);
just with an added constraint of two numbers not directly following each other. E.g.
14 8 15 15 7 16 6 4 1 8 is not legitimate, whereas
14 8 15 7 15 16 6 4 1 8 would be.
An alternative way I was thinking of was naming the categories A,B,C,...T, have them repeat 9 times and then shuffle the bunch. But there you run into the same problem I think?
I am an absolute Matlab beginner, so any guidance will be welcome.
The following uses modulo operations to make sure each value is different from the previous one:
m = 20; %// number of categories
n = 180; %// desired number of samples
x = [randi(m)-1 randi(m-1, [1 n-1])];
x = mod(cumsum(x), m) + 1;
How the code works
In the third line, the first entry of x is a random value between 0 and m-1. Each subsequent entry represents the change that, modulo m, will give the next value (this is done in the fourth line).
The key is to choose that change between 1 and m-1 (not between 0 and m-1), to assure consecutive values will be different. In other words, given a value, there are m-1 (not m) choices for the next value.
After the modulo operation, 1 is added to to transform the range of resulting values from 0,...,m-1 to 1,...,m.
Test
Take all (n-1) pairs of consecutive entries in the generated x vector and count occurrences of all (m^2) possible combinations of values:
count = accumarray([x(1:end-1); x(2:end)].', 1, [m m]);
imagesc(count)
axis square
colorbar
The following image has been obtained for m=20; n=1e6;. It is seen that all combinations are (more or less) equally likely, except for pairs with repeated values, which never occur.
You could look for the repetitions in an iterative manner and put new set of integers from the same group [1 20] only into those places where repetitions have occurred. We continue to do so until there are no repetitions left -
interval = [1 20]; %// interval from where the random integers are to be chosen
r = randi(interval,1,180); %// create the first batch of numbers
idx = diff(r)==0; %// logical array, where 1s denote repetitions for first batch
while nnz(idx)~=0
idx = diff(r)==0; %// logical array, where 1s denote repetitions for
%// subsequent batches
rN = randi(interval,1,nnz(idx)); %// new set of random integers to be placed
%// at the positions where repetitions have occured
r(find(idx)+1) = rN; %// place ramdom integers at their respective positions
end

Find median value of the largest clump of similar values in an array in the most computationally efficient manner

Sorry for the long title, but that about sums it up.
I am looking to find the median value of the largest clump of similar values in an array in the most computationally efficient manner.
for example:
H = [99,100,101,102,103,180,181,182,5,250,17]
I would be looking for the 101.
The array is not sorted, I just typed it in the above order for easier understanding.
The array is of a constant length and you can always assume there will be at least one clump of similar values.
What I have been doing so far is basically computing the standard deviation with one of the values removed and finding the value which corresponds to the largest reduction in STD and repeating that for the number of elements in the array, which is terribly inefficient.
for j = 1:7
G = double(H);
for i = 1:7
G(i) = NaN;
T(i) = nanstd(G);
end
best = find(T==min(T));
H(best) = NaN;
end
x = find(H==max(H));
Any thoughts?
This possibility bins your data and looks for the bin with most elements. If your distribution consists of well separated clusters this should work reasonably well.
H = [99,100,101,102,103,180,181,182,5,250,17];
nbins = length(H); % <-- set # of bins here
[v bins]=hist(H,nbins);
[vm im]=max(v); % find max in histogram
bl = bins(2)-bins(1); % bin size
bm = bins(im); % position of bin with max #
ifb =find(abs(H-bm)<bl/2) % elements within bin
median(H(ifb)) % average over those elements in bin
Output:
ifb = 1 2 3 4 5
H(ifb) = 99 100 101 102 103
median = 101
The more challenging parameters to set are the number of bins and the size of the region to look around the most populated bin. In the example you provided neither of these is so critical, you could set the number of bins to 3 (instead of length(H)) and it still would work. Using length(H) as the number of bins is in fact a little extreme and probably not a good general choice. A better choice is somewhere between that number and the expected number of clusters.
It may help for certain distributions to change bl within the find expression to a value you judge better in advance.
I should also note that there are clustering methods (kmeans) that may work better, but perhaps less efficiently. For instance this is the output of [H' kmeans(H',4) ]:
99 2
100 2
101 2
102 2
103 2
180 3
181 3
182 3
5 4
250 3
17 1
In this case I decided in advance to attempt grouping into 4 clusters.
Using kmeans you can get an answer as follows:
nbin = 4;
km = kmeans(H',nbin);
[mv iv]=max(histc(km,[1:nbin]));
H(km==km(iv))
median(H(km==km(iv)))
Notice however that kmeans does not necessarily return the same value every time it is run, so you might need to average over a few iterations.
I timed the two methods and found that kmeans takes ~10 X longer. However, it is more robust since the bin sizes adapt to your problem and do not need to be set beforehand (only the number of bins does).

MATLAB vector: prevent consecutive values from same range

Okay, this might seem like a weird question, but bear with me.
So I have a random vector in a .m file, with certain constraints built into it. Here is my code:
randvecall = randsample(done, done, true);
randvec = randvecall([1;diff(randvecall(:))]~=0);
"Done" is just the range of values we take the sample from, so don't worry about that. As you can see, this randsamples from a range of values, and then prunes this random vector with the diff function, so that consecutive duplicate values are removed. There is still the potential for duplicate values in the vector, but they simply cannot be consecutive.
This is all well and good, and works perfectly fine.
So, say, randvec looks like this:
randvec =
54
47
52
26
39
2
14
51
24
6
19
56
34
46
12
7
41
18
29
7
It is actually a lot longer, with something like 60-70 values, but you get the point.
What I want to do is add a little extra constraint on to this vector. When I sample from this vector, the values are classified according to their range. So values from 1-15 are category 1, 16-30 are category 2, and so on. The reasons for this are unimportant, but it is a pretty important part of the program. So if you look at the values I provided above, you see a section like this:
7
41
18
29
7
This is actually bad for my program. Because the value ranges are treated separately, 41, 18, and 29 are used differently than 7 is. So, for all intents and purposes, 7 is appearing consecutively in my script. What I want to do is somehow parse/modify/whatever the vector when it is generated so that the same number from a certain range cannot appear twice "in a row," regardless of how many other numbers from different ranges are between them. Does this make sense/did I describe this well? So, I want MATLAB to search the vector, and for all values within certain ranges (1-15,16-30,31-45,46-60) make sure that "consecutive" values from the same range are not identical.
So, then, that is what I want to do. This may not by any means be the best way to do this, so any advice/alternatives are, of course, appreciated. I know I can do this better with multiple vectors, but for various reasons I need this to be a single, long vector (the way my script is designed it just wouldn't work if I had a separate vector for each range of values).
What you may want to do is create four random vectors, one for each category, ensure that they do not contain any two consecutive equal values, and then build your final random vector by ordered picking of values from random categories, i.e.
%# make a 50-by-nCategories array of random numbers
categories = [1,16,31,46;15,30,45,60]; %# category min/max
nCategories = size(categories,2);
randomCategories = zeros(50,nCategories);
for c=1:nCategories
%# draw 100 numbers for good measure
tmp = randi(categories(:,c),[100 1]);
tmp(diff(tmp==0)) = []; %# remove consecutive duplicates
%# store
randomCategories(:,c) = tmp(1:50);
end
%# select from which bins to pick. Use half the numbers, so that we don't force the
%# numbers of entries per category to be exactly equal
bins = randi(nCategories,[100,1]);
%# combine the output, i.e. replace e.g. the numbers
%# '3' in 'bins' with the consecutive entries
%# from the third category
out = zeros(100,1);
for c = 1:nCategories
cIdx = find(bins==c);
out(cIdx) = randomCategories(1:length(cIdx),c);
end
First we assign each element the bin number of the range it lies into:
[~,bins] = histc(randvec, [1 16 31 46 61]);
Next we loop for each range, and find elements in those categories. For example for the first range of 1-16, we get:
>> ind = find(bins==1); %# bin#1 of 1-16
>> x = randvec(ind)
ans =
2
14
6
12
7
7
now you can apply the same process of removing consecutive duplicates:
>> idx = ([1;diff(x)] == 0)
idx =
0
0
0
0
0
1
>> problematicIndices = ind(idx) %# indices into the vector: randvec
Do this for all ranges, and collect those problematic indices. Next decide how you want to deal with them (remove them, generate other numbers in their place, etc...)
If I understand your problem correct, I think that is one solution. It uses unique, but applies it to each of the subranges of the vector. The values that are duplicated within a range of indices are identified so you can deal with them.
cat_inds = [1,16,31,46,60]; % need to include last element
for i=2:numel(cat_inds)
randvec_part = randvec( cat_inds(i-1):cat_inds(i) );
% Find the indices for the first unique elements in this part of the array
[~,uniqInds] = unique(randvec_part,'first');
% this binary vector identifies the indices that are duplicated in
% this part of randvec
%
% NB: they are indices into randvec_part
%
inds_of_duplicates = ~ismember(1:numel(randvec_part), uniqInds);
% code to deal with the problem indices goes here. Modify randvec_part accordingly...
% Write it back to the original vector (assumes that the length is the same)
randvec( cat_inds(i-1):cat_inds(i) ) = randvec_part;
end
Here's a different approach than what everyone else has been tossing up. The premise that I'm working on here is that you want to have a random arrangement of values in a vector without repitition. I'm not sure what other constraints you are applying prior to the point where we are giving out input.
My thoughts is to use the randperm function.
Here's some sample code how it would work:
%randvec is your vector of random values
randvec2 = unique(randvec); % This will return the sorted list of values from randvec.
randomizedvector = randvec2(randperm(length(randvec2));
% Note: if randvec is multidimensional you'll have to use numel instead of length
At this point randomizedvector should contain all the unique values from the initial randvec and but 'shuffled' or re-randomized after the unique function call. Now you could just seed the randvec differently to avoid needing the unique function call as simply calling randperm(n) will returning a randomized vector with values ranging from 1 to n.
Just an off the wall 2 cents there =P enjoy!

extracting data from an existing matrix

I have a matrix containing 4320 entries
for example:
P=[ 26 29 31 33 35 26 29 ..........25]
I want to create 180 matrices and every matrix contains 24 entries,i.e
the 1st matrix contains the 1st 24 entries
the 2nd matrix contains the 2nd 24 entries and so on
I know a simple method but it will take a long time which is:
P1=P(1:24);P2=P(25:48),..........P180=P(4297:4320)
and it is dificult since I have huge number of entries for
the original matrix P
thanks
I'm going to go ahead and assume this is MATLAB-related, in which case you'd use the reshape function:
Px = reshape(P, 24, []);
Px will now be a proper matrix, and you can access each of the 180 "matrices" (actually row vectors, you seem to be confusing the two) by simple MATLAB syntax:
P100 = P(:,100);
You can loop through the items in the index, counting up, creating a new matrix every 24 entries. Modular arithmetic might help:
foreach (var currentIndexInLargerMatrix : int = 0 to 4320)
begin
matrixToPutItIn := currentIndexInLargerMatrix div 24;
indexInNewMatrix := currentIndexInLargerMatrix mod 24;
end
in many languages the modulus (remainder) operator is either "mod" or "%".
"div" here denotes integer division. Most languages just use the virgule (slash) "/".
This obviously isn't complete code, but should get you started.
I think You's answer is the best way to approach your problem, where each submatrix is stored as a row or column in a larger matrix and is retrieved by simply indexing into that larger matrix.
However, if you really want/need to create 180 separate variables labeled P1 through P180, the way to do this has been discussed in other questions, like this one. In your case, you could use the function EVAL like so:
for iMatrix = 1:180 %# Loop 180 times
tempP = P((1:24)+24*(iMatrix-1)); %# Get your submatrix
eval(['P' int2str(iMatrix) ' = tempP;']); %# Create a new variable
end