I have a set which has {0} and other 8 elements, total 9 elements. I want to choose random 3 value in this set and create a 3x1 column matrix. This will repeat all possible choices in the set. How can I do?
As #Picket said in comment,
The way RandomSample works will ensure it will not output the same choice twice in a single call
If your list is small, you can generate all subsets and sample it.
Example
RandomSample[Subsets[{a, b, c, d, e, f}, {3}], 7]
will generate all (20) subsets with 3 (distinct) elements and then pick 7 different uniformly (there are options to weight each member differently, chose the random generator, etc.).
RandomSample[Flatten[Permutations /# Subsets[{a, b, c, d, e, f}, {3}], 1], 13]
will generate all (120) possible ordered selections of 3 distinct elements among a set of 6 elements and give a sample of 13 distinct elements of this list.
If what you want is a random ordering of all possible subsets of size 3, or of all ordered selections without duplicate of size 3 just ask the same way but with the exact number of such sets.
myset = { foo, foo2, foo3, foo5 };
RandomSample[Subsets[myset, {3}], Binomial[Length[myset],3 ]]
RandomSample[Flatten[Permutations /# Subsets[myset, {3}], 1], 3!*Binomial[Length[myset],3 ] ]
(if you ask more than the exact number of possibilities, RandomSample will complain)
Now if your initial set is large so that the set of subsets is impractical for generation time and memory, take advantage of representing set composition by numbers, even if it is not perfect in term of uniform distribution. Say that your initial set has 20 distinct elements. A three digit number in base 20 can represent any selection of 3. If you account for the need to filter out the few with one digit appearing more than once
20^3/(3!*Binomial[20, 3]) // N
1.16959
You are probably safe by generating 25% more numbers than what you need and filtering the ones with repetition:
Cases[IntegerDigits[RandomSample[0 ;; 20^3-1, Ceiling[31*(1 + 1/4)] ], 20, 3], _?(Length[Union[#]] == 3 &), 1, 31]
This generates a random sample of 39 distinct 3-digit numbers in base 20 and select the first 31 with no duplicates in the form of a list of 3-coordinates vectors.
Related
In a simple range I try to get the amount of successive assignments for a variable. The values should be between 6-12 or should be 0. For example in the case a hospital has 24 shifts and an employee should work between 6 and 12 hours or not at all.
# Build shifts
shifts = {}
for n in all_nurses:
for d in all_days:
for s in all_shifts:
shifts[(n, d, s)] = model.NewBoolVar('shift_n%id%is%i' % (n, d, s))
# Count successive occurrences
for e_count in all_nurses:
s_count = 0
while s_count < len(all_shifts):
model.Add(sum(shifts[e_count, s_count] for s in range(e_count, e_count + 6 == 6) #min
model.Add(sum(shifts[e_count, s_count] for s in range(e_count, e_count + 12 <= 12) #min
Unfortunately this doesn't work since it increases the value with only one, what would be the best approach to check if how many hours have been assigned and increase s_count with that value?
If you just want to constrain the sum, you should use this method
model.AddLinearExpressionInDomain(sum(bool_vars), cp_model.Domain.FromIntervals([0, 0], [6, 12]))
If you want to constrain the length of a sequence, you should look at the shift_scheduling example
In particular, the soft sequence constraint.
The idea is the following, for every starting point, you want to forbid 010, 0110, 01110, ..., 0111110 and 01111111111110 (0110 means work[start] is false, work[start + 1] is true, work[start + 2] is true, work[start + 3] is false.
To forbid a sequence, just add a nogood, that is a clause (or AddBoolOr containing the negation of the pattern.
in my example bool_or(work[start], work[start + 1].Not(), work[start + 2].Not(), work[start + 3]).
Loop over all starting points and all patterns. And pay attention to the boundary conditions.
I am using cp_model to solve a problem very similar to the multiple-knapsack problem (https://developers.google.com/optimization/bin/multiple_knapsack). Just like in the example code, I use some boolean variables to encode membership:
# Variables
# x[i, j] = 1 if item i is packed in bin j.
x = {}
for i in data['items']:
for j in data['bins']:
x[(i, j)] = solver.IntVar(0, 1, 'x_%i_%i' % (i, j))
What is specific to my problem is that there are a large number of fungible items. There may be 5 items of type 1 and 10 items of type 2. Any item is exchangeable with items of the same type. Using the boolean variables to encode the problem implicitly assumes that the order of the assignment for the same type of items matter. But in fact, the order does not matter and only takes up unnecessary computation time.
I am wondering if there is any way to design the model so that it accurately expresses that we are allocating from fungible pools of items to save computation.
Instead of creating 5 Boolean variables for 5 items of type 'i' in bin 'b', just create an integer variable 'count' from 0 to 5 of items 'i' in bin 'b'. Then sum over b (count[i][b]) == #item b
I am trying something (in netlogo), but it is not working. I want a value of a position from a list of numbers. And I want to use the number that comes out of it to retrieve a name from a list of names.
So if I have a list like [1 2 3 4] en a list with ["chicken" "duck" "monkey" "dog"]
I want my number 2 to correspond with "duck".
So far, my zq is a list of numbers and my usedstrategies is a list of names.
let m precision (max zq) 1
let l position m zq
let p (position l zq) usedstrategies
But when I try this the result will be false, because l is not part of usedstrategies.
Ideas?
You need the item primitive to select from the list after matching on the other list. I am not sure what the precision line is for. However, here is a self contained piece of code that I think demonstrates what you want to do. Note that NetLogo counts positions from 0, not 1. I also used arbitrary numbers in the list so you don't get confused between the number in the list and its position.
to testme
let usedstrategies (list "chicken" "duck" "monkey" "dog")
let zq (list 5 6 7 8)
let strategynum position 7 zq
let thisstrategy item strategynum usedstrategies
type "Selected strategy number " type strategynum
type " which is " print thisstrategy
end
Jen's solution is perfectly fine, but I think this could also be a good use case for the table extension. Here is an example:
extensions [table]
to demo
let usedstrategies ["chicken" "duck" "monkey" "dog"]
let zq [5 6 7 8]
let strategies table:from-list (map list zq usedstrategies)
; get item corresponding with number 7:
print table:get strategies 7
end
A "table", here, is a data structure where a set of keys are associated with values. Here, your numbers are the keys and the strategies are the values.
If you try to get an item for which there is no key in the table (e.g., table:get strategies 9), you'll get the following error:
Extension exception: No value for 9 in table.
Here is a bit more detail about how the code works.
To construct the table, we use the table:from-list reporter, which takes a list of lists as input and gives you back a table where the first item of each sublist is used as a key and the second item is used as a value.
To construct our list of lists, we use the map primitive. This part is a bit more tricky to understand. The map primitive needs two kind of inputs: one or more lists, and a reporter to be applied to elements of these lists. The reporter comes first, and the whole expression needs to be inside parentheses:
(map list zq usedstrategies)
This expression "zips" our two lists together: it takes the first element of zq and the first element of usedstrategies, passes them to the list reporter, which constructs a list with these two elements, and adds that result to a new list. It then takes the second element of zq and the second element of usedstrategies and does the same thing with them, until we have a list that looks like:
[[5 "chicken"] [6 "duck"] [7 "monkey"] [8 "dog"]]
Note that the zipping expression could also have be written:
(map [ [a b] -> list a b ] zq usedstrategies)
...but it's a more roundabout way to do it. The list reporter by itself is already what we want; there is no need to construct a separate anonymous reporter that does the same thing.
I am looking for to take one particular number or range of numbers from a set of number?
Example
A = [-10,-2,-3,-8, 0 ,1, 2, 3, 4 ,5,7, 8, 9, 10, -100];
How can I just take number 5 from the set of above number and
How can I take a range of number for example from -3 to 4 from A.
Please help.
Thanks
I don't know what you are trying to accomplish by this. But you could check each entry of the set and test it it's in the specified range of numbers. The test for a single number could be accomplished by testing each number explicitly or as a special case of range check where the lower and the upper bound are the same number.
looping and testing, no matter what the programming language is, although most programming languages have builtin methods for accomplishing this type of task (so you may want to specify what language are you supposed to use for your homework):
procfun get_element:
index=0
for element in set:
if element is 5 then return (element,index)
increment index
your "5" is in element and at set[index]
getting a range:
procfun getrange:
subset = []
index = 0
for element in set:
if element is -3:
push element in subset
while index < length(set)-1:
push set[index] in subset
if set[index] is 4:
return subset
increment index
#if we met "-3" but we didn't met "4" then there's no such range
return None
#keep searching for a "-3"
increment index
return None
if ran against A, subset would be [-3,-8, 0 ,1, 2, 3, 4]; this is a "first matched, first grabbed" poorman's algorithm. on sorted sets the algorithms can get smarter and faster.
I am facing the problem of having several integers, and I have to generate one using them. For example.
Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4
Hash Sum: 43
I have some restriction in the values, the maximum value that and attribute can have is 30, the addition of all of them is always 30. And the attributes are always positive.
The key is that I want to generate the same hash sum for similar integers, for example if I have the integers, 14, 4, 10, 2 then I want to generate the same hash sum, in the case above 43. But of course if the integers are very different (4, 4, 2, 20) then I should have a different hash sum. Also it needs to be fast.
Ideally I would like that the output of the hash sum is between 0 and 512, and it should evenly distributed. With my restrictions I can have around 5K different possibilities, so what I would like to have is around 10 per bucket.
I am sure there are many algorithms that do this, but I could not find a way of googling this thing. Can anyone please post an algorithm to do this?.
Some more information
The whole thing with this is that those integers are attributes for a function. I want to store the values of the function in a table, but I do not have enough memory to store all the different options. That is why I want to generalize between similar attributes.
The reason why 10, 5, 15 are totally different from 5, 10, 15, it is because if you imagine this in 3d then both points are a totally different point
Some more information 2
Some answers try to solve the problem using hashing. But I do not think this is so complex. Thanks to one of the comments I have realized that this is a clustering algorithm problem. If we have only 3 attributes and we imagine the problem in 3d, what I just need is divide the space in blocks.
In fact this can be solved with rules of this type
if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
Block = 21
if ( (5 < att[0] < 10) && (5 < att[1] < 10) && (5 < att[2] < 10) && (5 < att[3] < 10))
Block = 45
The problem is that I need a fast and a general way to generate those ifs I cannot write all the possibilities.
The simple solution:
Convert the integers to strings separated by commas, and hash the resulting string using a common hashing algorithm (md5, sha, etc).
If you really want to roll-your-own, I would do something like:
Generate large prime P
Generate random numbers 0 < a[i] < P (for each dimension you have)
To generate hash, calculate: sum(a[i] * x[i]) mod P
Given the inputs a, b, c, and d, each ranging in value from 0 to 30 (5 bits), the following will produce an number in the range of 0 to 255 (8 bits).
bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)
Whether the general approach is appropriate depends on how the question is interpreted. The 3 least significant bits are dropped, grouping 0-7 in the same set, 8-15 in the next, and so forth.
0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255
Trivially tested with:
for (int a = 0; a <= 30; a++)
for (int b = 0; b <= 30; b++)
for (int c = 0; c <= 30; c++)
for (int d = 0; d <= 30; d++) {
int bucket = ((a & 0x18) << 3) |
((b & 0x18) << 1) |
((c & 0x18) >> 1) |
((d & 0x18) >> 3);
printf("%d, %d, %d, %d -> %d\n",
a, b, c, d, bucket);
}
You want a hash function that depends on the order of inputs and where similar sets of numbers will generate the same hash? That is, you want 50 5 5 10 and 5 5 10 50 to generate different values, but you want 52 7 4 12 to generate the same hash as 50 5 5 10? A simple way to do something like this is:
long hash = 13;
for (int i = 0; i < array.length; i++) {
hash = hash * 37 + array[i] / 5;
}
This is imperfect, but should give you an idea of one way to implement what you want. It will treat the values 50 - 54 as the same value, but it will treat 49 and 50 as different values.
If you want the hash to be independent of the order of the inputs (so the hash of 5 10 20 and 20 10 5 are the same) then one way to do this is to sort the array of integers into ascending order before applying the hash. Another way would be to replace
hash = hash * 37 + array[i] / 5;
with
hash += array[i] / 5;
EDIT: Taking into account your comments in response to this answer, it sounds like my attempt above may serve your needs well enough. It won't be ideal, nor perfect. If you need high performance you have some research and experimentation to do.
To summarize, order is important, so 5 10 20 differs from 20 10 5. Also, you would ideally store each "vector" separately in your hash table, but to handle space limitations you want to store some groups of values in one table entry.
An ideal hash function would return a number evenly spread across the possible values based on your table size. Doing this right depends on the expected size of your table and on the number of and expected maximum value of the input vector values. If you can have negative values as "coordinate" values then this may affect how you compute your hash. If, given your range of input values and the hash function chosen, your maximum hash value is less than your hash table size, then you need to change the hash function to generate a larger hash value.
You might want to try using vectors to describe each number set as the hash value.
EDIT:
Since you're not describing why you want to not run the function itself, I'm guessing it's long running. Since you haven't described the breadth of the argument set.
If every value is expected then a full lookup table in a database might be faster.
If you're expecting repeated calls with the same arguments and little overall variation, then you could look at memoizing so only the first run for a argument set is expensive, and each additional request is fast, with less memory usage.
You would need to define what you mean by "similar". Hashes are generally designed to create unique results from unique input.
One approach would be to normalize your input and then generate a hash from the results.
Generating the same hash sum is called a collision, and is a bad thing for a hash to have. It makes it less useful.
If you want similar values to give the same output, you can divide the input by however close you want them to count. If the order makes a difference, use a different divisor for each number. The following function does what you describe:
int SqueezedSum( int a, int b, int c, int d )
{
return (a/11) + (b/7) + (c/5) + (d/3);
}
This is not a hash, but does what you describe.
You want to look into geometric hashing. In "standard" hashing you want
a short key
inverse resistance
collision resistance
With geometric hashing you susbtitute number 3 with something whihch is almost opposite; namely close initial values give close hash values.
Another way to view my problem is using the multidimesional scaling (MS). In MS we start with a matrix of items and what we want is assign a location of each item to an N dimensional space. Reducing in this way the number of dimensions.
http://en.wikipedia.org/wiki/Multidimensional_scaling