rand(n) returns a number between 0 and n. Will rand work as expected, with regard to "randomness", for all arguments up to the integer limit on my platform?
This is going to depend on your randbits value:
rand calls your system's random number generator (or whichever one was
compiled into your copy of Perl). For this discussion, I'll call that
generator RAND to distinguish it from rand, Perl's function. RAND produces
an integer from 0 to 2**randbits - 1, inclusive, where randbits is a small
integer. To see what it is in your perl, use the command 'perl
-V:randbits'. Common values are 15, 16, or 31.
When you call rand with an argument arg, perl takes that value as an
integer and calculates this value.
arg * RAND
rand(arg) = ---------------
2**randbits
This value will always fall in the range required.
0 <= rand(arg) < arg
But as arg becomes large in comparison to 2**randbits, things become
problematic. Let's imagine a machine where randbits = 15, so RAND ranges
from 0..32767. That is, whenever we call RAND, we get one of 32768
possible values. Therefore, when we call rand(arg), we get one of 32768
possible values.
It depends on the number of bits used by your system's (pseudo)random number generator. You can find this value via
perl -V:randbits
or within a program via
use Config;
my $randbits = $Config{randbits};
rand can generate 2^randbits distinct random numbers. While you can generate numbers larger than 2^randbits, you can't generate all of the integer values in the range [0, N) when N > 2^randbits.
Values of N which aren't a power of two can also be problematic, as the distribution of (integer truncated) random values won't quite be flat. Some values will be slightly over-represented, others slightly under-represented.
It's worth noting that randbits is a paltry 15 on Windows. This means you can only get 32768 (2**15) distinct values. You can improve the situation by making multiple calls to rand and combining the values:
use Config;
use constant RANDBITS => $Config{randbits};
use constant RAND_MAX => 2**RANDBITS;
sub double_rand {
my $max = shift || 1;
my $iv =
int rand(RAND_MAX) << RANDBITS
| int rand(RAND_MAX);
return $max * ($iv / 2**(2*RANDBITS));
}
Assuming randbits = 15, double_rand mimics randbits = 30, providing 1073741824 (2**30) possible distinct values. This alleviates (but can never eliminate) both of the problems mentioned above.
We are talking about big random integers and whether it is possible to get them. It should be noted that the concatenation of two random integers is also a random integer. So if your system, for any reason, cannot go beyond 999999999999, then just write
$bigrand = int(rand(999999999999)).int(rand(999999999999));
and you'll get a random integer of (maximally) twice the length.
(Actually this is not a numeric answer to the question “how big a rand number can be” but rather the answer “you can get as big as you want, just concatenate small numbers”.)
Related
I have a 16-bit WORD and I want to read the status of a specific bit or several bits.
I've tried a method that divides the word by the bit that I want, converts the result to two values - an integer and to a real, and compares the two. if they are not equal, then it it equates to false. This appears to only work if i am looking for a bit that the last 'TRUE' bit in the word. If there are any successive TRUE bits, it fails. Perhaps I just haven't done it right. I don't have the ability to use code, just basic math, boolean operations, and type conversion. Any ideas? I hope this isn't a dumb question but i have a feeling it is.
eg:
WORD 0010000100100100 = 9348
I want to know the value of bit 2. how can i determine it from 9348?
There are many ways, depending on what operations you can use. It appears you don't have much to choose from. But this should work, using just integer division and multiplication, and a test for equality.
(psuedocode):
x = 9348 (binary 0010000100100100, bit 0 = 0, bit 1 = 0, bit 2 = 1, ...)
x = x / 4 (now x is 1000010010010000
y = (x / 2) * 2 (y is 0000010010010000)
if (x == y) {
(bit 2 must have been 0)
} else {
(bit 2 must have been 1)
}
Every time you divide by 2, you move the bits to the left one position (in your big endian representation). Every time you multiply by 2, you move the bits to the right one position. Odd numbers will have 1 in the least significant position. Even numbers will have 0 in the least significant position. If you divide an odd number by 2 in integer math, and then multiply by 2, you loose the odd bit if there was one. So the idea above is to first move the bit you want to know about into the least significant position. Then, divide by 2 and then multiply by two. If the result is the same as what you had before, then there must have been a 0 in the bit you care about. If the result is not the same as what you had before, then there must have been a 1 in the bit you care about.
Having explained the idea, we can simplify to
((x / 8) * 2) <> (x / 4)
which will resolve to true if the bit was set, and false if the bit was not set.
AND the word with a mask [1].
In your example, you're interested in the second bit, so the mask (in binary) is
00000010. (Which is 2 in decimal.)
In binary, your word 9348 is 0010010010000100 [2]
0010010010000100 (your word)
AND 0000000000000010 (mask)
----------------
0000000000000000 (result of ANDing your word and the mask)
Because the value is equal to zero, the bit is not set. If it were different to zero, the bit was set.
This technique works for extracting one bit at a time. You can however use it repeatedly with different masks if you're interested in extracting multiple bits.
[1] For more information on masking techniques see http://en.wikipedia.org/wiki/Mask_(computing)
[2] See http://www.binaryhexconverter.com/decimal-to-binary-converter
The nth bit is equal to the word divided by 2^n mod 2
I think you'll have to test each bit, 0 through 15 inclusive.
You could try 9348 AND 4 (equivalent of 1<<2 - index of the bit you wanted)
9348 AND 4
should give 4 if bit is set, 0 if not.
So here is what I have come up with: 3 solutions. One is Hatchet's as proposed above, and his answer helped me immensely with actually understanding HOW this works, which is of utmost importance to me! The proposed AND masking solutions could have worked if my system supports bitwise operators, but it apparently does not.
Original technique:
( ( ( INT ( TAG / BIT ) ) / 2 ) - ( INT ( ( INT ( TAG / BIT ) ) / 2 ) ) <> 0 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT, then REAL division by 2. In the second part, integer division is performed TAG/BIT, then integer division again by 2. The difference between these two results is compared to 0. If the difference is not 0, then the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5-1168 <> 0, so the result is TRUE.
My modified technique:
( INT ( TAG / BIT ) / 2 ) <> ( INT ( INT ( TAG / BIT ) / 2 ) )
Explanation:
effectively the same as above, but instead of subtracting the two results and comparing them to 0, I am just comparing the two results themselves. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337 w/ integer division. Then 2337/2 = 1168.5 w/ REAL division but 1168 w/ integer division. 1168.5 <> 1168, so the result is TRUE.
Hatchet's technique as it applies to my system:
( INT ( TAG / BIT )) <> ( INT ( INT ( TAG / BIT ) / 2 ) * 2 )
Explanation:
in the first part of the equation, integer division is performed on TAG/BIT. In the second part, integer division is performed TAG/BIT, then integer division again by 2, then multiplication by 2. The two results are compared. If they are not equal, the formula resolves to TRUE, which means the specified bit is also TRUE.
eg: 9348/4 = 2337. Then 2337/2 = 1168 w/ integer division. Then 1168x2=2336. 2337 <> 2336 so the result is TRUE. As Hatchet stated, this method 'drops the odd bit'.
Note - 9348/4 = 2337 w/ both REAL and integer division, but it is important that these parts of the formula use integer division and not REAL division (12164/32 = 380 w/ integer division and 380.125 w/ REAL division)
I feel it important to note for any future readers that the BIT value in the equations above is not the bit number, but the actual value of the resulting decimal if the bit in the desired position was the only TRUE bit in the binary string (bit 2 = 4 (2^2), bit 6 = 64 (2^6))
This explanation may be a bit too verbatim for some, but may be perfect for others :)
Please feel free to comment/critique/correct me if necessary!
I just needed to resolve an integer status code to a bit state in order to interface with some hardware. Here's a method that works for me:
private bool resolveBitState(int value, int bitNumber)
{
return (value & (1 << (bitNumber - 1))) != 0;
}
I like it, because it's non-iterative, requires no cast operations and essentially translates directly to machine code operations like Shift, And and Comparison, which probably means it's really optimal.
To explain in a little more detail, I'm comparing the bitwise value to a mask for the bit I am interested in (value & mask) using an AND operation. If the bitwise AND operation result is zero, then the bit is not set (return false). If the AND operation result is not zero, then the bit is set (return true). The result of the AND operation is either zero or the value of the bit (1, 2, 4, 8, 16, 32...). Hence the boolean evaluation comparing the AND operation result and 0. The mask is created by taking the number 1 and shifting it left (bit wise), by the appropriate number of binary places (1 << n). The number of places is the number of the bit targeted minus 1. If it's bit #1, I want to shift the 1 left by 0 and if it's #2, I want to shift it left 1 place, etc.
I'm surprised no one rates my solution. It think it's most logical and succinct... and works.
When the numbers are really small, Matlab automatically shows them formatted in Scientific Notation.
Example:
A = rand(3) / 10000000000000000;
A =
1.0e-016 *
0.6340 0.1077 0.6477
0.3012 0.7984 0.0551
0.5830 0.8751 0.9386
Is there some in-built function which returns the exponent? Something like: getExponent(A) = -16?
I know this is sort of a stupid question, but I need to check hundreds of matrices and I can't seem to figure it out.
Thank you for your help.
Basic math can tell you that:
floor(log10(N))
The log base 10 of a number tells you approximately how many digits before the decimal are in that number.
For instance, 99987123459823754 is 9.998E+016
log10(99987123459823754) is 16.9999441, the floor of which is 16 - which can basically tell you "the exponent in scientific notation is 16, very close to being 17".
Floor always rounds down, so you don't need to worry about small exponents:
0.000000000003754 = 3.754E-012
log10(0.000000000003754) = -11.425
floor(log10(0.000000000003754)) = -12
You can use log10(A). The exponent used to print out will be the largest magnitude exponent in A. If you only care about small numbers (< 1), you can use
min(floor(log10(A)))
but if it is possible for them to be large too, you'd want something like:
a = log10(A);
[v i] = max(ceil(abs(a)));
exponent = v * sign(a(i));
this finds the maximum absolute exponent, and returns that. So if A = [1e-6 1e20], it will return 20.
I'm actually not sure quite how Matlab decides what exponent to use when printing out. Obviously, if A is close to 1 (e.g. A = [100, 203]) then it won't use an exponent at all but this solution will return 2. You'd have to play around with it a bit to work out exactly what the rules for printing matrices are.
I have a, very long, integer. The integer is represented by a array of unsigned chars.
Example: the integer 1234 with base 10 is represented in the array as [4,3,2,1], [2,2,3,2] (base 8) and [2,13,4] (base 16)
Now I want to convert my integer with base n to another integer with base m. In my persued for a answer I came accross Wallar's algorithm, originally from here.
from math import *
def baseExpansion(n,c,b):
j = 0
base10 = sum([pow(c,len(n)-k-1)*n[k] for k in range(0,len(n))])
while floor(base10/pow(b,j)) != 0: j = j+1
return [floor(base10/pow(b,j-p)) % b for p in range(1,j+1)]
At first I thought this was my answer but unfortunately it is not. The problem I have is that the algorithm computes the sum. In my case this is a problem because the variable base10 is of type unsigned integer of 32 bits. Therefore when my integer, represented as a array, has more then 10 digits it can not convert the number anymore. Anyone has a solution?
Here's the school-book algorithm for doing what you're trying. You start with a representation for zero and call it a running total. Then, for each digit of the number to be converted, starting with the most significant and going to the least significant, 1) multiply the running total by the base of the source number and 2) add the digit to the running total. Now all you need is algorithms to do the multiplication and addition (and you can actually do both at once). Here's how to do that: 1) set the current digit to a variable, call it "carry", 2) for each digit in your new number, starting with the least significant and going to the most significant: 2a) set carry to the current digit in the new number times the output base plus carry, 2b) set the current digit to carry mod the output base, 2c) set carry to carry divided by the output base. And that should do it. There is an implementation of what you are trying to do somewhere here: http://www.cis.ksu.edu/~howell/calculator/comparison.html
I am facing the problem of having several integers, and I have to generate one using them. For example.
Int 1: 14
Int 2: 4
Int 3: 8
Int 4: 4
Hash Sum: 43
I have some restriction in the values, the maximum value that and attribute can have is 30, the addition of all of them is always 30. And the attributes are always positive.
The key is that I want to generate the same hash sum for similar integers, for example if I have the integers, 14, 4, 10, 2 then I want to generate the same hash sum, in the case above 43. But of course if the integers are very different (4, 4, 2, 20) then I should have a different hash sum. Also it needs to be fast.
Ideally I would like that the output of the hash sum is between 0 and 512, and it should evenly distributed. With my restrictions I can have around 5K different possibilities, so what I would like to have is around 10 per bucket.
I am sure there are many algorithms that do this, but I could not find a way of googling this thing. Can anyone please post an algorithm to do this?.
Some more information
The whole thing with this is that those integers are attributes for a function. I want to store the values of the function in a table, but I do not have enough memory to store all the different options. That is why I want to generalize between similar attributes.
The reason why 10, 5, 15 are totally different from 5, 10, 15, it is because if you imagine this in 3d then both points are a totally different point
Some more information 2
Some answers try to solve the problem using hashing. But I do not think this is so complex. Thanks to one of the comments I have realized that this is a clustering algorithm problem. If we have only 3 attributes and we imagine the problem in 3d, what I just need is divide the space in blocks.
In fact this can be solved with rules of this type
if (att[0] < 5 && att[1] < 5 && att[2] < 5 && att[3] < 5)
Block = 21
if ( (5 < att[0] < 10) && (5 < att[1] < 10) && (5 < att[2] < 10) && (5 < att[3] < 10))
Block = 45
The problem is that I need a fast and a general way to generate those ifs I cannot write all the possibilities.
The simple solution:
Convert the integers to strings separated by commas, and hash the resulting string using a common hashing algorithm (md5, sha, etc).
If you really want to roll-your-own, I would do something like:
Generate large prime P
Generate random numbers 0 < a[i] < P (for each dimension you have)
To generate hash, calculate: sum(a[i] * x[i]) mod P
Given the inputs a, b, c, and d, each ranging in value from 0 to 30 (5 bits), the following will produce an number in the range of 0 to 255 (8 bits).
bucket = ((a & 0x18) << 3) | ((b & 0x18) << 1) | ((c & 0x18) >> 1) | ((d & 0x18) >> 3)
Whether the general approach is appropriate depends on how the question is interpreted. The 3 least significant bits are dropped, grouping 0-7 in the same set, 8-15 in the next, and so forth.
0-7,0-7,0-7,0-7 -> bucket 0
0-7,0-7,0-7,8-15 -> bucket 1
0-7,0-7,0-7,16-23 -> bucket 2
...
24-30,24-30,24-30,24-30 -> bucket 255
Trivially tested with:
for (int a = 0; a <= 30; a++)
for (int b = 0; b <= 30; b++)
for (int c = 0; c <= 30; c++)
for (int d = 0; d <= 30; d++) {
int bucket = ((a & 0x18) << 3) |
((b & 0x18) << 1) |
((c & 0x18) >> 1) |
((d & 0x18) >> 3);
printf("%d, %d, %d, %d -> %d\n",
a, b, c, d, bucket);
}
You want a hash function that depends on the order of inputs and where similar sets of numbers will generate the same hash? That is, you want 50 5 5 10 and 5 5 10 50 to generate different values, but you want 52 7 4 12 to generate the same hash as 50 5 5 10? A simple way to do something like this is:
long hash = 13;
for (int i = 0; i < array.length; i++) {
hash = hash * 37 + array[i] / 5;
}
This is imperfect, but should give you an idea of one way to implement what you want. It will treat the values 50 - 54 as the same value, but it will treat 49 and 50 as different values.
If you want the hash to be independent of the order of the inputs (so the hash of 5 10 20 and 20 10 5 are the same) then one way to do this is to sort the array of integers into ascending order before applying the hash. Another way would be to replace
hash = hash * 37 + array[i] / 5;
with
hash += array[i] / 5;
EDIT: Taking into account your comments in response to this answer, it sounds like my attempt above may serve your needs well enough. It won't be ideal, nor perfect. If you need high performance you have some research and experimentation to do.
To summarize, order is important, so 5 10 20 differs from 20 10 5. Also, you would ideally store each "vector" separately in your hash table, but to handle space limitations you want to store some groups of values in one table entry.
An ideal hash function would return a number evenly spread across the possible values based on your table size. Doing this right depends on the expected size of your table and on the number of and expected maximum value of the input vector values. If you can have negative values as "coordinate" values then this may affect how you compute your hash. If, given your range of input values and the hash function chosen, your maximum hash value is less than your hash table size, then you need to change the hash function to generate a larger hash value.
You might want to try using vectors to describe each number set as the hash value.
EDIT:
Since you're not describing why you want to not run the function itself, I'm guessing it's long running. Since you haven't described the breadth of the argument set.
If every value is expected then a full lookup table in a database might be faster.
If you're expecting repeated calls with the same arguments and little overall variation, then you could look at memoizing so only the first run for a argument set is expensive, and each additional request is fast, with less memory usage.
You would need to define what you mean by "similar". Hashes are generally designed to create unique results from unique input.
One approach would be to normalize your input and then generate a hash from the results.
Generating the same hash sum is called a collision, and is a bad thing for a hash to have. It makes it less useful.
If you want similar values to give the same output, you can divide the input by however close you want them to count. If the order makes a difference, use a different divisor for each number. The following function does what you describe:
int SqueezedSum( int a, int b, int c, int d )
{
return (a/11) + (b/7) + (c/5) + (d/3);
}
This is not a hash, but does what you describe.
You want to look into geometric hashing. In "standard" hashing you want
a short key
inverse resistance
collision resistance
With geometric hashing you susbtitute number 3 with something whihch is almost opposite; namely close initial values give close hash values.
Another way to view my problem is using the multidimesional scaling (MS). In MS we start with a matrix of items and what we want is assign a location of each item to an N dimensional space. Reducing in this way the number of dimensions.
http://en.wikipedia.org/wiki/Multidimensional_scaling
I am new to iPhone programming. I have 10 number say (1,2,3,4,5,6,7,8,9,10). I want to choose randomly 1 number from the above 10 numbers. How can I choose a random number from a set of numbers?
If you simply want a value between 1 and 10, you can use the standard C rand() method. This returns an integer between zero and RAND_MAX.
To get a value between 0 and 9 you can use the % operator. So to get a value between 1 and 10 you can use:
rand()%10 + 1
If you don't want the same series of pseudo random numbers each time, you'll need to use srand to seed the random number generator. A good value to seed it with would be the current time.
If you're asking about choosing a number from a list of arbitrary (and possibly non consecutive) numbers, you could use the following.
int numbers[] = {2,3,5,7,11,13,17,19,23,29};
int randomChoice = numbers[rand()%10];
To generate a random number you should use random() function. But if you call it twice it gives you two equal answers. Before calling random(), call srand(time()) to get fresh new random number. if you want to use for(int i = 0; ...) to create numbers,
use srand(time() + i).
Something like this:
- (IBAction)generate:(id)sender
{
// Generate a number between 1 and 10 inclusive
int generated;
generated = (random() % 10) + 1;
}