If I have a calculation that is performed using long (64-bit integer) values with a final result that never exceeds 52 bits (the precision of a double-precision floating point), what is the best way to implement this calculation using double such that it always yields the same answer?
The difficulty comes in when performing, for example, a multiplication of two large numbers: If the long overflows, it drops the highest order bits, but if the double "overflows" (i.e. the result ends up with more than 52 significant bits), it drops the lowest order bits.
As an example, let's take the next(...) function in Java's Random class (edited a bit):
protected int next(int bits) {
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1); // seed is a long
return (int) (seed >>> (48 - bits));
}
(Notice how the multiplication will likely overflow a long, but the result is ANDed down to 48 bits)
If I needed to exactly replicate the behavior of this function (or any other using long) in a language without the long data-type (e.g. JavaScript), what would be the most efficient way of doing this? One of the GWT-implementations of this function splits the seed into two halves of 24 bits each to eliminate overflows (again edited a bit):
double hi = seedhi * 0xECE66D + seedlo * 0x5DE; // seedhi and seedlo are doubles
double lo = seedlo * 0xECE66D + 0xB;
double carry = Math.floor(lo >> 24);
hi += carry;
lo &= (1L << 24) - 1;
hi &= (1L << 24) - 1;
seedhi = hi;
seedlo = lo;
Is this the best one can do, or are there some tricks/hacks to make this more elegant?
Related
I don't have much experience working with these low level bytes and numbers, so I've come here for help. I'm connecting to a bluetooth thermometer in my Flutter app, and I get an array of numbers formatted like this according to their documentation. I'm attempting to convert these numbers to a plain temperature double, but can't figure out how. This is the "example" the company gives me. However when I get a reading of 98.5 on the thermometer I get a response as an array of [113, 14, 0, 254]
Thanks for any help!
IEEE-11073 is a commonly used format in medical devices. The table you quoted has everything in it for you to decode the numbers, though might be hard to decipher at first.
Let's take the first example you have: 0xFF00016C. This is a 32-bit number and the first byte is the exponent, and the last three bytes are the mantissa. Both are encoded in 2s complement representation:
Exponent, 0xFF, in 2's complement this is the number -1
Mantissa, 0x00016C, in 2's complement this is the number 364
(If you're not quite sure how numbers are encoded in 2's complement, please ask that as a separate question.)
The next thing we do is to make sure it's not a "special" value, as dictated in your table. Since the exponent you have is not 0 (it is -1), we know that you're OK. So, no special processing is needed.
Since the value is not special, its numeric value is simply: mantissa * 10^exponent. So, we have: 364*10^-1 = 36.4, as your example shows.
Your second example is similar. The exponent is 0xFE, and that's the number -2 in 2's complement. The mantissa is 0x000D97, which is 3479 in decimal. Again, the exponent isn't 0, so no special processing is needed. So you have: 3479*10^-2 = 34.79.
You say for the 98.5 value, you get the byte-array [113, 14, 0, 254]. Let's see if we can make sense of that. Your byte array, written in hex is: [0x71, 0x0E, 0x00, 0xFE]. I'm guessing you receive these bytes in the "reverse" order, so as a 32-bit hexadecimal this is actually 0xFE000E71.
We proceed similarly: Exponent is again -2, since 0xFE is how you write -2 in 2's complement using 8-bits. (See above.) Mantissa is 0xE71 which equals 3697. So, the number is 3697*10^-2 = 36.97.
You are claiming that this is actually 98.5. My best guess is that you are reading it in Fahrenheit, and your device is reporting in Celcius. If you do the math, you'll find that 36.97C = 98.55F, which is close enough. I'm not sure how you got the 98.5 number, but with devices like this, this outcome seems to be within the precision you can about expect.
Hope this helps!
Here is something that I used to convert sfloat16 to double in dart for our flutter app.
double sfloat2double(ieee11073) {
var reservedValues = {
0x07FE: 'PositiveInfinity',
0x07FF: 'NaN',
0x0800: 'NaN',
0x0801: 'NaN',
0x0802: 'NegativeInfinity'
};
var mantissa = ieee11073 & 0x0FFF;
if (reservedValues.containsKey(mantissa)){
return 0.0; // basically error
}
if ((ieee11073 & 0x0800) != 0){
mantissa = -((ieee11073 & 0x0FFF) + 1 );
}else{
mantissa = (ieee11073 & 0x0FFF);
}
var exponent = ieee11073 >> 12;
if (((ieee11073 >> 12) & 0x8) != 0){
exponent = -((~(ieee11073 >> 12) & 0x0F) + 1 );
}else{
exponent = ((ieee11073 >> 12) & 0x0F);
}
var magnitude = pow(10, exponent);
return (mantissa * magnitude);
}
I don't have much experience working with these low level bytes and numbers, so I've come here for help. I'm connecting to a bluetooth thermometer in my Flutter app, and I get an array of numbers formatted like this according to their documentation. I'm attempting to convert these numbers to a plain temperature double, but can't figure out how. This is the "example" the company gives me. However when I get a reading of 98.5 on the thermometer I get a response as an array of [113, 14, 0, 254]
Thanks for any help!
IEEE-11073 is a commonly used format in medical devices. The table you quoted has everything in it for you to decode the numbers, though might be hard to decipher at first.
Let's take the first example you have: 0xFF00016C. This is a 32-bit number and the first byte is the exponent, and the last three bytes are the mantissa. Both are encoded in 2s complement representation:
Exponent, 0xFF, in 2's complement this is the number -1
Mantissa, 0x00016C, in 2's complement this is the number 364
(If you're not quite sure how numbers are encoded in 2's complement, please ask that as a separate question.)
The next thing we do is to make sure it's not a "special" value, as dictated in your table. Since the exponent you have is not 0 (it is -1), we know that you're OK. So, no special processing is needed.
Since the value is not special, its numeric value is simply: mantissa * 10^exponent. So, we have: 364*10^-1 = 36.4, as your example shows.
Your second example is similar. The exponent is 0xFE, and that's the number -2 in 2's complement. The mantissa is 0x000D97, which is 3479 in decimal. Again, the exponent isn't 0, so no special processing is needed. So you have: 3479*10^-2 = 34.79.
You say for the 98.5 value, you get the byte-array [113, 14, 0, 254]. Let's see if we can make sense of that. Your byte array, written in hex is: [0x71, 0x0E, 0x00, 0xFE]. I'm guessing you receive these bytes in the "reverse" order, so as a 32-bit hexadecimal this is actually 0xFE000E71.
We proceed similarly: Exponent is again -2, since 0xFE is how you write -2 in 2's complement using 8-bits. (See above.) Mantissa is 0xE71 which equals 3697. So, the number is 3697*10^-2 = 36.97.
You are claiming that this is actually 98.5. My best guess is that you are reading it in Fahrenheit, and your device is reporting in Celcius. If you do the math, you'll find that 36.97C = 98.55F, which is close enough. I'm not sure how you got the 98.5 number, but with devices like this, this outcome seems to be within the precision you can about expect.
Hope this helps!
Here is something that I used to convert sfloat16 to double in dart for our flutter app.
double sfloat2double(ieee11073) {
var reservedValues = {
0x07FE: 'PositiveInfinity',
0x07FF: 'NaN',
0x0800: 'NaN',
0x0801: 'NaN',
0x0802: 'NegativeInfinity'
};
var mantissa = ieee11073 & 0x0FFF;
if (reservedValues.containsKey(mantissa)){
return 0.0; // basically error
}
if ((ieee11073 & 0x0800) != 0){
mantissa = -((ieee11073 & 0x0FFF) + 1 );
}else{
mantissa = (ieee11073 & 0x0FFF);
}
var exponent = ieee11073 >> 12;
if (((ieee11073 >> 12) & 0x8) != 0){
exponent = -((~(ieee11073 >> 12) & 0x0F) + 1 );
}else{
exponent = ((ieee11073 >> 12) & 0x0F);
}
var magnitude = pow(10, exponent);
return (mantissa * magnitude);
}
This question already has answers here:
How to use Bitxor for Double Numbers?
(2 answers)
Closed 9 years ago.
I have two matrices a = [120.23, 255.23669877,...] and b = [125.000083, 800.0101010,...] with double numbers in [0, 999]. I want to use bitxor for a and b. I can not use bitxor with round like this:
result = bitxor(round(a(1,j),round(b(1,j))
Because the decimal parts 0.23 and 0.000083 ,... are very important to me. I thought maybe I could do a = a*10^k and b = b*10^k and use bitxor and after that result/10^k (because I want my result's range to also be [0, 999]. But I do not know the maximum length of the number after the decimal point. Does k = 16 support the max range of double numbers in Matlab? Does bitxor support two 19-digit numbers? Is there a better solution?
This is not really an answer, but a very long comment with embedded code. I don't have a current matlab installation, and in any case don't know enough to answer the question in that context. Instead, I've written a Java program that I think may do what you are asking for. It uses two Java classes, BigInteger and BigDecimal. BigInteger is an extended integer format. BigDecimal is the combination of a BigInteger and a decimal scale.
Conversion from double to BigDecimal is exact. Conversion in the opposite direction may require rounding.
The function xor in my program converts each of its operands to BigDecimal. It finds a number of decimal digits to move the decimal point by to make both operands integers. After scaling, it converts to BigInteger, does the actual xor, and converts back to BigDecimal undoing the scaling.
The main point of this is for you to look at the results, and see whether they are what you want, and would be useful to you if you could do the same thing in Matlab. Explaining any ways in which the results are not what you want may help clarify your requirements for the Matlab experts.
Here is some test output. The top and bottom rows of each block are in decimal. The middle row is the scaled integer versions of the inputs, in hex.
Testing operands 1.100000000000000088817841970012523233890533447265625, 2
2f0a689f1b94a78f11d31b7ab806d40b1014d3f6d59 xor 558749db77f70029c77506823d22bd0000000000000 = 7a8d21446c63a7a6d6a61df88524690b1014d3f6d59
1.1 xor 2.0 = 2.8657425494106605
Testing operands 100, 200.0004999999999881765688769519329071044921875
2cd76fe086b93ce2f768a00b22a00000000000 xor 59aeee72a26b59f6380fcf078b92c4478e8a13 = 7579819224d26514cf676f0ca932c4478e8a13
100.0 xor 200.0005 = 261.9771865509636
Testing operands 120.3250000000000028421709430404007434844970703125, 120.75
d2c39898113a28d484dd867220659fbb45005915 xor d3822c338b76bab08df9fee485d1b00000000000 = 141b4ab9a4c926409247896a5b42fbb45005915
120.325 xor 120.75 = 0.7174277813579485
Testing operands 120.2300000000000039790393202565610408782958984375, 120.0000830000000036079654819332063198089599609375
d298ff20fbed5fd091d87e56002df79fc7007cb7 xor d231e5f39e1db18654cb8c43d579692616a16a1f = a91ad365f0ee56c513f215d5549eb9d1a116a8
120.23 xor 120.000083 = 0.37711627930683345
Here is the Java program:
import java.math.BigDecimal;
import java.math.BigInteger;
public class Test {
public static double xor(double a, double b) {
BigDecimal ad = new BigDecimal(a);
BigDecimal bd = new BigDecimal(b);
/*
* Shifting the decimal point right by scale will make both operands
* integers.
*/
int scale = Math.max(ad.scale(), bd.scale());
/*
* Scale both operands by, in effect, multiplying by the same power of 10.
*/
BigDecimal aScaled = ad.movePointRight(scale);
BigDecimal bScaled = bd.movePointRight(scale);
/*
* Convert the operands to integers, treating any rounding as an error.
*/
BigInteger aInt = aScaled.toBigIntegerExact();
BigInteger bInt = bScaled.toBigIntegerExact();
BigInteger resultInt = aInt.xor(bInt);
System.out.println(aInt.toString(16) + " xor " + bInt.toString(16) + " = "
+ resultInt.toString(16));
/*
* Undo the decimal point shift, in effect dividing by the same power of 10
* as was used to scale to integers.
*/
BigDecimal result = new BigDecimal(resultInt, scale);
return result.doubleValue();
}
public static void test(double a, double b) {
System.out.println("Testing operands " + new BigDecimal(a) + ", " + new BigDecimal(b));
double result = xor(a, b);
System.out.println(a + " xor " + b + " = " + result);
System.out.println();
}
public static void main(String arg[])
{
test(1.1, 2.0);
test(100, 200.0005);
test(120.325, 120.75);
test(120.23, 120.000083);
}
}
"But I do not know the max length of number after point ..."
In double precision floating-point you have 15–17 significant decimal digits. If you give bitxor double inputs these must be less than intmax('uint64'): 1.844674407370955e+19. The largest double, realmax (= 1.797693134862316e+308), is much bigger than this, so you can't represent everything in the the way you're using. For example, this means that your value of 800.0101010*10^17 won't work.
If your range is [0, 999], one option is to solve for the largest fractional exponent k and use that: log(double(intmax('uint64'))/999)/log(10) (= 16.266354234268810).
I want to convert the decimal number 27 into binary such a way that , first the digit 2 is converted and its binary value is placed in an array and then the digit 7 is converted and its binary number is placed in that array. what should I do?
thanks in advance
That's called binary-coded decimal. It's easiest to work right-to-left. Take the value modulo 10 (% operator in C/C++/ObjC) and put it in the array. Then integer-divide the value by 10 (/ operator in C/C++/ObjC). Continue until your value is zero. Then reverse the array if you need most-significant digit first.
If I understand your question correctly, you want to go from 27 to an array that looks like {0010, 0111}.
If you understand how base systems work (specifically the decimal system), this should be simple.
First, you find the remainder of your number when divided by 10. Your number 27 in this case would result with 7.
Then you integer divide your number by 10 and store it back in that variable. Your number 27 would result in 2.
How many times do you do this?
You do this until you have no more digits.
How many digits can you have?
Well, if you think about the number 100, it has 3 digits because the number needs to remember that one 10^2 exists in the number. On the other hand, 99 does not.
The answer to the previous question is 1 + floor of Log base 10 of the input number.
Log of 100 is 2, plus 1 is 3, which equals number of digits.
Log of 99 is a little less than 2, but flooring it is 1, plus 1 is 2.
In java it is like this:
int input = 27;
int number = 0;
int numDigits = Math.floor(Log(10, input)) + 1;
int[] digitArray = new int [numDigits];
for (int i = 0; i < numDigits; i++) {
number = input % 10;
digitArray[numDigits - i - 1] = number;
input = input / 10;
}
return digitArray;
Java doesn't have a Log function that is portable for any base (it has it for base e), but it is trivial to make a function for it.
double Log( double base, double value ) {
return Math.log(value)/Math.log(base);
}
Good luck.
The problem in general:
I have a big 2d point space, sparsely populated with dots.
Think of it as a big white canvas sprinkled with black dots.
I have to iterate over and search through these dots a lot.
The Canvas (point space) can be huge, bordering on the limits
of int and its size is unknown before setting points in there.
That brought me to the idea of hashing:
Ideal:
I need a hash function taking a 2D point, returning a unique uint32.
So that no collisions can occur. You can assume that the number of
dots on the Canvas is easily countable by uint32.
IMPORTANT: It is impossible to know the size of the canvas beforehand
(it may even change),
so things like
canvaswidth * y + x
are sadly out of the question.
I also tried a very naive
abs(x) + abs(y)
but that produces too many collisions.
Compromise:
A hash function that provides keys with a very low probability of collision.
Cantor's enumeration of pairs
n = ((x + y)*(x + y + 1)/2) + y
might be interesting, as it's closest to your original canvaswidth * y + x but will work for any x or y. But for a real world int32 hash, rather than a mapping of pairs of integers to integers, you're probably better off with a bit manipulation such as Bob Jenkin's mix and calling that with x,y and a salt.
a hash function that is GUARANTEED collision-free is not a hash function :)
Instead of using a hash function, you could consider using binary space partition trees (BSPs) or XY-trees (closely related).
If you want to hash two uint32's into one uint32, do not use things like Y & 0xFFFF because that discards half of the bits. Do something like
(x * 0x1f1f1f1f) ^ y
(you need to transform one of the variables first to make sure the hash function is not commutative)
Like Emil, but handles 16-bit overflows in x in a way that produces fewer collisions, and takes fewer instructions to compute:
hash = ( y << 16 ) ^ x;
You can recursively divide your XY plane into cells, then divide these cells into sub-cells, etc.
Gustavo Niemeyer invented in 2008 his Geohash geocoding system.
Amazon's open source Geo Library computes the hash for any longitude-latitude coordinate. The resulting Geohash value is a 63 bit number. The probability of collision depends of the hash's resolution: if two objects are closer than the intrinsic resolution, the calculated hash will be identical.
Read more:
https://en.wikipedia.org/wiki/Geohash
https://aws.amazon.com/fr/blogs/mobile/geo-library-for-amazon-dynamodb-part-1-table-structure/
https://github.com/awslabs/dynamodb-geo
Your "ideal" is impossible.
You want a mapping (x, y) -> i where x, y, and i are all 32-bit quantities, which is guaranteed not to generate duplicate values of i.
Here's why: suppose there is a function hash() so that hash(x, y) gives different integer values. There are 2^32 (about 4 billion) values for x, and 2^32 values of y. So hash(x, y) has 2^64 (about 16 million trillion) possible results. But there are only 2^32 possible values in a 32-bit int, so the result of hash() won't fit in a 32-bit int.
See also http://en.wikipedia.org/wiki/Counting_argument
Generally, you should always design your data structures to deal with collisions. (Unless your hashes are very long (at least 128 bit), very good (use cryptographic hash functions), and you're feeling lucky).
Perhaps?
hash = ((y & 0xFFFF) << 16) | (x & 0xFFFF);
Works as long as x and y can be stored as 16 bit integers. No idea about how many collisions this causes for larger integers, though. One idea might be to still use this scheme but combine it with a compression scheme, such as taking the modulus of 2^16.
If you can do a = ((y & 0xffff) << 16) | (x & 0xffff) then you could afterward apply a reversible 32-bit mix to a, such as Thomas Wang's
uint32_t hash( uint32_t a)
a = (a ^ 61) ^ (a >> 16);
a = a + (a << 3);
a = a ^ (a >> 4);
a = a * 0x27d4eb2d;
a = a ^ (a >> 15);
return a;
}
That way you get a random-looking result rather than high bits from one dimension and low bits from the other.
You can do
a >= b ? a * a + a + b : a + b * b
taken from here.
That works for points in positive plane. If your coordinates can be in negative axis too, then you will have to do:
A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
A >= B ? A * A + A + B : A + B * B;
But to restrict the output to uint you will have to keep an upper bound for your inputs. and if so, then it turns out that you know the bounds. In other words in programming its impractical to write a function without having an idea on the integer type your inputs and output can be and if so there definitely will be a lower bound and upper bound for every integer type.
public uint GetHashCode(whatever a, whatever b)
{
if (a > ushort.MaxValue || b > ushort.MaxValue ||
a < ushort.MinValue || b < ushort.MinValue)
{
throw new ArgumentOutOfRangeException();
}
return (uint)(a * short.MaxValue + b); //very good space/speed efficiency
//or whatever your function is.
}
If you want output to be strictly uint for unknown range of inputs, then there will be reasonable amount of collisions depending upon that range. What I would suggest is to have a function that can overflow but unchecked. Emil's solution is great, in C#:
return unchecked((uint)((a & 0xffff) << 16 | (b & 0xffff)));
See Mapping two integers to one, in a unique and deterministic way for a plethora of options..
According to your use case, it might be possible to use a Quadtree and replace points with the string of branch names. It is actually a sparse representation for points and will need a custom Quadtree structure that extends the canvas by adding branches when you add points off the canvas but it avoids collisions and you'll have benefits like quick nearest neighbor searches.
If you're already using languages or platforms that all objects (even primitive ones like integers) has built-in hash functions implemented (Java platform Languages like Java, .NET platform languages like C#. And others like Python, Ruby, etc ).
You may use built-in hashing values as a building block and add your "hashing flavor" in to the mix. Like:
// C# code snippet
public class SomeVerySimplePoint {
public int X;
public int Y;
public override int GetHashCode() {
return ( Y.GetHashCode() << 16 ) ^ X.GetHashCode();
}
}
And also having test cases like "predefined million point set" running against each possible hash generating algorithm comparison for different aspects like, computation time, memory required, key collision count, and edge cases (too big or too small values) may be handy.
the Fibonacci hash works very well for integer pairs
multiplier 0x9E3779B9
other word sizes 1/phi = (sqrt(5)-1)/2 * 2^w round to odd
a1 + a2*multiplier
this will give very different values for close together pairs
I do not know about the result with all pairs