This question already has answers here:
How do you round a double in Dart to a given degree of precision AFTER the decimal point?
(28 answers)
Closed 10 months ago.
I've coded this simple function to round doubles to a custom step size.
The normal .round() function retuns an int and can only rounds to the nearest 1.
My function returns a double and can round to the nearest 100.0, 5.0, 1.0, 0.1 or 0.23, you get it.
But when I put in certain doubles the result doesn't really work out and is a very tiny fraction off.
I think this has something to do with how computers do floating comma calcualations, but I need an efficient way to get around it.
Run on DartPad
void main() {
stepround(61.337551616741315, 0.1); // this should be 61.3 but is 61.300000000000004
}
/// rounds a double with given steps/precision
double stepround(double value, double steps) {
double rounded = (value / steps).round() * steps;
print(value.toString() + " rounded to the nearest " + steps.toString() + " is " + rounded.toString());
return rounded;
}
As mentioned in the comments, the cause of this issue the way that computers deal with floating numbers. Please refer to the links in the comments for further explanation.
However in a nutshell the problem is mostly caused when dividing or multiplying decimals with decimals. Therefore we can create a similar method to the one you created but with a different approach. We we will take the precision to be as an int.
I.e: 0.1 => 10; 0.001 => 1000
double stepround(double value, int place){
return (value * place).round() / place;
}
Example
// This will return 61.3
stepround(61.337551616741315, 10);
// This will return 61.34
stepround(61.337551616741315, 100);
// This will return 61.338
stepround(61.337551616741315, 1000);
This method works since the small fraction that is caused by the multiplication is removed by round(). And after that we are doing a division by an integer which doesn't create such a problem.
I have a method which accepts BigDecimal. I want to make sure its decimal value have no precision loss (i.e) input to big decimal is exactly stored as it is. My understanding is that precision loss happens when I try to convert infinite double to big decimal and so, it is recommended to create big decimal using string representation of double. Basically, I want to make sure if big decimal is constructed using BigDecimal(String) for such infinite doubles.
As per my understanding after going through doc, input double value which results in precision loss during big decimal conversion always have very large magnitude which won't fit in 64 bits. Example: 0.1. So, string and double value representation of such big decimals won't match. is it enough to say that precision loss has occurred when string and double value won't match?
Eg:
BigDecimal decimal = new BigDecimal(0.1);
System.out.println(decimal.toString()) // prints 0.1000000000000000055511151231257827021181583404541015625
System.out.println(decimal.doubleValue()) // prints 0.1.
String and double value of big decimal differ and so, precision loss happened.
This idea breaks down if you allow the BigDecimal to be the result of arithmetic.
If you are going to require the BigDecimal to be the direct, unmodified result of conversion of a decimal string it would be much simpler to require a String argument and convert it to BigDecimal in your method.
The following program is an attempt to implement and test your validity check. The variable third was calculated without any involvement of doubles, using only decimal strings and BigDecimal, but fails the test.
import java.math.BigDecimal;
import java.math.RoundingMode;
public strictfp class Test {
public static void main(String[] args) {
BigDecimal third = BigDecimal.ONE.divide(new BigDecimal("3"), 30, RoundingMode.HALF_EVEN);
testIt(new BigDecimal("0.1"));
testIt(new BigDecimal(0.1));
testIt(third);
}
static void testIt(BigDecimal in) {
System.out.println(in+" "+isValid(in));
}
static boolean isValid(BigDecimal in) {
double d = in.doubleValue();
String s1 = in.toString();
String s2 = Double.toString(d);
return s1.equals(s2);
}
}
Output:
0.1 true
0.1000000000000000055511151231257827021181583404541015625 false
0.333333333333333333333333333333 false
I am calculating the intersection point of two lines given in the polar coordinate system:
typedef ap_fixed<16,3,AP_RND> t_lines_angle;
typedef ap_fixed<16,14,AP_RND> t_lines_rho;
bool get_intersection(
hls::Polar_< t_lines_angle, t_lines_rho>* lineOne,
hls::Polar_< t_lines_angle, t_lines_rho>* lineTwo,
Point* point)
{
float angleL1 = lineOne->angle.to_float();
float angleL2 = lineTwo->angle.to_float();
t_lines_angle rhoL1 = lineOne->rho.to_float();
t_lines_angle rhoL2 = lineTwo->rho.to_float();
t_lines_angle ct1=cosf(angleL1);
t_lines_angle st1=sinf(angleL1);
t_lines_angle ct2=cosf(angleL2);
t_lines_angle st2=sinf(angleL2);
t_lines_angle d=ct1*st2-st1*ct2;
// we make sure that the lines intersect
// which means that parallel lines are not possible
point->X = (int)((st2*rhoL1-st1*rhoL2)/d);
point->Y = (int)((-ct2*rhoL1+ct1*rhoL2)/d);
return true;
}
After synthesis for our FPGA I saw that the 4 implementations of the float sine (and cos) take 4800 LUTs per implementation, which sums up to 19000 LUTs for these 4 functions. I want to reduce the LUT count by using a fixed point sine. I already found a implementation of CORDIC but I am not sure how to use it. The input of the function is an integer but i have a ap_fixed datatype. How can I map this ap_fixed to integer? and how can I map my 3.13 fixed point to the required 2.14 fixed point?
With the help of one of my colleagues I figured out a quite easy solution that does not require any hand written implementations or manipulation of the fixed point data:
use #include "hls_math.h" and the hls::sinf() and hls::cosf() functions.
It is important to say that the input of the functions should be ap_fixed<32, I> where I <= 32. The output of the functions can be assigned to different types e.g., ap_fixed<16, I>
Example:
void CalculateSomeTrig(ap_fixed<16,5>* angle, ap_fixed<16,5>* output)
{
ap_fixed<32,5> functionInput = *angle;
*output = hls::sinf(functionInput);
}
LUT consumption:
In my case the consumption of LUT was reduced to 400 LUTs for each implementation of the function.
You can use bit-slicing to get the fraction and the integer parts of the ap_fixed variable, and then manipulate them to get the new ap_fixed. Perhaps something like:
constexpr int max(int a, int b) { return a > b ? a : b; }
template <int W2, int I2, int W1, int I1>
ap_fixed<W2, I2> convert(ap_fixed<W1, I1> f)
{
// Read fraction part as integer:
ap_fixed<max(W2, W1) + 1, max(I2, I1) + 1> result = f(W1 - I1 - 1, 0);
// Shift by the original number of bits in the fraction part
result >>= W1 - I1;
// Add the integer part
result += f(W1 - 1, W1 - I1);
return result;
}
I haven't tested this code well, so take it with a grain of salt.
If I have a calculation that is performed using long (64-bit integer) values with a final result that never exceeds 52 bits (the precision of a double-precision floating point), what is the best way to implement this calculation using double such that it always yields the same answer?
The difficulty comes in when performing, for example, a multiplication of two large numbers: If the long overflows, it drops the highest order bits, but if the double "overflows" (i.e. the result ends up with more than 52 significant bits), it drops the lowest order bits.
As an example, let's take the next(...) function in Java's Random class (edited a bit):
protected int next(int bits) {
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1); // seed is a long
return (int) (seed >>> (48 - bits));
}
(Notice how the multiplication will likely overflow a long, but the result is ANDed down to 48 bits)
If I needed to exactly replicate the behavior of this function (or any other using long) in a language without the long data-type (e.g. JavaScript), what would be the most efficient way of doing this? One of the GWT-implementations of this function splits the seed into two halves of 24 bits each to eliminate overflows (again edited a bit):
double hi = seedhi * 0xECE66D + seedlo * 0x5DE; // seedhi and seedlo are doubles
double lo = seedlo * 0xECE66D + 0xB;
double carry = Math.floor(lo >> 24);
hi += carry;
lo &= (1L << 24) - 1;
hi &= (1L << 24) - 1;
seedhi = hi;
seedlo = lo;
Is this the best one can do, or are there some tricks/hacks to make this more elegant?
The problem in general:
I have a big 2d point space, sparsely populated with dots.
Think of it as a big white canvas sprinkled with black dots.
I have to iterate over and search through these dots a lot.
The Canvas (point space) can be huge, bordering on the limits
of int and its size is unknown before setting points in there.
That brought me to the idea of hashing:
Ideal:
I need a hash function taking a 2D point, returning a unique uint32.
So that no collisions can occur. You can assume that the number of
dots on the Canvas is easily countable by uint32.
IMPORTANT: It is impossible to know the size of the canvas beforehand
(it may even change),
so things like
canvaswidth * y + x
are sadly out of the question.
I also tried a very naive
abs(x) + abs(y)
but that produces too many collisions.
Compromise:
A hash function that provides keys with a very low probability of collision.
Cantor's enumeration of pairs
n = ((x + y)*(x + y + 1)/2) + y
might be interesting, as it's closest to your original canvaswidth * y + x but will work for any x or y. But for a real world int32 hash, rather than a mapping of pairs of integers to integers, you're probably better off with a bit manipulation such as Bob Jenkin's mix and calling that with x,y and a salt.
a hash function that is GUARANTEED collision-free is not a hash function :)
Instead of using a hash function, you could consider using binary space partition trees (BSPs) or XY-trees (closely related).
If you want to hash two uint32's into one uint32, do not use things like Y & 0xFFFF because that discards half of the bits. Do something like
(x * 0x1f1f1f1f) ^ y
(you need to transform one of the variables first to make sure the hash function is not commutative)
Like Emil, but handles 16-bit overflows in x in a way that produces fewer collisions, and takes fewer instructions to compute:
hash = ( y << 16 ) ^ x;
You can recursively divide your XY plane into cells, then divide these cells into sub-cells, etc.
Gustavo Niemeyer invented in 2008 his Geohash geocoding system.
Amazon's open source Geo Library computes the hash for any longitude-latitude coordinate. The resulting Geohash value is a 63 bit number. The probability of collision depends of the hash's resolution: if two objects are closer than the intrinsic resolution, the calculated hash will be identical.
Read more:
https://en.wikipedia.org/wiki/Geohash
https://aws.amazon.com/fr/blogs/mobile/geo-library-for-amazon-dynamodb-part-1-table-structure/
https://github.com/awslabs/dynamodb-geo
Your "ideal" is impossible.
You want a mapping (x, y) -> i where x, y, and i are all 32-bit quantities, which is guaranteed not to generate duplicate values of i.
Here's why: suppose there is a function hash() so that hash(x, y) gives different integer values. There are 2^32 (about 4 billion) values for x, and 2^32 values of y. So hash(x, y) has 2^64 (about 16 million trillion) possible results. But there are only 2^32 possible values in a 32-bit int, so the result of hash() won't fit in a 32-bit int.
See also http://en.wikipedia.org/wiki/Counting_argument
Generally, you should always design your data structures to deal with collisions. (Unless your hashes are very long (at least 128 bit), very good (use cryptographic hash functions), and you're feeling lucky).
Perhaps?
hash = ((y & 0xFFFF) << 16) | (x & 0xFFFF);
Works as long as x and y can be stored as 16 bit integers. No idea about how many collisions this causes for larger integers, though. One idea might be to still use this scheme but combine it with a compression scheme, such as taking the modulus of 2^16.
If you can do a = ((y & 0xffff) << 16) | (x & 0xffff) then you could afterward apply a reversible 32-bit mix to a, such as Thomas Wang's
uint32_t hash( uint32_t a)
a = (a ^ 61) ^ (a >> 16);
a = a + (a << 3);
a = a ^ (a >> 4);
a = a * 0x27d4eb2d;
a = a ^ (a >> 15);
return a;
}
That way you get a random-looking result rather than high bits from one dimension and low bits from the other.
You can do
a >= b ? a * a + a + b : a + b * b
taken from here.
That works for points in positive plane. If your coordinates can be in negative axis too, then you will have to do:
A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
A >= B ? A * A + A + B : A + B * B;
But to restrict the output to uint you will have to keep an upper bound for your inputs. and if so, then it turns out that you know the bounds. In other words in programming its impractical to write a function without having an idea on the integer type your inputs and output can be and if so there definitely will be a lower bound and upper bound for every integer type.
public uint GetHashCode(whatever a, whatever b)
{
if (a > ushort.MaxValue || b > ushort.MaxValue ||
a < ushort.MinValue || b < ushort.MinValue)
{
throw new ArgumentOutOfRangeException();
}
return (uint)(a * short.MaxValue + b); //very good space/speed efficiency
//or whatever your function is.
}
If you want output to be strictly uint for unknown range of inputs, then there will be reasonable amount of collisions depending upon that range. What I would suggest is to have a function that can overflow but unchecked. Emil's solution is great, in C#:
return unchecked((uint)((a & 0xffff) << 16 | (b & 0xffff)));
See Mapping two integers to one, in a unique and deterministic way for a plethora of options..
According to your use case, it might be possible to use a Quadtree and replace points with the string of branch names. It is actually a sparse representation for points and will need a custom Quadtree structure that extends the canvas by adding branches when you add points off the canvas but it avoids collisions and you'll have benefits like quick nearest neighbor searches.
If you're already using languages or platforms that all objects (even primitive ones like integers) has built-in hash functions implemented (Java platform Languages like Java, .NET platform languages like C#. And others like Python, Ruby, etc ).
You may use built-in hashing values as a building block and add your "hashing flavor" in to the mix. Like:
// C# code snippet
public class SomeVerySimplePoint {
public int X;
public int Y;
public override int GetHashCode() {
return ( Y.GetHashCode() << 16 ) ^ X.GetHashCode();
}
}
And also having test cases like "predefined million point set" running against each possible hash generating algorithm comparison for different aspects like, computation time, memory required, key collision count, and edge cases (too big or too small values) may be handy.
the Fibonacci hash works very well for integer pairs
multiplier 0x9E3779B9
other word sizes 1/phi = (sqrt(5)-1)/2 * 2^w round to odd
a1 + a2*multiplier
this will give very different values for close together pairs
I do not know about the result with all pairs