PostgreSQL rounding error | Sum of differences between values and difference between sum of values are not equal - postgresql

I am trying to evaluate numbers upto 3 decimal precision. The formulas are fairly simple but sum of differences between values and difference between sum of values are not equal.
The dataset contains nearly 2050 entries.
result_column_1 = SUM(column1) - SUM(column2)
result_column_2 = SUM(column1-column2)
I am getting a huge dissimilarity between result_column_1 and result_column_2
column1 and column2 are already truncated (not rounded) to 3 decimal places.
1. ROUND() - didn't work
2. CAST TO Demical, Numeric - didn't work
3. TRUNC() - didn't work

Related

how to the get the Pentaho table values with Decimal

I am using Pentaho-code I am round the queries values as decimal but getting some colums values not decimal
table output:
NORMAL (06:00--17:00) 3 14341.54 43024.62
OFF_PEAK (22:00--06:00) 3 7002.39 21007.170000000002
PEAK (17:00--22:00) 3 9362.95 28088.850000000002
required output
NORMAL (06:00--17:00) 3 14341.54 43024.62
OFF_PEAK (22:00--06:00) 3 7002.39 21007.17
PEAK (17:00--22:00) 3 9362.95 28088.85
Column formats %.2f will definitely work to show 2 decimal point. You just need to be define column formats for every column. For string you need to define %s, for only real digit use %d, and to get 2 decimal you need to define %.2f etc.
You can have a look in my sample image for correct configuration-

Postgres Custom float type that is always truncated to 2 decimals after point

Can I generate a custom data type in postgres that everytime I insert or update a float into it it is truncate to 2 decimals after dot.
create table money(
formatted moneys_type
);
insert into money values (30.122323213);
Select * from money;
Returns
30.12
Update I didn't use numeric or decimal because they round up when 1.999 => 2
See documentation on Numeric Types / Arbitrary Precision Numbers.
The precision of a numeric is the total count of significant digits in
the whole number, that is, the number of digits to both sides of the
decimal point. The scale of a numeric is the count of decimal digits
in the fractional part, to the right of the decimal point. So the
number 23.5141 has a precision of 6 and a scale of 4. Integers can be
considered to have a scale of zero.
...
To declare a column of type numeric use the syntax:
NUMERIC(precision, scale)
The maximum allowed precision when explicitly specified in the type declaration is 1000.
So you can use
NUMERIC(1000, 2)

How is eps() calculated in MATLAB?

The eps routine in MATLAB essentially returns the positive distance between floating point numbers. It can take an optional argument, too.
My question: How does MATLAB calculate this value? (Does it use a lookup table, or does it use some algorithm to calculate it at runtime, or something else...?)
Related: how could it be calculated in any language providing bit access, given a floating point number?
WIkipedia has quite the page on it
Specifically for MATLAB it's 2^(-53), as MATLAB uses double precision by default. Here's the graph:
It's one bit for the sign, 11 for the exponent and the rest for the fraction.
The MATLAB documentation on floating point numbers also show this.
d = eps(x), where x has data type single or double, returns the positive distance from abs(x) to the next larger floating-point number of the same precision as x.
As not all fractions are equally closely spaced on the number line, different fractions will show different distances to the next floating-point within the same precision. Their bit representations are:
1.0 = 0 01111111111 0000000000000000000000000000000000000000000000000000
0.9 = 0 01111111110 1100110011001100110011001100110011001100110011001101
the sign for both is positive (0), the exponent is not equal and of course their fraction is vastly different. This means that the next floating point numbers would be:
dec2bin(typecast(eps(1.0), 'uint64'), 64) = 0 01111001011 0000000000000000000000000000000000000000000000000000
dec2bin(typecast(eps(0.9), 'uint64'), 64) = 0 01111001010 0000000000000000000000000000000000000000000000000000
which are not the same, hence eps(0.9)~=eps(1.0).
Here is some insight into eps which will help you to write an algorithm.
See that eps(1) = 2^(-52). Now, say you want to compute the eps of 17179869183.9. Note that, I have chosen a number which is 0.1 less than 2^34 (in other words, something like 2^(33.9999...)). To compute eps of this, you can compute log2 of the number, which would be ~ 33.99999... as mentioned before. Take a floor() of this number and add it to -52, since eps(1) = 2^(-52) and the given number 2^(33.999...). Therefore, eps(17179869183.9) = -52+33 = -19.
If you take a number which is fractionally more than 2^34, e.g., 17179869184.1, then the log2(eps(17179869184.1)) = -18. This also shows that the eps value will change for the numbers that are integer powers of your base (or radix), in this case 2. Since eps value only changes at those numbers which are integer powers of 2, we take floor of the power. You will be able to get the perfect value of eps for any number using this. I hope it is clear.
MATLAB uses (along with other languages) the IEEE754 standard for representing real floating point numbers.
In this format the bits allocated for approximating the actual1 real number, usually 32 - for single or 64 - for double precision, are grouped into: 3 groups
1 bit for determining the sign, s.
8 (or 11) bits for exponent, e.
23 (or 52) bits for the fraction, f.
Then a real number, n, is approximated by the following three - term - relation:
n = (-1)s * 2(e - bias) * (1 + fraction)
where the bias offsets negatively2 the values of the exponent so that they describe numbers between 0 and 1 / (1 and 2) .
Now, the gap reflects the fact that real numbers does not map perfectly to their finite, 32 - or 64 - bit, representations, moreover, a range of real numbers that differ by abs value < eps maps to a single value in computer memory, i.e: if you assign a values val to a variable var_i
var_1 = val - offset
...
var_i = val;
...
val_n = val + offset
where
offset < eps(val) / 2
Then:
var_1 = var_2 = ... = var_i = ... = var_n.
The gap is determined from the second term containing the exponent (or characteristic):
2(e - bias)
in the above relation3, which determines the "scale" of the "line" on which the approximated numbers are located, the larger the numbers, the larger the distance between them, the less precise they are and vice versa: the smaller the numbers, the more densely located their representations are, consequently, more accurate.
In practice, to determine the gap of a specific number, eps(number), you can start by adding / subtracting a gradually increasing small number until the initial value of the number of interest changes - this will give you the gap in that (positive or negative) direction, i.e. eps(number) / 2.
To check possible implementations of MATLAB's eps (or ULP - unit of last place , as it is called in other languages), you could search for ULP implementations either in C, C++ or Java, which are the languages MATLAB is written in.
1. Real numbers are infinitely preciser i.e. they could be written with arbitrary precision, i.e. with any number of digits after the decimal point.
2. Usually around the half: in single precision 8 bits mean decimal values from 1 to 2^8 = 256, around the half in our case is: 127, i.e. 2(e - 127)
2. It can be thought that: 2(e - bias), is representing the most significant digits of the number, i.e. the digits that contribute to describe how big the number is, as opposed to the least significant digits that contribute to describe its precise location. Then the larger the term containing the exponent, the smaller the significance of the 23 bits of the fraction.

Maximum double value (float) possible in MATLAB (64-bit)

I'm aware that double is the default data-type in MATLAB.
When you compare two double numbers that have no floating part, MATLAB is accurate upto the 17th digit place in my testing.
a=12345678901234567 ; b=12345678901234567; isequal(a,b) --> TRUE
a=123456789012345671; b=123456789012345672; isequal(a,b) --> printed as TRUE
I have found a conservative estimate to be use numbers (non-floating) upto only 13th digit as other functions can become unreliable after it (such as ismember, or the MEX functions ismembc etc).
Is there a similar cutoff for floating values? E.g., if I use shares-outstanding for a company which can be very very large with decimal places, when do I start losing decimal accuracy?
a = 1234567.89012345678 ; b = 1234567.89012345679 ; isequal(a,b) --> printed as TRUE
a = 123456789012345.678 ; b = 123456789012345.677 ; isequal(a,b) --> printed as TRUE
isequal may not be right tool to use for comparing such numbers. I'm more concerned about up to how many places should I trust my decimal values once the integer part of a number starts growing?
It's usually not a good idea to test the equality of floating-point numbers. The behavior of binary floating-point numbers can differ drastically from what you may expect from base-10 decimals. Consider the example:
>> isequal(0.1, 0.3/3)
ans =
0
Ultimately, you have 53 bits of precision. This means that integers can be represented exactly (with no loss in accuracy) up to the number 253 (which is a little over 9 x 1015). After that, well:
>> (2^53 + 1) - 2^53
ans =
0
>> 2^53 + (1 - 2^53)
ans =
1
For non-integers, you are almost never going to be representing them exactly, even for simple-looking decimals such as 0.1 (as shown in that first example). However, it still guarantees you at least 15 significant figures of precision.
This means that if you take any number and round it to the nearest number representable as a double-precision floating point, then this new number will match your original number at least up to the first 15 digits (regardless of where these digits are with respect to the decimal point).
You might want to use variable precision arithmetics (VPA) in matlab. It computes expressions exactly up to a given digit count, which may be quite large. See here.
Check out the MATLAB function flintmax which tells you the maximum consecutive integers that can be stored in either double or single precision. From that page:
flintmax returns the largest consecutive integer in IEEEĀ® double
precision, which is 2^53. Above this value, double-precision format
does not have integer precision, and not all integers can be
represented exactly.

How many distinct values can be stored in floating-point formats?

My assumptions for IEEE 754-2008:
binary16 - 2^16 distinct values,
binary32 - 2^32 distinct values,
...
binary128 - 2^128 distinct values.
Is this correct?
This is a trick question.
The floating-point formats define some special values. Whether you count these as distinct depends on your point of view. The following is for double-precision (binary64):
There are two representations of 0: with the sign bit 0 or 1 and both the exponent and mantissa all zero. The values are distinguishable by the fact that 1/+0 = infinity and 1/-0 = -infinity. But they compare equal.
There are 2 infinities, where the first 12 bits are 0x7ff or 0xfff and the mantissa is all zero. These are not finite real numbers, but they are values.
There is a whole range of Not-A-Number (NaN) values, having sign+exponent bits 0x7ff (signaling NaN) or 0xfff ("quiet" NaN) and a nonzero mantissa. Again, these are not real numbers but they are distinguishable values.
So, to summarize:
The total number of distinguishable values (real numbers or otherwise) is 2^64.
The number of distinct real numbers, excluding infinities and counting zero only once, is 2*(2^11-1)*2^52-1 = 18,437,736,874,454,810,623.
For binary16, the number of distinct real numbers is 2*(2^5-1)*2^10-1 = 63,487. For binary32, it's 2*(2^8-1)*2^23-1=4,278,190,079. For binary128, it's 2*(2^15-1)*2^112-1 or about 3.4*10^38.