According to the docs Postgres double precision type implements IEEE 754 : “The data types real and double precision are inexact, variable-precision numeric types. On all currently supported platforms, these types are implementations of IEEE Standard 754 for Binary Floating-Point Arithmetic (single and double precision, respectively), to the extent that the underlying processor, operating system, and compiler support it.”
My question is then how do I check that "the underlying processor, operating system, and compiler support it"? Is it somewhat common that this is not the case?
Clarification
I want to check if my specific Postgres instance is compliant. Is there some kind of test(s) that I can do running SQL queries in order to verify or disprove that?
It is very common for today's computers to use IEEE 754 floating point numbers. So much so that even in the wide range of platforms supported by PostgreSQL, all do. The only way to check is to look at the specification and documentation of the hardware or software in question.
PostgreSQL uses whatever the compiler and machine provide for the C data type double. It does not implement floating point arithmetic itself.
Related
I want to convert data(double precision,15 decimal points) to data of another type(quadruple precision,34 decimal points). So, I used vpa function like this:
data = sin(2*pi*frequency*time);
quad_data = vpa(data,34);
But, the type of the result is sym, not double. And when I checked each cell of the sym type data, 1x1 sym was created in each cell. I tried to use fft function using 'quad_data', but it didn't work. Is there any solution that I can change the decimal point of double type from 15 to 34?
The only numeric floating point types that MATLAB currently supports is double, single, and half. Extended precision types can be achieved via the Symbolix Toolbox (e.g., vpa) or 3rd party code (e.g., John D'Errico's FEX submission High Precision Floating HPF class). But even then, only a subset of floating point functions will typically be supported. If the function you are trying to use doesn't support the variable type, then you would have to supply your own function.
Also, you are not building vpa objects properly in the first place. Typically you would convert the operands to vpa first and then do arithmetic on them. Doing the arithmetic in double precision first as you are doing with data, and then converting to extended precision vpa, just adds garbage to the values. E.g., set the digits first and then use vpa('pi') to get the full extended precision version of pi as a vpa variable.
There is a commercial 3rd-party toolbox for this purpose, called the Multiprecision Computing Toolbox for MATLAB.
This tool implements many of the mathematical operations you would expect from double inputs, and according to benchmarks on the website, it's much faster than vpa.
Disclosure: I am not affiliated with the creators of this tool in any way, however I can say that we had a good experience with this tool for one of our lab's projects.
The other suggestion I can give is doing the high-precision arithmetic in an another language\environment to which MATLAB provides interfaces (e.g., C, python, java), and which should have the quad data type implemented.
Some of the values I need to read in my ksy file are double's which I assume is a binary64 structure. The native data-types for a float won't stretch that far. Has anyone managed to represent this datatype in Kaitai ?
"binary64" is a normal IEEE 754 double-precision floats, occupying 64 bits = 8 bytes.
They're perfectly supported by vast majority of languages and, subsequently, Kaitai Struct offers built-in supports for them as type: f8 (float, 8 bytes long).
If you're rather interested in larger floating point values (binary128, binary256 — i.e. quad or octuple precision), there is no built-in support for them in KS due to lack of standard support for these types in most target languages. If you want something like that, the recommended way would be implementing one as opaque type in a target language of your choice. That will likely require you to bringing in some external library which implements this type using some kind of software emulation / complex arithmetics — as hardware support seems to be almost non-existent in commodity CPUs (like Intel or ARM) as of 2020.
For more details on these, see issue #101.
I have seen that my scientific calculator stores 99 digits after decimal. Why doesn't the programming languages use such precision ? Moreover, how can I achieve such precision if I want to ?
This is a good question!
To understand what happens, you must first familiarize yourself with how computers store floating point numbers: http://grouper.ieee.org/groups/754/
Typically programming languages offer two binary representations - one that uses 32 bits and one that uses 64 bits.
If you need more precision, you need a better representation and you can implement a division algorithm to obtain the result with arbitrary precision that you need.
You can take a look at Java’s implementation of BigDecimal
I'm very new to Ada and was trying to see if it offers double precision type. I see that we have float and
Put( Integer'Image( Float'digits ) );
on my machine gives a value of 6, which is not enough for numerical computations.
Does Ada has double and long double types as in C?
Thanks a lot...
It is a wee bit more complicated than that.
The only predefined floating-point type that compilers have to support is Float. Compilers may optionally support Short_Float and Long_Float. You should be able to look in appendex F of your compiler documentation to see what it supports.
In practice, your compiler almost certianly defines Float as a 32-bit IEEE float, and Long_Float as a 64-bit. Note that C pretty much works this way too with its float and double. C doesn't actually define the size of those.
If you absolutely must have a certian precision (eg: you are sharing the data with something external that must use IEEE 64-bit), then you should probably define your own float type with exactly that precision. That would ensure your code is either portable to any platform or compiler you move it to, or that it will produce a compiler error so you can fix the issue.
You can create any size Float you like. For a long it would be:
type My_Long_Float is digits 11;
Wiki Books is a good reference for things like this.
I encountered a weird behaviour in Perl. The following subtraction should yield zero as result (which it does in Python):
print 7.6178E-01 - 0.76178
-1.11022302462516e-16
Why does it occur and how to avoid it?
P.S. Effect appears on "v5.10.0 built for x86_64-linux-gnu-thread-multi" (Ubuntu 9.04) and "v5.8.9 built for darwin-2level" (Mac OS 10.6)
It's not that scientific notation affects the precision so much as the limitations of floating point notation represented in binary. See the answers to the perlfaq4. This is a problem for any language that relies on the underlying architecture for number storage.
Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
Why is int() broken?
If you need better number handling, check out the bignum pragma.