How to discard the garbage values of a double? - double

I have a long double value x. which value is actually 3.00. But in the debugger I see the value is 3.00000000000012312414 that is some garbage value after 13th decimal point. How do I discard this garbage? If I had to print, I can just write printf("%.10llF",x); but I have to compare if the number is integer. One way to compare is long long xt = x; if(x == xt) I will just take the values before 12th place. But for the garbage values, this isn't working. How do I do that?

That is not garbage it is how floating point numbers are stored. To compair floating point numbers you always need to define an acceptable error e. When you compare x to y if the abs(x -y) < e the numbers are equal. Also take a look at this document

you should use some epsilon to compare floating point values.
3.00000000000012312414 can not be represented just as 3.000000000000000000 because computer representation of numbers is not that precise.
If it should be exactly 3, you have to declare it as integer.

Related

Negative 0 in CGPoint's x or y value

When multiplying a zero x or y value of a CGPoint variable with -1, the resulting value is -0. Is this intended behavior?
I think that's odd, it took me quite some time to find that out. I use Swift 3.1 in Xcode 8.3.2.
CGPoint's x and y are CGFloats, which means they are floating point scalar values. And in floating point arithmetic, zero also has a sign, so that multiplying it by a negative value results in a zero with opposite sign.
For further reading you can check these:
https://developer.apple.com/reference/swift/floatingpointclassification
https://en.wikipedia.org/wiki/Signed_zero

Maximum double value (float) possible in MATLAB (64-bit)

I'm aware that double is the default data-type in MATLAB.
When you compare two double numbers that have no floating part, MATLAB is accurate upto the 17th digit place in my testing.
a=12345678901234567 ; b=12345678901234567; isequal(a,b) --> TRUE
a=123456789012345671; b=123456789012345672; isequal(a,b) --> printed as TRUE
I have found a conservative estimate to be use numbers (non-floating) upto only 13th digit as other functions can become unreliable after it (such as ismember, or the MEX functions ismembc etc).
Is there a similar cutoff for floating values? E.g., if I use shares-outstanding for a company which can be very very large with decimal places, when do I start losing decimal accuracy?
a = 1234567.89012345678 ; b = 1234567.89012345679 ; isequal(a,b) --> printed as TRUE
a = 123456789012345.678 ; b = 123456789012345.677 ; isequal(a,b) --> printed as TRUE
isequal may not be right tool to use for comparing such numbers. I'm more concerned about up to how many places should I trust my decimal values once the integer part of a number starts growing?
It's usually not a good idea to test the equality of floating-point numbers. The behavior of binary floating-point numbers can differ drastically from what you may expect from base-10 decimals. Consider the example:
>> isequal(0.1, 0.3/3)
ans =
0
Ultimately, you have 53 bits of precision. This means that integers can be represented exactly (with no loss in accuracy) up to the number 253 (which is a little over 9 x 1015). After that, well:
>> (2^53 + 1) - 2^53
ans =
0
>> 2^53 + (1 - 2^53)
ans =
1
For non-integers, you are almost never going to be representing them exactly, even for simple-looking decimals such as 0.1 (as shown in that first example). However, it still guarantees you at least 15 significant figures of precision.
This means that if you take any number and round it to the nearest number representable as a double-precision floating point, then this new number will match your original number at least up to the first 15 digits (regardless of where these digits are with respect to the decimal point).
You might want to use variable precision arithmetics (VPA) in matlab. It computes expressions exactly up to a given digit count, which may be quite large. See here.
Check out the MATLAB function flintmax which tells you the maximum consecutive integers that can be stored in either double or single precision. From that page:
flintmax returns the largest consecutive integer in IEEEĀ® double
precision, which is 2^53. Above this value, double-precision format
does not have integer precision, and not all integers can be
represented exactly.

How to stop matlab truncating long numbers

These two long numbers are the same except for the last digit.
test = [];
test(1) = 33777100285870080;
test(2) = 33777100285870082;
but the last digit is lost when the numbers are put in the array:
unique(test)
ans = 3.3777e+16
How can I prevent this? The numbers are ID codes and losing the last digit is screwing everything up.
Matlab uses 64-bit floating point representation by default for numbers. Those have a base-10 16-digit precision (more or less) and your numbers seem to exceed that.
Use something like uint64 to store your numbers:
> test = [uint64(33777100285870080); uint64(33777100285870082)];
> disp(test(1));
33777100285870080
> disp(test(2));
33777100285870082
This is really a rounding error, not a display error. To get the correct strings for output purposes, use int2str, because, again, num2str uses a 64-bit floating point representation, and that has rounding errors in this case.
To add more explanation to #rubenvb's solution, your values are greater than flintmax for IEEE 754 double precision floating-point, i.e, greater than 2^53. After this point not all integers can be exactly represented as doubles. See also this related question.

How to use Bitxor for Double Numbers?

I want to use xor for my double numbers in matlab,but bitxor is only working for int numbers. Is there a function that could convert double to int in Matlab?
The functions You are looking for might be: int8(number), int16(number), uint32(number) Any of them will convert Double to an Integer, but You must pick the best one for the result You want to achieve. Remember that You cannot cast from Double to Integer without rounding the number.
If I understood You correcly, You could create a function that would simply remove the "comma" from the Double number by multiplying your starting value by 2^n and then casting it to Integer using any of the functions mentioned earlier, performing whatever you want and then returning comma to its original position by dividing the number by 2^n
Multiplying the starting value by 2^n is a hack that will decrease the rounding error.
The perfect value for n would be the number of digits after the comma if this number is relatively small.
Please also specify, why are You trying to do this? This doesn't seem to be the optimal solution.
You can just cast to an integer:
a = 1.003
int8(a)
ans =
1
That gives you an 8 bit signed integer, you can also get other size i.e. int16 or else unsigned i.e. uint8 depending on what you want to do

fortran90 reading array with real numbers

I have a list of real data in a file. The real data looks like this..
25.935
25.550
24.274
29.936
23.122
27.360
28.154
24.320
28.613
27.601
29.948
29.367
I write fortran90 code to read this data into an array as below:
PROGRAM autocorr
implicit none
INTEGER, PARAMETER :: TRUN=4000,TCOR=1800
real,dimension(TRUN) :: angle
real :: temp, temp2, average1, average2
integer :: i, j, p, q, k, count1, t, count2
REAL, DIMENSION(0:TCOR) :: ACF
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
open(100, file="fort.64",status="old")
do k = 1,TRUN
read(100,*) angle(k)
end do
Then, when I print again to see the values, I get
25.934999
25.549999
24.274000
29.936001
23.122000
27.360001
28.153999
24.320000
28.613001
27.601000
29.948000
29.367001
32.122002
33.818001
21.837000
29.283001
26.489000
24.010000
27.698000
30.799999
36.157001
29.034000
34.700001
26.058001
29.114000
24.177000
25.209000
25.820999
26.620001
29.761000
May I know why the values are now 6 decimal points?
How to avoid this effect so that it doesn't affect the calculation results?
Appreciate any help.
Thanks
You don't show the statement you use to write the values out again. I suspect, therefore, that you've used Fortran's list-directed output, something like this
write(output_unit,*) angle(k)
If you have done this you have surrendered the control of how many digits the program displays to the compiler. That's what the use of * in place of an explicit format means, the standard says that the compiler can use any reasonable representation of the number.
What you are seeing, therefore, is your numbers displayed with 8 sf which is about what single-precision floating-point numbers provide. If you wanted to display the numbers with only 3 digits after the decimal point you could write
write(output_unit,'(f8.3)') angle(k)
or some variation thereof.
You've declared angle to be of type real; unless you've overwritten the default with a compiler flag, this means that you are using single-precision IEEE754 floating-point numbers (on anything other than an exotic computer). Bear in mind too that most real (in the mathematical sense) numbers do not have an exact representation in floating-point and that the single-precision decimal approximation to the exact number 25.935 is likely to be 25.934999; the other numbers you print seem to be the floating-point approximations to the numbers your program reads.
If you really want to compute your results with a lower precision, then you are going to have to employ some clever programming techniques.