difference results between 10**-2 and E-2 - fortran90

The following program print 1 for 100E-2 and gives 0 for 100*10**(-2), that means that
the operator exponent doesnot work for negative **, is that correct.
Thanks in advance
program testme
implicit none
print*,100E-2
print*,100*10**(-2)
end program

You'll notice that the second print statement prints 0 -- no decimal pt, etc. Eg, integer zero. That's because 10 by itself is an integer literal, and raising that to the negative 2 power correctly gives zero; multiplying it by integer 100 still gives integer zero.
If you instead use
print*,100*10.**(-2)
you'll get the answer you expect.
The issue doesn't arise with 100e-2 because any number expressed with scientific notation is a floating point (real) literal.

Related

MATLAB numeric precision when generating a numeric sequence

I was testing a operation like this:
[input] 3.9/0.1 : 4.1/0.1
[output] 39 40
don't know why 4.1/0.1 is approximated to 40. If I add a round(), it will go as expected:
[input] 3.9/0.1 : round(4.1/0.1)
[output] 39 40 41
What's wrong with the first operation?
In this Q&A I go into detail on how the colon operator works in MATLAB to create a range. But the detail that causes the issue described in this question is not covered there.
That post includes the full code for a function that imitates exactly what the colon operator does. Let's follow that code. We start with start = 3.9/0.1, which is exactly 39, and stop = 4.1/0.1, which, due to rounding errors, is just slightly smaller than 41, and step = 1 (the default if it's not given).
It starts by computing a tolerance:
tol = 2.0*eps*max(abs(start),abs(stop));
This tolerance is intended to be used so that the stop value, if within tol of an exact number of steps, is still used, if the last step would step over it. Without a tolerance, it would be really difficult to build correct sequences using floating-point end points and step sizes.
However, then we get this test:
if start == floor(start) && step == 1
% Consecutive integers.
n = floor(stop) - start;
elseif ...
If the start value is an exact integer, and the step size is 1, then it forces the sequence to be an integer sequence. Unfortunately, it does so by taking the number of steps as the distance between floor(stop) and start. That is, it is not using the tolerance computed earlier in determining the right stop! If stop is slightly above an integer, that integer will be in the range. If stop is slightly below an integer (as in the case of the OP), that integer will not be part of the range.
It could be debated whether MATLAB should round the stop number up in this case or not. MATLAB chose not to. All of the sequences produced by the colon operator use the start and stop values exactly as given by the user. It leaves it up to the user to ensure the bounds of the sequence are as required.
However, if the colon operator hadn't special-cased the sequence of integers, the result would have been less surprising in this case. Let's add a very small number to the start value, so it's not an integer:
>> a = 3.9/0.1 : 4.1/0.1
a =
39 40
>> b = 3.9/0.1 + eps(39) : 4.1/0.1
b =
39.0000 40.0000 41.0000
Floating-point numbers suffer from loss of precision when represented with a fixed number of bits (64-bit in MATLAB by default). This is because there are infinite number of real numbers (even within a small range of say 0.0 to 0.1). On the other hand, a n-bit binary pattern can represent a finite 2^n distinct numbers. Hence, not all the real numbers can be represented. The nearest approximation will be used instead, resulted in loss of accuracy.
The closest representable value for 4.1/0.1 in the computer as a 64-bit double precision floating point number is actually,
4.1/0.1 ≈ 40.9999999999999941713291207...
So, in essence, 4.1/0.1 < 41.0 and that's what you get from the range. If you subtract, for example, 41 - 4.1/0.1 = 7.105427357601002e-15. But when you round, you get the closest value of 41.0 as expected.
The representation scheme for 64-bit double-precision according to the IEEE-754 standard:
The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers.
The following 11 bits represent exponent (E).
The remaining 52 bits represents fraction (F).

MATLAB's num2str is inconsistent

I want to receive the string representation of a number, 2 points after the dot.
I'm using MATLAB R2015a, and noticed that num2str function returns inconsistent results:
for 0.511 I get the required result (0.51):
num2str(0.511,2)
ans =
0.51
for 1.711 I get 1.7 instead of 1.71:
num2str(1.511,2)
ans =
1.5
Anyone knows why?
the 'precision' scalar input to num2str is the number of significant digits. if you want to have 2 figures after the decimal point use the 'formatSpec' string argument:
num2str(0.511,'%.2f')
0.51
num2str(1.511,'%.2f')
1.51
From the documentation for num2str():
s = num2str(A,precision) returns a character array that represents the numbers with the maximum number of significant digits specified by precision.
In other words, the second parameter controls the total number of significant figures, not the number of decimal places.
If you want to round to two decimal places, then use the round() function. You can try rounding the number before calling num2str():
num2str(round(1.511,2))

Maximum double value (float) possible in MATLAB (64-bit)

I'm aware that double is the default data-type in MATLAB.
When you compare two double numbers that have no floating part, MATLAB is accurate upto the 17th digit place in my testing.
a=12345678901234567 ; b=12345678901234567; isequal(a,b) --> TRUE
a=123456789012345671; b=123456789012345672; isequal(a,b) --> printed as TRUE
I have found a conservative estimate to be use numbers (non-floating) upto only 13th digit as other functions can become unreliable after it (such as ismember, or the MEX functions ismembc etc).
Is there a similar cutoff for floating values? E.g., if I use shares-outstanding for a company which can be very very large with decimal places, when do I start losing decimal accuracy?
a = 1234567.89012345678 ; b = 1234567.89012345679 ; isequal(a,b) --> printed as TRUE
a = 123456789012345.678 ; b = 123456789012345.677 ; isequal(a,b) --> printed as TRUE
isequal may not be right tool to use for comparing such numbers. I'm more concerned about up to how many places should I trust my decimal values once the integer part of a number starts growing?
It's usually not a good idea to test the equality of floating-point numbers. The behavior of binary floating-point numbers can differ drastically from what you may expect from base-10 decimals. Consider the example:
>> isequal(0.1, 0.3/3)
ans =
0
Ultimately, you have 53 bits of precision. This means that integers can be represented exactly (with no loss in accuracy) up to the number 253 (which is a little over 9 x 1015). After that, well:
>> (2^53 + 1) - 2^53
ans =
0
>> 2^53 + (1 - 2^53)
ans =
1
For non-integers, you are almost never going to be representing them exactly, even for simple-looking decimals such as 0.1 (as shown in that first example). However, it still guarantees you at least 15 significant figures of precision.
This means that if you take any number and round it to the nearest number representable as a double-precision floating point, then this new number will match your original number at least up to the first 15 digits (regardless of where these digits are with respect to the decimal point).
You might want to use variable precision arithmetics (VPA) in matlab. It computes expressions exactly up to a given digit count, which may be quite large. See here.
Check out the MATLAB function flintmax which tells you the maximum consecutive integers that can be stored in either double or single precision. From that page:
flintmax returns the largest consecutive integer in IEEE® double
precision, which is 2^53. Above this value, double-precision format
does not have integer precision, and not all integers can be
represented exactly.

How to stop matlab truncating long numbers

These two long numbers are the same except for the last digit.
test = [];
test(1) = 33777100285870080;
test(2) = 33777100285870082;
but the last digit is lost when the numbers are put in the array:
unique(test)
ans = 3.3777e+16
How can I prevent this? The numbers are ID codes and losing the last digit is screwing everything up.
Matlab uses 64-bit floating point representation by default for numbers. Those have a base-10 16-digit precision (more or less) and your numbers seem to exceed that.
Use something like uint64 to store your numbers:
> test = [uint64(33777100285870080); uint64(33777100285870082)];
> disp(test(1));
33777100285870080
> disp(test(2));
33777100285870082
This is really a rounding error, not a display error. To get the correct strings for output purposes, use int2str, because, again, num2str uses a 64-bit floating point representation, and that has rounding errors in this case.
To add more explanation to #rubenvb's solution, your values are greater than flintmax for IEEE 754 double precision floating-point, i.e, greater than 2^53. After this point not all integers can be exactly represented as doubles. See also this related question.

How to use Bitxor for Double Numbers?

I want to use xor for my double numbers in matlab,but bitxor is only working for int numbers. Is there a function that could convert double to int in Matlab?
The functions You are looking for might be: int8(number), int16(number), uint32(number) Any of them will convert Double to an Integer, but You must pick the best one for the result You want to achieve. Remember that You cannot cast from Double to Integer without rounding the number.
If I understood You correcly, You could create a function that would simply remove the "comma" from the Double number by multiplying your starting value by 2^n and then casting it to Integer using any of the functions mentioned earlier, performing whatever you want and then returning comma to its original position by dividing the number by 2^n
Multiplying the starting value by 2^n is a hack that will decrease the rounding error.
The perfect value for n would be the number of digits after the comma if this number is relatively small.
Please also specify, why are You trying to do this? This doesn't seem to be the optimal solution.
You can just cast to an integer:
a = 1.003
int8(a)
ans =
1
That gives you an 8 bit signed integer, you can also get other size i.e. int16 or else unsigned i.e. uint8 depending on what you want to do