I need a very precise number, but ipython is rounding it wrong [duplicate] - ipython

I have two integer values a and b, but I need their ratio in floating point. I know that a < b and I want to calculate a / b, so if I use integer division I'll always get 0 with a remainder of a.
How can I force c to be a floating point number in Python 2 in the following?
c = a / b
In 3.x, the behaviour is reversed; see Why does integer division yield a float instead of another integer? for the opposite, 3.x-specific problem.

In Python 2, division of two ints produces an int. In Python 3, it produces a float. We can get the new behaviour by importing from __future__.
>>> from __future__ import division
>>> a = 4
>>> b = 6
>>> c = a / b
>>> c
0.66666666666666663

You can cast to float by doing c = a / float(b). If the numerator or denominator is a float, then the result will be also.
A caveat: as commenters have pointed out, this won't work if b might be something other than an integer or floating-point number (or a string representing one). If you might be dealing with other types (such as complex numbers) you'll need to either check for those or use a different method.

How can I force division to be floating point in Python?
I have two integer values a and b, but I need their ratio in floating point. I know that a < b and I want to calculate a/b, so if I use integer division I'll always get 0 with a remainder of a.
How can I force c to be a floating point number in Python in the following?
c = a / b
What is really being asked here is:
"How do I force true division such that a / b will return a fraction?"
Upgrade to Python 3
In Python 3, to get true division, you simply do a / b.
>>> 1/2
0.5
Floor division, the classic division behavior for integers, is now a // b:
>>> 1//2
0
>>> 1//2.0
0.0
However, you may be stuck using Python 2, or you may be writing code that must work in both 2 and 3.
If Using Python 2
In Python 2, it's not so simple. Some ways of dealing with classic Python 2 division are better and more robust than others.
Recommendation for Python 2
You can get Python 3 division behavior in any given module with the following import at the top:
from __future__ import division
which then applies Python 3 style division to the entire module. It also works in a python shell at any given point. In Python 2:
>>> from __future__ import division
>>> 1/2
0.5
>>> 1//2
0
>>> 1//2.0
0.0
This is really the best solution as it ensures the code in your module is more forward compatible with Python 3.
Other Options for Python 2
If you don't want to apply this to the entire module, you're limited to a few workarounds. The most popular is to coerce one of the operands to a float. One robust solution is a / (b * 1.0). In a fresh Python shell:
>>> 1/(2 * 1.0)
0.5
Also robust is truediv from the operator module operator.truediv(a, b), but this is likely slower because it's a function call:
>>> from operator import truediv
>>> truediv(1, 2)
0.5
Not Recommended for Python 2
Commonly seen is a / float(b). This will raise a TypeError if b is a complex number. Since division with complex numbers is defined, it makes sense to me to not have division fail when passed a complex number for the divisor.
>>> 1 / float(2)
0.5
>>> 1 / float(2j)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't convert complex to float
It doesn't make much sense to me to purposefully make your code more brittle.
You can also run Python with the -Qnew flag, but this has the downside of executing all modules with the new Python 3 behavior, and some of your modules may expect classic division, so I don't recommend this except for testing. But to demonstrate:
$ python -Qnew -c 'print 1/2'
0.5
$ python -Qnew -c 'print 1/2j'
-0.5j

c = a / (b * 1.0)

In Python 3.x, the single slash (/) always means true (non-truncating) division. (The // operator is used for truncating division.) In Python 2.x (2.2 and above), you can get this same behavior by putting a
from __future__ import division
at the top of your module.

Just making any of the parameters for division in floating-point format also produces the output in floating-point.
Example:
>>> 4.0/3
1.3333333333333333
or,
>>> 4 / 3.0
1.3333333333333333
or,
>>> 4 / float(3)
1.3333333333333333
or,
>>> float(4) / 3
1.3333333333333333

Add a dot (.) to indicate floating point numbers
>>> 4/3.
1.3333333333333333

This will also work
>>> u=1./5
>>> print u
0.2

If you want to use "true" (floating point) division by default, there is a command line flag:
python -Q new foo.py
There are some drawbacks (from the PEP):
It has been argued that a command line option to change the
default is evil. It can certainly be dangerous in the wrong
hands: for example, it would be impossible to combine a 3rd
party library package that requires -Qnew with another one that
requires -Qold.
You can learn more about the other flags values that change / warn-about the behavior of division by looking at the python man page.
For full details on division changes read: PEP 238 -- Changing the Division Operator

from operator import truediv
c = truediv(a, b)

from operator import truediv
c = truediv(a, b)
where a is dividend and b is the divisor.
This function is handy when quotient after division of two integers is a float.

Related

Alternate approach for pdist() from scipy in Julia?

My objective is to replicate the functionality of pdist() from SciPy in Julia.
I tried using Distances.jl package to perform pairwise computation of distance between observations. However, the results are not same as seen in the below mentioned example.
Python Example:
from scipy.spatial.distance import pdist
a = [[1,2], [3,4], [5,6], [7,8]]
b = pdist(a)
print(b)
output --> array([2.82842712, 5.65685425, 8.48528137, 2.82842712, 5.65685425, 2.82842712])
Julia Example:
using Distances
a = [1 2; 3 4; 5 6; 7 8]
dist_function(x) = pairwise(Euclidean(), x, dims = 1)
dist_function(a)
output -->
4×4 Array{Float64,2}:
0.0 2.82843 5.65685 8.48528
2.82843 0.0 2.82843 5.65685
5.65685 2.82843 0.0 2.82843
8.48528 5.65685 2.82843 0.0
With reference to above examples:
Is pdist() from SciPy in python has metric value set to Euclidean() by default?
How may I approach this problem, to replicate the results in Julia?
Please suggest a solution to resolve this problem.
Documentation reference for pdist() :--> https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
Thanks in advance!!
According to the documentation page you linked, to get the same form as Julia from python (yes, I know, this is the reverse of your question), you can pass it to squareform. I.e. in your example, add
from scipy.spatial.distance import squareform
squareform(b)
Also, yes, from the same documentation page, you can see that the 'metric' parameter defaults to 'euclidean' if not explictly defined.
For the reverse situation, simply note that the python vector is simply all the elements in the off-diagonal (since for a 'proper' distance metric, the resulting distance matrix is symmetric).
So you can simply collect all the elements from the off-diagonal into a vector.
For (1), the answer is yes as per the documentation you linked, which says at the top
scipy.spatial.distance.pdist(X, metric='euclidean', *args, **kwargs)
indicating that the metric arg is indeed set to 'euclidean' by default.
I'm not sure I understand your second question - the results are the same? The only difference to me seems to be that scipy returns the upper triangular as a vector, so if it's just about doing this have a look at: https://discourse.julialang.org/t/vector-of-upper-triangle/7764

Evaluate Irrational with Julia

Julia has the built-in constant pi, with type Irrational.
julia> pi
π = 3.1415926535897...
julia> π
π = 3.1415926535897...
julia> typeof(pi)
Irrational{:π}
Coming from SymPy, which has the N() function, I would like to evaluate pi (or other Irrationals, such as e, golden, etc.) to n digits.
In [5]: N(pi, n=50)
Out[5]: 3.1415926535897932384626433832795028841971693993751
Is this possible? I am assuming that pi is based on its mathematical definition, rather than just to thirteen decimal places.
Sure, you can set the BigFloat precision and use big(π). Note that the precision is binary; it's counted in bits. You should be safe if you set the precision to at least log2(10) + 1 times the number of digits you need.
Example:
julia> setprecision(BigFloat, 2000) do
#printf "%.200f" big(π)
end
3.14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196
Here I've set the precision a little higher than it needs to be for just 200 digits.
The digits are computed in the GNU MPFR library.
Julia has an interface for the SymPy package:
# Pkg.add("SymPy") ## initial installation of package
using SymPy
julia> N(PI, 50)
3.14159265358979323846264338327950288419716939937508
Note that SymPy uses the uppercase PI to distinguish it's object from the lowercase pi in native Julia. You'll also need the native SymPy installed on your computer to get full functionality here.
Other comments:
see here for a longer tutorial on Julia SymPy with a lot of good examples.

Detailed implementation of IEEE754 in MATLAB?

In MATLAB,
>> format hex; 3/10, 3*0.1
ans =
3fd3333333333333
ans =
3fd3333333333334
>> 3/10 - 3*0.1
ans =
bc90000000000000
Is this result predictable? i.e. I can follow some rules of floating point arithmetic, and get 3/10 = 3d3333333333333, 3*0.1 = 3d3333333333334 again by hand.
The rules are:
In MATLAB, unless specified otherwise (via constructors), all literals have double precision in the sense of IEE754 standard: http://www.mathworks.com/help/matlab/matlab_prog/floating-point-numbers.html
All arithmetic operations are executed according to the usual precedence rules: http://www.mathworks.com/help/matlab/matlab_prog/operator-precedence.html
When mixing numeric types with double in an arithmetic operation, MATLAB converts the double to the other numeric type before executing the operation—as opposed to C, for example, which does the other way around.
By using these rules you can pretty much predict the results of any arithmetic expression (always little endian memory layout, bit patterns are two's complement for signed integers and IEEE754 for floats). The alternative is to let MATLAB apply the rules for you; the results will be consistent and repeatable.
The reason is that when creating the binary representation for 0.1 a roundup has occurred, introducing a small error:
>> 0.1
ans =
3fb999999999999a
There should be infinitely many of those 9s in the end but we cut it and round up the last digit. The error is small but becomes significant when you multiply by 3
>> 3*0.1
ans =
3fd3333333333334
When correctly calculated by division this last digit shouldn't be 4:
>> 3/10
ans =
3fd3333333333333
It is interesting to see that this error is not big enough to cause a problem when we multiply by some other number smaller than 3 (the threshold is not exactly 3 though):
>> 2.9/10
ans =
3fd28f5c28f5c28f
>> 2.9*0.1
ans =
3fd28f5c28f5c28f

mod() operation weird behavior

I use mod() to compare if a number's 0.01 digit is 2 or not.
if mod(5.02*100, 10) == 2
...
end
The result is mod(5.02*100, 10) = 2 returns 0;
However, if I use mod(1.02*100, 10) = 2 or mod(20.02*100, 10) = 2, it returns 1.
The result of mod(5.02*100, 10) - 2 is
ans =
-5.6843e-14
Could it be possible that this is a bug for matlab?
The version I used is R2013a. version 8.1.0
This is not a bug in MATLAB. It is a limitation of floating point arithmetic and conversion between binary and decimal numbers. Even a simple decimal number such as 0.1 has cannot be exactly represented as a binary floating point number with finite precision.
Computer floating point arithmetic is typically not exact. Although we are used to dealing with numbers in decimal format (base10), computers store and process numbers in binary format (base2). The IEEE standard for double precision floating point representation (see http://en.wikipedia.org/wiki/Double-precision_floating-point_format, what MATLAB uses) specifies the use of 64 bits to represent a binary number. 1 bit is used for the sign, 52 bits are used for the mantissa (the actual digits of the number), and 11 bits are used for the exponent and its sign (which specifies where the decimal place goes).
When you enter a number into MATLAB, it is immediately converted to binary representation for all manipulations and arithmetic and then converted back to decimal for display and output.
Here's what happens in your example:
Convert to binary (keeping only up to 52 digits):
5.02 => 1.01000001010001111010111000010100011110101110000101e2
100 => 1.1001e6
10 => 1.01e3
2 => 1.0e1
Perform multiplication:
1.01000001010001111010111000010100011110101110000101 e2
x 1.1001 e6
--------------------------------------------------------------
0.000101000001010001111010111000010100011110101110000101
0.101000001010001111010111000010100011110101110000101
+ 1.01000001010001111010111000010100011110101110000101
-------------------------------------------------------------
1.111101011111111111111111111111111111111111111111111101e8
Cutting off at 52 digits gives 1.111101011111111111111111111111111111111111111111111e8
Note that this is not the same as 1.11110110e8 which would be 502.
Perform modulo operation: (there may actually be additional error here depending on what algorithm is used within the mod() function)
mod( 1.111101011111111111111111111111111111111111111111111e8, 1.01e3) = 1.111111111111111111111111111111111111111111100000000e0
The error is exactly -2-44 which is -5.6843x10-14. The conversion between decimal and binary and the rounding due to finite precision have caused a small error. In some cases, you get lucky and rounding errors cancel out and you might still get the 'right' answer which is why you got what you expect for mod(1.02*100, 10), but In general, you cannot rely on this.
To use mod() correctly to test the particular digit of a number, use round() to round it to the nearest whole number and compensate for floating point error.
mod(round(5.02*100), 10) == 2
What you're encountering is a floating point error or artifact, like the commenters say. This is not a Matlab bug; it's just how floating point values work. You'd get the same results in C or Java. Floating point values are "approximate" types, so exact equality comparisons using == without some rounding or tolerance are prone to error.
>> isequal(1.02*100, 102)
ans =
1
>> isequal(5.02*100, 502)
ans =
0
It's not the case that 5.02 is the only number this happens for; several around 0 are affected. Here's an example that picks out several of them.
x = 1.02:1000.02;
ix = mod(x .* 100, 10) ~= 2;
disp(x(ix))
To understand the details of what's going on here (and in many other situations you'll encounter working with floats), have a read through the Wikipedia entry for "floating point", or my favorite article on it, "What Every Computer Scientist Should Know About Floating-Point Arithmetic". (That title is hyperbole; this article goes deep and I don't understand half of it. But it's a great resource.) This stuff is particularly relevant to Matlab because Matlab does everything in floating point by default.

Why is modulus different in different programming languages?

Perl
print 2 % -18;
-->
-16
Tcl
puts [expr {2 % -18}]
-->
-16
but VBScript
wscript.echo 2 mod -18
-->
2
Why the difference?
The wikipedia answer is fairly helpful here.
A short summary is that any integer can be defined as
a = qn + r
where all of these letters are integers, and
0 <= |r| < |n|.
Almost every programming language will require that (a/n) * n + (a%n) = a. So the definition of modulus will nearly always depend on the definition of integer division. There are two choices for integer division by negative numbers 2/-18 = 0 or 2/-18 = -1. Depending on which one is true for your language will usually change the % operator.
This is because 2 = (-1) * -18 + (-16) and 2 = 0 * -18 + 2.
For Perl the situation is complicated. The manual page says: "Note that when use integer is in scope, "%" gives you direct access to the modulus operator as implemented by your C compiler. This operator is not as well defined for negative operands, but it will execute faster. " So it can choose either option for Perl (like C) if use integer is in scope. If use integer is not in scope, the manual says " If $b is negative, then $a % $b is $a minus the smallest multiple of $b that is not less than $a (i.e. the result will be less than or equal to zero). "
Wikipedia's "Modulo operation" page explains it quite well. I won't try to do any better here, as I'm likely to make a subtle but important mistake.
The rub of it is that you can define "remainder" or "modulus" in different ways, and different languages have chosen different options to implement.
After dividing a number and a divisor, one of which is negative, you have at least two ways to separate them into a quotient and a remainder, such that quotient * divisor + remainder = number: you can either round the quotient towards negative infinity, or towards zero.
Many languages just choose one.
I can't resist pointing out that Common Lisp offers both.
python, of course, explicitly informs you
>>> divmod(2,-18)
(-1, -16)