I observed something really strange. If you run this code in Swift:
Int(Float(Int.max))
It crashes with the error message:
fatal error: Float value cannot be converted to Int because the result would be greater than Int.max
This is really counter-intuitive, so I expanded the expression into 3 lines and tried to see what happens in each step in a playground:
let a = Int.max
let b = Float(a)
let c = Int(b)
It crashes with the same message. This time, I see that a is 9223372036854775807 and b is 9.223372e+18. It is obvious that a is greater than b by 36854775807. I also understand that floating points are inaccurate, so I expected something less than Int.max, with the last few digits being 0.
I also tried this with Double, it crashes too.
Then I thought, maybe this is just how floating point numbers behave, so I tested the same thing in Java:
long a = Long.MAX_VALUE;
float b = (float)a;
long c = (long)b;
System.out.println(c);
It prints the expected 9223372036854775807!
What is wrong with swift?
There aren't enough bits in the mantissa of a Double or Float to accurately represent 19 significant digits, so you are getting a rounded result.
If you print the Float using String(format:) you can see a more accurate representation of the value of the Float:
let a = Int.max
print(a) // 9223372036854775807
let b = Float(a)
print(String(format: "%.1f", b)) // 9223372036854775808.0
So the value represented by the Float is 1 larger than Int.max.
Many values will be converted to the same Float value. The question becomes, how much would you have to reduce Int.max before it results in a different Double or Float value.
Starting with Double:
var y = Int.max
while Double(y) == Double(Int.max) {
y -= 1
}
print(Int.max - y) // 512
So with Double, the last 512 Ints all convert to the same Double.
Float has fewer bits to represent the value, so there are more values that all map to the same Float. Switching to - 1000 so that it runs in reasonable time:
var y = Int.max
while Float(y) == Float(Int.max) {
y -= 1000
}
print(Int.max - y) // 274877907000
So, your expectation that a Float could accurately represent a specific Int was misplaced.
Follow up question from the comments:
If float does not have enough bits to represent Int.max, how is it
able to represent a number one larger than that?
Floating point numbers are represented as two parts: mantissa and exponent. The mantissa represents the significant digits (in binary) and the exponent represents the power of 2. As a result, a floating point number can accurately express an even power of 2 by having a mantissa of 1 with an exponent that represents the power.
Numbers that are not even powers of 2 may have a binary pattern that contains more digits than can be represented in the mantissa. This is the case for Int.max (which is 2^63 - 1) because in binary that is 111111111111111111111111111111111111111111111111111111111111111 (63 1's). A Float which is 32 bits cannot store a mantissa which is 63 bits, so it has to be rounded or truncated. In the case of Int.max, rounding up by 1 results in the value
1000000000000000000000000000000000000000000000000000000000000000. Starting from the left, there is only 1 significant bit to be represented by the mantissa (the trailing 0's come for free), so this number is a mantissa of 1 and an exponent of 64.
See #MartinR's answer for an explanation of what Java is doing.
Swift and Java behave differently when converting a "too large" floating point
number to an integer. Java truncates any floating point value
larger than Long.MAX_VALUE = 2^63-1:
long c = (long)(1.0E+30f);
System.out.println(c);
// 9223372036854775807
Swift expects that the value is in the range of Int, and aborts
with a runtime exception otherwise:
/// Creates a new instance by rounding the given floating-point value toward
/// zero.
///
/// - Parameter other: A floating-point value. When `other` is rounded toward
/// zero, the result must be within the range `Int.min...Int.max`.
public init(_ value: Float)
Example:
let c = Int(Float(1.0E30))
print(c)
// fatal error: Float value cannot be converted to Int because the result would be greater than Int.max
The same happens with your value Float(Int.max), which is the
floating point representable value closest to Int.max and happens
to be larger than Int.max.
Related
I don't have much experience working with these low level bytes and numbers, so I've come here for help. I'm connecting to a bluetooth thermometer in my Flutter app, and I get an array of numbers formatted like this according to their documentation. I'm attempting to convert these numbers to a plain temperature double, but can't figure out how. This is the "example" the company gives me. However when I get a reading of 98.5 on the thermometer I get a response as an array of [113, 14, 0, 254]
Thanks for any help!
IEEE-11073 is a commonly used format in medical devices. The table you quoted has everything in it for you to decode the numbers, though might be hard to decipher at first.
Let's take the first example you have: 0xFF00016C. This is a 32-bit number and the first byte is the exponent, and the last three bytes are the mantissa. Both are encoded in 2s complement representation:
Exponent, 0xFF, in 2's complement this is the number -1
Mantissa, 0x00016C, in 2's complement this is the number 364
(If you're not quite sure how numbers are encoded in 2's complement, please ask that as a separate question.)
The next thing we do is to make sure it's not a "special" value, as dictated in your table. Since the exponent you have is not 0 (it is -1), we know that you're OK. So, no special processing is needed.
Since the value is not special, its numeric value is simply: mantissa * 10^exponent. So, we have: 364*10^-1 = 36.4, as your example shows.
Your second example is similar. The exponent is 0xFE, and that's the number -2 in 2's complement. The mantissa is 0x000D97, which is 3479 in decimal. Again, the exponent isn't 0, so no special processing is needed. So you have: 3479*10^-2 = 34.79.
You say for the 98.5 value, you get the byte-array [113, 14, 0, 254]. Let's see if we can make sense of that. Your byte array, written in hex is: [0x71, 0x0E, 0x00, 0xFE]. I'm guessing you receive these bytes in the "reverse" order, so as a 32-bit hexadecimal this is actually 0xFE000E71.
We proceed similarly: Exponent is again -2, since 0xFE is how you write -2 in 2's complement using 8-bits. (See above.) Mantissa is 0xE71 which equals 3697. So, the number is 3697*10^-2 = 36.97.
You are claiming that this is actually 98.5. My best guess is that you are reading it in Fahrenheit, and your device is reporting in Celcius. If you do the math, you'll find that 36.97C = 98.55F, which is close enough. I'm not sure how you got the 98.5 number, but with devices like this, this outcome seems to be within the precision you can about expect.
Hope this helps!
Here is something that I used to convert sfloat16 to double in dart for our flutter app.
double sfloat2double(ieee11073) {
var reservedValues = {
0x07FE: 'PositiveInfinity',
0x07FF: 'NaN',
0x0800: 'NaN',
0x0801: 'NaN',
0x0802: 'NegativeInfinity'
};
var mantissa = ieee11073 & 0x0FFF;
if (reservedValues.containsKey(mantissa)){
return 0.0; // basically error
}
if ((ieee11073 & 0x0800) != 0){
mantissa = -((ieee11073 & 0x0FFF) + 1 );
}else{
mantissa = (ieee11073 & 0x0FFF);
}
var exponent = ieee11073 >> 12;
if (((ieee11073 >> 12) & 0x8) != 0){
exponent = -((~(ieee11073 >> 12) & 0x0F) + 1 );
}else{
exponent = ((ieee11073 >> 12) & 0x0F);
}
var magnitude = pow(10, exponent);
return (mantissa * magnitude);
}
I don't have much experience working with these low level bytes and numbers, so I've come here for help. I'm connecting to a bluetooth thermometer in my Flutter app, and I get an array of numbers formatted like this according to their documentation. I'm attempting to convert these numbers to a plain temperature double, but can't figure out how. This is the "example" the company gives me. However when I get a reading of 98.5 on the thermometer I get a response as an array of [113, 14, 0, 254]
Thanks for any help!
IEEE-11073 is a commonly used format in medical devices. The table you quoted has everything in it for you to decode the numbers, though might be hard to decipher at first.
Let's take the first example you have: 0xFF00016C. This is a 32-bit number and the first byte is the exponent, and the last three bytes are the mantissa. Both are encoded in 2s complement representation:
Exponent, 0xFF, in 2's complement this is the number -1
Mantissa, 0x00016C, in 2's complement this is the number 364
(If you're not quite sure how numbers are encoded in 2's complement, please ask that as a separate question.)
The next thing we do is to make sure it's not a "special" value, as dictated in your table. Since the exponent you have is not 0 (it is -1), we know that you're OK. So, no special processing is needed.
Since the value is not special, its numeric value is simply: mantissa * 10^exponent. So, we have: 364*10^-1 = 36.4, as your example shows.
Your second example is similar. The exponent is 0xFE, and that's the number -2 in 2's complement. The mantissa is 0x000D97, which is 3479 in decimal. Again, the exponent isn't 0, so no special processing is needed. So you have: 3479*10^-2 = 34.79.
You say for the 98.5 value, you get the byte-array [113, 14, 0, 254]. Let's see if we can make sense of that. Your byte array, written in hex is: [0x71, 0x0E, 0x00, 0xFE]. I'm guessing you receive these bytes in the "reverse" order, so as a 32-bit hexadecimal this is actually 0xFE000E71.
We proceed similarly: Exponent is again -2, since 0xFE is how you write -2 in 2's complement using 8-bits. (See above.) Mantissa is 0xE71 which equals 3697. So, the number is 3697*10^-2 = 36.97.
You are claiming that this is actually 98.5. My best guess is that you are reading it in Fahrenheit, and your device is reporting in Celcius. If you do the math, you'll find that 36.97C = 98.55F, which is close enough. I'm not sure how you got the 98.5 number, but with devices like this, this outcome seems to be within the precision you can about expect.
Hope this helps!
Here is something that I used to convert sfloat16 to double in dart for our flutter app.
double sfloat2double(ieee11073) {
var reservedValues = {
0x07FE: 'PositiveInfinity',
0x07FF: 'NaN',
0x0800: 'NaN',
0x0801: 'NaN',
0x0802: 'NegativeInfinity'
};
var mantissa = ieee11073 & 0x0FFF;
if (reservedValues.containsKey(mantissa)){
return 0.0; // basically error
}
if ((ieee11073 & 0x0800) != 0){
mantissa = -((ieee11073 & 0x0FFF) + 1 );
}else{
mantissa = (ieee11073 & 0x0FFF);
}
var exponent = ieee11073 >> 12;
if (((ieee11073 >> 12) & 0x8) != 0){
exponent = -((~(ieee11073 >> 12) & 0x0F) + 1 );
}else{
exponent = ((ieee11073 >> 12) & 0x0F);
}
var magnitude = pow(10, exponent);
return (mantissa * magnitude);
}
I observed something really strange. If you run this code in Swift:
Int(Float(Int.max))
It crashes with the error message:
fatal error: Float value cannot be converted to Int because the result would be greater than Int.max
This is really counter-intuitive, so I expanded the expression into 3 lines and tried to see what happens in each step in a playground:
let a = Int.max
let b = Float(a)
let c = Int(b)
It crashes with the same message. This time, I see that a is 9223372036854775807 and b is 9.223372e+18. It is obvious that a is greater than b by 36854775807. I also understand that floating points are inaccurate, so I expected something less than Int.max, with the last few digits being 0.
I also tried this with Double, it crashes too.
Then I thought, maybe this is just how floating point numbers behave, so I tested the same thing in Java:
long a = Long.MAX_VALUE;
float b = (float)a;
long c = (long)b;
System.out.println(c);
It prints the expected 9223372036854775807!
What is wrong with swift?
There aren't enough bits in the mantissa of a Double or Float to accurately represent 19 significant digits, so you are getting a rounded result.
If you print the Float using String(format:) you can see a more accurate representation of the value of the Float:
let a = Int.max
print(a) // 9223372036854775807
let b = Float(a)
print(String(format: "%.1f", b)) // 9223372036854775808.0
So the value represented by the Float is 1 larger than Int.max.
Many values will be converted to the same Float value. The question becomes, how much would you have to reduce Int.max before it results in a different Double or Float value.
Starting with Double:
var y = Int.max
while Double(y) == Double(Int.max) {
y -= 1
}
print(Int.max - y) // 512
So with Double, the last 512 Ints all convert to the same Double.
Float has fewer bits to represent the value, so there are more values that all map to the same Float. Switching to - 1000 so that it runs in reasonable time:
var y = Int.max
while Float(y) == Float(Int.max) {
y -= 1000
}
print(Int.max - y) // 274877907000
So, your expectation that a Float could accurately represent a specific Int was misplaced.
Follow up question from the comments:
If float does not have enough bits to represent Int.max, how is it
able to represent a number one larger than that?
Floating point numbers are represented as two parts: mantissa and exponent. The mantissa represents the significant digits (in binary) and the exponent represents the power of 2. As a result, a floating point number can accurately express an even power of 2 by having a mantissa of 1 with an exponent that represents the power.
Numbers that are not even powers of 2 may have a binary pattern that contains more digits than can be represented in the mantissa. This is the case for Int.max (which is 2^63 - 1) because in binary that is 111111111111111111111111111111111111111111111111111111111111111 (63 1's). A Float which is 32 bits cannot store a mantissa which is 63 bits, so it has to be rounded or truncated. In the case of Int.max, rounding up by 1 results in the value
1000000000000000000000000000000000000000000000000000000000000000. Starting from the left, there is only 1 significant bit to be represented by the mantissa (the trailing 0's come for free), so this number is a mantissa of 1 and an exponent of 64.
See #MartinR's answer for an explanation of what Java is doing.
Swift and Java behave differently when converting a "too large" floating point
number to an integer. Java truncates any floating point value
larger than Long.MAX_VALUE = 2^63-1:
long c = (long)(1.0E+30f);
System.out.println(c);
// 9223372036854775807
Swift expects that the value is in the range of Int, and aborts
with a runtime exception otherwise:
/// Creates a new instance by rounding the given floating-point value toward
/// zero.
///
/// - Parameter other: A floating-point value. When `other` is rounded toward
/// zero, the result must be within the range `Int.min...Int.max`.
public init(_ value: Float)
Example:
let c = Int(Float(1.0E30))
print(c)
// fatal error: Float value cannot be converted to Int because the result would be greater than Int.max
The same happens with your value Float(Int.max), which is the
floating point representable value closest to Int.max and happens
to be larger than Int.max.
I am trying to get a random decimal from 0.75 to 1.25
let incomeCalc = Decimal((arc4random_uniform(50)+75)/100)
print("incomeCalc")
print(incomeCalc)
Why does this print 0?
arc4random_uniform return an integer type so you are doing integer math. You need to be doing floating point math.
let incomeCalc = Decimal(Double((arc4random_uniform(50)+75))/100)
By casting the value before you do the division, you get a Double result which is passed to your Decimal initializer.
Or you can do:
let incomeCalc = Decimal((arc4random_uniform(50)+75))/100
which creates the Decimal before the division is done.
You can also use the code below which gets a random number between 75 - 125 and then divides it by 100
let incomeCalc = Decimal((arc4random_uniform(50)+75)) / 100
print("incomeCalc")
print(incomeCalc)
Paste the following code into a playground:
5.0 / 100
func test(anything: Float) -> Float {
return anything / 100
}
test(5.0)
The first line should return 0.05 as expected. The function test returns 0.0500000007450581. Why?
It has nothing to do with functions. Your first example is using type Double which represents floating point numbers more precisely by using 64 bits. If you were to change your second example to:
func test(anything: Double) -> Double {
return anything / 100
}
test(5.0)
You would get the result you expect. Float uses only 32 bits of data, thus it provides a less precise representation of the number. Also, floating point numbers are stored as binary values and frequently are only an approximation of the base 10 representation. That is why 0.05 is showing up as 0.0500000007450581 when stored as a Float.