How can I check whether a Double value overflows? - scala

I want to check if adding some value to a double value exceed the Double limits or not. I tried this:
object Hello {
def main(args: Array[String]): Unit = {
var t = Double.MaxValue
var t2 = t+100000000
if(t2 > 0) {
println("t2 > 0: " + t2)
} else
println("t2 <= 0: " + t2)
}
}
The output I get is
t2 > 0: 1.7976931348623157E308
What I actually want is to sum billions of values and check whether or not the running sum overflows at any time.

The first part of your question seems to stem from a misunderstanding of floating-point numbers.
IEEE-754 floating-point numbers do not wrap around like some finite-size integers would. Instead, they "saturate" at Double.PositiveInfinity, which represents mathematical (positive) infinity. Double.MaxValue is the largest finite positive value of doubles. The next Double after that is Double.PositiveInfinity. Adding any double (other than Double.NegativeInfinity or NaNs) to Double.PositiveInfinity yields Double.PositiveInfinity.
scala> Double.PositiveInfinity + 1
res0: Double = Infinity
scala> Double.PositiveInfinity - 1
res1: Double = Infinity
scala> Double.PositiveInfinity + Double.NaN
res2: Double = NaN
scala> Double.PositiveInfinity + Double.NegativeInfinity
res3: Double = NaN
Floating-point numbers get fewer and farther between as their magnitude grows. Double.MaxValue + 100000000 evaluates to Double.MaxValue as a result of roundoff error: Double.MaxValue is so much larger than 100000000 that the former "swallows up" the latter if you try to add them. You would need to add a Double of the order of math.pow(2, -52) * Double.MaxValue to Double.MaxValue in order to get Double.PositiveInfinity:
scala> math.pow(2,-52) * Double.MaxValue + Double.MaxValue
res4: Double = Infinity
Now, you write
What I actually want is to sum billions of values and check whether or not the running sum overflows at any time.
One possible approach is to define a function that adds the numbers recursively but stops if the running sum is an infinity or a NaN, and wraps the result in an Either[String, Double]:
import scala.collection.immutable
def sumToEither(xs: immutable.Seq[Double]): Either[String, Double] = {
#annotation.tailrec
def go(ys: immutable.Seq[Double], acc: Double): Double =
if (ys.isEmpty || acc.isInfinite || acc.isNaN) acc
else go(ys.tail, ys.head + acc)
go(xs, 0.0) match {
case x if x.isInfinite => Left("overflow")
case x if x.isNaN => Left("NaN")
case x => Right(x)
}
}

In response to your question in the comments:
Actually, I want to get the total of billions of values and check if the total overflows anytime or not. Could you please tell a way to check that?
If the total overflows, the result will be either an infinity (either positive or negative), or NaN (if at some point you have added a positive and negative infinity): the easiest way is to check total.isInfinity || total.isNaN.

Related

find out if a number is a good number in scala

Hi I am new to scala functional programming methodology. I want to input a number to my function and check if it is a good number or not.
A number is a good number if its every digit is larger than the sum of digits which are on the right side of that digit. 
For example:
9620  is good as (2 > 0, 6 > 2+0, 9 > 6+2+0)
steps I am using to solve this is
1. converting a number to string and reversing it
2. storing all digits of the reversed number as elements of a list
3. applying for loop from i equals 1 to length of number - 1
4. calculating sum of first i digits as num2
5. extracting ith digit from the list as digit1 which is one digit ahead of the first i numbers for which we calculated sum because list starts from zero.
6. comparing output of 4th and 5th step. if num1 is greater than num2 then we will break the for loop and come out of the loop to print it is not a good number.
please find my code below
val num1 = 9521.toString.reverse
val list1 = num1.map(_.todigit).toList
for (i <- 1 to num1.length - 1) {
val num2 = num1.take(i).map(_.toDigits) sum
val digit1 = list1(i)
if (num2 > digit1) {
print("number is not a good number")
break
}
}
I know this is not the most optimized way to solve this problem. Also I am looking for a way to code this using tail recursion where I pass two numbers and get all the good numbers falling in between those two numbers.
Can this be done in more optimized way?
Thanks in advance!
No String conversions required.
val n = 9620
val isGood = Stream.iterate(n)(_/10)
.takeWhile(_>0)
.map(_%10)
.foldLeft((true,-1)){ case ((bool,sum),digit) =>
(bool && digit > sum, sum+digit)
}._1
Here is a purely numeric version using a recursive function.
def isGood(n: Int): Boolean = {
#tailrec
def loop(n: Int, sum: Int): Boolean =
(n == 0) || (n%10 > sum && loop(n/10, sum + n%10))
loop(n/10, n%10)
}
This should compile into an efficient loop.
Using this function:(This will be the efficient way as the function forall will not traverse the entire list of digits. it stops when it finds the false condition immediately ( ie., when v(i)>v.drop(i+1).sum becomes false) while traversing from left to right of the vector v. )
def isGood(n: Int)= {
val v1 = n.toString.map(_.asDigit)
val v = if(v1.last!=0) v1 else v1.dropRight(1)
(0 to v.size-1).forall(i=>v(i)>v.drop(i+1).sum)
}
If we want to find good numbers in an interval of integers ranging from n1 to n2 we can use this function:
def goodNums(n1:Int,n2:Int) = (n1 to n2).filter(isGood(_))
In Scala REPL:
scala> isGood(9620)
res51: Boolean = true
scala> isGood(9600)
res52: Boolean = false
scala> isGood(9641)
res53: Boolean = false
scala> isGood(9521)
res54: Boolean = true
scala> goodNums(412,534)
res66: scala.collection.immutable.IndexedSeq[Int] = Vector(420, 421, 430, 510, 520, 521, 530, 531)
scala> goodNums(3412,5334)
res67: scala.collection.immutable.IndexedSeq[Int] = Vector(4210, 5210, 5310)
This is a more functional way. pairs is a list of tuples between a digit and the sum of the following digits. It is easy to create these tuples with drop, take and slice (a combination of drop and take) methods.
Finally I can represent my condition in an expressive way with forall method.
val n = 9620
val str = n.toString
val pairs = for { x <- 1 until str.length } yield (str.slice(x - 1, x).toInt, str.drop(x).map(_.asDigit).sum)
pairs.forall { case (a, b) => a > b }
If you want to be functional and expressive avoid to use break. If you need to check a condition for each element is a good idea to move your problem to collections, so you can use forAll.
This is not the case, but if you want performance (if you don't want to create an entire pairs collection because the condition for the first element is false) you can change your for collection from a Range to Stream.
(1 until str.length).toStream
Functional style tends to prefer monadic type things, such as maps and reduces. To make this look functional and clear, I'd do something like:
def isGood(value: Int) =
value.toString.reverse.map(digit=>Some(digit.asDigit)).
reduceLeft[Option[Int]]
{
case(sum, Some(digit)) => sum.collectFirst{case sum if sum < digit => sum+digit}
}.isDefined
Instead of using tail recursion to calculate this for ranges, just generate the range and then filter over it:
def goodInRange(low: Int, high: Int) = (low to high).filter(isGood(_))

Averaging a very long List[Double] Without getting infinity in Scala

I have a very long list of doubles that I need to average but I can't sum them within the double data type so when I go to divide I still get Infinity.
def applyToMap(list: Map[String, List[Map[String, String]]], f: Map[String, String]=>Double): Map[String,Double]={
val mSLD = list.mapValues(lm=>lm.map(f))
mSLD.mapValues(ld=> ld.sum/ld.size)
}
This leaves me with a Map[String, Double] that are all Key -> Infinity
You could use fold to compute an average as you go. Rather than doing sum / size you should count your way through the items with n, and for each one adjust your accumulator with acc = (acc * n/(n+1)) + (item * 1/(n+1))
Here’s the general scala code:
val average = seq.foldLeft((0.0, 1)) ((acc, i) => ((acc._1 + (i - acc._1) / acc._2), acc._2 + 1))._1
Taken from here.
You’d probably still have precision difficulty if the list is really long, as you’d be dividing by a gradually very large number. To be really safe you should break the list into sublists, and compute the average of averages of the sublists. Make sure the sublists are all the same length though, or do a weighted average based on their size.
Interested in implementing gandaliters solution, I came up with the following (Since I'm not the well known friend of Doubles, I tried to find an easy to follow numeric sequence with Bytes). First, I generate 10 Bytes in the range of 75..125, to be close to MaxByte, but below for every value, and in average 100, for simple control:
val rnd = util.Random
val is=(1 to 10).map (i => (rnd.nextInt (50)+75).toByte)
// = Vector(99, 122, 99, 105, 102, 104, 122, 99, 87, 114)
The 1st algo multiplies before division (which increases the danger to exceed MaxByte), the 2nd divides before multiplication, which leads to rounding errors.
def slidingAvg0 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = ((sofar * cnt).toByte / (cnt + 1).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
def slidingAvg1 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = (((sofar / (cnt + 1).toByte).toByte * cnt).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
This is foldLeft in scala:
((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg0 (sofar, x, cnt)}
110
21
41
2
18
32
8
16
0
scala> ((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg1 (sofar, x, cnt)}
110
105
104
100
97
95
89
81
83
Since 10 values is far too less to rely on the average being close to 100, let's see the sum as Int:
is.map (_.toInt).sum
res65: Int = 1053
The drift is pretty significant (should be 105, is 0/83)
Whether the findings are transferable from Bytes/Int to Doubles is the other question. And I'm not 100% confident, that my braces mirror the evaluation order, but imho, for multiplication/division of same precedence it is left to right.
So the original formulas were:
acc = (acc * n/(n+1)) + (item * 1/(n+1))
acc = (acc /(n+1) *n) + (item/(n+1))
If i understand the OP correctly then the amount of data doesn't seem to be a problem otherwise it wouldn't fit into memory.
So i concentrate on the data types only.
Summary
My suggestion is to go with BigDecimal instead of Double.
Especially if you are adding reasonbly high values.
The only significant drawback is the performance and a small amount of cluttered syntax.
Alternatively you must rescale your input upfront but this will degrade precision and requires special care with post processing.
Double breaks at some scale
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (Double.MaxValue + 1) == Double.MaxValue
val res1 = Double.MaxValue/10 == Double.MaxValue
val res2 = List.fill(11)(Double.MaxValue/10).sum
val res3 = List.fill(10)(Double.MaxValue/10).sum == Double.MaxValue
val res4 = (List.fill(10)(Double.MaxValue/10).sum + 1) == Double.MaxValue
// Exiting paste mode, now interpreting.
res0: Boolean = true
res1: Boolean = false
res2: Double = Infinity
res3: Boolean = true
res4: Boolean = true
Take a look these simple Double arithmetics examples in your scala REPL:
Double.MaxValue + 1 will numerically cancel out and nothing is going to be added, thus it is still the same as Double.MaxValue
Double.MaxValue/10 behaves as expected and doesn't equal to Double.MaxValue
Adding Double.MaxValue/10 for 11 times will produce an overflow to Infintiy
Adding Double.MaxValue/10 for 10 times won't break arithmetics and evaluate to Double.MaxValue again
The summed Double.MaxValue/10 behaves exactly as the Double.MaxValue
BigDecimal works on all scales but is slower
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (BigDecimal(Double.MaxValue) + 1) == BigDecimal(Double.MaxValue)
val res1 = BigDecimal(Double.MaxValue)/10 == BigDecimal(Double.MaxValue)
val res2 = List.fill(11)(BigDecimal(Double.MaxValue)/10).sum
val res3 = List.fill(10)(BigDecimal(Double.MaxValue)/10).sum == BigDecimal(Double.MaxValue)
val res4 = (List.fill(10)(BigDecimal(Double.MaxValue)/10).sum + 1) == BigDecimal(Double.MaxValue)
// Exiting paste mode, now interpreting.
res0: Boolean = false
res1: Boolean = false
res2: scala.math.BigDecimal = 197746244834854727000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
res3: Boolean = true
res4: Boolean = false
Now compare these results with the ones above from Double.
As you can see everything works as expected.
Rescaling reduces precision and can be tedious
When working with astronomic or microscopic scales it is likely to happen that numbers will overflow or underflow quickly.
Then it is appropriate to work with other units than the base units to compensate this.
E.g. with km instead of m.
However, then you will have to take special care when multiplying those numbers in formulas.
10km * 10km ≠ 100 km^2
but rather
10,000 m * 10,000 m = 100,000,000 m^2 = 100 Mm^2
So keep this in mind.
Another trap is when dealing with very diverse datasets where numbers exist in all kinds of scales and quantities.
When scaling down your input domain you will loose precision and small numbers may be cancelled out.
In some scenarios these numbers don't need to be considered because of their small impact.
However, when these small numbers exist in a high frequency and are ignored all the time you will introduce a large error in the end.
So keep this in mind as well ;)
Hope this helps

custom absolute function is not working for large long values in Scala

I have written custom function to get absolute value for long number. Below is the
def absolute(x:Long):Long= x match {
case y:Long if(y<0) => -1 * y
case y if(y>=0) => y
}
println(absolute(-9223372036854775808L))
println(absolute(-2300L))
Below is the output of above program
-9223372036854775808
2300
I am not sure why it is working for very big long values. Ang suggestions on the same.
This is just a case of integer overflow:
scala> Long.MaxValue
res0: Long = 9223372036854775807
scala> Long.MinValue
res1: Long = -9223372036854775808
Thus when you negate -9223372036854775808 you are overflowing the Long by 1 unit, causing it to wrap around (to itself!).
Also there is no need for a match here:
scala> def abs(x: Long): Long = if (x < 0) -x else x
abs: (x: Long)Long
Why not use scala.math.abs?
See scala.math

Scalacheck number generator between 0 <= x < 2^64

I'm trying to right a good number generator that covers uint64_t in C. Here is what I have so far.
def uInt64s : Gen[BigInt] = Gen.choose(0,64).map(pow2(_) - 1)
It is a good start, but it only generates numbers 2^n - 1. Is there a more effective way to generate random BigInts while preserving the number range 0 <= n < 2^64?
Okay, maybe I am missing something here, but isn't it as simple as this?
def uInt64s : Gen[BigInt] = Gen.chooseNum(Long.MinValue,Long.MaxValue)
.map(x => BigInt(x) + BigInt(2).pow(63))
Longs already have the correct number of bits - just adding 2^63 so Long.MinValue becomes 0 and Long.MaxValue becomes 2^64 - 1. And doing the addition with BigInts of course.
I was curious about the distribution of generated values. Apparently the distribution of chooseNum is not uniform, since it prefers special values, but the edge cases for Longs are probably also interesting for UInt64s:
/** Generates numbers within the given inclusive range, with
* extra weight on zero, +/- unity, both extremities, and any special
* numbers provided. The special numbers must lie within the given range,
* otherwise they won't be included. */
def chooseNum[T](minT: T, maxT: T, specials: T*)(
With ScalaCheck...
Generating a number from 0..Long.MaxValue is easy.
Generating an unsigned long from 0..Long.MaxValue..2^64-1 is not so easy.
Tried:
❌ Gen.chooseNum(BigInt(0),BigInt(2).pow(64)-1) Does not work: At this time there is not an implicit defined for BigInt.
❌ Arbitrary.arbBigInt.arbitrary Does not work: It's type BigInt but still limited to the range of signed Long.
✔ Generate a Long as BigInt and shift left arbitrarily to make an UINT64 Works: Taking Rickard Nilsson's, ScalaCheck code as a guide this passed the test.
This is what I came up with:
// Generate a long and map to type BigInt
def genBigInt : Gen[BigInt] = Gen.chooseNum(0,Long.MaxValue) map (x => BigInt(x))
// Take genBigInt and shift-left a chooseNum(0,64) of positions
def genUInt64 : Gen[BigInt] = for { bi <- genBigInt; n <- Gen.chooseNum(0,64); x = (bi << n) if x >= 0 && x < BigInt(2).pow(64) } yield x
...
// Use the generator, genUInt64()
As noted, Scalacheck number generator between 0 <= x < 2^64, the distribution of the BigInts generated is not even. The preferred generator is #stholzm solution:
def genUInt64b : Gen[BigInt] =
Gen.chooseNum(Long.MinValue,Long.MaxValue) map (x =>
BigInt(x) + BigInt(2).pow(63))
it is simpler, the numbers fed to ScalaCheck will be more evenly distributed, it is faster, and it passes the tests.
A simpler and more efficient alternative to stholmz's answer is as follows:
val myGen = {
val offset = -BigInt(Long.MinValue)
Arbitrary.arbitrary[Long].map { BigInt(_) + offset }
}
Generate an arbitrary Long;
Convert it to a BigInt;
Add the appropriate offset, i.e. -BigInt(Long.MinValue)).
Tests in the REPL:
scala> myGen.sample
res0: Option[scala.math.BigInt] = Some(9223372036854775807)
scala> myGen.sample
res1: Option[scala.math.BigInt] = Some(12628207908230674671)
scala> myGen.sample
res2: Option[scala.math.BigInt] = Some(845964316914833060)
scala> myGen.sample
res3: Option[scala.math.BigInt] = Some(15120039215775627454)
scala> myGen.sample
res4: Option[scala.math.BigInt] = Some(0)
scala> myGen.sample
res5: Option[scala.math.BigInt] = Some(13652951502631572419)
Here is what I have so far, I'm not entirely happy with it
/**
* Chooses a BigInt in the ranges of 0 <= bigInt < 2^^64
* #return
*/
def bigInts : Gen[BigInt] = for {
bigInt <- Arbitrary.arbBigInt.arbitrary
exponent <- Gen.choose(1,2)
} yield bigInt.pow(exponent)
def positiveBigInts : Gen[BigInt] = bigInts.filter(_ >= 0)
def bigIntsUInt64Range : Gen[BigInt] = positiveBigInts.filter(_ < (BigInt(1) << 64))
/**
* Generates a number in the range 0 <= x < 2^^64
* then wraps it in a UInt64
* #return
*/
def uInt64s : Gen[UInt64] = for {
bigInt <- bigIntsUInt64Range
} yield UInt64(bigInt)
Since it appears that Arbitrary.argBigInt.arbitrary is only ranges -2^63 <= x <= 2^63 I take the x^2 some of the time to get a number larger than 2^63
Free free to comment if you see a place improvements can be made or a bug fixed

Why does scala return an out of range value in this modulo operation?

This is a piece of code to generate random Long values within a given range, simplified for clarity:
def getLong(min: Long, max: Long): Long = {
if(min > max) {
throw new IncorrectBoundsException
}
val rangeSize = (max - min + 1L)
val randValue = math.abs(Random.nextLong())
val result = (randValue % (rangeSize)) + min
result
}
I know the results of this aren't uniform and this wouldn't work correctly for some values of min and max, but that's beside the point.
In the tests it turned out, that the following assertion isn't always true:
getLong(-1L, 1L) >= -1L
More specifically the returned value is -3. How is that even possible?
As it turns out, math.abs(x: Long): Long isn't guaranteed to always return non-negative values! There is no Long value that could represent math.abs(Long.MinValue), so instead of throwing an exception, math.abs returns Long.MinValue:
scala> Long.MinValue
res27: Long = -9223372036854775808
scala> math.abs(Long.MinValue)
res28: Long = -9223372036854775808
scala> math.abs(Long.MinValue) % 3
res29: Long = -2
scala> math.abs(Long.MinValue) % 3 + (-1)
res30: Long = -3
Which is, in my opinion, a very good example of why one should be using ScalaCheck to test at least parts of their codebase.