scala can't make an add on a long - scala

I'm not able to do an add on long type.
scala or the processor doesn't manage correctly the sign
scala> var i="-1014570924054025346".toLong
i: Long = -1014570924054025346
scala> i=i+92233720368547758L
i: Long = -922337203685477588
scala> var i=9223372036854775807L
i: Long = 9223372036854775807
scala> i=i+5
i: Long = -9223372036854775804
The first test where a negative number doesn't pass to a positive one is a problem for me

I have not fully understood the question, but for the first example, you get the expected result. What happens in the second example, the Long number happens to be the maximum value for a Long (i.e Long.MaxValue) so essentially when you had another positive number, it's overflowing:
scala> Long.MaxValue
res4: Long = 9223372036854775807L
scala> Long.MaxValue + 1
res7: Long = -9223372036854775808L // which is Long.MinValue
scala> Long.MinValue + 4
res8: Long = -9223372036854775804L // which is the result that you get
In other words:
9223372036854775807L + 5
is equivalent to:
Long.MaxValue + 5
which is equivalent to:
Long.MinValue + 4 // because (Long.MaxValue + 1) = Long.MinValue
which is equals to -9223372036854775804L

If you really need to use such big numbers, you might try using BigInt
scala> val x = BigInt(Long.MaxValue)
x: scala.math.BigInt = 9223372036854775807
scala> x + 1
res6: scala.math.BigInt = 9223372036854775808
scala> x + 5
res11: scala.math.BigInt = 9223372036854775812
scala> x + 10
res8: scala.math.BigInt = 9223372036854775817
scala> x * 1000
res10: scala.math.BigInt = 9223372036854775807000
scala> x * x
res9: scala.math.BigInt = 85070591730234615847396907784232501249
scala> x * x * x * x
res13: scala.math.BigInt = 7237005577332262210834635695349653859421902880380109739573089701262786560001
scala>
The documentation on BigInt is rather, err, small. However, i believe that it is basically an infinite precision integer (can support as many digits as you need). Having said that, there will probably at some point be a limit. There is a comment on BigDecimal - which has more documentation - that at about 4,934 digits there might be some deviation between BigDecimal and BigInt.
I will leave it to someone else to work out whether or not x ^ 4 is the value shown above.
Oh, I almost forgot your negative number test, I aligned the sum with the initialisation, to make it easier to visualise that the result appears to be correct:
scala> val x = BigInt("-1014570924054025346")
x: scala.math.BigInt = -1014570924054025346
scala> x + 92233720368547758L
res15: scala.math.BigInt = -922337203685477588
scala>
As for Ints, Longs and similar data types, they are limited in their size due to the number of bits they are constrained to. Int's are typically 32 bit and longs are typically 64 bits.
It is easier to visualise when you look at them in hexadecimal. A signed Byte (at 8 bits) has a maximum positive value of 0x7F (127). When you add one to it, you get 0x80 (-128). This is because we use the "Most Significant Bit" as an indicator of whether the number is positive or negative.
If the same byte was interpreted as unsigned, then 0x7F (127) would still become 0x80 when 1 is added to it. However, since we are interpreting it as unsigned, this would be equivalent to 128. We can keep adding one until we get to 0xFF (255) at which point if we add another 1 we will end up at 0x00 again which is of course 0.
Here are some references that explain this in much more detail:
Wikipedia - Twos complement
Cornell University - what is twos complement
Stack Overflow - what is 2s complement

Related

Averaging a very long List[Double] Without getting infinity in Scala

I have a very long list of doubles that I need to average but I can't sum them within the double data type so when I go to divide I still get Infinity.
def applyToMap(list: Map[String, List[Map[String, String]]], f: Map[String, String]=>Double): Map[String,Double]={
val mSLD = list.mapValues(lm=>lm.map(f))
mSLD.mapValues(ld=> ld.sum/ld.size)
}
This leaves me with a Map[String, Double] that are all Key -> Infinity
You could use fold to compute an average as you go. Rather than doing sum / size you should count your way through the items with n, and for each one adjust your accumulator with acc = (acc * n/(n+1)) + (item * 1/(n+1))
Here’s the general scala code:
val average = seq.foldLeft((0.0, 1)) ((acc, i) => ((acc._1 + (i - acc._1) / acc._2), acc._2 + 1))._1
Taken from here.
You’d probably still have precision difficulty if the list is really long, as you’d be dividing by a gradually very large number. To be really safe you should break the list into sublists, and compute the average of averages of the sublists. Make sure the sublists are all the same length though, or do a weighted average based on their size.
Interested in implementing gandaliters solution, I came up with the following (Since I'm not the well known friend of Doubles, I tried to find an easy to follow numeric sequence with Bytes). First, I generate 10 Bytes in the range of 75..125, to be close to MaxByte, but below for every value, and in average 100, for simple control:
val rnd = util.Random
val is=(1 to 10).map (i => (rnd.nextInt (50)+75).toByte)
// = Vector(99, 122, 99, 105, 102, 104, 122, 99, 87, 114)
The 1st algo multiplies before division (which increases the danger to exceed MaxByte), the 2nd divides before multiplication, which leads to rounding errors.
def slidingAvg0 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = ((sofar * cnt).toByte / (cnt + 1).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
def slidingAvg1 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = (((sofar / (cnt + 1).toByte).toByte * cnt).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
This is foldLeft in scala:
((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg0 (sofar, x, cnt)}
110
21
41
2
18
32
8
16
0
scala> ((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg1 (sofar, x, cnt)}
110
105
104
100
97
95
89
81
83
Since 10 values is far too less to rely on the average being close to 100, let's see the sum as Int:
is.map (_.toInt).sum
res65: Int = 1053
The drift is pretty significant (should be 105, is 0/83)
Whether the findings are transferable from Bytes/Int to Doubles is the other question. And I'm not 100% confident, that my braces mirror the evaluation order, but imho, for multiplication/division of same precedence it is left to right.
So the original formulas were:
acc = (acc * n/(n+1)) + (item * 1/(n+1))
acc = (acc /(n+1) *n) + (item/(n+1))
If i understand the OP correctly then the amount of data doesn't seem to be a problem otherwise it wouldn't fit into memory.
So i concentrate on the data types only.
Summary
My suggestion is to go with BigDecimal instead of Double.
Especially if you are adding reasonbly high values.
The only significant drawback is the performance and a small amount of cluttered syntax.
Alternatively you must rescale your input upfront but this will degrade precision and requires special care with post processing.
Double breaks at some scale
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (Double.MaxValue + 1) == Double.MaxValue
val res1 = Double.MaxValue/10 == Double.MaxValue
val res2 = List.fill(11)(Double.MaxValue/10).sum
val res3 = List.fill(10)(Double.MaxValue/10).sum == Double.MaxValue
val res4 = (List.fill(10)(Double.MaxValue/10).sum + 1) == Double.MaxValue
// Exiting paste mode, now interpreting.
res0: Boolean = true
res1: Boolean = false
res2: Double = Infinity
res3: Boolean = true
res4: Boolean = true
Take a look these simple Double arithmetics examples in your scala REPL:
Double.MaxValue + 1 will numerically cancel out and nothing is going to be added, thus it is still the same as Double.MaxValue
Double.MaxValue/10 behaves as expected and doesn't equal to Double.MaxValue
Adding Double.MaxValue/10 for 11 times will produce an overflow to Infintiy
Adding Double.MaxValue/10 for 10 times won't break arithmetics and evaluate to Double.MaxValue again
The summed Double.MaxValue/10 behaves exactly as the Double.MaxValue
BigDecimal works on all scales but is slower
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (BigDecimal(Double.MaxValue) + 1) == BigDecimal(Double.MaxValue)
val res1 = BigDecimal(Double.MaxValue)/10 == BigDecimal(Double.MaxValue)
val res2 = List.fill(11)(BigDecimal(Double.MaxValue)/10).sum
val res3 = List.fill(10)(BigDecimal(Double.MaxValue)/10).sum == BigDecimal(Double.MaxValue)
val res4 = (List.fill(10)(BigDecimal(Double.MaxValue)/10).sum + 1) == BigDecimal(Double.MaxValue)
// Exiting paste mode, now interpreting.
res0: Boolean = false
res1: Boolean = false
res2: scala.math.BigDecimal = 197746244834854727000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
res3: Boolean = true
res4: Boolean = false
Now compare these results with the ones above from Double.
As you can see everything works as expected.
Rescaling reduces precision and can be tedious
When working with astronomic or microscopic scales it is likely to happen that numbers will overflow or underflow quickly.
Then it is appropriate to work with other units than the base units to compensate this.
E.g. with km instead of m.
However, then you will have to take special care when multiplying those numbers in formulas.
10km * 10km ≠ 100 km^2
but rather
10,000 m * 10,000 m = 100,000,000 m^2 = 100 Mm^2
So keep this in mind.
Another trap is when dealing with very diverse datasets where numbers exist in all kinds of scales and quantities.
When scaling down your input domain you will loose precision and small numbers may be cancelled out.
In some scenarios these numbers don't need to be considered because of their small impact.
However, when these small numbers exist in a high frequency and are ignored all the time you will introduce a large error in the end.
So keep this in mind as well ;)
Hope this helps

custom absolute function is not working for large long values in Scala

I have written custom function to get absolute value for long number. Below is the
def absolute(x:Long):Long= x match {
case y:Long if(y<0) => -1 * y
case y if(y>=0) => y
}
println(absolute(-9223372036854775808L))
println(absolute(-2300L))
Below is the output of above program
-9223372036854775808
2300
I am not sure why it is working for very big long values. Ang suggestions on the same.
This is just a case of integer overflow:
scala> Long.MaxValue
res0: Long = 9223372036854775807
scala> Long.MinValue
res1: Long = -9223372036854775808
Thus when you negate -9223372036854775808 you are overflowing the Long by 1 unit, causing it to wrap around (to itself!).
Also there is no need for a match here:
scala> def abs(x: Long): Long = if (x < 0) -x else x
abs: (x: Long)Long
Why not use scala.math.abs?
See scala.math

How to convert Hex string to Hex value in scala?

I am new to scala and was trying out a few basic concepts. I have an Integer value that I am trying to convert an integer x into a hex value using the following command
val y = Integer.toHexString(x)
This value gives me the hex number in a string format. However I want to get the hex value as a value and not a string. I could write some code for it, but I was wondering if there was some direct command available to do this? Any help is appreciated.
Edit: For example with an integer value of say x =38
val y = Integer.toHexString(38)
y is "26" which is a string. I want to use the hex value 0x26 (not the string) to do bitwise AND operations.
Hex is simply a presentation of a numerical value in base 16. You don't need a numeric value in hexadecimal representation to do bitwise operations on it. In memory, a 32bit integer will be stored in binary format, which is a different way of representation that same number, only in a different base. For example, if you have the number 4 (0100 in binary representation, 0x4 in hex) as variable in scala, you can bitwise on it using the & operator:
scala> val y = 4
y: Int = 4
scala> y & 6
res0: Int = 4
scala> y & 2
res1: Int = 0
scala> y & 0x4
res5: Int = 4
Same goes for bitwise OR (|) operations:
scala> y | 2
res2: Int = 6
scala> y | 4
res3: Int = 4
You do not need to convert the integer to a "hex value" to do bitwise operations. You can just do:
val x = 38
val y = 64
x | y
In fact, there is no such thing as a "hex value" in memory. Every integer is stored in binary. If you want to write an integer literal in hex, you can prefix it with 0x:
val x = 0x38 // 56 in decimal.
x | 0x10 // Turn on 5th bit.

Why does scala return an out of range value in this modulo operation?

This is a piece of code to generate random Long values within a given range, simplified for clarity:
def getLong(min: Long, max: Long): Long = {
if(min > max) {
throw new IncorrectBoundsException
}
val rangeSize = (max - min + 1L)
val randValue = math.abs(Random.nextLong())
val result = (randValue % (rangeSize)) + min
result
}
I know the results of this aren't uniform and this wouldn't work correctly for some values of min and max, but that's beside the point.
In the tests it turned out, that the following assertion isn't always true:
getLong(-1L, 1L) >= -1L
More specifically the returned value is -3. How is that even possible?
As it turns out, math.abs(x: Long): Long isn't guaranteed to always return non-negative values! There is no Long value that could represent math.abs(Long.MinValue), so instead of throwing an exception, math.abs returns Long.MinValue:
scala> Long.MinValue
res27: Long = -9223372036854775808
scala> math.abs(Long.MinValue)
res28: Long = -9223372036854775808
scala> math.abs(Long.MinValue) % 3
res29: Long = -2
scala> math.abs(Long.MinValue) % 3 + (-1)
res30: Long = -3
Which is, in my opinion, a very good example of why one should be using ScalaCheck to test at least parts of their codebase.

Avoiding Overflow

This Stackoverflow post discusses the potential problem of a numeric overflow if not appending L to a number:
Here's an example from the REPL:
scala> 100000 * 100000 // no type specified, so numbers are `int`'s
res0: Int = 1410065408
One way to avoid this problem is to use L.
scala> 100000L * 100000L
res1: Long = 10000000000
Or to specify the number's types:
scala> val x: Long = 100000
x: Long = 100000
scala> x * x
res2: Long = 10000000000
What's considered the best practice to properly specify a number's type?
You should always use L if you are using a long. Otherwise, you can still have problems:
scala> val x: Long = 10000000000
<console>:1: error: integer number too large
val x: Long = 10000000000
^
scala> val x = 10000000000L
x: Long = 10000000000
The conversion due to type ascription happens after the literal has been interpreted as Int.