New to Scala and am trying to come up with a library in Scala to check if the double value being passed is of a certain precision and scale. What I noticed was that if the value being passed is 1.00001 then I get the value as that in my called function, but if the value being passed is 0.00001 then I get the value as 1.0E-5, Is there any way to preserve the number in Scala?
def checkPrecisionAndScaleFormat(precision: Int, scale: Int)(valueToCheck: Double): Boolean = {
val value = BigDecimal(valueToCheck)
value.precision <= precision && value.scale <= scale
}
What I noticed was that if the value being passed is 1.00001 then I get the value as that in my called function, but if the value being passed is 0.00001 then I get the value as 1.0E-5
From your phrasing, it seems like you see 1.00001 and 1.0E-5 when debugging (either by printing or in the debugger). It's important to understand that
this doesn't correspond to any internal difference, it's just how Double.toString is defined in Java.
when you do something like val x = 1.00001, the value isn't exactly 1.00001 but the closest number representable as a Double: 1.000010000000000065512040237081237137317657470703125. The easiest way to see the exact value is actually looking at BigDecimal.exact(valueToCheck).
The only way to preserve the number is not to work with Double to begin with. If it's passed as a string, create the BigDecimal from the string. If it's the result of some calculations as a double, consider doing them on BigDecimals instead. But string representation of a Double simply doesn't carry the information you want.
Related
I am new to Scala and I would like to understand some basic stuff.
First of all, I need to calculate the average of a certain column of a DataFrame and use the result as a double type variable.
After some Internet research I was able to calculate the average and at the same time pass it into a List type Any by using the following command:
val avgX_List = mainDataFrame.groupBy().agg(mean("_c1")).collect().map(_(0)).toList
where "_c1" is the second column of my dataframe. This line of code returns a List with type List[Any].
To pass the result into a variable I used the following command:
var avgX = avgX_List(0)
hoping that the var avgX would be type double automatically but that didn't happen obviously.
So now let the questions begin:
What does map(_(0)) do? I know the basic definition of the map() transformation but I can't find an explanation with this exact argument
I know that by using .toList method in the end of the command my result will be a List with type Any. Is there a way that I could change this into List which contains type Double elements? Or even convert this one
Do you think that it would be much more appropriate to pass the column of my Dataframe into a List[Double] and then calculate the average of its elements?
Is the solution I showed above at any point of view correct based on my problem? I know that "it is working" is different from "correct solution"?
Summing up, I need to calculate the average of a certain column of a Dataframe and have the result as a double type variable.
Note that: I am Greek and I find it hard sometimes to understand some English coding "slang".
map(_(0)) is a shortcut for map( (r: Row) => r(0) ), which is in turn a shortcut for map( (r: Row) => r.apply(0) ). The apply method returns Any, and so you are losing the right type. Try using map(_.getAs[Double](0)) or map(_.getDouble(0)) instead.
Collecting all entries of the column and then computing the average would be highly counterproductive, because you'd have to send huge amounts of data to the master node, and then do all the calculations on this single central node. That would be the exact opposite of what Spark is good for.
You also don't need collect(...).toList, because you can access the 0-th entry directly (it doesn't matter whether you get it from an Array or from a List). Since you are collapsing everything into a single Row anyway, you could get rid of the map step entirely by reordering the methods a little bit:
val avgX = mainDataFrame.groupBy().agg(mean("_c1")).collect()(0).getDouble(0)
It can be written even shorter using the first method:
val avgX = mainDataFrame.groupBy().agg(mean("_c1")).first().getDouble(0)
#Any dataType in Scala can't be directly converted to Double.
#Use toString & then toDouble on final captured result.
#Eg-
#scala> x
#res22: Any = 1.0
#scala> x.toString.toDouble
#res23: Double = 1.0
#Note- Instead of using map().toList() directly use (0)(0) to get the final value from your resultset.
#TestSample(Scala)-
val wa = Array("one","two","two")
val wrdd = sc.parallelize(wa,3).map(x=>(x,1))
val wdf = wrdd.toDF("col1","col2")
val x = wdf.groupBy().agg(mean("col2")).collect()(0)(0).toString.toDouble
#O/p-
#scala> val x = wdf.groupBy().agg(mean("col2")).collect()(0)(0).toString.toDouble
#x: Double = 1.0
When I do example from tutorial, I get some issue from constants variables topic.
If someone explain my example I'll be appreciate for this.
When you don't specify a type, a floating point number literal will be inferred to be of type Double.
Double, as its name suggests, has double precision than Float. So when you do:
let a = 64.1
The actual value in memory may be something like 64.099999999999991. Since Double shows only 16 significant digits, it shows 64.09999999999999, rounding off the last "1".
Why does let b: Float = 64.1 show the correct number?
When you specify the type to float, the precision decreases. Float only shows 8 significant digits. That's 64.099999, but there's a "9" straight after that, so it rounds it up to get 64.1.
This has nothing to do with explicitly stating the variable type. Try specifying it to be a Double:
let b: Double = 64.1
It will still show 64.09999999999999.
Looking at various posts on this topic but still no luck. Is there a simple way to make division/conversion when dividing Double (or Float) with Int? Here is a simple example in playground returning and error "Double is not convertible to UInt8".
var score:Double = 3.00
var length:Int = 2 // it is taken from some an array lenght and does not return decimal or float
var result:Double = (score / length )
Cast the int to double with var result:Double=(score/Double(length))
What this will do is before computing the division it will create a new Double variable with int inside parentheses hence constructor like syntax.
You cannot combine or use different variable types together.
You need to convert them all to the same type, to be able to divide them together.
The easiest way I see to make that happen, would be to make the Int a Double.
You can do that quite simply do that by adding a ".0" on the end of the Integer you want to convert.
Also, FYI:
Floats are pretty rarely used, so unless you're using them for something specific, its also just more fluid to use more common variables.
I am confused by what is returned when performing number operations in Swift between various types. Consider the following:
var castedFoo = Float(7.0/5.0) // returns 1.39999997...
var specifiedTypeFoo:Float = 7/5.0 //returns 1.39999997...
var foo = (7/5.0) //returns 1.4
What separates the first two from the last one? They are all returning floats, so why is the value from the last one rounded? I understand that the first is casted and the second explicitly specified to be a Float, but the last one also returns a Float value. So what makes the difference here?
According to Swift documentation,
Unless otherwise specified, the default type of a floating-point literal is the Swift standard library type Double, which represents a 64-bit floating-point number.
In other words, the literal 5.0 is of type Double.
Your first two examples set the result type to Float; your last example keeps the type of the result a Double, because the result of the division of an Int and a Double is a Double. Because of that difference, the last result has higher precision.
When i am checking values inside scala interpreter like:
scala> 1==1.0000000000000001
res1: Boolean = true
scala> 1==1.000000000000001
res2: Boolean = false
Here I am not getting clear view related with "how does scala compiler interpret these as integer or floating points or doubles(and comparing)" .
It is not really Scala related, it is more of a ieee-754 floating-point arithmetic issue. First of all when comparing Int with Double it will cast Int to Double (always safe). The second case is obvious - values are different.
What happens with the first case is that Double type is not capable of storing that many significant digits (17 in your case, 64-bit floating point can store up-to 16 decimal digits) so it rounds the value to 1. And 1 == 1.