Scala - error: type not found - scala

I am a newbie in Scala and I have an error that i cannot understand. Here is my array of int : (numbers from 1 to 100)
val rdd = sc.parallelize(1 to 100)
Next I wrote a function, which is returning the MAX value of my array:
rdd.reduce((x, y) => x > y ? x : y)
But I always get this error:
<console>:30: error: not found: type y
rdd.reduce((x, y) => x > y ? x : y)
^
I don't really know what the error means so i can't find a solution. But if i use my function like this, it works:
rdd.reduce((x, y) => if(x > y) x else y)
Thank you for your answers !

There is no ? : operator in Scala, use if instead:
rdd.reduce((x, y) => if (x > y) x else y)
Or use max instead of building it on your own:
rdd.reduce((x, y) => x max y)
Or with _ syntax for anonymous function:
rdd.reduce(_ max _)
Or avoid building collection max on your own:
rdd.max

Related

add values from struct keys spark

I have the following code:
val df2 = df.withColumn("col", expr("transform(col, x -> struct(x.amt as amt))"))
Output: [{"amt": 10000}, {"amt": 20000}]
I want to add all the values for amt key. So I am getting all the values into a list as below:
df.withColumn("list_val", expr("transform(col, x -> x.amt)"))
Output: [10000,20000]
To sum the values, I have the following code, but getting error cannot resolve aggregate
.withColumn("amount", aggregate($"list_val", lit(0), (x, y) => (x + y)))
How do I fix this code or is there any better way to add the values?
aggregate should be used inside a Spark SQL expr for Spark 2.4. Also it should be better to add a type cast to ensure there is no type mismatch:
df.withColumn("amount", expr("aggregate(list_val, 0, (x, y) -> (x + int(y)))")
// for float type; for double type, replace "float" with "double"
df.withColumn("amount", expr("aggregate(list_val, float(0), (x, y) -> (x + float(y)))")
In Scala API that would be
df.withColumn("amount", aggregate($"list_val", lit(0), (x, y) => (x + int(y))))
df.withColumn("amount", aggregate($"list_val", lit(0f), (x, y) => (x + float(y))))
df.withColumn("amount", aggregate($"list_val", lit(0.0), (x, y) => (x + double(y))))

Adding elements to a list in a for loop

var locations: List[Location] = List[Location]()
for (x <- 0 to 10; y <- 0 to 10) {
println("x: " + x + " y: " + y)
locations ::: List(Location(x, y))
println(locations)
}
The code above is supposed to concatenate some lists. But the result is an empty list. Why?
Your mistake is on the line locations ::: List(Location(x, y)). This is concatenating the lists, but the doing nothing with the result. If you replace it with locations = locations ::: List(Location(x, y)) you would have your desired result.
However there are more idiomatic ways to solve this problem in Scala. In Scala, writing immutable code is the preferred style (i.e. use val rather than var where possible).
Here's a couple of ways to do it:
Using yield:
val location = for (x <- 0 to 10; y <- 0 to 10) yield Location(x, y)
Using tabulate:
val location = List.tabulate(11, 11) { case (x, y) => Location(x, y) }
Even shorter:
val location = List.tabulate(11, 11)(Location)
Edit: just noticed you had 0 to 10 which is inclusive-inclusive. 0 until 10 is inclusive-exclusive. I've changed the args to tabulate to 11.

Use of the forall construct in Stainless

I'm trying to proof in Stainless that if two lists have the same contents and one list is bounded by x then the other list is also bounded by x. For doing so, I'm told to use the construct:
forall(x => list.content.contains(x) ==> p(x))
The lemma would be written (in a verbose way) as:
def lowerBoundLemma(l1: List[BigInt],l2: List[BigInt],x:BigInt) : Boolean = {
require(l1.content == l2.content && forall(y => l1.content.contains(y) ==> y <= x))
forall(z => l2.content.contains(z) ==> z <= x) because{
forall(z => l2.content.contains(z) ==> z <= x) ==| l1.content == l2.content |
forall(z => l1.content.contains(z) ==> z <= x) ==| trivial |
forall(y => l1.content.contains(z) ==> y <= x)
}
}.holds
The problem is that I get the following errors:
exercise.scala:12:48: error: missing parameter type
require(l1.content == l2.content && forall(y => l1.content.contains(y) ==> y <= x))
Once I add the type to y I get this error (pointing to the left brace of the contains parentheses):
exercise.scala:12:81: error: ')' expected but '(' found.
require(l1.content == l2.content && forall(y : BigInt => l1.content.contains(y) ==> y <= x))
Any idea why this is happening?
I also tried the syntax l.forall(_ <= x) but I get errors when combining with constructs like because and ==| of the type: because is not a member of Boolean.
The issues you are facing are coming from the Scala compiler frontend to Stainless. In Scala, the syntax for a closure (with specified parameter type) is (x: Type) => body (note the extra parentheses!)
If you want to use because and ==|, you'll have to add import stainless.proof._ at the beginning of your source file.

scala parallel collections not consistent

I am getting inconsistent answers from the following code which I find odd.
import scala.math.pow
val p = 2
val a = Array(1,2,3)
println(a.par
.aggregate("0")((x, y) => s"$y pow $p; ", (x, y) => x + y))
for (i <- 1 to 100) {
println(a.par
.aggregate(0.0)((x, y) => pow(y, p), (x, y) => x + y) == 14)
}
a.map(x => pow(x,p)).sum
In the code the a.par ... computes 14 or 10. Can anyone provide an explanation for why it is computing inconsistently?
In your "seqop" function, that is the first function you pass to aggregate, you define the logic that is used to combine elements within the same partition. Your function looks like this:
(x, y) => pow(y, p)
The problem is that you don't accumulate the results of a partition. Instead, you throw away your accumulator x. Every time you get 10 as a result, the calculation 2^2 was dropped.
If you change your function to take the accumulated value into account, you will get 14 every time:
(x, y) => x + pow(y, p)
The correct way to use aggregate is
a.par.aggregate(0.0)(
(acc, value) => acc + pow(value, 2), (acc1, acc2) => acc1 + acc2
)
By using (x,y) => pow(y,2) , you did not accumulate the item to the accumulator but just replaced the accumulator by pow(y,2).

syntax explanation for pattern matching a list in scala

I was reading this blog post and i was not able to understand a part of the code.
object O {
def maximum(x: List[Int]): Int = x match {
case Nil => error("maximum undefined for empty list")
case x :: y :: ys => maximum((if(x > y) x else y) :: ys)
case x :: _ => x
}
}
Please explain the code maximum((if(x > y) x else y) :: ys)
How the if condition can be a part of the method maximum ?
I understand that if condition is not exactly a parameter.
In Scala, if is an expression, not a statement.
Try this in the REPL:
scala> val x=1; val y=0
x: Int = 1
y: Int = 0
scala> val test=if(x > y) x else y
test: Int = 1
if evaluates to 1 and 1 is assigned to test. In Java if could be expressed with the conditional operator (x > y) ? x : y
Now, you have a function called maximum that takes a List[Int] as a parameter.
maximum((if(x > y) x else y) :: ys) calls maximum (recursively) with a list obtained prepending one between x and y (depending on what the if evaluates to) to ys.