In Scala 3, I'm able to write a poly-function of type 1:
val y = [C <: Int] => (x: C) => x * 2
When I try to generalise it into type 2:
val z = [C <: Int] => ([D <: Int] => (x: C, y: D) = x * y)
I got the following error:
DependentPoly.scala:19:37: Implementation restriction: polymorphic function literals must have a value parameter
So is this feature not implemented? Or I'm not writing it properly?
Implementation restriction: polymorphic function literals must have a value parameter means that
val y = [C <: Int] => foo[C]
is illegal (for example for def foo[C <: Int]: C => Int = _ * 2) while
val y = [C <: Int] => (x: C) => x * 2
is legal.
Similarly,
val z = [C <: Int] => [D <: Int] => (x: C, y: D) => x * y
val z = [C <: Int] => [D <: Int] => (x: C) => (y: D) => x * y
are illegal while
val z = [C <: Int, D <: Int] => (x: C, y: D) => x * y
val z = [C <: Int, D <: Int] => (x: C) => (y: D) => x * y
val z = [C <: Int] => (x: C) => [D <: Int] => (y: D) => x * y
val z = [C <: Int] => (_: C) => [D <: Int] => (x: C, y: D) => x * y
val z = [C <: Int] => (_: C) => [D <: Int] => (x: C) => (y: D) => x * y
are legal.
This is because of
trait PolyFunction:
def apply[A](x: A): B[A]
https://docs.scala-lang.org/scala3/reference/new-types/polymorphic-function-types.html
https://github.com/lampepfl/dotty/pull/4672
I have been trying to learn pattern matching and pairs in Scala and use it to implement the merge sort but the pattern matching. But the pattern match to extract head and tail pair doesn't work. What is it which I am missing in the below code?
def merge(xs: List[Int], ys: List[Int]): List[Int] =
(xs, ys) match {
case (x: Int, y: Int) :: (xs1: List[Int], ys1: List[Int]) =>
if (x < y) x :: merge(xs1, ys)
else y :: merge(xs, ys1)
case (x: List[Int], Nil) => x
case (Nil, y: List[Int]) => y
}
you have a syntax error in your 1st case statement, change it to
case (x :: xs1 , y :: ys1)
from
case (x: Int, y: Int) :: (xs1: List[Int], ys1: List[Int])
You are trying to match a tuple containing Lists, not List of Tuples
I am able to calculate the mean word length per starting letter for the spark collection
val animals23 = sc.parallelize(List(("a","ant"), ("c","crocodile"), ("c","cheetah"), ("c","cat"), ("d","dolphin"), ("d","dog"), ("g","gnu"), ("l","leopard"), ("l","lion"), ("s","spider"), ("t","tiger"), ("w","whale")), 2)
either with
animals23.
aggregateByKey((0,0))(
(x, y) => (x._1 + y.length, x._2 + 1),
(x, y) => (x._1 + y._1, x._2 + y._2)
).
map(x => (x._1, x._2._1.toDouble / x._2._2.toDouble)).
collect
or with
animals23.
combineByKey(
(x:String) => (x.length,1),
(x:(Int, Int), y:String) => (x._1 + y.length, x._2 + 1),
(x:(Int, Int), y:(Int, Int)) => (x._1 + y._1, x._2 + y._2)
).
map(x => (x._1, x._2._1.toDouble / x._2._2.toDouble)).
collect
each resulting in
Array((a,3.0), (c,6.333333333333333), (d,5.0), (g,3.0), (l,5.5), (w,5.0), (s,6.0), (t,5.0))
What I do not understand: Why am I required to explicitly state the types in the functions in the second example while the first example's functions can do without?
I am talking about
(x, y) => (x._1 + y.length, x._2 + 1),
(x, y) => (x._1 + y._1, x._2 + y._2)
vs
(x:(Int, Int), y:String) => (x._1 + y.length, x._2 + 1),
(x:(Int, Int), y:(Int, Int)) => (x._1 + y._1, x._2 + y._2)
and it might be more a Scala than a Spark question.
Why am I required to explicitly state the types in the functions in
the second example while the first example's functions can do without?
Because in the first example, the compiler is able to infer the type of seqOp based on the first argument list supplied. aggregateByKey is using currying:
def aggregateByKey[U](zeroValue: U)
(seqOp: (U, V) ⇒ U,
combOp: (U, U) ⇒ U)
(implicit arg0: ClassTag[U]): RDD[(K, U)]
The way type inference works in Scala, is that the compiler is able to infer the type of the second argument list based on the first. So in the first example, it knows that that seqOp is a function ((Int, Int), String) => (Int, Int), same goes for combOp.
On the contrary, combineByKey there's only a single argument list:
combineByKey[C](createCombiner: (V) ⇒ C,
mergeValue: (C, V) ⇒ C,
mergeCombiners: (C, C) ⇒ C): RDD[(K, C)]
And without explicitly stating the types, the compiler doesn't know what to infer x and y to.
What you can do to help the compiler is to explicitly specify the type arguments:
animals23
.combineByKey[(Int, Int)](x => (x.length,1),
(x, y) => (x._1 + y.length, x._2 + 1),
(x, y) => (x._1 + y._1, x._2 + y._2))
.map(x => (x._1, x._2._1.toDouble / x._2._2.toDouble))
.collect
Here is a function I wrote for concatenating elements of a List using an accumulator with tail recursion :
val l1 = List(1, 2, 3) //> l1 : List[Int] = List(1, 2, 3)
val l2 = List(1, 2, 3) //> l2 : List[Int] = List(1, 2, 3)
def func(l1: List[Int], l2: List[Int], acc: List[Int]): List[Int] = {
(l1, l2) match {
case (Nil, Nil) => acc.reverse
case (h1 :: t1, h2 :: t2) => {
func(t1, t2, h1 :: h2 :: acc)
}
}
} //> func: (l1: List[Int], l2: List[Int], acc: List[Int])List[Int]
func(l1, l2, List()) //> res0: List[Int] = List(1, 1, 2, 2, 3, 3)
This is my understanding of the call order
func( 1 :: 1 :: () )
func( 2 :: 2 :: 1 :: 1 : () )
func( 3 :: 3 :: 2 :: 2 :: 1 :: 1 : () )
So the call order is the reason why I must call reverse on base call of acc so that the List is ordered in same ordering initial List elements. To try to minimize the steps required to concatenate the lists I have tried to add the elements like this :
func(t1, t2, acc :: h1 :: h2)
instead of
func(t1, t2, h1 :: h2 :: acc)
but receive compile time error :
value :: is not a member of Int
So it seems I cannot prepend these elements to this List ?
When you write x :: y, y must be a list and x the element you want to prepend.
You can use acc :+ h1 :+ h2 to append h1 and h2 to acc, but note that adding elements to the end of the list is a relatively expensive operation (linear with the length of the list).
Why do both of the following foldLeft's result in the same output?
#1
scala> List(1,2,3).foldLeft(List[Int]())( (acc, el) => acc :+ el)
res114: List[Int] = List(1, 2, 3)
And, now using _ :+ _ as the (B, A) => B argument.
#2
scala> List(1,2,3).foldLeft(List[Int]())(_ :+ _)
res115: List[Int] = List(1, 2, 3)
In particular, the lack of explicitly appending to the accumulator in the second case confuses me.
_ :+ _ is simply a shorthand for (x1, x2) => x1 :+ x2, just as list.map(_.toString) is simply list.map(x => x.toString).
See more on the placeholder syntax here.