I have a problem about assignment via pattern matching in scala. Let's say I have the following:
a. Seq.unapplySeq(List(1))
b. val Seq(z) = List(1)
c. val c = List(1) match {
case Seq(e) => e
}
d. List.unapplySeq(Seq(1))
e. val List(a) = Seq(1)
f. val b = Seq(1) match {
case List(e) => e
}
Only (d) doesn't compile and others compile and run right.
I know that unapplySeq of List is defined in SeqFactory as:
abstract class SeqFactory[CC[X] <: Seq[X] with GenericTraversableTemplate[X, CC]] extends GenSeqFactory[CC] with TraversableFactory[CC] {
applySeq[A](x: CC[A]): Some[CC[A]] = Some(x)
}
Because CC is List, Seq in (d) won't type check.
Seems like (a), (b) and (c) are in one group and (d), (e) and (f) are in the other.
In my understanding, destruction of (f) will actually call (d) because what the pattern matching in (f) does is use List to destruct Seq(1).
My question is why (e) and (f) are still right in the case (d) does not compile.
(e) is translated to (f), correct. But if you look at https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html#pattern-sequences, the only requirement for unapplySeq is on result type, not argument type. So my guess (the specification doesn't actually specify this and I can't check at the moment) is that (f) tests that it's argument is a List before calling unapplySeq, i.e. it's internally something like
Seq(1) match {
case canUnapplySeq: List[Int] =>
... List.unapplySeq(canUnapplySeq) // (d) only invoked here
}
Note that the argument applies to unapply as well.
I may have a answer for you which doesn't talk about SeqFactory:
(e) compiles and runs because Seq(1) is an shortcut for Seq.apply(2) which conveniently always give a List instance, ie. here List(2)*. Thus it's assimilable with (b) case.
(f) compiles and runs for the same foresaid reason (Seq(1) gives a List(1) which match case List(e))
(d) doesn't compile because what I've said about (e) case in first point is about instance and not type, ie. Seq(1) is an instance of List but of type Seq. Here's a demonstration:
List.unapplySeq(Seq(1))
is equivalent to
List.unapplySeq(List(1).asInstanceOf[Seq[Int]])
and give:
*error: type mismatch;*
whereas:
List.unapplySeq(List(1))
is equivalent to:
List.unapplySeq(Seq(1).asInstanceOf[List[Int]])
and both give:
Some(List(1))
But, here's the difficulty, you have a point about (f) pattern matching curiosity: It behaves as if scala internally type it as a List and not as a Seq.
Here's my explanation pattern matching doesn't care about input's type. It's simple as that (cf. https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html#variable-patterns for example). Here's a demonstration:
var foo: Seq[Int] = List(1)
foo match { case _: List[Int] => true } // gives true
foo = 1 to 3
foo match { case _: List[Int] => true } // throws scala.MatchError
Hope it helps.
*To be clearer, if, in another scala implementation, Vector(2) would have been given instead by this apply, error would have popped up not during compiling but while running (Alexey Romanov made a good point about this).
Related
I just started learing scala and found a piece of code, thats works just fine, but I just don't get why...
sealed abstract class Nat
case class Zero() extends Nat
case class Succ(n: Nat) extends Nat
def add(n: Nat, m: Nat): Nat = {
n match {
case Zero() => m
case Succ(prev) => add(prev, Succ(m))
}
}
The members of Nat and Zero are defined in an extra file (and used later on) like this:
val zero = Zero()
val one = Succ(zero)
val two = Succ(one)
val three = Succ(Succ(one))
val four = Succ(Succ(two))
My question now is: In the second case 'prev' never got defined. What happens here? The math behind is clear for me, (like n+m == (n-1)+(m+1), repeat until n==Zero()). Ok so far. But all that is defined is Succ() and not a kind of Prev()?
In this case, prev is declared in the case statement, here:
case Succ(prev) => add(prev, Succ(m))
when you are typing case Succ(prev) ... you are using pattern matching, and saying: if n is of type Succ and we call its n parameter prev, then return add(...)
so basically you are naming the n parameter of Succ class as prev to use it after the arrow =>
This Scala feature can even be use with regex where you capture groups that will be put into the variables you define.
More info on the docs: https://docs.scala-lang.org/tour/pattern-matching.html
Scala gives you concise syntax so instead of having to write out something like
if (n.isInstanceOf[Succ]) {
val x = n.asInstanceOf[Succ]
val prev = x.n
add(prev, ...)
}
we can reason at a higher level by considering the structure of data and write simply
case Succ(prev) => add(prev, ...)
case classes in scala automatically define a method called unapply.
Here is the Scala 2 Doc on Case Classes
It is this unapply method that enables this kind of pattern matching.
If I define a case class with a member called value, I can extract that value by utilizing this unapply to obtain it:
case class Number( value: Int )
val valueOfNumber: Int = Number(5).value
println(valueOfNumber) // 5
// Using unapply
val testNumber: Number = Number(200)
val Number(numberValue) = testNumber
println(numberValue) // 200
When you do case Succ(prev) => add(prev, Succ(m)) you are extracting the value n of Succ as prev by matching on the type signatures of the unapply method.
Hence, prev is defined, it is the value, n, contained by the matched Succ
Consider the following example:
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
In this example, I know (runtime guarantee) that the type of a will be some T, and b will have type C[T] with the same T. Of course, the compiler cannot know that, hence we get a typing error in b.f(a).
To tell the compiler that this invocation is OK, we need to do a typecast à la b.f(a.asInstanceOf[T]). Unfortunately, T is not known here. So my question is: How do I rewrite b.f(a) in order to make this code compile?
I am looking for a solution that does not involve complex constructions (to keep the code readable), and that is "clean" in the sense that we should not rely on code erasure to make it work (see the first approach below).
I have some working approaches, but I find them unsatisfactory for various reasons.
Approaches I tried:
b.asInstanceOf[C[Any]].f(a)
This works, and is reasonably readable, but it is based on a "lie". b is not of type C[Any], and the only reason we do not get a runtime error is because we rely on the limitations of the JVM (type erasure). I think it is good style only to use x.asInstanceOf[X] when we know that x is really of type X.
b.f(a.asInstanceOf[b.ValueType])
This should work according to my understanding of the type system. I have added the member ValueType to the class C in order to be able to explicitly refer to the type parameter T. However, in this approach we get a mysterious error message:
Error:(9, 22) type mismatch;
found : b.ValueType
(which expands to) _1
required: _1
b.f(a.asInstanceOf[b.ValueType])
^
Why? It seems to complain that we expect type _1 but got type _1! (But even if this approach works, it is limited to the cases where we have the possibility to add a member ValueType to C. If C is some existing library class, we cannot do that either.)
for ((a,b) <- list.asInstanceOf[List[(T,C[T]) forSome {type T}]]) {
b.f(a)
}
This one works, and is semantically correct (i.e., we do not "lie" when invoking asInstanceOf). The limitation is that this is somewhat unreadable. Also, it is somewhat specific to the present situation: if a,b do not come from the same iterator, then where can we apply this type cast? (This code also has the side effect of being too complex for Intelli/J IDEA 2016.2 which highlights it as an error in the editor.)
val (a2,b2) = (a,b).asInstanceOf[(T,C[T]) forSome {type T}]
b2.f(a2)
I would have expected this one to work since a2,b2 now should have types T and C[T] for the same existential T. But we get a compile error:
Error:(10, 9) type mismatch;
found : a2.type (with underlying type Any)
required: T
b2.f(a2)
^
Why? (Besides that, the approach has the disadvantage of incurring runtime costs (I think) because of the creation and destruction of a pair.)
b match {
case b : C[t] => b.f(a.asInstanceOf[t])
}
This works. But enclosing the code with a match makes the code much less readable. (And it also is too complicated for Intelli/J.)
The cleanest solution is, IMO, the one you found with the type-capture pattern match. You can make it concise, and hopefully readable, by integrating the pattern directly inside your for comprehension, as follows:
for ((a, b: C[t]) <- list) {
b.f(a.asInstanceOf[t])
}
Fiddle: http://www.scala-js-fiddle.com/gist/b9030033133ee94e8c18ad772f3461a0
If you are not in a for comprehension already, unfortunately the corresponding pattern assignment does not work:
val (c, d: C[t]) = (a, b)
d.f(c.asInstanceOf[t])
That's because t is not in scope anymore on the second line. In that case, you would have to use the full pattern matching.
Maybe I'm confused about what you are trying to achieve, but this compiles:
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
type CP[T] = (T, C[T])
val list = List[CP[T forSome {type T}]](1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
Edit
If the type of the list itself is out of your control, you can still cast it to this "correct" type.
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
type CP[T] = (T, C[T])
for ((a,b) <- list.asInstanceOf[List[CP[T forSome { type T }]]]) {
b.f(a)
}
Great question! Lots to learn here about Scala.
Other answers and comments have already addressed most of the issues here, but I'd like to address a few additional points.
You asked why this variant doesn't work:
val (a2,b2) = (a,b).asInstanceOf[(T,C[T]) forSome {type T}]
b2.f(a2)
You aren't the only person who's been surprised by this; see e.g. this recent very similar issue report: SI-9899.
As I wrote there:
I think this is working as designed as per SLS 6.1: "The following skolemization rule is applied universally for every expression: If the type of an expression would be an existential type T, then the type of the expression is assumed instead to be a skolemization of T."
Basically, every time you write a value-level expression that the compiler determines to have an existential type, the existential type is instantiated. b2.f(a2) has two subexpressions with existential type, namely b2 and a2, so the existential gets two different instantiations.
As for why the pattern-matching variant works, there isn't explicit language in SLS 8 (Pattern Matching) covering the behavior of existential types, but 6.1 doesn't apply because a pattern isn't technically an expression, it's a pattern. The pattern is analyzed as a whole and any existential types inside only get instantiated (skolemized) once.
As a postscript, note that yes, when you play in this area, the error messages you get are often confusing or misleading and ought to be improved. See for example https://github.com/scala/scala-dev/issues/205
A wild guess, but is it possible that you need something like this:
case class C[+T](x:T) {
def f[A >: T](t: A) = println(t)
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
?
It will type check.
I'm not quite sure what "runtime guarantee" means here, usually it means that you are trying to fool type system (e.g. with asInstanceOf), but then all bets are off and you shouldn't expect type system to be of any help.
UPDATE
Just for the illustration why type casting is an evil:
case class C[T <: Int](x:T) {
def f(t: T) = println(t + 1)
}
val list = List("hello" -> C(2), 2 -> C(3))
for ((a, b: C[t]) <- list) {
b.f(a.asInstanceOf[t])
}
It compiles and fails at runtime (not surprisingly).
UPDATE2
Here's what generated code looks like for the last snippet (with C[t]):
...
val a: Object = x1._1();
val b: Test$C = x1._2().$asInstanceOf[Test$C]();
if (b.ne(null))
{
<synthetic> val x2: Test$C = b;
matchEnd4({
x2.f(scala.Int.unbox(a));
scala.runtime.BoxedUnit.UNIT
})
}
...
Type t simply vanished (as it should have been) and Scala is trying to convert a to an upper bound of T in C, i.e. Int. If there is no upper bound it's going to be Any (but then method f is nearly useless unless you cast again or use something like println which takes Any).
I have a number of functions that return Option values, like this
case class A()
case class B()
case class C()
def optionA(): Option[A] = None
def optionB(): Option[B] = Some(B())
def optionC(): Option[C] = Some(C())
What I want to do is, I want to run these functions in sequence, but only until one of the functions returns an Option with a value (a Some). Then I want to have that value returned, without running the remaining functions.
This is my current implementation
val res:Option[Any] = Stream(
() => optionA(),
() => optionB(),
() => optionC()
) .map(f => f())
.filter(opt => opt.isDefined)
.head
For the function implementations above, this applies optionA and optionB, gives me a Some(B()), and it never runs optionC, which is what I want.
But I'd like to know if there is is a better/simple/alternative implementation.
Something like val findFirst = optionA compose optionB compose optionC?
optionA().orElse(optionB()).orElse(optionC())
orElse will not evaluate its argument if this is defined.
Or if you have already the options in a collection/stream, you might do
options.find(_.isDefined).flatten
Say you now have a collection of Options, you can then do this:
coll.foldLeft[Option[Int]](None)(_ orElse _)
Which will return you the first non-None value in the collection
Note that I explicitly mention the type of the collection, because scala can't infer what orElse should do without it... (None is of type Option[Any] by default)
If you have a giant list of options, it might be helpful to write
coll.view.foldLeft[Option[Int]](None)(_ orElse _)
While going though Functional Programming in Scala, I came across the following code snippet:
def foldRight[A](z: => B)(f: (A,=>B) => B):B = uncons match {
case Some((h,t)) => f(h,t.foldRight(z)(f))
case None => z
}
The authors then go ahead and state the following:
This looks very similar to the foldRight we wrote for List, but
notice how our combining function, f, is non-strict in its second
parameter. If f chooses not to evaluate its second parameter, this
terminates the traversal early. We can see this by using foldRight to
implement exists, which checks to see if any value in the Stream
matches a given predicate.
Then the author states the following:
def exists(p: A => Boolean): Boolean =
foldRight(false)((a, b) => p(a) || b)
My question is how does the combining function f causes early termination in the exists method? I don't think I was able to understand how that happens from the text.
In f(h,t.foldRight(z)(f)), the first argument provided to f is h, the second argument is t.foldRight(z)(f). The way foldRight is defined is that the second argument of its f argument is a by-name argument which will not be evaluated until needed (and will be evaluated everytime it is needed). So in f: (A, =>B) => B, the first argument of type A is a normal argument, but the second one of type B is a by-name argument.
So pretend you defined f like this:
f(a: A, b: => Boolean): Boolean = predicate(a) || b
If predicate(a) is true then b is never needed and will never be evaluated. That is the way the or operator works.
So say for exists applied to some Stream. For the first element to be uncons-ed that will exist (where p(h) is true) this code:
uncons match {
case Some((h,t)) => f(h,t.foldRight(z)(f))
case None => z
}
Is the same as this code (we assume we have a non-empty stream, so we can remove the second case):
f(h,t.foldRight(z)(f))
Which is then equivalent to (expanding f):
p(h) || t.foldRight(z)(f)
But p(h) is true so:
true || t.foldRight(z)(f)
And that's the same as just true and no need to continue calling foldRight, so early termination happens!
I'm trying to understand the traverseImpl implementation in scalaz-seven:
def traverseImpl[F[_], A, B](l: List[A])(f: A => F[B])(implicit F: Applicative[F]) = {
DList.fromList(l).foldr(F.point(List[B]())) {
(a, fbs) => F.map2(f(a), fbs)(_ :: _)
}
}
Can someone explain how the List interacts with the Applicative? Ultimately, I'd like to be able to implement other instances for Traverse.
An applicative lets you apply a function in a context to a value in a context. So for instance, you can apply some((i: Int) => i + 1) to some(3) and get some(4). Let's forget that for now. I'll come back to that later.
List has two representations, it's either Nil or head :: tail. You may be used to fold over it using foldLeft but there is another way to fold over it:
def foldr[A, B](l: List[A], acc0: B, f: (A, B) => B): B = l match {
case Nil => acc0
case x :: xs => f(x, foldr(xs, acc0, f))
}
Given List(1, 2) we fold over the list applying the function starting from the right side - even though we really deconstruct the list from the left side!
f(1, f(2, Nil))
This can be used to compute the length of a list. Given List(1, 2):
foldr(List(1, 2), 0, (i: Int, acc: Int) => 1 + acc)
// returns 2
This can also be used to create another list:
foldr[Int, List[Int]](List(1, 2), List[Int](), _ :: _)
//List[Int] = List(1, 2)
So given an empty list and the :: function we were able to create another list. What if our elements are in some context? If our context is an applicative then we can still apply our elements and :: in that context. Continuing with List(1, 2) and Option as our applicative. We start with some(List[Int]())) we want to apply the :: function in the Option context. This is what the F.map2 does. It takes two values in their Option context, put the provided function of two arguments into the Option context and apply them together.
So outside the context we have (2, Nil) => 2 :: Nil
In context we have: (Some(2), Some(Nil)) => Some(2 :: Nil)
Going back to the original question:
// do a foldr
DList.fromList(l).foldr(F.point(List[B]())) {
// starting with an empty list in its applicative context F.point(List[B]())
(a, fbs) => F.map2(f(a), fbs)(_ :: _)
// Apply the `::` function to the two values in the context
}
I am not sure why the difference DList is used. What I see is that it uses trampolines so hopefully that makes this implementation work without blowing the stack, but I have not tried so I don't know.
The interesting part about implementing the right fold like this is that I think it gives you an approach to implement traverse for algebric data types using catamorphisms.
For instance given:
trait Tree[+A]
object Leaf extends Tree[Nothing]
case class Node[A](a: A, left: Tree[A], right: Tree[A]) extends Tree[A]
Fold would be defined like this (which is really following the same approach as for List):
def fold[A, B](tree: Tree[A], valueForLeaf: B, functionForNode: (A, B, B) => B): B = {
tree match {
case Leaf => valueForLeaf
case Node(a, left, right) => functionForNode(a,
fold(left, valueForLeaf, functionForNode),
fold(right, valueForLeaf, functionForNode)
)
}
}
And traverse would use that fold with F.point(Leaf) and apply it to Node.apply. Though there is no F.map3 so it may be a bit cumbersome.
This not something so easy to grasp. I recommend reading the article linked at the beginning of my blog post on the subject.
I also did a presentation on the subject during the last Functional Programming meeting in Sydney and you can find the slides here.
If I can try to explain in a few words, traverse is going to traverse each element of the list one by one, eventually re-constructing the list (_ :: _) but accumulating/executing some kind of "effects" as given by the F Applicative. If F is State it keeps track of some state. If F is the applicative corresponding to a Monoid it aggregates some kind of measure for each element of the list.
The main interaction of the list and the applicative is with the map2 application where it receives a F[B] element and attach it to the other F[List[B]] elements by definition of F as an Applicative and the use of the List constructor :: as the specific function to apply.
From there you see that implementing other instances of Traverse is only about applying the data constructors of the data structure you want to traverse. If you have a look at the linked powerpoint presentation, you'll see some slides with a binary tree traversal.
List#foldRight blows the stack for large lists. Try this in a REPL:
List.range(0, 10000).foldRight(())((a, b) => ())
Typically, you can reverse the list, use foldLeft, then reverse the result to avoid this problem. But with traverse we really have to process the elements in the correct order, to make sure that the effect is treated correctly. DList is a convenient way to do this, by virtue of trampolining.
In the end, these tests must pass:
https://github.com/scalaz/scalaz/blob/scalaz-seven/tests/src/test/scala/scalaz/TraverseTest.scala#L13
https://github.com/scalaz/scalaz/blob/scalaz-seven/tests/src/test/scala/scalaz/std/ListTest.scala#L11
https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/Traverse.scala#L76