Lazy foldRight early termination confusion - scala

While going though Functional Programming in Scala, I came across the following code snippet:
def foldRight[A](z: => B)(f: (A,=>B) => B):B = uncons match {
case Some((h,t)) => f(h,t.foldRight(z)(f))
case None => z
}
The authors then go ahead and state the following:
This looks very similar to the foldRight we wrote for List, but
notice how our combining function, f, is non-strict in its second
parameter. If f chooses not to evaluate its second parameter, this
terminates the traversal early. We can see this by using foldRight to
implement exists, which checks to see if any value in the Stream
matches a given predicate.
Then the author states the following:
def exists(p: A => Boolean): Boolean =
foldRight(false)((a, b) => p(a) || b)
My question is how does the combining function f causes early termination in the exists method? I don't think I was able to understand how that happens from the text.

In f(h,t.foldRight(z)(f)), the first argument provided to f is h, the second argument is t.foldRight(z)(f). The way foldRight is defined is that the second argument of its f argument is a by-name argument which will not be evaluated until needed (and will be evaluated everytime it is needed). So in f: (A, =>B) => B, the first argument of type A is a normal argument, but the second one of type B is a by-name argument.
So pretend you defined f like this:
f(a: A, b: => Boolean): Boolean = predicate(a) || b
If predicate(a) is true then b is never needed and will never be evaluated. That is the way the or operator works.
So say for exists applied to some Stream. For the first element to be uncons-ed that will exist (where p(h) is true) this code:
uncons match {
case Some((h,t)) => f(h,t.foldRight(z)(f))
case None => z
}
Is the same as this code (we assume we have a non-empty stream, so we can remove the second case):
f(h,t.foldRight(z)(f))
Which is then equivalent to (expanding f):
p(h) || t.foldRight(z)(f)
But p(h) is true so:
true || t.foldRight(z)(f)
And that's the same as just true and no need to continue calling foldRight, so early termination happens!

Related

How map work on Options in Scala?

I have this two functions
def pattern(s: String): Option[Pattern] =
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
and
def mkMatcher(pat: String): Option[String => Boolean] =
pattern(pat) map (p => (s: String) => p.matcher(s).matches)
Map is the higher-order function that applies a given function to each element of a list.
Now I am not getting that how map is working here as per above statement.
Map is the higher-order function that applies a given function to each element of a list.
This is an uncommonly restrictive definition of map.
At any rate, it works because it was defined by someone who did not hold to that.
For example, that someone wrote something akin to
sealed trait Option[+A] {
def map[B](f: A => B): Option[B] = this match {
case Some(value) => Some(f(value))
case None => None
}
}
as part of the standard library. This makes map applicable to Option[A]
It was defined because it makes sense to map many kinds of data structures not just lists.
Mapping is a transformation applied to the elements held by the data structure.
It applies a function to each element.
Option[A] can be thought of as a trivial sequence. It either has zero or one elements. To map it means to apply the function on its element if it has one.
Now it may not make much sense to use this facility all of the time, but there are cases where it is useful.
For example, it is one of a few distinct methods that, when present enable enable For Expressions to operate on a type. Option[A] can be used in for expressions which can be convenient.
For example
val option: Option[Int] = Some(2)
val squared: Option[Int] = for {
n <- option
if n % 2 == 0
} yield n * n
Interestingly, this implies that filter is also defined on Option[A].
If you just have a simple value it may well be clearer to use a less general construct.
Map is working the same way that it does with other collections types like List and Vector. It applies your function to the contents of the collection, potentially changing the type but keeping the collection type the same.
In many cases you can treat an Option just like a collection with either 0 or 1 elements. You can do a lot of the same operations on Option that you can on other collections.
You can modify the value
var opt = Option(1)
opt.map(_ + 3)
opt.map(_ * math.Pi)
opt.filter(_ == 1)
opt.collect({case i if i > 0 => i.toString })
opt.foreach(println)
and you can test the value
opt.contains(3)
opt.forall(_ > 0)
opt.exists(_ > 0)
opt.isEmpty
Using these methods you rarely need to use a match statement to unpick an Option.

Pattern Matching Assignment in Scala

I have a problem about assignment via pattern matching in scala. Let's say I have the following:
a. Seq.unapplySeq(List(1))
b. val Seq(z) = List(1)
c. val c = List(1) match {
case Seq(e) => e
}
d. List.unapplySeq(Seq(1))
e. val List(a) = Seq(1)
f. val b = Seq(1) match {
case List(e) => e
}
Only (d) doesn't compile and others compile and run right.
I know that unapplySeq of List is defined in SeqFactory as:
abstract class SeqFactory[CC[X] <: Seq[X] with GenericTraversableTemplate[X, CC]] extends GenSeqFactory[CC] with TraversableFactory[CC] {
applySeq[A](x: CC[A]): Some[CC[A]] = Some(x)
}
Because CC is List, Seq in (d) won't type check.
Seems like (a), (b) and (c) are in one group and (d), (e) and (f) are in the other.
In my understanding, destruction of (f) will actually call (d) because what the pattern matching in (f) does is use List to destruct Seq(1).
My question is why (e) and (f) are still right in the case (d) does not compile.
(e) is translated to (f), correct. But if you look at https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html#pattern-sequences, the only requirement for unapplySeq is on result type, not argument type. So my guess (the specification doesn't actually specify this and I can't check at the moment) is that (f) tests that it's argument is a List before calling unapplySeq, i.e. it's internally something like
Seq(1) match {
case canUnapplySeq: List[Int] =>
... List.unapplySeq(canUnapplySeq) // (d) only invoked here
}
Note that the argument applies to unapply as well.
I may have a answer for you which doesn't talk about SeqFactory:
(e) compiles and runs because Seq(1) is an shortcut for Seq.apply(2) which conveniently always give a List instance, ie. here List(2)*. Thus it's assimilable with (b) case.
(f) compiles and runs for the same foresaid reason (Seq(1) gives a List(1) which match case List(e))
(d) doesn't compile because what I've said about (e) case in first point is about instance and not type, ie. Seq(1) is an instance of List but of type Seq. Here's a demonstration:
List.unapplySeq(Seq(1))
is equivalent to
List.unapplySeq(List(1).asInstanceOf[Seq[Int]])
and give:
*error: type mismatch;*
whereas:
List.unapplySeq(List(1))
is equivalent to:
List.unapplySeq(Seq(1).asInstanceOf[List[Int]])
and both give:
Some(List(1))
But, here's the difficulty, you have a point about (f) pattern matching curiosity: It behaves as if scala internally type it as a List and not as a Seq.
Here's my explanation pattern matching doesn't care about input's type. It's simple as that (cf. https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html#variable-patterns for example). Here's a demonstration:
var foo: Seq[Int] = List(1)
foo match { case _: List[Int] => true } // gives true
foo = 1 to 3
foo match { case _: List[Int] => true } // throws scala.MatchError
Hope it helps.
*To be clearer, if, in another scala implementation, Vector(2) would have been given instead by this apply, error would have popped up not during compiling but while running (Alexey Romanov made a good point about this).

Iterate over Option instances until the first non-empty is found

I have a number of functions that return Option values, like this
case class A()
case class B()
case class C()
def optionA(): Option[A] = None
def optionB(): Option[B] = Some(B())
def optionC(): Option[C] = Some(C())
What I want to do is, I want to run these functions in sequence, but only until one of the functions returns an Option with a value (a Some). Then I want to have that value returned, without running the remaining functions.
This is my current implementation
val res:Option[Any] = Stream(
() => optionA(),
() => optionB(),
() => optionC()
) .map(f => f())
.filter(opt => opt.isDefined)
.head
For the function implementations above, this applies optionA and optionB, gives me a Some(B()), and it never runs optionC, which is what I want.
But I'd like to know if there is is a better/simple/alternative implementation.
Something like val findFirst = optionA compose optionB compose optionC?
optionA().orElse(optionB()).orElse(optionC())
orElse will not evaluate its argument if this is defined.
Or if you have already the options in a collection/stream, you might do
options.find(_.isDefined).flatten
Say you now have a collection of Options, you can then do this:
coll.foldLeft[Option[Int]](None)(_ orElse _)
Which will return you the first non-None value in the collection
Note that I explicitly mention the type of the collection, because scala can't infer what orElse should do without it... (None is of type Option[Any] by default)
If you have a giant list of options, it might be helpful to write
coll.view.foldLeft[Option[Int]](None)(_ orElse _)

Scala - Make signature of function parameter f of higher order function g dependent on varars of g

I am trying to define a higher order function f which accepts a variable number of parameters args of type Wrapper[T]* and a function parameter g in Scala.
The function f should decapsulate each object passed in args and then call g with the decapsulated parameters. Therefore, g has to accept exactly the same number of parameters of type T as args contains.
The closest thing I could achieve was to pass a Seq[T] to g and to use pattern matching inside of g. Like the following:
f("This", "Is", "An", "Example")(x => x match {
case Seq(a:String, b:String, c:String): //Do something.
})
With f defined like:
def f[V](args: Wrapper[T]*)
(g: (Seq[T]) => (V)) : V = {
val params = args.map(x => x.unwrap())
g(params)
}
How is it possible to accomplish a thing like this without pattern
matching?
It is possible to omit the types in the signature of g
by using type inference, but only if the number of parameters is
fixed. How could this be done in this case?
It is possible to pass
different types of parameters into varargs, if a type wildcard is
used args: Wrapper[_]*. Additionally, casting the result of
x.unwrap to AnyRef and using pattern matching in g is
necessary. This, however, completely breaks type inference and type
safety. Is there a better way to make mixing types in the varargs
possible in this special case?
I am also considering the use of scala makros to accomplish these tasks.
Did I get you right? I replaced your Wrapper with some known type, but that doesn't seem to be essential.
def f[T, V](args: T*)(g: PartialFunction[Seq[T], V]): V = g(args)
So later you can do this:
f(1,2,3) { case Seq(a,b,c) => c } // Int = 3
Okay, I've made my own Wrapper to be totally clear:
case class Wrapper[T](val x:T) {
def unwrap = x
}
def f[V](args: Wrapper[_]*)(g: PartialFunction[Seq[_], V]): V =
g(args.map(_.unwrap))
f(Wrapper("1"), Wrapper(1), Wrapper(BigInt(1))) {
case Seq(s: String, i: Int, b: BigInt) => (s, i, b)
} // res3: (String, Int, BigInt) = (1,1,1)
Regarding your concerns about type safety and conversions: as you can see, there aren't any explicit conversions in the code above, and since you are going to pattern-match with explicitly defined types, you may not to worry about these things - if some items of an undefined origin are going to show in your input, scala.MatchError will be thrown.

Explain Traverse[List] implementation in scalaz-seven

I'm trying to understand the traverseImpl implementation in scalaz-seven:
def traverseImpl[F[_], A, B](l: List[A])(f: A => F[B])(implicit F: Applicative[F]) = {
DList.fromList(l).foldr(F.point(List[B]())) {
(a, fbs) => F.map2(f(a), fbs)(_ :: _)
}
}
Can someone explain how the List interacts with the Applicative? Ultimately, I'd like to be able to implement other instances for Traverse.
An applicative lets you apply a function in a context to a value in a context. So for instance, you can apply some((i: Int) => i + 1) to some(3) and get some(4). Let's forget that for now. I'll come back to that later.
List has two representations, it's either Nil or head :: tail. You may be used to fold over it using foldLeft but there is another way to fold over it:
def foldr[A, B](l: List[A], acc0: B, f: (A, B) => B): B = l match {
case Nil => acc0
case x :: xs => f(x, foldr(xs, acc0, f))
}
Given List(1, 2) we fold over the list applying the function starting from the right side - even though we really deconstruct the list from the left side!
f(1, f(2, Nil))
This can be used to compute the length of a list. Given List(1, 2):
foldr(List(1, 2), 0, (i: Int, acc: Int) => 1 + acc)
// returns 2
This can also be used to create another list:
foldr[Int, List[Int]](List(1, 2), List[Int](), _ :: _)
//List[Int] = List(1, 2)
So given an empty list and the :: function we were able to create another list. What if our elements are in some context? If our context is an applicative then we can still apply our elements and :: in that context. Continuing with List(1, 2) and Option as our applicative. We start with some(List[Int]())) we want to apply the :: function in the Option context. This is what the F.map2 does. It takes two values in their Option context, put the provided function of two arguments into the Option context and apply them together.
So outside the context we have (2, Nil) => 2 :: Nil
In context we have: (Some(2), Some(Nil)) => Some(2 :: Nil)
Going back to the original question:
// do a foldr
DList.fromList(l).foldr(F.point(List[B]())) {
// starting with an empty list in its applicative context F.point(List[B]())
(a, fbs) => F.map2(f(a), fbs)(_ :: _)
// Apply the `::` function to the two values in the context
}
I am not sure why the difference DList is used. What I see is that it uses trampolines so hopefully that makes this implementation work without blowing the stack, but I have not tried so I don't know.
The interesting part about implementing the right fold like this is that I think it gives you an approach to implement traverse for algebric data types using catamorphisms.
For instance given:
trait Tree[+A]
object Leaf extends Tree[Nothing]
case class Node[A](a: A, left: Tree[A], right: Tree[A]) extends Tree[A]
Fold would be defined like this (which is really following the same approach as for List):
def fold[A, B](tree: Tree[A], valueForLeaf: B, functionForNode: (A, B, B) => B): B = {
tree match {
case Leaf => valueForLeaf
case Node(a, left, right) => functionForNode(a,
fold(left, valueForLeaf, functionForNode),
fold(right, valueForLeaf, functionForNode)
)
}
}
And traverse would use that fold with F.point(Leaf) and apply it to Node.apply. Though there is no F.map3 so it may be a bit cumbersome.
This not something so easy to grasp. I recommend reading the article linked at the beginning of my blog post on the subject.
I also did a presentation on the subject during the last Functional Programming meeting in Sydney and you can find the slides here.
If I can try to explain in a few words, traverse is going to traverse each element of the list one by one, eventually re-constructing the list (_ :: _) but accumulating/executing some kind of "effects" as given by the F Applicative. If F is State it keeps track of some state. If F is the applicative corresponding to a Monoid it aggregates some kind of measure for each element of the list.
The main interaction of the list and the applicative is with the map2 application where it receives a F[B] element and attach it to the other F[List[B]] elements by definition of F as an Applicative and the use of the List constructor :: as the specific function to apply.
From there you see that implementing other instances of Traverse is only about applying the data constructors of the data structure you want to traverse. If you have a look at the linked powerpoint presentation, you'll see some slides with a binary tree traversal.
List#foldRight blows the stack for large lists. Try this in a REPL:
List.range(0, 10000).foldRight(())((a, b) => ())
Typically, you can reverse the list, use foldLeft, then reverse the result to avoid this problem. But with traverse we really have to process the elements in the correct order, to make sure that the effect is treated correctly. DList is a convenient way to do this, by virtue of trampolining.
In the end, these tests must pass:
https://github.com/scalaz/scalaz/blob/scalaz-seven/tests/src/test/scala/scalaz/TraverseTest.scala#L13
https://github.com/scalaz/scalaz/blob/scalaz-seven/tests/src/test/scala/scalaz/std/ListTest.scala#L11
https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/Traverse.scala#L76