Scala: Breaking out of foldLeft

Scala: Breaking out of foldLeft - scala

Suppose we have Seq val ourSeq = Seq(10,5,3,5,4).
I want to return a new list which reads from the left and stop when it sees a duplicate number (e.g. Seq(10,5,3) since 5 is repeated).
I was thinking of using fold left as such
ourSeq.foldLeft(Seq())(op = (temp, curr) => {
if (!temp.contains(curr)) {
temp :+ curr
} else break
})
but as far as I understand, there is no way to break out of a foldLeft?

Although it can be accomplished with a foldLeft() without any breaking out, I would argue that fold is the wrong tool for the job.
I'm rather fond of unfold(), which was introduced in Scala 2.13.0.
val ourSeq = Seq(10,5,3,5,4)
Seq.unfold((Set.empty[Int],ourSeq)){ case (seen,ns) =>
Option.when(ns.nonEmpty && !seen(ns.head)) {
(ns.head, (seen+ns.head, ns.tail))
}
}
//res0: Seq[Int] = Seq(10, 5, 3)

You are correct that it's not possible to break out of foldLeft. It would theoretically be possible to get the correct result with foldLeft, but you're still going to iterate the whole data structure. It'll be better to use an algorithm that already understands how to terminate early, and since you want to take a prefix, takeWhile will suffice.
import scala.collection.mutable.Set
val ourSeq = Seq(10, 5, 3, 5, 4)
val seen: Set[Int] = Set()
val untilDups = ourSeq.takeWhile((x) => {
if (seen contains x) {
false
} else {
seen += x
true
}
})
print(untilDups)
If you wanted to be totally immutable about this, you could wrap the whole thing in some kind of lazy fold that uses an immutable Set to keep its data. And that's certainly how I'd do it in Haskell. But this is Scala; we have mutability, and we may as well use it locally when it suits us.

This can be done using a recursive function:
def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
#annotation.tailrec
def loop(rem: List[T], res: List[T]): List[T] =
rem match {
case hd::tail if !res.contains(hd) =>
loop(tail, res :+ hd)
case _ =>
res
}
loop(ourSeq.toList, Nil)
}
This appears more complicated, but once you are familiar with the general pattern recursive functions are simple to write and more powerful than fold operations.
If you are working on large collections, this version is more efficient because it is O(n):
def distinctPrefix[T](ourSeq: Seq[T]): List[T] = {
#annotation.tailrec
def loop(rem: List[T], found: Set[T], res: List[T]): List[T] =
rem match {
case hd::tail if !found.contains(hd) =>
loop(tail, found + hd, hd +: res)
case _ =>
res.reverse
}
loop(ourSeq.toList, Set.empty, Nil)
}
This version works with any Seq and there are other options using Iterator etc. as described in the comments. You would need to be more specific about the type of the collection in order to create an optimised algorithm.
def uniquePrefix[T](ourSeq: Seq[T]): List[T] = {
#annotation.tailrec
def loop(rem: Seq[T], res: List[T]): List[T] =
rem.take(1) match {
case Seq(hd) if !res.contains(hd) =>
loop(rem.drop(1), res :+ hd)
case _ =>
res
}
loop(ourSeq, Nil)
}

Another option you have, is to use the function inits:
ourSeq.inits.dropWhile(curr => curr.distinct.size != curr.size).next()
Code run at Scastie.

Related

How do you run a computation, that may fail, over a list of elements so that it terminates as soon as a failure is detected?

My computation that can fail is halve() below. It returns Left(errorMessage) to indicate failure and Right(value) to indicate the successful halving of a number.
def halve(n: Int): Either[String, Int] =
if (n % 2 == 0) {
Right(n / 2)
} else {
Left("cannot halve odd number")
}
I'd like to apply the halve function to a list of Ints such that as soon as the first call to halve fails (e.g. when called with an odd number), the halveAll function immediately stops iterating over the numbers in ns and returns Left(errorMessage).
Here is one way to achieve this:
def halveAll(ns: List[Int]): Either[String, List[Int]] =
try {
Right(
for {
n <- ns
Right(halved) = halve(n)
} yield n
)
} catch {
case ex: MatchError =>
Left("cannot match an odd number")
}
I would prefer an approach that does not use exceptions. Is there an idiomatic way of achieving this? I'd prefer the approach to use only functionality in the Scala 2.x standard library. If Cats or scalaz has an elegant solution, I'd be interested in hearing about it though.
Thank you!
Example usage of the halveAll function:
val allEven = List(2, 4, 6, 8)
val evenAndOdd = List(2, 4, 6, 7, 8)
println(halveAll(allEven))
println(halveAll(evenAndOdd))

This has been asked a dozen times but I am too lazy to search for a duplicate.
Have you ever heard the FP meme "the answer is always traverse"? Well, you are now part of that, since that is exactly the function you want.
Thus, if you have cats in scope then you only need to do this:
import cats.syntax.all._
def halveAll(ns: List[Int]): Either[String, List[Int]] =
ns.traverse(halve)
If you don't have it already in scope, and don't want to add it just for a single function then you may use the foldLeft from Gaël J answer, or implement the recursion if you really want to stop iterating, like this:
def traverse[A, E, B](list: List[A])(f: A => Either[E, B]): Either[E, List[B]] = {
#annotation.tailrec
def loop(remaining: List[A], acc: List[B]): Either[E, List[B]] =
remaining match {
case a :: tail =>
f(a) match {
case Right(b) =>
loop(remaining = tail, b :: acc)
case Left(e) =>
Left(e)
}
case Nil =>
Right(acc.reverse)
}
loop(remaining = list, acc = List.empty)
}
Disclaimer: What follows is only my opinion.
I have heard the argument about not including cats for a single function many times, people simply don't realize is not just one function but probably many of them in the rest of the codebase; which ultimately means you are probably re-implementing many bits of the library in a worse way and with less testing.

The typical Scala approach (without libs) for this would be using foldLeft or a variant like this:
def halveAll(ns: List[Int]): Either[String, List[Int]] = {
ns.foldLeft(Right(List.empty[Int])) { (acc, n) =>
for { // for-comprehension on Either
accVal <- acc
x <- halve(n)
} yield accVal :+ x
}
}
As soon as a Left is produced by halve, it will continue iterating but will not call halve on the remaining items.
If you really need to not iterate anymore, you can use a recursive approach instead.
I guess it depends the size of the list but iterating over it should not be that costly most of the time.

SCALA: Fold method with conditions

I am still learning the basics of Scala, therefore I am asking for your understanding. Is it any possible way to use fold method to print only names beginning with "A"
Object Scala {
val names: List[String] = List("Adam", "Mick", "Ann");
def main(args: Array[String]) {
println(names.foldLeft("my list of items starting with A: ")(_+_));
}
}
}

Have a look at the signature of foldLeft
def foldLeft[B](z: B)(op: (B, A) => B): B
where
z is the initial value
op is a function taking two arguments, namely accumulated result so far B, and the next element to be processed A
returns the accumulated result B
Now consider this concrete implementation
val names: List[String] = List("Adam", "Mick", "Ann")
val predicate: String => Boolean = str => str.startsWith("A")
names.foldLeft(List.empty[String]) { (accumulated: List[String], next: String) =>
if (predicate(next)) accumulated.prepended(next) else accumulated
}
here
z = List.empty[String]
op = (accumulated: List[String], next: String) => if (predicate(next)) accumulated.prepended(next) else accumulated
Usually we would write this inlined and rely on type inference so we do not have two write out full types all the time, so it becomes
names.foldLeft(List.empty[String]) { (acc, next) =>
if (next.startsWith("A")) next :: acc else acc
}
// val res1: List[String] = List(Ann, Adam)
On of the key ideas when working with List is to always prepend an element instead of append
names.foldLeft(List.empty[String]) { (accumulated: List[String], next: String) =>
if (predicate(next)) accumulated.appended(next) else accumulated
}
because prepending is much more efficient. However note how this makes the accumulated result in reverse order, so
List(Ann, Adam)
instead of perhaps required
List(Adam, Ann)
so often-times we perform one last traversal by calling reverse like so
names.foldLeft(List.empty[String]) { (acc, next) =>
if (next.startsWith("A")) next :: acc else acc
}.reverse
// val res1: List[String] = List(Adam, Ann)

The answer from #Mario Galic is a good one and should be accepted. (It's the polite thing to do).
Here's a slightly different way to filter for starts-with-A strings.
val names: List[String] = List("Adam", "Mick", "Ann")
println(names.foldLeft("my list of items starting with A: "){
case (acc, s"A$nme") => acc + s"A$nme "
case (acc, _ ) => acc
})
//output: "my list of items starting with A: Adam Ann"

How to return early without return statement?

How to write an early-return piece of code in scala with no returns/breaks?
For example
for i in 0..10000000
if expensive_operation(i)
return i
return -1

How about
input.find(expensiveOperation).getOrElse(-1)

You can use dropWhile
Here an example:
Seq(2,6,8,3,5).dropWhile(_ % 2 == 0).headOption.getOrElse(default = -1) // -> 8
And here you find more scala-takewhile-example
With your example
(0 to 10000000).dropWhile(!expensive_operation(_)).headOption.getOrElse(default = -1)`

Since you asked for intuition to solve this problem generically. Let me start from the basis.
Scala is (between other things) a functional programming language, as such there is a very important concept for us. And it is that we write programs by composing expressions rather than statements.
Thus, the concept of return value for us means the evaluation of an expression.
(Note this is related to the concept of referential transparency).
val a = expr // a is bounded to the evaluation of expr,
val b = (a, a) // and they are interchangeable, thus b === (expr, expr)
How this relates to your question. In the sense that we really do not have control structures but complex expressions. For example an if
val a = if (expr) exprA else exprB // if itself is an expression, that returns other expressions.
Thus instead of doing something like this:
def foo(a: Int): Int =
if (a != 0) {
val b = a * a
return b
}
return -1
We would do something like:
def foo(a: Int): Int =
if (a != 0)
a * a
else
-1
Because we can bound all the if expression itself as the body of foo.
Now, returning to your specific question. How can we early return a cycle?
The answer is, you can't, at least not without mutations. But, you can use a higher concept, instead of iterating, you can traverse something. And you can do that using recursion.
Thus, let's implement ourselves the find proposed by #Thilo, as a tail-recursive function.
(It is very important that the function is recursive by tail, so the compiler optimizes it as something equivalent to a while loop, that way we will not blow up the stack).
def find(start: Int, end: Int, step: Int = 1)(predicate: Int => Boolean): Option[Int] = {
#annotation.tailrec
def loop(current: Int): Option[Int] =
if (current == end)
None // Base case.
else if (predicate(current))
Some(current) // Early return.
else
loop(current + step) // Recursive step.
loop(current = start)
}
find(0, 10000)(_ == 10)
// res: Option[Int] = Some(10)
Or we may generalize this a little bit more, let's implement find for Lists of any kind of elements.
def find[T](list: List[T])(predicate: T => Boolean): Option[T] = {
#annotation.tailrec
def loop(remaining: List[T]): Option[T] =
remaining match {
case Nil => None
case t :: _ if (predicate(t)) => Some(t)
case _ :: tail => loop(remaining = tail)
}
loop(remaining = list)
}

This is not necessarily the best solution from a practical perspective but I still wanted to add it for educational purposes:
import scala.annotation.tailrec
def expensiveOperation(i: Int): Boolean = ???
#tailrec
def findFirstBy[T](f: (T) => Boolean)(xs: Seq[T]): Option[T] = {
xs match {
case Seq() => None
case Seq(head, _*) if f(head) => Some(head)
case Seq(_, tail#_*) => findFirstBy(f)(tail)
}
}
val result = findFirstBy(expensiveOperation)(Range(0, 10000000)).getOrElse(-1)
Please prefer collections methods (dropWhile, find, ...) in your production code.

There a lot of better answer here but I think a 'while' could work just fine in that situation.
So, this code
for i in 0..10000000
if expensive_operation(i)
return i
return -1
could be rewritten as
var i = 0
var result = false
while(!result && i<(10000000-1)) {
i = i+1
result = expensive_operation(i)
}
After the 'while' the variable 'result' will tell if it succeed or not.

What would be the good name for this operation?

I see that Scala standard library misses the method to get ranges of objects in the collection, that satisfy the predicate:
def <???>(p: A => Boolean): List[List[A]] = {
val buf = collection.mutable.ListBuffer[List[A]]()
var elems = this.dropWhile(e => !p(e))
while (elems.nonEmpty) {
buf += elems.takeWhile(p)
elems = elems.dropWhile(e => !p(e))
}
buf.toList
}
What would be the good name for such method? And is my implementation good enough?

I'd go for chunkWith or chunkBy
As for your implementation, I think this cries out for recursion! See if you can fill out this
#tailrec def chunkBy[A](l: List[A], acc: List[List[A]] = Nil)(p: A => Boolean): List[List[A]] = l match {
case Nil => acc
case l =>
val next = l dropWhile !p
val (chunk, rest) = next span p
chunkBy(rest, chunk :: acc)(p)
}
Why recursion? It's much easier to understand the algorithm and more likely to be bug-free (given the absence of vars).
The syntax !p for the negation of a predicate is achieved via an implicit conversion
implicit def PredicateW[A](p: A => Boolean) = new {
def unary_! : A => Boolean = a => !p(a)
}
I generally keep this around as it's astoundingly useful

How about:
def chunkBy[K](f: A => K): Map[K, List[List[A]]] = ...
Similar to groupBy but keeps contiguous chunks as chunks.
Using this, you can do xs.chunkBy(p)(true) to get what you want.

You probably want to call it splitWith because split is the string operation that more-or-less does that, and it's similar to splitAt.
Incidentally, here's a very compact implementation (though it does a lot of unnecessary work, so it's not a good implementation for speed; yours is fine for that):
def splitWith[A](xs: List[A])(p: A => Boolean) = {
(xs zip xs.scanLeft(1){ (i,x) => if (p(x) == ((i&1)==1)) i+1 else i }.tail).
filter(_._2 % 2 == 0).groupBy(_._2).toList.sortBy(_._1).map(_._2.map(_._1))
}

Just a little refinement of oxbow's code, this way the signature is lighter
def chunkBy[A](xs: List[A])(p: A => Boolean): List[List[A]] = {
#tailrec
def recurse(todo: List[A], acc: List[List[A]]): List[List[A]] = todo match {
case Nil => acc
case _ =>
val next = todo dropWhile (!p(_))
val (chunk, rest) = next span p
recurse(rest, acc ::: List(chunk))
}
recurse(xs, Nil)
}

Scala - Recursion of an anonymous function

I'm working through the scala labs stuff and I'm building out a function that will, in the end, return something like this:
tails(List(1,2,3,4)) = List(List(1,2,3,4), List(2,3,4), List(3,4), List(4), List())
I got this working by using two functions and using some recursion on the second one.
def tails[T](l: List[T]): List[List[T]] = {
if ( l.length > 1 )trailUtil(List() ::: List(l))
else List() ::: List(l);
}
def trailUtil[T](l:List[List[T]]) : List[List[T]] = {
if ( l.last.length == 0)l
else trailUtil(l :+ l.last.init);
}
This is all good a great but it's bugging me that I need two functions to do this. I tried switching: trailUtil(List() ::: List(l)) for an anonymous function but I got this error type mismatch; found :List[List[T]] required:Int from the IDE.
val ret : List[List[T]] = (ll:List[List[T]]) => {
if ( ll.last.length == 0) ll else ret(ll :+ ll.last.init)
}
ret(List() ::: List(1))
Could someone please point me to what I am doing wrong, or a better way of doing this that would be great.
(I did look at this SO post but the different type are just not working for me):

What about this:
def tails[T](l: List[T]): List[List[T]] =
l match {
case h :: tail => l :: tails(tail)
case Nil => List(Nil)
}
And a little bit less idiomatic version:
def tails[T](input: List[T]): List[List[T]] =
if(input.isEmpty)
List(List())
else
input :: tails(input.tail)
BTW try to avoid List.length, it runs in O(n) time.
UPDATE: as suggested by tenshi, tail-recursive solution:
#tailrec def tails[T](l: List[T], init: List[List[T]] = Nil): List[List[T]] =
l match {
case h :: tail => tails(tail, l :: init)
case Nil => init
}

You actually can define def inside another def. It allows to define function that actually has name which can be referenced and used for recursion. Here is how tails can be implemented:
def tails[T](l: List[T]) = {
#annotation.tailrec
def ret(ll: List[List[T]]): List[List[T]] =
if (ll.last.isEmpty) ll
else ret(ll :+ ll.last.tail)
ret(l :: Nil)
}
This implementation is also tail-recursive. I added #annotation.tailrec annotation in order to ensure that it really is (code will not compile if it's not).
You can also use build-in function tails (see ScalaDoc):
List(1,2,3,4).tails.toList
tails returns Iterator, so you need to convert it to list (like I did), if you want it. Also result will contain one extra empty in the end (in my example result would be List(List(1, 2, 3, 4), List(2, 3, 4), List(3, 4), List(4), List())), so you need deal with it.

What you are doing wrong is this:
val ret : List[List[T]]
So ret is a list of list of T. Then you do this:
ret(ll :+ ll.last.init)
That mean you are calling the method apply on a list of list of T. The apply method for lists take an Int parameter, and returns an element with that index. For example:
scala> List("first", "second", "third")(2)
res0: java.lang.String = third
I assume you wanted to write val ret: List[List[T]] => List[List[T]], that is, a function that takes a List[List[T]] and returns a List[List[T]]. You'd have other problems then, because val is referring to itself in its definition. To get around that, you could replace it with a lazy val:
def tails[T](l: List[T]): List[List[T]] = {
lazy val ret : List[List[T]] => List[List[T]] = { (ll:List[List[T]]) =>
if ( ll.last.length == 0) ll
else ret(ll :+ ll.last.init)
}
if ( l.length > 1 )ret(List() ::: List(l))
else List() ::: List(l);
}
But, of course, the easy solution is to put one def inside the other, like tenshi suggested.

You can also use folding:
val l = List(1,2,3,4)
l.foldLeft(List[List[Int]](l))( (outerList,element) => {
println(outerList)
outerList.head.tail :: outerList
})
The first parameter list is your start value/accumulator. The second function is the modifier. Typically, it modifies the start value, which is then passed to every element in the list. I included a println so you can see the accumulator as the list is iterated over.