Related
I'm working through the Functional Programming in Scala book, and am on the section about lazy evaluation. There is an exercise to implement a takeWhile function using foldRight. I was able to complete it successfully, but when I added print statements I saw that it seems to be doing processing I wouldn't expect. I am very confused by this.
Here is the code:
trait McStream[+A] {
def uncons: Option[(A, McStream[A])]
def isEmpty: Boolean = uncons.isEmpty
def toList: List[A] = {
uncons match {
case Some(head -> tail) => head :: tail.toList
case None => Nil
}
}
def foldRight[B](z: => B)(f: (A, => B) => B): B = {
uncons match {
case Some(head -> tail) =>
println(s"Inside foldRight, head is $head")
f(head, tail.foldRight(z)(f))
case None => z
}
}
// TODO how does evaluate? Trace steps
// TODO it seems to be storing a deferred takeWhile in the `b` variable that evaluates during the cons
def takeWhile(p: A => Boolean): McStream[A] = {
foldRight(McStream.empty[A]) { (a, b) =>
println(s"a is $a, b is $b")
if (p(a)) {
McStream.cons(a, b)
} else {
McStream.empty
}
}
}
}
With a helper object for constructors:
object McStream {
def empty[A]: McStream[A] = new McStream[A] {
override def uncons: Option[(A, McStream[A])] = None
}
def cons[A](hd: => A, tl: => McStream[A]): McStream[A] = {
new McStream[A] {
lazy val uncons = Some((hd, tl))
}
}
def apply[A](as: A*): McStream[A] = {
if (as.isEmpty) empty
else cons(as.head, apply(as.tail: _*))
}
}
}
Here is the test I'm running:
"take while a predicate is matched" in {
val stream = McStream(1, 2)
stream.takeWhile(_ < 2).toList shouldEqual List(1)
}
And here is the output I get:
Inside foldRight, head is 1
Inside foldRight, head is 2
a is 2, b is McStream(None)
a is 1, b is McStream(None)
Inside foldRight, head is 2
a is 2, b is McStream(None)
I'm confused about the last two lines, to me it seems like it should recurse all the way to the end of the list, and then either connect the currently processed tail to the next element if the predicate matches, or replace it with an empty McStream otherwise. At that point, it should just be returning the list, and not doing the additional foldRight and evaluation.
Here is the evaluation order as far as I can understand it:
Stream(1, Stream(2, Stream.empty)).takeWhile(_ < 2)
should print Inside foldRight, head is 1
Stream(2, Stream.empty).takeWhile(_ < 2)
should print Inside foldRight, head is 2
Stream.empty.takeWhile(_ < 2)
End of recursion, starts to return
Stream(2, Stream.empty).takeWhile(_ < 2)
should print a is 2, b is Stream.empty
Stream(1, Stream.empty).takeWhile(_ < 2)
should print a is 1, b is Stream.empty
1 #:: Stream.empty
Thanks in advance!
Turns out that the very mechanisms I was using to try to understand the evaluation (println statements), were forcing evaluation and causing the above issues. When I remove the print statements it evaluates like it should.
Don't use println in your lazy evaluations!
How do I rewrite the following loop (pattern) into Scala, either using built-in higher order functions or tail recursion?
This the example of an iteration pattern where you do a computation (comparison, for example) of two list elements, but only if the second one comes after first one in the original input. Note that the +1 step is used here, but in general, it could be +n.
public List<U> mapNext(List<T> list) {
List<U> results = new ArrayList();
for (i = 0; i < list.size - 1; i++) {
for (j = i + 1; j < list.size; j++) {
results.add(doSomething(list[i], list[j]))
}
}
return results;
}
So far, I've come up with this in Scala:
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#scala.annotation.tailrec
def loop(ix: List[T], jx: List[T], res: List[U]): List[U] = (ix, jx) match {
case (_ :: _ :: is, Nil) => loop(ix, ix.tail, res)
case (i :: _ :: is, j :: Nil) => loop(ix.tail, Nil, f(i, j) :: res)
case (i :: _ :: is, j :: js) => loop(ix, js, f(i, j) :: res)
case _ => res
}
loop(list, Nil, Nil).reverse
}
Edit:
To all contributors, I only wish I could accept every answer as solution :)
Here's my stab. I think it's pretty readable. The intuition is: for each head of the list, apply the function to the head and every other member of the tail. Then recurse on the tail of the list.
def mapNext[U, T](list: List[U], fun: (U, U) => T): List[T] = list match {
case Nil => Nil
case (first :: Nil) => Nil
case (first :: rest) => rest.map(fun(first, _: U)) ++ mapNext(rest, fun)
}
Here's a sample run
scala> mapNext(List(1, 2, 3, 4), (x: Int, y: Int) => x + y)
res6: List[Int] = List(3, 4, 5, 5, 6, 7)
This one isn't explicitly tail recursive but an accumulator could be easily added to make it.
Recursion is certainly an option, but the standard library offers some alternatives that will achieve the same iteration pattern.
Here's a very simple setup for demonstration purposes.
val lst = List("a","b","c","d")
def doSomething(a:String, b:String) = a+b
And here's one way to get at what we're after.
val resA = lst.tails.toList.init.flatMap(tl=>tl.tail.map(doSomething(tl.head,_)))
// resA: List[String] = List(ab, ac, ad, bc, bd, cd)
This works but the fact that there's a map() within a flatMap() suggests that a for comprehension might be used to pretty it up.
val resB = for {
tl <- lst.tails
if tl.nonEmpty
h = tl.head
x <- tl.tail
} yield doSomething(h, x) // resB: Iterator[String] = non-empty iterator
resB.toList // List(ab, ac, ad, bc, bd, cd)
In both cases the toList cast is used to get us back to the original collection type, which might not actually be necessary depending on what further processing of the collection is required.
Comeback Attempt:
After deleting my first attempt to give an answer I put some more thought into it and came up with another, at least shorter solution.
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#tailrec
def loop(in: List[T], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(list, Nil)
}
I would also like to recommend the enrich my library pattern for adding the mapNext function to the List api (or with some adjustments to any other collection).
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
Then you can use the function like:
list.mapNext(doSomething)
Again, there is a downside, as concatenating lists is relatively expensive.
However, variable assignemends inside for comprehensions can be quite inefficient, too (as this improvement task for dotty Scala Wart: Convoluted de-sugaring of for-comprehensions suggests).
UPDATE
Now that I'm into this, I simply cannot let go :(
Concerning 'Note that the +1 step is used here, but in general, it could be +n.'
I extended my proposal with some parameters to cover more situations:
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
def mapEvery[U](step: Int)(f: A => U) = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = {
in match {
case Nil => out.reverse
case head :: tail => loop(tail.drop(step), f(head) :: out)
}
}
loop(underlying, Nil)
}
def mapDrop[U](drop1: Int, drop2: Int, step: Int)(f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail =>
loop(tail.drop(drop1), out ::: tail.drop(drop2).mapEvery(step) { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
list // [a, b, c, d, ...]
.indices // [0, 1, 2, 3, ...]
.flatMap { i =>
elem = list(i) // Don't redo access every iteration of the below map.
list.drop(i + 1) // Take only the inputs that come after the one we're working on
.map(doSomething(elem, _))
}
// Or with a monad-comprehension
for {
index <- list.indices
thisElem = list(index)
thatElem <- list.drop(index + 1)
} yield doSomething(thisElem, thatElem)
You start, not with the list, but with its indices. Then, you use flatMap, because each index goes to a list of elements. Use drop to take only the elements after the element we're working on, and map that list to actually run the computation. Note that this has terrible time complexity, because most operations here, indices/length, flatMap, map, are O(n) in the list size, and drop and apply are O(n) in the argument.
You can get better performance if you a) stop using a linked list (List is good for LIFO, sequential access, but Vector is better in the general case), and b) make this a tiny bit uglier
val len = vector.length
(0 until len)
.flatMap { thisIdx =>
val thisElem = vector(thisIdx)
((thisIdx + 1) until len)
.map { thatIdx =>
doSomething(thisElem, vector(thatIdx))
}
}
// Or
val len = vector.length
for {
thisIdx <- 0 until len
thisElem = vector(thisIdx)
thatIdx <- (thisIdx + 1) until len
thatElem = vector(thatIdx)
} yield doSomething(thisElem, thatElem)
If you really need to, you can generalize either version of this code to all IndexedSeqs, by using some implicit CanBuildFrom parameters, but I won't cover that.
In scalaz 7.2.6, I want to implement sequence on Disjunction, such that if there is one or more lefts, it returns a list of those, instead of taking only the first one (as in Disjunction.sequenceU):
import scalaz._, Scalaz._
List(1.right, 2.right, 3.right).sequence
res1: \/-(List(1, 2, 3))
List(1.right, "error2".left, "error3".left).sequence
res2: -\/(List(error2, error3))
I've implemented it as follows and it works, but it looks ugly. Is there a getRight method (such as in scala Either class, Right[String, Int](3).right.get)? And how to improve this code?
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def getLeft(v: \/[L, R]):L = v match { case -\/(x) => x }
def getRight(v: \/[L, R]):R = v match { case \/-(x) => x }
def sequence: \/[List[L], List[R]] =
if (l.forall(_.isRight)) {
l.map(e => getRight(e)).right
} else {
l.filter(_.isLeft).map(e => getLeft(e)).left
}
}
Playing around I've implemented a recursive function for that, but the best option would be to use separate:
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def sequence: \/[List[L], List[R]] = {
def seqLoop(left: List[L], right: List[R], list: List[\/[L, R]]): \/[List[L], List[R]] =
list match {
case (h :: t) =>
h match {
case -\/(e) => seqLoop(left :+ e, right, t)
case \/-(s) => seqLoop(left, right :+ s, t)
}
case Nil =>
if(left.isEmpty) \/-(right)
else -\/(left)
}
seqLoop(List(), List(), l)
}
def sequenceSeparate: \/[List[L], List[R]] = {
val (left, right) = l.separate[\/[L, R], L, R]
if(left.isEmpty) \/-(right)
else -\/(left)
}
}
The first one just collects results and at the end decide what to do with those, the second its basically the same with the exception that the recursive function is much simpler, I didn't think about performance here, I've used :+, if you care use prepend or some other collection.
You may also want to take a look at Validation and ValidationNEL which unlike Disjunction accumulate failures.
I need a maxBy that returns all max values in case of equality.
Here is the signature and a first implementation :
def maxBy[A, B](as: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] =
as.groupBy(f).toList.maxBy(_._1)._2
Example :
maxBy(Seq(("a", "a1"),("a", "a2"),("b", "b1"),("b", "b2")))(_._1)
res6: Seq[(String, String)] = List(("b", "b1"), ("b", "b2"))
Updated with #thearchetypepaul comment
def maxBy[A, B](l: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = {
l.foldLeft(Seq.empty[A])((b, a) =>
b.headOption match {
case None => Seq(a)
case Some(v) => cmp.compare(f(a), f(v)) match {
case -1 => b
case 0 => b.+:(a)
case 1 => Seq(a)
}
}
)
}
Is there a better way ?
(1) Ordering#compare promises to denote the three possible results by a negative, positive, or zero number, not -1, 1, or 0 necessarily.
(2) Option#fold is generally (though not universally) considered to be more idiomatic than pattern matching.
(3) You are calling f possibly multiple times per element. TraversableOnce#maxBy used to do this before it was fixed in 2.11.
(4) You only accept Seq. The Scala library works hard to use CanBuildFrom to generalize the algorithms; you might want to as well.
(5) You can use the syntactic sugar B : Ordering if you would like.
(6) You prepend to the Seq. This is faster than appending, since prepending is O(1) for List and appending is O(n). But you wind up with the results in reverse order. foldRight will correct this. (Or you can call reverse on the final result.)
If you want to allow the use of CanBuildFrom,
def maxBy[A, Repr, That, B : Ordering](elements: TraversableLike[A, Repr])(f: A => B)(implicit bf: CanBuildFrom[Repr, A, That]): That = {
val b = bf()
elements.foldLeft(Option.empty[B]) { (best, element) =>
val current = f(element)
val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
if (result > 0) {
b.clear()
}
if (result >= 0) {
b += element
Some(current)
} else {
best
}
}
b.result
}
If you want to work with TraversableOnce,
def maxBy[A, B : Ordering](elements: TraversableOnce[A])(f: A => B): Seq[A] = {
elements.foldRight((Option.empty[B], List.empty[A])) { case (element, (best, elements)) =>
val current = f(element)
val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
if (result > 0) {
(Some(current), List(element))
} else {
(best, if (result == 0) element +: elements else elements)
}
}._2
}
if the dataset is small the performance shouldn't concern you that much
and you can just sort, reverse, and takeWhile.
def maxBy[A,B:Ordering](l:List[A], f: A => B): List[A] = {
l.sortBy(f).reverse match {
case Nil => Nil
case h :: t => h :: t.takeWhile(x => f(x) == f(h))
}
}
where the f should be an isomorphism on A.
and you can also save f(h) before comparison
I implemented a simple method to generate Cartesian product on several Seqs like this:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq[T](s)
}
class RichSeq[T](s: Seq[T]) {
import RichSeq._
def cartesian(ss: Seq[Seq[T]]): Seq[Seq[T]] = {
ss.toList match {
case Nil => Seq(s)
case s2 :: Nil => {
for (e <- s) yield s2.map(e2 => Seq(e, e2))
}.flatten
case s2 :: tail => {
for (e <- s) yield s2.cartesian(tail).map(seq => e +: seq)
}.flatten
}
}
}
Obviously, this one is really slow, as it calculates the whole product at once. Did anyone implement a lazy solution for this problem in Scala?
UPD
OK, So I implemented a reeeeally stupid, but working version of an iterator over a Cartesian product. Posting here for future enthusiasts:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq(s)
}
class RichSeq[T](s: Seq[T]) {
def lazyCartesian(ss: Seq[Seq[T]]): Iterator[Seq[T]] = new Iterator[Seq[T]] {
private[this] val seqs = s +: ss
private[this] var indexes = Array.fill(seqs.length)(0)
private[this] val counts = Vector(seqs.map(_.length - 1): _*)
private[this] var current = 0
def next(): Seq[T] = {
val buffer = ArrayBuffer.empty[T]
if (current != 0) {
throw new NoSuchElementException("no more elements to traverse")
}
val newIndexes = ArrayBuffer.empty[Int]
var inside = 0
for ((index, i) <- indexes.zipWithIndex) {
buffer.append(seqs(i)(index))
newIndexes.append(index)
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
inside = inside + 1
}
}
current = inside
if (current < seqs.length) {
for (i <- (0 to current).reverse) {
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
newIndexes(i) = 0
} else if (newIndexes(i) < counts(i)) {
newIndexes(i) = newIndexes(i) + 1
}
}
current = 0
indexes = newIndexes.toArray
}
buffer.result()
}
def hasNext: Boolean = current != seqs.length
}
}
Here's my solution to the given problem. Note that the laziness is simply caused by using .view on the "root collection" of the used for comprehension.
scala> def combine[A](xs: Traversable[Traversable[A]]): Seq[Seq[A]] =
| xs.foldLeft(Seq(Seq.empty[A])){
| (x, y) => for (a <- x.view; b <- y) yield a :+ b }
combine: [A](xs: Traversable[Traversable[A]])Seq[Seq[A]]
scala> combine(Set(Set("a","b","c"), Set("1","2"), Set("S","T"))) foreach (println(_))
List(a, 1, S)
List(a, 1, T)
List(a, 2, S)
List(a, 2, T)
List(b, 1, S)
List(b, 1, T)
List(b, 2, S)
List(b, 2, T)
List(c, 1, S)
List(c, 1, T)
List(c, 2, S)
List(c, 2, T)
To obtain this, I started from the function combine defined in https://stackoverflow.com/a/4515071/53974, passing it the function (a, b) => (a, b). However, that didn't quite work directly, since that code expects a function of type (A, A) => A. So I just adapted the code a bit.
These might be a starting point:
Cartesian product of two lists
Expand a Set[Set[String]] into Cartesian Product in Scala
https://stackoverflow.com/questions/6182126/im-learning-scala-would-it-be-possible-to-get-a-little-code-review-and-mentori
What about:
def cartesian[A](list: List[Seq[A]]): Iterator[Seq[A]] = {
if (list.isEmpty) {
Iterator(Seq())
} else {
list.head.iterator.flatMap { i => cartesian(list.tail).map(i +: _) }
}
}
Simple and lazy ;)
def cartesian[A](list: List[List[A]]): List[List[A]] = {
list match {
case Nil => List(List())
case h :: t => h.flatMap(i => cartesian(t).map(i :: _))
}
}
You can look here: https://stackoverflow.com/a/8318364/312172 how to translate a number into an index of all possible values, without generating every element.
This technique can be used to implement a stream.