How to idiomatically iteratively flatMap a collection against its own members? - scala

A simple class with flatMap/map that does nothing but lazily store a value:
[Note1: this class could be replaced with any class with flatMap/map. Option is only one concrete example, this question is in regards to the general case]
[Note2: scalaz is an interesting library, but this question is not in regards to it. If there is not a std scala library solution other than what I have posted below, that is acceptable.]
class C[A](value : => A) {
def flatMap[B](f: A => C[B]) : C[B] = { f(value) }
def map[B](f: A => B) : C[B] = { new C(f(value)) }
override def toString = s"C($value)"
}
object C {
def apply[A](value : => A) = new C[A](value)
}
A function that iteratively applies flatMap to its members:
def invert[A](xs: Traversable[C[A]], acc: List[A] = Nil) : C[List[A]] =
if(xs.nonEmpty) {
xs.head flatMap { a => invert(xs.tail, a :: acc) }
} else {
C(acc.reverse)
}
Function in action:
scala> val l = List(C(1),C(2),C(3))
l: List[C[Int]] = List(C(1), C(2), C(3))
scala> invert(l)
res4: C[List[Int]] = C(List(1, 2, 3))
Is there a way rewrite "invert" idiomatically? Also, is there a functional "verb" that captures what I'm doing here?

The problem with your solution is that it will give a stack overflow for large lists, as it is fully (not just tail) recursive. I'd fold instead:
def invert[A](xs: Traversable[C[A]]) =
(C(List[A]()) /: xs){ (c,x) => c.flatMap(l => x.map(_ :: l)) }.map(_.reverse)

You might make invert a bit clearer with a pattern match, but it's essentially the same code:
xs match {
case x0 :: xMore => x0.flatMap(a => invert(xMore, a :: acc))
case Nil => C(acc.reverse)
}

Related

Rewriting imperative for loop to declarative style in Scala

How do I rewrite the following loop (pattern) into Scala, either using built-in higher order functions or tail recursion?
This the example of an iteration pattern where you do a computation (comparison, for example) of two list elements, but only if the second one comes after first one in the original input. Note that the +1 step is used here, but in general, it could be +n.
public List<U> mapNext(List<T> list) {
List<U> results = new ArrayList();
for (i = 0; i < list.size - 1; i++) {
for (j = i + 1; j < list.size; j++) {
results.add(doSomething(list[i], list[j]))
}
}
return results;
}
So far, I've come up with this in Scala:
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#scala.annotation.tailrec
def loop(ix: List[T], jx: List[T], res: List[U]): List[U] = (ix, jx) match {
case (_ :: _ :: is, Nil) => loop(ix, ix.tail, res)
case (i :: _ :: is, j :: Nil) => loop(ix.tail, Nil, f(i, j) :: res)
case (i :: _ :: is, j :: js) => loop(ix, js, f(i, j) :: res)
case _ => res
}
loop(list, Nil, Nil).reverse
}
Edit:
To all contributors, I only wish I could accept every answer as solution :)
Here's my stab. I think it's pretty readable. The intuition is: for each head of the list, apply the function to the head and every other member of the tail. Then recurse on the tail of the list.
def mapNext[U, T](list: List[U], fun: (U, U) => T): List[T] = list match {
case Nil => Nil
case (first :: Nil) => Nil
case (first :: rest) => rest.map(fun(first, _: U)) ++ mapNext(rest, fun)
}
Here's a sample run
scala> mapNext(List(1, 2, 3, 4), (x: Int, y: Int) => x + y)
res6: List[Int] = List(3, 4, 5, 5, 6, 7)
This one isn't explicitly tail recursive but an accumulator could be easily added to make it.
Recursion is certainly an option, but the standard library offers some alternatives that will achieve the same iteration pattern.
Here's a very simple setup for demonstration purposes.
val lst = List("a","b","c","d")
def doSomething(a:String, b:String) = a+b
And here's one way to get at what we're after.
val resA = lst.tails.toList.init.flatMap(tl=>tl.tail.map(doSomething(tl.head,_)))
// resA: List[String] = List(ab, ac, ad, bc, bd, cd)
This works but the fact that there's a map() within a flatMap() suggests that a for comprehension might be used to pretty it up.
val resB = for {
tl <- lst.tails
if tl.nonEmpty
h = tl.head
x <- tl.tail
} yield doSomething(h, x) // resB: Iterator[String] = non-empty iterator
resB.toList // List(ab, ac, ad, bc, bd, cd)
In both cases the toList cast is used to get us back to the original collection type, which might not actually be necessary depending on what further processing of the collection is required.
Comeback Attempt:
After deleting my first attempt to give an answer I put some more thought into it and came up with another, at least shorter solution.
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#tailrec
def loop(in: List[T], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(list, Nil)
}
I would also like to recommend the enrich my library pattern for adding the mapNext function to the List api (or with some adjustments to any other collection).
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
Then you can use the function like:
list.mapNext(doSomething)
Again, there is a downside, as concatenating lists is relatively expensive.
However, variable assignemends inside for comprehensions can be quite inefficient, too (as this improvement task for dotty Scala Wart: Convoluted de-sugaring of for-comprehensions suggests).
UPDATE
Now that I'm into this, I simply cannot let go :(
Concerning 'Note that the +1 step is used here, but in general, it could be +n.'
I extended my proposal with some parameters to cover more situations:
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
def mapEvery[U](step: Int)(f: A => U) = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = {
in match {
case Nil => out.reverse
case head :: tail => loop(tail.drop(step), f(head) :: out)
}
}
loop(underlying, Nil)
}
def mapDrop[U](drop1: Int, drop2: Int, step: Int)(f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail =>
loop(tail.drop(drop1), out ::: tail.drop(drop2).mapEvery(step) { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
list // [a, b, c, d, ...]
.indices // [0, 1, 2, 3, ...]
.flatMap { i =>
elem = list(i) // Don't redo access every iteration of the below map.
list.drop(i + 1) // Take only the inputs that come after the one we're working on
.map(doSomething(elem, _))
}
// Or with a monad-comprehension
for {
index <- list.indices
thisElem = list(index)
thatElem <- list.drop(index + 1)
} yield doSomething(thisElem, thatElem)
You start, not with the list, but with its indices. Then, you use flatMap, because each index goes to a list of elements. Use drop to take only the elements after the element we're working on, and map that list to actually run the computation. Note that this has terrible time complexity, because most operations here, indices/length, flatMap, map, are O(n) in the list size, and drop and apply are O(n) in the argument.
You can get better performance if you a) stop using a linked list (List is good for LIFO, sequential access, but Vector is better in the general case), and b) make this a tiny bit uglier
val len = vector.length
(0 until len)
.flatMap { thisIdx =>
val thisElem = vector(thisIdx)
((thisIdx + 1) until len)
.map { thatIdx =>
doSomething(thisElem, vector(thatIdx))
}
}
// Or
val len = vector.length
for {
thisIdx <- 0 until len
thisElem = vector(thisIdx)
thatIdx <- (thisIdx + 1) until len
thatElem = vector(thatIdx)
} yield doSomething(thisElem, thatElem)
If you really need to, you can generalize either version of this code to all IndexedSeqs, by using some implicit CanBuildFrom parameters, but I won't cover that.

scalaz, Disjunction.sequence returning a list of lefts

In scalaz 7.2.6, I want to implement sequence on Disjunction, such that if there is one or more lefts, it returns a list of those, instead of taking only the first one (as in Disjunction.sequenceU):
import scalaz._, Scalaz._
List(1.right, 2.right, 3.right).sequence
res1: \/-(List(1, 2, 3))
List(1.right, "error2".left, "error3".left).sequence
res2: -\/(List(error2, error3))
I've implemented it as follows and it works, but it looks ugly. Is there a getRight method (such as in scala Either class, Right[String, Int](3).right.get)? And how to improve this code?
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def getLeft(v: \/[L, R]):L = v match { case -\/(x) => x }
def getRight(v: \/[L, R]):R = v match { case \/-(x) => x }
def sequence: \/[List[L], List[R]] =
if (l.forall(_.isRight)) {
l.map(e => getRight(e)).right
} else {
l.filter(_.isLeft).map(e => getLeft(e)).left
}
}
Playing around I've implemented a recursive function for that, but the best option would be to use separate:
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def sequence: \/[List[L], List[R]] = {
def seqLoop(left: List[L], right: List[R], list: List[\/[L, R]]): \/[List[L], List[R]] =
list match {
case (h :: t) =>
h match {
case -\/(e) => seqLoop(left :+ e, right, t)
case \/-(s) => seqLoop(left, right :+ s, t)
}
case Nil =>
if(left.isEmpty) \/-(right)
else -\/(left)
}
seqLoop(List(), List(), l)
}
def sequenceSeparate: \/[List[L], List[R]] = {
val (left, right) = l.separate[\/[L, R], L, R]
if(left.isEmpty) \/-(right)
else -\/(left)
}
}
The first one just collects results and at the end decide what to do with those, the second its basically the same with the exception that the recursive function is much simpler, I didn't think about performance here, I've used :+, if you care use prepend or some other collection.
You may also want to take a look at Validation and ValidationNEL which unlike Disjunction accumulate failures.

How to implement maxBy with multiple max in Scala

I need a maxBy that returns all max values in case of equality.
Here is the signature and a first implementation :
def maxBy[A, B](as: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] =
as.groupBy(f).toList.maxBy(_._1)._2
Example :
maxBy(Seq(("a", "a1"),("a", "a2"),("b", "b1"),("b", "b2")))(_._1)
res6: Seq[(String, String)] = List(("b", "b1"), ("b", "b2"))
Updated with #thearchetypepaul comment
def maxBy[A, B](l: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = {
l.foldLeft(Seq.empty[A])((b, a) =>
b.headOption match {
case None => Seq(a)
case Some(v) => cmp.compare(f(a), f(v)) match {
case -1 => b
case 0 => b.+:(a)
case 1 => Seq(a)
}
}
)
}
Is there a better way ?
(1) Ordering#compare promises to denote the three possible results by a negative, positive, or zero number, not -1, 1, or 0 necessarily.
(2) Option#fold is generally (though not universally) considered to be more idiomatic than pattern matching.
(3) You are calling f possibly multiple times per element. TraversableOnce#maxBy used to do this before it was fixed in 2.11.
(4) You only accept Seq. The Scala library works hard to use CanBuildFrom to generalize the algorithms; you might want to as well.
(5) You can use the syntactic sugar B : Ordering if you would like.
(6) You prepend to the Seq. This is faster than appending, since prepending is O(1) for List and appending is O(n). But you wind up with the results in reverse order. foldRight will correct this. (Or you can call reverse on the final result.)
If you want to allow the use of CanBuildFrom,
def maxBy[A, Repr, That, B : Ordering](elements: TraversableLike[A, Repr])(f: A => B)(implicit bf: CanBuildFrom[Repr, A, That]): That = {
val b = bf()
elements.foldLeft(Option.empty[B]) { (best, element) =>
val current = f(element)
val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
if (result > 0) {
b.clear()
}
if (result >= 0) {
b += element
Some(current)
} else {
best
}
}
b.result
}
If you want to work with TraversableOnce,
def maxBy[A, B : Ordering](elements: TraversableOnce[A])(f: A => B): Seq[A] = {
elements.foldRight((Option.empty[B], List.empty[A])) { case (element, (best, elements)) =>
val current = f(element)
val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
if (result > 0) {
(Some(current), List(element))
} else {
(best, if (result == 0) element +: elements else elements)
}
}._2
}
if the dataset is small the performance shouldn't concern you that much
and you can just sort, reverse, and takeWhile.
def maxBy[A,B:Ordering](l:List[A], f: A => B): List[A] = {
l.sortBy(f).reverse match {
case Nil => Nil
case h :: t => h :: t.takeWhile(x => f(x) == f(h))
}
}
where the f should be an isomorphism on A.
and you can also save f(h) before comparison

What would be the good name for this operation?

I see that Scala standard library misses the method to get ranges of objects in the collection, that satisfy the predicate:
def <???>(p: A => Boolean): List[List[A]] = {
val buf = collection.mutable.ListBuffer[List[A]]()
var elems = this.dropWhile(e => !p(e))
while (elems.nonEmpty) {
buf += elems.takeWhile(p)
elems = elems.dropWhile(e => !p(e))
}
buf.toList
}
What would be the good name for such method? And is my implementation good enough?
I'd go for chunkWith or chunkBy
As for your implementation, I think this cries out for recursion! See if you can fill out this
#tailrec def chunkBy[A](l: List[A], acc: List[List[A]] = Nil)(p: A => Boolean): List[List[A]] = l match {
case Nil => acc
case l =>
val next = l dropWhile !p
val (chunk, rest) = next span p
chunkBy(rest, chunk :: acc)(p)
}
Why recursion? It's much easier to understand the algorithm and more likely to be bug-free (given the absence of vars).
The syntax !p for the negation of a predicate is achieved via an implicit conversion
implicit def PredicateW[A](p: A => Boolean) = new {
def unary_! : A => Boolean = a => !p(a)
}
I generally keep this around as it's astoundingly useful
How about:
def chunkBy[K](f: A => K): Map[K, List[List[A]]] = ...
Similar to groupBy but keeps contiguous chunks as chunks.
Using this, you can do xs.chunkBy(p)(true) to get what you want.
You probably want to call it splitWith because split is the string operation that more-or-less does that, and it's similar to splitAt.
Incidentally, here's a very compact implementation (though it does a lot of unnecessary work, so it's not a good implementation for speed; yours is fine for that):
def splitWith[A](xs: List[A])(p: A => Boolean) = {
(xs zip xs.scanLeft(1){ (i,x) => if (p(x) == ((i&1)==1)) i+1 else i }.tail).
filter(_._2 % 2 == 0).groupBy(_._2).toList.sortBy(_._1).map(_._2.map(_._1))
}
Just a little refinement of oxbow's code, this way the signature is lighter
def chunkBy[A](xs: List[A])(p: A => Boolean): List[List[A]] = {
#tailrec
def recurse(todo: List[A], acc: List[List[A]]): List[List[A]] = todo match {
case Nil => acc
case _ =>
val next = todo dropWhile (!p(_))
val (chunk, rest) = next span p
recurse(rest, acc ::: List(chunk))
}
recurse(xs, Nil)
}

Replacing imperative PriorityQueue in my algorithm

I currently have a method that uses a scala.collection.mutable.PriorityQueue to combine elements in a certain order. For instance the code looks a bit like this:
def process[A : Ordering](as: Set[A], f: (A, A) => A): A = {
val queue = new scala.collection.mutable.PriorityQueue[A]() ++ as
while (queue.size > 1) {
val a1 = queue.dequeue
val a2 = queue.dequeue
queue.enqueue(f(a1, a2))
}
queue.dequeue
}
The code works as written, but is necessarily pretty imperative. I thought of using a SortedSet instead of the PriorityQueue, but my attempts make the process look a lot messier. What is a more declarative, succinct way of doing what I want to do?
If f doesn't produce elements that are already in the Set, you can indeed use a SortedSet. (If it does, you need an immutable priority queue.) A declarative wayto do this would be:
def process[A:Ordering](s:SortedSet[A], f:(A,A)=>A):A = {
if (s.size == 1) s.head else {
val fst::snd::Nil = s.take(2).toList
val newSet = s - fst - snd + f(fst, snd)
process(newSet, f)
}
}
Tried to improve #Kim Stebel's answer, but I think imperative variant is still more clear.
def process[A:Ordering](s: Set[A], f: (A, A) => A): A = {
val ord = implicitly[Ordering[A]]
#tailrec
def loop(lst: List[A]): A = lst match {
case result :: Nil => result
case fst :: snd :: rest =>
val insert = f(fst, snd)
val (more, less) = rest.span(ord.gt(_, insert))
loop(more ::: insert :: less)
}
loop(s.toList.sorted(ord.reverse))
}
Here's a solution with SortedSet and Stream:
def process[A : Ordering](as: Set[A], f: (A, A) => A): A = {
Stream.iterate(SortedSet.empty ++ as)( ss =>
ss.drop(2) + f(ss.head, ss.tail.head))
.takeWhile(_.size > 1).last.head
}