Related
How do I rewrite the following loop (pattern) into Scala, either using built-in higher order functions or tail recursion?
This the example of an iteration pattern where you do a computation (comparison, for example) of two list elements, but only if the second one comes after first one in the original input. Note that the +1 step is used here, but in general, it could be +n.
public List<U> mapNext(List<T> list) {
List<U> results = new ArrayList();
for (i = 0; i < list.size - 1; i++) {
for (j = i + 1; j < list.size; j++) {
results.add(doSomething(list[i], list[j]))
}
}
return results;
}
So far, I've come up with this in Scala:
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#scala.annotation.tailrec
def loop(ix: List[T], jx: List[T], res: List[U]): List[U] = (ix, jx) match {
case (_ :: _ :: is, Nil) => loop(ix, ix.tail, res)
case (i :: _ :: is, j :: Nil) => loop(ix.tail, Nil, f(i, j) :: res)
case (i :: _ :: is, j :: js) => loop(ix, js, f(i, j) :: res)
case _ => res
}
loop(list, Nil, Nil).reverse
}
Edit:
To all contributors, I only wish I could accept every answer as solution :)
Here's my stab. I think it's pretty readable. The intuition is: for each head of the list, apply the function to the head and every other member of the tail. Then recurse on the tail of the list.
def mapNext[U, T](list: List[U], fun: (U, U) => T): List[T] = list match {
case Nil => Nil
case (first :: Nil) => Nil
case (first :: rest) => rest.map(fun(first, _: U)) ++ mapNext(rest, fun)
}
Here's a sample run
scala> mapNext(List(1, 2, 3, 4), (x: Int, y: Int) => x + y)
res6: List[Int] = List(3, 4, 5, 5, 6, 7)
This one isn't explicitly tail recursive but an accumulator could be easily added to make it.
Recursion is certainly an option, but the standard library offers some alternatives that will achieve the same iteration pattern.
Here's a very simple setup for demonstration purposes.
val lst = List("a","b","c","d")
def doSomething(a:String, b:String) = a+b
And here's one way to get at what we're after.
val resA = lst.tails.toList.init.flatMap(tl=>tl.tail.map(doSomething(tl.head,_)))
// resA: List[String] = List(ab, ac, ad, bc, bd, cd)
This works but the fact that there's a map() within a flatMap() suggests that a for comprehension might be used to pretty it up.
val resB = for {
tl <- lst.tails
if tl.nonEmpty
h = tl.head
x <- tl.tail
} yield doSomething(h, x) // resB: Iterator[String] = non-empty iterator
resB.toList // List(ab, ac, ad, bc, bd, cd)
In both cases the toList cast is used to get us back to the original collection type, which might not actually be necessary depending on what further processing of the collection is required.
Comeback Attempt:
After deleting my first attempt to give an answer I put some more thought into it and came up with another, at least shorter solution.
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#tailrec
def loop(in: List[T], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(list, Nil)
}
I would also like to recommend the enrich my library pattern for adding the mapNext function to the List api (or with some adjustments to any other collection).
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
Then you can use the function like:
list.mapNext(doSomething)
Again, there is a downside, as concatenating lists is relatively expensive.
However, variable assignemends inside for comprehensions can be quite inefficient, too (as this improvement task for dotty Scala Wart: Convoluted de-sugaring of for-comprehensions suggests).
UPDATE
Now that I'm into this, I simply cannot let go :(
Concerning 'Note that the +1 step is used here, but in general, it could be +n.'
I extended my proposal with some parameters to cover more situations:
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
def mapEvery[U](step: Int)(f: A => U) = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = {
in match {
case Nil => out.reverse
case head :: tail => loop(tail.drop(step), f(head) :: out)
}
}
loop(underlying, Nil)
}
def mapDrop[U](drop1: Int, drop2: Int, step: Int)(f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail =>
loop(tail.drop(drop1), out ::: tail.drop(drop2).mapEvery(step) { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
list // [a, b, c, d, ...]
.indices // [0, 1, 2, 3, ...]
.flatMap { i =>
elem = list(i) // Don't redo access every iteration of the below map.
list.drop(i + 1) // Take only the inputs that come after the one we're working on
.map(doSomething(elem, _))
}
// Or with a monad-comprehension
for {
index <- list.indices
thisElem = list(index)
thatElem <- list.drop(index + 1)
} yield doSomething(thisElem, thatElem)
You start, not with the list, but with its indices. Then, you use flatMap, because each index goes to a list of elements. Use drop to take only the elements after the element we're working on, and map that list to actually run the computation. Note that this has terrible time complexity, because most operations here, indices/length, flatMap, map, are O(n) in the list size, and drop and apply are O(n) in the argument.
You can get better performance if you a) stop using a linked list (List is good for LIFO, sequential access, but Vector is better in the general case), and b) make this a tiny bit uglier
val len = vector.length
(0 until len)
.flatMap { thisIdx =>
val thisElem = vector(thisIdx)
((thisIdx + 1) until len)
.map { thatIdx =>
doSomething(thisElem, vector(thatIdx))
}
}
// Or
val len = vector.length
for {
thisIdx <- 0 until len
thisElem = vector(thisIdx)
thatIdx <- (thisIdx + 1) until len
thatElem = vector(thatIdx)
} yield doSomething(thisElem, thatElem)
If you really need to, you can generalize either version of this code to all IndexedSeqs, by using some implicit CanBuildFrom parameters, but I won't cover that.
Slightly simplifying, my problem comes from a list of strings input that I want to parse with a function parse returning Either[String,Int].
Then list.map(parse) returns a list of Eithers. The next step in the program is to format an error message summing up all the errors or passing on the list of parsed integers.
Lets call the solution I'm looking for partitionEithers.
Calling
partitionEithers(List(Left("foo"), Right(1), Left("bar")))
Would give
(List("foo", "bar"),List(1))
Finding something like this in the standard library would be best. Failing that some kind of clean, idiomatic and efficient solution would be best. Also some kind of efficient utility function I could just paste into my projects would be ok.
I was very confused between these 3 earlier questions. As far as I can tell, neither of those questions matches my case, but some answers there seem to contain valid answers to this question.
Scala collections offer a partition function:
val eithers: List[Either[String, Int]] = List(Left("foo"), Right(1), Left("bar"))
eithers.partition(_.isLeft) match {
case (leftList, rightList) =>
(leftList.map(_.left.get), rightList.map(_.right.get))
}
=> res0: (List[String], List[Int]) = (List(foo, bar),List(1))
UPDATE
If you want to wrap it in a (maybe even somewhat type safer) generic function:
def partitionEither[Left : ClassTag, Right : ClassTag](in: List[Either[Left, Right]]): (List[Left], List[Right]) =
in.partition(_.isLeft) match {
case (leftList, rightList) =>
(leftList.collect { case Left(l: Left) => l }, rightList.collect { case Right(r: Right) => r })
}
You could use separate from MonadPlus (scalaz) or MonadCombine (cats) :
import scala.util.{Either, Left, Right}
import scalaz.std.list._
import scalaz.std.either._
import scalaz.syntax.monadPlus._
val l: List[Either[String, Int]] = List(Right(1), Left("error"), Right(2))
l.separate
// (List[String], List[Int]) = (List(error),List(1, 2))
I don't really get the amount of contortions of the other answers. So here is a one liner:
scala> val es:List[Either[Int,String]] =
List(Left(1),Left(2),Right("A"),Right("B"),Left(3),Right("C"))
es: List[Either[Int,String]] = List(Left(1), Left(2), Right(A), Right(B), Left(3), Right(C))
scala> es.foldRight( (List[Int](), List[String]()) ) {
case ( e, (ls, rs) ) => e.fold( l => ( l :: ls, rs), r => ( ls, r :: rs ) )
}
res5: (List[Int], List[String]) = (List(1, 2, 3),List(A, B, C))
Here is an imperative implementation mimicking the style of Scala collection internals.
I wonder if there should something like this in there, since at least I run into this from time to time.
import collection._
import generic._
def partitionEithers[L, R, E, I, CL, CR]
(lrs: I)
(implicit evI: I <:< GenTraversableOnce[E],
evE: E <:< Either[L, R],
cbfl: CanBuildFrom[I, L, CL],
cbfr: CanBuildFrom[I, R, CR])
: (CL, CR) = {
val ls = cbfl()
val rs = cbfr()
ls.sizeHint(lrs.size)
rs.sizeHint(lrs.size)
lrs.foreach { e => evE(e) match {
case Left(l) => ls += l
case Right(r) => rs += r
} }
(ls.result(), rs.result())
}
partitionEithers(List(Left("foo"), Right(1), Left("bar"))) == (List("foo", "bar"), List(1))
partitionEithers(Set(Left("foo"), Right(1), Left("bar"), Right(1))) == (Set("foo", "bar"), Set(1))
You can use foldLeft.
def f(s: Seq[Either[String, Int]]): (Seq[String], Seq[Int]) = {
s.foldRight((Seq[String](), Seq[Int]())) { case (c, r) =>
c match {
case Left(le) => (le +: r._1, r._2)
case Right(ri) => (r._1 , ri +: r._2)
}
}
}
val eithers: List[Either[String, Int]] = List(Left("foo"), Right(1), Left("bar"))
scala> f(eithers)
res0: (Seq[String], Seq[Int]) = (List(foo, bar),List(1))
A simple class with flatMap/map that does nothing but lazily store a value:
[Note1: this class could be replaced with any class with flatMap/map. Option is only one concrete example, this question is in regards to the general case]
[Note2: scalaz is an interesting library, but this question is not in regards to it. If there is not a std scala library solution other than what I have posted below, that is acceptable.]
class C[A](value : => A) {
def flatMap[B](f: A => C[B]) : C[B] = { f(value) }
def map[B](f: A => B) : C[B] = { new C(f(value)) }
override def toString = s"C($value)"
}
object C {
def apply[A](value : => A) = new C[A](value)
}
A function that iteratively applies flatMap to its members:
def invert[A](xs: Traversable[C[A]], acc: List[A] = Nil) : C[List[A]] =
if(xs.nonEmpty) {
xs.head flatMap { a => invert(xs.tail, a :: acc) }
} else {
C(acc.reverse)
}
Function in action:
scala> val l = List(C(1),C(2),C(3))
l: List[C[Int]] = List(C(1), C(2), C(3))
scala> invert(l)
res4: C[List[Int]] = C(List(1, 2, 3))
Is there a way rewrite "invert" idiomatically? Also, is there a functional "verb" that captures what I'm doing here?
The problem with your solution is that it will give a stack overflow for large lists, as it is fully (not just tail) recursive. I'd fold instead:
def invert[A](xs: Traversable[C[A]]) =
(C(List[A]()) /: xs){ (c,x) => c.flatMap(l => x.map(_ :: l)) }.map(_.reverse)
You might make invert a bit clearer with a pattern match, but it's essentially the same code:
xs match {
case x0 :: xMore => x0.flatMap(a => invert(xMore, a :: acc))
case Nil => C(acc.reverse)
}
I see that Scala standard library misses the method to get ranges of objects in the collection, that satisfy the predicate:
def <???>(p: A => Boolean): List[List[A]] = {
val buf = collection.mutable.ListBuffer[List[A]]()
var elems = this.dropWhile(e => !p(e))
while (elems.nonEmpty) {
buf += elems.takeWhile(p)
elems = elems.dropWhile(e => !p(e))
}
buf.toList
}
What would be the good name for such method? And is my implementation good enough?
I'd go for chunkWith or chunkBy
As for your implementation, I think this cries out for recursion! See if you can fill out this
#tailrec def chunkBy[A](l: List[A], acc: List[List[A]] = Nil)(p: A => Boolean): List[List[A]] = l match {
case Nil => acc
case l =>
val next = l dropWhile !p
val (chunk, rest) = next span p
chunkBy(rest, chunk :: acc)(p)
}
Why recursion? It's much easier to understand the algorithm and more likely to be bug-free (given the absence of vars).
The syntax !p for the negation of a predicate is achieved via an implicit conversion
implicit def PredicateW[A](p: A => Boolean) = new {
def unary_! : A => Boolean = a => !p(a)
}
I generally keep this around as it's astoundingly useful
How about:
def chunkBy[K](f: A => K): Map[K, List[List[A]]] = ...
Similar to groupBy but keeps contiguous chunks as chunks.
Using this, you can do xs.chunkBy(p)(true) to get what you want.
You probably want to call it splitWith because split is the string operation that more-or-less does that, and it's similar to splitAt.
Incidentally, here's a very compact implementation (though it does a lot of unnecessary work, so it's not a good implementation for speed; yours is fine for that):
def splitWith[A](xs: List[A])(p: A => Boolean) = {
(xs zip xs.scanLeft(1){ (i,x) => if (p(x) == ((i&1)==1)) i+1 else i }.tail).
filter(_._2 % 2 == 0).groupBy(_._2).toList.sortBy(_._1).map(_._2.map(_._1))
}
Just a little refinement of oxbow's code, this way the signature is lighter
def chunkBy[A](xs: List[A])(p: A => Boolean): List[List[A]] = {
#tailrec
def recurse(todo: List[A], acc: List[List[A]]): List[List[A]] = todo match {
case Nil => acc
case _ =>
val next = todo dropWhile (!p(_))
val (chunk, rest) = next span p
recurse(rest, acc ::: List(chunk))
}
recurse(xs, Nil)
}
With the intention of learning and further to this question, I've remained curious of the idiomatic alternatives to explicit recursion for an algorithm that checks whether a list (or collection) is ordered. (I'm keeping things simple here by using an operator to compare and Int as type; I'd like to look at the algorithm before delving into the generics of it)
The basic recursive version would be (by #Luigi Plinge):
def isOrdered(l:List[Int]): Boolean = l match {
case Nil => true
case x :: Nil => true
case x :: xs => x <= xs.head && isOrdered(xs)
}
A poor performing idiomatic way would be:
def isOrdered(l: List[Int]) = l == l.sorted
An alternative algorithm using fold:
def isOrdered(l: List[Int]) =
l.foldLeft((true, None:Option[Int]))((x,y) =>
(x._1 && x._2.map(_ <= y).getOrElse(true), Some(y)))._1
It has the drawback that it will compare for all n elements of the list even if it could stop earlier after finding the first out-of-order element. Is there a way to "stop" fold and therefore making this a better solution?
Any other (elegant) alternatives?
This will exit after the first element that is out of order. It should thus perform well, but I haven't tested that. It's also a lot more elegant in my opinion. :)
def sorted(l:List[Int]) = l.view.zip(l.tail).forall(x => x._1 <= x._2)
By "idiomatic", I assume you're talking about McBride and Paterson's "Idioms" in their paper Applicative Programming With Effects. :o)
Here's how you would use their idioms to check if a collection is ordered:
import scalaz._
import Scalaz._
case class Lte[A](v: A, b: Boolean)
implicit def lteSemigroup[A:Order] = new Semigroup[Lte[A]] {
def append(a1: Lte[A], a2: => Lte[A]) = {
lazy val b = a1.v lte a2.v
Lte(if (!a1.b || b) a1.v else a2.v, a1.b && b && a2.b)
}
}
def isOrdered[T[_]:Traverse, A:Order](ta: T[A]) =
ta.foldMapDefault(x => some(Lte(x, true))).fold(_.b, true)
Here's how this works:
Any data structure T[A] where there exists an implementation of Traverse[T], can be traversed with an Applicative functor, or "idiom", or "strong lax monoidal functor". It just so happens that every Monoid induces such an idiom for free (see section 4 of the paper).
A monoid is just an associative binary operation over some type, and an identity element for that operation. I'm defining a Semigroup[Lte[A]] (a semigroup is the same as a monoid, except without the identity element) whose associative operation tracks the lesser of two values and whether the left value is less than the right value. And of course Option[Lte[A]] is just the monoid generated freely by our semigroup.
Finally, foldMapDefault traverses the collection type T in the idiom induced by the monoid. The result b will contain true if each value was less than all the following ones (meaning the collection was ordered), or None if the T had no elements. Since an empty T is sorted by convention, we pass true as the second argument to the final fold of the Option.
As a bonus, this works for all traversable collections. A demo:
scala> val b = isOrdered(List(1,3,5,7,123))
b: Boolean = true
scala> val b = isOrdered(Seq(5,7,2,3,6))
b: Boolean = false
scala> val b = isOrdered(Map((2 -> 22, 33 -> 3)))
b: Boolean = true
scala> val b = isOrdered(some("hello"))
b: Boolean = true
A test:
import org.scalacheck._
scala> val p = forAll((xs: List[Int]) => (xs /== xs.sorted) ==> !isOrdered(xs))
p:org.scalacheck.Prop = Prop
scala> val q = forAll((xs: List[Int]) => isOrdered(xs.sorted))
q: org.scalacheck.Prop = Prop
scala> p && q check
+ OK, passed 100 tests.
And that's how you do idiomatic traversal to detect if a collection is ordered.
I'm going with this, which is pretty similar to Kim Stebel's, as a matter of fact.
def isOrdered(list: List[Int]): Boolean = (
list
sliding 2
map {
case List(a, b) => () => a < b
}
forall (_())
)
In case you missed missingfaktor's elegant solution in the comments above:
Scala < 2.13.0
(l, l.tail).zipped.forall(_ <= _)
Scala 2.13.x+
l.lazyZip(l.tail).forall(_ <= _)
This solution is very readable and will exit on the first out-of-order element.
The recursive version is fine, but limited to List (with limited changes, it would work well on LinearSeq).
If it was implemented in the standard library (would make sense) it would probably be done in IterableLike and have a completely imperative implementation (see for instance method find)
You can interrupt the foldLeft with a return (in which case you need only the previous element and not boolean all along)
import Ordering.Implicits._
def isOrdered[A: Ordering](seq: Seq[A]): Boolean = {
if (!seq.isEmpty)
seq.tail.foldLeft(seq.head){(previous, current) =>
if (previous > current) return false; current
}
true
}
but I don't see how it is any better or even idiomatic than an imperative implementation. I'm not sure I would not call it imperative actually.
Another solution could be
def isOrdered[A: Ordering](seq: Seq[A]): Boolean =
! seq.sliding(2).exists{s => s.length == 2 && s(0) > s(1)}
Rather concise, and maybe that could be called idiomatic, I'm not sure. But I think it is not too clear. Moreover, all of those methods would probably perform much worse than the imperative or tail recursive version, and I do not think they have any added clarity that would buy that.
Also you should have a look at this question.
To stop iteration, you can use Iteratee:
import scalaz._
import Scalaz._
import IterV._
import math.Ordering
import Ordering.Implicits._
implicit val ListEnumerator = new Enumerator[List] {
def apply[E, A](e: List[E], i: IterV[E, A]): IterV[E, A] = e match {
case List() => i
case x :: xs => i.fold(done = (_, _) => i,
cont = k => apply(xs, k(El(x))))
}
}
def sorted[E: Ordering] : IterV[E, Boolean] = {
def step(is: Boolean, e: E)(s: Input[E]): IterV[E, Boolean] =
s(el = e2 => if (is && e < e2)
Cont(step(is, e2))
else
Done(false, EOF[E]),
empty = Cont(step(is, e)),
eof = Done(is, EOF[E]))
def first(s: Input[E]): IterV[E, Boolean] =
s(el = e1 => Cont(step(true, e1)),
empty = Cont(first),
eof = Done(true, EOF[E]))
Cont(first)
}
scala> val s = sorted[Int]
s: scalaz.IterV[Int,Boolean] = scalaz.IterV$Cont$$anon$2#5e9132b3
scala> s(List(1,2,3)).run
res11: Boolean = true
scala> s(List(1,2,3,0)).run
res12: Boolean = false
If you split the List into two parts, and check whether the last of the first part is lower than the first of the second part. If so, you could check in parallel for both parts. Here the schematic idea, first without parallel:
def isOrdered (l: List [Int]): Boolean = l.size/2 match {
case 0 => true
case m => {
val low = l.take (m)
val high = l.drop (m)
low.last <= high.head && isOrdered (low) && isOrdered (high)
}
}
And now with parallel, and using splitAt instead of take/drop:
def isOrdered (l: List[Int]): Boolean = l.size/2 match {
case 0 => true
case m => {
val (low, high) = l.splitAt (m)
low.last <= high.head && ! List (low, high).par.exists (x => isOrdered (x) == false)
}
}
def isSorted[A <: Ordered[A]](sequence: List[A]): Boolean = {
sequence match {
case Nil => true
case x::Nil => true
case x::y::rest => (x < y) && isSorted(y::rest)
}
}
Explain how it works.
my solution combine with missingfaktor's solution and Ordering
def isSorted[T](l: Seq[T])(implicit ord: Ordering[T]) = (l, l.tail).zipped.forall(ord.lt(_, _))
and you can use your own comparison method. E.g.
isSorted(dataList)(Ordering.by[Post, Date](_.lastUpdateTime))