Related
Given val as: Seq[Int] = ...
A lot of times I need to apply an operation to two consecutive elements, e.g.
By the way I don't like
for (i <- 1 until as.size) {
// do something with as(i) and as(i - 1)
}
Or by another
as.tail.foldLeft((0, as.head)) { (acc, e) =>
// do something with acc._2 and e
// and try to not forget returning (_, e)
}
How do I writer better code for this scenario?
You could zip the sequence as with its own tail:
for ((prev, curr) <- as zip as.tail) {
// do something with `prev` and `curr`
}
Or you could use sliding:
for (window <- as.sliding(2)) {
val prev = window(0)
val curr = window(1)
// do something with `prev` and `curr`
}
Here's one way to supply the head of your sequence to every subsequent element.
val sq:Seq[Int] = Seq(. . .)
sq.headOption.fold(sq){hd =>
sq.tail.map(/*map() or fold() with the hd value*/)
}
Note that this is safe for collections of 1 or zero elements.
Can make your own fold which supports previous element.
Safe with 1 or zero element collections.
def foldLeftWithPrevious[A, B](as: Seq[A], accumulator: B)(f: (B, A, A) => B): B = {
#scala.annotation.tailrec
def foldLeftInner(list2: Seq[A], previous: A, accumulator: B, f: (B, A, A) => B): B = {
if (list2.isEmpty) accumulator
else foldLeftInner(list2.tail, list2.head, f(accumulator, previous, list2.head), f)
}
if (as.length <= 1) accumulator
else foldLeftInner(as.tail, as.head, accumulator, f)
}
Feel free to test it with this snippet.
val foldLeftTest = Seq(1)
foldLeftWithPrevious(foldLeftTest, 0)((accum, previous, current) => {
println("accum = " + accum)
println("previous = " + previous)
println("current = " + current)
println("accum will be... " + accum + " + " + previous + " + " + current)
println("which is... " + (accum + previous + current))
accum + previous + current
})
I am very curious how Scala desugars the following for-comprehension:
for {
a <- Option(5)
b = a * 2
c <- if (b == 10) Option(100) else None
} yield b + c
My difficulty comes from having both b and c in the yield, because they seem to be bound at different steps
This is the sanitized output of desugar - a command available in Ammonite REPL:
Option(5)
.map { a =>
val b = a * 2;
(a, b)
}
.flatMap { case (a, b) =>
(if (b == 10) Option(100) else None)
.map(c => b + c)
}
Both b and c can be present in yield because it does not desugar to chained calls to map/flatMap, but rather to nested calls.
You can even ask the compiler. The following command:
scala -Xprint:parser -e "for {
a <- Option(5)
b = a * 2
c <- if (b == 10) Option(100) else None
} yield b + c"
yields this output
[[syntax trees at end of parser]] // scalacmd7617799112170074915.scala
package <empty> {
object Main extends scala.AnyRef {
def <init>() = {
super.<init>();
()
};
def main(args: Array[String]): scala.Unit = {
final class $anon extends scala.AnyRef {
def <init>() = {
super.<init>();
()
};
Option(5).map(((a) => {
val b = a.$times(2);
scala.Tuple2(a, b)
})).flatMap(((x$1) => x$1: #scala.unchecked match {
case scala.Tuple2((a # _), (b # _)) => if (b.$eq$eq(10))
Option(100)
else
None.map(((c) => b.$plus(c)))
}))
};
new $anon()
}
}
}
Taking only the piece you are interested in and improving the readability, you get this:
Option(5).map(a => {
val b = a * 2
(a, b)
}).flatMap(_ match {
case (_, b) =>
if (b == 10)
Option(100)
else
None.map(c => b + c)
})
Edit
As reported in a comment, literally translating from the compiler output seems to highlight a bug in how the desugared expression is rendered. The sum should be mapped on the result of the if expression, rather then on the None in the else branch:
Option(5).map(a => {
val b = a * 2
(a, b)
}).flatMap(_ match {
case (_, b) =>
(if (b == 10) Option(100) else None).map(c => b + c)
})
It's probably worth it to ask the compiler team if this is a bug.
These two codes are equivalent:
scala> for {
| a <- Option(5)
| b = a * 2
| c <- if (b == 10) Option(100) else None
| } yield b + c
res70: Option[Int] = Some(110)
scala> for {
| a <- Option(5)
| b = a * 2
| if (b == 10)
| c <- Option(100)
| } yield b + c
res71: Option[Int] = Some(110)
Since there is no collection involved, yielding multiple values, there is only one big step - or, arguable, 3 to 4 small steps. If a would have been None, the whole loop would have been terminated early, yielding a None.
The desugaring is a flatMap/withFilter/map.
I wanted to memoize this:
def fib(n: Int) = if(n <= 1) 1 else fib(n-1) + fib(n-2)
println(fib(100)) // times out
So I wrote this and this surprisingly compiles and works (I am surprised because fib references itself in its declaration):
case class Memo[A,B](f: A => B) extends (A => B) {
private val cache = mutable.Map.empty[A, B]
def apply(x: A) = cache getOrElseUpdate (x, f(x))
}
val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
println(fib(100)) // prints 100th fibonacci number instantly
But when I try to declare fib inside of a def, I get a compiler error:
def foo(n: Int) = {
val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
fib(n)
}
Above fails to compile error: forward reference extends over definition of value fib
case n => fib(n-1) + fib(n-2)
Why does declaring the val fib inside a def fails but outside in the class/object scope works?
To clarify, why I might want to declare the recursive memoized function in the def scope - here is my solution to the subset sum problem:
/**
* Subset sum algorithm - can we achieve sum t using elements from s?
*
* #param s set of integers
* #param t target
* #return true iff there exists a subset of s that sums to t
*/
def subsetSum(s: Seq[Int], t: Int): Boolean = {
val max = s.scanLeft(0)((sum, i) => (sum + i) max sum) //max(i) = largest sum achievable from first i elements
val min = s.scanLeft(0)((sum, i) => (sum + i) min sum) //min(i) = smallest sum achievable from first i elements
val dp: Memo[(Int, Int), Boolean] = Memo { // dp(i,x) = can we achieve x using the first i elements?
case (_, 0) => true // 0 can always be achieved using empty set
case (0, _) => false // if empty set, non-zero cannot be achieved
case (i, x) if min(i) <= x && x <= max(i) => dp(i-1, x - s(i-1)) || dp(i-1, x) // try with/without s(i-1)
case _ => false // outside range otherwise
}
dp(s.length, t)
}
I found a better way to memoize using Scala:
def memoize[I, O](f: I => O): I => O = new mutable.HashMap[I, O]() {
override def apply(key: I) = getOrElseUpdate(key, f(key))
}
Now you can write fibonacci as follows:
lazy val fib: Int => BigInt = memoize {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
Here's one with multiple arguments (the choose function):
lazy val c: ((Int, Int)) => BigInt = memoize {
case (_, 0) => 1
case (n, r) if r > n/2 => c(n, n - r)
case (n, r) => c(n - 1, r - 1) + c(n - 1, r)
}
And here's the subset sum problem:
// is there a subset of s which has sum = t
def isSubsetSumAchievable(s: Vector[Int], t: Int) = {
// f is (i, j) => Boolean i.e. can the first i elements of s add up to j
lazy val f: ((Int, Int)) => Boolean = memoize {
case (_, 0) => true // 0 can always be achieved using empty list
case (0, _) => false // we can never achieve non-zero if we have empty list
case (i, j) =>
val k = i - 1 // try the kth element
f(k, j - s(k)) || f(k, j)
}
f(s.length, t)
}
EDIT: As discussed below, here is a thread-safe version
def memoize[I, O](f: I => O): I => O = new mutable.HashMap[I, O]() {self =>
override def apply(key: I) = self.synchronized(getOrElseUpdate(key, f(key)))
}
Class/trait level val compiles to a combination of a method and a private variable. Hence a recursive definition is allowed.
Local vals on the other hand are just regular variables, and thus recursive definition is not allowed.
By the way, even if the def you defined worked, it wouldn't do what you expect. On every invocation of foo a new function object fib will be created and it will have its own backing map. What you should be doing instead is this (if you really want a def to be your public interface):
private val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
def foo(n: Int) = {
fib(n)
}
Scalaz has a solution for that, why not reuse it?
import scalaz.Memo
lazy val fib: Int => BigInt = Memo.mutableHashMapMemo {
case 0 => 0
case 1 => 1
case n => fib(n-2) + fib(n-1)
}
You can read more about memoization in Scalaz.
Mutable HashMap isn't thread safe. Also defining case statements separately for base conditions seems unnecessary special handling, rather Map can be loaded with initial values and passed to Memoizer. Following would be the signature of Memoizer where it accepts a memo(immutable Map) and formula and returns a recursive function.
Memoizer would look like
def memoize[I,O](memo: Map[I, O], formula: (I => O, I) => O): I => O
Now given a following Fibonacci formula,
def fib(f: Int => Int, n: Int) = f(n-1) + f(n-2)
fibonacci with Memoizer can be defined as
val fibonacci = memoize( Map(0 -> 0, 1 -> 1), fib)
where context agnostic general purpose Memoizer is defined as
def memoize[I, O](map: Map[I, O], formula: (I => O, I) => O): I => O = {
var memo = map
def recur(n: I): O = {
if( memo contains n) {
memo(n)
} else {
val result = formula(recur, n)
memo += (n -> result)
result
}
}
recur
}
Similarly, for factorial, a formula is
def fac(f: Int => Int, n: Int): Int = n * f(n-1)
and factorial with Memoizer is
val factorial = memoize( Map(0 -> 1, 1 -> 1), fac)
Inspiration: Memoization, Chapter 4 of Javascript good parts by Douglas Crockford
I am actually blocked on this for about 4 hours now. I want to get a List of Pairs[String, Int] ordered by their int value. The function partiotion works fine, so should the bestN, but when loading this into my interpreter I get:
<console>:15: error: could not find implicit value for evidence parameter of type Ordered[T]
on my predicate. Does someone see what the problem is? I am really desperate at the moment...
This is the code:
def partition[T : Ordered](pred: (T)=>Boolean, list:List[T]): Pair[List[T],List[T]] = {
list.foldLeft(Pair(List[T](),List[T]()))((pair,x) => if(pred(x))(pair._1, x::pair._2) else (x::pair._1, pair._2))
}
def bestN[T <% Ordered[T]](list:List[T], n:Int): List[T] = {
list match {
case pivot::other => {
println("pivot: " + pivot)
val (smaller,bigger) = partition(pivot <, list)
val s = smaller.size
println(smaller)
if (s == n) smaller
else if (s+1 == n) pivot::smaller
else if (s < n) bestN(bigger, n-s-1)
else bestN(smaller, n)
}
case Nil => Nil
}
}
class OrderedPair[T, V <% Ordered[V]] (t:T, v:V) extends Pair[T,V](t,v) with Ordered[OrderedPair[T,V]] {
def this(p:Pair[T,V]) = this(p._1, p._2)
override def compare(that:OrderedPair[T,V]) : Int = this._2.compare(that._2)
}
Edit: The first function divides a List into two by applying the predicate to every member, the bestN function should return a List of the lowest n members of the list list. And the class is there to make Pairs comparable, in this case what I want do do is:
val z = List(Pair("alfred",1),Pair("peter",4),Pair("Xaver",1),Pair("Ulf",2),Pair("Alfons",6),Pair("Gulliver",3))
with this given List I want to get for example with:
bestN(z, 3)
the result:
(("alfred",1), ("Xaver",1), ("Ulf",2))
It looks like you don't need an Ordered T on your partition function, since it just invokes the predicate function.
The following doesn't work (presumably) but merely compiles. Other matters for code review would be the extra braces and stuff like that.
package evident
object Test extends App {
def partition[T](pred: (T)=>Boolean, list:List[T]): Pair[List[T],List[T]] = {
list.foldLeft(Pair(List[T](),List[T]()))((pair,x) => if(pred(x))(pair._1, x::pair._2) else (x::pair._1, pair._2))
}
def bestN[U,V<%Ordered[V]](list:List[(U,V)], n:Int): List[(U,V)] = {
list match {
case pivot::other => {
println(s"pivot: $pivot and rest ${other mkString ","}")
def cmp(a: (U,V), b: (U,V)) = (a: OrderedPair[U,V]) < (b: OrderedPair[U,V])
val (smaller,bigger) = partition(((x:(U,V)) => cmp(x, pivot)), list)
//val (smaller,bigger) = list partition ((x:(U,V)) => cmp(x, pivot))
println(s"smaller: ${smaller mkString ","} and bigger ${bigger mkString ","}")
val s = smaller.size
if (s == n) smaller
else if (s+1 == n) pivot::smaller
else if (s < n) bestN(bigger, n-s-1)
else bestN(smaller, n)
}
case Nil => Nil
}
}
implicit class OrderedPair[T, V <% Ordered[V]](tv: (T,V)) extends Pair(tv._1, tv._2) with Ordered[OrderedPair[T,V]] {
override def compare(that:OrderedPair[T,V]) : Int = this._2.compare(that._2)
}
val z = List(Pair("alfred",1),Pair("peter",4),Pair("Xaver",1),Pair("Ulf",2),Pair("Alfons",6),Pair("Gulliver",3))
println(bestN(z, 3))
}
I found the partition function hard to read; you need a function to partition all the parens. Here are a couple of formulations, which also use the convention that results accepted by the filter go left, rejects go right.
def partition[T](p: T => Boolean, list: List[T]) =
((List.empty[T], List.empty[T]) /: list) { (s, t) =>
if (p(t)) (t :: s._1, s._2) else (s._1, t :: s._2)
}
def partition2[T](p: T => Boolean, list: List[T]) =
((List.empty[T], List.empty[T]) /: list) {
case ((is, not), t) if p(t) => (t :: is, not)
case ((is, not), t) => (is, t :: not)
}
// like List.partition
def partition3[T](p: T => Boolean, list: List[T]) = {
import collection.mutable.ListBuffer
val is, not = new ListBuffer[T]
for (t <- list) (if (p(t)) is else not) += t
(is.toList, not.toList)
}
This might be closer to what the original code intended:
def bestN[U, V <% Ordered[V]](list: List[(U,V)], n: Int): List[(U,V)] = {
require(n >= 0)
require(n <= list.length)
if (n == 0) Nil
else if (n == list.length) list
else list match {
case pivot :: other =>
println(s"pivot: $pivot and rest ${other mkString ","}")
def cmp(x: (U,V)) = x._2 < pivot._2
val (smaller, bigger) = partition(cmp, other) // other partition cmp
println(s"smaller: ${smaller mkString ","} and bigger ${bigger mkString ","}")
val s = smaller.size
if (s == n) smaller
else if (s == 0) pivot :: bestN(bigger, n - 1)
else if (s < n) smaller ::: bestN(pivot :: bigger, n - s)
else bestN(smaller, n)
case Nil => Nil
}
}
Arrow notation is more usual:
val z = List(
"alfred" -> 1,
"peter" -> 4,
"Xaver" -> 1,
"Ulf" -> 2,
"Alfons" -> 6,
"Gulliver" -> 3
)
I suspect I am missing something, but I'll post a bit of code anyway.
For bestN, you know you can just do this?
val listOfPairs = List(Pair("alfred",1),Pair("peter",4),Pair("Xaver",1),Pair("Ulf",2),Pair("Alfons",6),Pair("Gulliver",3))
val bottomThree = listOfPairs.sortBy(_._2).take(3)
Which gives you:
List((alfred,1), (Xaver,1), (Ulf,2))
And for the partition function, you can just do this (say you wanted all pairs lower then 4):
val partitioned = listOfPairs.partition(_._2 < 4)
Which gives (all lower then 4 on the left, all greater on the right):
(List((alfred,1), (Xaver,1), (Ulf,2), (Gulliver,3)),List((peter,4), (Alfons,6)))
Just to share with you: this works! Thanks alot to all people who helped me, you're all great!
object Test extends App {
def partition[T](pred: (T)=>Boolean, list:List[T]): Pair[List[T],List[T]] = {
list.foldLeft(Pair(List[T](),List[T]()))((pair,x) => if(pred(x))(pair._1, x::pair._2) else (x::pair._1, pair._2))
}
def bestN[U,V<%Ordered[V]](list:List[(U,V)], n:Int): List[(U,V)] = {
list match {
case pivot::other => {
def cmp(a: (U,V), b: (U,V)) = (a: OrderedPair[U,V]) <= (b: OrderedPair[U,V])
val (smaller,bigger) = partition(((x:(U,V)) => cmp(pivot, x)), list)
val s = smaller.size
//println(n + " :" + s)
//println("size:" + smaller.size + "Pivot: " + pivot + " Smaller part: " + smaller + " bigger: " + bigger)
if (s == n) smaller
else if (s+1 == n) pivot::smaller
else if (s < n) bestN(bigger, n-s)
else bestN(smaller, n)
}
case Nil => Nil
}
}
class OrderedPair[T, V <% Ordered[V]](tv: (T,V)) extends Pair(tv._1, tv._2) with Ordered[OrderedPair[T,V]] {
override def compare(that:OrderedPair[T,V]) : Int = this._2.compare(that._2)
}
implicit final def OrderedPair[T, V <% Ordered[V]](p : Pair[T, V]) : OrderedPair[T,V] = new OrderedPair(p)
val z = List(Pair("alfred",1),Pair("peter",1),Pair("Xaver",1),Pair("Ulf",2),Pair("Alfons",6),Pair("Gulliver",3))
println(bestN(z, 3))
println(bestN(z, 4))
println(bestN(z, 1))
}
I implemented a simple method to generate Cartesian product on several Seqs like this:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq[T](s)
}
class RichSeq[T](s: Seq[T]) {
import RichSeq._
def cartesian(ss: Seq[Seq[T]]): Seq[Seq[T]] = {
ss.toList match {
case Nil => Seq(s)
case s2 :: Nil => {
for (e <- s) yield s2.map(e2 => Seq(e, e2))
}.flatten
case s2 :: tail => {
for (e <- s) yield s2.cartesian(tail).map(seq => e +: seq)
}.flatten
}
}
}
Obviously, this one is really slow, as it calculates the whole product at once. Did anyone implement a lazy solution for this problem in Scala?
UPD
OK, So I implemented a reeeeally stupid, but working version of an iterator over a Cartesian product. Posting here for future enthusiasts:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq(s)
}
class RichSeq[T](s: Seq[T]) {
def lazyCartesian(ss: Seq[Seq[T]]): Iterator[Seq[T]] = new Iterator[Seq[T]] {
private[this] val seqs = s +: ss
private[this] var indexes = Array.fill(seqs.length)(0)
private[this] val counts = Vector(seqs.map(_.length - 1): _*)
private[this] var current = 0
def next(): Seq[T] = {
val buffer = ArrayBuffer.empty[T]
if (current != 0) {
throw new NoSuchElementException("no more elements to traverse")
}
val newIndexes = ArrayBuffer.empty[Int]
var inside = 0
for ((index, i) <- indexes.zipWithIndex) {
buffer.append(seqs(i)(index))
newIndexes.append(index)
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
inside = inside + 1
}
}
current = inside
if (current < seqs.length) {
for (i <- (0 to current).reverse) {
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
newIndexes(i) = 0
} else if (newIndexes(i) < counts(i)) {
newIndexes(i) = newIndexes(i) + 1
}
}
current = 0
indexes = newIndexes.toArray
}
buffer.result()
}
def hasNext: Boolean = current != seqs.length
}
}
Here's my solution to the given problem. Note that the laziness is simply caused by using .view on the "root collection" of the used for comprehension.
scala> def combine[A](xs: Traversable[Traversable[A]]): Seq[Seq[A]] =
| xs.foldLeft(Seq(Seq.empty[A])){
| (x, y) => for (a <- x.view; b <- y) yield a :+ b }
combine: [A](xs: Traversable[Traversable[A]])Seq[Seq[A]]
scala> combine(Set(Set("a","b","c"), Set("1","2"), Set("S","T"))) foreach (println(_))
List(a, 1, S)
List(a, 1, T)
List(a, 2, S)
List(a, 2, T)
List(b, 1, S)
List(b, 1, T)
List(b, 2, S)
List(b, 2, T)
List(c, 1, S)
List(c, 1, T)
List(c, 2, S)
List(c, 2, T)
To obtain this, I started from the function combine defined in https://stackoverflow.com/a/4515071/53974, passing it the function (a, b) => (a, b). However, that didn't quite work directly, since that code expects a function of type (A, A) => A. So I just adapted the code a bit.
These might be a starting point:
Cartesian product of two lists
Expand a Set[Set[String]] into Cartesian Product in Scala
https://stackoverflow.com/questions/6182126/im-learning-scala-would-it-be-possible-to-get-a-little-code-review-and-mentori
What about:
def cartesian[A](list: List[Seq[A]]): Iterator[Seq[A]] = {
if (list.isEmpty) {
Iterator(Seq())
} else {
list.head.iterator.flatMap { i => cartesian(list.tail).map(i +: _) }
}
}
Simple and lazy ;)
def cartesian[A](list: List[List[A]]): List[List[A]] = {
list match {
case Nil => List(List())
case h :: t => h.flatMap(i => cartesian(t).map(i :: _))
}
}
You can look here: https://stackoverflow.com/a/8318364/312172 how to translate a number into an index of all possible values, without generating every element.
This technique can be used to implement a stream.