Scala grouping a collection by a finite sequence of values

Scala grouping a collection by a finite sequence of values - scala

For
val xs = (1 to 9).toArray
we can group for instance every two consecutive items like this,
xs.grouped(2)
Yet, given a finite sequence of values, namely for instance
val gr = Seq(3,2,1)
how to group xs based in gr so that
xs.grouped(gr)
res: Array(Array(1,2,3), Array(4,5), Array(6), Array(7,8,9))

Please, consider the following solution:
def groupBySeq[T](arr:Array[T], gr:Seq[Int]) = {
val r = gr.foldLeft((arr, List[Array[T]]())) {
case ((rest, acc), item) => (rest.drop(item), rest.take(item)::acc)
}
(r._1::r._2).reverse
}

The following function generates the result you are looking for, although I suspect there may be a better way:
def grouped[T](what: Seq[T], by: Seq[Int]) = {
def go(left: Seq[T], nextBy: Int, acc: List[Seq[T]]): List[Seq[T]] = (left.length, by(nextBy % by.length)) match {
case (n, sz) if n <= sz => left :: acc
case (n, sz) => go(left.drop(sz), nextBy+1, left.take(sz) :: acc)
}
go(what, 0, Nil).reverse
}

Related

Lists Merge and reduce without using inbuilt functions

Consider an example
val a= List(1,2,3)
val b= List(4,5,6)
merge reduce function taking two lists and two function where one function acts as merge and another function to reduce it to Integer more of a general form.
merge by multiplying the head of two lists and then reduce using add
merge using max then get the min of the generated list
mergeReduce(a,b,product,add) = 32
mergeReduce(a,b,max,min) = 4
This can be achieved using inbuilt functions but is there a better way to do without the use of those functions in a recursive manner.

Here is your mergeReduce() (as I understand it).
def mergeReduce(a :List[Int], b :List[Int]
,f :(Int,Int)=>Int, g :(Int,Int)=>Int) :Int =
a.zip(b).map(f.tupled).reduce(g)
val a= List(1,2,3)
val b= List(4,5,6)
mergeReduce(a,b,_*_,_+_) // 32
mergeReduce(a,b,math.max,math.min) // 4
So, what are the "inbuilt" functions you want to replace? And why do you want to replace them?
Here then is a version without map, reduce, zip, and tupled.
def mergeReduce(lsta :List[Int], lstb :List[Int]
,f :(Int,Int)=>Int, g :(Int,Int)=>Int) :Int = {
def merg(x :List[Int], y :List[Int], acc :List[Int] = Nil) :List[Int] =
if (x.isEmpty || y.isEmpty) acc.reverse
else merg(x.tail, y.tail, f(x.head,y.head) :: acc)
def reduc(z: List[Int]) :Int = z match {
case Nil => -1 //error
case i :: Nil => i
case a::b::c => reduc(g(a,b) :: c)
}
reduc(merg(lsta, lstb))
}
This uses .isEmpty, .reverse, .head, .tail, and .unapply (the method by which pattern matching is accomplished). Still too much "inbuilt"?

I think this is what you are looking for. It performs merge and reduce in a single pass, using only the basic List operations:
def mergeReduce[T](a: List[T], b: List[T], merge: (T, T) => T, reduce: (T, T) => T): T = {
#tailrec
def loop(a: List[T], b: List[T], res: T): T =
(a, b) match {
case (a :: at, b :: bt) => loop(at, bt, reduce(res, merge(a, b)))
case _ => res
}
loop(a.tail, b.tail, merge(a.head, b.head))
}
This will fail if either list is Nil and will silently discard the values from the longer list if the lengths are not the same.

Rewriting imperative for loop to declarative style in Scala

How do I rewrite the following loop (pattern) into Scala, either using built-in higher order functions or tail recursion?
This the example of an iteration pattern where you do a computation (comparison, for example) of two list elements, but only if the second one comes after first one in the original input. Note that the +1 step is used here, but in general, it could be +n.
public List<U> mapNext(List<T> list) {
List<U> results = new ArrayList();
for (i = 0; i < list.size - 1; i++) {
for (j = i + 1; j < list.size; j++) {
results.add(doSomething(list[i], list[j]))
}
}
return results;
}
So far, I've come up with this in Scala:
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#scala.annotation.tailrec
def loop(ix: List[T], jx: List[T], res: List[U]): List[U] = (ix, jx) match {
case (_ :: _ :: is, Nil) => loop(ix, ix.tail, res)
case (i :: _ :: is, j :: Nil) => loop(ix.tail, Nil, f(i, j) :: res)
case (i :: _ :: is, j :: js) => loop(ix, js, f(i, j) :: res)
case _ => res
}
loop(list, Nil, Nil).reverse
}
Edit:
To all contributors, I only wish I could accept every answer as solution :)

Here's my stab. I think it's pretty readable. The intuition is: for each head of the list, apply the function to the head and every other member of the tail. Then recurse on the tail of the list.
def mapNext[U, T](list: List[U], fun: (U, U) => T): List[T] = list match {
case Nil => Nil
case (first :: Nil) => Nil
case (first :: rest) => rest.map(fun(first, _: U)) ++ mapNext(rest, fun)
}
Here's a sample run
scala> mapNext(List(1, 2, 3, 4), (x: Int, y: Int) => x + y)
res6: List[Int] = List(3, 4, 5, 5, 6, 7)
This one isn't explicitly tail recursive but an accumulator could be easily added to make it.

Recursion is certainly an option, but the standard library offers some alternatives that will achieve the same iteration pattern.
Here's a very simple setup for demonstration purposes.
val lst = List("a","b","c","d")
def doSomething(a:String, b:String) = a+b
And here's one way to get at what we're after.
val resA = lst.tails.toList.init.flatMap(tl=>tl.tail.map(doSomething(tl.head,_)))
// resA: List[String] = List(ab, ac, ad, bc, bd, cd)
This works but the fact that there's a map() within a flatMap() suggests that a for comprehension might be used to pretty it up.
val resB = for {
tl <- lst.tails
if tl.nonEmpty
h = tl.head
x <- tl.tail
} yield doSomething(h, x) // resB: Iterator[String] = non-empty iterator
resB.toList // List(ab, ac, ad, bc, bd, cd)
In both cases the toList cast is used to get us back to the original collection type, which might not actually be necessary depending on what further processing of the collection is required.

Comeback Attempt:
After deleting my first attempt to give an answer I put some more thought into it and came up with another, at least shorter solution.
def mapNext[T, U](list: List[T])(f: (T, T) => U): List[U] = {
#tailrec
def loop(in: List[T], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(list, Nil)
}
I would also like to recommend the enrich my library pattern for adding the mapNext function to the List api (or with some adjustments to any other collection).
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}
Then you can use the function like:
list.mapNext(doSomething)
Again, there is a downside, as concatenating lists is relatively expensive.
However, variable assignemends inside for comprehensions can be quite inefficient, too (as this improvement task for dotty Scala Wart: Convoluted de-sugaring of for-comprehensions suggests).
UPDATE
Now that I'm into this, I simply cannot let go :(
Concerning 'Note that the +1 step is used here, but in general, it could be +n.'
I extended my proposal with some parameters to cover more situations:
object collection {
object Implicits {
implicit class RichList[A](private val underlying: List[A]) extends AnyVal {
def mapNext[U](f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail => loop(tail, out ::: tail.map { f(head, _) } )
}
loop(underlying, Nil)
}
def mapEvery[U](step: Int)(f: A => U) = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = {
in match {
case Nil => out.reverse
case head :: tail => loop(tail.drop(step), f(head) :: out)
}
}
loop(underlying, Nil)
}
def mapDrop[U](drop1: Int, drop2: Int, step: Int)(f: (A, A) => U): List[U] = {
#tailrec
def loop(in: List[A], out: List[U]): List[U] = in match {
case Nil => out
case head :: tail =>
loop(tail.drop(drop1), out ::: tail.drop(drop2).mapEvery(step) { f(head, _) } )
}
loop(underlying, Nil)
}
}
}
}

list // [a, b, c, d, ...]
.indices // [0, 1, 2, 3, ...]
.flatMap { i =>
elem = list(i) // Don't redo access every iteration of the below map.
list.drop(i + 1) // Take only the inputs that come after the one we're working on
.map(doSomething(elem, _))
}
// Or with a monad-comprehension
for {
index <- list.indices
thisElem = list(index)
thatElem <- list.drop(index + 1)
} yield doSomething(thisElem, thatElem)
You start, not with the list, but with its indices. Then, you use flatMap, because each index goes to a list of elements. Use drop to take only the elements after the element we're working on, and map that list to actually run the computation. Note that this has terrible time complexity, because most operations here, indices/length, flatMap, map, are O(n) in the list size, and drop and apply are O(n) in the argument.
You can get better performance if you a) stop using a linked list (List is good for LIFO, sequential access, but Vector is better in the general case), and b) make this a tiny bit uglier
val len = vector.length
(0 until len)
.flatMap { thisIdx =>
val thisElem = vector(thisIdx)
((thisIdx + 1) until len)
.map { thatIdx =>
doSomething(thisElem, vector(thatIdx))
}
}
// Or
val len = vector.length
for {
thisIdx <- 0 until len
thisElem = vector(thisIdx)
thatIdx <- (thisIdx + 1) until len
thatElem = vector(thatIdx)
} yield doSomething(thisElem, thatElem)
If you really need to, you can generalize either version of this code to all IndexedSeqs, by using some implicit CanBuildFrom parameters, but I won't cover that.

Group List elements with a distance less than x

I'm trying to figure out a way to group all the objects in a list depending on an x distance between the elements.
For instance, if distance is 1 then
List(2,3,1,6,10,7,11,12,14)
would give
List(List(1,2,3), List(6,7), List(10,11,12), List(14))
I can only come up with tricky approaches and loops but I guess there must be a cleaner solution.

You may try to sort your list and then use a foldLeft on it. Basically something like that
def sort = {
val l = List(2,3,1,6,10,7,11,12,14)
val dist = 1
l.sorted.foldLeft(List(List.empty[Int]))((list, n) => {
val last = list.head
last match {
case h::q if Math.abs(last.head-n) > dist=> List(n) :: list
case _ => (n :: last ) :: list.tail
}
}
)
}
The result seems to be okay but reversed. Call "reverse" if needed, when needed, on the lists. the code becomes
val l = List(2,3,1,6,10,7,11,12,14)
val dist = 1
val res = l.sorted.foldLeft(List(List.empty[Int]))((list, n) => {
val last = list.head
last match {
case h::q if Math.abs(last.head-n) > dist=> List(n) :: (last.reverse :: list.tail)
case _ => (n :: last ) :: list.tail
}
}
).reverse

The cleanest answer would rely upon a method that probably should be called groupedWhile which would split exactly where a condition was true. If you had this method, then it would just be
def byDist(xs: List[Int], d: Int) = groupedWhile(xs.sorted)((l,r) => r - l <= d)
But we don't have groupedWhile.
So let's make one:
def groupedWhile[A](xs: List[A])(p: (A,A) => Boolean): List[List[A]] = {
val yss = List.newBuilder[List[A]]
val ys = List.newBuilder[A]
(xs.take(1) ::: xs, xs).zipped.foreach{ (l,r) =>
if (!p(l,r)) {
yss += ys.result
ys.clear
}
ys += r
}
ys.result match {
case Nil =>
case zs => yss += zs
}
yss.result.dropWhile(_.isEmpty)
}
Now that you have the generic capability, you can get the specific one easily.

Cartesian product stream scala

I had a simple task to find combination which occurs most often when we drop 4 cubic dices an remove one with least points.
So, the question is: are there any Scala core classes to generate streams of cartesian products in Scala? When not - how to implement it in the most simple and effective way?
Here is the code and comparison with naive implementation in Scala:
object D extends App {
def dropLowest(a: List[Int]) = {
a diff List(a.min)
}
def cartesian(to: Int, times: Int): Stream[List[Int]] = {
def stream(x: List[Int]): Stream[List[Int]] = {
if (hasNext(x)) x #:: stream(next(x)) else Stream(x)
}
def hasNext(x: List[Int]) = x.exists(n => n < to)
def next(x: List[Int]) = {
def add(current: List[Int]): List[Int] = {
if (current.head == to) 1 :: add(current.tail) else current.head + 1 :: current.tail // here is a possible bug when we get maximal value, don't reuse this method
}
add(x.reverse).reverse
}
stream(Range(0, times).map(t => 1).toList)
}
def getResult(list: Stream[List[Int]]) = {
list.map(t => dropLowest(t).sum).groupBy(t => t).map(t => (t._1, t._2.size)).toMap
}
val list1 = cartesian(6, 4)
val list = for (i <- Range(1, 7); j <- Range(1,7); k <- Range(1, 7); l <- Range(1, 7)) yield List(i, j, k, l)
println(getResult(list1))
println(getResult(list.toStream) equals getResult(list1))
}
Thanks in advance

I think you can simplify your code by using flatMap :
val stream = (1 to 6).toStream
def cartesian(times: Int): Stream[Seq[Int]] = {
if (times == 0) {
Stream(Seq())
} else {
stream.flatMap { i => cartesian(times - 1).map(i +: _) }
}
}
Maybe a little bit more efficient (memory-wise) would be using Iterators instead:
val pool = (1 to 6)
def cartesian(times: Int): Iterator[Seq[Int]] = {
if (times == 0) {
Iterator(Seq())
} else {
pool.iterator.flatMap { i => cartesian(times - 1).map(i +: _) }
}
}
or even more concise by replacing the recursive calls by a fold :
def cartesian[A](list: Seq[Seq[A]]): Iterator[Seq[A]] =
list.foldLeft(Iterator(Seq[A]())) {
case (acc, l) => acc.flatMap(i => l.map(_ +: i))
}
and then:
cartesian(Seq.fill(4)(1 to 6)).map(dropLowest).toSeq.groupBy(i => i.sorted).mapValues(_.size).toSeq.sortBy(_._2).foreach(println)
(Note that you cannot use groupBy on Iterators, so Streams or even Lists are the way to go whatever to be; above code still valid since toSeq on an Iterator actually returns a lazy Stream).
If you are considering stats on the sums of dice instead of combinations, you can update the dropLowest fonction :
def dropLowest(l: Seq[Int]) = l.sum - l.min

Is there a generic way to memoize in Scala?

I wanted to memoize this:
def fib(n: Int) = if(n <= 1) 1 else fib(n-1) + fib(n-2)
println(fib(100)) // times out
So I wrote this and this surprisingly compiles and works (I am surprised because fib references itself in its declaration):
case class Memo[A,B](f: A => B) extends (A => B) {
private val cache = mutable.Map.empty[A, B]
def apply(x: A) = cache getOrElseUpdate (x, f(x))
}
val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
println(fib(100)) // prints 100th fibonacci number instantly
But when I try to declare fib inside of a def, I get a compiler error:
def foo(n: Int) = {
val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
fib(n)
}
Above fails to compile error: forward reference extends over definition of value fib
case n => fib(n-1) + fib(n-2)
Why does declaring the val fib inside a def fails but outside in the class/object scope works?
To clarify, why I might want to declare the recursive memoized function in the def scope - here is my solution to the subset sum problem:
/**
* Subset sum algorithm - can we achieve sum t using elements from s?
*
* #param s set of integers
* #param t target
* #return true iff there exists a subset of s that sums to t
*/
def subsetSum(s: Seq[Int], t: Int): Boolean = {
val max = s.scanLeft(0)((sum, i) => (sum + i) max sum) //max(i) = largest sum achievable from first i elements
val min = s.scanLeft(0)((sum, i) => (sum + i) min sum) //min(i) = smallest sum achievable from first i elements
val dp: Memo[(Int, Int), Boolean] = Memo { // dp(i,x) = can we achieve x using the first i elements?
case (_, 0) => true // 0 can always be achieved using empty set
case (0, _) => false // if empty set, non-zero cannot be achieved
case (i, x) if min(i) <= x && x <= max(i) => dp(i-1, x - s(i-1)) || dp(i-1, x) // try with/without s(i-1)
case _ => false // outside range otherwise
}
dp(s.length, t)
}

I found a better way to memoize using Scala:
def memoize[I, O](f: I => O): I => O = new mutable.HashMap[I, O]() {
override def apply(key: I) = getOrElseUpdate(key, f(key))
}
Now you can write fibonacci as follows:
lazy val fib: Int => BigInt = memoize {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
Here's one with multiple arguments (the choose function):
lazy val c: ((Int, Int)) => BigInt = memoize {
case (_, 0) => 1
case (n, r) if r > n/2 => c(n, n - r)
case (n, r) => c(n - 1, r - 1) + c(n - 1, r)
}
And here's the subset sum problem:
// is there a subset of s which has sum = t
def isSubsetSumAchievable(s: Vector[Int], t: Int) = {
// f is (i, j) => Boolean i.e. can the first i elements of s add up to j
lazy val f: ((Int, Int)) => Boolean = memoize {
case (_, 0) => true // 0 can always be achieved using empty list
case (0, _) => false // we can never achieve non-zero if we have empty list
case (i, j) =>
val k = i - 1 // try the kth element
f(k, j - s(k)) || f(k, j)
}
f(s.length, t)
}
EDIT: As discussed below, here is a thread-safe version
def memoize[I, O](f: I => O): I => O = new mutable.HashMap[I, O]() {self =>
override def apply(key: I) = self.synchronized(getOrElseUpdate(key, f(key)))
}

Class/trait level val compiles to a combination of a method and a private variable. Hence a recursive definition is allowed.
Local vals on the other hand are just regular variables, and thus recursive definition is not allowed.
By the way, even if the def you defined worked, it wouldn't do what you expect. On every invocation of foo a new function object fib will be created and it will have its own backing map. What you should be doing instead is this (if you really want a def to be your public interface):
private val fib: Memo[Int, BigInt] = Memo {
case 0 => 0
case 1 => 1
case n => fib(n-1) + fib(n-2)
}
def foo(n: Int) = {
fib(n)
}

Scalaz has a solution for that, why not reuse it?
import scalaz.Memo
lazy val fib: Int => BigInt = Memo.mutableHashMapMemo {
case 0 => 0
case 1 => 1
case n => fib(n-2) + fib(n-1)
}
You can read more about memoization in Scalaz.

Mutable HashMap isn't thread safe. Also defining case statements separately for base conditions seems unnecessary special handling, rather Map can be loaded with initial values and passed to Memoizer. Following would be the signature of Memoizer where it accepts a memo(immutable Map) and formula and returns a recursive function.
Memoizer would look like
def memoize[I,O](memo: Map[I, O], formula: (I => O, I) => O): I => O
Now given a following Fibonacci formula,
def fib(f: Int => Int, n: Int) = f(n-1) + f(n-2)
fibonacci with Memoizer can be defined as
val fibonacci = memoize( Map(0 -> 0, 1 -> 1), fib)
where context agnostic general purpose Memoizer is defined as
def memoize[I, O](map: Map[I, O], formula: (I => O, I) => O): I => O = {
var memo = map
def recur(n: I): O = {
if( memo contains n) {
memo(n)
} else {
val result = formula(recur, n)
memo += (n -> result)
result
}
}
recur
}
Similarly, for factorial, a formula is
def fac(f: Int => Int, n: Int): Int = n * f(n-1)
and factorial with Memoizer is
val factorial = memoize( Map(0 -> 1, 1 -> 1), fac)
Inspiration: Memoization, Chapter 4 of Javascript good parts by Douglas Crockford

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala grouping a collection by a finite sequence of values - scala

For val xs = (1 to 9).toArray we can group for instance every two consecutive items like this, xs.grouped(2) Yet, given a finite sequence of values, namely for instance val gr = Seq(3,2,1) how to group xs based in gr so that xs.grouped(gr) res: Array(Array(1,2,3), Array(4,5), Array(6), Array(7,8,9))

Please, consider the following solution: def groupBySeq[T](arr:Array[T], gr:Seq[Int]) = { val r = gr.foldLeft((arr, List[Array[T]]())) { case ((rest, acc), item) => (rest.drop(item), rest.take(item)::acc) } (r._1::r._2).reverse }

Related

Lists Merge and reduce without using inbuilt functions

Rewriting imperative for loop to declarative style in Scala

Group List elements with a distance less than x

Cartesian product stream scala

Is there a generic way to memoize in Scala?

Categories

Resources