Recursively accumulate elements of collection

Recursively accumulate elements of collection - scala

I have a collection of elements with an integer range such as
case class Element(id: Int, from: Int, to: Int)
val elementColl: Traversable[Element]
and I want to accumulate them into
case class ElementAcc(ids: List[Int], from: Int, to: Int)
according to the following algorithm:
Take one Element from my elementColl and use it to create a new ElementsAcc which has the same from/to as the Element taken.
Iterate over remaining elements in elementColl to look for an Element that has an overlapping integer range with our ElementAcc.
If one is found, add it to ElementAcc and expand the integer range of ElementAcc to include the range of the new Element
If none is found, repeat the process above on the remaining elements of elementColl that have not yet been assigned to an ElementAcc
This should result in collection of ElementAcc's. While just recursively adding elements to an accumulator seems easy enough, I don't know how to handle the shrinking size of elementColl so that I don't add the same Element to multiple ElementAcc's
Edit: I think I was unclear regarding the extension of the range. So let my clarify this on an example:
My accumulator currently has a range from 1 to 5. An Element with a range from 6 to 8 does not overlap with the accumulator range and thus will not be included. An Element with a range of 4 to 7 does overlap, will be included and the resulting accumulator has a range from 1 to 7.

I'll go like this:
1) Write a function that takes an ElementAcc and an Element and returns an ElementAcc.
The function would look like:
def extend(acc: ElementAcc, e: Element): ElementAcc = {
if(acc.from <= e.from && e.from <= acc.to)
ElementAcc(e.id :: acc.ids, acc.from, math.max(acc.to, e.to))
else if (acc.from <= e.to && e.to <= acc.to)
ElementAcc(e.id :: acc.ids, math.min(acc.from, e.from), acc.to)
else acc
}
foldLeft is often the good solution when accumulating objects.
It needs an initial value for the accumulator and an function that takes an accumulator and an element and returns an accumulator. Then it accumulates all elements of the traversable.
EDIT:
2) To accumulate on different lists you would have to create another function to combine a List[ElementAcc] and an Element :
def overlap(acc: ElementAcc, e: Element): Boolean = {
(acc.from <= e.from && e.from <= acc.to) || (acc.from <= e.to && e.to <= acc.to)
}
def dispatch(accList: List[ElementAcc], e: Element): List[ElementAcc] = accList match {
case Nil => List(ElementAcc(List(e.id), e.from, e.to))
case acc :: tail =>
if (overlap(acc, e)) extend(acc, e) :: tail
else acc :: dispatch(tail, e)
}
3) And it's used with a foldLeft:
val a = Element(0, 0, 5)
val b = Element(1, 3, 8)
val c = Element(2, 20, 30)
val sorted = List(a, b, c).foldLeft(List[ElementAcc]())(dispatch)
sorted: List[ElementAcc] = List(ElementAcc(List(1, 0),0,8), ElementAcc(List(2),20,30))

Related

Scala partition sorted list elements based on distance

I am new to Scala and functional programming. I have a task that I want to partition a Scala list into list of sub-lists where the distance between each element in any sub-list is less than 2. I found a code somewhere online can do this but I don't understand how this code works internally, can someone give a detailed explanation?
def partition(input: List[Int], prev: Int,
splits: List[List[Int]]): List[List[Int]] = {
input match {
case Nil => splits
case h :: t if h-prev < 2 => partition(t, h, (h :: splits.head) :: splits.tail)
case h :: t => partition(t, h, List(h) :: splits)
}
}
val input = List(1,2,3,5,6,7,10)
partition(input,input.head,List(List.empty[Int]))
The result is as follows:
List[List[Int]] = List(List(10), List(7, 6, 5), List(3, 2, 1))
which is the desired outcome.

This code assumes the original list is ordered from smallest to largest
it works recursively where in each call the input is what is still left of the list, prev holds the previous head of the list (input.head) and splits holds the splits so far
in each call, we look at the input (what's left of the list)
if it is empty (Nil) we finished the split and we return the splits
the other two options the match uses pattern matching to
break the input into header and tail (h and t respectively)
the second match uses a guard condition (the if) to check if the head of the input belongs in the latest split if it does it prepends it to the split
the last option is to create a new split

def partition(input :List[Int] // a sorted List of Ints
,prev :Int // Int previously added to the accumulator
,splits :List[List[Int]] // accumulator of Ints for eventual output
): List[List[Int]] = { // the output (same type as accumulator)
input match { // what does input look like?
case Nil => splits // input is empty, return the accumulator
// input has a head and tail, head is close to previous Int
case h :: t if h-prev < 2 =>
// start again with new input (current tail), new previous (current head),
// and the current head inserted into accumulator
partition(t, h, (h :: splits.head) :: splits.tail)
// input has a head and tail, head is not close to previous Int
case h :: t =>
// start again with new input (current tail), new previous (current head),
// and the current head is the start of a new sub-list in the accumulator
partition(t, h, List(h) :: splits)
}
}

Scalaz Tree - finding the min/max depth from a TreeLoc set

In Scalaz I have a Tree[A] like;
'A'.node('B'.leaf, 'C'.node('D'.leaf), 'E'.leaf)
Now lets say I have a function which recurses through the tree and returns a TreeLoc;
def getCharLoc(c: Char) = tree.loc.find(_.getLabel == c)
Then I do something like
Seq('D','E').flatMap(getCharLoc)
How could I find the lowest and/or the highest loc in tree. In the above example 'D' is the lowest/deepest location and 'E' is the highest/shallowest location.
I was thinking each loc has a .path method which returns a Stream from the loc to the root. Calling .length on this would give a count of the depth which could be compared in a fold left but it feels clunky.
How can I achieve this?

I was able to count parents with a tail recursive function, not sure if you would consider that more or less "clunky":
val tree = 'A'.node('B'.leaf, 'C'.node('D'.leaf), 'E'.leaf)
#tailrec def countParents(loc: Option[TreeLoc[Char]], acc: Int = 0): Int =
loc >>= { _.parent } match {
case None => acc
case next # _ => countParents(next, acc + 1)
}
println(countParents(tree.loc.find(_.getLabel == 'D'))) // 2
println(countParents(tree.loc.find(_.getLabel == 'E'))) // 1

Find min and max elements of array

I want to find the min and max elements of an array using for comprehension. Is it possible to do that with one iteration of array to find both min element and max element?
I am looking for a solution without using scala provided array.min or max.

You can get min and max values of an Array[Int] with reduceLeft function.
scala> val a = Array(20, 12, 6, 15, 2, 9)
a: Array[Int] = Array(20, 12, 6, 15, 2, 9)
scala> a.reduceLeft(_ min _)
res: Int = 2
scala> a.reduceLeft(_ max _)
res: Int = 20
See this link for more information and examples of reduceLeft method: http://alvinalexander.com/scala/scala-reduceleft-examples

Here is a concise and readable solution, that avoids the ugly if statements :
def minMax(a: Array[Int]) : (Int, Int) = {
if (a.isEmpty) throw new java.lang.UnsupportedOperationException("array is empty")
a.foldLeft((a(0), a(0)))
{ case ((min, max), e) => (math.min(min, e), math.max(max, e))}
}
Explanation : foldLeft is a standard method in Scala on many collections. It allows to pass an accumulator to a callback function that will be called for each element of the array.
Take a look at scaladoc for further details

def findMinAndMax(array: Array[Int]) = { // a non-empty array
val initial = (array.head, array.head) // a tuple representing min-max
// foldLeft takes an initial value of type of result, in this case a tuple
// foldLeft also takes a function of 2 parameters.
// the 'left' parameter is an accumulator (foldLeft -> accum is left)
// the other parameter is a value from the collection.
// the function2 should return a value which replaces accumulator (in next iteration)
// when the next value from collection will be picked.
// so on till all values are iterated, in the end accum is returned.
array.foldLeft(initial) { ((min, max), x) =>
if (x < min) (x, max)
else if (x > max) (min, x)
else acc
}
}

Following on from the other answers - a more general solution is possible, that works for other collections as well as Array, and other contents as well as Int:
def minmax[B >: A, A](xs: Iterable[A])(implicit cmp: Ordering[B]): (A, A) = {
if (xs.isEmpty) throw new UnsupportedOperationException("empty.minmax")
val initial = (xs.head, xs.head)
xs.foldLeft(initial) { case ((min, max), x) =>
(if (cmp.lt(x, min)) x else min, if (cmp.gt(x, max)) x else max) }
}
For example:
minmax(List(4, 3, 1, 2, 5)) //> res0: (Int, Int) = (1,5)
minmax(Vector('Z', 'K', 'B', 'A')) //> res1: (Char, Char) = (A,Z)
minmax(Array(3.0, 2.0, 1.0)) //> res2: (Double, Double) = (1.0,3.0)
(It's also possible to write this a bit more concisely using cmp.min() and cmp.max(), but only if you remove the B >: A type bound, which makes the function less general).

Consider this (for non-empty orderable arrays),
val ys = xs.sorted
val (min,max) = (ys.head, ys.last)

val xs: Array[Int] = ???
var min: Int = Int.MaxValue
var max: Int = Int.MinValue
for (x <- xs) {
if (x < min) min = x
if (x > max) max = x
}

I'm super late to the party on this one, but I'm new to Scala and thought I'd contribute anyway. Here is a solution using tail recursion:
#tailrec
def max(list: List[Int], currentMax: Int = Int.MinValue): Int = {
if(list.isEmpty) currentMax
else if ( list.head > currentMax) max(list.tail, list.head)
else max(list.tail,currentMax)
}

Of all of the answers I reviewed to this questions, DNA's solution was the closest to "Scala idiomatic" I could find. However, it can be slightly improved by...:
Performing as few comparisons as needed (important for very large collections)
Provide ideal ordering consistency by only using the Ordering.lt method
Avoiding throwing an Exception
Making the code more readable for those new to and learning Scala
The comments should help clarify the changes.
def minAndMax[B>: A, A](iterable: Iterable[A])(implicit ordering: Ordering[B]): Option[(A, A)] =
if (iterable.nonEmpty)
Some(
iterable.foldLeft((iterable.head, iterable.head)) {
case (minAndMaxTuple, element) =>
val (min, max) =
minAndMaxTuple //decode reference to tuple
if (ordering.lt(element, min))
(element, max) //if replacing min, it isn't possible max will change so no need for the max comparison
else
if (ordering.lt(max, element))
(min, element)
else
minAndMaxTuple //use original reference to avoid instantiating a new tuple
}
)
else
None
And here's the solution expanded to return the lower and upper bounds of a 2d space in a single pass, again using the above optimizations:
def minAndMax2d[B >: A, A](iterable: Iterable[(A, A)])(implicit ordering: Ordering[B]): Option[((A, A), (A, A))] =
if (iterable.nonEmpty)
Some(
iterable.foldLeft(((iterable.head._1, iterable.head._1), (iterable.head._2, iterable.head._2))) {
case ((minAndMaxTupleX, minAndMaxTupleY), (elementX, elementY)) =>
val ((minX, maxX), (minY, maxY)) =
(minAndMaxTupleX, minAndMaxTupleY) //decode reference to tuple
(
if (ordering.lt(elementX, minX))
(elementX, maxX) //if replacing minX, it isn't possible maxX will change so no need for the maxX comparison
else
if (ordering.lt(maxX, elementX))
(minX, elementX)
else
minAndMaxTupleX //use original reference to avoid instantiating a new tuple
, if (ordering.lt(elementY, minY))
(elementY, maxY) //if replacing minY, it isn't possible maxY will change so no need for the maxY comparison
else
if (ordering.lt(maxY, elementY))
(minY, elementY)
else
minAndMaxTupleY //use original reference to avoid instantiating a new tuple
)
}
)
else
None

You could always write your own foldLeft function - that will guarantee one iteration and known performance.
val array = Array(3,4,62,8,9,2,1)
if(array.isEmpty) throw new IllegalArgumentException // Just so we can safely call array.head
val (minimum, maximum) = array.foldLeft((array.head, array.head)) { // We start of with the first element as min and max
case ((min, max), next) =>
if(next > max) (min, next)
else if(next < min) (next, max)
else (min, max)
}
println(minimum, maximum) //1, 62

scala> val v = Vector(1,2)
scala> v.max
res0: Int = 2
scala> v.min
res1: Int = 2
You could use the min and max methods of Vector

Convert normal recursion to tail recursion

I was wondering if there is some general method to convert a "normal" recursion with foo(...) + foo(...) as the last call to a tail-recursion.
For example (scala):
def pascal(c: Int, r: Int): Int = {
if (c == 0 || c == r) 1
else pascal(c - 1, r - 1) + pascal(c, r - 1)
}
A general solution for functional languages to convert recursive function to a tail-call equivalent:
A simple way is to wrap the non tail-recursive function in the Trampoline monad.
def pascalM(c: Int, r: Int): Trampoline[Int] = {
if (c == 0 || c == r) Trampoline.done(1)
else for {
a <- Trampoline.suspend(pascal(c - 1, r - 1))
b <- Trampoline.suspend(pascal(c, r - 1))
} yield a + b
}
val pascal = pascalM(10, 5).run
So the pascal function is not a recursive function anymore. However, the Trampoline monad is a nested structure of the computation that need to be done. Finally, run is a tail-recursive function that walks through the tree-like structure, interpreting it, and finally at the base case returns the value.
A paper from Rúnar Bjanarson on the subject of Trampolines: Stackless Scala With Free Monads

In cases where there is a simple modification to the value of a recursive call, that operation can be moved to the front of the recursive function. The classic example of this is Tail recursion modulo cons, where a simple recursive function in this form:
def recur[A](...):List[A] = {
...
x :: recur(...)
}
which is not tail recursive, is transformed into
def recur[A]{...): List[A] = {
def consRecur(..., consA: A): List[A] = {
consA :: ...
...
consrecur(..., ...)
}
...
consrecur(...,...)
}
Alexlv's example is a variant of this.
This is such a well known situation that some compilers (I know of Prolog and Scheme examples but Scalac does not do this) can detect simple cases and perform this optimisation automatically.
Problems combining multiple calls to recursive functions have no such simple solution. TMRC optimisatin is useless, as you are simply moving the first recursive call to another non-tail position. The only way to reach a tail-recursive solution is remove all but one of the recursive calls; how to do this is entirely context dependent but requires finding an entirely different approach to solving the problem.
As it happens, in some ways your example is similar to the classic Fibonnaci sequence problem; in that case the naive but elegant doubly-recursive solution can be replaced by one which loops forward from the 0th number.
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
def fib (n: Long): Long = {
def loop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
loop(next, current + next, iteration + 1)
}
loop(0, 1, 0)
}
For the Fibonnaci sequence, this is the most efficient approach (a streams based solution is just a different expression of this solution that can cache results for subsequent calls). Now,
you can also solve your problem by looping forward from c0/r0 (well, c0/r2) and calculating each row in sequence - the difference being that you need to cache the entire previous row. So while this has a similarity to fib, it differs dramatically in the specifics and is also significantly less efficient than your original, doubly-recursive solution.
Here's an approach for your pascal triangle example which can calculate pascal(30,60) efficiently:
def pascal(column: Long, row: Long):Long = {
type Point = (Long, Long)
type Points = List[Point]
type Triangle = Map[Point,Long]
def above(p: Point) = (p._1, p._2 - 1)
def aboveLeft(p: Point) = (p._1 - 1, p._2 - 1)
def find(ps: Points, t: Triangle): Long = ps match {
// Found the ultimate goal
case (p :: Nil) if t contains p => t(p)
// Found an intermediate point: pop the stack and carry on
case (p :: rest) if t contains p => find(rest, t)
// Hit a triangle edge, add it to the triangle
case ((c, r) :: _) if (c == 0) || (c == r) => find(ps, t + ((c,r) -> 1))
// Triangle contains (c - 1, r - 1)...
case (p :: _) if t contains aboveLeft(p) => if (t contains above(p))
// And it contains (c, r - 1)! Add to the triangle
find(ps, t + (p -> (t(aboveLeft(p)) + t(above(p)))))
else
// Does not contain(c, r -1). So find that
find(above(p) :: ps, t)
// If we get here, we don't have (c - 1, r - 1). Find that.
case (p :: _) => find(aboveLeft(p) :: ps, t)
}
require(column >= 0 && row >= 0 && column <= row)
(column, row) match {
case (c, r) if (c == 0) || (c == r) => 1
case p => find(List(p), Map())
}
}
It's efficient, but I think it shows how ugly complex recursive solutions can become as you deform them to become tail recursive. At this point, it may be worth moving to a different model entirely. Continuations or monadic gymnastics might be better.
You want a generic way to transform your function. There isn't one. There are helpful approaches, that's all.

I don't know how theoretical this question is, but a recursive implementation won't be efficient even with tail-recursion. Try computing pascal(30, 60), for example. I don't think you'll get a stack overflow, but be prepared to take a long coffee break.
Instead, consider using a Stream or memoization:
val pascal: Stream[Stream[Long]] =
(Stream(1L)
#:: (Stream from 1 map { i =>
// compute row i
(1L
#:: (pascal(i-1) // take the previous row
sliding 2 // and add adjacent values pairwise
collect { case Stream(a,b) => a + b }).toStream
++ Stream(1L))
}))

The accumulator approach
def pascal(c: Int, r: Int): Int = {
def pascalAcc(acc:Int, leftover: List[(Int, Int)]):Int = {
if (leftover.isEmpty) acc
else {
val (c1, r1) = leftover.head
// Edge.
if (c1 == 0 || c1 == r1) pascalAcc(acc + 1, leftover.tail)
// Safe checks.
else if (c1 < 0 || r1 < 0 || c1 > r1) pascalAcc(acc, leftover.tail)
// Add 2 other points to accumulator.
else pascalAcc(acc, (c1 , r1 - 1) :: ((c1 - 1, r1 - 1) :: leftover.tail ))
}
}
pascalAcc(0, List ((c,r) ))
}
It does not overflow the stack but as on big row and column but Aaron mentioned it's not fast.

Yes it's possible. Usually it's done with accumulator pattern through some internally defined function, which has one additional argument with so called accumulator logic, example with counting length of a list.
For example normal recursive version would look like this:
def length[A](xs: List[A]): Int = if (xs.isEmpty) 0 else 1 + length(xs.tail)
that's not a tail recursive version, in order to eliminate last addition operation we have to accumulate values while somehow, for example with accumulator pattern:
def length[A](xs: List[A]) = {
def inner(ys: List[A], acc: Int): Int = {
if (ys.isEmpty) acc else inner(ys.tail, acc + 1)
}
inner(xs, 0)
}
a bit longer to code, but i think the idea i clear. Of cause you can do it without inner function, but in such case you should provide acc initial value manually.

I'm pretty sure it's not possible in the simple way you're looking for the general case, but it would depend on how elaborate you permit the changes to be.
A tail-recursive function must be re-writable as a while-loop, but try implementing for example a Fractal Tree using while-loops. It's possble, but you need to use an array or collection to store the state for each point, which susbstitutes for the data otherwise stored in the call-stack.
It's also possible to use trampolining.

It is indeed possible. The way I'd do this is to
begin with List(1) and keep recursing till you get to the
row you want.
Worth noticing that you can optimize it: if c==0 or c==r the value is one, and to calculate let's say column 3 of the 100th row you still only need to calculate the first three elements of the previous rows.
A working tail recursive solution would be this:
def pascal(c: Int, r: Int): Int = {
#tailrec
def pascalAcc(c: Int, r: Int, acc: List[Int]): List[Int] = {
if (r == 0) acc
else pascalAcc(c, r - 1,
// from let's say 1 3 3 1 builds 0 1 3 3 1 0 , takes only the
// subset that matters (if asking for col c, no cols after c are
// used) and uses sliding to build (0 1) (1 3) (3 3) etc.
(0 +: acc :+ 0).take(c + 2)
.sliding(2, 1).map { x => x.reduce(_ + _) }.toList)
}
if (c == 0 || c == r) 1
else pascalAcc(c, r, List(1))(c)
}
The annotation #tailrec actually makes the compiler check the function
is actually tail recursive.
It could be probably be further optimized since given that the rows are symmetric, if c > r/2, pascal(c,r) == pascal ( r-c,r).. but left to the reader ;)

How to functionally merge overlapping number-ranges from a List

I have a number of range-objects which I need to merge so that all overlapping ranges disappear:
case class Range(from:Int, to:Int)
val rangelist = List(Range(3, 40), Range(1, 45), Range(2, 50), etc)
Here is the ranges:
3 40
1 45
2 50
70 75
75 90
80 85
100 200
Once finished we would get:
1 50
70 90
100 200
Imperative Algorithm:
Pop() the first range-obj and iterate through the rest of the list comparing it with each of the other ranges.
if there is an overlapping item,
merge them together ( This yields a new Range instance ) and delete the 2 merge-candidates from the source-list.
At the end of the list add the Range object (which could have changed numerous times through merging) to the final-result-list.
Repeat this with the next of the remaining items.
Once the source-list is empty we're done.
To do this imperatively one must create a lot of temporary variables, indexed loops etc.
So I'm wondering if there is a more functional approach?
At first sight the source-collection must be able to act like a Stack in providing pop() PLUS
giving the ability to delete items by index while iterating over it, but then that would not be that functional anymore.

Try tail-recursion. (Annotation is needed only to warn you if tail-recursion optimization doesn't happen; the compiler will do it if it can whether you annotate or not.)
import annotation.{tailrec => tco}
#tco final def collapse(rs: List[Range], sep: List[Range] = Nil): List[Range] = rs match {
case x :: y :: rest =>
if (y.from > x.to) collapse(y :: rest, x :: sep)
else collapse( Range(x.from, x.to max y.to) :: rest, sep)
case _ =>
(rs ::: sep).reverse
}
def merge(rs: List[Range]): List[Range] = collapse(rs.sortBy(_.from))

I love these sorts of puzzles:
case class Range(from:Int, to:Int) {
assert(from <= to)
/** Returns true if given Range is completely contained in this range */
def contains(rhs: Range) = from <= rhs.from && rhs.to <= to
/** Returns true if given value is contained in this range */
def contains(v: Int) = from <= v && v <= to
}
def collapse(rangelist: List[Range]) =
// sorting the list puts overlapping ranges adjacent to one another in the list
// foldLeft runs a function on successive elements. it's a great way to process
// a list when the results are not a 1:1 mapping.
rangelist.sortBy(_.from).foldLeft(List.empty[Range]) { (acc, r) =>
acc match {
case head :: tail if head.contains(r) =>
// r completely contained; drop it
head :: tail
case head :: tail if head.contains(r.from) =>
// partial overlap; expand head to include both head and r
Range(head.from, r.to) :: tail
case _ =>
// no overlap; prepend r to list
r :: acc
}
}

Here's my solution:
def merge(ranges:List[Range]) = ranges
.sortWith{(a, b) => a.from < b.from || (a.from == b.from && a.to < b.to)}
.foldLeft(List[Range]()){(buildList, range) => buildList match {
case Nil => List(range)
case head :: tail => if (head.to >= range.from) {
Range(head.from, head.to.max(range.to)) :: tail
} else {
range :: buildList
}
}}
.reverse
merge(List(Range(1, 3), Range(4, 5), Range(10, 11), Range(1, 6), Range(2, 8)))
//List[Range] = List(Range(1,8), Range(10,11))

I ran into this need for Advent of Code 2022, Day 15, where I needed to merge a list of inclusive ranges. I had to slightly modify the solution for inclusiveness:
import annotation.{tailrec => tco}
#tco final def collapse(rs: List[Range], sep: List[Range] = Nil): List[Range] = rs match {
case x :: y :: rest =>
if (y.start - 1 > x.end) collapse(y :: rest, x :: sep)
else collapse(Range.inclusive(x.start, x.end max y.end) :: rest, sep)
case _ =>
(rs ::: sep).reverse
}