Group by margin - scala

I am having a sequence of Int numbers:
val numbers = Seq(5, 3, 4, 1)
I need to group them according to their difference. The difference has to be smaller or equal to a certain threshold, let it be 2 for this example. So the possible groups would be:
(5, 3, 4) (1)
(1, 3) (5, 4)
I don't really care which of these constellations of groups I'll get. Each element is allowed to be used once. I also need to remain the index, so prior grouping I would need a zipWithIndex.
Is there a clever way to do such grouping?

Ok then. Idea of the algorithm:
Take the next element in numbers. Check whether it belongs to a previously created group. If it does, add it to that group. If not, add a new group with the element.
I use IndexedSeq because i want indexing to be O(1).
It is kinda long, but I can't think of something better at the moment. I hope I understood you correctly with your idea of "difference".
val numbers = Seq(5, 3, 4, 1)
def group(seq: Seq[Int], treshold: Int) = seq.zipWithIndex.foldLeft(IndexedSeq.empty[IndexedSeq[(Int,Int)]])((result, elem) => {
(0 until result.size).find(
i => result(i).forall(num => (num._1 - elem._1).abs <= treshold)).map(
i => result.updated(i, result(i) :+ elem))
.getOrElse(result :+ IndexedSeq(elem))
})
println(group(numbers, 2)) //result Vector(Vector((5,0), (3,1), (4,2)), Vector((1,3)))
Edit forgot you wanted to zipWithIndex

Since you're working with indices of elements anyway, you may not care about working with indices of the groups as well, in which case Kigyo's answer is probably the right one.
One of the nice things about functional programming is that it can often free you from working with indices, though, so for the sake of completeness, here's an implementation using span that doesn't need to track the indices of groups (first for the simple form without element indices):
val numbers = Seq(5, 3, 4, 1)
numbers.foldLeft(List.empty[List[Int]]) {
case (acc, x) => acc.span(_.exists(y => math.abs(x - y) > 2)) match {
case (bad, picked :: rest) => (x :: picked) :: rest ::: bad
case (bad, _) => List(x) :: bad
}
}
If you haven't already zipWithIndex-ed numbers, you can also take care of that during the fold without too much extra fuss:
val numbers = Seq(5, 3, 4, 1)
numbers.foldLeft(List.empty[List[(Int, Int)]], 0) {
case ((acc, i), x) => acc.span(_.exists(y => math.abs(x - y._1) > 2)) match {
case (bad, picked :: rest) => (((x, i) :: picked) :: rest ::: bad, i + 1)
case (bad, _) => (List((x, i)) :: bad, i + 1)
}
}._1
This returns List(List((1, 3)), List((4, 2), (3, 1), (5, 0))) as expected, and saves you an iteration through the sequence with very little extra verbosity.

Related

How do you write a function to divide the input list into three sublists?

Write a function to divide the input list into three sublists.
The first sub-list is to include all the elements whose indexes satisfy the equation i mod 3 = 1.
The second sub-list is to include all the elements whose indexes satisfy the equation and mod 3 = 2.
The third sub-list is to contain the remaining elements.
The order of the elements must be maintained. Return the result as three lists.
Write a function using tail and non-tail recursion.
My attempt: I’m very confused in how to increase index so it can go through the list, any recommendation about how to make it recursive with increasing index each time?
def divide(list: List[Int]): (List[Int], List[Int], List[Int]) = {
var index:Int =0
def splitList(remaining: List[Int], firstSubList: List[Int], secondSubList: List[Int], thirdSubList: List[Int], index:Int): (List[Int], List[Int], List[Int]) = {
if(remaining.isEmpty) {
return (List[Int](), List[Int](), List[Int]())
}
val splitted = splitList(remaining.tail, firstSubList, secondSubList, thirdSubList, index)
val firstList = if (index % 3 == 1) List() ::: splitted._1 else splitted._1
val secondList = if (index % 3 == 2) List() ::: splitted._2 else splitted._2
val thirdList = if((index% 3 != 1) && (index % 3 != 2)) List() ::: splitted._3 else splitted._3
index +1
(firstSubList ::: firstList, secondSubList ::: secondList, thirdSubList ::: thirdList)
}
splitList(list, List(), List(), List(), index+1)
}
println(divide(List(0,11,22,33)))
Generalizing the requirement a little, here's one approach using a simple recursive function to compose a Map of Lists by modulo n of the original list indexes:
def splitList[T](list: List[T], n: Int): Map[Int, List[T]] = {
#scala.annotation.tailrec
def loop(zls: List[(T, Int)], lsMap: Map[Int, List[T]]): Map[Int, List[T]] =
zls match {
case Nil =>
lsMap.map{ case (i, ls) => (i, ls.reverse) }
case (x, i) :: rest =>
val j = i % n
loop(rest, lsMap + (j -> (x :: lsMap.getOrElse(j, Nil))))
}
loop(list.zipWithIndex, Map.empty[Int, List[T]])
}
splitList(List(0, 11, 22, 33, 44, 55, 66), 3)
// Map(0 -> List(0, 33, 66), 1 -> List(11, 44), 2 -> List(22, 55))
splitList(List('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'), 4)
// Map(0 -> List(a, e, i), 1 -> List(b, f), 2 -> List(c, g), 3 -> List(d, h))
To do this in real life, just label each value with its index and then group by that index modulo 3:
def divide[T](list: List[T]) = {
val g = list.zipWithIndex.groupMap(_._2 % 3)(_._1)
(g.getOrElse(1, Nil), g.getOrElse(2, Nil), g.getOrElse(0, Nil))
}
If you insist on a recursive version, it might look like this:
def divide[T](list: List[T]) = {
def loop(rem: List[T]): (List[T], List[T], List[T]) =
rem match {
case a::b::c::tail =>
val rem = loop(tail)
(b +: rem._1, c +: rem._2, a +: rem._3)
case a::b::Nil =>
(List(b), Nil, List(a))
case a::Nil =>
(Nil, Nil, List(a))
case Nil =>
(Nil, Nil, Nil)
}
loop(list)
}
Tail recursion would look like this:
def divide[T](list: List[T]) = {
#annotation.tailrec
def loop(rem: List[T], res: (List[T], List[T], List[T])): (List[T], List[T], List[T]) =
rem match {
case a::b::c::tail =>
loop(tail, (res._1 :+ b, res._2 :+ c, res._3 :+ a))
case a::b::Nil =>
(res._1 :+ b, res._2, res._3 :+ a)
case a::Nil =>
(res._1, res._2, res._3 :+ a)
case Nil =>
res
}
loop(list, (Nil, Nil, Nil))
}
And if you care about efficiency, this version would build the lists in the other order and reverse them when returning the result.
Your problem is that you put index+1 into a wrong place. Try swapping it around: put index+1 into the call where you have index, and index into the other one. Also remove the "standalone" index+1 statement in the middle, it doesn't do anything anyway.
That should make your code work ... but it is still not very good. A couple of problems with it (besides it being badly structured, non-idiomatic, and hard to read, which is kinda subjective):
It it is not tail-recursive, and effectively, creates another copy of the entire list on stack. This may be problematic when the list is long.
It concatenates (potentially long) lists. This is a bad idea. List in scala is a singly linked list, you have it's head readily available, but to get to the end, you need to spend O(N) cycles, iterating through each node. Thus things like foo:::bar in a iterative function instantly make any algorithm (at least) quadratic.
The usual "trick" to avoid the last problem is prepending elements to output one-by-one, and then reversing the result in the end. The first one can be avoided with tail-recursion. The "non-idiomatic" and "hard to read" problems are mostly addressed by using match statement in this case:
def split3(
in: List[Int],
one: List[Int],
two: List[Int],
three: List[Int],
index: Int = 0
} = (in, index % 3) match {
case (Nil, _) => (one.reverse, two.reverse, three.reverse)
case (head::tail, 1) => split3(tail, head::one, two, three, index+1)
case (head::tail, 2) => split3(tail, one, head::two, three, index+1)
case (head::tail, _) => split3(tail, one, two, head::three, index+1)
}
Now, this is a fine solution, albeit a little repetitive to my demanding eye ... But if want to be clever and really unleash the full power of scala standard library, forget recursion, you don't really need it in this case.
If you knew that number of elements in the list was always divisible by 3,
you could just do list.grouped(3).toSeq.transpose: break the list into groups of three (each group will have index%3=0 as first element, index%3=1 as second, index%3=2 as the third), and then transpose will turn a list of lists of 3 into a list of three lists where the first one contains all the first elements, the second - all the seconds etc. (I know, you wanted them in a different order, but that's trivial). If you are having trouble understanding what I am talking about, just try running it on some lists, and look at the results.
This would be a really elegant solution ... if it worked :/ The problem is, that it only does when you have 3*n elements in the original list. If not, transpose will fail on the last element if it doesn't have 3 elements like all others. Can we fix it? Well ... that's where the cleverness comes in.
val (triplets, tails) = list.grouped(3).toSeq.partition(_.size == 3)
triplets
.transpose
.padTo(3, Nil)
.zip(tails.flatten.map(Seq(_)).padTo(3, Nil))
.map { case (head, tail) => head ++ tail }
Basically, it is doing the same thing as the one-liner I described above (break into groups of 3 and transpose), but adds special handling for the case when the last group has less than three elements - it splits it out and pads with required number of empty lists, then just appends the result to transposed triplets.

Subsequence in a sequence of numbers in scala

For example i have:
List(1,3,2,5)
How i get all these:
List(1), List(3), List(2), List(5), List(1,3), List(1,2), List(1,5), List(3,2), List(3,5), List(2,5), List(1,3,2), List(1,3,5), List(1,2,5), List(3,2,5), List(1,3,2,5))
I need this for Longest Increasing Subsequence -problem and this answer for example would be: (1,3,5)
And I want use it for bigger Lists.
You want all the combinations() from 1 to array length.
val arr = Array(4, 3, 1)
arr.indices.flatMap(x => arr.combinations(x + 1))
//res0: Seq[Array[Int]] = Vector(Array(4), Array(3), Array(1), Array(4, 3), Array(4, 1), Array(3, 1), Array(4, 3, 1))
update
This will give you all possible combinations, retaining original order and duplicate elements.
def subseqs[A](seq :Seq[A]) :List[Seq[A]] = seq match {
case hd +: tl =>
val res = subseqs(tl)
Seq(hd) :: res ++ res.map(hd +: _)
case Seq() => Nil
}
The result is a List of n^2 - 1 possible sub-sequences. So for a collection of 8 elements you'll get 255 sub-sequences.
This, of course, is going to be way too tedious and inefficient for your purposes. Generating all possible sub-sequences in order to find the Longest Increasing is a little like washing all the clothes in your neighborhood so you'll have clean socks in the morning.
Much better to generate only the increasing sub-sequences and find the longest from that set (9 lines of code).
[ Update in response to updated question ]
[ Thanks to #jwvh for spotting error in original version ]
This method will generate all possible sub-sequences of a List:
def subsequences[T](list: List[T]): List[List[T]] =
list match {
case Nil =>
List(List())
case hd :: tl =>
val subs = subsequences(tl)
subs.map(hd +: _) ++ subs
}
Note that this is not efficient for a number of reasons, so it is not a good way to solve the "longest increasing subsequence problem".
[ Original answer ]
This function will generate all the non-empty contiguous sub-sequences of any sequence:
def subsequences[T](seq: Seq[T]) =
seq.tails.flatMap(_.inits).filter(_.nonEmpty)
This returns an Iterator so it creates each sub-sequence in turn which reduces memory usage.
Note that this will generate all the sub-sequences and will preserve the order of the values, unlike solutions using combinations or Set.
You can use this in your "longest increasing subsequence problem" like this:
def isAscending(seq: Seq[Int]): Boolean =
seq.length <= 1 || seq.sliding(2).forall(x => x(0) < x(1))
subsequences(a).filter(isAscending).maxBy(_.length)
The result will be the longest sequence of ascending values in the input a.

Scala check value inside mapping

Allright so i don't know if this is possible, but let's say we have the following list:
List(1, 2, 3, 1)
If i want to apply a map over this, is there a way for me to check if i've already had a value before, e.g. on the 4th value (the 2nd 1) it'll say that it already came across the 1 and then throw an error or something.
This would be the role of a foldLeft stage:
List(1, 2, 3, 1).foldLeft(List[Int]()) {
// The item has already been encountered:
case (uniqueItems, b) if uniqueItems.contains(b) => {
// If as stated, you want to throw an exception, that's where you could do it
uniqueItems
}
// New item not seen yet:
case (uniqueItems, b) => uniqueItems :+ b
}
foldLeft traverses a sequence while working (at each new element), with a result based on the previous ones.
For each element, the pattern matching (uniqueItems, b) should be understood this way: uniqueItems is the "accumulator" (it's initialized as List[Int]()) and will be updated (or not) for each item of the list. And b if the new item of the list which is currently being processed.
By the way, this example is a (non-efficient) distinct over a list.
List(1, 2, 3, 1).distinct.map (n => n*n)
// res163: List[Int] = List(1, 4, 9)
This code removes duplicates, then performs the mappings in a self documenting, brief manner.
fold is probably the way to go. The problem is that each iteration has to carry both the memory of previous elements as well as the map() results as it is being built.
List(1, 2, 3, 11).foldRight((Set[Int](),List[String]())) {case (i, (st, lst)) =>
if (st(i)) throw new Error //duplicate encountered
else (st + i, i.toString :: lst) //add to memory and map result
}._2 //pull the map result from the tuple

Kadane's Algorithm in Scala

Does anyone have a Scala implementation of Kadane's algorithm done in a functional style?
Edit Note: The definition on the link has changed in a way that invalidated answers to this question -- which goes to show why questions (and answers) should be self-contained instead of relying on external links. Here's the original definition:
In computer science, the maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array of numbers (containing at least one positive number) which has the largest sum. For example, for the sequence of values −2, 1, −3, 4, −1, 2, 1, −5, 4; the contiguous subarray with the largest sum is 4, −1, 2, 1, with sum 6.
What about this, if an empty subarray is allowed or the input array cannot be all negative:
numbers.scanLeft(0)((acc, n) => math.max(0, acc + n)).max
Or, failing the conditions above this (which assumes the input is non-empty):
numbers.tail.scanLeft(numbers.head)((acc, n) => (acc + n).max(n)).max
I prefer the folding solution to the scan solution -- though there's certainly elegance to the latter. Anyway,
numbers.foldLeft(0 -> 0) {
case ((maxUpToHere, maxSoFar), n) =>
val maxEndingHere = 0 max maxUpToHere + n
maxEndingHere -> (maxEndingHere max maxSoFar)
}._2
The following code returns the start and end index as well as the sum:
import scala.math.Numeric.Implicits.infixNumericOps
import scala.math.Ordering.Implicits.infixOrderingOps
case class Sub[T: Numeric](start: Index, end: Index, sum: T)
def maxSubSeq[T](arr: collection.IndexedSeq[T])(implicit n: Numeric[T]) =
arr
.view
.zipWithIndex
.scanLeft(Sub(-1, -1, n.zero)) {
case (p, (x, i)) if p.sum > n.zero => Sub(p.start, i, p.sum + x)
case (_, (x, i)) => Sub(i, i, x)
}
.drop(1)
.maxByOption(_.sum)

How do I populate a list of objects with new values

Apologies: I'm well noob
I have an items class
class item(ind:Int,freq:Int,gap:Int){}
I have an ordered list of ints
val listVar = a.toList
where a is an array
I want a list of items called metrics where
ind is the (unique) integer
freq is the number of times that ind appears in list
gap is the minimum gap between ind and the number in the list before it
so far I have:
def metrics = for {
n <- 0 until 255
listVar filter (x == n) count > 0
}
yield new item(n, (listVar filter == n).count,0)
It's crap and I know it - any clues?
Well, some of it is easy:
val freqMap = listVar groupBy identity mapValues (_.size)
This gives you ind and freq. To get gap I'd use a fold:
val gapMap = listVar.sliding(2).foldLeft(Map[Int, Int]()) {
case (map, List(prev, ind)) =>
map + (ind -> (map.getOrElse(ind, Int.MaxValue) min ind - prev))
}
Now you just need to unify them:
freqMap.keys.map( k => new item(k, freqMap(k), gapMap.getOrElse(k, 0)) )
Ideally you want to traverse the list only once and in the course for each different Int, you want to increment a counter (the frequency) as well as keep track of the minimum gap.
You can use a case class to store the frequency and the minimum gap, the value stored will be immutable. Note that minGap may not be defined.
case class Metric(frequency: Int, minGap: Option[Int])
In the general case you can use a Map[Int, Metric] to lookup the Metric immutable object. Looking for the minimum gap is the harder part. To look for gap, you can use the sliding(2) method. It will traverse the list with a sliding window of size two allowing to compare each Int to its previous value so that you can compute the gap.
Finally you need to accumulate and update the information as you traverse the list. This can be done by folding each element of the list into your temporary result until you traverse the whole list and get the complete result.
Putting things together:
listVar.sliding(2).foldLeft(
Map[Int, Metric]().withDefaultValue(Metric(0, None))
) {
case (map, List(a, b)) =>
val metric = map(b)
val newGap = metric.minGap match {
case None => math.abs(b - a)
case Some(gap) => math.min(gap, math.abs(b - a))
}
val newMetric = Metric(metric.frequency + 1, Some(newGap))
map + (b -> newMetric)
case (map, List(a)) =>
map + (a -> Metric(1, None))
case (map, _) =>
map
}
Result for listVar: List[Int] = List(2, 2, 4, 4, 0, 2, 2, 2, 4, 4)
scala.collection.immutable.Map[Int,Metric] = Map(2 -> Metric(4,Some(0)),
4 -> Metric(4,Some(0)), 0 -> Metric(1,Some(4)))
You can then turn the result into your desired item class using map.toSeq.map((i, m) => new Item(i, m.frequency, m.minGap.getOrElse(-1))).
You can also create directly your Item object in the process, but I thought the code would be harder to read.