Looking for an FP ranking implementation which handles ties (i.e. equal values) - scala

Starting from a sorted sequence of values, my goal is to assign a rank to each value, using identical ranks for equal values (aka ties):
Input: Vector(1, 1, 3, 3, 3, 5, 6)
Output: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))
A few type aliases for readability:
type Rank = Int
type Value = Int
type RankValuePair = (Rank, Value)
An imperative implementation using a mutable rank variable could look like this:
var rank = 0
val ranked1: Vector[RankValuePair] = for ((value, index) <- values.zipWithIndex) yield {
if ((index > 0) && (values(index - 1) != value)) rank += 1
(rank, value)
}
// ranked1: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))
To hone my FP skills, I was trying to come up with a functional implementation:
val ranked2: Vector[RankValuePair] = values.sliding(2).foldLeft((0 , Vector.empty[RankValuePair])) {
case ((rank: Rank, rankedValues: Vector[RankValuePair]), Vector(currentValue, nextValue)) =>
val newRank = if (nextValue > currentValue) rank + 1 else rank
val newRankedValues = rankedValues :+ (rank, currentValue)
(newRank, newRankedValues)
}._2
// ranked2: Vector((0,1), (0,1), (1,3), (1,3), (1,3), (2,5))
It is less readable, and – more importantly – is missing the last value (due to using sliding(2) on an odd number of values).
How could this be fixed and improved?

This works well for me:
// scala
val vs = Vector(1, 1, 3, 3, 3, 5, 6)
val rank = vs.distinct.zipWithIndex.toMap
val result = vs.map(i => (rank(i), i))
The same in Java 8 using Javaslang:
// java(slang)
Vector<Integer> vs = Vector(1, 1, 3, 3, 3, 5, 6);
Function<Integer, Integer> rank = vs.distinct().zipWithIndex().toMap(t -> t);
Vector<Tuple2<Integer, Integer>> result = vs.map(i -> Tuple(rank.apply(i), i));
The output of both variants is
Vector((0, 1), (0, 1), (1, 3), (1, 3), (1, 3), (2, 5), (3, 6))
*) Disclosure: I'm the creator of Javaslang

This is nice and concise but it assumes that your Values don't go negative. (Actually it just assumes that they can never start with -1.)
val vs: Vector[Value] = Vector(1, 1, 3, 3, 3, 5, 6)
val rvps: Vector[RankValuePair] =
vs.scanLeft((-1,-1)){ case ((r,p), v) =>
if (p == v) (r, v) else (r + 1, v)
}.tail
edit
Modification that makes no assumptions, as suggested by #Kolmar.
vs.scanLeft((0,vs.headOption.getOrElse(0))){ case ((r,p), v) =>
if (p == v) (r, v) else (r + 1, v)
}.tail

Here's an approach with recursion, pattern matching and guards.
The interesting part is where the head and head of the tail (h and ht respectively) are de-constructed from the list and an if checks if they are equal. The logic for each case adjusts the rank and proceeds on the remaining part of the list.
def rank(xs: Vector[Value]): List[RankValuePair] = {
def rankR(xs: List[Value], acc: List[RankValuePair], rank: Rank): List[RankValuePair] = xs match{
case Nil => acc.reverse
case h :: Nil => rankR(Nil, (rank, h) :: acc, rank)
case h :: ht :: t if (h == ht) => rankR(xs.tail, (rank, h) :: acc, rank)
case h :: ht :: t if (h != ht) => rankR(xs.tail, (rank, h) :: acc, rank + 1)
}
rankR(xs.toList, List[RankValuePair](), 0)
}
Output:
scala> rank(xs)
res14: List[RankValuePair] = List((0,1), (0,1), (1,3), (1,3), (1,3), (2,5), (3,6))

This is a modification of the solution by #jwvh, that doesn't make any assumptions about the values:
val vs = Vector(1, 1, 3, 3, 3, 5, 6)
vs.sliding(2).scanLeft(0, vs.head) {
case ((rank, _), Seq(a, b)) => (if (a != b) rank + 1 else rank, b)
}.toVector
Note, that it would throw if vs is empty, so you'd have to use vs.headOption getOrElse 0, or check if the input is empty beforehand: if (vs.isEmpty) Vector.empty else ...

import scala.annotation.tailrec
type Rank = Int
// defined type alias Rank
type Value = Int
// defined type alias Value
type RankValuePair = (Rank, Value)
// defined type alias RankValuePair
def rankThem(values: List[Value]): List[RankValuePair] = {
// Assumes that the "values" are sorted
#tailrec
def _rankThem(currentRank: Rank, currentValue: Value, ranked: List[RankValuePair], values: List[Value]): List[RankValuePair] = values match {
case value :: tail if value == currentValue => _rankThem(currentRank, value, (currentRank, value) +: ranked, tail)
case value :: tail if value > currentValue => _rankThem(currentRank + 1, value, (currentRank + 1, value) +: ranked, tail)
case Nil => ranked.reverse
}
_rankThem(0, Int.MinValue, List.empty[RankValuePair], values.sorted)
}
// rankThem: rankThem[](val values: List[Value]) => List[RankValuePair]
val valueList = List(1, 1, 3, 3, 5, 6)
// valueList: List[Int] = List(1, 1, 3, 3, 5, 6)
val rankValueList = rankThem(valueList)[RankedValuePair], values: Vector[Value])
// rankValueList: List[RankValuePair] = List((1,1), (1,1), (2,3), (2,3), (3,5), (4,6))

val list = List(1, 1, 3, 3, 5, 6)
val result = list
.groupBy(identity)
.mapValues(_.size)
.toArray
.sortBy(_._1)
.zipWithIndex
.flatMap(tuple => List.fill(tuple._1._2)((tuple._2, tuple._1._1)))
result: Array[(Int, Int)] = Array((0,1), (0,1), (1,3), (1,3), (2,5), (3,6))
The idea is using groupBy to find identical elements and find their occurrences and then sort and then flatMap. Time complexity I would say is O(nlogn), groupBy is O(n), sort is O(nlogn), fl

Related

Reduce sequence by parts

I have a sequence Seq[T] and I want to do partial reduce. For example for a Seq[Int] I want to get Seq[Int] consisting of the longest partial sums of monotonic regions. For example:
val s = Seq(1, 2, 4, 3, 2, -1, 0, 6, 8)
groupMonotionic(s) = Seq(1 + 2 + 4, 3 + 2 + (-1), 0 + 6 + 8)
I was looking for some method like conditional fold with the signature fold(z: B)((B, T) => B, (T, T) => Boolean) where the predicate states for where to terminate current sum aggregation, but it seems there is no something like that in the subtrait hierarchy of Seq.
What would be a solution using Scala Collection API and without using mutable variables?
Here is one way amongst many to do this (using Scala 2.13's List#unfold):
// val items = Seq(1, 2, 4, 3, 2, -1, 0, 6, 8)
items match {
case first :: _ :: _ => // If there are more than 2 items
List
.unfold(items.sliding(2).toList) { // We slid items to work on pairs of consecutive items
case Nil => // No more items to unfold
None // None signifies the end of the unfold
case rest # Seq(a, b) :: _ => // We span based on the sign of a-b
Some(rest.span(x => (x.head - x.last).signum == (a-b).signum))
}
.map(_.map(_.last)) // back from slided pairs
match { case head :: rest => (first :: head) :: rest }
case _ => // If there is 0 or 1 item
items.map(List(_))
}
// List(List(1, 2, 4), List(3, 2, -1), List(0, 6, 8))
List.unfold iterates as long as the unfolding function provides Some. It starts with an initial state which is the list of items to unfold. At each iteration, we span the state (remaining elements to unfold) based on the sign of the heading two elements difference. The unfolded elements are heading items sharing the same monotony and the unfolding state becomes the other remaining elements.
List#span splits a list into a tuple whose first part contains elements matching the predicate applied until the predicate stops being valid. The second part of the tuple contains the rest of the elements. Which fits perfectly the expected return type of List.unfold's unfolding function, which is Option[(A, S)] (In this case Option[(List[Int], List[Int])]).
Int.signum returns -1, 0 or 1 depending on the sign of the integer it's applied on.
Note that the first item has to be put back in the result as it hasn't an ancestor determining its signum (match { case head :: rest => (first :: head) :: rest }).
To apply the reducing function (in this case a sum), we can map the final result: .map(_.sum)
Works in Scala 2.13+ with cats
import scala.util.chaining._
import cats.data._
import cats.implicits._
val s = List(1, 2, 4, 3, 2, -1, 0, 6, 8)
def isLocalExtrema(a: List[Int]) =
a.max == a(1) || a.min == a(1)
implicit class ListOps[T](ls: List[T]) {
def multiSpanUntil(f: T => Boolean): List[List[T]] = ls.span(f) match {
case (h, Nil) => List(h)
case (h, t) => (h ::: t.take(1)) :: t.tail.multiSpanUntil(f)
}
}
def groupMonotionic(groups: List[Int]) = groups match {
case Nil => Nil
case x if x.length < 3 => List(groups.sum)
case _ =>
groups
.sliding(3).toList
.map(isLocalExtrema)
.pipe(false :: _ ::: List(false))
.zip(groups)
.multiSpanUntil(!_._1)
.pipe(Nested.apply)
.map(_._2)
.value
.map(_.sum)
}
println(groupMonotionic(s))
//List(7, 4, 14)
Here's one way using foldLeft to traverse the numeric list with a Tuple3 accumulator (listOfLists, prevElem, prevTrend) that stores the previous element and previous trend to conditionally assemble a list of lists in the current iteration:
val list = List(1, 2, 4, 3, 2, -1, 0, 6, 8)
val isUpward = (a: Int, b: Int) => a < b
val initTrend = isUpward(list.head, list.tail.head)
val monotonicLists = list.foldLeft( (List[List[Int]](), list.head, initTrend) ){
case ((lol, prev, prevTrend), curr) =>
val currTrend = isUpward(curr, prev)
if (currTrend == prevTrend)
((curr :: lol.head) :: lol.tail , curr, currTrend)
else
(List(curr) :: lol , curr, currTrend)
}._1.reverse.map(_.reverse)
// monotonicLists: List[List[Int]] = List(List(1, 2, 4), List(3, 2, -1), List(0, 6, 8))
To sum the individual nested lists:
monotonicLists.map(_.sum)
// res1: List[Int] = List(7, 4, 14)

if condition for partial argument in map

I understand how to use if in map. For example, val result = list.map(x => if (x % 2 == 0) x * 2 else x / 2).
However, I want to use if for only part of the arguments.
val inputColumns = List(
List(1, 2, 3, 4, 5, 6), // first "column"
List(4, 6, 5, 7, 12, 15) // second "column"
)
inputColumns.zipWithIndex.map{ case (col, idx) => if (idx == 0) col * 2 else col / 10}
<console>:1: error: ';' expected but integer literal found.
inputColumns.zipWithIndex
res4: List[(List[Int], Int)] = List((List(1, 2, 3, 4, 5, 6),0), (List(4, 6, 5, 7, 12, 15),1))
I have searched the error info but have not found a solution.
Why my code is not 'legal' in Scala? Is there a better way to write it? Basically, I want to do a pattern matching and then do something on other arguments.
To explain your problem another way, inputColumns has type List[List[Int]]. You can verify this in the Scala REPL:
$ scala
Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161).
Type in expressions for evaluation. Or try :help.
scala> val inputColumns = List(
| List(1, 2, 3, 4, 5, 6), // first "column"
| List(4, 6, 5, 7, 12, 15) // second "column"
| )
inputColumns: List[List[Int]] = List(List(1, 2, 3, 4, 5, 6), List(4, 6, 5, 7, 12, 15))
Now, when you call .zipWithIndex on that list, you end up with a List[(List[Int], Int)] - that is, a list of a tuple, in which the first tuple type is a List[Int] (the column) and the second is an Int (the index):
scala> inputColumns.zipWithIndex
res0: List[(List[Int], Int)] = List((List(1, 2, 3, 4, 5, 6),0), (List(4, 6, 5, 7, 12, 15),1))
Consequently, when you try to apply a map function to this list, col is a List[Int] and not an Int, and so col * 2 makes no sense - you're multiplying a List[Int] by 2. You then also try to divide the list by 10, obviously.
scala> inputColumns.zipWithIndex.map{ case(col, idx) => if(idx == 0) col * 2 else col / 10 }
<console>:13: error: value * is not a member of List[Int]
inputColumns.zipWithIndex.map{ case(col, idx) => if(idx == 0) col * 2 else col / 10 }
^
<console>:13: error: value / is not a member of List[Int]
inputColumns.zipWithIndex.map{ case(col, idx) => if(idx == 0) col * 2 else col / 10 }
^
In order to resolve this, it depends what you're trying to achieve. If you want a single list of integers, and then zip those so that each value has an associated index, you should call flatten on inputColumns before calling zipWithIndex. This will result in List[(Int, Int)], where the first value in the tuple is the column value, and the second is the index. Your map function will then work correctly without modification:
scala> inputColumns.flatten.zipWithIndex.map{ case(col, idx) => if(idx == 0) col * 2 else col / 10 }
res3: List[Int] = List(2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1)
Of course, you no longer have separate columns.
If you wish each value in each list to have an associated index, you need to firstly map inputColumns into two zipped lists, using inputColumns.map(_.zipWithIndex) to create a List[List[(Int, Int)]] - a list of a list of (Int, Int) tuples:
scala> inputColumns.map(_.zipWithIndex)
res4: List[List[(Int, Int)]] = List(List((1,0), (2,1), (3,2), (4,3), (5,4), (6,5)), List((4,0), (6,1), (5,2), (7,3), (12,4), (15,5)))
We can now apply your original map function to the result of the zipWithIndex operation:
scala> inputColumns.map(_.zipWithIndex.map { case (col, idx) => if(idx == 0) col * 2 else col / 10 })
res5: List[List[Int]] = List(List(2, 0, 0, 0, 0, 0), List(8, 0, 0, 0, 1, 1))
The result is another List[List[Int]] with each internal list being the results of your map operation on the original two input columns.
On the other hand, if idx is meant to be the index of the column, and not of each value, and you want to multiply all of the values in the first column by 2 and divide all of the values in the other columns by 10, then you need to change your original map function to map across each column, as follows:
scala> inputColumns.zipWithIndex.map {
| case (col, idx) => {
| if(idx == 0) col.map(_ * 2) // Multiply values in first column by 1
| else col.map(_ / 10) // Divide values in all other columns by 10
| }
| }
res5: List[List[Int]] = List(List(2, 4, 6, 8, 10, 12), List(0, 0, 0, 0, 1, 1))
Let me know if you require any further clarification...
UPDATE:
The use of case in map is a common Scala shorthand. If a higher-order function takes a single argument, something such as this:
def someHOF[A, B](x: A => B) = //...
and you call that function like this (with what Scala terms a partial function - a function consisting solely of a list of case statements):
someHOF {
case expr1 => //...
case expr2 => //...
...
}
then Scala treats it as a kind-of shorthand for:
someHOF {a =>
a match {
case expr1 => //...
case expr2 => //...
...
}
}
or, being slightly more terse,
someHOF {
_ match {
case expr1 => //...
case expr2 => //...
...
}
}
For a List, for example, you can use it with functions such as map, flatMap, filter, etc.
In the case of your map function, the sole argument is a tuple, and the sole case statement acts to break open the tuple and expose its contents. That is:
val l = List((1, 2), (3, 4), (5, 6))
l.map { case(a, b) => println(s"First is $a, second is $b") }
is equivalent to:
l.map {x =>
x match {
case (a, b) => println(s"First is $a, second is $b")
}
}
and both will output:
First is 1, second is 2
First is 3, second is 4
First is 5, second is 6
Note: This latter is a bit of a dumb example, since map is supposed to map (i.e. change) the values in the list into new values in a new list. If all you were doing was printing the values, this would be better:
val l = List((1, 2), (3, 4), (5, 6))
l.foreach { case(a, b) => println(s"First is $a, second is $b") }
You are trying to multiply a list by 2 when you do col * 2 as col is List(1, 2, 3, 4, 5, 6) when idx is 0, which is not possible and similar is the case with else part col / 10
If you are trying to multiply the elements of first list by 2 and devide the elements of rest of the list by 10 then you should be doing the following
inputColumns.zipWithIndex.map{ case (col, idx) => if (idx == 0) col.map(_*2) else col.map(_/10)}
Even better approach would be to use match case
inputColumns.zipWithIndex.map(x => x._2 match {
case 0 => x._1.map(_*2)
case _ => x._1.map(_/10)
})

How do I do a Map comprehension with Scala?

With Python, I can do something like
listOfLists = [('a', -1), ('b', 0), ('c', 1)]
my_dict = {foo: bar for foo, bar in listOfLists}
my_dict == {'a': -1, 'b': 0, 'c': 1} => True
I know this as a dictionary comprehension. When I look for this operation with Scala, I find this incomprehensible document (pun intended).
Is there an idiomatic way to do this with Scala?
Bonus question: Can I filter with this operation as well like my_dict = {foo: bar for foo, bar in listOfLists if bar > 0}?
First, let's parse your Python code to figure out what it's doing.
my_dict = {
foo: bar <-- Key, value names
for foo, bar <-- Destructuring a list
in listOfLists <-- This is where they came from
}
So you can see that even in this very short example there's actually considerable redundancy and plenty of potential for failure if listOfLists isn't actually what it says it is.
If listOfLists actually is a list of pairs (key, value), then in Scala it's trivial:
listOfPairs.toMap
If, on the other hand, it really is lists, and you want to pull off the first one to make the key and save the rest as a value, it would be something like
listOfLists.map(x => x.head -> x.tail).toMap
You can select some of them by using collect instead. For instance, maybe you only want the lists of length 2 (you could if x.head > 0 to get your example), in which case you
listOfLists.collect{
case x if x.length == 2 => x.head -> x.last
}.toMap
or if it is literally a List, you could also
listOfLists.collect{
case key :: value :: Nil => key -> value
}.toMap
I'll compare list comprehension in Scala2.x and Python 3.x
1. Sequence
In python:
xs = [x*x for x in range(5)]
#xs = [0, 1, 4, 9, 16]
ys = list(map(lambda x: x*x, range(5)))
#ys = [0, 1, 4, 9, 16]
In Scala:
scala> val xs = for(x <- 0 until 5) yield x*x
xs: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 4, 9, 16)
scala> val ys = (0 until 5) map (x => x*x)
ys: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 4, 9, 16)
Or you really want a list:
scala> import collection.breakOut
scala> val xs: List[Int] = (for(x <- 0 until 5) yield x*x)(breakOut)
xs: List[Int] = List(0, 1, 4, 9, 16)
scala> val ys: List[Int] = (0 until 5).map(x => x*x)(breakOut)
ys: List[Int] = List(0, 1, 4, 9, 16)
scala> val zs = (for(x <- 0 until 5) yield x*x).toList
zs: List[Int] = List(0, 1, 4, 9, 16)
2. Set
In Python
s1 = { x//2 for x in range(10) }
#s1 = {0, 1, 2, 3, 4}
s2 = set(map(lambda x: x//2, range(10)))
#s2 = {0, 1, 2, 3, 4}
In Scala
scala> val s1 = (for(x <- 0 until 10) yield x/2).toSet
s1: scala.collection.immutable.Set[Int] = Set(0, 1, 2, 3, 4)
scala> val s2: Set[Int] = (for(x <- 0 until 10) yield x/2)(breakOut)
s2: Set[Int] = Set(0, 1, 2, 3, 4)
scala> val s3: Set[Int] = (0 until 10).map(_/2)(breakOut)
s3: Set[Int] = Set(0, 1, 2, 3, 4)
scala> val s4 = (0 until 10).map(_/2).toSet
s4: scala.collection.immutable.Set[Int] = Set(0, 1, 2, 3, 4)
3. Dict
In Python:
pairs = [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]
#d1 = {1: 'aa', 2: 'bb', 3: 'cc', 4: 'dd'}
d2 = dict([(k*2, v) for k, v in pairs])
#d2 = {2: 'a', 4: 'b', 6: 'c', 8: 'd'}
In Scala
scala> val pairs = Seq(1->"a", 2->"b", 3->"c", 4->"d")
pairs: Seq[(Int, String)] = List((1,a), (2,b), (3,c), (4,d))
scala> val d1 = (for((k, v) <- pairs) yield (k, v*2)).toMap
d1: scala.collection.immutable.Map[Int,String] = Map(1 -> aa, 2 -> bb, 3 -> cc, 4 -> dd)
scala> val d2 = Map(pairs map { case(k, v) => (k*2, v) } :_*)
d2: scala.collection.immutable.Map[Int,String] = Map(2 -> a, 4 -> b, 6 -> c, 8 -> d)
scala> val d3 = pairs map { case(k, v) => (k*2, v) } toMap
d3: scala.collection.immutable.Map[Int,String] = Map(2 -> a, 4 -> b, 6 -> c, 8 -> d)
scala> val d4: Map[Int, String] = (for((k, v) <- pairs) yield (k, v*2))(breakOut)
d4: Map[Int,String] = Map(1 -> aa, 2 -> bb, 3 -> cc, 4 -> dd)
Here are a few examples:
val listOfLists = Vector(Vector(1,2), Vector(3,4), Vector(5,6))
val m1 = listOfLists.map { case Seq(a,b) => (a,b) }.toMap
val m2 = listOfLists.collect { case Seq(a,b) if b>0 => (a,b) }.toMap
val m3 = (for (Seq(a,b) <- listOfLists) yield (a,b)).toMap
val m4 = (for (Seq(a,b) <- listOfLists if b>0) yield (a,b)).toMap
val m5 = Map(listOfLists.map { case Seq(a,b) => (a,b) }: _*)
val m6 = Map(listOfLists.collect { case Seq(a,b) => (a,b) }: _*)
val m7 = Map((for (Seq(a,b) <- listOfLists) yield (a,b)): _*)
val m8 = Map((for (Seq(a,b) <- listOfLists if b>0) yield (a,b)): _*)
You can create a Map using .toMap or Map(xs: _*). The collect method lets you filter as you map. And a for-comprehension uses syntax most similar to your example.

How to remove 2 or more duplicates from list and maintain their initial order?

Lets assume we have a Scala list:
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
We can easily remove duplicates using the following code:
l1.distinct
or
l1.toSet.toList
But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.
I could achieve it with following code:
l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten
that gave me the result:
List(2, 2, 5, 1, 1, 3, 3)
Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?
So the result for l1 should be:
List(1, 2, 3, 1, 3, 2, 5)
Not the most efficient.
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
My humble answer:
def distinctOrder[A](x:List[A]):List[A] = {
#scala.annotation.tailrec
def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
(list, covered) match {
case (Nil, _) => covered.reverse
case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
case _ => distinctOrderRec(list.tail, list.head :: covered)
}
}
distinctOrderRec(x, Nil)
}
With the results:
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
On Edit: Right before I went to bed I came up with this!
l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)
With an answer of:
res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
Not the prettiest. I look forward to seeing the other solutions.
def noMoreThan(xs: List[Int], max: Int) =
{
def op(m: Map[Int, Int], a: Int) = {
m updated (a, m(a) + 1)
}
xs.scanLeft( Map[Int,Int]().withDefaultValue(0) ) (op).tail
.zip(xs)
.filter{ case (m, a) => m(a) <= max }
.map(_._2)
}
scala> noMoreThan(l1, 2)
res0: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
More straightforward version using foldLeft:
l1.foldLeft(List[Int]()){(acc, el) =>
if (acc.count(_ == el) >= 2) acc else el::acc}.reverse
Similar to how distinct is implemeted, with a multiset instead of a set:
def noMoreThan[T](list : List[T], max : Int) = {
val b = List.newBuilder[T]
val seen = collection.mutable.Map[T,Int]().withDefaultValue(0)
for (x <- list) {
if (seen(x) < max) {
b += x
seen(x) += 1
}
}
b.result()
}
Based on experquisite's answer, but using foldLeft:
def noMoreThanBis(xs: List[Int], max: Int) = {
val initialState: (Map[Int, Int], List[Int]) = (Map().withDefaultValue(0), Nil)
val (_, result) = xs.foldLeft(initialState) { case ((count, res), x) =>
if (count(x) >= max)
(count, res)
else
(count.updated(x, count(x) + 1), x :: res)
}
result.reverse
}
distinct is defined for SeqLike as
/** Builds a new $coll from this $coll without any duplicate elements.
* $willNotTerminateInf
*
* #return A new $coll which contains the first occurrence of every element of this $coll.
*/
def distinct: Repr = {
val b = newBuilder
val seen = mutable.HashSet[A]()
for (x <- this) {
if (!seen(x)) {
b += x
seen += x
}
}
b.result()
}
We can define our function in very similar fashion:
def distinct2[A](ls: List[A]): List[A] = {
val b = List.newBuilder[A]
val seen1 = mutable.HashSet[A]()
val seen2 = mutable.HashSet[A]()
for (x <- ls) {
if (!seen2(x)) {
b += x
if (!seen1(x)) {
seen1 += x
} else {
seen2 += x
}
}
}
b.result()
}
scala> distinct2(l1)
res4: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
This version uses internal state, but is still pure. It is also quite easy to generalise for arbitrary n (currently 2), but specific version is more performant.
You can implement the same function with folds carrying the "what is seen once and twice" state with you. Yet the for loop and mutable state does the same job.
How about this:
list
.zipWithIndex
.groupBy(_._1)
.toSeq
.flatMap { _._2.take(2) }
.sortBy(_._2)
.map(_._1)
Its a bit ugly, but its relatively faster
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1.foldLeft((Map[Int, Int](), List[Int]())) { case ((m, ls), x) => {
val z = m + ((x, m.getOrElse(x, 0) + 1))
(z, if (z(x) <= 2) x :: ls else ls)
}}._2.reverse
Gives: List(1, 2, 3, 1, 3, 2, 5)
Here is a recursive solution (it will stack overflow for large lists):
def filterAfter[T](l: List[T], max: Int): List[T] = {
require(max > 1)
//keep the state of seen values
val seen = Map[T, Int]().withDefaultValue(0)//init to 0
def filterAfter(l: List[T], seen: Map[T, Int]): (List[T], Map[T, Int]) = {
l match {
case x :: xs =>
if (seen(x) < max) {
//Update the state and pass to next
val pair = filterAfter(xs, seen updated (x, seen(x) + 1))
(x::pair._1, pair._2)
} else {
//already seen more than max
filterAfter(xs, seen)
}
case _ => (l, seen)//empty, terminate recursion
}
}
//call inner recursive function
filterAfter(l, seen, 2)._1
}
Here is canonical Scala code to do reduce three or more in a row to two in a row:
def checkForTwo(candidate: List[Int]): List[Int] = {
candidate match {
case x :: y :: z :: tail if x == y && y == z =>
checkForTwo(y :: z :: tail)
case x :: tail =>
x :: checkForTwo(tail)
case Nil =>
Nil
}
}
It looks at the first three elements of the list, and if they are the same, drops the first one and repeats the process. Otherwise, it passes items on through.
Solution with groupBy and filter, without any sorting (so it's O(N), sorting will give you additional O(Nlog(N)) in typical case):
val li = l1.zipWithIndex
val pred = li.groupBy(_._1).flatMap(_._2.lift(1)) //1 is your "2", but - 1
for ((x, i) <- li if !pred.get(x).exists(_ < i)) yield x
I prefer approach with immutable Map:
def noMoreThan[T](list: List[T], max: Int): List[T] = {
def go(tail: List[T], freq: Map[T, Int]): List[T] = {
tail match {
case h :: t =>
if (freq(h) < max)
h :: go(t, freq + (h -> (freq(h) + 1)))
else go(t, freq)
case _ => Nil
}
}
go(list, Map[T, Int]().withDefaultValue(0))
}

my combinations function returns an empty list

I am working on S-99: Ninety-Nine Scala Problems and already stuck at question 26.
Generate the combinations of K distinct objects chosen from the N elements of a list.
After wasting a couple hours, I decided to peek at a solution written in Haskell:
combinations :: Int -> [a] -> [[a]]
combinations 0 _ = [ [] ]
combinations n xs = [ y:ys | y:xs' <- tails xs
, ys <- combinations (n-1) xs']
It looks pretty straightforward so I decided to translate into Scala. (I know that's cheating.) Here's what I got so far:
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (0, _) => List[List[T]]()
case (n, xs) => {
for {
y :: xss <- allTails(xs).reverse
ys <- combinations((n - 1), xss)
} yield y :: ys
}
}
My helper function:
def allTails[T](ls: List[T]): List[List[T]] = {
ls./:(0, List[List[T]]())((acc, c) => {
(acc._1 + 1, ls.drop(acc._1) :: acc._2)
})._2 }
allTails(List(0, 1, 2, 3)).reverse
//> res1: List[List[Int]] = List(List(0, 1, 2, 3), List(1, 2, 3), List(2, 3), List(3))
However, my combinations returns an empty list. Any idea?
Other solutions with explanation are very welcome as well. Thanks
Edit: The description of the question
Generate the combinations of K distinct objects chosen from the N elements of a list.
In how many ways can a committee of 3 be chosen from a group of 12 people? We all know that there are C(12,3) = 220 possibilities (C(N,K) denotes the well-known binomial coefficient). For pure mathematicians, this result may be great. But we want to really generate all the possibilities.
Example:
scala> combinations(3, List('a, 'b, 'c, 'd, 'e, 'f))
res0: List[List[Symbol]] = List(List('a, 'b, 'c), List('a, 'b, 'd), List('a, 'b, 'e), ...
As Noah pointed out, my problem is for of an empty list doesn't yield. However, the hacky work around that Noah suggested is wrong. It adds an empty list to the result of every recursion step. Anyway, here is my final solution. I changed the base case to "case (1, xs)". (n matches 1)
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (1, xs) => xs.map(List(_))
case (n, xs) => {
val tails = allTails(xs).reverse
for {
y :: xss <- allTails(xs).reverse
ys <- combinations((n - 1), xss)
} yield y :: ys
}
}
//combinations(3, List(1, 2, 3, 4))
//List(List(1, 2, 3), List(1, 2, 4), List(1, 3, 4), List(2, 3, 4))
//combinations(2, List(0, 1, 2, 3))
//List(List(0, 1), List(0, 2), List(0, 3), List(1, 2), List(1, 3), List(2, 3))
def allTails[T](ls: List[T]): List[List[T]] = {
ls./:(0, List[List[T]]())((acc, c) => {
(acc._1 + 1, ls.drop(acc._1) :: acc._2)
})._2
}
//allTails(List(0,1,2,3))
//List(List(3), List(2, 3), List(1, 2, 3), List(0, 1, 2, 3))
You made a mistake when translating the Haskell version here:
case (0, _) => List[List[T]]()
This returns an empty list. Whereas the Haskell version
combinations 0 _ = [ [] ]
returns a list with a single element, and that element is an empty list.
This is essentially saying that there is one way to choose zero items, and that is important because the code builds on this case recursively for the cases where we choose more items. If there were no ways to select zero items, then there would also be no ways to select one item and so on. That's what's happening in your code.
If you fix the Scala version to do the same as the Haskell version:
case (0, _) => List(List[T]())
it works as expected.
Your problem is using the for comprehension with lists. If the for detects an empty list, then it short circuits and returns an empty list instead of 'cons'ing your head element. Here's an example:
scala> for { xs <- List() } yield println("It worked!") // This never prints
res0: List[Unit] = List()
So, a kind of hacky work around for your combinations function would be:
def combinations[T](n: Int, ls: List[T]): List[List[T]] = (n, ls) match {
case (0, _) => List[List[T]]()
case (n, xs) => {
val tails = allTails(xs).reverse
println(tails)
for {
y :: xss <- tails
ys <- Nil :: combinations((n - 1), xss) //Now we're sure to keep evaulating even with an empty list
} yield y :: ys
}
}
scala> combinations(2, List(1, 2, 3))
List(List(1, 2, 3), List(2, 3), List(3))
List(List(2, 3), List(3))
List(List(3))
List()
res5: List[List[Int]] = List(List(1), List(1, 2), List(1, 3), List(2), List(2, 3), List(3))
One more way of solving it.
def combinations[T](n: Int, ls: List[T]): List[List[T]] = {
var ms: List[List[T]] = List[List[T]]();
val len = ls.size
if (n > len)
throw new Error();
else if (n == len)
List(ls)
else if (n == 1)
ls map (a => List(a))
else {
for (i <- n to len) {
val take: List[T] = ls take i;
val temp = combinations(n - 1, take.init) map (a => take.last :: a)
ms = ms ::: temp
}
ms
}
}
So combinations(2, List(1, 2, 3)) gives: List[List[Int]] = List(List(2, 1), List(3, 1), List(3, 2))