Functional "Find pairs that add up to X" with linear time complexity - scala

I am trying to implement the "find pairs that add up to X" functionally with linear time complexity, for which I have the following:
def pairs(nums: List[Int], sum: Int): List[(Int, Int)] = {
def pairsR(nums: List[Int], sum: Int, start: Int, end: Int, acc: List[(Int, Int)]): List[(Int, Int)] = {
val newAcc = nums(start) + nums(end) match {
case n if n == sum => ( (nums(start), nums(end)) :: acc, start + 1, end - 1)
case n if n < sum => (acc, start + 1, end)
case n if n > sum => (acc, start, end - 1)
}
if(start < end) pairsR(nums, sum, newAcc._2, newAcc._3, newAcc._1)
else newAcc._1
}
pairsR(nums, sum, 0, nums.length - 1, List())
}
Which would work if I were trying to look for the first pair that adds to X (assuming I return after finding the first occurrence). But because I am trying to find all pairs I am getting some duplicates, as seen here: (note in the list there is only a single 5, yet because the pointers arrive at 5 at the same time I am guessing they count it twice)
pairs(List(1,2,3,4,5,6,7,8,9), 10) should equalTo (List( (1, 9), (2, 8), (3, 7), (4, 6) ))
Yet I get the following failure:
List((5,5), (4,6), (3,7), (2,8), (1,9)) is not equal to List((1,9),
(2,8), (3,7), (4,6))
Is this algorithm just not possible to accomplish in linear time when you are looking for ALL pairs (and not just the first)? I know its possible to do with a HashSet, but I wanted to know if you could take the "pointers" approach?

Here is an updated version of your code:
def pairs(nums: List[Int], sum: Int): List[(Int, Int)] = {
val numsArr = nums.toArray
def pairsR(start: Int, end: Int, acc: List[(Int, Int)]): List[(Int, Int)] =
numsArr(start) + numsArr(end) match {
case _ if start >= end => acc.reverse
case `sum` => pairsR(start + 1, end - 1, (numsArr(start), numsArr(end)) :: acc)
case n if n < sum => pairsR(start + 1, end, acc)
case n if n > sum => pairsR(start, end - 1, acc)
}
pairsR(0, numsArr.length - 1, Nil)
}
Test:
pairs(1 to 9 toList, 10)
Result:
res0: List[(Int, Int)] = List((1,9), (2,8), (3,7), (4,6))
Some notes:
When start and end pointers intersect somewhere at the middle of the array, it's time to end recursion and return acc. This condition must go first so you don't apply generic logic
As you prepend results to the acc, acc is in reversed order in the end, so it's reasonable to reverse it before returning.
No need to add static parameters, such as nums and sum as arguments of the inner recursive function
As people suggested in the comments, if you need read-only collection with constant time indexed access, your choice is Array. Vector will also work, but it's slower
As a side note, current implementation may produce incorrect results if you have duplicate elements in your input list:
pairs(List(1,1,1,2,2), 3)
Result:
res1: List[(Int, Int)] = List((1,2), (1,2))
The easiest way to fix that is to preprocess input list removing duplicate elements. This will make result contain only distinct pairs. However if you want to include all pairs with elements of the same value but different index, then it's not possible to do that in linear time (consider an example when you have N elements of X value and you want to find all sums of 2X. Result's length will be N2).
Also, this algorithm requires input data to be sorted. There is an easy change to the algorithm to make it work on unsorted data (using counting sort).

Related

What's the idiomatic way to take top n values according to some criteria?

I have the following code:
Sighting.all
.iterator
.map(s => (s, haversineDistance(s, ourLocation)))
.toSeq
.sortBy(_._2)
.take(5)
As expected, it returns 5 sightings closests to ourLocation.
However, for a very large number of sightings, it does not scale well. We can instead just go through all sightings O(N) and find the 5 closest ones, instead of sorting them all and thus doing O(N*logN). How to do so idiomatically?
As with your previous questions, fold might be of use. In this case I'd be tempted to fold over a PriorityQueue initialized to values larger than the expected data set.
import scala.collection.mutable.PriorityQueue
...
.iterator
.foldLeft(PriorityQueue((999,"x"),(999,"x"),(999,"x"),(999,"x"),(999,"x")){
case (pq, s) => pq.+=((haversineDistance(s, ourLocation), s)).tail
}
The result is a PriorityQueue of 5 (distance, sighting) tuples, but only the 5 smallest distances.
You can avoid sorting the big list by iterating through each of the elements in the list just once while maintaining a 5-element list as follows:
Keep the 5-element list sorted by distance in descending order so that its head element has the longest distance (Note that since 5 is small the cost of sorting is negligible)
In each iteration, if the current element in the original list has its distance shorter than that of the head element in the 5-element list, replace the head element with the current element; otherwise keep the current 5-element list
Upon completing the iterations, the 5-element list will consist of elements with the shortest distances and a final sorting by distance in ascending order will give the top5 list:
val list = Sighting.all.
iterator.
map(s => (s, haversineDistance(s, ourLocation))).
toSeq
// For example ...
res1: list = List(
("a", 5), ("b", 2), ("c", 12), ("d", 9), ("e", 6), ("f", 15),
("g", 9), ("h", 7), ("i", 6), ("j", 3), ("k", 10), ("l", 5)
)
val top5 = list.drop(5).
foldLeft( list.take(5).sortWith(_._2 > _._2) )(
(l, e) => if (e._2 < l.head._2)
(e :: l.tail).sortWith(_._2 > _._2)
else
l
).
sortBy(_._2)
// top5: List[(String, Int)] = List((b,2), (f,3), (h,5), (a,5), (e,6))
[UPDATE]
Below is a verbose version of the above top5 value assignment which hopefully makes the foldLeft expression look less overwhelming.
val initialTop5Sorted = list.take(5).sortWith(_._2 > _._2)
val originalListTail = list.drop(5)
def updateTop5Sorted = ( list: List[(String, Int)], element: (String, Int) ) => {
if (element._2 < list.head._2)
(element :: list.tail).sortWith(_._2 > _._2)
else
list
}
val top5 = originalListTail.
foldLeft( initialTop5Sorted )( updateTop5Sorted ).
sortBy(_._2)
Here's signature of foldLeft for your reference:
def foldLeft[B](z: B)(op: (B, A) => B): B
Here's a slightly different approach:
def topNBy[A, B : Ordering](xs: Iterable[A], n: Int, f: A => B): List[A] = {
val q = new scala.collection.mutable.PriorityQueue[A]()(Ordering.by(f))
for (x <- xs) {
q += x
if (q.size > n) {
q.dequeue()
}
}
q.dequeueAll.toList.reverse
}
fold is useful, and worth getting comfortable with, but if you're not creating a new object to act on in each iteration, and just modifying an existing one, it's no better than a for-loop. And I'd prefer relying on PriorityQueue to do the sorting rather than rolling your own, especially given it's an efficient O(log n) implementation. Functional purists might balk at this for being more imperative, but to me it's worth it for readability and conciseness. The mutable state is limited to a single local data structure.
You could even put it in an implicit class:
implicit class IterableWithTopN[A](xs: Iterable[A]) {
def topNBy[B : Ordering](n: Int, f: A => B): List[A] = {
...
}
}
And then use it like:
Sighting.all.topNBy(5, s => haversineDistance(s, ourLocation))

Scala: partitioning number into n almost equal-length ranges

I am trying to implement a function:
def NumberPartition(InputNum:Int,outputListSize:Int):List[Range]
such that:
NumberPartition(8,3)=List(Range(0,3),Range(3,6),Range(6,8))
ie. it creates n-1 equal-length ranges(length=ceil(InputNum/outputListSize)) plus the last/first one being slightly smaller.
I want to use this function for agglomeration of an embarrassingly-parallel program consisting of n subroutines that are going to be batch-handled by n tasks/threads.
What is the most idiomatic way of doing this in Scala?
I think using Range steps could be helpful:
def rangeHeads(n:Int,len:Int):Range=Range(0,n,ceil(n/len))//type conversion for ceil() omitted here.
rangeHeads(8,3)//Range(0, 3, 6)
I just need a function that does (1,2,3,4)->((1,2),(2,3),(3,4))
While this isn't the exact output you are seeking, perhaps this will be good guidance:
scala> def numberPartition(inputNum: Int, outputListSize: Int): List[List[Int]] = {
(0 to inputNum).toList.grouped(outputListSize).toList
}
numberPartition: numberPartition[](val inputNum: Int,val outputListSize: Int) => List[List[Int]]
scala> numberPartition(8, 3)
res0: List[List[Int]] = List(List(0, 1, 2), List(3, 4, 5), List(6, 7, 8))
def roundedUpIntDivide(a:Int,b:Int):Int=a/b + (if(a%b==0) 0 else 1)
def partitionToRanges(n:Int,len:Int): List[(Int, Int)] ={
(Range(0,n,roundedUpIntDivide(n,len)):+n)
.sliding(2)
.map(x => (x(0),x(1)))
.toList
}
thanks to #jwvh for suggesting sliding()
def numberPartition(inputNum: Int, outputListSize: Int): List[(Int, Int)] = {
val range = 0.until(inputNum).by(inputNum/outputListSize + 1).:+(inputNum)
range.zip(range.tail).toList
}
Note:
0.until(inputNum) is an exclusive range [0, inputNum);
.by(inputNum/outputListSize + 1) is the step;
.:+(inputNum) it adds back the range upper bound;
range.zip(range.tail) build couples from list, equals to .sliding(2).map { case Seq(x,y) => (x,y) }

fold left operation in Scala?

I am having difficulty understanding how fold left works in Scala.
The following code computes for each unique character in the list chars the number of
times it occurs. For example, the invocation
times(List('a', 'b', 'a'))
should return the following (the order of the resulting list is not important):
List(('a', 2), ('b', 1))
def times(chars: List[Char]): List[(Char, Int)] = {
def incr(acc: Map[Char,Int], c: Char) = {
val count = (acc get c).getOrElse(0) + 1
acc + ((c, count));
}
val map = Map[Char, Int]()
(map /: chars)(incr).iterator.toList
}
I am just confused as to what the last line of this function is actually doing?
Any help wpuld be great.
Thanks.
foldLeft in scala works like this:
suppose you have a list of integers,
val nums = List(2, 3, 4, 5, 6, 7, 8, 9, 10)
val res= nums.foldLeft(0)((m: Int, n: Int) => m + n)
you will get res=55.
lets visualise it.
val res1 = nums.foldLeft(0) { (m: Int, n: Int) => println("m: " + m + " n: " + n);
m + n }
m: 0 n: 1
m: 1 n: 2
m: 3 n: 3
m: 6 n: 4
m: 10 n: 5
m: 15 n: 6
m: 21 n: 7
m: 28 n: 8
m: 36 n: 9
m: 45 n: 10
so, we can see that we need to pass initial accumulator value in foldLeft argument. And accumulated value is stored in 'm' and next value we get in 'n'.
And finally we get the accumulator as result.
Let's start from the "last line" which you are asking about: as the Map trait extends Iterable which in turn extends Traversable where the operator /: is explained, the code (map /: chars)(incr) does fold-left over chars, with the initial value of the accumulator being the empty mapping from characters to integers, applying incr to each intermediate value of acc and each element c of chars.
For example, when chars is List('a', 'b', 'a', 'c'), the fold-left expression (map /: chars)(incr) equals incr(incr(incr(incr(Map[Char, Int](), 'a'), 'b'), 'a'), 'c').
Now, as for what incr does: it takes an intermediate mapping acc from characters to integers, along with a character c, and increments by 1 the integer corresponding to c in the mapping. (Strictly speaking, the mapping is immutable and therefore never mutated: instead, a new, updated mapping is created and returned. Also, getOrElse(0) says that, if c does not exist in acc, the integer to be incremented is considered 0.)
As a whole, given List('a', 'b', 'a', 'c') as chars for example, the final mapping would be List(('a', 2), ('b', 1), ('c', 1)) when converted to a list by toList.
I rewrote your function in a more verbose way:
def times(chars: List[Char]): List[(Char, Int)] = {
chars
.foldLeft(Map[Char, Int]()){ (acc, c) =>
acc + ((c, acc.getOrElse(c, 0) + 1))
}
.toList
}
Let's see the first steps on times("aba".toList)
First invocation:
(Map(), 'a') => Map() ++ Map(`a` -> 1)
Second invocation:
(Map(`a` -> 1), `b`) => Map('a' -> 1) ++ Map('b' ->1)
Third invocation:
(Map('a' -> 1, 'b' ->1), 'a') =>
Map('a' -> 1, 'b' ->1) ++ Map('a' -> 2) =>
Map('a' -> 2, 'b' ->1)
The actual implementation in the scala codebase is very concise:
def foldLeft[B](z: B)(f: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = f(acc, these.head)
these = these.tail
}
acc
}
Let me rename stuff for clarity:
def foldLeft[B](initialValue: B)(f: (B, A) => B): B = {
//Notice that both accumulator and collectionCopy are `var`s! They are reassigned each time in the loop.
var accumulator = initialValue
//create a copy of the collection
var collectionCopy = this //the function is inside a collection class, so **this** is the collection
while (!collectionCopy.isEmpty) {
accumulator = f(accumulator , collection.head)
collectionCopy = these.tail
}
accumulator
}
Edit after comment:
Let us revisit now the the OPs function and rewrite it in an imperative manner (i.e. non-functional, which apparently is the source of confusion):
(map /: chars)(incr) is be exactly equivalent to chars.foldLeft(map)(incr), which can be imperatively rewritten as:
def foldLeft(initialValue: Map[Char,Int])(incrFunction: (Map[Char,Int], Char) => Map[Char,Int]): Map[Char,Int] = {
//Notice that both accumulator and charList are `var`s! They are reassigned each time in the loop.
var accumulator = initialValue
//create a copy of the collection
var charList: List[Char] = this //the function is inside a collection class, so **this** is the collection
while (!charList.isEmpty) {
accumulator = incrFunction(accumulator , collection.head)
charList = these.tail
}
accumulator
}
I hope this makes the concept of foldLeft clearer.
So it is essentially an abstraction over an imperative while loop, that accumulates some value by traversing the collection and updating the accumulator. The accumulator is updated using a user-provided function that takes the previous value of the accumulator and the current item of the collection.
Its very description hints that it is a great tool to compute all sorts of aggregates on a collection, like sum, max etc. Yeah, scala collections actually provide all these functions, but they serve as a good example use case.
On the specifics of your question, let me point out that this can be easily done using groupBy:
def times(l: List[Char]) = l.groupBy(c => c).mapValues(_.size).toList
times(List('a','b','a')) // outputs List[(Char, Int)] = List((b,1), (a,2))
.groupBy(c => c) gives you Map[Char,List[Char]] = Map(b -> List(b), a -> List(a, a))
Then we use .mapValues(_.size) to map the values of the map to the size of the grouped sub-collections: Map[Char,Int] = Map(b -> 1, a -> 2).
Finally, you convert the map to a list of key-value tuples with .toList to get the final result.
Lastly, if you don't care about the order of the output list as you said, then leaving the output as a Map[Char,Int] conveys better this decision (instead of converting it to a list).

Scala List Operation

Given a List of Int and variable X of Int type . What is the best in Scala functional way to retain only those values in the List (starting from beginning of list) such that sum of list values is less than equal to variable.
This is pretty close to a one-liner:
def takeWhileLessThan(x: Int)(l: List[Int]): List[Int] =
l.scan(0)(_ + _).tail.zip(l).takeWhile(_._1 <= x).map(_._2)
Let's break that into smaller pieces.
First you use scan to create a list of cumulative sums. Here's how it works on a small example:
scala> List(1, 2, 3, 4).scan(0)(_ + _)
res0: List[Int] = List(0, 1, 3, 6, 10)
Note that the result includes the initial value, which is why we take the tail in our implementation.
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail
res1: List[Int] = List(1, 3, 6, 10)
Now we zip the entire thing against the original list. Taking our example again, this looks like the following:
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail.zip(List(1, 2, 3, 4))
res2: List[(Int, Int)] = List((1,1), (3,2), (6,3), (10,4))
Now we can use takeWhile to take as many values as we can from this list before the cumulative sum is greater than our target. Let's say our target is 5 in our example:
scala> res2.takeWhile(_._1 <= 5)
res3: List[(Int, Int)] = List((1,1), (3,2))
This is almost what we want—we just need to get rid of the cumulative sums:
scala> res2.takeWhile(_._1 <= 5).map(_._2)
res4: List[Int] = List(1, 2)
And we're done. It's worth noting that this isn't very efficient, since it computes the cumulative sums for the entire list, etc. The implementation could be optimized in various ways, but as it stands it's probably the simplest purely functional way to do this in Scala (and in most cases the performance won't be a problem, anyway).
In addition to Travis' answer (and for the sake of completeness), you can always implement these type of operations as a foldLeft:
def takeWhileLessThanOrEqualTo(maxSum: Int)(list: Seq[Int]): Seq[Int] = {
// Tuple3: the sum of elements so far; the accumulated list; have we went over x, or in other words are we finished yet
val startingState = (0, Seq.empty[Int], false)
val (_, accumulatedNumbers, _) = list.foldLeft(startingState) {
case ((sum, accumulator, finished), nextNumber) =>
if(!finished) {
if (sum + nextNumber > maxSum) (sum, accumulator, true) // We are over the sum limit, finish
else (sum + nextNumber, accumulator :+ nextNumber, false) // We are still under the limit, add it to the list and sum
} else (sum, accumulator, finished) // We are in a finished state, just keep iterating over the list
}
accumulatedNumbers
}
This only iterates over the list once, so it should be more efficient, but is more complicated and requires a bit of reading code to understand.
I will go with something like this, which is more functional and should be efficient.
def takeSumLessThan(x:Int,l:List[Int]): List[Int] = (x,l) match {
case (_ , List()) => List()
case (x, _) if x<= 0 => List()
case (x, lh :: lt) => lh :: takeSumLessThan(x-lh,lt)
}
Edit 1 : Adding tail recursion and implicit for shorter call notation
import scala.annotation.tailrec
implicit class MyList(l:List[Int]) {
def takeSumLessThan(x:Int) = {
#tailrec
def f(x:Int,l:List[Int],acc:List[Int]) : List[Int] = (x,l) match {
case (_,List()) => acc
case (x, _ ) if x <= 0 => acc
case (x, lh :: lt ) => f(x-lh,lt,acc ++ List(lh))
}
f(x,l,Nil)
}
}
Now you can use this like
List(1,2,3,4,5,6,7,8).takeSumLessThan(10)

Listing combinations WITH repetitions in Scala

Trying to learn a bit of Scala and ran into this problem. I found a solution for all combinations without repetions here and I somewhat understand the idea behind it but some of the syntax is messing me up. I also don't think the solution is appropriate for a case WITH repetitions. I was wondering if anyone could suggest a bit of code that I could work from. I have plenty of material on combinatorics and understand the problem and iterative solutions to it, I am just looking for the scala-y way of doing it.
Thanks
I understand your question now. I think the easiest way to achieve what you want is to do the following:
def mycomb[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
sl <- mycomb(n-1, l dropWhile { _ != el } ))
yield el :: sl
}
def comb[T](n: Int, l: List[T]): List[List[T]] = mycomb(n, l.removeDuplicates)
The comb method just calls mycomb with duplicates removed from the input list. Removing the duplicates means it is then easier to test later whether two elements are 'the same'. The only change I have made to your mycomb method is that when the method is being called recursively I strip off the elements which appear before el in the list. This is to stop there being duplicates in the output.
> comb(3, List(1,2,3))
> List[List[Int]] = List(
List(1, 1, 1), List(1, 1, 2), List(1, 1, 3), List(1, 2, 2),
List(1, 2, 3), List(1, 3, 3), List(2, 2, 2), List(2, 2, 3),
List(2, 3, 3), List(3, 3, 3))
> comb(6, List(1,2,1,2,1,2,1,2,1,2))
> List[List[Int]] = List(
List(1, 1, 1, 1, 1, 1), List(1, 1, 1, 1, 1, 2), List(1, 1, 1, 1, 2, 2),
List(1, 1, 1, 2, 2, 2), List(1, 1, 2, 2, 2, 2), List(1, 2, 2, 2, 2, 2),
List(2, 2, 2, 2, 2, 2))
Meanwhile, combinations have become integral part of the scala collections:
scala> val li = List (1, 1, 0, 0)
li: List[Int] = List(1, 1, 0, 0)
scala> li.combinations (2) .toList
res210: List[List[Int]] = List(List(1, 1), List(1, 0), List(0, 0))
As we see, it doesn't allow repetition, but to allow them is simple with combinations though: Enumerate every element of your collection (0 to li.size-1) and map to element in the list:
scala> (0 to li.length-1).combinations (2).toList .map (v=>(li(v(0)), li(v(1))))
res214: List[(Int, Int)] = List((1,1), (1,0), (1,0), (1,0), (1,0), (0,0))
I wrote a similar solution to the problem in my blog: http://gabrielsw.blogspot.com/2009/05/my-take-on-99-problems-in-scala-23-to.html
First I thought of generating all the possible combinations and removing the duplicates, (or use sets, that takes care of the duplications itself) but as the problem was specified with lists and all the possible combinations would be too much, I've came up with a recursive solution to the problem:
to get the combinations of size n, take one element of the set and append it to all the combinations of sets of size n-1 of the remaining elements, union the combinations of size n of the remaining elements.
That's what the code does
//P26
def combinations[A](n:Int, xs:List[A]):List[List[A]]={
def lift[A](xs:List[A]):List[List[A]]=xs.foldLeft(List[List[A]]())((ys,y)=>(List(y)::ys))
(n,xs) match {
case (1,ys)=> lift(ys)
case (i,xs) if (i==xs.size) => xs::Nil
case (i,ys)=> combinations(i-1,ys.tail).map(zs=>ys.head::zs):::combinations(i,ys.tail)
}
}
How to read it:
I had to create an auxiliary function that "lift" a list into a list of lists
The logic is in the match statement:
If you want all the combinations of size 1 of the elements of the list, just create a list of lists in which each sublist contains an element of the original one (that's the "lift" function)
If the combinations are the total length of the list, just return a list in which the only element is the element list (there's only one possible combination!)
Otherwise, take the head and tail of the list, calculate all the combinations of size n-1 of the tail (recursive call) and append the head to each one of the resulting lists (.map(ys.head::zs) ) concatenate the result with all the combinations of size n of the tail of the list (another recursive call)
Does it make sense?
The question was rephrased in one of the answers -- I hope the question itself gets edited too. Someone else answered the proper question. I'll leave that code below in case someone finds it useful.
That solution is confusing as hell, indeed. A "combination" without repetitions is called permutation. It could go like this:
def perm[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
sl <- perm(n-1, l filter (_ != el)))
yield el :: sl
}
If the input list is not guaranteed to contain unique elements, as suggested in another answer, it can be a bit more difficult. Instead of filter, which removes all elements, we need to remove just the first one.
def perm[T](n: Int, l: List[T]): List[List[T]] = {
def perm1[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
(hd, tl) = l span (_ != el);
sl <- perm(n-1, hd ::: tl.tail))
yield el :: sl
}
perm1(n, l).removeDuplicates
}
Just a bit of explanation. In the for, we take each element of the list, and return lists composed of it followed by the permutation of all elements of the list except for the selected element.
For instance, if we take List(1,2,3), we'll compose lists formed by 1 and perm(List(2,3)), 2 and perm(List(1,3)) and 3 and perm(List(1,2)).
Since we are doing arbitrary-sized permutations, we keep track of how long each subpermutation can be. If a subpermutation is size 0, it is important we return a list containing an empty list. Notice that this is not an empty list! If we returned Nil in case 0, there would be no element for sl in the calling perm, and the whole "for" would yield Nil. This way, sl will be assigned Nil, and we'll compose a list el :: Nil, yielding List(el).
I was thinking about the original problem, though, and I'll post my solution here for reference. If you meant not having duplicated elements in the answer as a result of duplicated elements in the input, just add a removeDuplicates as shown below.
def comb[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(i <- (0 to (l.size - n)).toList;
l1 = l.drop(i);
sl <- comb(n-1, l1.tail))
yield l1.head :: sl
}
It's a bit ugly, I know. I have to use toList to convert the range (returned by "to") into a List, so that "for" itself would return a List. I could do away with "l1", but I think this makes more clear what I'm doing. Since there is no filter here, modifying it to remove duplicates is much easier:
def comb[T](n: Int, l: List[T]): List[List[T]] = {
def comb1[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(i <- (0 to (l.size - n)).toList;
l1 = l.drop(i);
sl <- comb(n-1, l1.tail))
yield l1.head :: sl
}
comb1(n, l).removeDuplicates
}
Daniel -- I'm not sure what Alex meant by duplicates, it may be that the following provides a more appropriate answer:
def perm[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l.removeDuplicates;
sl <- perm(n-1, l.slice(0, l.findIndexOf {_ == el}) ++ l.slice(1 + l.findIndexOf {_ == el}, l.size)))
yield el :: sl
}
Run as
perm(2, List(1,2,2,2,1))
this gives:
List(List(2, 2), List(2, 1), List(1, 2), List(1, 1))
as opposed to:
List(
List(1, 2), List(1, 2), List(1, 2), List(2, 1),
List(2, 1), List(2, 1), List(2, 1), List(2, 1),
List(2, 1), List(1, 2), List(1, 2), List(1, 2)
)
The nastiness inside the nested perm call is removing a single 'el' from the list, I imagine there's a nicer way to do that but I can't think of one.
This solution was posted on Rosetta Code: http://rosettacode.org/wiki/Combinations_with_repetitions#Scala
def comb[A](as: List[A], k: Int): List[List[A]] =
(List.fill(k)(as)).flatten.combinations(k).toList
It is really not clear what you are asking for. It could be one of a few different things. First would be simple combinations of different elements in a list. Scala offers that with the combinations() method from collections. If elements are distinct, the behavior is exactly what you expect from classical definition of "combinations". For n-element combinations of p elements there will be p!/n!(p-n)! combinations in the output.
If there are repeated elements in the list, though, Scala will generate combinations with the item appearing more than once in the combinations. But just the different possible combinations, with the element possibly replicated as many times as they exist in the input. It generates only the set of possible combinations, so repeated elements, but not repeated combinations. I'm not sure if underlying it there is an iterator to an actual Set.
Now what you actually mean if I understand correctly is combinations from a given set of different p elements, where an element can appear repeatedly n times in the combination.
Well, coming back a little, to generate combinations when there are repeated elements in the input, and you wanna see the repeated combinations in the output, the way to go about it is just to generate it by "brute-force" using n nested loops. Notice that there is really nothing brute about it, it is just the natural number of combinations, really, which is O(p^n) for small n, and there is nothing you can do about it. You only should be careful to pick these values properly, like this:
val a = List(1,1,2,3,4)
def comb = for (i <- 0 until a.size - 1; j <- i+1 until a.size) yield (a(i), a(j))
resulting in
scala> comb
res55: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((1,1), (1,2), (1,3), (1,4), (1,2), (1,3), (1,4), (2,3), (2,4), (3,4))
This generates the combinations from these repeated values in a, by first creating the intermediate combinations of 0 until a.size as (i, j)...
Now to create the "combinations with repetitions" you just have to change the indices like this:
val a = List('A','B','C')
def comb = for (i <- 0 until a.size; j <- i until a.size) yield (a(i), a(j))
will produce
List((A,A), (A,B), (A,C), (B,B), (B,C), (C,C))
But I'm not sure what's the best way to generalize this to larger combinations.
Now I close with what I was looking for when I found this post: a function to generate the combinations from an input that contains repeated elements, with intermediary indices generated by combinations(). It is nice that this method produces a list instead of a tuple, so that means we can actually solve the problem using a "map of a map", something I'm not sure anyone else has proposed here, but that is pretty nifty and will make your love for FP and Scala grow a bit more after you see it!
def comb[N](p:Seq[N], n:Int) = (0 until p.size).combinations(n) map { _ map p }
results in
scala> val a = List('A','A','B','C')
scala> comb(a, 2).toList
res60: List[scala.collection.immutable.IndexedSeq[Int]] = List(Vector(1, 1), Vector(1, 2), Vector(1, 3), Vector(1, 2), Vector(1, 3), Vector(2, 3))