How the get the index of the duplicate pair in the scala list - scala

I have a scala list like this below:
slist = List("a","b","c","a","d","c","a")
I want to get the index of the same element pair in this list.
For example,the result of this slist is
(0,3),(0,6),(3,6),(2,5)
which (0,3) means the slist(0)==slist(3)
(0,6) means the slist(0)==slist(6)
and so on.
So is there any method to do this in scala?
Thanks very much

There's simpler approaches but starting with zipWithIndex leads down this path. zipWithIndex returns a Tuple2 with the index and one of the letters. From there we groupBy the letter to get a map of the letter to it's indices and filter the ones with more than one value. Lastly, we have this MapLike.DefaultValuesIterable(List((a,0), (a,3), (a,6)), List((c,2), (c,5)))
which we take the indices from and make combinations.
scala> slist.zipWithIndex.groupBy(zipped => zipped._1).filter(t => t._2.size > 1).values.flatMap(xs => xs.map(t => t._2).combinations(2))
res40: Iterable[List[Int]] = List(List(0, 3), List(0, 6), List(3, 6), List(2, 5))

Indexing a List is rather inefficient so I recommend a transition to Vector and then back again (if needed).
val svec = slist.toVector
svec.indices
.map(x => (x,svec.indexOf(svec(x),x+1)))
.filter(_._2 > 0)
.toList
//res0: List[(Int, Int)] = List((0,3), (2,5), (3,6))

val v = slist.toVector; val s = v.size
for(i<-0 to s-1;j<-0 to s-1;if(i<j && v(i)==v(j))) yield (i,j)
In Scala REPL:
scala> for(i<-0 to s-1;j<-0 to s-1;if(i<j && v(i)==v(j))) yield (i,j)
res34: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((0,3), (0,6), (2,5), (3,6))

Related

combine two lists with same keys

Here's a quite simple request to combine two lists as following:
scala> list1
res17: List[(Int, Double)] = List((1,0.1), (2,0.2), (3,0.3), (4,0.4))
scala> list2
res18: List[(Int, String)] = List((1,aaa), (2,bbb), (3,ccc), (4,ddd))
The desired output is as:
((aaa,0.1),(bbb,0.2),(ccc,0.3),(ddd,0.4))
I tried:
scala> (list1 ++ list2)
res23: List[(Int, Any)] = List((1,0.1), (2,0.2), (3,0.3), (4,0.4),
(1,aaa), (2,bbb), (3,ccc), (4,ddd))
But:
scala> (list1 ++ list2).groupByKey
<console>:10: error: value groupByKey is not a member of List[(Int,
Any)](list1 ++ list2).groupByKey
Any hints? Thanks!
The method you're looking for is groupBy:
(list1 ++ list2).groupBy(_._1)
If you know that for each key you have exactly two values, you can join them:
scala> val pairs = List((1, "a1"), (2, "b1"), (1, "a2"), (2, "b2"))
pairs: List[(Int, String)] = List((1,a1), (2,b1), (1,a2), (2,b2))
scala> pairs.groupBy(_._1).values.map {
| case List((_, v1), (_, v2)) => (v1, v2)
| }
res0: Iterable[(String, String)] = List((b1,b2), (a1,a2))
Another approach using zip is possible if the two lists contain the same keys in the same order:
scala> val l1 = List((1, "a1"), (2, "b1"))
l1: List[(Int, String)] = List((1,a1), (2,b1))
scala> val l2 = List((1, "a2"), (2, "b2"))
l2: List[(Int, String)] = List((1,a2), (2,b2))
scala> l1.zip(l2).map { case ((_, v1), (_, v2)) => (v1, v2) }
res1: List[(String, String)] = List((a1,a2), (b1,b2))
Here's a quick one-liner:
scala> list2.map(_._2) zip list1.map(_._2)
res0: List[(String, Double)] = List((aaa,0.1), (bbb,0.2), (ccc,0.3), (ddd,0.4))
If you are unsure why this works then read on! I'll expand it step by step:
list2.map(<function>)
The map method iterates over each value in list2 and applies your function to it. In this case each of the values in list2 is a Tuple2 (a tuple with two values). What you are wanting to do is access the second tuple value. To access the first tuple value use the ._1 method and to access the second tuple value use the ._2 method. Here is an example:
val myTuple = (1.0, "hello") // A Tuple2
println(myTuple._1) // prints "1.0"
println(myTuple._2) // prints "hello"
So what we want is a function literal that takes one parameter (the current value in the list) and returns the second tuple value (._2). We could have written the function literal like this:
list2.map(item => item._2)
We don't need to specify a type for item because the compiler is smart enough to infer it thanks to target typing. A really helpful shortcut is that we can just leave out item altogether and replace it with a single underscore _. So it gets simplified (or cryptified, depending on how you view it) to:
list2.map(_._2)
The other interesting part about this one liner is the zip method. All zip does is take two lists and combines them into one list just like the zipper on your favorite hoodie does!
val a = List("a", "b", "c")
val b = List(1, 2, 3)
a zip b // returns ((a,1), (b,2), (c,3))
Cheers!

How to sum adjacent elements in scala

I want to sum adjacent elements in scala and I'm not sure how to deal with the last element.
So I have a list:
val x = List(1,2,3,4)
And I want to sum adjacent elements using indices and map:
val size = x.indices.size
val y = x.indices.map(i =>
if (i < size - 1)
x(i) + x(i+1))
The problem is that this approach creates an AnyVal elemnt at the end:
res1: scala.collection.immutable.IndexedSeq[AnyVal] = Vector(3, 5, 7, ())
and if I try to sum the elements or another numeric method of the collection, it doesn't work:
error: could not find implicit value for parameter num: Numeric[AnyVal]
I tried to filter out the element using:
y diff List(Unit) or y diff List(AnyVal)
but it doesn't work.
Is there a better approach in scala to do this type of adjacent sum without using a foor loop?
For a more functional solution, you can use sliding to group the elements together in twos (or any number of them), then map to their sum.
scala> List(1, 2, 3, 4).sliding(2).map(_.sum).toList
res80: List[Int] = List(3, 5, 7)
What sliding(2) will do is create an intermediate iterator of lists like this:
Iterator(
List(1, 2),
List(2, 3),
List(3, 4)
)
So when we chain map(_.sum), we will map each inner List to it's own sum. toList will convert the Iterator back into a List.
You can try pattern matching and tail recursion also.
import scala.annotation.tailrec
#tailrec
def f(l:List[Int],r :List[Int]=Nil):List[Int] = {
l match {
case x :: xs :: xss =>
f(l.tail, r :+ (x + xs))
case _ => r
}
}
scala> f(List(1,2,3,4))
res4: List[Int] = List(3, 5, 7)
With a for comprehension by zipping two lists, the second with the first item dropped,
for ( (a,b) <- x zip x.drop(1) ) yield a+b
which results in
List(3, 5, 7)

How do I yield pairwise combinations of a collection, ignoring order?

Let's assume I have a collection (let's use a set):
scala> val x = Set(1, 2, 3)
x: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
I can get all pairwise combinations with the following code:
scala> for {
| a <- x
| b <- x
| if a != b
| } yield (a, b)
res9: scala.collection.immutable.Set[(Int, Int)] = Set((3,1), (3,2), (1,3), (2,3), (1,2), (2,1))
The problem is that I only want to get all pairwise combinations where order is ignored (so the combination (1, 2) is equivalent to (2, 1)). So I'd like to return Set((3, 1), (3, 2), (1, 2)).
Do not assume that the elements of the collection will be integers. They may be any arbitrary type.
Any ideas?
Edit: Python's itertools.combinations performs the exact functionality I'm looking for. I just want an idiomatic way to do it in Scala :)
Scala has a combinations method too, but it's only defined on Seq, not Set. So turn your set into a Seq first, and the following will give you an Iterator[Seq[Int]]:
x.toSeq.combinations(2)
If you really want tuples, add map {case Seq(a,b) => (a,b)} to the above.
If you don't mind having a Set of Sets instead of a Set of tuples it is easy:
scala> val s3 = for {
| a <- x
| b <- x
| if(a != b)
| } yield ( Set(a,b))
s3: scala.collection.immutable.Set[scala.collection.immutable.Set[Int]] = Set(Set(1, 2), Set(1, 3), Set(2, 3))

Scala generate unique pairs from list

Input :
val list = List(1, 2, 3, 4)
Desired output :
Iterator((1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4))
This code works :
for (cur1 <- 0 until list.size; cur2 <- (cur1 + 1) until list.size)
yield (list(cur1), list(cur2))
but it not seems optimal, is there any better way of doing it ?
There's a .combinations method built-in:
scala> List(1,2,3,4).combinations(2).toList
res0: List[List[Int]] = List(List(1, 2), List(1, 3), List(1, 4), List(2, 3), List(2, 4), List(3, 4))
It returns an Iterator, but I added .toList just for the purpose of printing the result. If you want your results in tuple form, you can do:
scala> List(1,2,3,4).combinations(2).map{ case Seq(x, y) => (x, y) }.toList
res1: List[(Int, Int)] = List((1,2), (1,3), (1,4), (2,3), (2,4), (3,4))
You mentioned uniqueness as well, so you could apply .distinct to your input list uniqueness isn't a precondition of your function, because .combination will not deduplicate for you.
.combinations is the proper way for generating unique arbitrary groups of any size, another alternative solution that does not check the uniqueness in first place is using foldLeft that way:
val list = (1 to 10).toList
val header :: tail = list
tail.foldLeft((header, tail, List.empty[(Int, Int)])) {
case ((header, tail, res), elem) =>
(elem, tail.drop(1), res ++ tail.map(x => (header, x)))
}._3
Will produce:
res0: List[(Int, Int)] = List((1,2), (1,3), (1,4), (1,5), (1,6), (1,7), (1,8), (1,9), (1,10), (2,3), (2,4), (2,5), (2,6), (2,7), (2,8), (2,9), (2,10), (3,4), (3,5), (3,6), (3,7), (3,8), (3,9), (3,10), (4,5), (4,6), (4,7), (4,8), (4,9), (4,10), (5,6), (5,7), (5,8), (5,9), (5,10), (6,7), (6,8), (6,9), (6,10), (7,8), (7,9), (7,10), (8,9), (8,10), (9,10))
If you expect there to be duplicates then you can turn the output list into a set and bring it back into a list, but you will lose the ordering then. Thus not the recommended way if you want to have uniqueness, but should be preferred if you want to generate all of the pairs included equal elements.
E.g. I used it in the field of machine learning for generating all of the products between each pair of variables in the feature space and if two or more variables have the same value I still want to produce a new variable corresponding to their product even though those newly generated "interaction variables" will have duplicates.

Listing combinations WITH repetitions in Scala

Trying to learn a bit of Scala and ran into this problem. I found a solution for all combinations without repetions here and I somewhat understand the idea behind it but some of the syntax is messing me up. I also don't think the solution is appropriate for a case WITH repetitions. I was wondering if anyone could suggest a bit of code that I could work from. I have plenty of material on combinatorics and understand the problem and iterative solutions to it, I am just looking for the scala-y way of doing it.
Thanks
I understand your question now. I think the easiest way to achieve what you want is to do the following:
def mycomb[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
sl <- mycomb(n-1, l dropWhile { _ != el } ))
yield el :: sl
}
def comb[T](n: Int, l: List[T]): List[List[T]] = mycomb(n, l.removeDuplicates)
The comb method just calls mycomb with duplicates removed from the input list. Removing the duplicates means it is then easier to test later whether two elements are 'the same'. The only change I have made to your mycomb method is that when the method is being called recursively I strip off the elements which appear before el in the list. This is to stop there being duplicates in the output.
> comb(3, List(1,2,3))
> List[List[Int]] = List(
List(1, 1, 1), List(1, 1, 2), List(1, 1, 3), List(1, 2, 2),
List(1, 2, 3), List(1, 3, 3), List(2, 2, 2), List(2, 2, 3),
List(2, 3, 3), List(3, 3, 3))
> comb(6, List(1,2,1,2,1,2,1,2,1,2))
> List[List[Int]] = List(
List(1, 1, 1, 1, 1, 1), List(1, 1, 1, 1, 1, 2), List(1, 1, 1, 1, 2, 2),
List(1, 1, 1, 2, 2, 2), List(1, 1, 2, 2, 2, 2), List(1, 2, 2, 2, 2, 2),
List(2, 2, 2, 2, 2, 2))
Meanwhile, combinations have become integral part of the scala collections:
scala> val li = List (1, 1, 0, 0)
li: List[Int] = List(1, 1, 0, 0)
scala> li.combinations (2) .toList
res210: List[List[Int]] = List(List(1, 1), List(1, 0), List(0, 0))
As we see, it doesn't allow repetition, but to allow them is simple with combinations though: Enumerate every element of your collection (0 to li.size-1) and map to element in the list:
scala> (0 to li.length-1).combinations (2).toList .map (v=>(li(v(0)), li(v(1))))
res214: List[(Int, Int)] = List((1,1), (1,0), (1,0), (1,0), (1,0), (0,0))
I wrote a similar solution to the problem in my blog: http://gabrielsw.blogspot.com/2009/05/my-take-on-99-problems-in-scala-23-to.html
First I thought of generating all the possible combinations and removing the duplicates, (or use sets, that takes care of the duplications itself) but as the problem was specified with lists and all the possible combinations would be too much, I've came up with a recursive solution to the problem:
to get the combinations of size n, take one element of the set and append it to all the combinations of sets of size n-1 of the remaining elements, union the combinations of size n of the remaining elements.
That's what the code does
//P26
def combinations[A](n:Int, xs:List[A]):List[List[A]]={
def lift[A](xs:List[A]):List[List[A]]=xs.foldLeft(List[List[A]]())((ys,y)=>(List(y)::ys))
(n,xs) match {
case (1,ys)=> lift(ys)
case (i,xs) if (i==xs.size) => xs::Nil
case (i,ys)=> combinations(i-1,ys.tail).map(zs=>ys.head::zs):::combinations(i,ys.tail)
}
}
How to read it:
I had to create an auxiliary function that "lift" a list into a list of lists
The logic is in the match statement:
If you want all the combinations of size 1 of the elements of the list, just create a list of lists in which each sublist contains an element of the original one (that's the "lift" function)
If the combinations are the total length of the list, just return a list in which the only element is the element list (there's only one possible combination!)
Otherwise, take the head and tail of the list, calculate all the combinations of size n-1 of the tail (recursive call) and append the head to each one of the resulting lists (.map(ys.head::zs) ) concatenate the result with all the combinations of size n of the tail of the list (another recursive call)
Does it make sense?
The question was rephrased in one of the answers -- I hope the question itself gets edited too. Someone else answered the proper question. I'll leave that code below in case someone finds it useful.
That solution is confusing as hell, indeed. A "combination" without repetitions is called permutation. It could go like this:
def perm[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
sl <- perm(n-1, l filter (_ != el)))
yield el :: sl
}
If the input list is not guaranteed to contain unique elements, as suggested in another answer, it can be a bit more difficult. Instead of filter, which removes all elements, we need to remove just the first one.
def perm[T](n: Int, l: List[T]): List[List[T]] = {
def perm1[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l;
(hd, tl) = l span (_ != el);
sl <- perm(n-1, hd ::: tl.tail))
yield el :: sl
}
perm1(n, l).removeDuplicates
}
Just a bit of explanation. In the for, we take each element of the list, and return lists composed of it followed by the permutation of all elements of the list except for the selected element.
For instance, if we take List(1,2,3), we'll compose lists formed by 1 and perm(List(2,3)), 2 and perm(List(1,3)) and 3 and perm(List(1,2)).
Since we are doing arbitrary-sized permutations, we keep track of how long each subpermutation can be. If a subpermutation is size 0, it is important we return a list containing an empty list. Notice that this is not an empty list! If we returned Nil in case 0, there would be no element for sl in the calling perm, and the whole "for" would yield Nil. This way, sl will be assigned Nil, and we'll compose a list el :: Nil, yielding List(el).
I was thinking about the original problem, though, and I'll post my solution here for reference. If you meant not having duplicated elements in the answer as a result of duplicated elements in the input, just add a removeDuplicates as shown below.
def comb[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(i <- (0 to (l.size - n)).toList;
l1 = l.drop(i);
sl <- comb(n-1, l1.tail))
yield l1.head :: sl
}
It's a bit ugly, I know. I have to use toList to convert the range (returned by "to") into a List, so that "for" itself would return a List. I could do away with "l1", but I think this makes more clear what I'm doing. Since there is no filter here, modifying it to remove duplicates is much easier:
def comb[T](n: Int, l: List[T]): List[List[T]] = {
def comb1[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(i <- (0 to (l.size - n)).toList;
l1 = l.drop(i);
sl <- comb(n-1, l1.tail))
yield l1.head :: sl
}
comb1(n, l).removeDuplicates
}
Daniel -- I'm not sure what Alex meant by duplicates, it may be that the following provides a more appropriate answer:
def perm[T](n: Int, l: List[T]): List[List[T]] =
n match {
case 0 => List(List())
case _ => for(el <- l.removeDuplicates;
sl <- perm(n-1, l.slice(0, l.findIndexOf {_ == el}) ++ l.slice(1 + l.findIndexOf {_ == el}, l.size)))
yield el :: sl
}
Run as
perm(2, List(1,2,2,2,1))
this gives:
List(List(2, 2), List(2, 1), List(1, 2), List(1, 1))
as opposed to:
List(
List(1, 2), List(1, 2), List(1, 2), List(2, 1),
List(2, 1), List(2, 1), List(2, 1), List(2, 1),
List(2, 1), List(1, 2), List(1, 2), List(1, 2)
)
The nastiness inside the nested perm call is removing a single 'el' from the list, I imagine there's a nicer way to do that but I can't think of one.
This solution was posted on Rosetta Code: http://rosettacode.org/wiki/Combinations_with_repetitions#Scala
def comb[A](as: List[A], k: Int): List[List[A]] =
(List.fill(k)(as)).flatten.combinations(k).toList
It is really not clear what you are asking for. It could be one of a few different things. First would be simple combinations of different elements in a list. Scala offers that with the combinations() method from collections. If elements are distinct, the behavior is exactly what you expect from classical definition of "combinations". For n-element combinations of p elements there will be p!/n!(p-n)! combinations in the output.
If there are repeated elements in the list, though, Scala will generate combinations with the item appearing more than once in the combinations. But just the different possible combinations, with the element possibly replicated as many times as they exist in the input. It generates only the set of possible combinations, so repeated elements, but not repeated combinations. I'm not sure if underlying it there is an iterator to an actual Set.
Now what you actually mean if I understand correctly is combinations from a given set of different p elements, where an element can appear repeatedly n times in the combination.
Well, coming back a little, to generate combinations when there are repeated elements in the input, and you wanna see the repeated combinations in the output, the way to go about it is just to generate it by "brute-force" using n nested loops. Notice that there is really nothing brute about it, it is just the natural number of combinations, really, which is O(p^n) for small n, and there is nothing you can do about it. You only should be careful to pick these values properly, like this:
val a = List(1,1,2,3,4)
def comb = for (i <- 0 until a.size - 1; j <- i+1 until a.size) yield (a(i), a(j))
resulting in
scala> comb
res55: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((1,1), (1,2), (1,3), (1,4), (1,2), (1,3), (1,4), (2,3), (2,4), (3,4))
This generates the combinations from these repeated values in a, by first creating the intermediate combinations of 0 until a.size as (i, j)...
Now to create the "combinations with repetitions" you just have to change the indices like this:
val a = List('A','B','C')
def comb = for (i <- 0 until a.size; j <- i until a.size) yield (a(i), a(j))
will produce
List((A,A), (A,B), (A,C), (B,B), (B,C), (C,C))
But I'm not sure what's the best way to generalize this to larger combinations.
Now I close with what I was looking for when I found this post: a function to generate the combinations from an input that contains repeated elements, with intermediary indices generated by combinations(). It is nice that this method produces a list instead of a tuple, so that means we can actually solve the problem using a "map of a map", something I'm not sure anyone else has proposed here, but that is pretty nifty and will make your love for FP and Scala grow a bit more after you see it!
def comb[N](p:Seq[N], n:Int) = (0 until p.size).combinations(n) map { _ map p }
results in
scala> val a = List('A','A','B','C')
scala> comb(a, 2).toList
res60: List[scala.collection.immutable.IndexedSeq[Int]] = List(Vector(1, 1), Vector(1, 2), Vector(1, 3), Vector(1, 2), Vector(1, 3), Vector(2, 3))