Generic function for multiple generators in for comprehensions in Scala - scala

Let's say I want to create all possible combinations of letters "a" and "b". For combinations of length 2 using for-comprehensions it will be:
for {
x <- Seq("a", "b")
y <- Seq("a", "b")
} yield x + y
And for combinations of length 3 it will be:
for {
x <- Seq("a", "b")
y <- Seq("a", "b")
z <- Seq("a", "b")
} yield x + y + z
Pretty similar. Is this possible to abstract this pattern and write generic function?
I can think of such signature:
def genericCombine[A,B](length: Int, elements: Seq[A])(reducer: Seq[A] => B): Seq[B]
How can I parametrize number of generators used in for comprehension?

This is more like permutations with replacement than combinations, and a recursive implementation is fairly straightforward:
def select(n: Int)(input: List[String]): List[String] =
if (n == 1) input else for {
c <- input
s <- select(n - 1)(input)
} yield c + s
Which works as expected:
scala> select(2)(List("a", "b"))
res0: List[String] = List(aa, ab, ba, bb)
scala> select(3)(List("a", "b"))
res1: List[String] = List(aaa, aab, aba, abb, baa, bab, bba, bbb)
(You should of course check for invalid input in a real application.)

Related

Type mismatch in Scala's for-comprehension

I have tried to define a recursive Scala function that looks something like this:
def doSomething: (List[List[(Int, Int)]], List[(Int, Int)], Int, Int) => List[Int] =
(als, rs, d, n) =>
if (n == 0) {
for (entry <- rs if (entry._1 == d)) yield entry._2
} else {
for (entry <- rs; adj <- als(entry._1)) yield doSomething(als, rs.::((adj._1, adj._2 + entry._2)), d, n - 1)
}
Now, the compiler tells me:
| | | | | | <console>:17: error: type mismatch;
found : List[List[Int]]
required: List[Int]
for (entry <- rs; adj <- als(entry._1)) yield doSomething(als, rs.::((adj._1, adj._2 + entry._2)), d, n - 1)
^
I cannot figure out what the problem is. I'm sure that I'm using <- correctly. On the other hand, I'm a Scala newbie coming from the Java world...
Regarding the types of the input:
als : List[List[(Int,Int)]],
rs : List[(Int,Int)],
d and n : Int
The compiler error appears as soon as I tell IntelliJ to send my code to the Scala console.
When you yield an A when iterating on a List, you return a List[A]. doSomething returns a List[Int], so by yielding that you return a List[List[Int]]. You can unroll that like this:
def doSomethingElse(als: List[List[(Int, Int)]], rs: List[(Int, Int)], d: Int, n: Int): List[Int] =
if (n == 0) {
for ((k, v) <- rs if k == d) yield v
} else {
for {
(k, v) <- rs
(adjk, adjv) <- als(k)
item <- doSomethingElse(als, (adjk, adjv + v) :: rs, d, n - 1)
} yield item
}
Notice that I also used a method notation for brevity and destructured the pairs and leveraged the right-associativity of methods whose name ends in : for readability, feel free to use whatever convention you might want (but I don't see really a reading why having a method that returns a constant function (maybe you'd want to just use a val to declare it).
As a further note, you are using random access on a linear sequence (als(k)), you may want to consider an indexed sequence (like a Vector). More info on the complexity characteristics of the Scala Collection API can be found here.
for test purpose I created some sample data that meets the input datatypes as
val als = List(List((1,2), (3,4)), List((1,2), (3,4)), List((1,2), (3,4)))
//als: List[List[(Int, Int)]] = List(List((1,2), (3,4)), List((1,2), (3,4)), List((1,2), (3,4)))
val rs = List((1,2), (2,3))
//rs: List[(Int, Int)] = List((1,2), (2,3))
val d = 1
//d: Int = 1
val n = 3
//n: Int = 3
And in you doSomething function when n == 0 you are doing
for (entry <- rs if (entry._1 == d)) yield entry._2
//res0: List[Int] = List(2)
You can see that the return type is List[Int]
And for the else part you are calling recursively doSomething.
I have created dummy doSomething method of yours as your doSomething function definition lacks input variables as
def dosomething(nn: Int)={
for (entry <- rs if (entry._1 == d)) yield entry._2
}
and I call the method recursively as
for (entry <- rs; adj <- als(entry._1)) yield dosomething(0)
//res1: List[List[Int]] = List(List(2), List(2), List(2), List(2))
Clearly you can see that the second nested for loop is returning List[List[Int]]
And thats what the compiler is warning you
error: type mismatch;
found : List[List[Int]]
required: List[Int]
I hope the answer is helpful

N to N matching in scala

I have two lists, namely
val a = List(1,2,3)
val b = List(4,5)
I want to perform N to N bipartite mapping and want to get output
List((1,4),(1,5),(2,4),(2,5),(3,4),(3,5))
How can I do this?
Assuming that B = List(4,5), then you can use for comprehensions to achieve your goal:
val A = List(1,2,3)
val B = List(4,5)
val result = for(a <- A; b <- B) yield {
(a, b)
}
The output is
result:List[(Int, Int)] = List((1,4), (1,5), (2,4), (2,5), (3,4), (3,5))
Consider also
a.flatMap(x => b.map(y => (x,y)))
though not so concise as a for comprehension.

Does Scala have a statement equivalent to ML's "as" construct?

In ML, one can assign names for each element of a matched pattern:
fun findPair n nil = NONE
| findPair n (head as (n1, _))::rest =
if n = n1 then (SOME head) else (findPair n rest)
In this code, I defined an alias for the first pair of the list and matched the contents of the pair. Is there an equivalent construct in Scala?
You can do variable binding with the # symbol, e.g.:
scala> val wholeList # List(x, _*) = List(1,2,3)
wholeList: List[Int] = List(1, 2, 3)
x: Int = 1
I'm sure you'll get a more complete answer later as I'm not sure how to write it recursively like your example, but maybe this variation would work for you:
scala> val pairs = List((1, "a"), (2, "b"), (3, "c"))
pairs: List[(Int, String)] = List((1,a), (2,b), (3,c))
scala> val n = 2
n: Int = 2
scala> pairs find {e => e._1 == n}
res0: Option[(Int, String)] = Some((2,b))
OK, next attempt at direct translation. How about this?
scala> def findPair[A, B](n: A, p: List[Tuple2[A, B]]): Option[Tuple2[A, B]] = p match {
| case Nil => None
| case head::rest if head._1 == n => Some(head)
| case _::rest => findPair(n, rest)
| }
findPair: [A, B](n: A, p: List[(A, B)])Option[(A, B)]

Cartesian product of two lists

Given a map where a digit is associated to several characters
scala> val conversion = Map("0" -> List("A", "B"), "1" -> List("C", "D"))
conversion: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] =
Map(0 -> List(A, B), 1 -> List(C, D))
I want to generate all possible character sequences based on a sequence of digits. Examples:
"00" -> List("AA", "AB", "BA", "BB")
"01" -> List("AC", "AD", "BC", "BD")
I can do this with for comprehensions
scala> val number = "011"
number: java.lang.String = 011
Create a sequence of possible characters per index
scala> val values = number map { case c => conversion(c.toString) }
values: scala.collection.immutable.IndexedSeq[List[java.lang.String]] =
Vector(List(A, B), List(C, D), List(C, D))
Generate all the possible character sequences
scala> for {
| a <- values(0)
| b <- values(1)
| c <- values(2)
| } yield a+b+c
res13: List[java.lang.String] = List(ACC, ACD, ADC, ADD, BCC, BCD, BDC, BDD)
Here things get ugly and it will only work for sequences of three digits. Is there any way to achieve the same result for any sequence length?
The following suggestion is not using a for-comprehension. But I don't think it's a good idea after all, because as you noticed you'd be tied to a certain length of your cartesian product.
scala> def cartesianProduct[T](xss: List[List[T]]): List[List[T]] = xss match {
| case Nil => List(Nil)
| case h :: t => for(xh <- h; xt <- cartesianProduct(t)) yield xh :: xt
| }
cartesianProduct: [T](xss: List[List[T]])List[List[T]]
scala> val conversion = Map('0' -> List("A", "B"), '1' -> List("C", "D"))
conversion: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map(0 -> List(A, B), 1 -> List(C, D))
scala> cartesianProduct("01".map(conversion).toList)
res9: List[List[java.lang.String]] = List(List(A, C), List(A, D), List(B, C), List(B, D))
Why not tail-recursive?
Note that above recursive function is not tail-recursive. This isn't a problem, as xss will be short unless you have a lot of singleton lists in xss. This is the case, because the size of the result grows exponentially with the number of non-singleton elements of xss.
I could come up with this:
val conversion = Map('0' -> Seq("A", "B"), '1' -> Seq("C", "D"))
def permut(str: Seq[Char]): Seq[String] = str match {
case Seq() => Seq.empty
case Seq(c) => conversion(c)
case Seq(head, tail # _*) =>
val t = permut(tail)
conversion(head).flatMap(pre => t.map(pre + _))
}
permut("011")
I just did that as follows and it works
def cross(a:IndexedSeq[Tree], b:IndexedSeq[Tree]) = {
a.map (p => b.map( o => (p,o))).flatten
}
Don't see the $Tree type that am dealing it works for arbitrary collections too..

Multiple yields in sequence comprehension?

I'm trying to learn Scala and tried to write a sequence comprehension that extracts unigrams, bigrams and trigrams from a sequence. E.g., [1,2,3,4] should be transformed to (not Scala syntax)
[1; _,1; _,_,1; 2; 1,2; _,1,2; 3; 2,3; 1,2,3; 4; 3,4; 2,3,4]
In Scala 2.8, I tried the following:
def trigrams(tokens : Seq[T]) = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens) {
yield t3
yield (t2,t3)
yield (t1,t2,Some(t3))
t1 = t2
t2 = t3
}
}
But this doesn't compile as, apparently, only one yield is allowed in a for-comprehension (no block statements either). Is there any other elegant way to get the same behavior, with only one pass over the data?
You can't have multiple yields in a for loop because for loops are syntactic sugar for the map (or flatMap) operations:
for (i <- collection) yield( func(i) )
translates into
collection map {i => func(i)}
Without a yield at all
for (i <- collection) func(i)
translates into
collection foreach {i => func(i)}
So the entire body of the for loop is turned into a single closure, and the presence of the yield keyword determines whether the function called on the collection is map or foreach (or flatMap). Because of this translation, the following are forbidden:
Using imperative statements next to a yield to determine what will be yielded.
Using multiple yields
(Not to mention that your proposed verison will return a List[Any] because the tuples and the 1-gram are all of different types. You probably want to get a List[List[Int]] instead)
Try the following instead (which put the n-grams in the order they appear):
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
for {onegram <- basis
ngram <- slidingIterators if ngram.hasNext}
yield (ngram.next)
or
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
val first=slidingIterators head
val buf=new ListBuffer[List[Int]]
while (first.hasNext)
for (i <- slidingIterators)
if (i.hasNext)
buf += i.next
If you prefer the n-grams to be in length order, try:
val basis = List(1,2,3,4)
1 to 4 flatMap { basis sliding _ toList }
scala> val basis = List(1, 2, 3, 4)
basis: List[Int] = List(1, 2, 3, 4)
scala> val nGrams = (basis sliding 1).toList ::: (basis sliding 2).toList ::: (basis sliding 3).toList
nGrams: List[List[Int]] = ...
scala> nGrams foreach (println _)
List(1)
List(2)
List(3)
List(4)
List(1, 2)
List(2, 3)
List(3, 4)
List(1, 2, 3)
List(2, 3, 4)
I guess I should have given this more thought.
def trigrams(tokens : Seq[T]) : Seq[(Option[T],Option[T],T)] = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens)
yield {
val tri = (t1,t2,t3)
t1 = t2
t2 = Some(t3)
tri
}
}
Then extract the unigrams and bigrams from the trigrams. But can anyone explain to me why 'multi-yields' are not permitted, and if there's any other way to achieve their effect?
val basis = List(1, 2, 3, 4)
val nGrams = basis.map(x => (x)) ::: (for (a <- basis; b <- basis) yield (a, b)) ::: (for (a <- basis; b <- basis; c <- basis) yield (a, b, c))
nGrams: List[Any] = ...
nGrams foreach (println(_))
1
2
3
4
(1,1)
(1,2)
(1,3)
(1,4)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(3,3)
(3,4)
(4,1)
(4,2)
(4,3)
(4,4)
(1,1,1)
(1,1,2)
(1,1,3)
(1,1,4)
(1,2,1)
(1,2,2)
(1,2,3)
(1,2,4)
(1,3,1)
(1,3,2)
(1,3,3)
(1,3,4)
(1,4,1)
(1,4,2)
(1,4,3)
(1,4,4)
(2,1,1)
(2,1,2)
(2,1,3)
(2,1,4)
(2,2,1)
(2,2,2)
(2,2,3)
(2,2,4)
(2,3,1)
(2,3,2)
(2,3,3)
(2,3,4)
(2,4,1)
(2,4,2)
(2,4,3)
(2,4,4)
(3,1,1)
(3,1,2)
(3,1,3)
(3,1,4)
(3,2,1)
(3,2,2)
(3,2,3)
(3,2,4)
(3,3,1)
(3,3,2)
(3,3,3)
(3,3,4)
(3,4,1)
(3,4,2)
(3,4,3)
(3,4,4)
(4,1,1)
(4,1,2)
(4,1,3)
(4,1,4)
(4,2,1)
(4,2,2)
(4,2,3)
(4,2,4)
(4,3,1)
(4,3,2)
(4,3,3)
(4,3,4)
(4,4,1)
(4,4,2)
(4,4,3)
(4,4,4)
You could try a functional version without assignments:
def trigrams[T](tokens : Seq[T]) = {
val s1 = tokens.map { Some(_) }
val s2 = None +: s1
val s3 = None +: s2
s1 zip s2 zip s3 map {
case ((t1, t2), t3) => (List(t1), List(t1, t2), List(t1, t2, t3))
}
}