Scala: Elegant solution to iterate over (List, List) - scala

I'm trying to come up with an "elegant" solution to iterate over two lists (pairs of values), and perform some tests on the resulting values.
Any ideas? Here's what I have so far, but I get "value filter is not a member of (List[Int], List[Int])," which surprises me I thought this would work. AND, I feel like there must be a much cleaner way to express this in Scala.
val accounts = random(count = 100, minimum = 1, maximum = GPDataTypes.integer._2)
val ids = random(count = 100, minimum = 1, maximum = GPDataTypes.integer._2)
for ((id, accountId) <- (ids, accounts)) {
val g = new GPGlimple(Some(id), Some(timestamp), accountId, false, false, 2)
println(g)
g.accountId mustEqual accountId
g.id mustEqual id
g.created.get must beLessThan(System.currentTimeMillis)
g.layers must beNone
g.version must be equalTo 2
}

The simplest solution for this is zip:
(ids zip accounts)
The documentation for zip says:
Returns a list formed from this list and another iterable collection by combining corresponding elements in pairs.
In other words, zip will return a list of tuples.
The zipped method could also work here:
(ids, accounts).zipped
You can find the zipped source for 2-tuples here. Note that this is made available through an enrichment of (T, U) where T is implicitly viewable as a TraversableLike and U is implicitly viewable as an IterableLike. That method returns a ZippedTraversable2, which is a minimal interface that encapsulates this sort of zipped return, and behaves more efficiently for large sequences by inhibiting the creation of intermediary collections. These are generally more performant because they use iterators internally, as can be seen in the source.
Note that the returns here are of different types, which could affect downstream behavior. One important difference is that the normal combinator methods on ZippedTraversable2 are slightly different that those on a Traversable of tuples. The methods on ZippedTraversable2 generally expect a function of 2 arguments, while those on a Traversable of tuples will expect a function with a single argument that is a tuple. For example, you can check this in the REPL for the foreach method:
val s1 = List(1, 2, 3)
val s2 = List('a', 'b', 'c')
(s1 -> s2).zipped.foreach _
// ((Int, Char) => Any) => Unit = <function1>
(s1 zip s2).foreach _
// (((Int, Char)) => Any) => Unit = <function1>
//Notice the extra parens here, signifying a method with a tuple argument
This difference means that you sometimes have to use a different syntax when using zip and zipped:
(s1 zip s2).map { x => x._1 + x._2 }
(s1, s2).zipped.map { x => x._1 + x._2 } //This won't work! The method shouldn't expect a tuple argument
//conversely
(s1, s2).zipped.map { (x, y) => x + y }
(s1 zip s2).map { (x, y) => x + y } //This won't work! The method shouldn't expect 2 arguments
//Added note: methods with 2 arguments can often use the more concise underscore notation:
(s1, s2).zipped.map { _ + _ }
Note that if you use the case notation, you're covered either way:
//case works for both syntaxes
(s1, s2).zipped.map { case (x, y) => x + y } \
(s1 zip s2).map { case (x, y) => x + y }
This works since the compiler understands this notation for methods with either two arguments, or a single tuple argument, as explained in section 8.5 of the spec:
val f: (Int, Int) => Int = { case (a, b) => a + b }
val g: ((Int, Int)) => Int = { case (a, b) => a + b }

Use zip:
for ((id, accountId) <- ids.zip(accounts)) {
// ...
}

Related

SCALA: Fold method with conditions

I am still learning the basics of Scala, therefore I am asking for your understanding. Is it any possible way to use fold method to print only names beginning with "A"
Object Scala {
val names: List[String] = List("Adam", "Mick", "Ann");
def main(args: Array[String]) {
println(names.foldLeft("my list of items starting with A: ")(_+_));
}
}
}
Have a look at the signature of foldLeft
def foldLeft[B](z: B)(op: (B, A) => B): B
where
z is the initial value
op is a function taking two arguments, namely accumulated result so far B, and the next element to be processed A
returns the accumulated result B
Now consider this concrete implementation
val names: List[String] = List("Adam", "Mick", "Ann")
val predicate: String => Boolean = str => str.startsWith("A")
names.foldLeft(List.empty[String]) { (accumulated: List[String], next: String) =>
if (predicate(next)) accumulated.prepended(next) else accumulated
}
here
z = List.empty[String]
op = (accumulated: List[String], next: String) => if (predicate(next)) accumulated.prepended(next) else accumulated
Usually we would write this inlined and rely on type inference so we do not have two write out full types all the time, so it becomes
names.foldLeft(List.empty[String]) { (acc, next) =>
if (next.startsWith("A")) next :: acc else acc
}
// val res1: List[String] = List(Ann, Adam)
On of the key ideas when working with List is to always prepend an element instead of append
names.foldLeft(List.empty[String]) { (accumulated: List[String], next: String) =>
if (predicate(next)) accumulated.appended(next) else accumulated
}
because prepending is much more efficient. However note how this makes the accumulated result in reverse order, so
List(Ann, Adam)
instead of perhaps required
List(Adam, Ann)
so often-times we perform one last traversal by calling reverse like so
names.foldLeft(List.empty[String]) { (acc, next) =>
if (next.startsWith("A")) next :: acc else acc
}.reverse
// val res1: List[String] = List(Adam, Ann)
The answer from #Mario Galic is a good one and should be accepted. (It's the polite thing to do).
Here's a slightly different way to filter for starts-with-A strings.
val names: List[String] = List("Adam", "Mick", "Ann")
println(names.foldLeft("my list of items starting with A: "){
case (acc, s"A$nme") => acc + s"A$nme "
case (acc, _ ) => acc
})
//output: "my list of items starting with A: Adam Ann"

Return all combinations for nested lists

I have the following data structure
val list = List(1,2,
List(3,4),
List(5,6,7)
)
I want to get this as a result
List(
List(1,2,3,5), List(1,2,3,6), List(1,2,3,7),
List(1,2,4,5), List(1,2,4,6), List(1,2,4,7)
)
Number of sub-lists in the input and number of elements in them can vary
P.S.
I'm trying to use this as a first step
list.map{
case x => List(x)
case list:List => list
}
and some for comprehension, but it won't work because I don't know how many elements each sublist of the result will have
Types like List[Any] are most often avoided in Scala – so much of the power of the language comes from its smart type system, and this kind of type impedes this. So your instinct to turn the list into a normalized List[List[Int]] is spot on:
val normalizedList = list.map {
case x: Int => List(x)
case list: List[Int #unchecked] => list
}
Note that this will eventually throw a runtime exception if list includes a List of some type other than Int, such as List[String], due to type erasure. This is exactly the kind of problem that arises when failing to use strong types! You can read more about strategies for dealing with type erasure here.
Once you have a normalized List[List[Int]], then you can use foldLeft to build the combinations. You are also correct in seeing that a for comprehension can work well here:
normalizedList.foldLeft(List(List.empty[Int])) { (acc, next) =>
for {
combo <- acc
num <- next
} yield (combo :+ num)
}
In each iteration of the foldLeft, we consider one more sublist (next) from the normalizedList. We look at each combination thus far constructed (each combo in acc), and then for each number num in next, we make a new combination by appending it to combo.
As you might now, for comprehensions are really syntactic sugar for map, flatMap, and filter operations. So we can also express this with those more primitive methods:
normalizedList.foldLeft(List(List.empty[Int])) { (acc, next) =>
acc.flatMap { combo =>
next.map { num => combo :+ num }
}
}
You can even use the (somewhat silly) :/ alias for foldLeft, switch the order of the maps, and use underscore syntax for ultimate brevity:
(List(List[Int]()) /: normalizedList) { (acc, next) => next.flatMap { num => acc.map(_ :+ num) } }
val list = List(1,2,
List(3,4),
List(5,6,7)
)
def getAllCombinations(list: List[Any]) : List[List[Int]] ={
//normalize head so it is always a List
val headList: List[Int] = list.head match {
case i:Int => List(i)
case l:List[Int] => l
}
if(list.tail.nonEmpty){
// recursion for tail combinations
val tailCombinations : List[List[Int]] = getAllCombinations(list.tail)
//combine head combinations with tail combinations
headList.flatMap(
{i:Int => tailCombinations.map(
{l=>List(i).++(l)}
)
}
)
}
else{
headList.map(List(_))
}
}
print(getAllCombinations(list))
This can be achieved with the use of a foldLeft, as well. In the code below, each item of the outer List is folded into the List of List's by combining each current list with each new item.
val list = List(1,2, List(3,4), List(5,6,7) )
val lxl0 = List( List[Int]() ) //start value for foldLeft
val lxl = list.foldLeft( lxl0 )( (lxl, i) => {
i match {
case i:Int => for( l <- lxl ) yield l :+ i
case newl:List[Int] => for( l <- lxl;
i <- newl ) yield l :+ i
}
})
lxl.map( _.mkString(",") ).foreach( println(_))
While I didn't use the map that you desired, I do believe that the code may be changed to do the map and make all elements List[Int]. Then, that may simplify the foldLeft to simply do the for-comprehension. I was not able to get that to work immediately, though ;)

How to generate transitive closure of set of tuples?

What is the best way to generate transitive closure of a set of tuples?
Example:
Input Set((1, 2), (2, 3), (3, 4), (5, 0))
Output Set((1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4), (5, 0))
//one transitive step
def addTransitive[A, B](s: Set[(A, B)]) = {
s ++ (for ((x1, y1) <- s; (x2, y2) <- s if y1 == x2) yield (x1, y2))
}
//repeat until we don't get a bigger set
def transitiveClosure[A,B](s:Set[(A,B)]):Set[(A,B)] = {
val t = addTransitive(s)
if (t.size == s.size) s else transitiveClosure(t)
}
println(transitiveClosure(Set((1,2), (2,3), (3,4))))
This is not a very efficient implementation, but it is simple.
With the help of unfold,
def unfoldRight[A, B](seed: B)(f: B => Option[(A, B)]): List[A] = f(seed) match {
case Some((a, b)) => a :: unfoldRight(b)(f)
case None => Nil
}
def unfoldLeft[A, B](seed: B)(f: B => Option[(B, A)]) = {
def loop(seed: B)(ls: List[A]): List[A] = f(seed) match {
case Some((b, a)) => loop(b)(a :: ls)
case None => ls
}
loop(seed)(Nil)
}
it becomes rather simple:
def transitiveClosure(input: Set[(Int, Int)]) = {
val map = input.toMap
def closure(seed: Int) = unfoldLeft(map get seed) {
case Some(`seed`) => None
case Some(n) => Some(seed -> n -> (map get n))
case _ => None
}
map.keySet flatMap closure
}
Another way of writing closure is this:
def closure(seed: Int) = unfoldRight(seed) {
case n if map.get(n) != seed => map get n map (x => seed -> x -> x)
case _ => None
}
I'm not sure which way I like best, myself. I like the elegance of testing for Some(seed) to avoid loops, but, by the same token, I also like the elegance of mapping the result of map get n.
Neither version returns seed -> seed for loops, so you'll have to add that if needed. Here:
def closure(seed: Int) = unfoldRight(map get seed) {
case Some(`seed`) => Some(seed -> seed -> None)
case Some(n) => Some(seed -> n -> (map get n))
case _ => None
}
Model the problem as a directed graph as follows:
Represent the numbers in the tuples as vertices in a graph.
Then each tuple (x, y) represents a directed edge from x to y. After that, use Warshall's Algorithm to find the transitive closure of the graph.
For the resulting graph, each directed edge is then converted to an (x, y) tuple. That is the transitive closure of the set of tuples.
Assuming that what you have is a DAG (there are no cycles in your example data), you could use the code below. It expects the DAG as a Map from T to List[T], which you could get from your input using
input.groupBy(_._1) mapValues ( _ map (_._2) )
Here's the transitive closure:
def transitiveClosure[T]( dag: Map[ T, List[T] ] ) = {
var tc = Map.empty[ T, List[T] ]
def getItemTC( item:T ): List[T] = tc.get(item) match {
case None =>
val itemTC = dag(item) flatMap ( x => x::getItemTC(x) )
tc += ( item -> itemTC )
itemTC
case Some(itemTC) => itemTC
}
dag.keys foreach getItemTC
tc
}
This code figures out the closure for each element just once. However:
This code can cause a stack overflow if there are long enough paths through the DAG (the recursion is not tail recursion).
For a large graph, you would probably be better off making tc a mutable Map and then converting it at the end if you wanted an immutable Map.
If your elements were really small integers as in your example, you could improve performance significantly by using Arrays rather than Maps, although doing so would complicate some things.
To eliminate the stack overflow problem (for DAGs), you could do a topological sort, reverse it, and process the items in order. But see also this page:
best known transitive closure algorithm for graph

Cartesian Product and Map Combined in Scala

This is a followup to: Expand a Set of Sets of Strings into Cartesian Product in Scala
The idea is you want to take:
val sets = Set(Set("a","b","c"), Set("1","2"), Set("S","T"))
and get back:
Set("a&1&S", "a&1&T", "a&2&S", ..., "c&2&T")
A general solution is:
def combine[A](f:(A, A) => A)(xs:Iterable[Iterable[A]]) =
xs.reduceLeft { (x, y) => x.view.flatMap {a => y.map(f(a, _)) } }
used as follows:
val expanded = combine{(x:String, y:String) => x + "&" + y}(sets).toSet
Theoretically, there should be a way to take input of type Set[Set[A]] and get back a Set[B]. That is, to convert the type while combining the elements.
An example usage would be to take in sets of strings (as above) and output the lengths of their concatenation. The f function in combine would something of the form:
(a:Int, b:String) => a + b.length
I was not able to come up with an implementation. Does anyone have an answer?
If you really want your combiner function to do the mapping, you can use a fold but as Craig pointed out you'll have to provide a seed value:
def combine[A, B](f: B => A => B, zero: B)(xs: Iterable[Iterable[A]]) =
xs.foldLeft(Iterable(zero)) {
(x, y) => x.view flatMap { y map f(_) }
}
The fact that you need such a seed value follows from the combiner/mapper function type (B, A) => B (or, as a curried function, B => A => B). Clearly, to map the first A you encounter, you're going to need to supply a B.
You can make it somewhat simpler for callers by using a Zero type class:
trait Zero[T] {
def zero: T
}
object Zero {
implicit object IntHasZero extends Zero[Int] {
val zero = 0
}
// ... etc ...
}
Then the combine method can be defined as:
def combine[A, B : Zero](f: B => A => B)(xs: Iterable[Iterable[A]]) =
xs.foldLeft(Iterable(implicitly[Zero[B]].zero)) {
(x, y) => x.view flatMap { y map f(_) }
}
Usage:
combine((b: Int) => (a: String) => b + a.length)(sets)
Scalaz provides a Zero type class, along with a lot of other goodies for functional programming.
The problem that you're running into is that reduce(Left|Right) takes a function (A, A) => A which doesn't allow you to change the type. You want something more like foldLeft which takes (B, A) ⇒ B, allowing you to accumulate an output of a different type. folds need a seed value though, which can't be an empty collection here. You'd need to take xs apart into a head and tail, map the head iterable to be Iterable[B], and then call foldLeft with the mapped head, the tail, and some function (B, A) => B. That seems like more trouble than it's worth though, so I'd just do all the mapping up front.
def combine[A, B](f: (B, B) => B)(g: (A) => B)(xs:Iterable[Iterable[A]]) =
xs.map(_.map(g)).reduceLeft { (x, y) => x.view.flatMap {a => y.map(f(a, _)) } }
val sets = Set(Set(1, 2, 3), Set(3, 4), Set(5, 6, 7))
val expanded = combine{(x: String, y: String) => x + "&" + y}{(i: Int) => i.toString}(sets).toSet

How to interpret scaladoc?

How does foldRight[B](B) from scaladoc match the actual call foldRight(0)
args is an array of integers in string representation
val elems = args map Integer.parseInt
elems.foldRight(0) (_ + _)
Scaladoc says:
scala.Iterable.foldRight[B](B)((A, B) => B) : B
Combines the elements of this list together using the binary function f, from right to left, and starting with the value z.
#note Will not terminate for infinite-sized collections.
#return f(a0, f(a1, f(..., f(an, z)...))) if the list is [a0, a1, ..., an].
And not so imporant what do the periods after f(an, z) mean?
As Steve said, the "..." are just ellipsis, indicating that a variable number of parameters that are not being shown.
Let's go to the Scaladoc, and show this step by step:
def foldRight[B](z: B)(op: (B, A) ⇒ B): B
That doesn't show enough. What is A? That is defined in the Iterable class (or whatever other class it is defined for):
trait Iterable[+A] extends AnyRef // Scala 2.7
trait Iterable[+A] extends Traversable[A] with GenericTraversableTemplate[A, Iterable[A][A]] with IterableLike[A, Iterable[A]] // scala 2.8
Ok, so A is the type of the collection. In your example, A would stand for Int:
val elems = args map Integer.parseInt
Next, [B]. That's a type parameter. Basically, the following two calls are identical in practice, but the first has the type parameter inferred by the compiler:
elems.foldRight(0) (_ + _)
elems.foldRight[Int](0) (_ + _)
If you used 0L instead of 0, then B would stand for Long instead. If you passed a "" instead of 0, then B would stand for String. You can try these out, they all will work.
So, B is Int and z is 0. Note that there are two sets parenthesis in the declaration. That means the function is curried. It receives two sets of parameters, beyond, as well as the type parameter ([B]). What that means is that you can ommit the second set of parameter, and that will return a function which takes that second set of parameter, and returns the expected result. For example:
val elemsFolder: ((Int, Int) => Int) => Int = elems.foldRight(0)
Which you could then call like this:
elemsFolder(_ + _)
Anyway, the second set receives op, which is expected to be of type (B, A) => B. Or, in other words, a function which receives two parameters -- the first being the same type as z, and the second being the same type as the type of the collection -- and returns a result of the same type as the first parameter. Since both A and B are Int, it will be a function of (Int, Int) => Int. If you passed "", then it would be a function of type (String, Int) => String.
Finally, the return type of the collection is B, which means whatever is the type of z, that will be the type returned by foldRight.
As for how foldRight works, it goes a bit like this:
def foldRight[B](z: B)(op: (B, A) => B): B = {
var acc: B = z
var it = this.reverse.elements // this.reverse.iterator on Scala 2.8
while (!it.isEmpty) {
acc = op(acc, it.next)
}
return acc
}
Which, I hope should be easy enough to understand.
Everything you need to know about foldLeft and foldRight can be gleaned from the following:
scala> List("1", "2", "3").foldRight("0"){(a, b) => "f(" + a + ", " + b + ")"}
res21: java.lang.String = f(1, f(2, f(3, 0)))
scala> List("1", "2", "3").foldLeft("0"){(a, b) => "f(" + a + ", " + b + ")"}
res22: java.lang.String = f(f(f(0, 1), 2), 3)