flatten and flatMap in scala - scala

I'll like to check if I have understood flatten and flatMap functions correctly.
1) Am I correct that flatten works only when a collection constitutes of other collections. Eg flatten would work on following lists
//list of lists
val l1 = List(List(1,1,2,-1,3,1,-4,5), List("a","b"))
//list of a set, list and map
val l2 = List(Set(1,2,3), List(4,5,6), Map('a'->"x",'b'->"y"))
But flatten will not work on following
val l3 = List(1,2,3)
val l4 = List(1,2,3,List('a','b'))
val s1 = "hello world"
val m1 = Map('h'->"h", 'e'->"e", 'l'->"l",'o'->"0")
'flatten' method would create a new list consisting of all elements by removing the hierarchy. Thus it sort of 'flattens' the collection and brings all elements at the same level.
l1.flatten
res0: List[Any] = List(1, 1, 2, -1, 3, 1, -4, 5, a, b)
l2.flatten
res1: List[Any] = List(1, 2, 3, 1, 5, 6, (a,x), (b,y))
2) 'flatMap' first applies a method to elements of a list and then flattens the list. As we noticed above, the flatten method works if lists have a hierarchy (contain other collections). Thus it is important that the method we apply to the elements returns a collection otherwise flatMap will not work
//we have list of lists
val l1 = List(List(1,1,2,-1,3,1,-4,5), List("a","b"))
l1 flatMap(x=>x.toSet)
res2: List[Any] = List(5, 1, -4, 2, 3, -1, a, b)
val l2 = List(Set(1,2,3), List(1,5,6), Map('a'->"x",'b'->"y"))
l2.flatMap(x=>x.toSet)
res3: List[Any] = List(1, 2, 3, 1, 5, 6, (a,x), (b,y))
val s1 = "hello world"
s1.flatMap(x=>Map(x->x.toString))
We notice above that s1.flatten didn't work but s1.flatMap did. This is because, in s1.flatMap, we convert elements of a String (characters) into a Map which is a collection. Thus the string got converted into a collection of Maps like (Map('h'->"h"), Map('e'->"e"), Map('l'->"l"),Map ('l'->"l"),Map('o'->"o")....) Thus flatten will work now. Note that the Map created is not Map('h'->"h", 'e'->"e", 'l'->"l",....).

Take a look at the full signature for flatten:
def flatten[B](implicit asTraversable: (A) ⇒ GenTraversableOnce[B]): List[B]
As you can see, flatten takes an implicit parameter. That parameter provides the rules for how to flatten the given collection types. If the compiler can't find an implicit in scope then it can be provided explicitly.
flatten can flatten almost anything as long as you provide the rules to do so.

Flatmap is basically a map operation followed by a flatten

Related

Does scala collection's flatten keep order?

I have looked at the documentation of flatten and the example seems to indicate the order of the elements in the result maintains the order of the input. Is there a documentation or source code we can reference to make sure that it is the case? Or, is the documentation "Converts this collection of traversable collections into a collection formed by the elements of these traversable collections" enough to confirm this?
Update: My original question was not clear enough. I wanted to ask about the collections that maintain order internally (like List) and we use the default implicit traversable in flatten(). prayagupd has answered this question.
If you read the flatten function of scala 2.12.x, you can see it sequentially adding given inputs to a new collection.
//a sequential view of the collection
private def sequential: TraversableOnce[A] = this.asInstanceOf[GenTraversableOnce[A]].seq
def flatten[B](implicit asTraversable: A => /*<:<!!!*/ GenTraversableOnce[B]): CC[B] = {
val b = genericBuilder[B]
for (xs <- sequential)
b ++= asTraversable(xs).seq
b.result()
}
you can verify with example as well,
scala> List(List("order1", "order2"), List("order10", "order11")).flatten
res1: List[String] = List(order1, order2, order10, order11)
The order remains same even if you provide your own traversable,
scala> val asTraversable: List[String] => List[String] = list => list.map(elem => s"mutated $elem")
asTraversable: List[String] => List[String] = $$Lambda$1271/1988351538#513bec8c
scala> List(List("order1", "order2"), List("order10", "order11")).flatten(asTraversable)
res2: List[String] = List(mutated order1, mutated order2, mutated order10, mutated order11)
NOTE: above only applies to underlying data-structure that maintain the order.
For example Set does not maintain order
scala> Set(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).seq
res3: scala.collection.immutable.Set[Int] = Set(5, 10, 1, 6, 9, 2, 7, 3, 8, 4)

Scala: flatMap with tuples

Why is the below statement valid for .map() but not for .flatMap()?
val tupled = input.map(x => (x*2, x*3))
//Compilation error: cannot resolve reference flatMap with such signature
val tupled = input.flatMap(x => (x*2, x*3))
This statement has no problem, though:
val tupled = input.flatMap(x => List(x*2, x*3))
Assuming input if of type List[Int], map takes a function from Int to A, whereas flatMap takes a function from Int to List[A].
Depending on your use case you can choose either one or the other, but they're definitely not interchangeable.
For instance, if you are merely transforming the elements of a List you typically want to use map:
List(1, 2, 3).map(x => x * 2) // List(2, 4, 6)
but you want to change the structure of the List and - for example - "explode" each element into another list then flattening them, flatMap is your friend:
List(1, 2, 3).flatMap(x => List.fill(x)(x)) // List(1, 2, 2, 3, 3, 3)
Using map you would have had List(List(1), List(2, 2), List(3, 3, 3)) instead.
To understand how this is working, it can be useful to explicitly unpack the functions you're sending to map and flatMap and examine their signatures. I've rewritten them here, so you can see that f is a function mapping from Int to an (Int, Int) tuple, and g is a function that maps from Int to a List[Int].
val f: (Int) => (Int, Int) = x => (x*2, x*3)
val g: (Int) => List[Int] = x => List(x*2, x*3)
List(1,2,3).map(f)
//res0: List[(Int, Int)] = List((2,3), (4,6), (6,9))
List(1,2,3).map(g)
//res1: List[List[Int]] = List(List(2, 3), List(4, 6), List(6, 9))
//List(1,2,3).flatMap(f) // This won't compile
List(1,2,3).flatMap(g)
//res2: List[Int] = List(2, 3, 4, 6, 6, 9)
So why won't flatMap(f) compile? Let's look at the signature for flatMap, in this case pulled from the List implementation:
final override def flatMap[B, That](f : scala.Function1[A, scala.collection.GenTraversableOnce[B]])(...)
This is a little difficult to unpack, and I've elided some of it, but the key is the GenTraversableOnce type. List, if you follow it's chain of inheritance up, has this as a trait it is built with, and thus a function that maps from some type A to a List (or any object with the GenTraversableOnce trait) will be a valid function. Notably, tuples do not have this trait.
That is the in-the-weeds explanation why the typing is wrong, and is worth explaining because any error that says 'cannot resolve reference with such a signature' means that it can't find a function that takes the explicit type you're offering. Types are very often inferred in Scala, and so you're well-served to make sure that the type you're giving is the type expected by the method you're calling.
Note that flatMap has a standard meaning in functional programming, which is, roughly speaking, any mapping function that consumes a single element and produces n elements, but your final result is the concatenation of all of those lists. Therefore, the function you pass to flatMap will always expect to produce a list, and no flatMap function would be expected to know how to act on single elements.

How map function does sorting in scala

The following code snippet
val keys=List(3,2,1,0)
val unsorted=List(1, 2, 3, 4)
val sorted =keys map unsorted
does sorting based on the keys.
Normally, map method takes a lambda and apply it to every element in the source. Instead the above code takes a list and does a sorting based on the index.
What is happening in this particular situation?
List is a partial function that goes from its indices to its values. You could look at it like this:
scala> List(3,2,1,0) map (i => List(1,2,3,4)(i))
res0: List[Int] = List(4, 3, 2, 1)
to verify
scala> val f: Int => Int = List(1, 2, 3, 4)
f: Int => Int = List(1, 2, 3, 4)
It does not sorting them, map takes a function, lets see:
that statement translates to:
keys.map(unsorted.apply)
Scala's List.apply(n: Int) function return nth element of list, in this case map makes it:
unsorted(3), unsorted(2), unsorted(1), unsorted(4)
which is
4,3,2,1 which is sorted only by chance in this case.
Test this:
scala> val keys = List(3,0,1,2)
keys: List[Int] = List(3, 0, 1, 2)
scala> val unsorted = List(10,20,30,40)
unsorted: List[Int] = List(10, 20, 30, 40)
scala> keys map unsorted
res0: List[Int] = List(40, 10, 20, 30)
In keys map unsorted for each value in keys we call the apply method in list unsorted which yields the element at a key position.
Swap elements in keys and note the outcome, for instance note the last key,
val keys=List(2,1,0,3)
val sorted = keys map unsorted
sorted: List[Int] = List(3, 2, 1, 4)
The function .map has not specific relation with sorting.
Your snipet is a specific case that happens to sort. If desugaring, it does as follows.
/* keys */List(3,2,1,0).map { i: Int =>
unsorted.apply(i)
}
It applies unsorted.apply with each value of keys (used as indexes).
It's not sorting at all. It is normal situation: map method takes a lambda and apply it to every element in the source. Just change your sorted list and you'll see it.

using for comprehension for 2-argument map function

I've tried to use zipped method for tuple to browse temporary zipped list
give it be something like it:
val l1 : List[Int] = List(1,2,3)
val l2 : List[Int] = List(2,3,1)
val l3 : List[Int] = for ( (a,b) <- (l1,l2).zipped ) yield a+b
It is a synthetic example and may be replaced with just map function, but I want to use it in more complicated for expressions.
It gives me error: wrong number of parameters; expected = 2 which make sense since (l1,l2).zipped.map has two arguments. What is the right way to translate two-argument map function inside for comprehension?
You cannot translate the zipped version into a for statement because the for is just
(l1,l2).zipped.map{ _ match { case (a,b) => a+b } }
and zipped's map requires two arguments, not one. For doesn't know about maps that take two arguments, but it does know how to do matches. Tuples are exactly what you need to convert two arguments into one, and zip will create them for you:
for ((a,b) <- (l1 zip l2)) yield a+b
at the cost of creating an extra object every iteration. In many cases this won't matter; when it does, you're better off writing it out in full. Actually, you're probably better yet using Array, at least if primitives feature heavily, so you can avoid boxing, and work off of indices.
scala> val l1 : List[Int] = List(1,2,3)
l1: List[Int] = List(1, 2, 3)
scala> val l2 : List[Int] = List(2,3,1)
l2: List[Int] = List(2, 3, 1)
scala> val l3 : List[Int] = for ( (a,b) <- (l1,l2).zip) yield a+b
l3: List[Int] = List(3, 5, 4)

Scala: Can I rely on the order of items in a Set?

This was quite an unplesant surprise:
scala> Set(1, 2, 3, 4, 5)
res18: scala.collection.immutable.Set[Int] = Set(4, 5, 1, 2, 3)
scala> Set(1, 2, 3, 4, 5).toList
res25: List[Int] = List(5, 1, 2, 3, 4)
The example by itself suggest a "no" answer to my question. Then what about ListSet?
scala> import scala.collection.immutable.ListSet
scala> ListSet(1, 2, 3, 4, 5)
res21: scala.collection.immutable.ListSet[Int] = Set(1, 2, 3, 4, 5)
This one seems to work, but should I rely on this behavior?
What other data structure is suitable for an immutable collection of unique items, where the original order must be preserved?
By the way, I do know about distict method in List. The problem is, I want to enforce uniqueness of items (while preserving the order) at interface level, so using distinct would mess up my neat design..
EDIT
ListSet doesn't seem very reliable either:
scala> ListSet(1, 2, 3, 4, 5).toList
res28: List[Int] = List(5, 4, 3, 2, 1)
EDIT2
In my search for a perfect design I tried this:
scala> class MyList[A](list: List[A]) { val values = list.distinct }
scala> implicit def toMyList[A](l: List[A]) = new MyList(l)
scala> implicit def fromMyList[A](l: MyList[A]) = l.values
Which actually works:
scala> val l1: MyList[Int] = List(1, 2, 3)
scala> l1.values
res0: List[Int] = List(1, 2, 3)
scala> val l2: List[Int] = new MyList(List(1, 2, 3))
l2: List[Int] = List(1, 2, 3)
The problem, however, is that I do not want to expose MyList outside the library. Is there any way to have the implicit conversion when overriding? For example:
trait T { def l: MyList[_] }
object O extends T { val l: MyList[_] = List(1, 2, 3) }
scala> O.l mkString(" ") // Let's test the implicit conversion
res7: String = 1 2 3
I'd like to do it like this:
object O extends T { val l = List(1, 2, 3) } // Doesn't work
That depends on the Set you are using. If you do not know which Set implementation you have, then the answer is simply, no you cannot be sure. In practice I usually encounter the following three cases:
I need the items in the Set to be ordered. For this I use classes mixing in the SortedSet trait which when you use only the Standard Scala API is always a TreeSet. It guarantees the elements are ordered according to their compareTo method (see the Ordered trat). You get a (very) small performance penalty for the sorting as the runtime of inserts/retrievals is now logarithmic, not (almost) constant like with the HashSet (assuming a good hash function).
You need to preserve the order in which the items are inserted. Then you use the LinkedHashSet. Practically as fast as the normal HashSet, needs a little more storage space for the additional links between elements.
You do not care about order in the Set. So you use a HashSet. (That is the default when using the Set.apply method like in your first example)
All this applies to Java as well, Java has a TreeSet, LinkedHashSet and HashSet and the corresponding interfaces SortedSet, Comparable and plain Set.
It is my belief that you should never rely on the order in a set. In no language.
Apart from that, have a look at this question which talks about this in depth.
ListSet will always return elements in the reverse order of insertion because it is backed by a List, and the optimal way of adding elements to a List is by prepending them.
Immutable data structures are problematic if you want first in, first out (a queue). You can get O(logn) or amortized O(1). Given the apparent need to build the set and then produce an iterator out of it (ie, you'll first put all elements, then you'll remove all elements), I don't see any way to amortize it.
You can rely that a ListSet will always return elements in last in, first out order (a stack). If that suffices, then go for it.