How flatMap in a Map works in scala? - scala

This is my code
def testMap() = {
val x = Map(
1 -> Map(
2 -> 3,
3 -> 4
),
5 -> Map(
6 -> 7,
7 -> 8
)
)
for {
(a, v) <- x
(b, c) <- v
} yield {
a
}
}
The code above gives
List(1, 1, 5, 5)
If I change the yield value of the for comprehension a to (a, b), the result is
Map(1 -> 3, 5 -> 7)
If I change (a, b) to (a, b, c), the result is
List((1,2,3), (1,3,4), (5,6,7), (5,7,8))
My question is what is the mechanism behind the determination of the result type in this for comprehension?

When you look into the API Documentation into the details of the map-Method you will find, that it has a second, implicit parameter of type CanBuildFrom.
An instance of CanBuildFrom from defines how a certain collection is build when mapping over some other collection and a certain element type is provided.
In the case where you get a Map as result, you are mapping over a Map and are providing binary tuples. So the compiler searches for a CanBuildFrom-instance, that can handle that.
To find such an instance, the compiler looks in different places, e.g. the current scope, the class a method is invoked on and its companion object.
In this case it will find an implicit field called canBuildFrom in the companion object of Map that is suitable and can be used to build a Map as result. So it tries to infer the result type to Map and as this succeeds uses this instance.
In the case, where you provide single values or triples instead, the instance found in the companion of Map does not have the required type, so it continues searching up the inheritance tree. It finds it in the companion object of Iterable. The instance their allows to build an Iterable of an arbitrary element type. So the compiler uses that.
So why do you get a List? Because that happens to be the implementation used there, the type system only guarantees you an Iterable.
If you want to get an Iterable instead of a Map you can provide a CanBuildFrom instance explicitly (only if you call map and flatMap directly) or just force the return type. There you will also notice that you won't be able to request a List even though you get one.
This wont work:
val l: List[Int] = Map(1->2).map(x=>3)
This however will:
val l: Iterable[Int] = Map(1->2).map(x=>3)

To add to #dth, if you want a list, you can do:
val l = Map(1->2,3->4).view.map( ... ).toList
Here the map function apply on a lazy IterableView, which output also an IterableView, and the actual construction is triggered by the toList.
Note: Also, not using view can result in a dangerous behavior. Example:
val m = Map(2->2,3->3)
val l = m.map{ case (k,v) => (k/2,v) ).toList
// List((1,3))
val l = m.view.map{ case (k,v) => (k/2,v) ).toList
// List((1,2), (1,3))
Here, omitting the .view make the map output a Map which overrides duplicate keys (and does additional and unnecessary work).

Related

Is there an efficient way to avoid repeated evaluation with mapValues?

The mapValues method creates a new Map that modifies the results of queries to the original Map by applying the given function. If the same value is queried twice, the function passed to mapValues is called twice.
For example:
case class A(i: Int) {
print("A")
}
case class B(a: A) {
print("B")
}
case class C(b: B) {
print("C")
}
val map = Map("One" -> 1)
.mapValues(A)
.mapValues(B)
.mapValues(C)
val a = map.get("One")
val b = map.get("One")
This will print ABCABC because a new set of case classes is created each time the value is queried.
How can I efficiently make this into a concrete Map that has pre-computed the mapValues functions? Ideally I would like a mechanism that does nothing if the Map already has concrete values.
I know that I can call map.map(identity) but this would re-compute the index for the Map which seems inefficient. The same is true if the last mapValues is converted to a map.
The view method will turn a strict Map into a non-strict Map, but there does not seem to be a method to do the opposite.
You can call force on the view to force evaluation:
scala> val strictMap = map.view.force
ABCstrictMap: scala.collection.immutable.Map[String,C] = Map(One -> C(B(A(1))))
scala> strictMap.get("One")
res1: Option[C] = Some(C(B(A(1))))
scala> strictMap.get("One")
res2: Option[C] = Some(C(B(A(1))))
I'd be careful about assuming that this will perform better than a simple map, though, and even if it does, the difference is likely to be negligible compared to the noise and the inconvenience if you need to cross-build for 2.11 or 2.12 and future Scala versions that will fix mapValues and change the view system entirely.

What do -> and ! mean in Scala

I am reading some Scala code. What does the -> mean in the following context?
var queries = { "Select apple from farm" -> None, "Select orange from fram" -> None, "Select blueberry from anotherFarm" -> Some( #randomStuff ) }
It looks like a list of lambda functions but I thought it should be => in that case instead of ->.
Also,
what does this single line code mean?
def onConnection(id) = { application ! turnOnApplication(id) }
Specifically, I am confused with the use of !. It doesn't seem to be a "NOT" as it is in most languages
The -> symbol is one way to define a tuple in Scala. The below are all equivalent:
val apples1 = "Select apple from farm" -> None
val apples2 = ("Select apple from farm" -> None)
val apples3 = ("Select apple from farm", None)
As for the !:
def onConnection(id) = { application ! turnOnApplication(id) }
! in Scala can be the negation operator, but ! in the above code snippet looks like tell from Akka (Akka is the main actor library for Scala). This pattern is used to send a message to an actor. So if application is a reference to an actor, the code snippet sends the result of turnOnApplication(id) to the application actor. From the linked documentation:
"!" means “fire-and-forget”, e.g. send a message asynchronously and return immediately. Also known as tell.
The thin arrow -> is Tuple syntax. It's just a different way of writing Tuples. I.e.
val x: (Int, String) = 3 -> "abc"
Is the same as writing:
val x: (Int, String) = (3, "abc")
The arrow syntax is done by providing an implicit class ArrowAssoc which defines a method def ->[B](y: B): (A, B). ArrowAssoc is part of Predef which is inserted into every Scala source file. You can find the docs here.
The bracket syntax meanwhile is syntactic sugar done by the compiler.
You can form tuple using two syntaxes
1) Using comma
val tuple = (1, 2)
2) Using -> (arrow)
val tuple = 1 -> 2
Scala repl
scala> val tuple = (1, 2)
tuple: (Int, Int) = (1,2)
scala> val tuple = 1 -> 2
tuple: (Int, Int) = (1,2)
Finding-symbols defines -> as Method provided by implicit conversion. Just look at the methods tagged with implicit that receive, as parameter, an object of type that is receiving the method. For example:
"a" -> 1 // Look for an implicit from String, AnyRef, Any or type parameter
In the above case, -> is defined in the class ArrowAssoc through the method any2ArrowAssoc that takes an object of type A, where A is an unbounded type parameter to the same method.
tutorialPoint definde ! as It is called Logical NOT Operator. Use to reverses the logical state of its operand. If a condition is true then Logical NOT operator will make false.

Why we need implicit parameters in scala?

I am new to scala, and today when I came across this akka source code I was puzzled:
def traverse[A, B](in: JIterable[A], fn: JFunc[A, Future[B]],
executor: ExecutionContext): Future[JIterable[B]] = {
implicit val d = executor
scala.collection.JavaConversions.iterableAsScalaIterable(in).foldLeft(
Future(new JLinkedList[B]())) { (fr, a) ⇒
val fb = fn(a)
for (r ← fr; b ← fb) yield { r add b; r }
}
}
Why the code is written using implicit parameters intentionally? Why can't it be written as:
scala.collection.JavaConversions.iterableAsScalaIterable(in).foldLeft(
Future(new JLinkedList[B](),executor))
without decalaring a new implicit variable d? Is there any advantage of doing this? For now I only find implicits increase the ambiguity of the code.
I can give you 3 reasons.
1) It hides boilerplate code.
Lets sort some lists:
import math.Ordering
List(1, 2, 3).sorted(Ordering.Int) // Fine. I can tell compiler how to sort ints
List("a", "b", "c").sorted(Ordering.String) // .. and strings.
List(1 -> "a", 2 -> "b", 3 -> "c").sorted(Ordering.Tuple2(Ordering.Int, Ordering.String)) // Not so fine...
With implicit parameters:
List(1, 2, 3).sorted // Compiller knows how to sort ints
List(1 -> "a", 2 -> "b", 3 -> "c").sorted // ... and some other types
2) It alows you to create API with generic methods:
scala> (70 to 75).map{ _.toChar }
res0: scala.collection.immutable.IndexedSeq[Char] = Vector(F, G, H, I, J, K)
scala> (70 to 75).map{ _.toChar }(collection.breakOut): String // You can change default behaviour.
res1: String = FGHIJK
3) It allows you to focus on what really matters:
Future(new JLinkedList[B]())(executor) // meters: what to do - `new JLinkedList[B]()`. don't: how to do - `executor`
It's not so bad, but what if you need 2 futures:
val f1 = Future(1)(executor)
val f2 = Future(2)(executor) // You have to specify the same executor every time.
Implicit creates "context" for all actions:
implicit val d = executor // All `Future` in this scope will be created with this executor.
val f1 = Future(1)
val f2 = Future(2)
3.5) Implicit parameters allows type-level programming . See shapeless.
About "ambiguity of the code":
You don't have to use implicits, alternatively you can specify all parameters explicitly. It looks ugly sometimes (see sorted example), but you can do it.
If you can't find which implicit variables are used as parameters you can ask compiler:
>echo object Test { List( (1, "a") ).sorted } > test.scala
>scalac -Xprint:typer test.scala
You'll find math.this.Ordering.Tuple2[Int, java.lang.String](math.this.Ordering.Int, math.this.Ordering.String) in output.
In the code from Akka you linked, it is true that executor could be just passed explicitly. But if there was more than one Future used throughout this method, declaring implicit parameter would definitely make sense to avoid passing it around many times.
So I would say that in the code you linked, implicit parameter was used just to follow some code style. It would be ugly to make an exception from it.
Your question intrigued me, so I searched a bit on the net. Here's what I found on this blog: http://daily-scala.blogspot.in/2010/04/implicit-parameters.html
What is an implicit parameter?
An implicit parameter is a parameter to method or constructor that is marked as implicit. This means that if a parameter value is not supplied then the compiler will search for an "implicit" value defined within scope (according to resolution rules.)
Why use an implicit parameter?
Implicit parameters are very nice for simplifying APIs. For example the collections use implicit parameters to supply CanBuildFrom objects for many of the collection methods. This is because normally the user does not need to be concerned with those parameters. Another example is supplying an encoding to an IO library so the encoding is defined once (perhaps in a package object) and all methods can use the same encoding without having to define it for every method call.

Scala SortedMap.map method returns non-sorted map when static type is Map

I encountered some unauthorized strangeness working with Scala's SortedMap[A,B]. If I declare the reference to SortedMap[A,B] "a" to be of type Map[A,B], then map operations on "a" will produce a non-sorted map implementation.
Example:
import scala.collection.immutable._
object Test extends App {
val a: Map[String, String] = SortedMap[String, String]("a" -> "s", "b" -> "t", "c" -> "u", "d" -> "v", "e" -> "w", "f" -> "x")
println(a.getClass+": "+a)
val b = a map {x => x} // identity
println(b.getClass+": "+b)
}
The output of the above is:
class scala.collection.immutable.TreeMap: Map(a -> s, b -> t, c -> u, d -> v, e -> w, f -> x)
class scala.collection.immutable.HashMap$HashTrieMap: Map(e -> w, f -> x, a -> s, b -> t, c -> u, d -> v)
The order of key/value pairs before and after the identity transformation is not the same.
The strange thing is that removing the type declaration from "a" makes this issue go away. That's fine in a toy example, but makes SortedMap[A,B] unusable for passing to methods that expect Map[A,B] parameters.
In general, I would expect higher order functions such as "map" and "filter" to not change the fundamental properties of the collections they are applied to.
Does anyone know why "map" is behaving like this?
The map method, like most of the collection methods, isn't defined specifically for SortedMap. It is defined on a higher-level class (TraversableLike) and uses a "builder" to turn the mapped result into the correct return type.
So how does it decide what the "correct" return type is? Well, it tries to give you back the return type that it started out as. When you tell Scala that you have a Map[String,String] and ask it to map, then the builder has to figure out how to "build" the type for returning. Since you told Scala that the input was a Map[String,String], the builder decides to build a Map[String,String] for you. The builder doesn't know that you wanted a SortedMap, so it doesn't give you one.
The reason it works when you leave off the the Map[String,String] type annotation is that Scala infers that the type of a is SortedMap[String,String]. Thus, when you call map, you are calling it on a SortedMap, and the builder knows to construct a SortedMap for returning.
As far as your assertion that methods shouldn't change "fundamental properties", I think you're looking at it from the wrong angle. The methods will always give you back an object that conforms to the type that you specify. It's the type that defines the behavior of the builder, not the underlying implementation. When you think about like that, it's the type that forms the contract for how methods should behave.
Why might we want this?
Why is this the preferred behavior? Let's look at a concrete example. Say we have a SortedMap[Int,String]
val sortedMap = SortedMap[Int, String](1 -> "s", 2 -> "t", 3 -> "u", 4 -> "v")
If I were to map over it with a function that modifies the keys, I run the risk of losing elements when their keys clash:
scala> sortedMap.map { case (k, v) => (k / 2, v) }
res3: SortedMap[Int,String] = Map(0 -> s, 1 -> u, 2 -> v)
But hey, that's fine. It's a Map after all, and I know it's a Map, so I should expect that behavior.
Now let's say we have a function that accepts an Iterable of pairs:
def f(iterable: Iterable[(Int, String)]) =
iterable.map { case (k, v) => (k / 2, v) }
Since this function has nothing to do with Maps, it would be very surprising if the result of this function ever had fewer elements than the input. After all, map on a Iterable should produce the mapped version of each element. But a Map is an Iterable of pairs, so we can pass it into this function. So what happens in Scala when we do?
scala> f(sortedMap)
res4: Iterable[(Int, String)] = List((0,s), (1,t), (1,u), (2,v))
Look at that! No elements lost! In other words, Scala won't surprise us by violating our expectations about how map on an Iterable should work. If the builder instead tried to produce a SortedMap based on the fact that the input was a SortedMap, then our function f would have surprising results, and this would be bad.
So the moral of the story is: Use the types to tell the collections framework how to deal with your data. If you want your code to be able to expect that a map is sorted, then you should type it as SortedMap.
The signature of map is:
def
map[B, That](f: ((A, B)) ⇒ B)(implicit bf: CanBuildFrom[Map[A, B], B, That]): That
The implicit parameter bf is used to build the resulting collection. So in your example, since the type of a is Map[String, String], the type of bf is:
val cbf = implicitly[CanBuildFrom[Map[String, String], (String, String), Map[String, String]]]
Which just builds a Map[String, String] which doesn't have any of the properties of the SortedMap. See:
cbf() ++= List("b" -> "c", "e" -> "g", "a" -> "b") result
For more information, see this excellent article: http://docs.scala-lang.org/overviews/core/architecture-of-scala-collections.html
As dyross points out, it's the Builder, which is chosen (via the CanBuildFrom) on the basis of the target type, which determines the class of the collection that you get out of a map operation. Now this might not be the behaviour that you wanted, but it does for example allow you select the target type:
val b: SortedMap[String, String] = a.map(x => x)(collection.breakOut)
(breakOut gives a generic CanBuildFrom whose type is determined by context, i.e. our type annotation.)
So you could add some type parameters that allow you accept any sort of Map or Traversable (see this question), which would allow you do do a map operation in your method while retaining the correct type information, but as you can see it's not straightforward.
I think a much simpler approach is instead to define functions that you apply to your collections using the collections' map, flatMap etc methods, rather than by sending the collection itself to a method.
i.e. instead of
def f[Complex type parameters](xs: ...)(complex implicits) = ...
val result = f(xs)
do
val f: X => Y = ...
val results = xs map f
In short: you explicitly declared a to be of type Map, and the Scala collections framework tries very hard for higher order functions such as map and filter to not change the fundamental properties of the collections they are applied to, therefore it will also return a Map since that is what you explicitly told it you wanted.

How do I form the union of scala SortedMaps?

(I'm using Scala nightlies, and see the same behaviour in 2.8.0b1 RC4. I'm a Scala newcomer.)
I have two SortedMaps that I'd like to form the union of. Here's the code I'd like to use:
import scala.collection._
object ViewBoundExample {
class X
def combine[Y](a: SortedMap[X, Y], b: SortedMap[X, Y]): SortedMap[X, Y] = {
a ++ b
}
implicit def orderedX(x: X): Ordered[X] = new Ordered[X] { def compare(that: X) = 0 }
}
The idea here is the 'implicit' statement means Xs can be converted to Ordered[X]s, and then it makes sense combine SortedMaps into another SortedMap, rather than just a map.
When I compile, I get
sieversii:scala-2.8.0.Beta1-RC4 scott$ bin/scalac -versionScala compiler version
2.8.0.Beta1-RC4 -- Copyright 2002-2010, LAMP/EPFL
sieversii:scala-2.8.0.Beta1-RC4 scott$ bin/scalac ViewBoundExample.scala
ViewBoundExample.scala:8: error: type arguments [ViewBoundExample.X] do not
conform to method ordered's type parameter bounds [A <: scala.math.Ordered[A]]
a ++ b
^
one error found
It seems my problem would go away if that type parameter bound was [A <% scala.math.Ordered[A]], rather than [A <: scala.math.Ordered[A]]. Unfortunately, I can't even work out where the method 'ordered' lives! Can anyone help me track it down?
Failing that, what am I meant to do to produce the union of two SortedMaps? If I remove the return type of combine (or change it to Map) everything works fine --- but then I can't rely on the return being sorted!
Currently, what you are using is the scala.collection.SortedMap trait, whose ++ method is inherited from the MapLike trait. Therefore, you see the following behaviour:
scala> import scala.collection.SortedMap
import scala.collection.SortedMap
scala> val a = SortedMap(1->2, 3->4)
a: scala.collection.SortedMap[Int,Int] = Map(1 -> 2, 3 -> 4)
scala> val b = SortedMap(2->3, 4->5)
b: scala.collection.SortedMap[Int,Int] = Map(2 -> 3, 4 -> 5)
scala> a ++ b
res0: scala.collection.Map[Int,Int] = Map(1 -> 2, 2 -> 3, 3 -> 4, 4 -> 5)
scala> b ++ a
res1: scala.collection.Map[Int,Int] = Map(1 -> 2, 2 -> 3, 3 -> 4, 4 -> 5)
The type of the return result of ++ is a Map[Int, Int], because this would be the only type it makes sense the ++ method of a MapLike object to return. It seems that ++ keeps the sorted property of the SortedMap, which I guess it is because ++ uses abstract methods to do the concatenation, and those abstract methods are defined as to keep the order of the map.
To have the union of two sorted maps, I suggest you use scala.collection.immutable.SortedMap.
scala> import scala.collection.immutable.SortedMap
import scala.collection.immutable.SortedMap
scala> val a = SortedMap(1->2, 3->4)
a: scala.collection.immutable.SortedMap[Int,Int] = Map(1 -> 2, 3 -> 4)
scala> val b = SortedMap(2->3, 4->5)
b: scala.collection.immutable.SortedMap[Int,Int] = Map(2 -> 3, 4 -> 5)
scala> a ++ b
res2: scala.collection.immutable.SortedMap[Int,Int] = Map(1 -> 2, 2 -> 3, 3 -> 4, 4 -> 5)
scala> b ++ a
res3: scala.collection.immutable.SortedMap[Int,Int] = Map(1 -> 2, 2 -> 3, 3 -> 4, 4 -> 5)
This implementation of the SortedMap trait declares a ++ method which returns a SortedMap.
Now a couple of answers to your questions about the type bounds:
Ordered[T] is a trait which if mixed in a class it specifies that that class can be compared using <, >, =, >=, <=. You just have to define the abstract method compare(that: T) which returns -1 for this < that, 1 for this > that and 0 for this == that. Then all other methods are implemented in the trait based on the result of compare.
T <% U represents a view bound in Scala. This means that type T is either a subtype of U or it can be implicitly converted to U by an implicit conversion in scope. The code works if you put <% but not with <: as X is not a subtype of Ordered[X] but can be implicitly converted to Ordered[X] using the OrderedX implicit conversion.
Edit: Regarding your comment. If you are using the scala.collection.immutable.SortedMap, you are still programming to an interface not to an implementation, as the immutable SortedMap is defined as a trait. You can view it as a more specialised trait of scala.collection.SortedMap, which provides additional operations (like the ++ which returns a SortedMap) and the property of being immutable. This is in line with the Scala philosophy - prefer immutability - therefore I don't see any problem of using the immutable SortedMap. In this case you can guarantee the fact that the result will definitely be sorted, and this can't be changed as the collection is immutable.
Though, I still find it strange that the scala.collection.SortedMap does not provide a ++ method witch returns a SortedMap as a result. All the limited testing I have done seem to suggest that the result of a concatenation of two scala.collection.SortedMaps indeed produces a map which keeps the sorted property.
Have you picked a tough nut to crack as a beginner to Scala! :-)
Ok, brief tour, don't expect to fully understand it right now. First, note that the problem happens at the method ++. Searching for its definition, we find it at the trait MapLike, receiving either an Iterator or a Traversable. Since y is a SortedMap, then it is the Traversable version being used.
Note in its extensive type signature that there is a CanBuildFrom being passed. It is being passed implicitly, so you don't normally need to worry about it. However, to understand what is going on, this time you do.
You can locate CanBuildFrom by either clicking on it where it appears in the definition of ++, or by filtering. As mentioned by Randall on the comments, there's an unmarked blank field on the upper left of the scaladoc page. You just have to click there and type, and it will return matches for whatever it is you typed.
So, look up the trait CanBuildFrom on ScalaDoc and select it. It has a large number of subclasses, each one responsible for building a specific type of collection. Search for and click on the subclass SortedMapCanBuildFrom. This is the class of the object you need to produce a SortedMap from a Traversable. Note on the instance constructor (the constructor for the class) that it receives an implicit Ordering parameter. Now we are getting closer.
This time, use the filter filter to search for Ordering. Its companion object (click on the small "o" the name) hosts an implicit that will generate Orderings, as companion objects are examined for implicits generating instances or conversions for that class. It is defined inside the trait LowPriorityOrderingImplicits, which object Ordering extends, and looking at it you'll see the method ordered[A <: Ordered[A]], which will produce the Ordering required... or would produce it, if only there wasn't a problem.
One might assume the implicit conversion from X to Ordered[X] would be enough, just as I had before looking more carefully into this. That, however, is a conversion of objects, and ordered expects to receive a type which is a subtype of Ordered[X]. While one can convert an object of type X to an object of type Ordered[X], X, itself, is not a subtype of Ordered[X], so it can't be passed as a parameter to ordered.
On the other hand, you can create an implicit val Ordering[X], instead of the def Ordered[X], and you'll get around the problem. Specifically:
object ViewBoundExample {
class X
def combine[Y](a: SortedMap[X, Y], b: SortedMap[X, Y]): SortedMap[X, Y] = {
a ++ b
}
implicit val orderingX = new Ordering[X] { def compare(x: X, y: X) = 0 }
}
I think most people initial reaction to Ordered/Ordering must be one of perplexity: why have classes for the same thing? The former extends java.lang.Comparable, whereas the latter extends java.util.Comparator. Alas, the type signature for compare pretty much sums the main difference:
def compare(that: A): Int // Ordered
def compare(x: T, y: T): Int // Ordering
The use of an Ordered[A] requires for either A to extend Ordered[A], which would require one to be able to modify A's definition, or to pass along a method which can convert an A into an Ordered[A]. Scala is perfectly capable of doing the latter easily, but then you have to convert each instance before comparing.
On the other hand, the use of Ordering[A] requires the creation of a single object, such as demonstrated above. When you use it, you just pass two objects of type A to compare -- no objects get created in the process.
So there are some performance gains to be had, but there is a much more important reason for Scala's preference for Ordering over Ordered. Look again on the companion object to Ordering. You'll note that there are several implicits for many of Scala classes defined in there. You may recall I mentioned earlier that an implicit for class T will be searched for inside the companion object of T, and that's exactly what is going on.
This could be done for Ordered as well. However, and this is the sticking point, that means every method supporting both Ordering and Ordered would fail! That's because Scala would look for an implicit to make it work, and would find two: one for Ordering, one for Ordered. Being unable to decide which is it you wanted, Scala gives up with an error message. So, a choice had to be made, and Ordering had more going on for it.
Duh, I forgot to explain why the signature isn't defined as ordered[A <% Ordered[A]], instead of ordered[A <: Ordered[A]]. I suspect doing so would cause the double implicits failure I have mentioned before, but I'll ask the guy who actually did this stuff and had the double implicit problems whether this particular method is problematic.