Type alias for immutable collections - scala

What is the best way to resolve the compilation error in the example below? Assume that 'm' must be of type GenMap and I do not have control over the arguments of myFun.
import scala.collection.GenMap
object Test {
def myFun(m: Map[Int, String]) = m
val m: GenMap[Int, String] = Map(1 -> "One", 2 -> "two")
//Build error here on m.seq
// Found scala.collection.Map[Int, String]
// Required scala.collection.immutable.Map[Int, String]
val result = myFun(m.seq)
}
EDIT:
I should have been clearer. In my actual use-case I don't have control over myFun, so I have to pass it a Map. The 'm' also arises from another scala component as a GenMap. I need to convert one to another, but there appears to be a conflict between collection.Map and collection.immutable.Map

m.seq.toMap will solve your problem.
According to the signature presented in the API toMap returns a scala.collection.immutable.Map which is said to be required in your error message. scala.collection.Map returned by the seq method is a more general trait which besides being a parent to immutable map is also a parent to the mutable and concurrent map.

Related

Scala: list to set using flatMap

I have class with field of type Set[String]. Also, I have list of objects of this class. I'd like to collect all strings from all sets of these objects into one set. Here is how I can do it already:
case class MyClass(field: Set[String])
val list = List(
MyClass(Set("123")),
MyClass(Set("456", "798")),
MyClass(Set("123", "798"))
)
list.flatMap(_.field).toSet // Set(123, 456, 798)
It works, but I think, I can achieve the same using only flatMap, without toSet invocation. I tried this, but it had given compilation error:
// error: Cannot construct a collection of type Set[String]
// with elements of type String based on a collection of type List[MyClass].
list.flatMap[String, Set[String]](_.field)
If I change type of list to Set (i.e., val list = Set(...)), then such flatMap invocation works.
So, can I use somehow Set.canBuildFrom or any other CanBuildFrom object to invoke flatMap on List object, so that I'll get Set as a result?
The CanBuildFrom instance you want is called breakOut and has to be provided as a second parameter:
import scala.collection.breakOut
case class MyClass(field: Set[String])
val list = List(
MyClass(Set("123")),
MyClass(Set("456", "798")),
MyClass(Set("123", "798"))
)
val s: Set[String] = list.flatMap(_.field)(breakOut)
Note that explicit type annotation on variable s is mandatory - that's how the type is chosen.
Edit:
If you're using Scalaz or cats, you can use foldMap as well:
import scalaz._, Scalaz._
list.foldMap(_.field)
This does essentially what mdms answer proposes, except the Set.empty and ++ parts are already baked in.
The way flatMap work in Scala is that it can only remove one wrapper for the same type of wrappers i.e. List[List[String]] -> flatMap -> List[String]
if you apply flatMap on different wrapper data types then you will always get the final outcome as higher level wrapper data type i.e.List[Set[String]] -> flatMap -> List[String]
if you want to apply the flatMap on different wrapper type i.e. List[Set[String]] -> flatMap -> Set[String] in you have 2 options :-
Explicitly cast the one datatype wrapper to another i.e. list.flatMap(_.field).toSet or
By providing implicit converter ie. implicit def listToSet(list: List[String]): Set[String] = list.toSet and the you can get val set:Set[String] = list.flatMap(_.field)
only then what you are trying to achieve will be accomplished.
Conclusion:- if you apply flatMap on 2 wrapped data type then you will always get the final result as op type which is on top of wrapper data type i.e. List[Set[String]] -> flatMap -> List[String] and if you want to convert or cast to different datatype then either you need to implicitly or explicitly cast it.
You could maybe provide a specific CanBuildFrom, but why not to use a fold instead?
list.foldLeft(Set.empty[String]){case (set, myClass) => set ++ myClass.field}
Still just one pass through the collection, and if you are sure the list is not empty, you could even user reduceLeft instead.

Scala - Multiple ways of initializing containers

I am new to Scala and was wondering what is the difference between initializing a Map data structure using the following three ways:
private val currentFiles: HashMap[String, Long] = new HashMap[String, Long]()
private val currentJars = new HashMap[String, Long]
private val currentVars = Map[String, Long]
There are two different parts to your question.
first, the difference between using an explicit type or not (cases 1 and 2) goes for any class, not necessarily containers.
val x = 1
Here the type is not explicit, and the compiler will try to figure it out using type inference. The type of x will be Int.
val x: Int = 1
Same as above, but now explicitly. If whatever you have at the right of = can't be cast to an Int, you will get a compiler error.
val x: Any = 1
Here we will still store a 1, but the type of the variable will be a parent class, using polymorphism.
The second part of your question is about initialization. The base initialization is as in java:
val x = new List[Int]()
This calls the class constructor and returns a new instance of the exact class.
Now, there is a special method called .apply that you can define and call with just parenthesis, like this:
val x = Seq[Int]()
This is a shortcut for this:
val x = Seq.apply[Int]()
Notice this is a function on the Seq object. The return type is whatever the function wants it to be, it is just another function. That said, it is mostly used to return a new instance of the given type, but there are no guarantees, you need to look at the function documentation to be sure of the contract.
That said, in the case of val x = Map[String, Long]() the implementation returns an actual instance of immutable.HashMap[String, Long], which is kind of the default Map implementation.
Map and HashMap are almost equivalent, but not exactly the same thing.
Map is trait, and HashMap is a class. Although under the hood they may be the same thing (scala.collection.immutable.HashMap) (more on that later).
When using
private val currentVars = Map[String, Long]()
You get a Map instance. In scala, () is a sugar, under the hood you are actually calling the apply() method of the object Map. This would be equivalent to:
private val currentVars = Map.apply[String, Long]()
Using
private val currentJars = new HashMap[String, Long]()
You get a HashMap instance.
In the third statement:
private val currentJars: HashMap[String, Long] = new HashMap[String, Long]()
You are just not relying anymore on type inference. This is exactly the same as the second statement:
private val currentJars: HashMap[String, Long] = new HashMap[String, Long]()
private val currentJars = new HashMap[String, Long]() // same thing
When / Which I use / Why
About type inference, I would recommend you to go with type inference. IMHO in this case it removes verbosity from the code where it is not really needed. But if you really miss like-java code, then include the type :) .
Now, about the two constructors...
Map vs HashMap
Short answer
You should probably always go with Map(): it is shorter, already imported and returns a trait (like a java interface). This last reason is nice because when passing this Map around you won't rely on implementation details since Map is just an interface of what you want or need.
On the other side, HashMap is an implementation.
Long answer
Map is not always a HashMap.
As seen in Programming in Scala, Map.apply[K, V]() can return a different class depending on how many key-value pairs you pass to it (ref):
Number of elements Implementation
0 scala.collection.immutable.EmptyMap
1 scala.collection.immutable.Map1
2 scala.collection.immutable.Map2
3 scala.collection.immutable.Map3
4 scala.collection.immutable.Map4
5 or more scala.collection.immutable.HashMap
When you have less then 5 elements you get an special class for each of these small collections and when you have an empty Map, you get a singleton object.
This is done mostly to get better performance.
You can try it out in repl:
import scala.collection.immutable.HashMap
val m2 = Map(1 -> 1, 2 -> 2)
m2.isInstanceOf[HashMap[Int, Int]]
// false
val m5 = Map(1 -> 1, 2 -> 2, 3 -> 3, 4 -> 4, 5 -> 5, 6 -> 6)
m5.isInstanceOf[HashMap[Int, Int]]
// true
If you are really curious you can even take a look at the source code.
So, even for performance you should also probably stick with Map().

How to combine Maps with different value types in Scala

I have the following code which is working:
case class Step() {
def bindings(): Map[String, Any] = ???
}
class Builder {
private val globalBindings = scala.collection.mutable.HashMap.empty[String, Any]
private val steps = scala.collection.mutable.ArrayBuffer.empty[Step]
private def context: Map[String, Any] =
globalBindings.foldLeft(Map[String, Any]())((l, r) => l + r) ++ Map[String, Any]("steps" -> steps.foldLeft(Vector[Map[String, Any]]())((l, r) => l.+:(r.bindings)))
}
But I think it could be simplified so as to not need the first foldLeft in the 'context' method.
The desired result is to produce a map where the entry values are either a String, an object upon which toString will be invoked later, or a function which returns a String.
Is this the best I can do with Scala's type system or can I make the code clearer?
TIA
First of all, the toMap method on mutable.HashMap returns an immutable.Map. You can also use map instead of the inner foldLeft together with toVector if you really need a vector, which might be unnecessary. Finally, you can just use + to add the desired key-value pair of "steps" to the map.
So your whole method body could be:
globalBindings.toMap + ("steps" -> steps.map(_.bindings).toVector)
I'd also note that you should be apprehensive of using types like Map[String, Any] in Scala. So much of the power of Scala comes from its type system and it can be used to great effect in many such situations, and so these types are often considered unidiomatic. Of course, there are situations where this approach makes the most sense, and without more context it would be hard to determine if that were true here.

Scala SortedMap.map method returns non-sorted map when static type is Map

I encountered some unauthorized strangeness working with Scala's SortedMap[A,B]. If I declare the reference to SortedMap[A,B] "a" to be of type Map[A,B], then map operations on "a" will produce a non-sorted map implementation.
Example:
import scala.collection.immutable._
object Test extends App {
val a: Map[String, String] = SortedMap[String, String]("a" -> "s", "b" -> "t", "c" -> "u", "d" -> "v", "e" -> "w", "f" -> "x")
println(a.getClass+": "+a)
val b = a map {x => x} // identity
println(b.getClass+": "+b)
}
The output of the above is:
class scala.collection.immutable.TreeMap: Map(a -> s, b -> t, c -> u, d -> v, e -> w, f -> x)
class scala.collection.immutable.HashMap$HashTrieMap: Map(e -> w, f -> x, a -> s, b -> t, c -> u, d -> v)
The order of key/value pairs before and after the identity transformation is not the same.
The strange thing is that removing the type declaration from "a" makes this issue go away. That's fine in a toy example, but makes SortedMap[A,B] unusable for passing to methods that expect Map[A,B] parameters.
In general, I would expect higher order functions such as "map" and "filter" to not change the fundamental properties of the collections they are applied to.
Does anyone know why "map" is behaving like this?
The map method, like most of the collection methods, isn't defined specifically for SortedMap. It is defined on a higher-level class (TraversableLike) and uses a "builder" to turn the mapped result into the correct return type.
So how does it decide what the "correct" return type is? Well, it tries to give you back the return type that it started out as. When you tell Scala that you have a Map[String,String] and ask it to map, then the builder has to figure out how to "build" the type for returning. Since you told Scala that the input was a Map[String,String], the builder decides to build a Map[String,String] for you. The builder doesn't know that you wanted a SortedMap, so it doesn't give you one.
The reason it works when you leave off the the Map[String,String] type annotation is that Scala infers that the type of a is SortedMap[String,String]. Thus, when you call map, you are calling it on a SortedMap, and the builder knows to construct a SortedMap for returning.
As far as your assertion that methods shouldn't change "fundamental properties", I think you're looking at it from the wrong angle. The methods will always give you back an object that conforms to the type that you specify. It's the type that defines the behavior of the builder, not the underlying implementation. When you think about like that, it's the type that forms the contract for how methods should behave.
Why might we want this?
Why is this the preferred behavior? Let's look at a concrete example. Say we have a SortedMap[Int,String]
val sortedMap = SortedMap[Int, String](1 -> "s", 2 -> "t", 3 -> "u", 4 -> "v")
If I were to map over it with a function that modifies the keys, I run the risk of losing elements when their keys clash:
scala> sortedMap.map { case (k, v) => (k / 2, v) }
res3: SortedMap[Int,String] = Map(0 -> s, 1 -> u, 2 -> v)
But hey, that's fine. It's a Map after all, and I know it's a Map, so I should expect that behavior.
Now let's say we have a function that accepts an Iterable of pairs:
def f(iterable: Iterable[(Int, String)]) =
iterable.map { case (k, v) => (k / 2, v) }
Since this function has nothing to do with Maps, it would be very surprising if the result of this function ever had fewer elements than the input. After all, map on a Iterable should produce the mapped version of each element. But a Map is an Iterable of pairs, so we can pass it into this function. So what happens in Scala when we do?
scala> f(sortedMap)
res4: Iterable[(Int, String)] = List((0,s), (1,t), (1,u), (2,v))
Look at that! No elements lost! In other words, Scala won't surprise us by violating our expectations about how map on an Iterable should work. If the builder instead tried to produce a SortedMap based on the fact that the input was a SortedMap, then our function f would have surprising results, and this would be bad.
So the moral of the story is: Use the types to tell the collections framework how to deal with your data. If you want your code to be able to expect that a map is sorted, then you should type it as SortedMap.
The signature of map is:
def
map[B, That](f: ((A, B)) ⇒ B)(implicit bf: CanBuildFrom[Map[A, B], B, That]): That
The implicit parameter bf is used to build the resulting collection. So in your example, since the type of a is Map[String, String], the type of bf is:
val cbf = implicitly[CanBuildFrom[Map[String, String], (String, String), Map[String, String]]]
Which just builds a Map[String, String] which doesn't have any of the properties of the SortedMap. See:
cbf() ++= List("b" -> "c", "e" -> "g", "a" -> "b") result
For more information, see this excellent article: http://docs.scala-lang.org/overviews/core/architecture-of-scala-collections.html
As dyross points out, it's the Builder, which is chosen (via the CanBuildFrom) on the basis of the target type, which determines the class of the collection that you get out of a map operation. Now this might not be the behaviour that you wanted, but it does for example allow you select the target type:
val b: SortedMap[String, String] = a.map(x => x)(collection.breakOut)
(breakOut gives a generic CanBuildFrom whose type is determined by context, i.e. our type annotation.)
So you could add some type parameters that allow you accept any sort of Map or Traversable (see this question), which would allow you do do a map operation in your method while retaining the correct type information, but as you can see it's not straightforward.
I think a much simpler approach is instead to define functions that you apply to your collections using the collections' map, flatMap etc methods, rather than by sending the collection itself to a method.
i.e. instead of
def f[Complex type parameters](xs: ...)(complex implicits) = ...
val result = f(xs)
do
val f: X => Y = ...
val results = xs map f
In short: you explicitly declared a to be of type Map, and the Scala collections framework tries very hard for higher order functions such as map and filter to not change the fundamental properties of the collections they are applied to, therefore it will also return a Map since that is what you explicitly told it you wanted.

What is the best way to create and pass around dictionaries containing multiple types in scala?

By dictionary I mean a lightweight map from names to values that can be used as the return value of a method.
Options that I'm aware of include making case classes, creating anon objects, and making maps from Strings -> Any.
Case classes require mental overhead to create (names), but are strongly typed.
Anon objects don't seem that well documented and it's unclear to me how to use them as arguments since there is no named type.
Maps from String -> Any require casting for retrieval.
Is there anything better?
Ideally these could be built from json and transformed back into it when appropriate.
I don't need static typing (though it would be nice, I can see how it would be impossible) - but I do want to avoid explicit casting.
Here's the fundamental problem with what you want:
def get(key: String): Option[T] = ...
val r = map.get("key")
The type of r will be defined from the return type of get -- so, what should that type be? From where could it be defined? If you make it a type parameter, then it's relatively easy:
import scala.collection.mutable.{Map => MMap}
val map: MMap[String, (Manifest[_], Any) = MMap.empty
def get[T : Manifest](key: String): Option[T] = map.get(key).filter(_._1 <:< manifest[T]).map(_._2.asInstanceOf[T])
def put[T : Manifest](key: String, obj: T) = map(key) = manifest[T] -> obj
Example:
scala> put("abc", 2)
scala> put("def", true)
scala> get[Boolean]("abc")
res2: Option[Boolean] = None
scala> get[Int]("abc")
res3: Option[Int] = Some(2)
The problem, of course, is that you have to tell the compiler what type you expect to be stored on the map under that key. Unfortunately, there is simply no way around that: the compiler cannot know what type will be stored under that key at compile time.
Any solution you take you'll end up with this same problem: somehow or other, you'll have to tell the compiler what type should be returned.
Now, this shouldn't be a burden in a Scala program. Take that r above... you'll then use that r for something, right? That something you are using it for will have methods appropriate to some type, and since you know what the methods are, then you must also know what the type of r must be.
If this isn't the case, then there's something fundamentally wrong with the code -- or, perhaps, you haven't progressed from wanting the map to knowing what you'll do with it.
So you want to parse json and turn it into objects that resemble the javascript objets described in the json input? If you want static typing, case classes are pretty much your only option and there are already libraries handling this, for example lift-json.
Another option is to use Scala 2.9's experimental support for dynamic typing. That will give you elegant syntax at the expense of type safety.
You can use approach I've seen in the casbah library, when you explicitly pass a type parameter into the get method and cast the actual value inside the get method. Here is a quick example:
case class MultiTypeDictionary(m: Map[String, Any]) {
def getAs[T <: Any](k: String)(implicit mf: Manifest[T]): T =
cast(m.get(k).getOrElse {throw new IllegalArgumentException})(mf)
private def cast[T <: Any : Manifest](a: Any): T =
a.asInstanceOf[T]
}
implicit def map2multiTypeDictionary(m: Map[String, Any]) =
MultiTypeDictionary(m)
val dict: MultiTypeDictionary = Map("1" -> 1, "2" -> 2.0, "3" -> "3")
val a: Int = dict.getAs("1")
val b: Int = dict.getAs("2") //ClassCastException
val b: Int = dict.getAs("4") //IllegalArgumetExcepton
You should note that there is no real compile-time checks, so you have to deal with all exceptions drawbacks.
UPD Working MultiTypeDictionary class
If you have only a limited number of types which can occur as values, you can use some kind of union type (a.k.a. disjoint type), having e.g. a Map[Foo, Bar | Baz | Buz | Blargh]. If you have only two possibilities, you can use Either[A,B], giving you a Map[Foo, Either[Bar, Baz]]. For three types you might cheat and use Map[Foo, Either[Bar, Either[Baz,Buz]]], but this syntax obviously doesn't scale well. If you have more types you can use things like...
http://cleverlytitled.blogspot.com/2009/03/disjoint-bounded-views-redux.html
http://svn.assembla.com/svn/metascala/src/metascala/OneOfs.scala
http://www.chuusai.com/2011/06/09/scala-union-types-curry-howard/