scala collection conversions - scala

What is the most effective way of conversion between different scala.collection object?
E.g.
val a=scala.collection.mutable.ListBuffer(1,2,0,3)
And I want to get scala.collection.mutable.ArrayBuffer.
According to http://docs.scala-lang.org/resources/images/collections.mutable.png it should be possible by converting to Buffer and to ArrayBuffer afterwards. Correct?
In general, can i make any conversion in scala collection through its common ancestor? (in the previous example the common ancestor is Buffer)
PS I read http://docs.scala-lang.org/overviews/collections/introduction.html but couldn't find anything about general conversions between various types (i'm aware about .toArray like methods)
thx

Syntax-wise most effective should be the to method introduced in 2.10:
def to[Col[_]]: Col[A]
Converts this collection into another by copying all elements.
Note: will not terminate for infinite-sized collections.
Use it as a.to[scala.collection.mutable.ArrayBuffer].
Efficiency-wise, unless you do an upcast-like conversion where you turn a subtype into a more general collection, converting will involve copying the elements. In your example, it does not matter if you turn the list buffer into a buffer and then into an array buffer -- you can do this directly using to, as you cause copying the elements from a linked list into an array either way.

Answering question number 2:
Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_07).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import collection.mutable._
import collection.mutable._
scala> List(1,2,3,4,5)
res0: List[Int] = List(1, 2, 3, 4, 5)
scala> res0.to[ArrayBuffer]
res1: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 4, 5)
scala> res0.to[ListBuffer]
res2: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2, 3, 4, 5)
You can convert them as you'd like as long as you keep compatibility:
scala> res0.to[Map]
<console>:12: error: scala.collection.mutable.Map takes two type parameters, expected: one
res0.to[Map]
^

Related

Set.map now returns a Set warning scala

I am getting this warning on mapping on sets, but I want to maintain the set output. How do we go about fixing this? I do not want to suppress the warning.
warning: method map in trait SetLike has changed semantics in version 2.8.0:
Set.map now returns a Set, so it will discard duplicate values.
someSet.map(_.fullName)
signature
def mapSet(someSet: Set[People]): Set[String]
Using Scala v2.7 REPL:
scala> scala.collection.mutable.Set("a","aa","b").map{x => x.size}
res0: Iterable[Int] = ArrayBuffer(1, 1, 2)
Using Scala v2.8 REPL:
scala> scala.collection.mutable.Set("a","aa","b").map{x => x.size}
res0: scala.collection.mutable.Set[Int] = Set(1, 2)
This difference between Scala 2.7 and 2.8 mutable Set (precisely speaking SetLike) can change program semantics, so the migration warning is shown.

Lists in Scala - plus colon vs double colon (+: vs ::)

I am little bit confused about +: and :: operators that are available.
It looks like both of them gives the same results.
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)
scala> 0 +: res0
res1: List[Int] = List(0, 1, 2, 3)
scala> 0 :: res0
res2: List[Int] = List(0, 1, 2, 3)
For my novice eye source code for both methods looks similar (plus-colon method has additional condition on generics with use of builder factories).
Which one of these methods should be used and when?
+: works with any kind of collection, while :: is specific implementation for List.
If you look at the source for +: closely, you will notice that it actually calls :: when the expected return type is List. That is because :: is implemented more efficiently for the List case: it simply connects the new head to the existing list and returns the result, which is a constant-time operation, as opposed to linear copying the entire collection in the generic case of +:.
+: on the other hand, takes CanBuildFrom, so you can do fancy (albeit, not looking as nicely in this case) things like:
val foo: Array[String] = List("foo").+:("bar")(breakOut)
(It's pretty useless in this particular case, as you could start with the needed type to begin with, but the idea is you can prepend and element to a collection, and change its type in one "go", avoiding an additional copy).

Can views be used with parallel collections?

The idiom for finding a result within a mapping of a collection goes something like this:
list.view.map(f).find(p)
where list is a List[A], f is an A => B, and p is a B => Boolean.
Is it possible to use view with parallel collections? I ask because I'm getting some very odd results:
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val f : Int => Int = i => {println(i); i + 10}
f: Int => Int = <function1>
scala> val list = (1 to 10).toList
list: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> list.par.view.map(f).find(_ > 5)
1
res0: Option[Int] = Some(11)
scala> list.par.view.map(f).find(_ > 5)
res1: Option[Int] = None
See "A Generic Parallel Collection Framework", the paper by Martin Odersky et al that discusses the new parallel collections. Page 8 has a section "Parallel Views" that talks about how view and par can be used together, and how this can give the performance benefits of both views and parallel computation.
As for your specific example, that is definately a bug. The exists method also breaks, and having it break on one list breaks it for all other lists, so I think it is a problem where operations that may be aborted part way through (find and exists can stop once the have an answer) manage to break the thread pool in some way. It could be related to the bug with exceptions being thrown inside functions passed to parallel collections. If so, it should be fixed in 2.10.

What is the Scala equivalent of C++ typeid?

For example, if I do
scala> val a = Set(1,2,3)
a: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
in the REPL, I want to see the most refined type of "a" in order to know whether it's really a HashSet. In C++, typeid(a).name() would do it. What is the Scala equivalent?
scala> val a = Set(1,2,3)
a: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> a.getClass.getName
res0: java.lang.String = scala.collection.immutable.Set$Set3
(Yes, it really is an instance of an inner class called Set3--it's a set specialized for 3 elements. If you make it a little larger, it'll be a HashTrieSet.)
Edit: #pst also pointed out that the type information [Int] was erased; this is how JVM generics work. However, the REPL keeps track since the compiler still knows the type. If you want to get the type that the compiler knows, you can
def manifester[A: ClassManifest](a: A) = implicitly[ClassManifest[A]]
and then you'll get something whose toString is the same as what the REPL reports. Between the two of these, you'll get as much type information as there is to be had. Of course, since the REPL already does this for you, you normally don't need to use this. But if for some reason you want to, the erased types are available from .typeArguments from the manifest.

In Scala 2, type inference fails on Set made with .toSet?

Why is type inference failing here?
scala> val xs = List(1, 2, 3, 3)
xs: List[Int] = List(1, 2, 3, 3)
scala> xs.toSet map(_*2)
<console>:9: error: missing parameter type for expanded function ((x$1) => x$1.$times(2))
xs.toSet map(_*2)
However, if xs.toSet is assigned, it compiles.
scala> xs.toSet
res42: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> res42 map (_*2)
res43: scala.collection.immutable.Set[Int] = Set(2, 4, 6)
Also, going the other way, converting to Set from List, and mapping on List complies.
scala> Set(5, 6, 7)
res44: scala.collection.immutable.Set[Int] = Set(5, 6, 7)
scala> res44.toList map(_*2)
res45: List[Int] = List(10, 12, 14)
Q: Why doesn't toSet do what I want?
A: That would be too easy.
Q: But why doesn't this compile? List(1).toSet.map(x => ...)
A: The Scala compiler is unable to infer that x is an Int.
Q: What, is it stupid?
A: Well, List[A].toSet doesn't return an immutable.Set[A]. It returns an immutable.Set[B] for some unknown B >: A.
Q: How was I supposed to know that?
A: From the Scaladoc.
Q: But why is toSet defined that way?
A: You might be assuming immutable.Set is covariant, but it isn't. It's invariant. And the return type of toSet is in covariant position, so the return type can't be allowed to be invariant.
Q: What do you mean, "covariant position"?
A: Let me Wikipedia that for you: http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science) . See also chapter 19 of Odersky, Venners & Spoon.
Q: I understand now. But why is immutable.Set invariant?
A: Let me Stack Overflow that for you: Why is Scala's immutable Set not covariant in its type?
Q: I surrender. How do I fix my original code?
A: This works: List(1).toSet[Int].map(x => ...). So does this: List(1).toSet.map((x: Int) => ...)
(with apologies to Friedman & Felleisen. thx to paulp & ijuma for assistance)
EDIT: There is valuable additional information in Adriaan's answer and in the discussion in the comments both there and here.
The type inference does not work properly as the signature of List#toSet is
def toSet[B >: A] => scala.collection.immutable.Set[B]
and the compiler would need to infer the types in two places in your call. An alternative to annotating the parameter in your function would be to invoke toSet with an explicit type argument:
xs.toSet[Int] map (_*2)
UPDATE:
Regarding your question why the compiler can infer it in two steps, let's just look at what happens when you type the lines one by one:
scala> val xs = List(1,2,3)
xs: List[Int] = List(1, 2, 3)
scala> val ys = xs.toSet
ys: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
Here the compiler will infer the most specific type for ys which is Set[Int] in this case. This type is known now, so the type of the function passed to map can be inferred.
If you filled in all possible type parameters in your example the call would be written as:
xs.toSet[Int].map[Int,Set[Int]](_*2)
where the second type parameter is used to specify the type of the returned collection (for details look at how Scala collections are implemented). This means I even underestimated the number of types the compiler has to infer.
In this case it may seem easy to infer Int but there are cases where it is not (given the other features of Scala like implicit conversions, singleton types, traits as mixins etc.). I don't say it cannot be done - it's just that the Scala compiler does not do it.
I agree it would be nice to infer "the only possible" type, even when calls are chained, but there are technical limitations.
You can think of inference as a breadth-first sweep over the expression, collecting constraints (which arise from subtype bounds and required implicit arguments) on type variables, followed by solving those constraints. This approach allows, e.g., implicits to guide type inference. In your example, even though there is a single solution if you only look at the xs.toSet subexpression, later chained calls could introduce constraints that make the system unsatisfiable. The downside of leaving the type variables unsolved is that type inference for closures requires the target type to be known, and will thus fail (it needs something concrete to go on -- the required type of the closure and the type of its argument types must not both be unknown).
Now, when delaying solving the constraints makes inference fail, we could backtrack, solve all the type variables, and retry, but this is tricky to implement (and probably quite inefficient).