Why aren't toList and friends deprecated? - scala

Prior to version 2.10 of Scala sequence types had methods like toList and toArray for converting from one type to another. As of Scala 2.10 we have to[_], e.g. to[List], which appears to subsume toList and friends and also give us the ability to convert to new types like Vector and presumably even to our own collection types. And of course it gives you the ability to convert to a type which you know only as a type parameter, e.g. to[A] -- nice!
But why weren't the old methods deprecated? Are they faster? Are there cases where toList works but to[List] does not? Should we prefer one over the other where both work?

toList is implemented in TraversableOnce as to[List], so there won't be any noticeable performance difference.
However, toArray is (very slightly) more efficient than to[Array] as the former allocates an array of the right size while the latter first creates an array and then sets the size hint (as it does for every target collection type). This should not make a difference in a real application unless you are converting data to arrays in a tight loop.
The old methods could easily be deprecated, and I bet they will in the future, but people are so used to them that deprecating them right away would probably make some people angry.

On issue seems to be that you cannot use to[] in postfix notation:
scala> Array(1,2) toList
res2: List[Int] = List(1, 2)
scala> Array(1,2) to[List]
<console>:1: error: ';' expected but '[' found.
Array(1,2) to[List]
scala> Array(1,2).to[List]
res3: List[Int] = List(1, 2)

Related

Scala - covariant type in mutable collections

I am new in Scala world and now I am reading the book called "Scala in Action" (by Nilanjan Raychaudhuri), namely the part called "Mutable object need to be invariant" on page 97 and I don't understand the following part which is taken directly from the mentioned book.
Assume ListBuffer is covariant and the following code snippet works without any compilation problem:
scala> val mxs: ListBuffer[String] = ListBuffer("pants")
mxs: scala.collection.mutable.ListBuffer[String] =
ListBuffer(pants)
scala> val everything: ListBuffer[Any] = mxs
scala> everything += 1
res4: everything.type = ListBuffer(1, pants)
Can you spot the problem? Because everything is of the type Any, you can store an
integer value into a collection of strings. This is a disaster waiting to happen. To avoid these kinds of problems, it’s always a
good idea to make mutable objects invariant.
I would have the following questions..
1) What type of everything is in reality? String or Any? The declaration is "val everything: ListBuffer[Any]" and hence I would expect Any and because everything should be type of Any then I don't see any problems to have Integer and String in one ListBuffer[Any]. How can I store integer value into collection of strings how they write??? Why disaster??? Why should I use List (which is immutable) instead of ListBuffer (which is mutable)? I see no difference. I found a lot of answers that mutably collections should have type invariant and that immutable collections should have covariant type but why?
2) What does the last part "res4: everything.type = ListBuffer(1, pants)" mean? What does "everything.type" mean? I guess that everything does not have any method/function or variable called type.. Why is there no ListBuffer[Any] or ListBuffer[String]?
Thanks a lot,
Andrew
1 This doesn't look like a single question, so I have to subdivide it further:
"In reality" everything is ListBuffer[_], with erased parameter type. Depending on the JVM, it holds either 32 or 64 bit references to some objects. The types ListBuffer[String] and ListBuffer[Any] is what the compiler knows about it at compile time. If it "knows" two contradictory things, then it's obviously very bad.
"I don't see any problems to have Integer and String in
one ListBuffer[Any]". There is no problem to have Int and String in ListBuffer[Any], because ListBuffer is invariant. However, in your hypothetical example, ListBuffer is covariant, so you are storing an Int in a ListBuffer[String]. If someone later gets an Int from a ListBuffer[String], and tries to interpret it as String, then it's obviously very bad.
"How can I store integer value into collection
of strings how they write?" Why would you want to do something that is obviously very bad, as explained above?
"Why disaster???" It wouldn't be a major disaster. Java has been living with covariant arrays forever. It's does not lead to cataclysms, it's just bad and annoying.
"Why should I use List (which is immutable) instead of ListBuffer (which is mutable)?" There is no absolute imperative that tells you to always use List and to never use ListBuffer. Use both when it is appropriate. In 99.999% of cases, List is of course more appropriate, because you use Lists to represent data way more often than you design complicated algorithms that require local mutable state of a ListBuffer.
"I found a lot of answers that mutably collections
should have type invariant and that immutable collections should
have covariant type but why?". This is wrong, you are over-simplifying. For example, intensional immutable sets should be neither covariant, nor invariant, but contravariant. You should use covariance, contravariance, and invariance when it's appropriate. This little silly illustration has proven unreasonably effective for explaining the difference, maybe you too find it useful.
2 This is a singleton type, just like in the following example:
scala> val x = "hello"
x: String = hello
scala> val y: x.type = x
y: x.type = hello
Here is a longer discussion about the motivation for this.
I agree with most of what #Andrey is saying I would just add that covariance and contravariance belong exclusively to immutable structures, the exercisce that the books proposes is just a example so people can understand but it is not possible to implement a mutable structure that is covariant, you won't be able to make it compile.
As an exercise you could try to implement a MutableList[+A], you'll find out that there is not way to do this without tricking the compiler putting asInstanceOf everywhere

How does Array.toList work in Scala?

I'm looking at this snippet:
val here: Array[Int] = rdd.collect()
println(here.toList)
... looking at the source for toList (on TraversableOnce) and Array (Array does not inherit TraversableOnce though), but I can't find the connection that will make Scala consider an Array as a TraversableOnce - if that is even what happens. Is there some implicit at work here? Is there a conversion via ArraySeq or WrappedArray? How does that toList work?
Array does not extend TraversableOnce, but it is implicitly convertible to IndexedSeq, which does!
This means that internally, the array is converted to a WrappedArray, then toList is called on this.
See here for more info:
http://www.scala-lang.org/docu/files/collections-api/collections_38.html

Is there any smart implementation of flatMap that is able to infer common type characteristics for its input and output?

Consider the following type:
case class Subscriber(books: List[String])
And its instance wrapped in an option:
val s = Option(Subscriber(List("one", "two", "three")))
And an attempt to println all books for a subscriber:
s.flatMap(_.books).foreach(println(_))
This fails, due to:
Error: type mismatch;
found : List[String]
required: Option[?]
result.flatMap(_.books).foreach(println(_))
^
This is kind of expected, because flatMap must return types compatible to its source object and one can easily avoid this error by doing:
s.toList.flatMap(_.books).foreach(println(_))
I could also avoid flatMap, but its not the point.
But isn't there some smart method of achieving this without explicit toList conversion? Intuitively, None and List.empty have a lot in common. And during compilation of s.flatMap, s is implicitly converted to Traversable.
Something in scalaz, maybe?
I think smart implementation of flatamap is a bit missleading.
Such an implementation, as flatmap is well defined, would break the monadic behavior one expects when using flatmap and be in fact broken.
The problem with the smart method is
How should the compiler know what kind of type to use.
How should it be possible to infer if you wanted a List[_] or and Option[_] as a result?
What to do with other types, unknown types and what kind of conversion to apply on them ?
You could achieve the same result you would with the .list in that way , because for given example types don't matter (except you are aiming for a list of Units) at all as there is just output as a side effect:
s foreach ( _.books foreach println )
A direction to go, might be an implicit conversion using structural types but you would get a performance penalty.
The smarter way to do that would be is by using map.
val s = Option(Subscriber(List("one", "two", "three")))
s.map(_.books.map(println))
foreach has side effects while map doesn't

When applying `map` to a `Set` you sometimes want the result not to be a set but overlook this

Or how to avoid accidental removal of duplicates when mapping a Set?
This is a mistake I'm doing very often. Look at the following code:
def countSubelements[A](sl: Set[List[A]]): Int = sl.map(_.size).sum
The function shall count the accumulated size of all the contained lists. The problem is that after mapping the lists to their lengths, the result is still a Set and all lists of size 1 are reduced to a single representative.
Is it just me having this problem? Is there something I can do to prevent this happening? I think I'd love to have two methods mapToSet and mapToSeq for Set. But there is no way to enforce this, and sometimes you don't locally notice that you are working with a Set.
Maybe it's even possible that you were writing code for a Seq and something changes in another class and the underlying object becomes a Set?
Maybe something like a best practise to not let this situation arise at all?
Remote edits break my code
Imagine the following situation:
val totalEdges = graph.nodes.map(_.getEdges).map(_.size).sum / 2
You fetch a collection of Node objects from a graph, use them to get their adjacent edges and sum over them. This works if graph.nodes returns a Seq.
And it breaks if someone decides to make Graph return its nodes as a Set; without this code looking suspicious (at least not to me, do you expect every collection could possibly end up being a Set?) and without touching it.
It seems there will be many possible "gotcha's" if one expects a Seq and gets a Set. It's not a surprise that method implementations can depend on the type of the object and (with overloading) the arguments. With Scala implicits, the method can even depend on the expected return type.
A way to defend against surprises is to explicitly label types. For example, annotating methods with return types, even if it's not required. At least this way, if the type of graph.nodes is changed from Seq to Set, the programmer is aware that there's potential breakage.
For your specific issue, why not define your ownmapToSeq method,
scala> def mapToSeq[A, B](t: Traversable[A])(f: A => B): Seq[B] =
t.map(f)(collection.breakOut)
mapToSeq: [A, B](t: Traversable[A])(f: A => B)Seq[B]
scala> mapToSeq(Set(Seq(1), Seq(1,2)))(_.sum)
res1: Seq[Int] = Vector(1, 3)
scala> mapToSeq(Seq(Seq(1), Seq(1,2)))(_.sum)
res2: Seq[Int] = Vector(1, 3)
The advantage of using breakOut: CanBuildFrom is that the conversion from a Set to a Seq has no additional overhead.
You can make use the pimp my library pattern to make mapToSeq appear to be part of the Traversable trait, inherited by Seq and Set.
Workarounds:
def countSubelements[A](sl: Set[List[A]]): Int = sl.toSeq.map(_.size).sum
def countSubelements[A](sl: Set[List[A]]): Int = sl.foldLeft(0)(_ + _.size)

Where does Array get its toList method

Through searches, I understand the way (or, a way) to convert an Array to a List is like so:
val l = Array(1, 2, 3).toList
But not only can I not find the toList method in Array's API docs, I can't find it in anything that seems to be an ancestor or inherited trait of Array.
Using the newer 2.9 API docs, I see that toList exists in these things:
ImmutableMapAdaptor ImmutableSetAdaptor IntMap List ListBuffer LongMap
MutableList Option ParIterableLike PriorityQueue Stack StackProxy
StreamIterator SynchronizedSet SynchronizedStack TraversableForwarder
TraversableOnce TraversableOnceMethods TraversableProxyLike
But I can't understand how toList gets from one of these to be part of Array. Can anyone explain this?
toList and similar methods not natively found on Java arrays (including our old favourites, map, flatMap, filter etc.) come from s.c.m.ArrayOps, which arrays acquire via implicit conversions in scala.Predef. Look for implicit methods whose names end with ArrayOps and you'll see where the magic comes from.