I'm looking at this snippet:
val here: Array[Int] = rdd.collect()
println(here.toList)
... looking at the source for toList (on TraversableOnce) and Array (Array does not inherit TraversableOnce though), but I can't find the connection that will make Scala consider an Array as a TraversableOnce - if that is even what happens. Is there some implicit at work here? Is there a conversion via ArraySeq or WrappedArray? How does that toList work?
Array does not extend TraversableOnce, but it is implicitly convertible to IndexedSeq, which does!
This means that internally, the array is converted to a WrappedArray, then toList is called on this.
See here for more info:
http://www.scala-lang.org/docu/files/collections-api/collections_38.html
Related
Following code converts Scala List into java.util.List (Tested in Scala 2.11)
import scala.collection.JavaConverters._
val a = List(1, 2, 3)
val b = a.asJava
However, the conversion result seems incomplete. Because some methods in java.util.List do not work.
scala> b.remove(2)
java.lang.UnsupportedOperationException
at java.util.AbstractList.remove(AbstractList.java:161)
... 29 elided
My workaround is as follows:
val c = new java.util.ArrayList(a.asJava)
This works but seems redundant in API-design perspective.
Is this the correct way of using asJava method?
Why does Scala's JavaConverters produce incomplete result?
Because some methods in java.util.List do not work.
These methods are explicitly optional, because java.util.List covers both mutable and immutable lists, and some of implementations in Java standard library don't support them either:
Removes the first occurrence of the specified element from this list, if it is present (optional operation)...
Throws: ... UnsupportedOperationException - if the remove operation is not supported by this list
Same for other Java collection interfaces. So the result does completely satisfy the interface.
It appears that what you're getting from the asJava is an immutable list, as you've started with an immutable Scala list. Try the following
val a = List(1,2,3).to[scala.collection.mutable.ListBuffer].asJava
Prior to version 2.10 of Scala sequence types had methods like toList and toArray for converting from one type to another. As of Scala 2.10 we have to[_], e.g. to[List], which appears to subsume toList and friends and also give us the ability to convert to new types like Vector and presumably even to our own collection types. And of course it gives you the ability to convert to a type which you know only as a type parameter, e.g. to[A] -- nice!
But why weren't the old methods deprecated? Are they faster? Are there cases where toList works but to[List] does not? Should we prefer one over the other where both work?
toList is implemented in TraversableOnce as to[List], so there won't be any noticeable performance difference.
However, toArray is (very slightly) more efficient than to[Array] as the former allocates an array of the right size while the latter first creates an array and then sets the size hint (as it does for every target collection type). This should not make a difference in a real application unless you are converting data to arrays in a tight loop.
The old methods could easily be deprecated, and I bet they will in the future, but people are so used to them that deprecating them right away would probably make some people angry.
On issue seems to be that you cannot use to[] in postfix notation:
scala> Array(1,2) toList
res2: List[Int] = List(1, 2)
scala> Array(1,2) to[List]
<console>:1: error: ';' expected but '[' found.
Array(1,2) to[List]
scala> Array(1,2).to[List]
res3: List[Int] = List(1, 2)
If we are supposed to use Vector as the default Sequence type, why are there no methods toVector (like toList, toArray) in standard collection types?
In prototyping stage, is it okay to conform all collections to Seq type and use toSeq on all collection-returns (cast everything to Seq)?
Normally you should be more concerned with the interface that the collection implements rather than its concrete type, i.e. you should think in terms of Seq, LinearSeq, and IndexedSeq rather than List, Array, and Vector, which are concrete implementations. So arguably there shouldn't be toList and toArray either, but I guess they're there because they're so fundamental.
The toIndexedSeq method in practice provides you with a Vector, unless a collection overrides this in order to provide a more efficient implementation. You can also make a Vector with Vector() ++ c where c is your collection.
Scala 2.10 will come with a method .to[C[_]] so that you can write .to[List], .to[Array], .to[Vector], or any other compatible C.
Scala 2.10 adds not only .to[Vector], but .toVector, as well:
In TraversableOnce trait inherited by collections and iterators:
abstract def toVector: Vector[A]
I'm using a library (JXPath) to query a graph of beans in order to extract matching elements. However, JXPath returns groups of matching elements as an instance of java.lang.Iterator and I'd rather like to convert it into an immutable scala list. Is there any simpler way of doing than iterating over the iterator and creating a new immutable list at each iteration step ?
You might want to rethink the need for a List, although it feels very familiar when coming from Java, and List is the default implementation of an immutable Seq, it often isn't the best choice of collection.
The operations that list is optimal for are those already available via an iterator (basically taking consecutive head elements and prepending elements). If an iterator doesn't already give you what you need, then I can pretty much guarantee that a List won't be your best choice - a vector would be more appropriate.
Having got that out the way... The recommended technique to convert between Java and Scala collections (since Scala 2.8.1) is via scala.collection.JavaConverters. This gives you more control than JavaConversions and avoids some possible implicit conflicts.
You won't have a direct implicit conversion this way. Instead, you get asScala and asJava methods pimped onto collections, allowing you to perform the conversions explicitly.
To convert a Java iterator to a Scala iterator:
javaIterator.asScala
To convert a Java iterator to a Scala List (via the scala iterator):
javaIterator.asScala.toList
You may also want to consider converting toSeq instead of toList. In the case of iterators, this'll return a Stream - allowing you to retain the lazy behaviour of iterators within the richer Seq interface.
EDIT:
There's no toVector method, but (as Daniel pointed out) there's a toIndexedSeq method that will return a Vector as the default IndexedSeq subclass (just as List is the default Seq).
javaIterator.asScala.toIndexedSeq
EDIT: You should probably look at Kevin Wright's answer, which provides a better solution available since Scala 2.8.1, with less implicit magic.
You can import the implicit conversions from scala.collection.JavaConversions and then create a new Scala collection seamlessly, e.g. like this:
import collection.JavaConversions._
println(List() ++ javaIterator)
Your Java iterator is converted to a Scala iterator by JavaConversions.asScalaIterator. A Scala iterator with elements of type A implements TraversableOnce[A], which is the argument type needed to concatenate collections with ++.
If you need another collection type, just change List() to whatever you need (e.g., IndexedSeq() or collection.mutable.Seq(), etc.).
Through searches, I understand the way (or, a way) to convert an Array to a List is like so:
val l = Array(1, 2, 3).toList
But not only can I not find the toList method in Array's API docs, I can't find it in anything that seems to be an ancestor or inherited trait of Array.
Using the newer 2.9 API docs, I see that toList exists in these things:
ImmutableMapAdaptor ImmutableSetAdaptor IntMap List ListBuffer LongMap
MutableList Option ParIterableLike PriorityQueue Stack StackProxy
StreamIterator SynchronizedSet SynchronizedStack TraversableForwarder
TraversableOnce TraversableOnceMethods TraversableProxyLike
But I can't understand how toList gets from one of these to be part of Array. Can anyone explain this?
toList and similar methods not natively found on Java arrays (including our old favourites, map, flatMap, filter etc.) come from s.c.m.ArrayOps, which arrays acquire via implicit conversions in scala.Predef. Look for implicit methods whose names end with ArrayOps and you'll see where the magic comes from.