What is the Scala equivalent of C++ typeid? - scala

For example, if I do
scala> val a = Set(1,2,3)
a: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
in the REPL, I want to see the most refined type of "a" in order to know whether it's really a HashSet. In C++, typeid(a).name() would do it. What is the Scala equivalent?

scala> val a = Set(1,2,3)
a: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> a.getClass.getName
res0: java.lang.String = scala.collection.immutable.Set$Set3
(Yes, it really is an instance of an inner class called Set3--it's a set specialized for 3 elements. If you make it a little larger, it'll be a HashTrieSet.)
Edit: #pst also pointed out that the type information [Int] was erased; this is how JVM generics work. However, the REPL keeps track since the compiler still knows the type. If you want to get the type that the compiler knows, you can
def manifester[A: ClassManifest](a: A) = implicitly[ClassManifest[A]]
and then you'll get something whose toString is the same as what the REPL reports. Between the two of these, you'll get as much type information as there is to be had. Of course, since the REPL already does this for you, you normally don't need to use this. But if for some reason you want to, the erased types are available from .typeArguments from the manifest.

Related

Why the variation in operators?

Long time lurker, first time poster.
In Scala, I'm looking for advantages as to why it was preferred to vary operators depending on type. For example, why was this:
Vector(1, 2, 3) :+ 4
determined to be an advantage over:
Vector(1, 2, 3) + 4
Or:
4 +: Vector(1,2,3)
over:
Vector(4) + Vector(1,2,3)
Or:
Vector(1,2,3) ++ Vector(4,5,6)
over:
Vector(1,2,3) + Vector(4,5,6)
So, here we have :+, +:, and ++ when + alone could have sufficed. I'm new at Scala, and I'll succumb. But, this seems unnecessary and obfuscated for a language that tries to be clean with its syntax.
I've done quite a few google and stack overflow searches and have only found questions about specific operators, and operator overloading in general. But, no background on why it was necessary to split +, for example, into multiple variations.
FWIW, I could overload the operators using implicit classes, such as below, but I imagine that would only cause confusion (and tisk tisks) from experienced Scala programmers using/reading my code.
object AddVectorDemo {
implicit class AddVector(vector : Vector[Any]) {
def +(that : Vector[Any]) = vector ++ that
def +(that : Any) = vector :+ that
}
def main(args : Array[String]) : Unit = {
val u = Vector(1,2,3)
val v = Vector(4,5,6)
println(u + v)
println(u + v + 7)
}
}
Outputs:
Vector(1, 2, 3, 4, 5, 6)
Vector(1, 2, 3, 4, 5, 6, 7)
The answer requires a surprisingly long detour through variance. I'll try to make it as short as possible.
First, note that you can add anything to an existing Vector:
scala> Vector(1)
res0: scala.collection.immutable.Vector[Int] = Vector(1)
scala> res0 :+ "fish"
res1: scala.collection.immutable.Vector[Any] = Vector(1, fish)
Why can you do this? Well, if B extends A and we want to be able to use Vector[B] where Vector[A] is called for, we need to allow Vector[B] to add the same sorts of things that Vector[A] can add. But everything extends Any, so we need to allow addition of anything that Vector[Any] can add, which is everything.
Making Vector and most other non-Set collections covariant is a design decision, but it's what most people expect.
Now, let's try adding a vector to a vector.
scala> res0 :+ Vector("fish")
res2: scala.collection.immutable.Vector[Any] = Vector(1, Vector(fish))
scala> res0 ++ Vector("fish")
res3: scala.collection.immutable.Vector[Any] = Vector(1, fish)
If we only had one operation, +, we wouldn't be able to specify which one of these things we meant. And we really might mean to do either. They're both perfectly sensible things to try. We could try to guess based on types, but in practice it's better to just ask the programmer to explicitly say what they mean. And since there are two different things to mean, there need to be two ways to ask.
Does this come up in practice? With collections of collections, yes, all the time. For example, using your +:
scala> Vector(Vector(1), Vector(2))
res4: Vector[Vector[Int]] = Vector(Vector(1), Vector(2))
scala> res4 + Vector(3)
res5: Vector[Any] = Vector(Vector(1), Vector(2), 3)
That's probably not what I wanted.
It's a fair question, and I think it has a lot to do with legacy code and Java compatibility. Scala copied Java's + for String concatenation, which has complicated things.
This + allows us to do:
(new Object) + "foobar" //"java.lang.Object#5bb90b89foobar"
So what should we expect if we had + for List and we did List(1) + "foobar"? One might expect List(1, "foobar") (of type List[Any]), just like we get if we use :+, but the Java-inspired String-concatenation overload would complicate this, since the compiler would fail to resolve the overload.
Odersky even once commented:
One should never have a + method on collections that are covariant in their element type. Sets and maps are non-variant, that's why they can have a + method. It's all rather delicate and messy. We'd be better off if we did not try to duplicate Java's + for String concatenation. But when Scala got designed the idea was to keep essentially all of Java's expression syntax, including String +. And it's too late to change that now.
There is some discussion (although in a different context) on the answers to this similar question.

Set sequencing type puzzle

Last night in responding to this question, I noticed the following:
scala> val foo: Option[Set[Int]] = Some(Set(1, 2, 3))
foo: Option[Set[Int]] = Some(Set(1, 2, 3))
scala> import scalaz._, Scalaz._
import scalaz._
import Scalaz._
scala> foo.sequenceU
res0: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
That is, if foo is an optional set of integers, sequencing it returns a set of integers.
This isn't what I expected at first, since sequencing a F[G[A]] should return a G[F[A]] (assuming that F is traversable and G is an applicative functor). In this case, though, the Option layer just disappears.
I know this probably has something to do with some interaction between one of the supertypes of Set and the Unapply machinery that makes sequenceU work, and when I can find a few minutes I'm planning to work through the types and write up a description of what's going on.
It seems like a potentially interesting little puzzle, though, and I thought I'd post it here in case someone can beat me to an answer.
wow, yeah. Here's what I can surmise is happening. since Set doesn't have an Applicative of its own, we are getting the Monoid#applicative instance instead:
scala> implicitly[Unapply[Applicative, Set[Int]]].TC
res0: scalaz.Applicative[_1.M] forSome { val _1: scalaz.Unapply[scalaz.Applicative,Set[Int]] } = scalaz.Monoid$$anon$1#7f5d0856
Since Monoid is defined for types of kind * and applicative is defined for types of kind * -> *, the definition of Applicative in Monoid sorta wedges in an ignored type parameter using a type lambda:
final def applicative: Applicative[({type λ[α]=F})#λ] = new Applicative[({type λ[α]=F})#λ] with SemigroupApply...
Notice there that the type parameter α of λ is thrown away, so when Applicative#point is called, which becomes Monoid#zero, instead of it being a Monoid[Set[Option[Int]]] it is a Monoid[Set[Int]].
larsh points out that this has the interesting side-effect of alllowing sequenceU to be (ab)used as sum:
scala> List(1,2,3).sequenceU
res3: Int = 6

scala collection conversions

What is the most effective way of conversion between different scala.collection object?
E.g.
val a=scala.collection.mutable.ListBuffer(1,2,0,3)
And I want to get scala.collection.mutable.ArrayBuffer.
According to http://docs.scala-lang.org/resources/images/collections.mutable.png it should be possible by converting to Buffer and to ArrayBuffer afterwards. Correct?
In general, can i make any conversion in scala collection through its common ancestor? (in the previous example the common ancestor is Buffer)
PS I read http://docs.scala-lang.org/overviews/collections/introduction.html but couldn't find anything about general conversions between various types (i'm aware about .toArray like methods)
thx
Syntax-wise most effective should be the to method introduced in 2.10:
def to[Col[_]]: Col[A]
Converts this collection into another by copying all elements.
Note: will not terminate for infinite-sized collections.
Use it as a.to[scala.collection.mutable.ArrayBuffer].
Efficiency-wise, unless you do an upcast-like conversion where you turn a subtype into a more general collection, converting will involve copying the elements. In your example, it does not matter if you turn the list buffer into a buffer and then into an array buffer -- you can do this directly using to, as you cause copying the elements from a linked list into an array either way.
Answering question number 2:
Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_07).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import collection.mutable._
import collection.mutable._
scala> List(1,2,3,4,5)
res0: List[Int] = List(1, 2, 3, 4, 5)
scala> res0.to[ArrayBuffer]
res1: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 4, 5)
scala> res0.to[ListBuffer]
res2: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2, 3, 4, 5)
You can convert them as you'd like as long as you keep compatibility:
scala> res0.to[Map]
<console>:12: error: scala.collection.mutable.Map takes two type parameters, expected: one
res0.to[Map]
^

Should Scala's map() behave differently when mapping to the same type?

In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map().
We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder to recreate the resulting collection, and those who go though an implicit CanBuildFrom to obtain the builder.
The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter, partition, drop, take, span, etc. These transformations are free to call newBuilder and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int] can always return a List[Int]; filtering a BitSet (or the RNA example structure described in this article on the architecture of the collections framework) can always return another BitSet (or RNA). Let's call them the filtering transformations.
The second category of transformations need CanBuildFroms to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet cannot contain Strings; an RNA contains only Bases. Examples of such transformations are map, flatMap, collect, scanLeft, ++, etc. Let's call them the mapping transformations.
Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.
scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet
scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)
scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)
scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again
scala> set.filter(_ % 2 == 0)
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine
scala> treeset.map(_ + 1)
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine
scala> set.map(_ + 1)
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!
Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll) method, may or may not be able to recreate the same collection type.
Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet when mapping to Ints, create a new TreeSet with the same ordering, etc.
Then we would avoid cases where
for (i <- set) {
val x = i + 1
println(x)
}
does not print the incremented elements of the TreeSet in the same order as
for (i <- set; x = i + 1)
println(x)
So:
Do you think this would be a good idea to change the behavior of the mapping transformations as described?
What are the inevitable caveats I have grossly overlooked?
How could it be implemented?
I was thinking about something like an implicit sameTypeEvidence: A =:= B parameter, maybe with a default value of null (or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null), which could be used at runtime to give more information to the CanBuildFrom, which in turn could be used to determine the type of builder to return.
I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet. Because the following does work as intended:
val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List
val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector
So the reason is that TreeSet is missing a specialised companion object/builder:
seq1.companion.newBuilder[Int] // ListBuffer
seq2.companion.newBuilder[Int] // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)
So my guess is, if you take proper provision for such a companion for your RNA class, you may find that both map and filter work as you wish...?

In Scala 2, type inference fails on Set made with .toSet?

Why is type inference failing here?
scala> val xs = List(1, 2, 3, 3)
xs: List[Int] = List(1, 2, 3, 3)
scala> xs.toSet map(_*2)
<console>:9: error: missing parameter type for expanded function ((x$1) => x$1.$times(2))
xs.toSet map(_*2)
However, if xs.toSet is assigned, it compiles.
scala> xs.toSet
res42: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> res42 map (_*2)
res43: scala.collection.immutable.Set[Int] = Set(2, 4, 6)
Also, going the other way, converting to Set from List, and mapping on List complies.
scala> Set(5, 6, 7)
res44: scala.collection.immutable.Set[Int] = Set(5, 6, 7)
scala> res44.toList map(_*2)
res45: List[Int] = List(10, 12, 14)
Q: Why doesn't toSet do what I want?
A: That would be too easy.
Q: But why doesn't this compile? List(1).toSet.map(x => ...)
A: The Scala compiler is unable to infer that x is an Int.
Q: What, is it stupid?
A: Well, List[A].toSet doesn't return an immutable.Set[A]. It returns an immutable.Set[B] for some unknown B >: A.
Q: How was I supposed to know that?
A: From the Scaladoc.
Q: But why is toSet defined that way?
A: You might be assuming immutable.Set is covariant, but it isn't. It's invariant. And the return type of toSet is in covariant position, so the return type can't be allowed to be invariant.
Q: What do you mean, "covariant position"?
A: Let me Wikipedia that for you: http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science) . See also chapter 19 of Odersky, Venners & Spoon.
Q: I understand now. But why is immutable.Set invariant?
A: Let me Stack Overflow that for you: Why is Scala's immutable Set not covariant in its type?
Q: I surrender. How do I fix my original code?
A: This works: List(1).toSet[Int].map(x => ...). So does this: List(1).toSet.map((x: Int) => ...)
(with apologies to Friedman & Felleisen. thx to paulp & ijuma for assistance)
EDIT: There is valuable additional information in Adriaan's answer and in the discussion in the comments both there and here.
The type inference does not work properly as the signature of List#toSet is
def toSet[B >: A] => scala.collection.immutable.Set[B]
and the compiler would need to infer the types in two places in your call. An alternative to annotating the parameter in your function would be to invoke toSet with an explicit type argument:
xs.toSet[Int] map (_*2)
UPDATE:
Regarding your question why the compiler can infer it in two steps, let's just look at what happens when you type the lines one by one:
scala> val xs = List(1,2,3)
xs: List[Int] = List(1, 2, 3)
scala> val ys = xs.toSet
ys: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
Here the compiler will infer the most specific type for ys which is Set[Int] in this case. This type is known now, so the type of the function passed to map can be inferred.
If you filled in all possible type parameters in your example the call would be written as:
xs.toSet[Int].map[Int,Set[Int]](_*2)
where the second type parameter is used to specify the type of the returned collection (for details look at how Scala collections are implemented). This means I even underestimated the number of types the compiler has to infer.
In this case it may seem easy to infer Int but there are cases where it is not (given the other features of Scala like implicit conversions, singleton types, traits as mixins etc.). I don't say it cannot be done - it's just that the Scala compiler does not do it.
I agree it would be nice to infer "the only possible" type, even when calls are chained, but there are technical limitations.
You can think of inference as a breadth-first sweep over the expression, collecting constraints (which arise from subtype bounds and required implicit arguments) on type variables, followed by solving those constraints. This approach allows, e.g., implicits to guide type inference. In your example, even though there is a single solution if you only look at the xs.toSet subexpression, later chained calls could introduce constraints that make the system unsatisfiable. The downside of leaving the type variables unsolved is that type inference for closures requires the target type to be known, and will thus fail (it needs something concrete to go on -- the required type of the closure and the type of its argument types must not both be unknown).
Now, when delaying solving the constraints makes inference fail, we could backtrack, solve all the type variables, and retry, but this is tricky to implement (and probably quite inefficient).