HMap seems to be the perfect data structure for my use case, however, I can't get it working:
case class Node[N](node: N)
class ImplVal[K, V]
implicit val iv1 = new ImplVal[Int, Node[Int]]
implicit val iv2 = new ImplVal[Int, Node[String]]
implicit val iv3 = new ImplVal[String, Node[Int]]
val hm = HMap[ImplVal](1 -> Node(1), 2 -> Node("two"), "three" -> Node(3))
My first question is whether it is possible to create those implicits vals automatically. For sure, for typical combinations I could create them manually, but I'm wondering if there is something more generic, less boilerplate way.
Next question is, how to get values out of the map:
val res1 = hm.get(1) // (1) ambiguous implicit values: both value iv2 [...] and value iv1 [...] match expected type ImplVal[Int,V]`
To me, Node[Int] (iv1) and Node[String] (iv2) look pretty different :) I thought, despite the JVM type erasure limitations, Scala could differentiate here. What am I missing? Do I have to use other implicit values to make the difference clear?
The explicit version works:
val res2 = hm.get[Int, Node[Int]](1) // (2) works
Of course, in this simple case, I could add the type information to the get call. But in the following case, where only the keys are known in advance, I don't know how to do it:
def get[T <: HList](keys: T): HList = // return associated values for keys
Is there any simple solution to this problem?
BTW, what documentation about Scala's type system (or Shapeless or in functional programming in general) could be recommended to understand the whole topic better as I have to admit, I'm lacking some background for this topic.
The type of the key determines the type of the value. You have Int keys corresponding to both Node[Int] and Node[String] values, hence the ambiguity. You might find this article helpful in explaining the general mechanism underlying this.
Related
Assuming both type A and B are erased and unknown at run time.
Is there a way to construct TypeTag[Map[A,B]]?
Preferrably using only explicit constructor, as in my real code both A and B are wildcard (it is possible to assign type parameters to 2 functions invoking them but why bother extracting 2 methods when they are only used once).
Thank you for your idea. Any suggestion is appreciated.
Preferrably using only explicit constructor, as in my real code both A and B are wildcard
It isn't completely clear, but do you mean that you have tagA: TypeTag[_] and tagB: TypeTag[_]? If so, then you can do
(tagA, tagB) match {
case (tagA: TypeTag[a], tagB: TypeTag[b]) =>
implicit val tagA1 = tagA
implicit val tagB1 = tagB
typeTag[Map[a, b]]
}
Somewhat ugly and boilerplaty, but I don't think there is a better way currently (and would be happy to learn otherwise).
I've looked around and found several other examples of this, but I don't really understand from those answers what's actually going on.
I'd like to understand why the following code fails to compile:
val df = readFiles(sqlContext).
withColumn("timestamp", udf(UDFs.parseDate _)($"timestamp"))
Giving the error:
Error:(29, 58) not enough arguments for method udf: (implicit evidence$2: reflect.runtime.universe.TypeTag[java.sql.Date], implicit evidence$3: reflect.runtime.universe.TypeTag[String])org.apache.spark.sql.UserDefinedFunction.
Unspecified value parameter evidence$3.
withColumn("timestamp", udf(UDFs.parseDate _)($"timestamp")).
^
Whereas this code does compile:
val parseDate = udf(UDFs.parseDate _)
val df = readFiles(sqlContext).
withColumn("timestamp", parseDate($"timestamp"))
Obviously I've found a "workaround" but I'd really like to understand:
What this error really means. The info I have found on TypeTags and ClassTags has been really difficult to understand. I don't come from a Java background, which perhaps doesn't help, but I think I should be able to grasp it…
If I can achieve what I want without a separate function definition
The error message is a bit mis-leading indeed; the reason for it is that the function udf takes an implicit parameter list but you are passing an actual parameter. Since I don't know much about spark and since the udf signature is a bit convoluted I'll try to explain what is going on with a simplified example.
In practice udf is a function that given some explicit parameters and an implicit parameter list gives you another function; let's define the following function that given a pivot of type T for which we have an implicit Ordering will give as a function that allows us to split a sequence in two, one containing elements smaller than pivot and the other containing elements that are bigger:
def buildFn[T](pivot: T)(implicit ev: Ordering[T]): Seq[T] => (Seq[T], Seq[T]) = ???
Let's leave out the implementation as it's not important. Now, if I do the following:
val elements: Seq[Int] = ???
val (small, big) = buildFn(10)(elements)
I will make the same kind of mistake that you are showing in your code, i.e. the compiler will think that I am explicitly passing elements as the implicit parameter list and this won't compile. The error message of my example will be somewhat different from the one you have because in my case the number of parameters I am mistakenly passing for the implicit parameter list matches the expected one and then the error will be about types not lining up.
Instead, if I write it as:
val elements: Seq[Int] = ???
val fn = buildFn(10)
val (small, big) = fn(elements)
In this case the compiler will correctly pass the implicit parameters to the function. I don't know of any way to circumvent this problem, unless you want to pass the actual implicit parameters explicitly but I find it quite ugly and not always practical; for reference this is what I mean:
val elements: Seq[Int] = ???
val (small, big) = buildFn(10)(implicitly[Ordering[Int]])(elements)
I want to map from class tokens to instances along the lines of the following code:
trait Instances {
def put[T](key: Class[T], value: T)
def get[T](key: Class[T]): T
}
Can this be done without having to resolve to casts in the get method?
Update:
How could this be done for the more general case with some Foo[T] instead of Class[T]?
You can try retrieving the object from your map as an Any, then using your Class[T] to “cast reflectively”:
trait Instances {
private val map = collection.mutable.Map[Class[_], Any]()
def put[T](key: Class[T], value: T) { map += (key -> value) }
def get[T](key: Class[T]): T = key.cast(map(key))
}
With help of a friend of mine, we defined the map with keys as Manifest instead of Class which gives a better api when calling.
I didnt get your updated question about "general case with some Foo[T] instead of Class[T]". But this should work for the cases you specified.
object Instances {
private val map = collection.mutable.Map[Manifest[_], Any]()
def put[T: Manifest](value: T) = map += manifest[T] -> value
def get[T: Manifest]: T = map(manifest[T]).asInstanceOf[T]
def main (args: Array[String] ) {
put(1)
put("2")
println(get[Int])
println(get[String])
}
}
If you want to do this without any casting (even within get) then you will need to write a heterogeneous map. For reasons that should be obvious, this is tricky. :-) The easiest way would probably be to use a HList-like structure and build a find function. However, that's not trivial since you need to define some way of checking type equality for two arbitrary types.
I attempted to get a little tricky with tuples and existential types. However, Scala doesn't provide a unification mechanism (pattern matching doesn't work). Also, subtyping ties the whole thing in knots and basically eliminates any sort of safety it might have provided:
val xs: List[(Class[A], A) forSome { type A }] = List(
classOf[String] -> "foo", classOf[Int] -> 42)
val search = classOf[String]
val finalResult = xs collect { case (`search`, result) => result } headOption
In this example, finalResult will be of type Any. This is actually rightly so, since subtyping means that we don't really know anything about A. It's not why the compiler is choosing that type, but it is a correct choice. Take for example:
val xs: List[(Class[A], A) forSome { type A }] = List(classOf[Boolean] -> 'bippy)
This is totally legal! Subtyping means that A in this case will be chosen as Any. It's hardly what we want, but it is what you will get. Thus, in order to express this constraint without tracking all of the types individual (using a HMap), Scala would need to be able to express the constraint that a type is a specific type and nothing else. Unfortunately, Scala does not have this ability, and so we're basically stuck on the generic constraint front.
Update Actually, it's not legal. Just tried it and the compiler kicked it out. I think that only worked because Class is invariant in its type parameter. So, if Foo is a definite type that is invariant, you should be safe from this case. It still doesn't solve the unification problem, but at least it's sound. Unfortunately, type constructors are assumed to be in a magical super-position between co-, contra- and invariance, so if it's truly an arbitrary type Foo of kind * => *, then you're still sunk on the existential front.
In summary: it should be possible, but only if you fully encode Instances as a HMap. Personally, I would just cast inside get. Much simpler!
By dictionary I mean a lightweight map from names to values that can be used as the return value of a method.
Options that I'm aware of include making case classes, creating anon objects, and making maps from Strings -> Any.
Case classes require mental overhead to create (names), but are strongly typed.
Anon objects don't seem that well documented and it's unclear to me how to use them as arguments since there is no named type.
Maps from String -> Any require casting for retrieval.
Is there anything better?
Ideally these could be built from json and transformed back into it when appropriate.
I don't need static typing (though it would be nice, I can see how it would be impossible) - but I do want to avoid explicit casting.
Here's the fundamental problem with what you want:
def get(key: String): Option[T] = ...
val r = map.get("key")
The type of r will be defined from the return type of get -- so, what should that type be? From where could it be defined? If you make it a type parameter, then it's relatively easy:
import scala.collection.mutable.{Map => MMap}
val map: MMap[String, (Manifest[_], Any) = MMap.empty
def get[T : Manifest](key: String): Option[T] = map.get(key).filter(_._1 <:< manifest[T]).map(_._2.asInstanceOf[T])
def put[T : Manifest](key: String, obj: T) = map(key) = manifest[T] -> obj
Example:
scala> put("abc", 2)
scala> put("def", true)
scala> get[Boolean]("abc")
res2: Option[Boolean] = None
scala> get[Int]("abc")
res3: Option[Int] = Some(2)
The problem, of course, is that you have to tell the compiler what type you expect to be stored on the map under that key. Unfortunately, there is simply no way around that: the compiler cannot know what type will be stored under that key at compile time.
Any solution you take you'll end up with this same problem: somehow or other, you'll have to tell the compiler what type should be returned.
Now, this shouldn't be a burden in a Scala program. Take that r above... you'll then use that r for something, right? That something you are using it for will have methods appropriate to some type, and since you know what the methods are, then you must also know what the type of r must be.
If this isn't the case, then there's something fundamentally wrong with the code -- or, perhaps, you haven't progressed from wanting the map to knowing what you'll do with it.
So you want to parse json and turn it into objects that resemble the javascript objets described in the json input? If you want static typing, case classes are pretty much your only option and there are already libraries handling this, for example lift-json.
Another option is to use Scala 2.9's experimental support for dynamic typing. That will give you elegant syntax at the expense of type safety.
You can use approach I've seen in the casbah library, when you explicitly pass a type parameter into the get method and cast the actual value inside the get method. Here is a quick example:
case class MultiTypeDictionary(m: Map[String, Any]) {
def getAs[T <: Any](k: String)(implicit mf: Manifest[T]): T =
cast(m.get(k).getOrElse {throw new IllegalArgumentException})(mf)
private def cast[T <: Any : Manifest](a: Any): T =
a.asInstanceOf[T]
}
implicit def map2multiTypeDictionary(m: Map[String, Any]) =
MultiTypeDictionary(m)
val dict: MultiTypeDictionary = Map("1" -> 1, "2" -> 2.0, "3" -> "3")
val a: Int = dict.getAs("1")
val b: Int = dict.getAs("2") //ClassCastException
val b: Int = dict.getAs("4") //IllegalArgumetExcepton
You should note that there is no real compile-time checks, so you have to deal with all exceptions drawbacks.
UPD Working MultiTypeDictionary class
If you have only a limited number of types which can occur as values, you can use some kind of union type (a.k.a. disjoint type), having e.g. a Map[Foo, Bar | Baz | Buz | Blargh]. If you have only two possibilities, you can use Either[A,B], giving you a Map[Foo, Either[Bar, Baz]]. For three types you might cheat and use Map[Foo, Either[Bar, Either[Baz,Buz]]], but this syntax obviously doesn't scale well. If you have more types you can use things like...
http://cleverlytitled.blogspot.com/2009/03/disjoint-bounded-views-redux.html
http://svn.assembla.com/svn/metascala/src/metascala/OneOfs.scala
http://www.chuusai.com/2011/06/09/scala-union-types-curry-howard/
Is this an opportunity to make things a bit more efficient (for the prorammer): I find it gets a bit tiresome having to wrap things in Some, e.g. Some(5). What about something like this:
implicit def T2OptionT( x : T) : Option[T] = if ( x == null ) None else Some(x)
You would lose some type safety and possibly cause confusion.
For example:
val iThinkThisIsAList = 2
for (i <- iThinkThisIsAList) yield { i + 1 }
I (for whatever reason) thought I had a list, and it didn't get caught by the compiler when I iterated over it because it was auto-converted to an Option[Int].
I should add that I think this is a great implicit to have explicitly imported, just probably not a global default.
Note that you could use the explicit implicit pattern which would avoid confusion and keep code terse at the same time.
What I mean by explicit implicit is rather than have a direct conversion from T to Option[T] you could have a conversion to a wrapper object which provides the means to do the conversion from T to Option[T].
class Optionable[T <: AnyRef](value: T) {
def toOption: Option[T] = if ( value == null ) None else Some(value)
}
implicit def anyRefToOptionable[T <: AnyRef](value: T) = new Optionable(value)
... I might find a better name for it than Optionable, but now you can write code like:
val x: String = "foo"
x.toOption // Some("foo")
val y: String = null
x.toOption // None
I believe that this way is fully transparent and aids in the understanding of the written code - eliminating all checks for null in a nice way.
Note the T <: AnyRef - you should only do this implicit conversion for types that allow null values, which by definition are reference types.
The general guidelines for implicit conversions are as follows:
When you need to add members to a type (a la "open classes"; aka the "pimp my library" pattern), convert to a new type which extends AnyRef and which only defines the members you need.
When you need to "correct" an inheritance hierarchy. Thus, you have some type A which should have subclassed B, but didn't for some reason. In that case, you can define an implicit conversion from A to B.
These are the only cases where it is appropriate to define an implicit conversion. Any other conversion runs into type safety and correctness issues in a hurry.
It really doesn't make any sense for T to extend Option[T], and obviously the purpose of the conversion is not simply the addition of members. Thus, such a conversion would be inadvisable.
It would seem that this could be confusing to other developers, as they read your code.
Generally, it seems, implicit works to help cast from one object to another, to cut out confusing casting code that can clutter code, but, if I have some variable and it somehow becomes a Some then that would seem to be bothersome.
You may want to put some code showing it being used, to see how confusing it would be.
You could also try to overload the method :
def having(key:String) = having(key, None)
def having(key:String, default:String) = having(key, Some(default))
def having(key: String, default: Option[String]=Option.empty) : Create = {
keys += ( (key, default) )
this
}
That looks good to me, except it may not work for a primitive T (which can't be null). I guess a non-specialized generic always gets boxed primitives, so probably it's fine.