I have the following method:
def lift[P <: Product, L <: HList](params: P)(implicit hl: Generic.Aux[P, L]) = {
directive[L](_(hl to params))
}
and it perfectly works if i pass more then two arguments:
val result = lift("string", 'a', 10) // compiles
val result2 = list(true, 5) // compiles
But when i'm passing a single argument it can't resolve implicit:
val failes = lift("string")
It can't find Generic implicit for [String, Nothing], why does it work in other cases?
You're seeing the result of auto-tupling, which is a Scala (mis-)feature that causes lift(true, 5) to be parsed as lift((true, 5)) when there's no lift method with the appropriate number of values (two, in this case). The compiler won't automatically wrap a single value in a Tuple1, however—you just get a compiler error.
See for example this answer for more details about auto-tupling, and this thread for some reasons auto-tupling is a terrible thing to include in your language.
There are a couple of possible workarounds. The first would be to create an implicit conversion from values to Tuple1, as suggested in this answer. I wouldn't personally recommend this approach—every implicit conversion you introduce into your code is another mine in the minefield.
Instead I'd suggest avoiding autotupling altogether. Write out list((true, 5)) explicitly—you get a lot of extra clarity at the cost of only a couple of extra characters. Unfortunately there's no comparable literal support for Tuple1, so you have to write out lift(Tuple1("string")), but even that's not too bad, and if you really wanted you could define a new liftOne method that would do it for you.
Related
I am a Scala noob reading through a parsing library, and have reached some syntax I do not understand:
def parseA[_: P] = P("a")
val Parsed.Success(value, successIndex) = parse("a", parseA(_))
I want to be able to combine these lines into one, ie
val Parsed.Success(value, successIndex) = parse("a", P("a"))
but this gives a compile error:
Error:(8, 61) overloaded method value P with alternatives:
[T](t: fastparse.P[T])(implicit name: sourcecode.Name, implicit ctx: fastparse.P[_])fastparse.P[T] <and>
=> fastparse.ParsingRun.type
cannot be applied to (String)
Error occurred in an application involving default arguments.
val Parsed.Success(value, successIndex) = parse(source, P("a"))
How should this line be written? And can you name the syntax concepts involved to maximise my learning?
_: P is the same as (implicit ctx: P[_]), that means that method is asking for an implicit parameter of type P[_] (the underscore means that it does not care for the inner type. See What are all the uses of an underscore in Scala?).
P("a") is calling this method, which requires such implicit in scope, and that is why in your second example it fails to compile, because it did not find the implicit parameter.
The features sued here are implicits, existential types & macros...
All of them are very advanced techniques. If you are just starting, I would suggest to leave them for latter.
Implicits are very important and useful, I would start from there, but first make sure you feel comfortable with "normal" Scala (whatever that means).
For the second question, I think this should work.
def program[_: P] = parse("a", P("a"))
val Parsed.Success(value, successIndex) = program
HMap seems to be the perfect data structure for my use case, however, I can't get it working:
case class Node[N](node: N)
class ImplVal[K, V]
implicit val iv1 = new ImplVal[Int, Node[Int]]
implicit val iv2 = new ImplVal[Int, Node[String]]
implicit val iv3 = new ImplVal[String, Node[Int]]
val hm = HMap[ImplVal](1 -> Node(1), 2 -> Node("two"), "three" -> Node(3))
My first question is whether it is possible to create those implicits vals automatically. For sure, for typical combinations I could create them manually, but I'm wondering if there is something more generic, less boilerplate way.
Next question is, how to get values out of the map:
val res1 = hm.get(1) // (1) ambiguous implicit values: both value iv2 [...] and value iv1 [...] match expected type ImplVal[Int,V]`
To me, Node[Int] (iv1) and Node[String] (iv2) look pretty different :) I thought, despite the JVM type erasure limitations, Scala could differentiate here. What am I missing? Do I have to use other implicit values to make the difference clear?
The explicit version works:
val res2 = hm.get[Int, Node[Int]](1) // (2) works
Of course, in this simple case, I could add the type information to the get call. But in the following case, where only the keys are known in advance, I don't know how to do it:
def get[T <: HList](keys: T): HList = // return associated values for keys
Is there any simple solution to this problem?
BTW, what documentation about Scala's type system (or Shapeless or in functional programming in general) could be recommended to understand the whole topic better as I have to admit, I'm lacking some background for this topic.
The type of the key determines the type of the value. You have Int keys corresponding to both Node[Int] and Node[String] values, hence the ambiguity. You might find this article helpful in explaining the general mechanism underlying this.
I've looked around and found several other examples of this, but I don't really understand from those answers what's actually going on.
I'd like to understand why the following code fails to compile:
val df = readFiles(sqlContext).
withColumn("timestamp", udf(UDFs.parseDate _)($"timestamp"))
Giving the error:
Error:(29, 58) not enough arguments for method udf: (implicit evidence$2: reflect.runtime.universe.TypeTag[java.sql.Date], implicit evidence$3: reflect.runtime.universe.TypeTag[String])org.apache.spark.sql.UserDefinedFunction.
Unspecified value parameter evidence$3.
withColumn("timestamp", udf(UDFs.parseDate _)($"timestamp")).
^
Whereas this code does compile:
val parseDate = udf(UDFs.parseDate _)
val df = readFiles(sqlContext).
withColumn("timestamp", parseDate($"timestamp"))
Obviously I've found a "workaround" but I'd really like to understand:
What this error really means. The info I have found on TypeTags and ClassTags has been really difficult to understand. I don't come from a Java background, which perhaps doesn't help, but I think I should be able to grasp it…
If I can achieve what I want without a separate function definition
The error message is a bit mis-leading indeed; the reason for it is that the function udf takes an implicit parameter list but you are passing an actual parameter. Since I don't know much about spark and since the udf signature is a bit convoluted I'll try to explain what is going on with a simplified example.
In practice udf is a function that given some explicit parameters and an implicit parameter list gives you another function; let's define the following function that given a pivot of type T for which we have an implicit Ordering will give as a function that allows us to split a sequence in two, one containing elements smaller than pivot and the other containing elements that are bigger:
def buildFn[T](pivot: T)(implicit ev: Ordering[T]): Seq[T] => (Seq[T], Seq[T]) = ???
Let's leave out the implementation as it's not important. Now, if I do the following:
val elements: Seq[Int] = ???
val (small, big) = buildFn(10)(elements)
I will make the same kind of mistake that you are showing in your code, i.e. the compiler will think that I am explicitly passing elements as the implicit parameter list and this won't compile. The error message of my example will be somewhat different from the one you have because in my case the number of parameters I am mistakenly passing for the implicit parameter list matches the expected one and then the error will be about types not lining up.
Instead, if I write it as:
val elements: Seq[Int] = ???
val fn = buildFn(10)
val (small, big) = fn(elements)
In this case the compiler will correctly pass the implicit parameters to the function. I don't know of any way to circumvent this problem, unless you want to pass the actual implicit parameters explicitly but I find it quite ugly and not always practical; for reference this is what I mean:
val elements: Seq[Int] = ???
val (small, big) = buildFn(10)(implicitly[Ordering[Int]])(elements)
In Scala, you can declare a variable by specifying the type, like this: (method 1)
var x : String = "Hello World"
or you can let Scala automatically detect the variable type (method 2)
var x = "Hello World"
Why would you use method 1? Does it have a performance benefit?
And once the variable has been declared, will it behave exactly the same in all situations wether it has been declared by method 1 or method 2?
Type inference is done at compile time - it's essentially the compiler figuring out what you mean, filling in the blanks, and then compiling the resulting code.
What this means is that there can be no runtime cost to type inference. The compile time cost, however, can sometimes be prohibitive and require you to explicitly annotate some of your expressions.
You will not have any performance difference using this two variants.
They will both be compiled to the same code.
The other answers assume that the compiler inferred what you think it inferred.
It is easy to demonstrate that specifying the type in a definition will set the expected type for the RHS of the definition and guide type inference.
For example, in this method that builds a collection of something, A is inferred to be Nothing, which may not be what you wanted:
scala> def build[A, B, C <: Iterable[B]](bs: B*)(implicit cbf: CanBuildFrom[A, B, C]): C = {
| val b = cbf(); println(b.getClass); b ++= bs; b.result }
build: [A, B, C <: Iterable[B]](bs: B*)(implicit cbf: scala.collection.generic.CanBuildFrom[A,B,C])C
scala> val xs = build(1,2,3)
class scala.collection.immutable.VectorBuilder
xs: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
scala> val xs: List[Int] = build(1,2,3)
class scala.collection.mutable.ListBuffer
xs: List[Int] = List(1, 2, 3)
scala> val xs: Seq[Int] = build(1,2,3)
class scala.collection.immutable.VectorBuilder
xs: Seq[Int] = Vector(1, 2, 3)
Obviously, it matters for runtime performance whether you get a List or a Vector.
This is a lame example, but in many expressions you wouldn't notice the type of an intermediate collection unless it caused a performance problem.
Sample conversations:
https://groups.google.com/forum/#!msg/scala-language/mQ-bIXbC1zs/wgSD4Up5gYMJ
http://grokbase.com/p/gg/scala-user/137mgpjg98/another-funny-quirk
Why is Seq.newBuilder returning a ListBuffer?
https://groups.google.com/forum/#!topic/scala-user/1SjYq_qFuKk
In the simple example you gave, there is no difference in the generated byte code, and therefore no difference in performance. It would also make no noticeable difference in compilation speed.
In more complex code (likely involving implicits) you could run into cases where compile-type performance would be noticeably improved by specifying some types. However, I would completely ignore this until and unless you run into it -- specify types or not for other, better reasons.
More in line with your question, there is one very important case where it is a good idea to specify the type to ensure good run-time performance. Consider this code:
val x = new AnyRef { def sayHi() = println("Howdy!") }
x.sayHi
That code uses reflection to call sayHi, and that's a huge performance hit. Recent versions of Scala will warn you about this code for that reason, unless you have enabled the language feature for it:
warning: reflective access of structural type member method sayHi should be enabled
by making the implicit value scala.language.reflectiveCalls visible.
This can be achieved by adding the import clause 'import scala.language.reflectiveCalls'
or by setting the compiler option -language:reflectiveCalls.
See the Scala docs for value scala.language.reflectiveCalls for a discussion
why the feature should be explicitly enabled.
You might then change the code to this, which does not make use of reflection:
trait Talkative extends AnyRef { def sayHi(): Unit }
val x = new Talkative { def sayHi() = println("Howdy!") }
x.sayHi
For this reason you generally want to specify the type of the variable when you are defining classes this way; that way if you inadvertently add a method that would require reflection to call, you'll get a compilation error -- the method won't be defined for the variable's type. So while it is not the case that specifying the type makes the code run faster, it is the case that if the code would be slow, specifying the type makes it fail to compile.
val x: AnyRef = new AnyRef { def sayHi() = println("Howdy!") }
x.sayHi // ERROR: sayHi is not defined on AnyRef
There are of course other reasons why you might want to specify a type. They are required for the formal parameters of methods/functions, and for the return types of methods that are recursive or overloaded.
Also, you should always specify return types for methods in a public API (unless they are just trivially obvious), or you might end up with different method signatures than you intended, and then risk breaking existing clients of your API when you fix the signature.
You may of course want to deliberately widen a type so that you can assign other types of things to a variable later, e.g.
var shape: Shape = new Circle(1.0)
shape = new Square(1.0)
But in these cases there is no performance impact.
It is also possible that specifying a type will cause a conversion, and of course that will have whatever performance impact the conversion imposes.
Can anyone explain the compile error below? Interestingly, if I change the return type of the get() method to String, the code compiles just fine. Note that the thenReturn method has two overloads: a unary method and a varargs method that takes at least one argument. It seems to me that if the invocation is ambiguous here, then it would always be ambiguous.
More importantly, is there any way to resolve the ambiguity?
import org.scalatest.mock.MockitoSugar
import org.mockito.Mockito._
trait Thing {
def get(): java.lang.Object
}
new MockitoSugar {
val t = mock[Thing]
when(t.get()).thenReturn("a")
}
error: ambiguous reference to overloaded definition,
both method thenReturn in trait OngoingStubbing of type
java.lang.Object,java.lang.Object*)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
and method thenReturn in trait OngoingStubbing of type
(java.lang.Object)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
match argument types (java.lang.String)
when(t.get()).thenReturn("a")
Well, it is ambiguous. I suppose Java semantics allow for it, and it might merit a ticket asking for Java semantics to be applied in Scala.
The source of the ambiguitity is this: a vararg parameter may receive any number of arguments, including 0. So, when you write thenReturn("a"), do you mean to call the thenReturn which receives a single argument, or do you mean to call the thenReturn that receives one object plus a vararg, passing 0 arguments to the vararg?
Now, what this kind of thing happens, Scala tries to find which method is "more specific". Anyone interested in the details should look up that in Scala's specification, but here is the explanation of what happens in this particular case:
object t {
def f(x: AnyRef) = 1 // A
def f(x: AnyRef, xs: AnyRef*) = 2 // B
}
if you call f("foo"), both A and B
are applicable. Which one is more
specific?
it is possible to call B with parameters of type (AnyRef), so A is
as specific as B.
it is possible to call A with parameters of type (AnyRef,
Seq[AnyRef]) thanks to tuple
conversion, Tuple2[AnyRef,
Seq[AnyRef]] conforms to AnyRef. So
B is as specific as A. Since both are
as specific as the other, the
reference to f is ambiguous.
As to the "tuple conversion" thing, it is one of the most obscure syntactic sugars of Scala. If you make a call f(a, b), where a and b have types A and B, and there is no f accepting (A, B) but there is an f which accepts (Tuple2(A, B)), then the parameters (a, b) will be converted into a tuple.
For example:
scala> def f(t: Tuple2[Int, Int]) = t._1 + t._2
f: (t: (Int, Int))Int
scala> f(1,2)
res0: Int = 3
Now, there is no tuple conversion going on when thenReturn("a") is called. That is not the problem. The problem is that, given that tuple conversion is possible, neither version of thenReturn is more specific, because any parameter passed to one could be passed to the other as well.
In the specific case of Mockito, it's possible to use the alternate API methods designed for use with void methods:
doReturn("a").when(t).get()
Clunky, but it'll have to do, as Martin et al don't seem likely to compromise Scala in order to support Java's varargs.
Well, I figured out how to resolve the ambiguity (seems kind of obvious in retrospect):
when(t.get()).thenReturn("a", Array[Object](): _*)
As Andreas noted, if the ambiguous method requires a null reference rather than an empty array, you can use something like
v.overloadedMethod(arg0, null.asInstanceOf[Array[Object]]: _*)
to resolve the ambiguity.
If you look at the standard library APIs you'll see this issue handled like this:
def meth(t1: Thing): OtherThing = { ... }
def meth(t1: Thing, t2: Thing, ts: Thing*): OtherThing = { ... }
By doing this, no call (with at least one Thing parameter) is ambiguous without extra fluff like Array[Thing](): _*.
I had a similar problem using Oval (oval.sf.net) trying to call it's validate()-method.
Oval defines 2 validate() methods:
public List<ConstraintViolation> validate(final Object validatedObject)
public List<ConstraintViolation> validate(final Object validatedObject, final String... profiles)
Trying this from Scala:
validator.validate(value)
produces the following compiler-error:
both method validate in class Validator of type (x$1: Any,x$2: <repeated...>[java.lang.String])java.util.List[net.sf.oval.ConstraintViolation]
and method validate in class Validator of type (x$1: Any)java.util.List[net.sf.oval.ConstraintViolation]
match argument types (T)
var violations = validator.validate(entity);
Oval needs the varargs-parameter to be null, not an empty-array, so I finally got it to work with this:
validator.validate(value, null.asInstanceOf[Array[String]]: _*)