This question already has answers here:
What is the eta expansion in Scala?
(2 answers)
Closed 4 years ago.
New to Scala, have searched far and wide for clarification on some ScalaMock syntax. As per this guide, I keep seeing the following general testing pattern:
(myClass.myMethod _).expects()
What exactly is happening here? What function does the class/method/space/underscore serve? How does the compiler treat this?
The appended _ forces the conversion of a method into a function.
To understand why this is necessary, let's try to re-build a tiny piece of Scalamock, namely the expects method. The expects method seems to be invoked on methods of mocked objects. But methods / functions do not have an expects method to begin with. Therefore, we have to use the "pimp my library"-pattern to attach the method expects to functions. We could do something like this:
implicit class ExpectsOp[A, B](f: A => B) {
def expects(a: A): Unit = println("it compiles, ship it...")
}
Now let's define a class Bar with method baz:
class Bar {
def baz(i: Int): Int = i * i
}
and also an instance of Bar:
val bar = new Bar
Let's see what happens if you try to invoke expects on bar.baz:
(bar.baz).expects(42)
error: missing argument list for method baz in class Bar
Unapplied methods are only converted to functions when a function type is expected. You can make this conversion explicit by writing baz _ or baz(_) instead of baz.
So, it doesn't work without explicit conversion into a function, and we have to enforce this conversion by appending an _:
(bar.baz _).expects(42) // prints: "it compiles, ship it..."
Related
While reading Functional Programming in Scala by Chiusano and Bjarnason, I encountered the following code in chapter 9, Parser Combinators:
trait Parsers[ParseError, Parser[+_]] { self =>
...
def or[A](s1: Parser[A], s2: Parser[A]): Parser[A]
implicit def string(s: String): Parser[String]
implicit def operators[A](p: Parser[A]) = ParserOps[A](p)
implicit def asStringParser[A](a: A)(implicit f: A => Parser[String]):
ParserOps[String] = ParserOps(f(a))
case class ParserOps[A](p: Parser[A]) {
def |[B>:A](p2: Parser[B]): Parser[B] = self.or(p,p2)
def or[B>:A](p2: => Parser[B]): Parser[B] = self.or(p,p2)
}
}
I understand that if there is a type incompatibility or missing parameters during compilation, the Scala compiler would look for a missing function that converts the non-matching type to the desired type or a variable in scope with the desired type that fits the missing parameter respectively.
If a string occurs in a place that requires a Parser[String], the string function in the above trait should be invoked to convert the string to a Parser[String].
However, I've difficulties understanding the operators and asStringParser functions. These are the questions that I have:
For the implicit operators function, why isn't there a return type?
Why is ParserOps defined as a case class and why can't the | or or function be defined in the Parsers trait itself?
What exactly is the asStringParser trying to accomplish? What is its purpose here?
Why is self needed? The book says, "Use self to explicitly disambiguate reference to the or method on the trait," but what does it mean?
I'm truly enjoying the book but the use of advanced language-specific constructs in this chapter is hindering my progress. It would be of great help if you can explain to me how this code works. I understand that the goal is to make the library "nicer" to use through operators like | and or, but don't understand how this is done.
Every method has a return type. In this case, it's ParserOps[A]. You don't have to write it out explicitly, because in this case it can be inferred automatically.
Probably because of the automatically provided ParserOps.apply-factory method in the companion object. You need fewer vals in the constructor, and you don't need the new keyword to instantiate ParserOps. It is not used in pattern matching though, so, you could do the same thing with an ordinary (non-case) class, wouldn't matter.
It's the "pimp-my-library"-pattern. It attaches methods | and or to Parser, without forcing Parser to inherit from anything. In this way, you can later declare Parser to be something like ParserState => Result[A], but you will still have methods | and or available (even though Function1[ParserState, Result[A]] does not have them).
You could put | and or directly in Parsers, but then you would have to use the syntax
|(a, b)
or(a, b)
instead of the much nicer
a | b
a or b
There are no "real operators" in Scala, everything is a method. If you want to implement a method that behaves as if it were an infix operator, you do exactly what is done in the book.
I'm using Scala 2.12.1. In the Interpreter I make an Int val:
scala> val someInt = 3
someInt: Int = 3
Then I tried to use the eta expansion and get the following error:
scala> someInt.== _
<console>:13: error: ambiguous reference to overloaded definition,
both method == in class Int of type (x: Char)Boolean
and method == in class Int of type (x: Byte)Boolean
match expected type ?
someInt.== _
^
I see in the scaladoc that the Int class has more than 2 overloaded methods.
Question: is there a particular reason that the error message shows only 2 overloaded methods as opposed to listing all of them?
By the way, to specify which method you want to use the syntax is this:
scala> someInt.== _ : (Double => Boolean)
res9: Double => Boolean = $$Lambda$1103/1350894905#65e21ce3
The choice of the two methods that are listed seems to be more or less arbitrary. For example, this snippet:
class A
class B
class C
class Foo {
def foo(thisWontBeListed: C): Unit = {}
def foo(thisWillBeListedSecond: B): Unit = {}
def foo(thisWillBeListedFirst: A): Unit = {}
}
val x: Foo = new Foo
x.foo _
fails to compile with the error message:
error: ambiguous reference to overloaded definition,
both method foo in class Foo of type (thisWillBeListedFirst: this.A)Unit
and method foo in class Foo of type (thisWillBeListedSecond: this.B)Unit
match expected type ?
x.foo _
That is, it simply picks the two last methods that have been added to the body of the class, and lists them in the error message. Maybe those methods are stored in a List in reverse order, and the two first items are picked to compose the error message.
Why does it do it? It does it, because it has been programmed to do so, I'd take it as a given fact.
What was the main reason why it was programmed to do exactly this and not something else? That would be a primarily opinion-based question, and probably nobody except the authors of the Scala compiler themselves could give a definitive answer to that. I can think of at least three good reasons why only two conflicting methods are listed:
It's faster: why search for all conflicts, if it is already clear that a particular line of code does not compile? There is simply no reason to waste any time enumerating all possible ways how a particular line of code could be wrong.
The full list of conflicting methods is usually not needed anyway: the error messages are already quite long, and can at times be somewhat cryptic. Why aggravate it by printing an entire wall of error messages for a single line?
Implementation is easier: whenever you write a language interpreter of some sort, you quickly notice that returning all errors is somewhat more difficult than returning just the first error. Maybe in this particular case, it was decided not to bother collecting all possible conflicts.
PS: The order of the methods in the source code of Int seems to be different, but I don't know exactly what this "source" code has to do with the actual Int implementation: it seems to be a generated file without any implementations, it's there just so that #scaladoc has something to process, the real implementation is elsewhere.
I understand the difference between zero-parameter and parameterless methods, but what I don't really understand is the language design choice that made parameterless methods necessary.
Disadvantages I can think of:
It's confusing. Every week or two there are questions here or on the Scala mailing list about it.
It's complicated; we also have to distinguish between () => X and => X.
It's ambiguous: does x.toFoo(y) mean what it says, or x.toFoo.apply(y)? (Answer: it depends on what overloads there are x's toFoo method and the overloads on Foo's apply method, but if there's a clash you don't see an error until you try to call it.)
It messes up operator style method calling syntax: there is no symbol to use in place of the arguments, when chaining methods, or at the end to avoid semicolon interference. With zero-arg methods you can use the empty parameter list ().
Currently, you can't have both defined in a class: you get an error saying the method is already defined. They also both convert to a Function0.
Why not just make methods def foo and def foo() exactly the same thing, and allow them to be called with or without parentheses? What are the upsides of how it is?
Currying, That's Why
Daniel did a great job at explaining why parameterless methods are necessary. I'll explain why they are regarded distinctly from zero-parameter methods.
Many people view the distinction between parameterless and zero-parameter functions as some vague form of syntactic sugar. In truth it is purely an artifact of how Scala supports currying (for completeness, see below for a more thorough explanation of what currying is, and why we all like it so much).
Formally, a function may have zero or more parameter lists, with zero or more parameters each.
This means the following are valid: def a, def b(), but also the contrived def c()() and def d(x: Int)()()(y: Int) etc...
A function def foo = ??? has zero parameter lists. A function def bar() = ??? has precisely one parameter list, with zero parameters. Introducing additional rules that conflate the two forms would have undermined currying as a consistent language feature: def a would be equivalent in form to def b() and def c()() both; def d(x: Int)()()(y: Int) would be equivalent to def e()(x: Int)(y: Int)()().
One case where currying is irrelevant is when dealing with Java interop. Java does not support currying, so there's no problem with introducing syntactic sugar for zero-parameter methods like "test".length() (which directly invokes java.lang.String#length()) to also be invoked as "test".length.
A quick explanation of currying
Scala supports a language feature called 'currying', named after mathematician Haskell Curry.
Currying allows you to define functions with several parameter lists, e.g.:
def add(a: Int)(b: Int): Int = a + b
add(2)(3) // 5
This is useful, because you can now define inc in terms of a partial application of add:
def inc: Int => Int = add(1)
inc(2) // 3
Currying is most often seen as a way of introducing control structures via libraries, e.g.:
def repeat(n: Int)(thunk: => Any): Unit = (1 to n) foreach { _ => thunk }
repeat(2) {
println("Hello, world")
}
// Hello, world
// Hello, world
As a recap, see how repeat opens up another opportunity to use currying:
def twice: (=> Any) => Unit = repeat(2)
twice {
println("Hello, world")
}
// ... you get the picture :-)
One nice thing about an issue coming up periodically on the ML is that there are periodic answers.
Who can resist a thread called "What is wrong with us?"
https://groups.google.com/forum/#!topic/scala-debate/h2Rej7LlB2A
From: martin odersky Date: Fri, Mar 2, 2012 at
12:13 PM Subject: Re: [scala-debate] what is wrong with us...
What some people think is "wrong with us" is that we are trying bend
over backwards to make Java idioms work smoothly in Scala. The
principaled thing would have been to say def length() and def length
are different, and, sorry, String is a Java class so you have to write
s.length(), not s.length. We work really hard to paper over it by
admitting automatic conversions from s.length to s.length(). That's
problematic as it is. Generalizing that so that the two are identified
in the type system would be a sure way to doom. How then do you
disambiguate:
type Action = () => () def foo: Action
Is then foo of type Action or ()? What about foo()?
Martin
My favorite bit of paulp fiction from that thread:
On Fri, Mar 2, 2012 at 10:15 AM, Rex Kerr <ich...#gmail.com> wrote:
>This would leave you unable to distinguish between the two with
>structural types, but how often is the case when you desperately
>want to distinguish the two compared to the case where distinguishing
>between the two is a hassle?
/** Note to maintenance programmer: It is important that this method be
* callable by classes which have a 'def foo(): Int' but not by classes which
* merely have a 'def foo: Int'. The correctness of this application depends
* on maintaining this distinction.
*
* Additional note to maintenance programmer: I have moved to zambia.
* There is no forwarding address. You will never find me.
*/
def actOnFoo(...)
So the underlying motivation for the feature is to generate this sort of ML thread.
One more bit of googlology:
On Thu, Apr 1, 2010 at 8:04 PM, Rex Kerr <[hidden email]> wrote: On
Thu, Apr 1, 2010 at 1:00 PM, richard emberson <[hidden email]> wrote:
I assume "def getName: String" is the same as "def getName(): String"
No, actually, they are not. Even though they both call a method
without parameters, one is a "method with zero parameter lists" while
the other is a "method with one empty parameter list". If you want to
be even more perplexed, try def getName()(): String (and create a
class with that signature)!
Scala represents parameters as a list of lists, not just a list, and
List() != List(List())
It's kind of a quirky annoyance, especially since there are so few
distinctions between the two otherwise, and since both can be
automatically turned into the function signature () => String.
True. In fact, any conflation between parameterless methods and
methods with empty parameter lists is entirely due to Java interop.
They should be different but then dealing with Java methods would be
just too painful. Can you imagine having to write str.length() each
time you take the length of a string?
Cheers
First off, () => X and => X has absolutely nothing to do with parameterless methods.
Now, it looks pretty silly to write something like this:
var x() = 5
val y() = 2
x() = x() + y()
Now, if you don't follow what the above has to do with parameterless methods, then you should look up uniform access principle. All of the above are method declarations, and all of them can be replaced by def. That is, assuming you remove their parenthesis.
Besides the convention fact mentioned (side-effect versus non-side-effect), it helps with several cases:
Usefulness of having empty-paren
// short apply syntax
object A {
def apply() = 33
}
object B {
def apply = 33
}
A() // works
B() // does not work
// using in place of a curried function
object C {
def m()() = ()
}
val f: () => () => Unit = C.m
Usefulness of having no-paren
// val <=> def, var <=> two related defs
trait T { def a: Int; def a_=(v: Int): Unit }
trait U { def a(): Int; def a_=(v: Int): Unit }
def tt(t: T): Unit = t.a += 1 // works
def tu(u: U): Unit = u.a += 1 // does not work
// avoiding clutter with apply the other way round
object D {
def a = Vector(1, 2, 3)
def b() = Vector(1, 2, 3)
}
D.a(0) // works
D.b(0) // does not work
// object can stand for no-paren method
trait E
trait F { def f: E }
trait G { def f(): E }
object H extends F {
object f extends E // works
}
object I extends G {
object f extends E // does not work
}
Thus in terms of regularity of the language, it makes sense to have the distinction (especially for the last shown case).
I would say both are possible because you can access mutable state with a parameterless method:
class X(private var x: Int) {
def inc() { x += 1 }
def value = x
}
The method value does not have side effects (it only accesses mutable state). This behavior is explicitly mentioned in Programming in Scala:
Such parameterless methods are quite common in Scala. By contrast, methods defined with empty parentheses, such as def height(): Int, are called empty-paren methods. The recommended convention is to use a parameterless method whenever there are no parameters and the method accesses mutable state only by reading fields of the containing object (in particular, it does not change mutable state).
This convention supports the uniform access principle [...]
To summarize, it is encouraged style in Scala to define methods that take no parameters and have no side effects as parameterless methods, i.e., leaving off the empty parentheses. On the other hand, you should never define a method that has side-effects without parentheses, because then invocations of that method would look like a field selection.
What exactly is the difference between:
scala> def foo = 5
foo: Int
and
scala> def foo() = 5
foo: ()Int
Seems that in both cases, I end up with a variable foo which I can refer to without parenthesis, always evaluating to 5.
You're not defining a variable in either case. You're defining a method. The first method has no parameter lists, the second has one parameter list, which is empty. The first of these should be
called like this
val x = foo
while the second should be called like this
val x = foo()
However, the Scala compiler will let you call methods with one empty parameter list without the parentheses, so either form of call will work for the second method. Methods without parameter lists cannot be called with the parentheses
The preferred Scala style is to define and call no-argument methods which have side-effects with the parentheses. No-argument methods without side-effects should be defined and called without the parentheseses.
If you actually which to define a variable, the syntax is
val foo = 5
Before anything else is said, def does not define a field, it defines a method.
In the second case, you can omit parenthesis because of a specific feature of Scala. There are two differences of interest here: one mechanical, and one of recommended usage.
Beginning with the latter, it is recommended usage to use empty parameter list when there are side effects. One classic example is close(). You'd omit parenthesis if there are no side effects to calling the element.
Now, as a practical difference -- beyond possible weird syntactic mix-ups in corner cases (I'm not saying there are, just conjecturing) -- structural types must follow the correct convention.
For example, Source had a close method without parenthesis, meaning a structural type of def close(): Unit would not accept Source. Likewise, if I define a structural method as def close: Unit, then Java closeable objects will not be accepted.
What does it mean when I use def to define a field in Scala
You can't define a field using def.
Seems that in both cases, I end up with a variable foo which I can refer to without parenthesis, always evaluating to 5.
No, in both cases you end up with a method foo, which you can call without parentheses.
To see that, you can use javap:
// Main.scala
object Main {
def foo1 = 5
def foo2() = 5
}
F:\MyProgramming\raw>scalac main.scala
F:\MyProgramming\raw>javap Main
Compiled from "main.scala"
public final class Main extends java.lang.Object{
public static final int foo2();
public static final int foo1();
}
However, see http://tommy.chheng.com/index.php/2010/03/when-to-call-methods-with-or-without-parentheses-in-scala/
Additionally to the answers already given I'd like to stress two points:
The possibility to define methods without a parameter list is a way to realize the Uniform Access Principle. This allows to hide the difference between fields and methods, which makes later implementation changes easier.
You can call a method defined as def foo() = 5 using foo, but you can't call a method defined as def foo = 5 using foo()
I'm surprised that nobody mentioned anything about the laziness difference.
While val is evaluated only once at the time of definition, def is evaluated only when we access it and evaluated every-time we access it. See example below:
scala> def foo = {
| println("hi")
| 5
| }
foo: Int
scala> val onlyOnce = foo
scala> def everyTime = foo
scala> onlyOnce
res0: Int = 5
scala> onlyOnce
res1: Int = 5
scala> everyTime
hi
res2: Int = 5
scala> everyTime
hi
res3: Int = 5
Can anyone explain the compile error below? Interestingly, if I change the return type of the get() method to String, the code compiles just fine. Note that the thenReturn method has two overloads: a unary method and a varargs method that takes at least one argument. It seems to me that if the invocation is ambiguous here, then it would always be ambiguous.
More importantly, is there any way to resolve the ambiguity?
import org.scalatest.mock.MockitoSugar
import org.mockito.Mockito._
trait Thing {
def get(): java.lang.Object
}
new MockitoSugar {
val t = mock[Thing]
when(t.get()).thenReturn("a")
}
error: ambiguous reference to overloaded definition,
both method thenReturn in trait OngoingStubbing of type
java.lang.Object,java.lang.Object*)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
and method thenReturn in trait OngoingStubbing of type
(java.lang.Object)org.mockito.stubbing.OngoingStubbing[java.lang.Object]
match argument types (java.lang.String)
when(t.get()).thenReturn("a")
Well, it is ambiguous. I suppose Java semantics allow for it, and it might merit a ticket asking for Java semantics to be applied in Scala.
The source of the ambiguitity is this: a vararg parameter may receive any number of arguments, including 0. So, when you write thenReturn("a"), do you mean to call the thenReturn which receives a single argument, or do you mean to call the thenReturn that receives one object plus a vararg, passing 0 arguments to the vararg?
Now, what this kind of thing happens, Scala tries to find which method is "more specific". Anyone interested in the details should look up that in Scala's specification, but here is the explanation of what happens in this particular case:
object t {
def f(x: AnyRef) = 1 // A
def f(x: AnyRef, xs: AnyRef*) = 2 // B
}
if you call f("foo"), both A and B
are applicable. Which one is more
specific?
it is possible to call B with parameters of type (AnyRef), so A is
as specific as B.
it is possible to call A with parameters of type (AnyRef,
Seq[AnyRef]) thanks to tuple
conversion, Tuple2[AnyRef,
Seq[AnyRef]] conforms to AnyRef. So
B is as specific as A. Since both are
as specific as the other, the
reference to f is ambiguous.
As to the "tuple conversion" thing, it is one of the most obscure syntactic sugars of Scala. If you make a call f(a, b), where a and b have types A and B, and there is no f accepting (A, B) but there is an f which accepts (Tuple2(A, B)), then the parameters (a, b) will be converted into a tuple.
For example:
scala> def f(t: Tuple2[Int, Int]) = t._1 + t._2
f: (t: (Int, Int))Int
scala> f(1,2)
res0: Int = 3
Now, there is no tuple conversion going on when thenReturn("a") is called. That is not the problem. The problem is that, given that tuple conversion is possible, neither version of thenReturn is more specific, because any parameter passed to one could be passed to the other as well.
In the specific case of Mockito, it's possible to use the alternate API methods designed for use with void methods:
doReturn("a").when(t).get()
Clunky, but it'll have to do, as Martin et al don't seem likely to compromise Scala in order to support Java's varargs.
Well, I figured out how to resolve the ambiguity (seems kind of obvious in retrospect):
when(t.get()).thenReturn("a", Array[Object](): _*)
As Andreas noted, if the ambiguous method requires a null reference rather than an empty array, you can use something like
v.overloadedMethod(arg0, null.asInstanceOf[Array[Object]]: _*)
to resolve the ambiguity.
If you look at the standard library APIs you'll see this issue handled like this:
def meth(t1: Thing): OtherThing = { ... }
def meth(t1: Thing, t2: Thing, ts: Thing*): OtherThing = { ... }
By doing this, no call (with at least one Thing parameter) is ambiguous without extra fluff like Array[Thing](): _*.
I had a similar problem using Oval (oval.sf.net) trying to call it's validate()-method.
Oval defines 2 validate() methods:
public List<ConstraintViolation> validate(final Object validatedObject)
public List<ConstraintViolation> validate(final Object validatedObject, final String... profiles)
Trying this from Scala:
validator.validate(value)
produces the following compiler-error:
both method validate in class Validator of type (x$1: Any,x$2: <repeated...>[java.lang.String])java.util.List[net.sf.oval.ConstraintViolation]
and method validate in class Validator of type (x$1: Any)java.util.List[net.sf.oval.ConstraintViolation]
match argument types (T)
var violations = validator.validate(entity);
Oval needs the varargs-parameter to be null, not an empty-array, so I finally got it to work with this:
validator.validate(value, null.asInstanceOf[Array[String]]: _*)