Scala methods with no argument evaluating on definition in zeppelin - scala

I'm writing a method in zeppelin that will update several DataFrames, to be called as part of initializing my code.
The pattern we're following is to define all initialization methods in their own paragraphs, and then call them as part of a block.
def init(nc: NotebookContext) = {
method1()
method2()
}
However, for most definition signatures of methods without parameters, it appears that zeppelin is actually calling and evaluating the last method in a paragraph. This is a problem, because when the method is called later, it means the transformations have been applied to the DataFrame twice, which is not desired.
Is this a function of scala, or a quirk of zeppelin, or both? Why do some of these declarations evaluate immediately, while others wait to be called?
Assume the below methods are each defined in their own zeppelin paragraph
def runsAutomatically(): Unit = { println("test") }
//runsAutomatically: ()Unit
//test
def runsAutomatically2 = { println("test2") }
//runsAutomatically2: Unit
//test2
def waitsForDefinition= () => { println("test") }
//waitsForDefinition: () => Unit
I understand that there is a difference in scala between functions/methods with no parameter lists, and a single parameter list with no parameters, but I don't know why these different version would change when things get executed.
Finally if done in a single paragraph:
def runsAutomatically(): Unit = { println("test") }
def runsAutomatically2 = { println("test") }
//runsAutomatically: ()Unit
//runsAutomatically2: Unit
//test2
Is this just a quirk of zeppelins, or something about Scala I'm missing?

Because in scala, a def without an empty parameter list is a strict value; In the end it's actually just a val.
Scala is a strict language and not making the function a thunk by not adding an empty parameter list will actually be evaluated immediately.

You are right, none of these methods should get evaluated automatically. At least, not in pure Scala.
def runsAutomatically(): Unit = { println("test") }
def runsAutomatically2 = { println("test2") }
def waitsForDefinition= () => { println("test") }
You should blame Zeppelin for that then. What version of Zeppelin are you using? I don't see this problem in Zeppelin 0.9.0 - maybe an upgrade would be the option for you?

Related

Scala: wrapper function that returns a generic type

I am trying to create a generic wrapper function that can be wrapped around any method that returns an object. Very similar to the answer in this SO question. I tried the following:
def wrapper_function[T](f: => T): T = {
println("Executing now");
val ret: T = f;
println("Execution finished");
ret
}
def multiply2( x: Int ): Int = wrapper_function {
println("inside multiply2");
return x*2
}
However, I am observing that nothing is getting executed after the function call inside the wrapper function. Specifically, "Execution finished" is not getting printed.
scala> val x = multiply2(4)
Executing now
inside multiply2
x: Int = 8
I am using scala 2.11.8
Am I doing something wrong here? This is puzzling and I would appreciate some help.
I believe your problem is the "return" statement.
Return in scala doesn't work the same as in java. You can take a look in this answer but basically it is something of a "stack unwinding" which would cause you to return from the wrapper function.
Consider that when you do f: => T you are actually taking the block and running it. This block has a return which simply breaks from the wrapper and returns the value (as opposed to not using return in which case its result would be used for the assignment).
In general, if you are using return in scala at the end of a function or block, you are almost always doing something wrong...
Assaf Mendelson's answer is correct for most situations. However, it does not work in scenarios where you don't own the code of the inner function that you are wrapping, or when there is a legitimate case for using return in the inner function (see here)
For those cases, it will work by executing the inner function in a try-finally block:
def wrapper_function[T](f: => T): T = {
println("Executing now");
val ret: T = try f finally {
println("Execution finished");
}
ret
}

Scala procedure and function differences

I am learning Scala and running below code .I knew functions, that do not return anything is procedures in Scala but when running below code why extra () is coming in output. Here in procedure i am just printing the value of 'value'.
Can someone explain about this.
class Sample{
private var value = 1
def test() {value += 2; println(value)}
def test2() = value
}
object Main2 extends App {
val my_counter = new Sample()
println(my_counter.test())
println(my_counter.test2())
}
3
()
3
The so-called "procedure syntax" is just "syntactic sugar" for a method that returns Unit (what you would call void in Java).
def sayHello(toWhom: String) {
println(s"hello $toWhom")
}
Is semantically equivalent (and gets actually translated) to:
def sayHello(toWhom: String): Unit = {
println(s"hello $toWhom")
}
Notice the explicit type and the equal sign right after the method signature.
The type Unit has a single value which is written () (and read unit, just like it's type). That's what you see: the method test prints value and then produces () of type Unit, which you then move on to print on the screen itself.
As noted in a comment, the "procedure syntax" is deprecated and will be removed in Scala 3.
Procedure syntax compiles to a method that returns unit.
calling toString on Unit produces "()"
You are printing out the result of test (which is Unit) so you see its string representation, () in the output.

Mocking a function with pass-by-name arguments

My actual use-case is unit testing code involving finagle FuturePool: I want to make sure, FuturePool.apply was actually invoked, so that the task was executed in the correct instance of the pool.
The problem I am running into however seems more generic, so I will illustrate it on an abstract example, not related to finagle or futures.
Suppose, I have these two classes:
class Foo {
def apply(f: => String) = f
}
class Bar(val foo: Foo) {
def doit(f: => String) = foo(f)
}
Bar has an instance of Foo, that knows how to run functions, and I want to test that it is actually using it for execution:
describe("Bar") {
it("should use the right foo") {
val foo = mock[Foo]
when(foo.apply(any)).thenAnswer( new Answer[String] {
def answer(invocation: InvocationOnMock): String =
invocation.getArgumentAt(0, classOf[Function0[String]]).apply()
})
new Bar(foo).doit("foo") should equal("foo")
}
}
This does not work: .doit return null, apparently, because mockito does not realize it was mocked. It seems that any is not matching Function0 in this case (replacing it with any[Function0[String]] does not help either.
I also tried it another way:
it("should Foo!") {
val foo = Mockito.spy(new Foo)
new Bar(foo).doit("foo") should equal("foo")
verify(foo).apply(any)
}
This also does not work, and kinda confirms my suspicion about any not working in this case:
Argument(s) are different! Wanted:
foo$1.apply(
($anonfun$apply$mcV$sp$7) <function0>
);
Actual invocation has different arguments:
foo$1.apply(
($anonfun$apply$mcV$sp$6) <function0>
);
Any ideas about a good way to get around this?
This signature:
def apply(f: => String)
is known as "call by name" where it passes an expression instead of an evaluated expression. This is very specific to Scala and not well supported with Mockito.
There is a host of workarounds to this:
Is there a way to match on a call-by-name argument of a Mockito mock object in Specs?
How to mock a method with functional arguments in Scala?
How do you mock scala call-by name in Mockito
The one by Eric looks the simplest and what you may want.

What's the difference between using and no using a "=" in Scala defs ?

What the difference between the two defs below
def someFun(x:String) { x.length }
AND
def someFun(x:String) = { x.length }
As others already pointed out, the former is a syntactic shortcut for
def someFun(x:String): Unit = { x.length }
Meaning that the value of x.length is discarded and the function returns Unit (or () if you prefer) instead.
I'd like to stress out that this is deprecated since Oct 29, 2013 (https://github.com/scala/scala/pull/3076/), but the warning only shows up if you compile with the -Xfuture flag.
scala -Xfuture -deprecation
scala> def foo {}
<console>:1: warning: Procedure syntax is deprecated. Convert procedure `foo` to method by adding `: Unit =`.
def foo {}
foo: Unit
So you should never use the so-called procedure syntax.
Martin Odersky itself pointed this out in his Scala Day 2013 Keynote and it has been discussed in the scala mailing list.
The syntax is very inconsistent and it's very common for a beginner to hit this issue when learning the language. For this reasons it's very like that it will be removed from the language at some point.
Without the equals it is implicitly typed to return Unit (or "void"): the result of the body is fixed - not inferred - and any would-be return value is discarded.
That is, def someFun(x:String) { x.length } is equivalent to def someFun(x:String): Unit = { x.length }, neither of which are very useful here because the function causes no side-effects and returns no value.
The "equals form" without the explicit Unit (or other type) has the return type inferred; in this case that would be def someFun(x:String): Int = { x.length } which is more useful, albeit not very exciting.
I prefer to specify the return type for all exposed members, which helps to ensure API/contract stability and arguably adds clarity. For "void" methods this is trivial done by using the Procedure form, without the equals, although it is a stylistic debate of which is better - opponents might argue that having the different forms leads to needless questions about such ;-)
The former is
def someFun(x: String): Unit = {
x.length
() // return unit
}
And the latter is
def someFun(x: String): Int = {
x.length // returned
}
Note that the Scala Style guide always recommend using '=', both in
method declaration
Methods should be declared according to the following pattern:
def foo(bar: Baz): Bin = expr
function declaration
Function types should be declared with a space between the parameter type, the arrow and the return type:
def foo(f: Int => String) = ...
def bar(f: (Boolean, Double) => List[String]) = ...
As per Scala-2.10, using equals sign is preferred. Infact you must use equals sign in call declarations except the definitions returning Unit.
As such there is no difference but the first one is not recommended anymore and it should not be used.

why does this scala by-name parameter behave weirdly

OK the question might not say much, but here's the deal:
I'm learning scala and decided to make an utility class "FuncThread" with a method which receives a by-name parameter function (I guess its called that because it's a function but without a parameter list) and then starts a thread with a runable which in turn executes the passed function, I wrote such a class as follows:
class FuncThread
{
def runInThread( func: => Unit)
{
val thread = new Thread(new Runnable()
{
def run()
{
func
}
}
thread.start()
}
}
Then I wrote a junit test as follows:
#Test
def weirdBehaivorTest()
{
var executed = false
val util = new FuncThread()
util.runInThread
{
executed = true
}
//the next line makes the test pass....
//val nonSense : () => Unit = () => { Console println "???" }
assertTrue(executed)
}
If I uncomment the second commented line, the test passes but if it remains commented the test fails, is this the correct behaviour? how and when do by-name parameter functions get executed?
I know Scala has the actors library but I wanted to try this since I've always wanted to do this in Java
Is this just a race condition? runInThread starts the thread but your assertion tests 'executed' before the other thread sets it to true. Adding your extra line means more code (and so time) is executed before the test, making it more likely that 'executed' has been set to true
It's also worth noting that (as of Scala 2.8), the construct you were trying to write is available in the standard library
import scala.actors.Futures._
future{
executed = true
}
This construct is actually more powerful than what you're describing, the thread calculation can return a value, and which can be waited for.
import scala.actors.Futures._
//forks off expensive calculation
val expensiveToCalculateNumber:Future[Int] = future{
bigExpensiveCalculation()
}
// do a lot of other stuff
//print out the result of the expensive calculation if it's ready, otherwise wait until it is
println( expensiveToCalculateNumber());