`def` vs `val` vs `lazy val` evaluation in Scala - scala

Am I right understanding that
def is evaluated every time it gets accessed
lazy val is evaluated once it gets accessed
val is evaluated once it gets into the execution scope?

Yes, but there is one nice trick: if you have lazy value, and during first time evaluation it will get an exception, next time you'll try to access it will try to re-evaluate itself.
Here is example:
scala> import io.Source
import io.Source
scala> class Test {
| lazy val foo = Source.fromFile("./bar.txt").getLines
| }
defined class Test
scala> val baz = new Test
baz: Test = Test#ea5d87
//right now there is no bar.txt
scala> baz.foo
java.io.FileNotFoundException: ./bar.txt (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:137)
...
// now I've created empty file named bar.txt
// class instance is the same
scala> baz.foo
res2: Iterator[String] = empty iterator

Yes, though for the 3rd one I would say "when that statement is executed", because, for example:
def foo() {
new {
val a: Any = sys.error("b is " + b)
val b: Any = sys.error("a is " + a)
}
}
This gives "b is null". b is never evaluated and its error is never thrown. But it is in scope as soon as control enters the block.

I would like to explain the differences through the example that i executed in REPL.I believe this simple example is easier to grasp and explains the conceptual differences.
Here,I am creating a val result1, a lazy val result2 and a def result3 each of which has a type String.
A). val
scala> val result1 = {println("hello val"); "returns val"}
hello val
result1: String = returns val
Here, println is executed because the value of result1 has been computed here. So, now result1 will always refer to its value i.e "returns val".
scala> result1
res0: String = returns val
So, now, you can see that result1 now refers to its value. Note that, the println statement is not executed here because the value for result1 has already been computed when it was executed for the first time. So, now onwards, result1 will always return the same value and println statement will never be executed again because the computation for getting the value of result1 has already been performed.
B). lazy val
scala> lazy val result2 = {println("hello lazy val"); "returns lazy val"}
result2: String = <lazy>
As we can see here, the println statement is not executed here and neither the value has been computed. This is the nature of lazyness.
Now, when i refer to the result2 for the first time, println statement will be executed and value will be computed and assigned.
scala> result2
hello lazy val
res1: String = returns lazy val
Now, when i refer to result2 again, this time around, we will only see the value it holds and the println statement wont be executed. From now on, result2 will simply behave like a val and return its cached value all the time.
scala> result2
res2: String = returns lazy val
C). def
In case of def, the result will have to be computed everytime result3 is called. This is also the main reason that we define methods as def in scala because methods has to compute and return a value everytime it is called inside the program.
scala> def result3 = {println("hello def"); "returns def"}
result3: String
scala> result3
hello def
res3: String = returns def
scala> result3
hello def
res4: String = returns def

One good reason for choosing def over val, especially in abstract classes (or in traits that are used to mimic Java's interfaces), is, that you can override a def with a val in subclasses, but not the other way round.
Regarding lazy, there are two things I can see that one should have in mind. The first is that lazy introduces some runtime overhead, but I guess that you would need to benchmark your specific situation to find out whether this actually has a significant impact on the runtime performance. The other problem with lazy is that it possibly delays raising an exception, which might make it harder to reason about your program, because the exception is not thrown upfront but only on first use.

You are correct. For evidence from the specification:
From "3.3.1 Method Types" (for def):
Parameterless methods name expressions that are re-evaluated each time
the parameterless method name is referenced.
From "4.1 Value Declarations and Definitions":
A value definition val x : T = e defines x as a name of the value that results from
the evaluation of e.
A lazy value definition evaluates its right hand side e the first
time the value is accessed.

def defines a method. When you call the method, the method ofcourse runs.
val defines a value (an immutable variable). The assignment expression is evaluated when the value is initialized.
lazy val defines a value with delayed initialization. It will be initialized when it's first used, so the assignment expression will be evaluated then.

A name qualified by def is evaluated by replacing the name and its RHS expression every time the name appears in the program. Therefore, this replacement will be executed every where the name appears in your program.
A name qualified by val is evaluated immediately when control reaches its RHS expression. Therefore, every time the name appears in the expression, it will be seen as the value of this evaluation.
A name qualified by lazy val follows the same policy as that of val qualification with an exception that its RHS will be evaluated only when the control hits the point where the name is used for the first time

Should point out a potential pitfall in regard to usage of val when working with values not known until runtime.
Take, for example, request: HttpServletRequest
If you were to say:
val foo = request accepts "foo"
You would get a null pointer exception as at the point of initialization of the val, request has no foo (would only be know at runtime).
So, depending on the expense of access/calculation, def or lazy val are then appropriate choices for runtime-determined values; that, or a val that is itself an anonymous function which retrieves runtime data (although the latter seems a bit more edge case)

Related

Calling function library scala

I'm looking to call the ATR function from this scala wrapper for ta-lib. But I can't figure out how to use wrapper correctly.
package io.github.patceev.talib
import com.tictactec.ta.lib.{Core, MInteger, RetCode}
import scala.concurrent.Future
object Volatility {
def ATR(
highs: Vector[Double],
lows: Vector[Double],
closes: Vector[Double],
period: Int = 14
)(implicit core: Core): Future[Vector[Double]] = {
val arrSize = highs.length - period + 1
if (arrSize < 0) {
Future.successful(Vector.empty[Double])
} else {
val begin = new MInteger()
val length = new MInteger()
val result = Array.ofDim[Double](arrSize)
core.atr(
0, highs.length - 1, highs.toArray, lows.toArray, closes.toArray,
period, begin, length, result
) match {
case RetCode.Success =>
Future.successful(result.toVector)
case error =>
Future.failed(new Exception(error.toString))
}
}
}
}
Would someone be able to explain how to use function and print out the result to the console.
Many thanks in advance.
Regarding syntax, Scala is one of many languages where you call functions and methods passing arguments in parentheses (mostly, but let's keep it simple for now):
def myFunction(a: Int): Int = a + 1
myFunction(1) // myFunction is called and returns 2
On top of this, Scala allows to specify multiple parameters lists, as in the following example:
def myCurriedFunction(a: Int)(b: Int): Int = a + b
myCurriedFunction(2)(3) // myCurriedFunction returns 5
You can also partially apply myCurriedFunction, but again, let's keep it simple for the time being. The main idea is that you can have multiple lists of arguments passed to a function.
Built on top of this, Scala allows to define a list of implicit parameters, which the compiler will automatically retrieve for you based on some scoping rules. Implicit parameters are used, for example, by Futures:
// this defines how and where callbacks are run
// the compiler will automatically "inject" them for you where needed
implicit val ec: ExecutionContext = concurrent.ExecutionContext.global
Future(4).map(_ + 1) // this will eventually result in a Future(5)
Note that both Future and map have a second parameter list that allows to specify an implicit execution context. By having one in scope, the compiler will "inject" it for you at the call site, without having to write it explicitly. You could have still done it and the result would have been
Future(4)(ec).map(_ + 1)(ec)
That said, I don't know the specifics of the library you are using, but the idea is that you have to instantiate a value of type Core and either bind it to an implicit val or pass it explicitly.
The resulting code will be something like the following
val highs: Vector[Double] = ???
val lows: Vector[Double] = ???
val closes: Vector[Double] = ???
implicit val core: Core = ??? // instantiate core
val resultsFuture = Volatility.ATR(highs, lows, closes) // core is passed implicitly
for (results <- resultsFuture; result <- results) {
println(result)
}
Note that depending on your situation you may have to also use an implicit ExecutionContext to run this code (because you are extracting the Vector[Double] from a Future). Choosing the right execution context is another kind of issue but to play around you may want to use the global execution context.
Extra
Regarding some of the points I've left open, here are some pointers that hopefully will turn out to be useful:
Operators
Multiple Parameter Lists (Currying)
Implicit Parameters
Scala Futures

Strange NPE with io.circe.Decoder

I have 2 variables declared as follows,
implicit val decodeURL: Decoder[URL] = Decoder.decodeString.emapTry(s => Try(new URL(s))) // #1
implicit val decodeCompleted = Decoder[List[URL]].prepare(_.downField("completed")) // #2
Both lines compile and run.
However, if I annotate #2 with type i.e. implicit val decodeCompleted: Decoder[List[URL]] = Decoder[List[URL]].prepare(_.downField("completed")). It compiles and #2 will throw NullPointerException (NPE) during runtime.
How could this happen? I don't know if this is Circe or just plain Scala issue. Why #2 is different from #1? Thanks
The issue is that you are supposed to use implicits with annotation always.
Now, when you use them when they are not annotated you get into some sort on undefined/invalid behavior zone. That is why with unannotated happens sth like this:
I want to use Decoder[List[URL]]
there is no (annotated) Decoder[List[URL]] implicit in the scope
let's derive it normally (no need for generic.auto._ because the definition for that is in a companion object)
once derived you call on it .prepare(_.downField("completed"))
the final result is of type Decoder[List[URL]], so that is inferred type of decodeCompleted
Now, what happens if you annotate?
I want to use Decoder[List[URL]]
there is decodeCompleted declared as something that fulfills that definition
let use decodeCompleted value
but decodeCompleted wasn't initialized! in fact we are initializing it right now!
as a result you end up with decodeCompleted = null
This is virtually equal to:
val decodeCompleted = decodeCompleted
except that the layer of indirection get's in the way of discovering the absurdity of this by compiler. (If you replaced val with def you would end up with an infinite recursion and stack overflow):
# implicit val s: String = implicitly[String]
s: String = null
# implicit def s: String = implicitly[String]
defined function s
# s
java.lang.StackOverflowError
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
Yup, its messed up by compiler. You did nothing wrong and in a perfect world it would work.
Scala community mitigates that by distincting:
auto derivation - when you need implicit somewhere and it is automatically derived without you defining a variable for it
semiauto derivation - when you derive into a value, and make that value implicit
In the later case, you usually have some utilities like:
import io.circe.generic.semiauto._
implicit val decodeCompleted: Decoder[List[URL]] = deriveDecoder[List[URL]]
It works because it takes DerivedDecoder[A] implicit and then it extracts Decoder[A] from it, so you never end up with implicit val a: A = implicitly[A] scenario.
Indeed the problem is that you introduce a recursive val, like #MateuszKubuszok explained.
The most straightforward—although slightly ugly—workaround is:
implicit val decodeCompleted: Decoder[List[URL]] = {
val decodeCompleted = null
Decoder[List[URL]].prepare(_.downField("completed"))
}
By shadowing decodeCompleted in the right-hand side, implicit search will no longer consider it as a candidate inside that code block, because it can no longer be referenced.

Ambigious semantics for by-name parameters?

Consider the following function:
import java.util.concurrent.Callable;
def callable[T]( operation: =>T) : Callable[T] = {
new Callable[T] {
def call : T = operation
}
}
In the REPL, this code does what I want:
scala> val myCallable = callable {
| println("Side effect!");
| "Hi!"
| }
myCallable: java.util.concurrent.Callable[String] = $anon$1#11ba4552
scala> myCallable.call
Side effect!
res3: String = Hi!
scala> myCallable.call
Side effect!
res4: String = Hi!
The by-name parameter is not evaluated until the function 'call' is called, and is re-evaluated every time that function is called. That is the behavior I want.
But in the spec, it says the following about by-name parameters:
"the corresponding argument is not evaluated at the point of function application, but instead is evaluated at each use within the function."
From this description, it is unclear that I can rely on the behavior I want. What does a "use within the function mean"? How do I know that this refers to the point at which my Callable is called (sometime in the indefinite future), rather than the point at which it's defined (very much "within the function")?
The code is doing what I want. But I'd rest easier if I was sure this behavior is reliable, not a bug that might be "fixed" in some future version of scala.
This is not a bug - that behaviour is as intended. You can think of "evaluated at each use within the function" recursively as "evaluated at each use within an expression in the function when that expression is evaluated".
"The function" is the function you're passing your parameter to. What this passage tries to warn you about is this:
scala> def byName(arg: => String) = arg + arg
byName: (arg: => String)java.lang.String
scala> byName({println("hi") ; "foo" })
hi
hi
res0: java.lang.String = foofoo
i.e. your side effect will happen every time you reference the argument. Since you're only doing it once, that's not that relevant to your case (except for the point of evaluation, which is inside the function, not at the call site).
To expand on the previous answers and clarify the way to avoid this, you can capture the value in a val inside the function if you only want it to be evaluated once. By doing this you are causing evaluation of the "by name" parameter and are using the computed value more than once rather than causing 2 evaluations of the same expression.
scala> def byName(arg: => String) = {val computedArg = arg; computedArg + computedArg}
byName: (arg: => String)java.lang.String
scala> byName({"println("hi") ; "foo" })
hi
res0: java.lang.String = foofoo
In case you need to do that in the future...
Lastly, to get truly lazy evaluation of a method parameter (zero or one evaluations only), you can do this:
def m1(i: => Int) = {
lazy val li = i
...
// any number of static or dynamic references to li
...
}

Scala's lazy arguments: How do they work?

In the file Parsers.scala (Scala 2.9.1) from the parser combinators library I seem to have come across a lesser known Scala feature called "lazy arguments". Here's an example:
def ~ [U](q: => Parser[U]): Parser[~[T, U]] = { lazy val p = q // lazy argument
(for(a <- this; b <- p) yield new ~(a,b)).named("~")
}
Apparently, there's something going on here with the assignment of the call-by-name argument q to the lazy val p.
So far I have not been able to work out what this does and why it's useful. Can anyone help?
Call-by-name arguments are called every time you ask for them. Lazy vals are called the first time and then the value is stored. If you ask for it again, you'll get the stored value.
Thus, a pattern like
def foo(x: => Expensive) = {
lazy val cache = x
/* do lots of stuff with cache */
}
is the ultimate put-off-work-as-long-as-possible-and-only-do-it-once pattern. If your code path never takes you to need x at all, then it will never get evaluated. If you need it multiple times, it'll only be evaluated once and stored for future use. So you do the expensive call either zero (if possible) or one (if not) times, guaranteed.
The wikipedia article for Scala even answers what the lazy keyword does:
Using the keyword lazy defers the initialization of a value until this value is used.
Additionally, what you have in this code sample with q : => Parser[U] is a call-by-name parameter. A parameter declared this way remains unevaluated, until you explicitly evaluate it somewhere in your method.
Here is an example from the scala REPL on how the call-by-name parameters work:
scala> def f(p: => Int, eval : Boolean) = if (eval) println(p)
f: (p: => Int, eval: Boolean)Unit
scala> f(3, true)
3
scala> f(3/0, false)
scala> f(3/0, true)
java.lang.ArithmeticException: / by zero
at $anonfun$1.apply$mcI$sp(<console>:9)
...
As you can see, the 3/0 does not get evaluated at all in the second call. Combining the lazy value with a call-by-name parameter like above results in the following meaning: the parameter q is not evaluated immediately when calling the method. Instead it is assigned to the lazy value p, which is also not evaluated immediately. Only lateron, when p is used this leads to the evaluation of q. But, as p is a val the parameter q will only be evaluated once and the result is stored in p for later reuse in the loop.
You can easily see in the repl, that the multiple evaluation can happen otherwise:
scala> def g(p: => Int) = println(p + p)
g: (p: => Int)Unit
scala> def calc = { println("evaluating") ; 10 }
calc: Int
scala> g(calc)
evaluating
evaluating
20

What does it mean when I use def to define a field in Scala?

What exactly is the difference between:
scala> def foo = 5
foo: Int
and
scala> def foo() = 5
foo: ()Int
Seems that in both cases, I end up with a variable foo which I can refer to without parenthesis, always evaluating to 5.
You're not defining a variable in either case. You're defining a method. The first method has no parameter lists, the second has one parameter list, which is empty. The first of these should be
called like this
val x = foo
while the second should be called like this
val x = foo()
However, the Scala compiler will let you call methods with one empty parameter list without the parentheses, so either form of call will work for the second method. Methods without parameter lists cannot be called with the parentheses
The preferred Scala style is to define and call no-argument methods which have side-effects with the parentheses. No-argument methods without side-effects should be defined and called without the parentheseses.
If you actually which to define a variable, the syntax is
val foo = 5
Before anything else is said, def does not define a field, it defines a method.
In the second case, you can omit parenthesis because of a specific feature of Scala. There are two differences of interest here: one mechanical, and one of recommended usage.
Beginning with the latter, it is recommended usage to use empty parameter list when there are side effects. One classic example is close(). You'd omit parenthesis if there are no side effects to calling the element.
Now, as a practical difference -- beyond possible weird syntactic mix-ups in corner cases (I'm not saying there are, just conjecturing) -- structural types must follow the correct convention.
For example, Source had a close method without parenthesis, meaning a structural type of def close(): Unit would not accept Source. Likewise, if I define a structural method as def close: Unit, then Java closeable objects will not be accepted.
What does it mean when I use def to define a field in Scala
You can't define a field using def.
Seems that in both cases, I end up with a variable foo which I can refer to without parenthesis, always evaluating to 5.
No, in both cases you end up with a method foo, which you can call without parentheses.
To see that, you can use javap:
// Main.scala
object Main {
def foo1 = 5
def foo2() = 5
}
F:\MyProgramming\raw>scalac main.scala
F:\MyProgramming\raw>javap Main
Compiled from "main.scala"
public final class Main extends java.lang.Object{
public static final int foo2();
public static final int foo1();
}
However, see http://tommy.chheng.com/index.php/2010/03/when-to-call-methods-with-or-without-parentheses-in-scala/
Additionally to the answers already given I'd like to stress two points:
The possibility to define methods without a parameter list is a way to realize the Uniform Access Principle. This allows to hide the difference between fields and methods, which makes later implementation changes easier.
You can call a method defined as def foo() = 5 using foo, but you can't call a method defined as def foo = 5 using foo()
I'm surprised that nobody mentioned anything about the laziness difference.
While val is evaluated only once at the time of definition, def is evaluated only when we access it and evaluated every-time we access it. See example below:
scala> def foo = {
| println("hi")
| 5
| }
foo: Int
scala> val onlyOnce = foo
scala> def everyTime = foo
scala> onlyOnce
res0: Int = 5
scala> onlyOnce
res1: Int = 5
scala> everyTime
hi
res2: Int = 5
scala> everyTime
hi
res3: Int = 5