Scala shell. Declare same varibale name multiple times - scala

In Scala Shell, I can declare same variable multiple times and I am not getting any error/warning
For Example
scala> val a = 1
a : Int = 1
scala> val a = 2
a : Int = 2
scala> val a = 1
a : Int = 1
scala> lazy val a = 1
a : Int = <lazy>
Here variable name "a" is declared multiple times with var, val and lazy val
So I would like to know
How scala complier took this? eg: val a = 1 and var a = 2 which is higher precedence?
Why Scala shell is accepting while declaring the same name of variable multiple time?
How do i know whether declared variable is mutable or immutable since the same variable name is declared as var and val?
Note: In IntelliJ, Able to declare same variable with multiple time and I don't see error but while accessing IDE shows error as "Can not resolve varibale" so what is the use the declaring same variable multiple times?

In the repl, there is often experimenting and prototyping taking place, and redefining a val is most often not by mistake, but intentional.
Precedence is taken what you typed finally successful.
scala> val a: Int = 7
a: Int = 7
scala> val a: Int = "foo"
<console>:12: error: type mismatch;
found : String("foo")
required: Int
val a: Int = "foo"
^
scala> a
res7: Int = 7
If you aren't sure, whether a name is already in use, you may just type the name, like a in my case, and get a feedback. For undeclared values, you get:
scala> b
<console>:13: error: not found: value b
b
^
But if you paste a block of code with the :pas technique, multiple names in conflict won't work and the whole block is discarded.

Related

Scala: a template for function to accept only a certain arity and a certain output?

I have a class, where all of its functions have the same arity and same type of output. (Why? Each function is a separate processor that is applied to a Spark DataFrame and yields another DataFrame).
So, the class looks like this:
class Processors {
def p1(df: DataFrame): DataFrame {...}
def p2(df: DataFrame): DataFrame {...}
def p3(df: DataFrame): DataFrame {...}
...
}
I then apply all the methods of a given DataFrame by mapping over Processors.getClass.getMethod, which allows me to add more processors without changing anything else in the code.
What I'd like to do is define a template to the methods under Processors which will restrict all of them to accept only one DataFrame and return a DataFrame. Is there a way to do this?
Implementing a restriction on what kind of functions can be added to a "list" is possible by using an appropriate container class instead of a generic class to hold the methods that are restricted. The container of restricted methods can then be part of some new class or object or part of the main program.
What you lose below by using containers (e.g. a Map with string keys and restricted values) to hold specific kinds of functions is compile-time checking of the names of the methods. e.g. calling triple vs trilpe
The restriction of a function to take a type T and return that same type T can be defined as a type F[T] using Function1 from the scala standard library. Function1[A,B] allows any single-parameter function with input type A and output type B, but we want these input/output types to be the same, so:
type F[T] = Function1[T,T]
For a container, I will demonstrate scala.collection.mutable.ListMap[String,F[T]] assuming the following requirements:
string names reference the functions (doThis, doThat, instead of 1, 2, 3...)
functions can be added to the list later (mutable)
though you could choose some other mutable or immutable collection class (e.g. Vector[F[T]] if you only want to number the methods) and still benefit from the restriction of what kind of functions future developers can include into the container.
An abstract type can be defined as:
type TaskMap[T] = ListMap[String, F[T]]
For your specific application you would then instantiate this as:
val Processors:TaskMap[Dataframe] = ListMap(
"p1" -> ((df: DataFrame) => {...code for p1 goes here...}),
"p2" -> ((df: DataFrame) => {...code for p2 goes here...}),
"p3" -> ((df: DataFrame) => {...code for p3 goes here...})
)
and then to call one of these functions you use
Processors("p2")(someDF)
For simplicity of demonstration, let's forget about Dataframes for a moment and consider whether this scheme works with integers.
Consider the short program below. The collection "myTasks" can only contain functions from Int to Int. All of the lines below have been tested in the scala interpreter, v2.11.6, so you can follow along line by line.
import scala.collection.mutable.ListMap
type F[T] = Function1[T,T]
type TaskMap[T] = ListMap[String, F[T]]
val myTasks: TaskMap[Int] = ListMap(
"negate" -> ((x:Int)=>(-x)),
"triple" -> ((x:Int)=>(3*x))
)
we can add a new function to the container that adds 7 and name it "add7"
myTasks += ( "add7" -> ((x:Int)=>(x+7)) )
and the scala interpreter responds with:
res0: myTasks.type = Map(add7 -> <function1>, negate -> <function1>, triple -> <function1>)
but we can't add a function named "half" because it would return a Float, and a Float is not an Int and should trigger a type error
myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
Here we get this error message:
scala> myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
<console>:12: error: type mismatch;
found : Double
required: Int
myTasks += ( "half" -> ((x:Int)=>(0.5*x)) )
^
In a compiled application, this would be found at compile time.
How to call the functions stored this way is a bit more verbose for single calls, but can be very convenient.
Suppose we want to call "triple" on 10.
We can't write
triple(10)
<console>:9: error: not found: value triple
Instead it is
myTasks("triple")(10)
res4: Int = 30
Where this notation becomes more useful is if you have a list of tasks to perform but only want to allow tasks listed in myTasks.
Suppose we want to run all the tasks on the input data "10"
myTasks mapValues { _ apply 10 }
res9: scala.collection.Map[String,Int] =
Map(add7 -> 17, negate -> -10, triple -> 30)
Suppose we want to triple, then add7, then negate
If each result is desired separately, as above, that becomes:
List("triple","add7","negate") map myTasks.apply map { _ apply 10 }
res11: List[Int] = List(30, 17, -10)
But "triple, then add 7, then negate" could also be describing a series of steps to do 10, i.e. we want -((3*10)+7)" and scala can do that too
val myProgram = List("triple","add7","negate")
myProgram map myTasks.apply reduceLeft { _ andThen _ } apply 10
res12: Int = -37
opening the door to writing an interpreter for your own customizable set of tasks because we can also write
val magic = myProgram map myTasks.apply reduceLeft { _ andThen _ }
and magic is then a function from int to int that can take aribtrary ints or otherwise do work as a function should.
scala> magic(1)
res14: Int = -10
scala> magic(2)
res15: Int = -13
scala> magic(3)
res16: Int = -16
scala> List(10,20,30) map magic
res17: List[Int] = List(-37, -67, -97)
Is this what you mean?
class Processors {
type Template = DataFrame => DataFrame
val p1: Template = ...
val p2: Template = ...
val p3: Template = ...
def applyAll(df: DataFrame): DataFrame =
p1(p2(p3(df)))
}

Scala recursive val behaviour

What do you think prints out?
val foo: String = "foo" + foo
println(foo)
val foo2: Int = 3 + foo2
println(foo2)
Answer:
foonull
3
Why? Is there a part in specification that describes/explains this?
EDIT: To clarify my astonishment - I do realize that foo is undefined at val foo: String = "foo" + foo and that's why it has a default value null (zero for integers). But this doesn't seem very "clean", and I see opinions here that agree with me. I was hoping that compiler would stop me from doing something like that. It does make sense in some particular cases, such as when defining Streams which are lazy by nature, but for strings and integers I would expect either stopping me due to reassignment to val or telling me that I'm trying to use an undefined value, just like as if I wrote val foo = whatever (given that whatever was never defined).
To further complicate things, #dk14 points out that this behaviour is only present for values represented as fields and doesn't happen within blocks, e.g.
val bar: String = {
val foo: String = "foo" + foo // error: forward reference extends...
"bar"
}
Yes, see the SLS Section 4.2.
foo is a String which is a reference type. It has a default value of null. foo2 is an Int which has a default value of 0.
In both cases, you are referring to a val which has not been initialized, so the default value is used.
In both cases foo resp. foo2 have their default values according to the JVM specification. That's null for a reference type and 0 for an int or Int - as Scala spells it.
Scala's Int is backed by a primitive type (int) underneath, and its default value (before initialization) is 0.
On the other hand, String is an Object which is indeed null when not initialized.
Unfortunately, scala is not as safe as let's say Haskell because of compromises with Java's OOP model ("Object-oriented meets Functional"). There is a safe subset of Scala called Scalazzi and some of sbt/scalac-plugins can give you more warnings/errors:
https://github.com/puffnfresh/wartremover (doesn't check for the case you've found though)
Returning back to your case, this happens only for values represented as fields and doesn't happen when you're inside a function/method/block:
scala> val foo2: Int = 3 + foo2
foo2: Int = 3
scala> {val foo2: Int = 3 + foo2 }
<console>:14: error: forward reference extends over definition of value foo2
{val foo2: Int = 3 + foo2 }
^
scala> def method = {val a: Int = 3 + a}
<console>:12: error: forward reference extends over definition of value a
def method = {val a: Int = 3 + a}
The reason of this situation is integration with Java as val compiles to the final field in JVM, so Scala saves all initialization specifics of JVM classes (I believe it's part of JSR-133: Java Memory Model). Here is the more complicated example that explains this behavior:
scala> object Z{ val a = b; val b = 5}
<console>:12: warning: Reference to uninitialized value b
object Z{ val a = b; val b = 5}
^
defined object Z
scala> Z.a
res12: Int = 0
So, here you can see the warning that you didn't see in the first place.

Nulls in Scala ...why is this possible?

I was coding in Scala and doing some quick refactoring in Intellij, when I stumbled upon the following piece of weirdness...
package misc
/**
* Created by abimbola on 05/10/15.
*/
object WTF extends App {
val name: String = name
println(s"Value is: $name")
}
I then noticed that the compiler didn't complain, so I decided to attempt to run this and I got a very interesting output
Value is: null
Process finished with exit code 0
Can anyone tell me why this works?
EDIT:
First problem, the value name is assigned a reference to itself even though it does not exist yet; why exactly does the Scala compiler not explode with errors???
Why is the value of the assignment null?
1.) Why does the compiler not explode
Here is a reduced example. This compiles because through given type a default value can be inferred:
class Example { val x: Int = x }
scalac Example.scala
Example.scala:1: warning: value x in class Example does nothing other than call itself recursively
class Example { val x: Int = x }
This does not compile because no default value can be inferred:
class ExampleDoesNotCompile { def x = x }
scalac ExampleDoesNotCompile.scala
ExampleDoesNotCompile.scala:1: error: recursive method x needs result type
class ExampleDoesNotCompile { def x = x }
1.1 What happens here
My interpretation. So beware: The uniform access principle kicks in.
The assignment to the val x calls the accessor x() which returns the unitialized value of x.
So x is set to the default value.
class Example { val x: Int = x }
^
[[syntax trees at end of cleanup]] // Example.scala
package <empty> {
class Example extends Object {
private[this] val x: Int = _;
<stable> <accessor> def x(): Int = Example.this.x;
def <init>(): Example = {
Example.super.<init>();
Example.this.x = Example.this.x();
()
}
}
} ^
2.) Why the value is null
The default values are determined by the environment Scala is compiled to.
In the example you have given it looks like you run on the JVM. The default value for Object here is null.
So when you do not provide a value the default value is used as a fallback.
Default values JVM:
byte 0
short 0
int 0
long 0L
float 0.0f
double 0.0d
char '\u0000'
boolean false
Object null // String are objects.
Also the default value is a valid value for given type:
Here is an example in the REPL:
scala> val x : Int = 0
x: Int = 0
scala> val x : Int = null
<console>:10: error: an expression of type Null is ineligible for implicit conversion
val x : Int = null
^
scala> val x : String = null
x: String = null
why exactly does the Scala compiler not explode with errors?
Because this problem can't be solved in the general case. Do you know the halting problem? The halting problem says that it is not possible to write an algorithm that finds out if a program ever halts. Since the problem of finding out if a recursive definition would result in a null assignment can be reduced to the halting problem, it is also not possible to solve it.
Well, now it is quite easy to forbid recursive definitions at all, this is for example done for values that are no class values:
scala> def f = { val k: String = k+"abc" }
<console>:11: error: forward reference extends over definition of value k
def f = { val k: String = k+"abc" }
^
For class values this feature is not forbidden for a few reasons:
Their scope is not limited
The JVM initializes them with a default value (which is null for reference types).
Recursive values are useful
Your use case is trivial, as is this:
scala> val k: String = k+"abc"
k: String = nullabc
But what about this:
scala> object X { val x: Int = Y.y+1 }; object Y { val y: Int = X.x+1 }
defined object X
defined object Y
scala> X.x
res2: Int = 2
scala> Y.y
res3: Int = 1
scala> object X { val x: Int = Y.y+1 }; object Y { val y: Int = X.x+1 }
defined object X
defined object Y
scala> Y.y
res4: Int = 2
scala> X.x
res5: Int = 1
Or this:
scala> val f: Stream[BigInt] = 1 #:: 1 #:: f.zip(f.tail).map { case (a,b) => a+b }
f: Stream[BigInt] = Stream(1, ?)
scala> f.take(10).toList
res7: List[BigInt] = List(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
As you can see it is quite easy to write programs where it is not obvious anymore to which value they will result. And since the halting problem is not solvable we can not let the compiler do the work for us in non trivial cases.
This also means that trivial cases, as the one shown in your question, could be hardcoded in the compiler. But since there can't exist a algorithm that can detect all possible trivial cases, all cases that are ever found need to be hardcoded in the compiler (not to mention that a definition of a trivial case does not exist). Therefore it wouldn't be wise to even start hardcoding some of these cases. It would ultimately result in a slower compiler and a compiler that is more difficult to maintain.
One could argue that for an use case that burns every second user it would be wise to at least hardcode such an extreme scenario. On the other hand, some people just need to be burned in order to learn something new. ;)
I think #Andreas' answer already has the necessary info. I'll just try to provide additional explanation:
When you write val name: String = name at the class level, this does a few different things at the same time:
create the field name
create the getter name()
create code for the assignment name = name, which becomes part of the primary constructor
This is what's made explicit by Andreas' 1.1
package <empty> {
class Example extends Object {
private[this] val x: Int = _;
<stable> <accessor> def x(): Int = Example.this.x;
def <init>(): Example = {
Example.super.<init>();
Example.this.x = Example.this.x();
()
}
}
}
The syntax is not Scala, it is (as suggested by [[syntax trees at end of cleanup]]) a textual representation of what the compiler will later convert into bytecode. Some unfamiliar syntax aside, we can interpret this, like the JVM would:
the JVM creates an object. At this point, all fields have default values. val x: Int = _; is like int x; in Java, i.e. the JVM's default value is used, which is 0 for I (i.e. int in Java, or Int in Scala)
the constructor is called for the object
(the super constructor is called)
the constructor calls x()
x() returns x, which is 0
x is assigned 0
the constructor returns
as you can see, after the initial parsing step, there is nothing in the syntax tree that seems immediately wrong, even though the original source code looks wrong. I wouldn't say that this is the behavior I expect, so I would imagine one of three things:
Either, the Scala devs saw it as too intricate to recognize and forbid
or, it's a regression and simply wasn't found as a bug
or, it's a "feature" and there is legitimate need for this behavior
(ordering reflects my opinion of likeliness, in decreasing order)

How to get default constructor parameter using reflection?

This kind of seemed easy to figure out but now am confused:
scala> class B(i:Int)
defined class B
scala> classOf[B].getDeclaredFields
res12: Array[java.lang.reflect.Field] = Array()
Note this:
scala> class C(i:Int){
| val j = 3
| val k = -1
| }
defined class C
scala> classOf[C].getDeclaredFields
res15: Array[java.lang.reflect.Field] = Array(private final int C.j, private final int C.k)
If you declare i as a val or a var, or if you make B a case class, then you'll see:
scala> classOf[B].getDeclaredFields
res1: Array[java.lang.reflect.Field] = Array(private final int B.i)
If you do neither, no method or field named i is generated, because it's just a constructor parameter that's never used; there is no reason it would result in a method or field existing.
Note that the Scala compiler never generates public members, only private ones. Access from outside is meant to go through the method named i.

Getting a null with a val depending on abstract def in a trait [duplicate]

This question already has answers here:
Scala - initialization order of vals
(3 answers)
Closed 7 years ago.
I'm seeing some initialization weirdness when mixing val's and def's in my trait. The situation can be summarized with the following example.
I have a trait which provides an abstract field, let's call it fruit, which should be implemented in child classes. It also uses that field in a val:
scala> class FruitTreeDescriptor(fruit: String) {
| def describe = s"This tree has loads of ${fruit}s"
| }
defined class FruitTreeDescriptor
scala> trait FruitTree {
| def fruit: String
| val descriptor = new FruitTreeDescriptor(fruit)
| }
defined trait FruitTree
When overriding fruit with a def, things work as expected:
scala> object AppleTree extends FruitTree {
| def fruit = "apple"
| }
defined object AppleTree
scala> AppleTree.descriptor.describe
res1: String = This tree has loads of apples
However, if I override fruit using a val...
scala> object BananaTree extends FruitTree {
| val fruit = "banana"
| }
defined object BananaTree
scala> BananaTree.descriptor.describe
res2: String = This tree has loads of nulls
What's going on here?
In simple terms, at the point you're calling:
val descriptor = new FruitTreeDescriptor(fruit)
the constructor for BananaTree has not been given the chance to run yet. This means the value of fruit is still null, even though it's a val.
This is a subcase of the well-known quirk of the non-declarative initialization of vals, which can be illustrated with a simpler example:
class A {
val x = a
val a = "String"
}
scala> new A().x
res1: String = null
(Although thankfully, in this particular case, the compiler will detect something being afoot and will present a warning.)
To avoid the problem, declare fruit as a lazy val, which will force evaluation.
The problem is the initialization order. val fruit = ... is being initialized after val descriptor = ..., so at the point when descriptor is being initialized, fruit is still null. You can fix this by making fruit a lazy val, because then it will be initialized on first access.
Your descriptor field initializes earlier than fruit field as trait intializes earlier than class, that extends it. null is a field's value before initialization - that's why you get it. In def case it's just a method call instead of accessing some field, so everything is fine (as method's code may be called several times - no initialization here). See, http://docs.scala-lang.org/tutorials/FAQ/initialization-order.html
Why def is so different? That's because def may be called several times, but val - only once (so its first and only one call is actually initialization of the fileld).
Typical solution to such problem - using lazy val instead, it will intialize when you really need it. One more solution is early intializers.
Another, simpler example of what's going on:
scala> class A {val a = b; val b = 5}
<console>:7: warning: Reference to uninitialized value b
class A {val a = b; val b = 5}
^
defined class A
scala> (new A).a
res2: Int = 0 //null
Talking more generally, theoretically scala could analize the dependency graph between fields (which field needs other field) and start initialization from final nodes. But in practice every module is compiled separately and compiler might not even know those dependencies (it might be even Java, which calls Scala, which calls Java), so it's just do sequential initialization.
So, because of that, it couldn't even detect simple loops:
scala> class A {val a: Int = b; val b: Int = a}
<console>:7: warning: Reference to uninitialized value b
class A {val a: Int = b; val b: Int = a}
^
defined class A
scala> (new A).a
res4: Int = 0
scala> class A {lazy val a: Int = b; lazy val b: Int = a}
defined class A
scala> (new A).a
java.lang.StackOverflowError
Actually, such loop (inside one module) can be theoretically detected in separate build, but it won't help much as it's pretty obvious.