Scala val has to be guarded with synchronized for concurrent access? - scala

As I read, Scala immutable val doesn't get translated to Java final for various reasons. Does this mean that accessing a val from an other Thread must be guarded with synchronization in order to guarantee visibility?

the assignment to val itself is fine from a multi-threading point of view, because you have to assign val a value when you declare it and that value can't be changed in the future (so if you do a val s="hello", s is "hello" from its birth on: no thread will ever read another value).
There are a couple of caveats, however:
1 - if you assign an instance of a mutable class to val, val by itself will not "protect" the internal state of the class from changing.
class Foo(s:String) { var thisIsMutable=s }
// you can then do this
val x = new Foo("hello")
x.thisIsMutable="goodbye"
// note that val guarantees that x is still the same instance of Foo
// reassigning x = new Foo("goodbye") would be illegal
2 - you (or one of your libraries...) can change a val via reflection. If this happens two threads could indeed read a different value for your val
import java.lang.reflect.Field
class Foo { val foo=true } // foo is immutable
object test {
def main(args: Array[String]) {
val f = new Foo
println("foo is " + f.foo) // "foo is true"
val fld = f.getClass.getDeclaredField("foo")
fld.setAccessible(true)
fld.setBoolean(f, false)
println("foo is " + f.foo) // "foo is false"
}
}

As object members, once initialized, vals never change their values during the lifetime of the object. As such, their values are guaranteed to be visible to all threads provided that the reference to the object didn't escape in the constructor. And, in fact, they get Java final modifiers as illustrated below:
object Obj {
val r = 1
def foo {
val a = 1
def bar = a
bar
}
}
Using javap:
...
private final int r;
...
public void foo();
...
0: iconst_1
1: istore_1
2: aload_0
3: iload_1
4: invokespecial #31; //Method bar$1:(I)I
7: pop
...
private final int bar$1(int);
...
0: iload_1
1: ireturn
...
As method locals, they are used only within the method, or they're being passed to a nested method or a closure as arguments (see lifted bar$1 above). A closure might be passed on to another thread, but it will only have a final field with the value of the local val. Therefore, they are visible from the point where they are created to all other threads and synchronization is not necessary.
Note that this says nothing about the object the val points to - it itself may be mutable and warrant synchronization.
In most cases the above cannot be violated via reflection - the Scala val member declaration actually generates a getter with the same name and a private field which the getter accesses. Trying to use reflection to modify the field will result in the NoSuchFieldException. The only way you could modify it is to add a specialized annotation to your class which will make the specialized fields protected, hence accessible to reflection. I cannot currently think of any other situation that could change something declared as val...

Related

Why does scala allow private case class fields?

In scala, it is legal to write case class Foo(private val bar: Any, private val baz: Any).
This works like one might hope, Foo(1, 2) == Foo(1, 2) and a copy method is generated as well. In my testing, marking the entire constructor as private marks the copy method as private, which is great.
However, either way, you can still do this:
Foo(1, 2) match { case Foo(bar, baz) => bar } // 1
So by pattern matching you can extract the value. That seems to render the private in private val bar: Any more of a suggestion. I saw someone say that "If you want encapsulation, a case class is not the right abstraction", which is a valid point I reckon I agree with. But that raises the question, why is this syntax even allowed if it is arguably misleading?
It's because case classes were not primary intended to be used with private constructors or fields in the first place. Their main purpose is to model immutable data. As you saw there are workarounds to get the fields so using a private constructor or private fields on a case class is usually a sign of a code smell.
Still, the syntax is allowed, because the code is syntactically and semantically correct - as far as the compiler is concerned. But from the programmer's point of view is at the limit of "does it makes sense to be used it like that?" Probably not.
Marking the constructor of a case class as private does not have the effect you want, and it does not make the copy method private, at least not in Scala 2.13.
LATER EDIT: In Scala 3, marking the constructor as private, does make the apply and copy methods private. This change was also developed for Scala 2 and can be found here - but it was delayed for the 2.14 future release. The reason it can't go into Scala 2.13 is because it's a breaking change.
case class Foo private (bar: Int, baz: Int)
The only thing I can't do is this:
val foo1 = new Foo(1, 2) // no constructor accessible from here
But I can do this:
val foo2 = Foo(1, 2) // ok
val foo3 = Foo.apply(1, 2) // ok
val foo4 = foo2.copy(4) // ok - Foo(4,2)
A case class as it's name implies means the object is precisely intended to be pattern matched or "caseable" - that means you can use it like this:
case Foo(x, y) => // do something with x and y
Or this:
val foo2 = Foo(1, 2)
val Foo(x, y) = foo2 // equivalent to the previous
Otherwise why would you mark it as a case class instead of a class ?
Sure, one could argue a case class is more convenient because it comes with a lot of methods (rest assured, all that comes with an overhead at runtime as a trade-off for all those created methods) but if privacy is what you are after, they don't bring anything to the table on that.
A case class creates a companion object with an unapply method inside which when given an instance of your case class, it will deconstruct/destructure it into its initialized fields. This is also called an extractor method.
The companion object and it's class can access each other's members - that includes private constructor and fields. That is how it was designed. Now apply and unapply are public methods defined on the companion object which means - you can still create new objects using apply, and if your fields are private - you can still access them from unapply.
Still, you can overwrite them both in the companion object if you really want your case class to be private. Most of the times though, it won't make sense to do so, unless you have some really specific requirements:
case class Foo2 private (private val bar: Int, private val baz: Int)
object Foo2 {
private def apply(bar: Int, baz: Int) = new Foo2(bar, baz)
private def unapply(f: Foo2): Option[(Int, Int)] = Some(f.bar, f.baz)
}
val foo11 = new Foo2(1, 2) // won't compile
val foo22 = Foo2(1, 2) // won't compile
val foo33 = Foo2.apply(1, 2) // won't compile
val Foo2(bar, baz) = foo22 // won't compile
println(Foo2(1, 2) == Foo2(1, 2)) // won't compile
val sum = foo22 match {
case Foo2(x, y) => x + y // won't compile
}
Still, you can see the contents of Foo2 by printing it, because case classes also overwrite toString and you can't make that private, so you'll have to overwrite it to print something else. I will leave that to you to try out.
print(foo11) // Foo2(1,2)
As you see, a case class brings multiple access points to it's constructor and fields. This example was just for understanding the concept. It is not an example of a good design. Usually in OOP, you need an instance of some class, to perform operations on it. So a class that you cannot instantiate at all is no more useful than a Scala object. If you find yourself blocking all ways to create an instance of some class or case class, that's a sign you probably need an object instead since object is already a singleton in Scala.
Adding to the previous answer: you can make the copy method inaccessible by adding a private method named copy:
case class Foo3(private val x: Int) {
private def copy: Foo3 = this
}
Foo3(1).copy(x = 2) // won't compile ("method ... cannot be accessed")

Scala NullPointerException during initialization

Consider the following case (this is a simplified version of what I have. The numberOfScenarios is the most important variable here: I usually use a hardcoded number instead of it, but I'm trying to see if it's possible to calculate the value):
object ScenarioHelpers {
val identifierList = (1 to Scenarios.numberOfScenarios).toArray
val concurrentIdentifierQueue = new ConcurrentLinkedQueue[Int](identifierList.toSeq)
}
abstract class AbstractScenario {
val identifier = ScenarioHelpers.concurrentIdentifierQueue.poll()
}
object Test1 extends AbstractScenario {
val scenario1 = scenario("test scenario 1").exec(/..steps../)
}
object Test2 extends AbstractScenario {
val scenario2 = scenario("test scenario 2").exec(/..steps../)
}
object Scenarios {
val scenarios = List(Test1.scenario1, Test2.scenario2)
val numberOfScenarios = scenarios.length
}
object TestPreparation {
val feeder = ScenarioHelpers.identifierList.map(n => Map("counter" -> n))
val prepScenario = scenario("test preparation")
.feed(feeder)
.exec(/..steps../)
}
Not sure if it matters, but the simulation starts with executing the TestPreparation.prepScenario.
I see that this code contains a circular dependency which makes this case impossible in and of itself. But I get a NullPointerException on the line in AbstractScenario where identifier is being initialized.
I don't fully understand all this, but I think it has something to do with the vals being simply declared at first and the initialization does not happen until later. So when identifier is being initialized, the concurrentIdentifierQueue is not yet initialized and is therefore null.
I'm just trying to understand the reasons behind the NullPointerException and also if there's any way to get this working with minimal changes? Thank you.
NPEs during trait initialization is a very common problem.
The most robust way to resolve it is avoiding implementation inheritance at all.
if it is not possible for some reasons you can mark problematic fields lazy val or def instead of val.
You answered that yourself:
I see that this code contains a circular dependency which makes this case impossible in and of itself. But I get a NullPointerException on the line in AbstractScenario where identifier is being initialized.
val feeder = ScenarioHelpers.identifierList... calls ScenarioHelpers initialization
val identifierList = (1 to Scenarios.numberOfScenarios).toArray calls Scenarios initialization
val scenarios = List(Test1.scenario1, Test2.scenario2) calls Test1 inicialization including AbstractScenario
Here val identifier = ScenarioHelpers.concurrentIdentifierQueue.poll() ScenarioHelpers is still initializing and identifierList is null.
You have to get numberOfScenarios in noncyclic way. Personally I would remove identifierList and assign identifier other way - incrementing counter or so.

What is the purpose of final val in Scala? [duplicate]

What is the reason for vals not (?) being automatically final in singleton objects? E.g.
object NonFinal {
val a = 0
val b = 1
def test(i: Int) = (i: #annotation.switch) match {
case `a` => true
case `b` => false
}
}
results in:
<console>:12: error: could not emit switch for #switch annotated match
def test(i: Int) = (i: #annotation.switch) match {
^
Whereas
object Final {
final val a = 0
final val b = 1
def test(i: Int) = (i: #annotation.switch) match {
case `a` => true
case `b` => false
}
}
Compiles without warnings, so presumably generates the faster pattern matching table.
Having to add final seems pure annoying noise to me. Isn't an object final per se, and thus also its members?
This is addressed explicitly in the specification, and they are automatically final:
Members of final classes or objects are implicitly also final, so
the final modifier is generally redundant for them, too. Note, however, that
constant value definitions (§4.1) do require an explicit final modifier, even if
they are defined in a final class or object.
Your final-less example compiles without errors (or warnings) with 2.10-M7, so I'd assume that there's a problem with the #switch checking in earlier versions, and that the members are in fact final.
Update: Actually this is more curious than I expected—if we compile the following with either 2.9.2 or 2.10-M7:
object NonFinal {
val a = 0
}
object Final {
final val a = 0
}
javap does show a difference:
public final class NonFinal$ implements scala.ScalaObject {
public static final NonFinal$ MODULE$;
public static {};
public int a();
}
public final class Final$ implements scala.ScalaObject {
public static final Final$ MODULE$;
public static {};
public final int a();
}
You see the same thing even if the right-hand side of the value definitions isn't a constant expression.
So I'll leave my answer, but it's not conclusive.
You're not asking "why aren't they final", you're asking "why aren't they inlined." It just happens that final is how you cue the compiler that you want them inlined.
The reason they are not automatically inlined is separate compilation.
object A { final val x = 55 }
object B { def f = A.x }
When you compile this, B.f returns 55, literally:
public int f();
0: bipush 55
2: ireturn
That means if you recompile A, B will be oblivious to the change. If x is not marked final in A, B.f looks like this instead:
0: getstatic #19 // Field A$.MODULE$:LA$;
3: invokevirtual #22 // Method A$.x:()I
6: ireturn
Also, to correct one of the other answers, final does not mean immutable in scala.
To address the central question about final on an object, I think this clause from the spec is more relevant:
A constant value definition is of the form final val x = e
where e is a constant expression (§6.24). The final modifier must be present and no type annotation may be given. References to the constant value x are themselves treated as constant expressions; in the generated code they are replaced by the definition’s right-hand side e.
Of significance:
No type annotation may be given
The expression e is used in the generated code (by my reading, as the original unevaluated constant expression)
It sounds to me like the compiler is required by the spec to use these more like macro replacements rather than values that are evaluated in place at compile time, which could have impacts on how the resulting code runs.
I think it is particularly interesting that no type annotation may be given.
This, I think points to our ultimate answer, though I cannot come up with an example that shows the runtime difference for these requirements. In fact, in my 2.9.2 interpreter, I don't even get the enforcement of the first rule.

Why are `private val` and `private final val` different?

I used to think that private val and private final val are same, until I saw section 4.1 in Scala Reference:
A constant value definition is of the form
final val x = e
where e is a constant expression (§6.24). The final modifier must be present and no type annotation may be given. References to the constant value x are themselves treated as constant expressions; in the generated code they are replaced by the definition’s right-hand side e.
And I have written a test:
class PrivateVal {
private val privateVal = 0
def testPrivateVal = privateVal
private final val privateFinalVal = 1
def testPrivateFinalVal = privateFinalVal
}
javap -c output:
Compiled from "PrivateVal.scala"
public class PrivateVal {
public int testPrivateVal();
Code:
0: aload_0
1: invokespecial #19 // Method privateVal:()I
4: ireturn
public int testPrivateFinalVal();
Code:
0: iconst_1
1: ireturn
public PrivateVal();
Code:
0: aload_0
1: invokespecial #24 // Method java/lang/Object."<init>":()V
4: aload_0
5: iconst_0
6: putfield #14 // Field privateVal:I
9: return
}
The byte code is just as Scala Reference said: private val is not private final val.
Why doesn't scalac just treat private val as private final val? Is there any underlying reason?
So, this is just a guess, but it was a perennial annoyance in Java that final static variables with a literal on the right-hand side get inlined into bytecode as constants. That engenders a performance benefit sure, but it causes binary compatibility of the definition to break if the "constant" ever changed. When defining a final static variable whose value might need to change, Java programmers have to resort to hacks like initializing the value with a method or constructor.
A val in Scala is already final in the Java sense. It looks like Scala's designers are using the redundant modifier final to mean "permission to inline the constant value". So Scala programmers have complete control over this behavior without resorting to hacks: if they want an inlined constant, a value that should never change but is fast, they write "final val". if they want flexibility to change the value without breaking binary compatibility, just "val".
I think the confusion here arises from conflating immutability with the semantics of final. vals can be overridden in child classes and therefore can't be treated as final unless marked as such explicitly.
#Brian The REPL provides class scope at the line level. See:
scala> $iw.getClass.getPackage
res0: Package = package $line3
scala> private val x = 5
<console>:5: error: value x cannot be accessed in object $iw
lazy val $result = `x`
scala> private val x = 5; println(x);
5

Scala: collecting updates/changes of immutable state

I'm currently trying to apply a more functional programming style to a project involving low-level (LWJGL-based) GUI development. Obviously, in such a case it is necessary to carry around a lot of state, which is mutable in the current version. My goal is to eventually have a completely immutable state, in order to avoid state changes as side effect. I studied scalaz's lenses and state monads for awhile, but my main concern remains: All these techniques rely on copy-on-write. Since my state has both a large number of fields and also some fields of considerable size, I'm worried about performance.
To my knowledge the most common approach to modify immutable objects is to use the generated copy method of a case class (this is also what lenses do under the hood). My first question is, how this copy method is actually implemented? I performed a few experiments with a class like:
case class State(
innocentField: Int,
largeMap: Map[Int, Int],
largeArray: Array[Int]
)
By benchmarking and also by looking at the output of -Xprof it looks like updating someState.copy(innocentField = 42) actually performs a deep copy and I observe a significant performance drop when I increase the size of largeMap and largeArray. I was somehow expecting that the newly constructed instance shares the object references of the original state, since internally the reference should just get passed to the constructor. Can I somehow force or disable this deep copy behaviour of the default copy?
While pondering on the copy-on-write issue, I was wondering whether there are more general solutions to this problem in FP, which store changes of immutable data in a kind of incremental way (in the sense of "collecting updates" or "gathering changes"). To my surprise I could not find anything, so I tried the following:
// example state with just two fields
trait State {
def getName: String
def getX: Int
def setName(updated: String): State = new CachedState(this) {
override def getName: String = updated
}
def setX(updated: Int): State = new CachedState(this) {
override def getX: Int = updated
}
// convenient modifiers
def modName(f: String => String) = setName(f(getName))
def modX(f: Int => Int) = setX(f(getX))
def build(): State = new BasicState(getName, getX)
}
// actual (full) implementation of State
class BasicState(
val getName: String,
val getX: Int
) extends State
// CachedState delegates all getters to another state
class CachedState(oldState: State) extends State {
def getName = oldState.getName
def getX = oldState.getX
}
Now this allows to do something like this:
var s: State = new BasicState("hello", 42)
// updating single fields does not copy
s = s.setName("world")
s = s.setX(0)
// after a certain number of "wrappings"
// we can extract (i.e. copy) a normal instance
val ns = s.setName("ok").setX(40).modX(_ + 2).build()
My question now is: What do you think of this design? Is this some kind of FP design pattern that I'm not aware of (apart from the similarity to the Builder pattern)? Since I have not found anything similar, I'm wondering if there is some major issue with this approach? Or are there any more standard ways to solve the copy-on-write bottleneck without giving up immutability?
Is there even a possibility to unify the get/set/mod functions in some way?
Edit:
My assumption that copy performs a deep copy was indeed wrong.
This is basically the same as views and is a type of lazy evaluation; this type of strategy is more or less the default in Haskell, and is used in Scala a fair bit (see e.g. mapValues on maps, grouped on collections, pretty much anything on Iterator or Stream that returns another Iterator or Stream, etc.). It is a proven strategy to avoid extra work in the right context.
But I think your premise is somewhat mistaken.
case class Foo(bar: Int, baz: Map[String,Boolean]) {}
Foo(1,Map("fish"->true)).copy(bar = 2)
does not in fact cause the map to be copied deeply. It just sets references. Proof in bytecode:
62: astore_1
63: iconst_2 // This is bar = 2
64: istore_2
65: aload_1
66: invokevirtual #72; //Method Foo.copy$default$2:()Lscala/collection/immutable/Map;
69: astore_3 // That was baz
70: aload_1
71: iload_2
72: aload_3
73: invokevirtual #76; //Method Foo.copy:(ILscala/collection/immutable/Map;)LFoo;
And let's see what that copy$default$2 thing does:
0: aload_0
1: invokevirtual #50; //Method baz:()Lscala/collection/immutable/Map;
4: areturn
Just returns the map.
And copy itself?
0: new #2; //class Foo
3: dup
4: iload_1
5: aload_2
6: invokespecial #44; //Method "<init>":(ILscala/collection/immutable/Map;)V
9: areturn
Just calls the regular constructor. No cloning of the map.
So when you copy, you create exactly one object--a new copy of what you're copying, with fields filled in. If you have a large number of fields, your view will be faster (as you have to create one new object (two if you use the function application version, since you need to create the function object also) but it has only one field). Otherwise it should be about the same.
So, yes, good idea potentially, but benchmark carefully to be sure it's worth it in your case--you have to write a fair bit of code by hand instead of letting the case class do it all for you.
I tried to write a (quite rough) test for timing performances on your case class copy operation.
object CopyCase {
def main(args: Array[String]) = {
val testSizeLog = byTen(10 #:: Stream[Int]()).take(6).toList
val testSizeLin = (100 until 1000 by 100) ++ (1000 until 10000 by 1000) ++ (10000 to 40000 by 10000)
//warmUp
runTest(testSizeLin)
//test with logarithmic size increments
val times = runTest(testSizeLog)
//test with linear size increments
val timesLin = runTest(testSizeLin)
times.foreach(println)
timesLin.foreach(println)
}
//The case class to test for copy
case class State(
innocentField: Int,
largeMap: Map[Int, Int],
largeArray: Array[Int]
)
//executes the test
def runTest(sizes: Seq[Int]) =
for {
s <- sizes
st = State(s, largeMap(s), largeArray(s))
//(time, state) = takeTime (st.copy(innocentField = 42)) //single run for each size
(time, state) = mean(st.copy(innocentField = 42))(takeTime) //mean time on multiple runs for each size
} yield (s, time)
//Creates the stream of 10^n with n = Naturals+{0}
def byTen(s: Stream[Int]): Stream[Int] = s.head #:: byTen(s map (_ * 10))
//append the execution time to the result
def takeTime[A](thunk: => A): (Double, A) = {
import System.{currentTimeMillis => millis, nanoTime => nanos}
val t0:Double = nanos
val res = thunk
val time = ((nanos - t0) / 1000)
(time, res)
}
//does a mean on multiple runs of the first element of the pair
def mean[A](thunk: => A)(fun: (=> A) => (Double, A)) = {
val population = 50
val mean = ((for (n <- 1 to population) yield fun(thunk)) map (_._1) ).sum / population
(mean, fun(thunk)._2)
}
//Build collections for the requested size
def largeMap(size: Int) = (for (i <- (1 to size)) yield (i, i)).toMap
def largeArray(size: Int) = Array.fill(size)(1)
}
On this machine:
CPU: 64bits dual-core-i5 3.10GHz
RAM: 8GB ram
OS: win7
Java: 1.7
Scala: 2.9.2
I have the following results, which looks like pretty regular to me.
(size, millisecs to copy)
(10,0.4347000000000001)
(100,0.4412600000000001)
(1000,0.3953200000000001)
(10000,0.42161999999999994)
(100000,0.4478600000000002)
(1000000,0.42816000000000015)
(100,0.4084399999999999)
(200,0.41494000000000014)
(300,0.42156000000000016)
(400,0.4281799999999999)
(500,0.42160000000000003)
(600,0.4347200000000001)
(700,0.43466000000000016)
(800,0.41498000000000007)
(900,0.40178000000000014)
(1000,0.44134000000000007)
(2000,0.42151999999999995)
(3000,0.42148)
(4000,0.40842)
(5000,0.38860000000000006)
(6000,0.4413600000000001)
(7000,0.4743200000000002)
(8000,0.44795999999999997)
(9000,0.45448000000000005)
(10000,0.45448)
(20000,0.4281600000000001)
(30000,0.46768)
(40000,0.4676200000000001)
Maybe you have different performance measurements in mind.
Or could it be that your profiled times are actually spent on generating the Map and the Array, instead of copying the case class?