Updating Scala collections thread-safely - scala

i was wondering if there is any 'easy' way to update immutable scala collections safely. Consider following code:
class a {
private var x = Map[Int,Int]()
def update(p:(Int,Int)) { x = x + (p) }
}
This code is not thread safe, correct? By that i mean that if we have two threads invoking update method and lets say that x is map containing { 1=>2 } and thread A invokes update((3,4)) and only manages to execute the x + (p) part of the code. Then rescheduling occurs and thread B invokes update((13,37)) and successfully updates the variable x. The thread A continues and finishes.
After all this finishes, value x would equal map containing { 1=>2, 3=>4 }, correct? Instead of desired { 1=>2, 3=>4, 13=>37 }. Is there a simple way to fix that? I hope it's undestandable what I'm asking :)
Btw, i know there are solutions like Akka STM but i would prefer not to use those, unless necessary.
Thanks a lot for any answer!
edit: Also, i would prefer solution without locking. Eeeew :)

In your case, as MaurĂ­cio wrote, your collection is already thread safe because it is immutable. The only problem is reassigning the var, which may not be an atomic operation. For this particular problem, the easiest option is to use of the nice classes in java.util.concurrent.atomic, namely AtomicReference.
import java.util.concurrent.atomic.AtomicReference
class a {
private val x = new AtomicReference(Map[Int,Int]())
def update(p:(Int,Int)) {
while (true) {
val oldMap = x.get // get old value
val newMap = oldMap + p // update
if (x.compareAndSet(oldMap, newMap))
return // exit if update was successful, else repeat
}
}
}

The collection itself is thread safe as it has no shared mutable state, but your code is not and there is no way to fix this without locking, as you do have shared mutable state. Your best option is to lock the method itself marking it as synchronized.
The other solution would be use a mutable concurrent map, possibly java.util.concurrent.ConcurrentMap.

out of the box
atomicity
lock-free
O(1)
Check this out: http://www.scala-lang.org/api/2.11.4/index.html#scala.collection.concurrent.TrieMap

Re. Jean-Philippe Pellet's answer: you can make this a little bit more re-usable:
def compareAndSetSync[T](ref: AtomicReference[T])(logic: (T => T)) {
while(true) {
val snapshot = ref.get
val update = logic(snapshot)
if (ref.compareAndSet(snapshot, update)) return
}
}
def compareSync[T,V](ref: AtomicReference[T])(logic: (T => V)): V = {
var continue = true
var snapshot = ref.get
var result = logic(snapshot)
while (snapshot != ref.get) {
snapshot = ref.get
result = logic(snapshot)
}
result
}

Related

Need to know if assigning variable in Scala future iteration is correct

Given the following Scala code:
object TestFutures6 extends App {
val f0 = Future { 0 }
val f1 = Future { 1 }
val f2 = Future { 2 }
val fx = Seq(f0,f1,f2)
var x: String = _
println("x="+x)
fx.map {
seq => seq.map { i =>
println("i="+i)
if (x == null) {
x = "abc"
println("x="+x)
}
}
}
Thread.sleep(5000)
}
I need to know if assigning "abc" to variable x is subject to a run condition, as the iterations in seq.map may run in parallel. When I run the code I only get "abc" printed once, but I'm wondering if there may be a problem.
Also, I need to run this code in Play with the following context
import play.api.libs.concurrent.Execution.Implicits.defaultContext
There is technically a potential race condition here, though it may never actually occur.
Essentially you are relying on the fact that these two lines will never be "interleaved" among different threads:
if (x == null) {
x = "abc"
If two threads execute the null check before either executes setting x to abc, then they will both enter that block.
That might not happen in this toy example since your futures are quite simple and just complete immediately, but this is very bad practice and should not be in any production code.
That's certainly not thread safe, and in general the behavior would be non-deterministic. You may not see an issue when you run it, but we don't know what execution context you're using.
In general threads and statefulness don't go together. If you must have mutable state use a library/technique designed for statefulness (e.g. actors).

Scala: what is the interest in using Iterators?

I have used Iterators after have worked with Regexes in Scala but I don't really understand the interest.
I know that it has a state and if I call the next() method on it, it will output a different result every time, but I don't see anything I can do with it and that is not possible with an Iterable.
And it doesn't seem to work as Akka Streams (for example) since the following example directly prints all the numbers (without waiting one second as I would expect it):
lazy val a = Iterator({Thread.sleep(1000); 1}, {Thread.sleep(1000); 2}, {Thread.sleep(1000); 3})
while(a.hasNext){ println(a.next()) }
So what is the purpose of using Iterators?
Perhaps, the most useful property of iterators is that they are lazy.
Consider something like this:
(1 to 10000)
.map { x => x * x }
.map { _.toString }
.find { _ == "4" }
This snippet will square 10000 numbers, then generate 10000 strings, and then return the second one.
This on the other hand:
(1 to 10000)
.iterator
.map { x => x * x }
.map { _.toString }
.find { _ == "4" }
... only computes two squares, and generates two strings.
Iterators are also often useful when you need to wrap around some poorly designed (java?) objects in order to be able to handle them in functional style:
val rs: ResultSet = jdbcQuery.executeQuery()
new Iterator {
def next = rs
def hasNext = rs.next
}.map { rs =>
fetchData(rs)
}
Streams are similar to iterators - they are also lazy, and also useful for wrapping:
Stream.continually(rs).takeWhile { _.next }.map(fetchData)
The main difference though is that streams remember the data that gets materialized, so that you can traverse them more than once. This is convenient, but may be costly if the original amount of data is very large, especially, if it gets filtered down to much smaller size:
Source
.fromFile("huge_file.txt")
.getLines
.filter(_ == "")
.toList
This only uses, roughly (ignoring buffering, object overhead, and other implementation specific details), the amount of memory, necessary to keep one line in memory, plus however many empty lines there are in the file.
This on the other hand:
val reader = new FileReader("huge_file.txt")
Stream
.continually(reader.readLine)
.takeWhile(_ != null)
.filter(_ == "")
.toList
... will end up with the entire content of the huge_file.txt in memory.
Finally, if I understand the intent of your example correctly, here is how you could do it with iterators:
val iterator = Seq(1,2,3).iterator.map { n => Thread.sleep(1000); n }
iterator.foreach(println)
// Or while(iterator.hasNext) { println(iterator.next) } as you had it.
There is a good explanation of what iterator is http://www.scala-lang.org/docu/files/collections-api/collections_43.html
An iterator is not a collection, but rather a way to access the
elements of a collection one by one. The two basic operations on an
iterator it are next and hasNext. A call to it.next() will return the
next element of the iterator and advance the state of the iterator.
Calling next again on the same iterator will then yield the element
one beyond the one returned previously. If there are no more elements
to return, a call to next will throw a NoSuchElementException.
First of all you should understand what is wrong with your example:
lazy val a = Iterator({Thread.sleep(1); 1}, {Thread.sleep(1); 2},
{Thread.sleep(2); 3}) while(a.hasNext){ println(a.next()) }
if you look at the apply method of Iterator, you'll see there are no calls by name,so all Thread.sleep are calling at the same time when apply method calls. Also Thread.sleep takes parameter of time to sleep in milliseconds, so if you want to sleep your thread on one second you should pass Thread.sleep(1000).
The companion object has additional methods which allow you do the next:
val a = Iterator.iterate(1)(x => {Thread.sleep(1000); x+1})
Iterator is very useful when you need to work with large data. Also you can implement your own:
val it = new Iterator[Int] {
var i = -1
def hasNext = true
def next(): Int = { i += 1; i }
}
I don't see anything I can do with it and that is not possible with an Iterable
In fact, what most collection can do can also be done with Array, but we don't do that because it's much less convenient
So same reason apply to iterator, if you want to model a mutable state, then iterator makes more sense.
For example, Random is implemented in a way resemble to iterator because it's use case fit more naturally in iterator, rather than iterable.

Does scala have a lazy evaluating wrapper?

I want to return a wrapper/holder for a result that I want to compute only once and only if the result is actually used. Something like:
def getAnswer(question: Question): Lazy[Answer] = ???
println(getAnswer(q).value)
This should be pretty easy to implement using lazy val:
class Lazy[T](f: () => T) {
private lazy val _result = Try(f())
def value: T = _result.get
}
But I'm wondering if there's already something like this baked into the standard API.
A quick search pointed at Streams and DelayedLazyVal but neither is quite what I'm looking for.
Streams do memoize the stream elements, but it seems like the first element is computed at construction:
def compute(): Int = { println("computing"); 1 }
val s1 = compute() #:: Stream.empty
// computing is printed here, before doing s1.take(1)
In a similar vein, DelayedLazyVal starts computing upon construction, even requires an execution context:
val dlv = new DelayedLazyVal(() => 1, { println("started") })
// immediately prints out "started"
There's scalaz.Need which I think you'd be able to use for this.

Scala actor kills itself inconsistently

I am a newbie to scala and I am writing scala code to implement pastry protocol. The protocol itself does not matter. There are nodes and each node has a routing table which I want to populate.
Here is the part of the code:
def act () {
def getMatchingNode (initialMatch :String) : Int = {
val len = initialMatch.length
for (i <- 0 to noOfNodes-1) {
var flag : Int = 1
for (j <- 0 to len-1) {
if (list(i).key.charAt(j) == initialMatch(j)) {
continue
}
else {
flag = 0
}
}
if (flag == 1) {
return i
}
}
return -1
}
// iterate over rows
for (ii <- 0 to rows - 1) {
for (jj <- 0 to 15) {
var initialMatch = ""
for (k <- 0 to ii-1) {
initialMatch = initialMatch + key.charAt(k)
}
initialMatch += jj
println("initialMatch",initialMatch)
if (getMatchingNode(initialMatch) != -1) {
Routing(0)(jj) = list(getMatchingNode(initialMatch)).key
}
else {
Routing(0)(jj) = "NULL"
}
}
}
}// act
The problem is when the function call to getMatchingNode takes place then the actor dies suddenly by itself. 'list' is the list of all nodes. (list of node objects)
Also this behaviour is not consistent. The call to getMatchingNode should take place 15 times for each actor (for 10 nodes).
But while debugging the actor kills itself in the getMatchingNode function call after one call or sometimes after 3-4 calls.
The scala library code which gets executed is this :
def run() {
try {
beginExecution()
try {
if (fun eq null)
handler(msg)
else
fun()
} catch {
case _: KillActorControl =>
// do nothing
case e: Exception if reactor.exceptionHandler.isDefinedAt(e) =>
reactor.exceptionHandler(e)
}
reactor.kill()
}
Eclipse shows that this code has been called from the for loop in the getMatchingNode function
def getMatchingNode (initialMatch :String) : Int = {
val len = initialMatch.length
for (i <- 0 to noOfNodes-1)
The strange thing is that sometimes the loop behaves normally and sometimes it goes to the scala code which kills the actor.
Any inputs what wrong with the code??
Any help would be appreciated.
Got the error..
The 'continue' clause in the for loop caused the trouble.
I thought we could use continue in Scala as we do in C++/Java but it does not seem so.
Removing the continue solved the issue.
From the book: "Programming in Scala 2ed" by M.Odersky
You may have noticed that there has been no mention of break or continue.
Scala leaves out these commands because they do not mesh well with function
literals, a feature described in the next chapter. It is clear what continue
means inside a while loop, but what would it mean inside a function literal?
While Scala supports both imperative and functional styles of programming,
in this case it leans slightly towards functional programming in exchange
for simplifying the language. Do not worry, though. There are many ways to
program without break and continue, and if you take advantage of function
literals, those alternatives can often be shorter than the original code.
I really suggest reading the book if you want to learn scala
Your code is based on tons of nested for loops, which can be more often than not be rewritten using the Higher Order Functions available on the most appropriate Collection.
You can rewrite you function like the following [I'm trying to make it approachable for newcomers]:
//works if "list" contains "nodes" with an attribute "node.key: String"
def getMatchingNode (initialMatch :String) : Int = {
//a new list with the corresponding keys
val nodeKeys = list.map(node => node.key)
//zips each key (creates a pair) with the corresponding index in the list and then find a possible match
val matchOption: Option[(String, Int)] = (nodeKeys.zipWithIndex) find {case (key, index) => key == initialMatch}
//we convert an eventual result contained in the Option, with the right projection of the pair (which contains the index)
val idxOption = matchOption map {case (key, index) => index} //now we have an Option[Int] with a possible index
//returns the content of option if it's full (Some) or a default value of "-1" if there was no match (None). See Option[T] for more details
idxOption.getOrElse(-1)
}
The potential to easily transform or operate on the Collection's elements is what makes continues, and for loops in general, less used in Scala
You can convert the row iteration in a similar way, but I would suggest that if you need to work a lot with the collection's indexes, you want to use an IndexedSeq or one of its implementations, like ArrayBuffer.

Scala :: PriorityQueue => update priority of specific item

I have implemented routine to extract item from head, update it's priority and put it back to the queue with non-blocking approach like this (using AtomicReference)
def head(): Entry = {
def takeAndUpdate(e: Entry, success: Boolean): Entry = {
if (success) {
return e
}
val oldQueue = queueReference.get()
val newQueue = oldQueue.clone()
val item = newQueue.dequeue().increase()
newQueue += item
takeAndUpdate(item.e, queueReference.compareAndSet(oldQueue, newQueue))
}
takeAndUpdate(null, false)
}
Now I need to find out arbitrary Entry in the queue, change it's priority and put it back to the queue. It seems that PriorityQueue doesn't support this, so which class I should use to accomplish desired behavior?
It's related to Change priority of items in a priority queue
Use an immutable tree map (immutable.TreeMap) to accomplish this. You have to find the entry you want somehow - best way to do this is to associate the part of the information (Key) you use to find the entry with a key, and the actual entry that you wish to return with the value in the map (call this Entry).
Rename queueReference with treeReference. In the part of the code where you create the collection, use immutable.TreeMap[Key, Entry](<list of elements>).
Then modify the code like this:
def updateKey(k: Key) {
#annotation.tailrec def update(): (Key, Entry) = {
val oldTree = treeReference.get()
val entry = oldTree(k)
val newPair = modifyHoweverYouWish(k, entry)
val newTree = (oldTree - k) + newPair
if (treeReference.compareAndSet(oldTree, newTree)) newPair
else update()
}
update()
}
If the compareAndSet fails, you have to repeat as you already do in your source. Best to use #tailrec as shown above, to ensure that the function is tail recursive and prevent a potential stack overflow.
Alternatively, you could use an immutable.TreeSet - if you know the exact reference to your Entry object, you can use it to remove the element via - as above and then add it back after calling increase(). The code is almost the same.