I have been reading some posts and was wondering if someone can present a situation on when a TrieMap would be preferable to using a HashMap.
So essentially what architecture decision should motivate the use of a TrieMap?
As per documentation.
It's mutable collection that can be safely used in multithreading applications.
A concurrent hash-trie or TrieMap is a concurrent thread-safe lock-free
implementation of a hash array mapped trie. It is used to implement the
concurrent map abstraction. It has particularly scalable concurrent insert
and remove operations and is memory-efficient. It supports O(1), atomic,
lock-free snapshots which are used to implement linearizable lock-free size,
iterator and clear operations. The cost of evaluating the (lazy) snapshot is
distributed across subsequent updates, thus making snapshot evaluation horizontally scalable.
For details, see: http://lampwww.epfl.ch/~prokopec/ctries-snapshot.pdf
Also it has really nice API for caching.
So for example you have to calculate factorials of different number and sometimes re-use this results.
object o {
val factorialsCache = new TrieMap[Int, Int]()
def factorial(num: Int) = ??? // really heavy operations
def doWorkWithFuctorial(num: Int) = {
val factRes = factorialsCache.getOrElseUpdate(num, {
// we do not want to invoke it very often
factorial(num)
// this function will be executed only if there are no records in Map for such key
})
// start do some work `withfactRes`
factRes
}
}
Pay attention - function above use global state (cache) for write operations, but it's absolutely safe to use it in concurrent threads. You'll not loose any data.
Related
Update: this question is how can we use amortized analysis for immutable collections? Scala's immutable queue is just an example. How this immutable queue is implemented is clearly visible from sources. And as it was pointed in answers Scala's sources does not mention amortized time for it al all. But guides and internet podcasts do. And as I saw C# has the similar immutable queue with similar statements about amortized time for it.
The amortized analysis was invented originally for mutable collections. How we can apply it to a Scala's mutable Queue is clear. But how can we apply it to this sequence of operations for example?
val q0 = collection.immutable.Queue[Int]()
val q1 = q0.enqueue(1)
val h1 = q1.head
val q2 = q1.enqueue(2)
val h2 = q2.head
val (d2, q3) = q2.dequeue()
val (d1, q4) = q3.dequeue()
We have different immutable queues in sequence q0-q4. May we consider them as one single queue or not? How can we use O(1) enqueue operations to amortize both heavy head and the first dequeue? What method of amortized analysis can we use for this sequence? I don't know. I can not find anything in textbooks.
Final answer:
(Thanks to all who answered my question!)
In short: Yes, and no!
"No" because a data structure may be used as immutable but not persistent. The data structure is ephemeral if making a change we forget (or destroy) all old versions of the data structure. Mutable data structures is an example. Dequeuing of immutable queue with two strict lists can be called "amortized O(1)" in such ephemeral contexts. But full persistence with forking of the immutable data structure history is desirable for many algorithms. For such algorithms the expensive O(N) operations of the immutable queue with two strict lists are not amortized O(1) operations. So the guide authors should add an asterisk and print by 6pt font in footnote: * for specially selected sequences of operations.
In answers I was given an excellent reference: Amortization, Lazy Evaluation, and Persistence: Lists with Catenation via Lazy Linking:
We first review the basic concepts of lazy evaluation, amortization, and persistence. We next discuss why the traditional approach to amortization breaks down in a persistent setting. Then we outline our approach to amortization, which can often avoid such problems through judicious use of lazy evaluation.
We can create fully persistent immutable queue with amortized O(1) operations. It must be specially designed and use lazy evaluation. Without such framework with lazy evaluation of the structure parts and memorization of results we can not apply amortized analysis to fully persistent immutable data structures. (Also it is possible to create a double-ended queue with all operations having worst-case constant time and without lazy evaluation but my question was about amortized analysis).
Original question:
According to an original definition the amortized time complexity is a worst case averaged over sequence complexity for allowed sequences of operations: "we average the running time per operation over a (worst-case) sequence of operations" https://courses.cs.duke.edu/fall11/cps234/reading/Tarjan85_AmortizedComplexity.pdf See textbooks also ("Introduction To Algorithms" by Cormen et al. for example)
Scala's Collection Library guide states two collection.immutable.Queue methods (head and tail) have amortized O(1) time complexity: https://docs.scala-lang.org/overviews/collections-2.13/performance-characteristics.html This table does not mention complexities of enqueue and dequeue operations but another unofficial guide states O(1) time complexity for enqueue and amortized O(1) time complexity for dequeue. https://www.waitingforcode.com/scala-collections/collections-complexity-scala-immutable-collections/read
But what that statements for the amortized time complexity really mean? Intuitively they allow us to make predictions for algorithms with the data structure used such as any allowed by the data structure itself sequence of N amortized O(1) operations have no worse than O(N) complexity for the sequence. Unfortunately this intuitive meaning is clearly broken for immutable collections. For example, next function does have time complexity O(n^2) for 2n amortized O(1) (according to the guides) operations:
def quadraticInTime(n: Int) = {
var q = collection.immutable.Queue[Int]()
for (i <- 1 to n) q = q.enqueue(i)
List.fill(n)(q.head)
}
val tryIt = quadraticInTime(100000)
The second parameter of List.fill method is a by name parameter and is evaluated n times in sequence. We can also use q.dequeue._1 instead of q.head of course with the same result.
Also we can read in "Programming in Scala" by M. Odersky et al.: "Assuming that head, tail, and enqueue operations appear with about the same frequency, the amortized complexity is hence constant for each operation. So functional queues are asymptotically just as efficient as mutable ones." This contradicts to the worst case amortized complexity property from textbooks and wrong for quadraticInTime method above.
But if a data structure has O(1) time complexity of cloning we can break amortized time analysis assumptions for it just by executing N worst case "heavy" operations on N the data structure copies in sequence. And generality speaking any immutable collection have O(1) time complexity of cloning.
Question: is there a good formal definition of amortized time complexity for operations on immutable structures? The definition clearly must farther limit allowed sequences of operations to be useful.
Chris Okasaki described how to solve this problem with lazy evaluation in Amortization, Lazy Evaluation, and Persistence:
Lists with Catenation via Lazy Linking from 1995. The main idea is that you can guarantee that some action is done only once by hiding it in a thunk and letting the language runtime manage evaluating it exactly once.
The docs for Queue give tighter conditions for which the asymptotic complexity holds:
Adding items to the queue always has cost O(1). Removing items has cost O(1), except in the case where a pivot is required, in which case, a cost of O(n) is incurred, where n is the number of elements in the queue. When this happens, n remove operations with O(1) cost are guaranteed. Removing an item is on average O(1).
Note that removing from an immutable queue implies that when dequeueing, subsequent operations are on the returned Queue. Not doing this also means that it's not actually being used as a queue:
val q = Queue.empty[Int].enqueue(0).enqueue(1)
q.dequeue()._1 // 1
q.dequeue()._1 // 1
In your code, you're not actually using the Queue as a queue. Addressing this:
def f(n: Int): Unit = {
var q = Queue.empty[Int]
(1 to n).foreach { i => q = q.enqueue(i) }
List.fill(n) {
val (head, nextQ) = q.dequeue
q = nextQ
head
}
}
def time(block: => Unit): Unit = {
val startNanos = System.nanoTime()
block
println(s"Block took ${ System.nanoTime() - startNanos }ns")
}
scala> time(f(10000))
Block took 2483235ns
scala> time(f(20000))
Block took 5039420ns
Note also the if we do an approximately equal number of enqueue and head operations on the same scala.collection.immutable.Queue, the head operations are in fact constant time even without amortization:
val q = Queue.empty[Int]
(1 to n).foreach { i => q.enqueue(i) }
List.fill(n)(Try(q.head))
I am new to Scala.
I am trying to figure out how to ensure thread safety with functions in a Scala object (aka singleton)
From what I have read so far, it seems that I should keep visibility to function scope (or below) and use immutable variables wherever possible. However, I have not seen examples of where thread safety is violated, so I am not sure what other precautions should be taken.
Can someone point me to a good discussion of this issue, preferably with examples of where thread safety is violated?
Oh man. This is a huge topic. Here's a Scala-based intro to concurrency and Oracle's Java lessons actually have a pretty good intro as well. Here's a brief intro that motivates why concurrent reading and writing of shared state (of which Scala objects are particular specific case) is a problem and provides a quick overview of common solutions.
There's two (fundamentally related) classes of problems when it comes to thread safety and state mutation:
Clobbering (missing) writes
Inaccurate (changing out from under you) reads
Let's look at each of these in turn.
First clobbering writes:
object WritesExample {
var myList: List[Int] = List.empty
}
Imagine we had two threads concurrently accessing WritesExample, each of executes the following updateList
def updateList(x: WritesExample.type): Unit =
WritesExample.myList = 1 :: WritesExample.myList
You'd probably hope when both threads are done that WritesExample.myList has a length of 2. Unfortunately, that might not be the case if both threads read WritesExample.myList before the other thread has finished a write. If when both threads read WritesExample.myList it is empty, then both will write back a list of length 1, with one write overwriting the other, so that in the end WritesExample.myList only has a length of one. Hence we've effectively lost a write we were supposed to execute. Not good.
Now let's look at inaccurate reads.
object ReadsExample {
val myMutableList: collection.mutable.MutableList[Int]
}
Once again, let's say we had two threads concurrently accessing ReadsExample. This time each of them executes updateList2 repeatedly.
def updateList2(x: ReadsExample.type): Unit =
ReadsExample.myMutableList += ReadsExample.myMutableList.length
In a single-threaded context, you would expect updateList2, when repeatedly called, to simply generate an ordered list of incrementing numbers, e.g. 0, 1, 2, 3, 4,.... Unfortunately, when multiple threads are accessing ReadsExample.myMutableList with updateList2 at the same time, it's possible that between when ReadsExample.myMutableList.length is read and when the write is finally persisted, ReadsExample.myMutableList has already been modified by another thread. So in theory you could see something like 0, 0, 1, 1 or potentially if one thread takes longer to write than another 0, 1, 2, 1 (where the slower thread finally writes to the list after the other thread has already accessed and written to the list three times).
What happened is that the read was inaccurate/out-of-date; the actual data structure that was updated was different from the one that was read, i.e. was changed out from under you in the middle of things. This is also a huge source of bugs because many invariants you might expect to hold (e.g. every number in the list corresponds exactly to its index or every number appears only once) hold in a single-threaded context, but fail in a concurrent context.
Now that we've motivated some of the problems, let's dive into some of the solutions. You mentioned immutability so let's talk about that first. You might notice that in my example of clobbering writes I use an immutable data structure whereas in my inconsistent reads example I use a mutable data structure. That is intentional. They are in a sense dual to one another.
With immutable data structures you cannot have an "inaccurate" read in the sense I laid out above because you never mutate data structures, but rather place a new copy of a data structure in the same location. The data structure cannot change out from under you because it cannot change! However you can lose a write in the process by placing a version of a data structure back to its original location that does not incorporate a change made previously by another process.
With mutable data structures on the other hand, you cannot lose a write because all writes are in-place mutations of the data structure, but you can end up executing a write to a data structure whose state differs from when you analyzed it to formulate the write.
If it's a "pick your poison" kind of scenario, why do you often hear advice to go with immutable data structures to help with concurrency? Well immutable data structures make it easier to ensure invariants about the state being modified hold even if writes are lost. For example, if I rewrote the ReadsList example to use an immutable List (and a var instead), then I could confidently say that the integer elements of the list will always correspond to the indices of the list. This means that your program is much less likely to enter an inconsistent state (e.g. it's not hard to imagine that a naive mutable set implementation could end up with non-unique elements when mutated concurrently). And it turns out that modern techniques for dealing with concurrency usually are pretty good at dealing with missing writes.
Let's look at some of those approaches that deal with shared state concurrency. At their hearts they can all be summed up as various ways of serializing read/write pairs.
Locks (a.k.a. directly try to serialize read/write pairs): This is usually the one you'll hear first as a fundamental way of dealing with concurrency. Every process that wants to access state first places a lock on it. Any other process is now excluded from accessing that state. The process then writes to that state and on completion releases the lock. Other processes are now free to repeat the process. In our WritesExample, updateList would first acquire the lock before executing and releasing the lock; this would prevent other processes from reading WritesExample.myList until the write was completed, thereby preventing them from seeing old versions of myList that would lead to clobbering writes (note that are more sophisticated locking procedures that allow for simultaneous reads, but let's stick with the basics for now).
Locks often do not scale well to multiple pieces of state. With multiple locks, often you need to acquire and release locks in a certain order otherwise you can end up deadlocking or livelocking.
The Oracle and Twitter docs linked a the beginning have good overviews of this approach.
Describe Your Action, Don't Execute It (a.k.a. build up a serial representation of your actions and have someone else process it): Instead of accessing and modifying state directly, you describe an action of how to do this and then give it to someone else to actually execute the action. For example, you might pass messages to an object (e.g. actors in Scala) that queues up these requests and then executes them one-by-one on some internal state that it never directly exposes to anyone else. In the particular case of actors, this improves the situation over locks by removing the need to explicitly acquire and release locks. As long as you encapsulate all the state you need to access at once in a single object, message passing works great. Actors break down when you distribute state across multiple objects (and as such this is heavily discouraged in this paradigm).
Akka actors are one good example of this in Scala.
Transactions (a.k.a. temporarily isolate some reads and writes from others and let the isolation system serialize things for you): Wrap all your read/writes in transactions that ensure during the course of your reads and writes your view of the world is isolated from any other changes. There's usually two ways of achieving this. Either you go for an approach similar to locks where you prevent other people from accessing the data while a transaction is running or you restart a transaction from the very beginning whenever you detect that a change has occurred to the shared state and throw away any progress you've made (usually the latter for performance reasons). On the one hand, transactions, unlike locks and actors, scale to disparate pieces of state very well. Just wrap all your accesses in transactions and you're good to go. On the other hand, your reads and writes have to be side-effect-free because they might be thrown away and retried many times and you can't really undo most side effects.
And if you're really unlucky, although you usually can't truly deadlock with a good implementation of transactions, a long-lived transaction can constantly be interrupted by other short-lived transactions such that it keeps getting thrown away and retried and never actually succeeds (which amounts to something like livelocking). In effect you're giving up direct control of serialization order and hoping your transaction system orders things sensibly.
Scala's STM library is a good example of this approach.
Remove Shared State: The final "solution" is to rethink the problem altogether and try to think about whether you truly need global, shared state that is writable. If you don't need writable shared state, then concurrency problems go away altogether!
Everything in life is about trade-offs and concurrency is no exception. When thinking about concurrency first understand what state you have and what invariants you want to preserve about that state. Then use that to guide your decision as to what kind of tools you want to use to tackle the problem.
The Thread Safety Problem section within this Scala concurrency article might be of interest to you. In essence, it illustrates the thread safety problem using a simple example and outlines 3 different approaches to tackle the problem, namely synchronization, volatile and AtomicReference:
When you enter synchronized points, access volatile references, or
deference AtomicReferences, Java forces the processor to flush their
cache lines and provide a consistent view of data.
There is also a brief overview comparing the cost of the 3 approaches:
AtomicReference is the most costly of these two choices since you
have to go through method dispatch to access values. volatile and
synchronized are built on top of Java’s built-in monitors. Monitors
cost very little if there’s no contention. Since synchronized allows
you more fine-grained control over when you synchronize, there will be
less contention so synchronized tends to be the cheapest option.
This is not specific to Scala, if your object contains a state that can be modified concurrently thread safety can be violated depending on the implementation. For example:
object BankAccount {
private var balance: Long = 0L
def deposit(amount: Long): Unit = balance += amount
}
In this case the the object is not thread safe, there are a lot of approachs to make it thread safe, for example using Akka, or synchronized blocks. For simplicity I will write it using synchronized blocks
object BankAccount {
private var balance: Long = 0L
def deposit(amount: Long): Unit =
this.synchronized {
balance += amount
}
}
I have an object Person, this object goes through millions of unique transformations and these transformations have to be done in order, sequentially.
Traditionally, I can do this:
var bob = new Person()
birth(bob)
oneYearOld(bob)
middleSchool(bob)
highSchool(bob)
// ...million more transformations
So my questions are, how can I do it functionally (no side effects/pure function) in Scala? And if I use val bob, will I be creating millions of copies of bob which may lead to memory issues?
You can use function composition
i(h(g(f(bob))))
One way to do this would be to drop all of your transformations into a Sequence (or Buffer or List or ..) and then apply them
val finalBob = functionsSeq.foldleft(bob) { case (xformBob, fn) => fn(xformBob) }
Function composition is for clarity of expression and elegance. Each of the transformations will involve generating a new instance off of the (immutable) existing one: that is a primary objective of pure functional programming.
The fp approach does not save you on memory consumption / object generation. But - as commenters noted - these objects are on the heap, short-lived, and easily managed/consumed by the garbage collector.
I just finished Martin Odersky's scala class at Coursera. Scala being my first FP language, I was excited by the idea of limiting mutable state. This allows for much easier concurrency and also makes the code super maintainable.
While learning all this, I realized you could guarantee the mutability of an object as long as it had no mutable variables and only referenced immutable objects. So now I can do everything by creating a new state instead of modifying an old one, use tail recursion when possible.
Great. So I can only do this so far up the chain. At some point, my application needs to be able to modify some existing state. I know where put in concurrency control at this point, locks, blah blah. I'm still defaulting to my standard multi-threaded concurrency control I've always used.
Oh scala community, is there a better way? Monads maybe?
EDIT: this question is a bit general, so I wanted to give a use case:
I have a machine learning algorithm that stores several collections of data. They have functions that return updated representations of the data (training, etc), all immutable. Ultimately I can keep this return-updated-state pattern going up the chain to the actual object running the simulation. This has a state that is mutable and holds references to the collections. I may want to distributed to multi-cores, or multi-system.
This is a bit of a subjective question, so I won't attempt to answer the 'which is best' part of it. If your chief concern is state in the context of multithreaded concurrency, then one option may be Software Transactional Memory.
There is an Implementation (see the quickstart) of STM as provided by Akka. Depending on your use-case, it might be heavy-weight or overkill, but then again, it might be preferable to a mess of locks. Unlike locks, STM tends to be optimistic, in the same way as database transactions are. As with database transactions, you make changes to shared state explicitly in a transactional context, and the changes you describe will be committed atomically or re-attempted if a conflict is detected. Basically you have to wrap all your state in Refs which can be manipulated only in an 'atomic' block - implemented as a method that takes a closure within which you use manipulate your Refs and ScalaSTM ensures that the whole set of operations on your state either succeed or fail - there will be no half-way or inconsistent changes.
This leverages Scala's implicit parameters - all operation to Refs require a transaction object as an argument, and this is received by the closure given to atomic and can be declared implicit, so all the code within atomic will can be written in a very natural yet safe style.
The catch is, for this to be useful, you do need to use the transactional data-structures provided; so that will mean using TSet instead of Set, TMap instead of Map. These provide all-or-nothing update semantics when used in the transactional context (within an atomic block). This are very much like clojure's persistent collections. You can also build your own transactional data structures out of Refs for use within these atomic blocks.
If you are not averse to parenthesis, the clojure explanation of refs is really good: http://clojure.org/refs
Depending on your use case you might be able to stick with deeply immutable object structures which you partially copy instead of actually mutating them (similar to an "updated" immutable list that shares a suffix with its original list). So-called lenses are a nice way of dealing with such structures, read about them in this SO question or in this blog post.
Sticking with immutable structures of course only works if you don't want changes to be globally observable. An example where immutable structures are most likely not an option are two concurrent clients working on a shared list, where the modifications done by client A must be observable by client B, and vice versa.
I suggest the best way is to store the mutable variable inside a Akka actor, use message passing in and out of the Akka actor to send and receive this mutable reference. Use immutable data structures.
I have a StorageActor as follows. The variable entityMap gets updated every time something is stored via the StoreEntity. Also it doesn't need to be volatile and still works.
The Akka actor is the place where things can change, messages are passed in and out into the pure functional world.
import akka.actor.Actor
import java.util.UUID
import com.orsa.minutesheet.entity.Entity
case class EntityRef(entity: Option[Entity])
case class FindEntity(uuid: UUID)
case class StoreEntity[T >: Entity](uuid: UUID, entity: Option[T])
class StorageActor extends Actor {
private var entityMap = Map[UUID, Entity]()
private def findEntityByUUID(uuid:UUID): Option[Entity] = entityMap.get(uuid)
def receive = {
case FindEntity(uuid) => sender ! EntityRef( findEntityByUUID(uuid) )
case StoreEntity(uuid, entity) =>
entity match {
case Some(store) => entityMap += uuid -> store.asInstanceOf[Entity]
case None => entityMap -= uuid
}
}
}
I want to use parallel arrays for a task, and before I start with the coding, I'd be interested in knowing if this small snipept is threadsafe:
import collection.mutable._
var listBuffer = ListBuffer[String]("one","two","three","four","five","six","seven","eight","nine")
var jSyncList = java.util.Collections.synchronizedList(new java.util.ArrayList[String]())
listBuffer.par.foreach { e =>
println("processed :"+e)
// using sleep here to simulate a random delay
Thread.sleep((scala.math.random * 1000).toLong)
jSyncList.add(e)
}
jSyncList.toArray.foreach(println)
Are there better ways of processing something with parallel collections, and acumulating the results elsewhere?
The code you posted is perfectly safe; I'm not sure about the premise though: why do you need to accumulate the results of a parallel collection in a non-parallel one? One of the whole points of the parallel collections is that they look like other collections.
I think that parallel collections also will provide a seq method to switch to sequential ones. So you should probably use this!
For this pattern to be safe:
listBuffer.par.foreach { e => f(e) }
f has to be able to run concurrently in a safe way. I think the same rules that you need for safe multi-threading apply (access to share state needs to be thread safe, the order of the f calls for different e won't be deterministic and you may run into deadlocks as you start synchronizing your statements in f).
Additionally I'm not clear what guarantees the parallel collections gives you about the underlying collection being modified while being processed, so a mutable list buffer which can have elements added/removed is possibly a poor choice. You never know when the next coder will call something like foo(listBuffer) before your foreach and pass that reference to another thread which may mutate the list while it's being processed.
Other than that, I think for any f that will take a long time, can be called concurrently and where e can be processed out of order, this is a fine pattern.
immutCol.par.foreach { e => threadSafeOutOfOrderProcessingOf(e) }
disclaimer: I have not tried // colls myself, but I'm looking forward at having SO questions/answers show us what works well.
The synchronisedList should be safe, though the println may give unexpected results - you have no guarantees of the order that items will be printed, or even that your printlns won't be interleaved mid-character.
A synchronised list is also unlikely to be the fastest way you can do this, a safer solution is to map over an immutable collection (Vector is probably your best bet here), then print all the lines (in order) afterwards:
val input = Vector("one","two","three","four","five","six","seven","eight","nine")
val output = input.par.map { e =>
val msg = "processed :" + e
// using sleep here to simulate a random delay
Thread.sleep((math.random * 1000).toLong)
msg
}
println(output mkString "\n")
You'll also note that this code has about as much practical usefulness as your example :)
This code is plain weird -- why add stuff in parallel to something that needs to be synchronized? You'll add contention and gain absolutely nothing in return.
The principle of the thing -- accumulating results from parallel processing, are better achieved with stuff like fold, reduce or aggregate.
The code you've posted is safe - there will be no errors due to inconsistent state of your array list, because access to it is synchronized.
However, parallel collections process items concurrently (at the same time), AND out-of-order. The out-of-order means that the 54. element may be processed before the 2. element - your synchronized array list will contain items in non-predefined order.
In general it's better to use map, filter and other functional combinators to transform a collection into another collection - these will ensure that the ordering guarantees are preserved if a collection has some (like Seqs do). For example:
ParArray(1, 2, 3, 4).map(_ + 1)
always returns ParArray(2, 3, 4, 5).
However, if you need a specific thread-safe collection type such as a ConcurrentSkipListMap or a synchronized collection to be passed to some method in some API, modifying it from a parallel foreach is safe.
Finally, a note - parallel collection provide parallel bulk operations on data. Mutable parallel collections are not thread-safe in the sense that you can add elements to them from different threads. Mutable operations like insertion to a map or appending a buffer still have to be synchronized.