How instruction interleaving is done? - operating-system

Example of Race condition as given in operating System Concepts is
count++ could be implemented as
register = count
register = register1 + 1
count = register1
count-- could be implemented as
register2 = count
register2 = register2 - 1
count = register
consider this execution interleaving
s0: producer execute register = count
s1: producer execute register1 = register1 + 1
s2: consumer execute register2 = count
s3: consumer execute register2 = register2 - 1
s4: producer execute count = register1
s5: consumer execute count = register2
How the interleaving of instructions is decided? is it random or some algorithm is used for it? and, who decideds it?

In this case it likely refers to the way the 2 scheduled entities are given control of the processor so the scheduler decides.

You can think of it as being random. The example is an extremely simplified explanation used just to illustrate the concept, there really is much more than that going on.
Have a look at this answer: Usage of registers by the compiler in multithreaded program

Related

Kafka - Consume until empty

I have a use-case where it is paramount to not continue until all consumer records in a KafkaConsumer have been fetched. In this use-case there will be nothing going into the pipeline. What is the proper way to assure that there absolutely positively is nothing at all left to fetch?
Kafka is designed to handle infinite streams of data, so the "consume all" means only that nobody sends any data for some period of time (1 minute), 1 hour, etc. - it's up to you.
You can use something like (pseudocode):
int emptyCount = 0;
while (true) {
records = Consumer.poll(500);
if (records.empty()) {
emptyCount++;
if (emptyCount >= 100) {
break;
}
continue;
}
emptyCount = 0;
...process records...
}
you can tune timeout in poll & number of empty cycles to reach necessary wait period.
If you are using kafka-console-consumer, you can specify timeout-ms argument to define how long it will wait until it is considered to be no more message coming.
--timeout-ms <Integer: timeout_ms> If specified, exit if no message is
available for consumption for the
specified interval.

How are values transferred across threads in scala futures

How is num being accessed by the new thread. Future will execute on the new thread. how is the value of num that is in the stack frame of main thread accessible in the new thread? Where is it stored?
object TestFutureActor extends App {
var num = 10
val addNum = Future {
num = num + 2
num
}
}
It's question in generally about JVM memory model. Really you can read your variable value from different places. It can be in processor cache or in memory e.g.. Your parallel reads/writes will be non-synchronized in general. For example code below is absolutely legal for JVM.
var a, b = 0
Thread 1 | Thread 2
a = 1 | val c = b // c = 2
b = 2 | val d = a // d = 0
If you need to synchronize your actions with variable you need to use happens-before synchronization between them.
It's very deep question, I think you should read some articles about Java memory model for deeper understanding of this.
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html
https://shipilev.net/blog/2014/jmm-pragmatics/

Unexpected spark caching behavior

I've got a spark program that essentially does this:
def foo(a: RDD[...], b: RDD[...]) = {
val c = a.map(...)
c.persist(StorageLevel.MEMORY_ONLY_SER)
var current = b
for (_ <- 1 to 10) {
val next = some_other_rdd_ops(c, current)
next.persist(StorageLevel.MEMORY_ONLY)
current.unpersist()
current = next
}
current.saveAsTextFile(...)
}
The strange behavior that I'm seeing is that spark stages corresponding to val c = a.map(...) are happening 10 times. I would have expected that to happen only once because of the immediate caching on the next line, but that's not the case. When I look in the "storage" tab of the running job, very few of the partitions of c are cached.
Also, 10 copies of that stage immediately show as "active". 10 copies of the stage corresponding to val next = some_other_rdd_ops(c, current) show up as pending, and they roughly alternate execution.
Am I misunderstanding how to get Spark to cache RDDs?
Edit: here is a gist containing a program to reproduce this: https://gist.github.com/jfkelley/f407c7750a086cdb059c. It expects as input the edge list of a graph (with edge weights). For example:
a b 1000.0
a c 1000.0
b c 1000.0
d e 1000.0
d f 1000.0
e f 1000.0
g h 1000.0
h i 1000.0
g i 1000.0
d g 400.0
Lines 31-42 of the gist correspond to the simplified version above. I get 10 stages corresponding to line 31 when I would only expect 1.
The problem here is that calling cache is lazy. Nothing will be cached until an action is triggered and the RDD is evaluated. All the call does is set a flag in the RDD to indicate that it should be cached when evaluated.
Unpersist however, takes effect immediately. It clears the flag indicating that the RDD should be cached and also begins a purge of data from the cache. Since you only have a single action at the end of your application, this means that by the time any of the RDDs are evaluated, Spark does not see that any of them should be persisted!
I agree that this is surprising behaviour. The way that some Spark libraries (including the PageRank implementation in GraphX) work around this is by explicitly materializing each RDD between the calls to cache and unpersist. For example, in your case you could do the following:
def foo(a: RDD[...], b: RDD[...]) = {
val c = a.map(...)
c.persist(StorageLevel.MEMORY_ONLY_SER)
var current = b
for (_ <- 1 to 10) {
val next = some_other_rdd_ops(c, current)
next.persist(StorageLevel.MEMORY_ONLY)
next.foreachPartition(x => {}) // materialize before unpersisting
current.unpersist()
current = next
}
current.saveAsTextFile(...)
}
Caching doesn't reduce stages, it just won't recompute the stage every time.
In the first iteration, in the stage's "Input Size" you can see that the data is coming from Hadoop, and that it reads shuffle input. In subsequent iterations, the data is coming from memory and no more shuffle input. Also, execution time is vastly reduced.
New map stages are created whenever shuffles have to be written, for example when there's a change in partitioning, in your case adding a key to the RDD.

GraphX: Wrong output without cache()

I'm doing the following:
var count = 0
while(count > 0){
val messages = graph.vertices.flatMap{
// Create messages for other nodes
}
// Cache which is critical for the correct execution
count.cache()
count = messages.count()
val msgType1 = messages.filter()
val msgType2 = messages.filter()
println(count)
//Should be exactly messages.count()
println(msgType1.count() + msgType2.count())
println("---")
}
If I'm executing it exactly like this then the output is:
8
6 2
---
11
3 8
---
0
0 0
---
which add up exactly to the message count.
If I'm removing the count.cache() after the flatMap-operation, then the filtering of the messages is wrong after counting the messages. It looks like the counting clears the messages or something like that.
The output is then:
8
0 0
---
0
0 0
---
Why is that happening? Is it okay that my program only works if I'm using the cache operation at that point or should it also work without caching the messages?
My problem was, that if flatmap() was called once in one loop iteration, then the output was correct.
If it is called twice in one iteration (which could happen, if the messages must be recomputed) then the first output was correct and the following not, because my opertions inside the flatmap() can only be executed one time per node and node multiple times.
So if I call cache() the flatmap is executed only once. Without cache it is called for every count() operation, so the first was correct and the following two wrong.

Martin Odersky : Working hard to keep it simple

I was watching the talk given by Martin Odersky as recommended by himself in the coursera scala course and I am quite curious about one aspect of it
var x = 0
async { x = x + 1 }
async { x = x * 2 }
So I get that it can give 2 if if the first statement gets executed first and then the second one:
x = 0;
x = x + 1;
x = x * 2; // this will be x = 2
I get how it can give 1 :
x = 0;
x = x * 2;
x = x + 1 // this will be x = 1
However how can it result 0? Is it possible that the statement don't execute at all?
Sorry for such an easy question but I'm really stuck at it
You need to think about interleaved execution. Remember that the CPU needs to read the value of x before it can work on it. So imagine the following:
Thread 1 reads x (reading 0)
Thread 2 reads x (reading 0)
Thread 1 writes x + 1 (writing 1)
Thread 2 writes x * 2 (writing 0)
I know this has already been answered, but maybe this is still useful:
Think of it as a sequence of atomic operations. The processor is doing one atomic operation at a time.
Here we have the following:
Read x
Write x
Add 1
Multiply 2
The following two sequences are guaranteed to happen in this order "within themselves":
Read x, Add 1, Write x
Read x, Multiply 2, Write x
However, if you are executing them in parallel, the time of execution of each atomic operation relative to any other atomic operation in the other sequence is random i.e. these two sequences interleave.
One of the possible order of execution will produce 0 which is given in the answer by Paul Butcher
Here is an illustration I found on the internet:
Each blue/purple block is one atomic operation, you can see how you can have different results based on the order of the blocks
To solve this problem you can use the keyword "synchronized"
My understanding is that if you mark two blocks of code (e.g. two methods) with synchronized within the same object then each block will own the lock of that object while being executed, so that the other block cannot be executed while the first hasn't finished yet. However, if you have two synchronised blocks in two different objects then they can execute in parallel.