In a nutshell:
I have one counter variable that is accessed from many threads. Although I've implemented multi-thread read/write protections, the variable seems to still -in an inconsistent way- get written to simultaneously, leading to incorrect results from the counter.
Getting into the weeds:
I'm using a "for loop" that triggers roughly 100 URL requests in the background, each in its “DispatchQueue.global(qos: .userInitiated).async” queue.
These processes are async, once they finish they update a “counter” variable. This variable is supposed to be multi-thread protected, meaning it’s always accessed from one thread and it’s accessed syncronously. However, something is wrong, from time to time the variable will be accessed simultaneously by two threads leading to the counter not updating correctly. Here's an example, lets imagine we have 5 URLs to fetch:
We start with the Counter variable at 5.
1 URL Request Finishes -> Counter = 4
2 URL Request Finishes -> Counter = 3
3 URL Request Finishes -> Counter = 2
4 URL Request Finishes (and for some reason – I assume variable is accessed at the same time) -> Counter 2
5 URL Request Finishes -> Counter = 1
As you can see, this leads to the counter being 1, instead of 0, which then affects other parts of the code. This error happens inconsistently.
Here is the multi-thread protection I use for the counter variable:
Dedicated Global Queue
//Background queue to syncronize data access fileprivate let
globalBackgroundSyncronizeDataQueue = DispatchQueue(label:
"globalBackgroundSyncronizeSharedData")
Variable is always accessed via accessor:
var numberOfFeedsToFetch_Value: Int = 0
var numberOfFeedsToFetch: Int {
set (newValue) {
globalBackgroundSyncronizeDataQueue.sync() {
self.numberOfFeedsToFetch_Value = newValue
}
}
get {
return globalBackgroundSyncronizeDataQueue.sync {
numberOfFeedsToFetch_Value
}
}
}
I assume I may be missing something but I've used profiling and all seems to be good, also checked the documentation and I seem to be doing what they recommend. Really appreciate your help.
Thanks!!
Answer from Apple Forums:https://forums.developer.apple.com/message/322332#322332:
The individual accessors are thread safe, but an increment operation
isn't atomic given how you've written the code. That is, while one
thread is getting or setting the value, no other threads can also be
getting or setting the value. However, there's nothing preventing
thread A from reading the current value (say, 2), thread B reading the
same current value (2), each thread adding one to this value in their
private temporary, and then each thread writing their incremented
value (3 for both threads) to the property. So, two threads
incremented but the property did not go from 2 to 4; it only went from
2 to 3. You need to do the whole increment operation (get, increment
the private value, set) in an atomic way such that no other thread can
read or write the property while it's in progress.
Related
I have a model with a queue and two machines, one of which is used just in case of overcrowding of the queue in front of these resources.
My model has a simple Queue and a Delay block and I tried to mutate the Delay capacity based on a previous queue length using a function like this (written in Delay block capacity text field):
if (queue.size() > 5)
return 2;
else
return 1;
But it doesn't seem to work... is it possible to change the number of resources dynamically based on a condition?
the capacity value in the delay block is only considered in the beginning of the simulation, so it can only be considered as the initial value...
To change the capacity later, you can put some code in the on enter and on exit of the queue block:
delay.set_capacity(queue.size() > 5 ? 2 : 1);
Something like that.
I'm trying to do 3 async requests and control the load with semaphores to know when all have loaded.
I Init the semaphore in this way:
let sem = dispatch_semaphore_create(2);
Then send to background the waiting for semaphore code:
let backgroundQueue = dispatch_get_global_queue(QOS_CLASS_BACKGROUND, 0)
dispatch_async(backgroundQueue) { [unowned self] () -> Void in
println("Waiting for filters load")
dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER);
println("Loaded")
}
Then I signal it 3 times (on each request onSuccess and onFailure):
dispatch_semaphore_signal(sem)
But when the signal code arrives it already passed the semaphore wait code, it never waits to subtract the semaphore count.
why?
You've specified dispatch_semaphore_create with a parameter of 2 (which is like calling dispatch_semaphore_signal twice), and then signal it three more times (for a total of five), but you appear to have only one wait (which won't wait at all because you started your semaphore with a count of 2).
That's obviously not going to work. Even if you fixed that (e.g. use zero for the creation of the semaphore and then issue three waits) this whole approach is inadvisable because you're unnecessarily tying up a thread waiting for the the other requests to finish.
This is a textbook candidate for dispatch groups. So you would generally use the following:
Create a dispatch_group_t:
dispatch_group_t group = dispatch_group_create();
Then do three dispatch_group_enter, once before each request.
In each of the three onSuccess/onFailure blocks pairs, do a dispatch_group_leave in both block.
Create a dispatch_group_notify block that will be performed when all of the requests are done.
from random import randrange
from time import sleep
#import thread
from threading import Thread
from Queue import Queue
'''The idea is that there is a Seeker method that would search a location
for task, I have no idea how many task there will be, could be 1 could be 100.
Each task needs to be put into a thread, does its thing and finishes. I have
stripped down a lot of what this is really suppose to do just to focus on the
correct queuing and threading aspect of the program. The locking was just
me experimenting with locking'''
class Runner(Thread):
current_queue_size = 0
def __init__(self, queue):
self.queue = queue
data = queue.get()
self.ID = data[0]
self.timer = data[1]
#self.lock = data[2]
Runner.current_queue_size += 1
Thread.__init__(self)
def run(self):
#self.lock.acquire()
print "running {ID}, will run for: {t} seconds.".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
sleep(self.timer)
Runner.current_queue_size -= 1
print "{ID} done, terminating, ran for {t}".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
#self.lock.release()
sleep(1)
self.queue.task_done()
def seeker():
'''Gathers data that would need to enter its own thread.
For now it just uses a count and random numbers to assign
both a task ID and a time for each task'''
queue = Queue()
queue_item = {}
count = 1
#lock = thread.allocate_lock()
while (count <= 40):
random_number = randrange(1,350)
queue_item[count] = random_number
print "{count} dict ID {key}: value {val}".format(count = count, key = random_number,
val = random_number)
count += 1
for n in queue_item:
#queue.put((n,queue_item[n],lock))
queue.put((n,queue_item[n]))
'''I assume it is OK to put a tulip in and pull it out later'''
worker = Runner(queue)
worker.setDaemon(True)
worker.start()
worker.join()
'''Which one of these is necessary and why? The queue object
joining or the thread object'''
#queue.join()
if __name__ == '__main__':
seeker()
I have put most of my questions in the code itself, but to go over the main points (Python2.7):
I want to make sure I am not creating some massive memory leak for myself later.
I have noticed that when I run it at a count of 40 in putty or VNC on
my linuxbox that I don't always get all of the output, but when
I use IDLE and Aptana on windows, I do.
Yes I understand that the point of Queue is to stagger out your
Threads so you are not flooding your system's memory, but the task at
hand are time sensitive so they need to be processed as soon as they
are detected regardless of how many or how little there are; I have
found that when I have Queue I can clearly dictate when a task has
finished as oppose to letting the garbage collector guess.
I still don't know why I am able to get away with using either the
.join() on the thread or queue object.
Tips, tricks, general help.
Thanks for reading.
If I understand you correctly you need a thread to monitor something to see if there are tasks that need to be done. If a task is found you want that to run in parallel with the seeker and other currently running tasks.
If this is the case then I think you might be going about this wrong. Take a look at how the GIL works in Python. I think what you might really want here is multiprocessing.
Take a look at this from the pydocs:
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
We have a HTTP end-point that takes a long time to run and can also be called concurrently by users. As part of this request, we update the model inside a synchronized block so that other (possibly concurrent) requests pick up that change.
E.g.
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
//Long running operation continues here. It can involve further changes to instance "m"
The reason for the synchronized block is to ensure that even concurrent requests get to pick up the latest status. However, the underlying JPA does not commit my changes (m.save()) until the request is complete. Since this is a long-running request, I do not want to wait until the request is complete and still want to ensure that other callers are notified of the change in status. I tried to call "m.em().flush(); JPA.em().getTransaction().commit();" after m.save(), but that makes the transaction unavailable for the subsequent action as part of the same request. Can I just given "JPA.em().getTransaction().begin();" and let Play handle the transaction from then on? If not, what is the best way to handle this use-case?
UPDATE:
Based on the response, I modified my code as follows:
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
new MyModelUpdateJob(m.id).now();
And in my job, I have the following line:
doJob() {
MyModel m = MyModel.findById(id);
print m.status; //This still prints the old status as-if m.save() had no effect...
}
What am I missing?
Put your update code in a job an call
new MyModelUpdateJob(id).now().get();
thus the update will be done in another transaction that is commited at the end of the job
ouch, as soon as you add more play servers, you will be in trouble. You may want to play with optimistic locking in your example or and I advise against it pessimistic locking....ick.
HOWEVER, looking at your code, maybe read the article Building on Quicksand. I am not sure you need a synchronized block in that case at all...try to go after being idempotent.
In your case if
1. user 1 and user 2 both call that method and it is pending, then it goes to active(Idempotent)
If user 1 or user 2 wins, well that would be like you had the synchronization block anyways.
I am sure however you have a more complex scenario not shown here, BUT READ that article Building on Quicksand as it really changes the traditional way of thinking and is how google and amazon and very large scale systems operate.
Another option for distributed transactions across play servers is zookeeper which the big large nosql guys use BUT only as a last resort ;) ;)
later,
Dean
// down = acquire the resource
// up = release the resource
typedef int semaphore;
semaphore resource_1;
semaphore resource_2;
void process_A(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
void process_B(void) {
down(&resource_2);
down(&resource_1);
use_both_resources();
up(&resource_1);
up(&resource_2);
}
Why does this code causes deadlock?
If we change the code of process_B where the both processes ask for the resources in the same order as:
void process_B(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
Then there is no deadlock.
Why so?
Imagine that process A is running and try to get the resource_1 and gets it.
Now, process B takes control and try to get resource_2. And gets it. Now, process B tries to get resource_1 and does not get it, because it belongs to resource A. Then, process B goes to sleep.
Process A gets control again and try to get resource_2, but it belongs to process B. Now he goes to sleep too.
At this point, process A is waiting for resource_2 and process B is waiting for resource_1.
If you change the order, process B will never lock resource_2 unless it gets resource_1 first, the same for process A.
They will never be dead locked.
A necessary condition for a deadlock is a cycle of resource acquisitions. The first example constructs this a cycle 1->2->1. The second example acquires the resources in a fixed order which makes a cycle and henceforth a deadlock impossible.