Can a parallel pool be started asynchronously? - matlab

When a MATLAB parallel pool is started for the first time it typically takes a few seconds. In a user-interactive application there is hence an incentive to make sure there is a parallel pool running before the first demand for computational tasks arrives, so the process of starting a parallel pool isn't added to the total time to respond to the request.
However every programmatic action such as parpool that I've seen that will start a parallel pool blocks execution until the pool is done starting up. This means even if the user has no need to call upon the parallel pool for some time, they cannot do anything else like begin setting up their computationally expensive request – filling in a user interface for instance – until the parallel pool is done starting.
This is very frustrating! If it was any other time-consuming preparatory action, once a parallel pool was in place it could be done in the background using parfeval and wouldn't obstruct the user's workflow until any request that actually called upon the completion of that preparation. But because this task actually addresses the lack of a running parallel pool, it seems users must wait for something they may not actually need to use until long after the task is complete.
Is there any way around this apparent limitation on usability?

There is currently no way to launch a parallel pool in the background. There are a couple of potential mitigations that might help:
Don't ever explicitly call parpool - just let the auto-creation of the pool only start creating the pool when you hit a parallel language construct such as parfor, parfeval, or spmd.
If you're using a cluster which might not be able to service your request for workers for some time, you could use batch to launch the whole computation in the background. (This is probably only appropriate if you've got a fairly long-running computation).

Related

Matlab - batch jobs won't leave queued status

I've got some code that, as it iterates through a loop, grows by some percent in what is to be processed each time. The first few iterations take 4 seconds, but by the 100th, they're taking minutes - and this is for a lite selection of parameters, as I intend to do 350 iterations. To do serious research with this would take enormous time, and it's really inconvenient that simply running a script ties Matlab's hands behind its back until it's all done, and on top of that it hardly ever uses more than one core at a time.
I understand that turning on a Parallel Pool will enable parallel processing. Even if I can't convert any of the for loops into parfor loops, I understand that running a script as a batch job sends that process into the background, and I can do other things with the Matlab interface and the other 7 processors while I wait for this one to finish.
However, though I have the local parallel pool up and running, and I've checked the syntax for starting a batch job, it's not leaving the "queued" status. The first time I typed in batch('Script4') and hit Enter, and then realized I must have a variable name for the job, so then I did run1 = batch('Script4'). I typed get(run1,'State'), and also checked the Job Monitor, and both told me that its state was "queued".
I did some googling before I came here, and while I found some Q&As of similar experiences, they seemed to be solved by things like waiting for the pool to stop using the whole CPU as it starts up. But I started my pool up a long time ago (and it is still running at this moment!), and when I entered the first batch command, my first clue that something was wrong was that Windows Task Manager said all 8 cores were at 0%.
Is there something I need to call or maybe adjust before it will start executing the queued jobs?
I'm using Matlab R2015a on Windows 7 Enterprise.
I think the problem here is that you're trying to run batch jobs while the parallel pool is open. (Unfortunately, this is a common misunderstanding). Basically, the parallel pool and your batch job are both trying to consume local workers. However, because you opened the parallel pool first, it's consuming all the local workers, and the batch job cannot proceed. You should have seen a warning when you submit the batch job, like this:
>> parpool('local');
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
>> j = batch(#rand, 1, {});
Warning: This job will remain queued until the Parallel Pool is closed.
There are two possible fixes - the first is simple
delete(gcp('nocreate'))
will ensure no parallel pool is open, and your batch submissions should proceed. The second is more appropriate if your tasks are relatively short-lived - you can use parfeval to submit work to an open parallel pool:
f = parfeval(#rand, 1); % initiate 'rand' on the parallel pool workers
fetchOutputs(f); % wait for completion, and retrieve the result

Should we use thread pool for long running threads?

Should we use a thread pool for long running threads or start our own threads? Is there some design pattern?
Unfortunately, it depends. There is no hard and fast rule saying that you should always employ thread pools.
Thread pools offer two main things:
Delegated creation/reuse of threads.
Back-pressure
IMO, it's the back-pressure property that's interesting, but often the most poorly understood. Your machine runs on a limited set of resources. If you have (say) 8 CPU cores and they are all busy working, you would like to signal that in some way that adding more work (submitting more tasks) isn't going to help, at least not in terms of latency.
This is the reason java.util.concurrent.ExecutorService implementations allow you to specify a java.util.concurrent.BlockingQueue of your choice. When this queue grows full, invoking threads will block until the thread pool has managed to complete tasks in progress.
Whether or not to have long-running threads inside the thread pool depends on what it's doing. If the thread is constantly busy (meaning it will never complete) then it will always occupy a slot in the thread pool, which is kind of pointless.
Regarding delegated creation/reuse of threads; maybe you could have two pools, one for long-running tasks and one for other tasks. Or perhaps a long-running thread pool with one single slot, this will prevent two long-running tasks from running at the same time, provided that is what you want.
As you can see, there is no single good answer. It really boils down to what you are trying to achieve and how you want to use the resources at hand.

How can I create a Scheduled Task that will run every Second in MarkLogic?

MarkLogic Scheduled Tasks cannot be configured to run at an interval less than a minute.
Is there any way I can execute an XQuery module at an interval of 1 second?
NOTE:
Considering the situation where the Task Server is fully loaded and I need to make sure that the secondly scheduled task gets the Task Server thread whenever it needs.
Please let me know if there is anything in MarkLogic that can be used to achieve this.
Wanting rapid-fire scheduled tasks may be a hint that the design needs rethinking.
Even running a task once a minute can be risky, and needs careful thought to manage the possibilities of overlapping tasks and runaway tasks. If the application design calls for a scheduled task to run once a second, I would raise that as a potentially serious problem. Back up a few steps, and if necessary ask a new question about the higher-level problem that led to looking at scheduled tasks.
There was a sub-question about managing queue priority for tasks. Task priorities can handle some of that. There are two priorities: normal and higher. The Task Server empties the higher-priority queue first, then the normal queue. But each queue is still a simple queue, and there's no way to change priorities after a task has been spawned. So if you always queue tasks with priority=higher, then they'll all be in the higher priority queue and they'll all run in order. You can play some games with techniques like using server fields as signals to already-running tasks. But wanting to reorder tasks within a queue could be another hint that the design needs rethinking.
If, after careful thought about all the pitfalls and dangers, I decided I needed a rapid-fire task of some kind.... I would probably do it using external requests. Pick any scripting language and write a simple while loop with an HTTP request to the MarkLogic cluster. Even so, spend some time thinking about overlapping requests and locking. What happens if the request times out on the client side? Will it keep running on the server? Will that lead to overlapping requests and require deadlock resolution? Could it lead to runaway resource consumption?
Avoid any ideas that use xdmp:sleep. That will tie up a Task Server thread during the sleep period, and then you'll have two problems.

Scala parallel collections, threads termination, and sbt

I am using parallel collections, and when my application terminates, sbt issues:
Not interrupting system thread Thread[process reaper,10,system]
It issues this message one time per core (minus one to be precise).
I have seen in sbt code that this is by design, but I am not sure why don't the threads terminate along with my application. Any insight would be appreciated if you were unlucky enough to come across the same...
Parallel collections by default are backed by ForkJoinTasks.defaultForkJoinPool, which is a lazy val, so it's created the first time it's used.
Like any ForkJoinPool, it runs until explicitly shut down. The pool has no way of knowing whether it's going to receive any new tasks, and thread creation is relatively expensive, so it would be wasteful for the pool to shut down when it was empty only to start up again as soon as new tasks are added. So its threads hang around unless and until the pool is explicitly shut down.
As a design decision the JVM doesn't kill other threads just because the main thread terminates; in some programming styles the main thread terminates relatively early (e.g. think about web servers where the main thread sets up everything, starts a pool of dispatcher threads, and then exits, but the web server continues to run indefinitely).
You could call ForkJoinTasks.defaultForkJoinPool.shutdown() once you know you're not going to do any more parallel operations, or you could create parallel collections using a custom pool that's explicitly controlled from your code.

Networking using run loop

I have an application which uses some external library for analytics. Problem is that I suspect it does some things synchronously, which blocks my thread and makes watchdog kill my app after 10 secs (0x8badf00d code). It is really hard to reproduce (I cannot), but there are quite few cases "in the wild".
I've read some documentation, which suggested that instead creating another thread I should use run-loops. Unfortunately the more I read about them, the more confused I get. And the last thing i want to do is release a fix which will break even more things :/
What I am trying to achieve is:
From main thread add a task to the run-loop, which calls just one function: initMyAnalytics(). My thread continues running, even if initMyAnalytics() gets locked waiting for network data. After initMyAnalytics() finishes, it quietly quits and never gets called again (so it doesnt loop or anything).
Any ideas how to achieve it? Code examples are welcome ;)
Regards!
You don't need to use a run loop in that case. Run loops' purpose is to proceed events from various sources sequentially in a particular thread and stay idle when they have nothing to do. Of course, you can detach a thread, create a run loop, add a source for your function and run the run loop until the function ends. The same as you can use a semi-trailer truck to carry your groceries home.
Here, what you need are dispatch queues. Dispatch queues are First-In-First-Out data structures that run tasks asynchronously. In contrary to run loops, a dispatch queue isn't tied to a particular thread: the working threads are automatically created and terminated as and when required.
As you only have one task to execute, you don't need to create a dispatch queue. Instead you will use an existing global concurrent queue. A concurrent queue execute one or more tasks concurrently, which is perfectly fine in our case. But if we had many tasks to execute and wanted each task to wait for its predecessor to end, we would need to create a serial queue.
So all you have to do is:
create a task for your function by enclosing it into a Block
get a global queue using dispatch_get_global_queue
add the task to the queue using dispatch_async.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
initMyAnalytics();
});
DISPATCH_QUEUE_PRIORITY_DEFAULT is a macro that evaluates to 0. You can get different global queues with different priorities. The second parameter is reserved for future use and should always be 0.