boost::asio io_service stop specific thread - threadpool

I've got a boost::asio based thread pool running on N threads.
It used mainly for IO tasks (DB data storing/retreival). It also launches self-diagnostic timer job to check how 'busy' pool is (calculates ms diff between 'time added' and 'time handler called')
So the question is - is there any way to stop M of N threads ( for cases when load is very low and pool does not need so many threads).
When the load is high (determined by diagnostic task) then new thread is added:
_workers.emplace_back(srv::unique_ptr<srv::thread>(new srv::thread([this]
{
_service.run();
})));
(srv namespace is used to switch quickly between boost and std)
But when 'peak load' is passed I need some way to stop additional threads. Is there any solution for this?

What you are looking for is a way to interrupt a thread that is waiting on the io_service. You can implement some sort of interruption mechanism using exceptions.
class worker_interrupted : public std::runtime_error
{
public:
worker_interrupted()
: runtime_error("thread interrupted") {}
};
_workers.emplace_back(srv::unique_ptr<srv::thread>(new srv::thread([this]
{
try
{
_service.run();
}
catch (const worker_interrupted& intrruption)
{
// thread function exits gracefully.
}
})));
You could then just use io_service::post to enqueue a completion handler which just throws worker_interrupted exception.

Related

FreeRTOS mutex/binary semaphore and deadlock

I am new to FreeRTOS, so I started with what I think is a great tutorial, the one presented by Shawn Hymel. I'm also implementing the code that I'm writting in a ESP32 DevkitC V4.
However, I think that I don't understand the difference between binary semaphores and mutexes. When I run this code that tries to avoid deadlock between two tasks that use two mutexes to protect a critical section (as shown in the tutorial):
// Use only core 1 for demo purposes
#if CONFIG_FREERTOS_UNICORE
static const BaseType_t app_cpu = 0;
#else
static const BaseType_t app_cpu = 1;
#endif
//Settings
TickType_t mutex_timeout = 1000 / portTICK_PERIOD_MS;
//Timeout for any task that tries to take a mutex!
//Globals
static SemaphoreHandle_t mutex_1;
static SemaphoreHandle_t mutex_2;
//**********************************************************
//Tasks
//Task A (High priority)
void doTaskA(void*parameters){
while(1){
//Take mutex 1
if( xSemaphoreTake(mutex_1, mutex_timeout) == pdTRUE){
Serial.println("Task A took mutex 1");
vTaskDelay(1 / portTICK_PERIOD_MS);
//Take mutex 2
if(xSemaphoreTake(mutex_2, mutex_timeout) == pdTRUE){
Serial.println("Task A took mutex 2");
//Critical section protected by 2 mutexes
Serial.println("Task A doing work");
vTaskDelay(500/portTICK_PERIOD_MS); //simulate that critical section takes 500ms
} else {
Serial.println("Task A timed out waiting for mutex 2. Trying again...");
}
} else {
Serial.println("Task A timed out waiting for mutex 1. Trying again...");
}
//Return mutexes
xSemaphoreGive(mutex_2);
xSemaphoreGive(mutex_1);
Serial.println("Task A going to sleep");
vTaskDelay(500/portTICK_PERIOD_MS);
//Wait to let other task execute
}
}
//Task B (low priority)
void doTaskB(void * parameters){
while(1){
//Take mutex 2 and wait to force deadlock
if(xSemaphoreTake(mutex_2, mutex_timeout)==pdTRUE){
Serial.println("Task B took mutex 2");
vTaskDelay(1 / portTICK_PERIOD_MS);
if(xSemaphoreTake(mutex_1, mutex_timeout) == pdTRUE){
Serial.println("Task B took mutex 1");
//Critical section protected by 2 mutexes
Serial.println("Task B doing work");
vTaskDelay(500/portTICK_PERIOD_MS); //simulate that critical section takes 500ms
} else {
Serial.println("Task B timed out waiting for mutex 1");
}
} else {
Serial.println("Task B timed out waiting for mutex 2");
}
//Return mutexes
xSemaphoreGive(mutex_1);
xSemaphoreGive(mutex_2);
Serial.println("Task B going to sleep");
vTaskDelay(500/portTICK_PERIOD_MS);
//Wait to let other task execute
}
}
void setup(){
Serial.begin(115200);
vTaskDelay(1000 / portTICK_PERIOD_MS);
Serial.println();
Serial.println("---FreeRTOS Deadlock Demo---");
//create mutexes
mutex_1 = xSemaphoreCreateMutex();
mutex_2 = xSemaphoreCreateMutex();
//Start task A (high priority)
xTaskCreatePinnedToCore(doTaskA, "Task A", 1500, NULL, 2, NULL, app_cpu);
//Start task B (low priority)
xTaskCreatePinnedToCore(doTaskB, "Task B", 1500, NULL, 1, NULL, app_cpu);
vTaskDelete(NULL);
}
void loop(){
}
My ESP32 starts automatically rebooting after both tasks reach their first mutex in execution, displaying this message:
---FreeRTOS Deadlock Demo---
Task A took mutex 1
Task B took mutex 2
Task A timed out waiting for mutex 2. Trying again...
assert failed: xQueueGenericSend queue.c:832 (pxQueue->pcHead != ((void *)0) || pxQueue->u.xSemaphore.xMutexHolder == ((void *)0) || pxQueue->u.xSemaphore.xMutexHolder == xTaskGetCurrentTaskHandle())
I am unable to interpret the error. However, when I change the definition of the mutexes to binary semaphores in setup():
//create mutexes
mutex_1 = xSemaphoreCreateBinary();
mutex_2 = xSemaphoreCreateBinary();
The code runs fine in the ESP32. Would anyone please explain me why this happens? Many thanks and sorry if the question wasn't adequately made, as this is my first one.
One of the key differences between semaphores and mutexes is the concept of ownership. Semaphores, don't have a thread that owns them. A higher priority thread can acquire a semaphore even if a lower priority thread has already acquired it. On the other hand, mutexes are owned by the thread that acquires them and can only be released by that thread.
In your code above, mutex_1 is acquired by Task A and mutex_2 is acquired by Task B. At this point, Task A is trying to acquire mutex_2. When it is an actual mutex, Task A cannot acquire it since it is owned by Task B. If this were a semaphore, however, Task A could acquire it from Task B. Thus clearing the deadlock.
The error here plays into that. After task A times out waiting for mutex_2, it starts to release the mutexes. It can release mutex_1 no problem because it owns it. When it tries to release mutex_2, it cannot because it is not the owner. Thus the OS throws an error because a task shouldn't try to release a mutex it doesn't own.
If you want to read a little more about the differences between mutexes and semaphores, you can check out this article.

Why IoScheduler using ScheduledExecutorService with the poolCoreSize is 1?

I found the IoScheduler.createWorker() will create a NewThreadWorker immediately if there is no cached NewThreadWorker,This may result in OutOfMemoryError.
If I put 1000 count of work to IoScheduler one-time,it will create 1000 count of NewThreadWorker and ScheduledExecutorService.
private void submitWorkers(int workerCount) {
for (int i = 0; i < workerCount; i++) {
Single.fromCallable(new Callable<String>() {
#Override
public String call() throws Exception {
Thread.sleep(1000);
return "String-call(): " + Thread.currentThread().hashCode();
}
})
.subscribeOn(Schedulers.io())
.subscribe(new Consumer<String>() {
#Override
public void accept(String s) throws Exception {
// TODO
}
});
}
}
If I set the workerCount with 1000, I received a OutOfMemoryError,I want to know why IoScheduler use NewThreadWorker with ScheduledExecutorService but just execute a single work。
Every time a new work is coming it will create a NewThreadWorker and ScheduledExecutorService if there is no cached NewThreadWorker,Why is it designed to be such a process?
The standard workers of RxJava each use a dedicated thread to avoid excessive thread hopping and work migration in flows.
The standard IO scheduler uses an unbounded number of worker threads because it's main use is to allow blocking operations to block a worker thread while other operations can commence on other worker threads. The difference from newThread is that there is a thread reuse allowed once a worker is returned to an internal pool.
If there was a limit on the number of threads, it would drastically increase the likelihood of deadlocks due to resource exhaustion. Also, unlike the computation scheduler, there is no good default number for this limit: 1, 10, 100, 1000?
There are several ways to work around this problem, such as:
use Schedulers.from() with an arbitrary ExecutorService which you can limit and configure as you wish,
use ParallelScheduler from the Extensions project and define an arbitrary large but fixed pool of workers.

JeroMQ shutdown correctly

I am wondering how to shutdown JeroMQ properly, so far I know three methods that all have their pro and cons and I have no clue which one is the best.
The situation:
Thread A: owns context, shall provide start/stop methods
Thread B: actual listener thread
My current method:
Thread A
static ZContext CONTEXT = new ZContext();
Thread thread;
public void start() {
thread = new Thread(new B()).start();
}
public void stop() {
thread.stopping = true;
thread.join();
}
Thread B
boolean stopping = false;
ZMQ.Socket socket;
public void run() {
socket = CONTEXT.createSocket(ROUTER);
... // socket setup
socket.setReceiveTimeout(10);
while (!stopping) {
socket.recv();
}
if (NUM_SOCKETS >= 1) {
CONTEXT.destroySocket(socket);
} else {
CONTEXT.destroy();
}
}
This works just great. 10ms to shutdown is no problem for me, but I will unnecessarily increase the CPU load when there are no messages received. At the moment I prefer this one.
The second method shares the socket between the two threads:
Thread A
static ZContext CONTEXT = new ZContext();
ZMQ.Socket socket;
Thread thread;
public void start() {
socket = CONTEXT.createSocket(ROUTER);
... // socket setup
thread = new Thread(new B(socket)).start();
}
public void stop() {
thread.stopping = true;
CONTEXT.destroySocket(socket);
}
Thread B
boolean stopping = false;
ZMQ.Socket socket;
public void run() {
try {
while (!stopping) {
socket.recv();
}
} catch (ClosedSelection) {
// socket closed by A
socket = null;
}
if (socket != null) {
// close socket myself
if (NUM_SOCKETS >= 1) {
CONTEXT.destroySocket(socket);
} else {
CONTEXT.destroy();
}
}
}
Works like a charm, too, but even if recv is already blocking the exception does not get thrown sometimes. If I wait one millisecond after I started thread A the exception is always thrown. I don't know if this is a bug or just an effect of my misuse, as I share the socket.
"revite" asked this question before (https://github.com/zeromq/jeromq/issues/116) and got an answer which is the third solution:
https://github.com/zeromq/jeromq/blob/master/src/test/java/guide/interrupt.java
Summary:
They call ctx.term() and interrupt the thread blocking in socket.recv().
This works fine, but I do not want to terminate my whole context, but just this single socket. I would have to use one context per socket, so I were not able to use inproc.
Summary
At the moment I have no clue how to get thread B out of its blocking state other than using timeouts, share the socket or terminate the whole context.
What is the correct way of doing this?
It is often mentioned that you can just destroy the zmq context and anything sharing that context will exit, however this creates a nightmare because your exiting code has to do its best in avoiding a minefield of accidentally calling into dead socket objects.
Attempting to close the socket doesn't work either because they are not thread safe and you'll end up with crashes.
ANSWER: The best way is to do as the ZeroMQ guide suggests for any use via multiple threads; use zmq sockets and not thread mutexes/locks/etc. Set up an additional listener socket that you'll connect&send something to on shutdown, and your run() should used a JeroMQ Poller to check which of your two sockets receive anything - if the additional socket receives something then exit.
Old question, but just in case...
I'd recommend checking out ZThread source. You should be able to create an instance of IAttachedRunnable that you can pass to the fork method, and the run method of your instance will be passed a PAIR socket and execute in another thread, while the fork will return the connected PAIR socket to use for communicating with the PAIR socket that your IAttachedRunnable got.
Check out the jeromq source here, even when you're doing a "blocking" recv, you're still burning CPU the entire time (the thread never sleeps). If you're worried about that, have the second thread sleep between polling and let the parent thread interrupt. Something like (just the relevant portions):
Thread A
public void stop() {
thread.interrupt();
thread.join();
}
Thread B
while (!Thread.interrupted()) {
socket.recv(); // do whatever
try {
Thread.sleep(10); //milliseconds
} catch (InterruptedException e) {
break;
}
}
Also, with regard to your second solution, in general you should not share sockets between threads - the zeromq guide is pretty clear on this - "Don't share ØMQ sockets between threads. ØMQ sockets are not threadsafe." Remember that a major use for ZMQ is IPC - threads communicating through connected sockets, not sharing the same end of one socket. No need for things like shared boolean stop variables.

Memory Leak using Windows ThreadPool API

I am using Windows ThreadPools in my application, and am experiencing a memory leak of 136 bytes for every call to CreateThreadPoolWork(), as seen via UMDH:
+ 1257728 ( 1286424 - 28696) 9459 allocs BackTraceB0035CC
+ 9248 ( 9459 - 211) BackTraceB0035CC allocations
ntdll!RtlUlonglongByteSwap+B52
ntdll!TpAllocWork+8D
KERNEL32!CreateThreadpoolWork+25
... My Code ...
I am using Cleanup Group, so per the documentation I am not calling CloseThreadPoolWork().
My code for handling the ThreadPool is:
typedef PTP_WORK ThreadHandle_t;
typedef PTP_WORK_CALLBACK THREAD_ENTRY_POINT_T;
static PTP_POOL pool = NULL;
static TP_CALLBACK_ENVIRON CallBackEnviron;
static PTP_CLEANUP_GROUP cleanupgroup = NULL;
int mtInitialize()
{
InitializeThreadpoolEnvironment(&CallBackEnviron);
pool = CreateThreadpool(NULL);
if (NULL == pool)
{
return -1;
}
cleanupgroup = CreateThreadpoolCleanupGroup();
if (NULL == cleanupgroup)
{
return -1;
}
SetThreadpoolCallbackPool(&CallBackEnviron, pool);
SetThreadpoolCallbackCleanupGroup(&CallBackEnviron, cleanupgroup, NULL);
return 0; // Success
}
void mtDestroy()
{
CloseThreadpoolCleanupGroupMembers(cleanupgroup, FALSE, NULL);
CloseThreadpoolCleanupGroup(cleanupgroup);
DestroyThreadpoolEnvironment(&CallBackEnviron);
CloseThreadpool(pool);
}
//Create thread
ThreadHandle_t mtRunThread(THREAD_ENTRY_POINT_T entry_point, void *thread_args)
{
PTP_WORK work = NULL;
work = CreateThreadpoolWork(entry_point, thread_args, &CallBackEnviron);
if (NULL == work) {
// CreateThreadpoolWork() failed.
return 0;
}
SubmitThreadpoolWork(work);
return work;
}
//Wait for a thread to finish
void mtWaitForThread(ThreadHandle_t thread)
{
WaitForThreadpoolWorkCallbacks(thread, FALSE);
}
Am I doing something wrong?
Any ideas why I'm leaking memory?
I'm guessing you figured it out, given your comment, but the problem is that you only call CloseThreadpoolCleanupGroupMembers() in mtDestroy().
If you have a persistent thread pool the memory will not be freed unless you call CloseThreadpoolCleanupGroupMembers() periodically. Your code and comments suggests that you do, though I can't confirm this without the code responsible for creating and destroying your thread pool.
My recommendation for persistent thread pools is to just call CloseThreadpoolWork() in the callback functions. Microsoft's recommendations work better if you're creating and destroying thread pools, but CloseThreadpoolWork() is simpler and easier than periodically calling CloseThreadpoolCleanupGroupMembers() if you're maintaining one thread pool for the life of your application.
By the way, it's safe to do both as long as you tell CloseThreadpoolCleanupGroupMembers() to cancel any pending callbacks (pass fCancelPendingCallbacks as TRUE) to ensure CloseThreadpoolWork() is called on any cleaned up work items:
You can revoke the work object’s membership only by closing it, which
can be done on an individual basis with the CloseThreadpoolWork
function. The thread pool knows that the work object is a member of
the cleanup group and revokes its membership before closing it. This
ensures that the application doesn’t crash when the cleanup group
later attempts to close all of its members. The inverse isn’t true: If
you first instruct the cleanup group to close all of its members and
then call CloseThreadpoolWork on the now invalid work object, your
application will crash.
From Windows with C++ - Thread Pool Cancellation and Cleanup

iphone - how do I make a thread runs faster

I have two methods that I need to run, lets call them metA and metB.
When I start coding this app, I called both methods without using threads, but the app started freezing, so I decided to go with threads.
metA and metB are called by touch events, so they can occur any time in any order. They don't depend on each other.
My problem is the time it takes to either threads start running. There's a lag between the time the thread is created with
[NSThread detachNewThreadSelector:#selector(.... bla bla
and the time the thread starts running.
I suppose this time is related to the amount of time required by iOS to create the thread itself. How can I speed this? If I pre create both threads, how do I make them just do their stuff when needed and never terminate? I mean, a kind of sleeping thread that is always alive and works when asked and sleeps after that?
thanks.
If you want to avoid the expensive startup time of creating new threads, create both threads at startup as you suggested. To have them only run when needed, you can have them wait on a condition variable. Since you're using the NSThread class for threading, I'd recommend using the NSCondition class for condition variables (an alternative would be to use the POSIX threading (pthread) condition variables, pthread_cond_t).
One thing you'll have to be careful of is if you get another touch event while the thread is still running. In that case, I'd recommend using a queue to keep track of work items, and then the touch event handler can just add the work item to the queue, and the worker thread can process them as long as the queue is not empty.
Here's one way to do this:
typedef struct WorkItem
{
// information about the work item
...
struct WorkItem *next; // linked list of work items
} WorkItem;
WorkItem *workQueue = NULL; // head of linked list of work items
WorkItem *workQueueTail = NULL; // tail of linked list of work items
NSCondition *workCondition = NULL; // condition variable for the queue
...
-(id) init
{
if((self = [super init]))
{
// Make sure this gets initialized before the worker thread starts
// running
workCondition = [[NSCondition alloc] init];
// Start the worker thread
[NSThread detachNewThreadSelector:#selector(threadProc:)
toTarget:self withObject:nil];
}
return self;
}
// Suppose this function gets called whenever we receive an appropriate touch
// event
-(void) onTouch
{
// Construct a new work item. Note that this must be allocated on the
// heap (*not* the stack) so that it doesn't get destroyed before the
// worker thread has a chance to work on it.
WorkItem *workItem = (WorkItem *)malloc(sizeof(WorkItem));
// fill out the relevant info about the work that needs to get done here
...
workItem->next = NULL;
// Lock the mutex & add the work item to the tail of the queue (we
// maintain that the following invariant is always true:
// (workQueueTail == NULL || workQueueTail->next == NULL)
[workCondition lock];
if(workQueueTail != NULL)
workQueueTail->next = workItem;
else
workQueue = workItem;
workQueueTail = workItem;
[workCondition unlock];
// Finally, signal the condition variable to wake up the worker thread
[workCondition signal];
}
-(void) threadProc:(id)arg
{
// Loop & wait for work to arrive. Note that the condition variable must
// be locked before it can be waited on. You may also want to add
// another variable that gets checked every iteration so this thread can
// exit gracefully if need be.
while(1)
{
[workCondition lock];
while(workQueue == NULL)
{
[workCondition wait];
// The work queue should have something in it, but there are rare
// edge cases that can cause spurious signals. So double-check
// that it's not empty.
}
// Dequeue the work item & unlock the mutex so we don't block the
// main thread more than we have to
WorkItem *workItem = workQueue;
workQueue = workQueue->next;
if(workQueue == NULL)
workQueueTail = NULL;
[workCondition unlock];
// Process the work item here
...
free(workItem); // don't leak memory
}
}
If you can target iOS4 and higher, consider using blocks with Grand Central Dispatch asynch queue, which operates on background threads which the queue manages... or for backwards compatibility, as mentioned use NSOperations inside an NSOperation queue to have bits of work performed for you in the background. You can specify exactly how many background threads you want to support with an NSOperationQueue if both operations have to run at the same time.