How to cancel queue_delayed_work immediately? - linux-device-driver

In my driver will call queue_delayed_work and do things,
queue_delayed_work(queue, work, 60000);
then stop this queue after some conditions:
cancel_delayed_work_sync(work);
flush_delayed_work(work);
flush_workqueue(queue);
And the worker function:
static void worker(struct work_struct *work) {
printk("this is worker function!\n");
...
queue_delayed_work(queue, work, 60000);
}
But I find that worker function still can be triggered after I do stop the work queue(cancel and flush)
How did it happen and how should I avoid it?
Thank you!

If you need to cancel a work entry submitted to the shared queue, you may use cancel_delayed_work.
If you want to flush the shared workqueue requires a separate function:
void flush_scheduled_work(void)

Related

SubscribeOn does not change the thread pool for the whole chain

I want to trigger longer running operation via rest request and WebFlux. The result of a call should just return an info that operation has started. The long running operation I want to run on different scheduler (e.g. Schedulers.single()). To achieve that I used subscribeOn:
Mono<RecalculationRequested> recalculateAll() {
return provider.size()
.doOnNext(size -> log.info("Size: {}", size))
.doOnNext(size -> recalculate(size))
.map(RecalculationRequested::new);
}
private void recalculate(int toRecalculateSize) {
Mono.just(toRecalculateSize)
.flatMapMany(this::toPages)
.flatMap(page -> recalculate(page))
.reduce(new RecalculationResult(), RecalculationResult::increment)
.subscribeOn(Schedulers.single())
.subscribe(result -> log.info("Result of recalculation - success:{}, failed: {}",
result.getSuccess(), result.getFailed()));
}
private Mono<RecalculationResult> recalculate(RecalculationPage pageToRecalculate) {
return provider.findElementsToRecalculate(pageToRecalculate.getPageNumber(), pageToRecalculate.getPageSize())
.flatMap(this::recalculateSingle)
.reduce(new RecalculationResult(), RecalculationResult::increment);
}
private Mono<RecalculationResult> recalculateSingle(ElementToRecalculate elementToRecalculate) {
return recalculationTrigger.recalculate(elementToRecalculate)
.doOnNext(result -> {
log.info("Finished recalculation for element: {}", elementToRecalculate);
})
.doOnError(error -> {
log.error("Error during recalculation for element: {}", elementToRecalculate, error);
});
}
From the above I want to call:
private void recalculate(int toRecalculateSize)
in a different thread. However, it does not run on a single thread pool - it uses a different thread pool. I would expect subscribeOn change it for the whole chain. What should I change and why to execute it in a single thread pool?
Just to mention - method:
provider.findElementsToRecalculate(...)
uses WebClient to get elements.
One caveat of subscribeOn is it does what it says: it runs the act of "subscribing" on the provided Scheduler. Subscribing flows from bottom to top (the Subscriber subscribes to its parent Publisher), at runtime.
Usually you see in documentation and presentations that subscribeOn affects the whole chain. That is because most operators / sources will not themselves change threads, and by default will start sending onNext/onComplete/onError signals from the thread from which they were subscribed to.
But as soon as one operator switches threads in that top-to-bottom data path, the reach of subscribeOn stops there. Typical example is when there is a publishOn in the chain.
The source of data in this case is reactor-netty and netty, which operate on their own threads and thus act as if there was a publishOn at the source.
For WebFlux, I'd say favor using publishOn in the main chain of operators, or alternatively use subscribeOn inside of inner chains, like inside flatMap.
As per the documentation , all operators prefixed with doOn , are sometimes referred to as having a “side-effect”. They let you peek inside the sequence’s events without modifying them.
If you want to chain the 'recalculate' step after 'provider.size()' do it with flatMap.

Is it possible to avoid nested RetryLoop.callWithRetry calls so I get a consistent timeout?

I've configured a reasonable timeout using BoundedExponentialBackoffRetry, and generally it works as I'd expect if ZK is down when I make a call like "create.forPath". But if ZK is unavailable when I call acquire on an InterProcessReadWriteLock, it takes far longer before it finally times out.
I call acquire which is wrapped in "RetryLoop.callWithRetry" and it goes onto call findProtectedNodeInForeground which is also wrapped in "RetryLoop.callWithRetry". If I've configured the BoundedExponentialBackoffRetry to retry 20 times, the inner retry tries 20 times for every one of the 20 outer retry loops, so it retries 400 times.
We really need a consistent timeout after which we fail. Have I done anything wrong / anyway around this? If not, I guess I'll call the troublesome methods in a new thread that I can kill after my own timeout.
Here is the sample code to recreate it. I stick break points at the lines following the comments, bring ZK down and then let it continue and take the stacktrace whilst it's re-trying.
public class GoCurator {
public static void main(String[] args) throws Exception {
CuratorFramework cf = CuratorFrameworkFactory.newClient(
"localhost:2181",
new BoundedExponentialBackoffRetry(200, 10000, 20)
);
cf.start();
String root = "/myRoot";
if(cf.checkExists().forPath(root) == null) {
// Stacktrace A showing what happens if ZK is down for this call
cf.create().forPath(root);
}
InterProcessReadWriteLock lcok = new InterProcessReadWriteLock(cf, "/grant/myLock");
// See stacktrace B showing the nested re-try if ZK is down for this call
lcok.readLock().acquire();
lcok.readLock().release();
System.out.println("done");
}
}
Stacktrace A (if ZK is down when I'm calling create().forPath). This shows the single retry loop so it exist after the correct number of attempts:
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1499)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1487)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2617)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:242)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:231)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:228)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:219)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:41)
at com.gebatech.curator.GoCurator.main(GoCurator.java:25)
Stacktrace B (if ZK is down when I call InterProcessReadWriteLock#readLock#acquire). This shows the nested re-try loop so it doesn't exit until 20*20 attempts.
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:434)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:56)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.CreateBuilderImpl.findProtectedNodeInForeground(CreateBuilderImpl.java:1239)
at org.apache.curator.framework.imps.CreateBuilderImpl.access$1700(CreateBuilderImpl.java:51)
at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1167)
at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:575)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
at org.apache.curator.framework.recipes.locks.StandardLockInternalsDriver.createsTheLock(StandardLockInternalsDriver.java:54)
at org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:225)
at org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237)
at org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:89)
at com.gebatech.curator.GoCurator.main(GoCurator.java:29)
This turns out to be a real, longstanding, problem with how Curator uses retries. I have a fix and PR ready here: https://github.com/apache/curator/pull/346 - I'd appreciate more eyes on it.

Callback when the query has finished processing in Siddhi

I am writing a small CEP program using Siddhi. I can add a callback whenever a given filter outputs a data like this
executionPlanRuntime.addCallback("query1", new QueryCallback() {
#Override
public void receive(long timeStamp, Event[] inEvents, Event[] removeEvents) {
EventPrinter.print(inEvents);
System.out.println("data received after processing");
}
});
but is there is a way to know that the filter has finished processing and it won't give any more of the above callback. Something like didFinish. I think that would be the ideal place for shutting down SiddhiManager and ExecutionPlanRuntime instances.
No. There in no such functionality and can't be supported in the future also. Rationale behind that is, in real time stream processing queries will process the incoming stream and emit an output stream. There is no concept as 'finished processing'. Query will rather process event as long as there is input.
Since your requirement is to shutdown SiddhiManager and ExecutionPlanRuntime, recommended way is to do this inside some cleaning method of your program. Or else you can write some java code inside callback to count responses or time wait and call shutdown. Hope this helps!!

boost::asio::io_service::run() is not exiting when i call boost::asio::io_serive::stop()

Hi I having written one simple application which uses the asynchronous socket functions. I am facing some problems while closing the socket.
I am using 5 second timer before calling the async_connect on the socket. In some cases the connection is not happening and timer expires. When timer is expired I am closing the socket tcp_socket.close(). But the thing is my connection callback handler is not at all called with the boost::asio::error::operation_aborted error when i tried to cancel instead of close. The same thing is happening for the next all the async connection invokes.
Eventhough I am closing the tcp socket and destroying the client_session object join() call on the created thread is not coming out means io_service::run() is still running not exiting...:-( I don't know why this is happening... tried lot of other ways still facing the same problem.
I am not getting what is the problem, all suggestions and solutions will be appreciated.
My real code some what look like this.
class client_session
{
public:
client_session(boost::asio::io_service& io_service_ )tcp_socekt_(io_service_),
timer_(io_service_)
{
}
~client_session()
{
tcp_socket_.close();
}
void OnTimerExpired(const boost::system::error_code& err)
{
if( err ) tcp_socket_.close();
}
//Its just for example this will be called from upper layer of my module. giving some information about the server.
void connect()
{
//Here am starting the timer
timer_.expires_from_now(boost::posix_time::seconds(2));
timer_.async_wait(boost::bind(&OutgoingSession::OnTimerExpiry, this,PLACEHLDRS::error));
.......
tcp_socket_.async_connect(iterator->endpoint(), boost::bind( &OutgoingSession::handle_connect, this, _1, iterator));
......
}
void OnConnect(const boost::system::error_code& err)
{
//Cancelling the timer
timer_.cancel();
.....
//Register for write to send the request to server
......
}
private:
tcp::socket tcp_socket_;
deadline_timer timer_;
}
void main()
{
boost::asio::io_service tcp_io_service;
boost::asio::io_service::work tcp_work(tcp_io_service);
boost::thread* worker = new boost::thread(&boost::asio::io_service::run,&tcp_io_service);
client_session* csession = new client_session(tcp_io_service);
csession->connect();
sleep(10);
tcp_io_service.stop();
delete csession;
worker.join(); //Here it not coming out of join because io_service::run() is not exited yet.
cout<<"Termination successfull"<<endl;
}
There seem to be a couple of different things wrong with the posted code. I would suggest starting with smaller steps i.e. along the lines of
start and stop asio worker thread cleanly ( see explanation below )
add code to start timer: handle OnTimerExpired correctly, check error code
add in code for async_connect: when connect handler is called, cancel timer and check error code.
add in other asynchronous operations, etc.
For one, when you cancel the timer in the connect handler, the OnTimerExpired handler will be invoked with boost::asio::operation_aborted and then you close the socket, which is probably not what you want to do.
Further, you give the io_service work, yet still call stop. Generally if you give the io_service work, you want to stop the execution thread by removing the work (e.g. This can be accomplished by means of storing work in a smart pointer and resetting it) and letting the currently started asynchronous operations finish cleanly.

Code with a potential deadlock

// down = acquire the resource
// up = release the resource
typedef int semaphore;
semaphore resource_1;
semaphore resource_2;
void process_A(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
void process_B(void) {
down(&resource_2);
down(&resource_1);
use_both_resources();
up(&resource_1);
up(&resource_2);
}
Why does this code causes deadlock?
If we change the code of process_B where the both processes ask for the resources in the same order as:
void process_B(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
Then there is no deadlock.
Why so?
Imagine that process A is running and try to get the resource_1 and gets it.
Now, process B takes control and try to get resource_2. And gets it. Now, process B tries to get resource_1 and does not get it, because it belongs to resource A. Then, process B goes to sleep.
Process A gets control again and try to get resource_2, but it belongs to process B. Now he goes to sleep too.
At this point, process A is waiting for resource_2 and process B is waiting for resource_1.
If you change the order, process B will never lock resource_2 unless it gets resource_1 first, the same for process A.
They will never be dead locked.
A necessary condition for a deadlock is a cycle of resource acquisitions. The first example constructs this a cycle 1->2->1. The second example acquires the resources in a fixed order which makes a cycle and henceforth a deadlock impossible.