Have been through a lot of stuff explaining the fine line of difference between non-blocking and asynchronous I/O, but most seem to pertain to server side programming. And it seems to make sense as well. But could non-blocking vs. asynchronous I/O have a relevance in terms of a Rest Client?
Have spent a lot of time on the net but am afraid, am still unable to grasp the difference or rather the significance of having a non-blocking Rest Client. Am more concerned with Jersey Client. Yes, the API says that it supports Asynchronous Client and that a blocking "get" call on the Future object could be avoided using Future.isDone() (ref: https://jersey.java.net/documentation/latest/async.html) but if my Rest Client makes a Post request how is that handled? Is it just asynchronous or is it asynchronous and non-blocking as well? Would be much grateful if anyone could help me with an insight.
Thanks and regards
I recommend reading this article from Microsoft on the topic.
The term "non-blocking" really means the same thing whether it is happening on the server side or the client side. As the article above points out, blocking means that the thread you make the call on stops processing until the call has finished. The Task async pattern described in the article makes it possible to await a Rest call instead of blocking the thread. Generally, if a method returns a Task, and the method is suffixed with async, it will be a non-blocking call. This means that it can be ran concurrently with other calls so that calls do not need to be ran in a sequence. They can be ran in parallel.
Here is an example where non-blocking rest calls are made in parallel and awaited so that they do not block each other. A simple for loop without the parallel would achieve the same result because it would kick the calls off in a sequence but would not block on each call.
var tasks = new List<Task<Response<Person>>>();
var client = clientFactory.CreateClient();
for (var i = 0; i < 100; i++)
{
tasks.Add(client.GetAsync<Person>(new Uri("JsonPerson", UriKind.Relative)));
}
var results = await Task.WhenAll(tasks);
Full code
Related
I have a reactive pipeline to process incoming requests. For each request I need to call a business-relevant function (doSomeRelevantProcessing).
After that is done, I need to notify some external service about what happened. That part of the pipeline should not increase the overall response time.
Also, notifying this external system is not business critical: giving a quick response after the main part of the pipeline is finished is more important than making sure the notification is successful.
As far as I learned, the only way to run something in the background without slowing down the overall process is to subscribe to in directly in the pipeline, thus achieving a fire-and-forget mentality.
Is there a good alternative to subscribing inside the flatmap?
I am a little worried about what might happen if notifying the external service takes longer than the original processing and a lot of requests are coming in at once. Could this lead to a memory exhaustion or the overall process to block?
fun runPipeline(incoming: Mono<Request>) = incoming
.flatMap { doSomeRelevantProcessing(it) } // this should not be delayed
.flatMap { doBackgroundJob(it) } // this can take a moment, but is not super critical
fun doSomeRelevantProcessing(request: Request) = Mono.just(request) // do some processing
fun doBackgroundJob(request: Request) = Mono.deferContextual { ctx: ContextView ->
val notification = "notification" // build an object from context
// this uses non-blocking HTTP (i.e. webclient), so it can take a second or so
notifyExternalService(notification).subscribeOn(Schedulers.boundedElastic()).subscribe()
Mono.just(Unit)
}
fun notifyExternalService(notification: String) = Mono.just(Unit) // might take a while
I'm answering this assuming that you notify the external service using purely reactive mechanisms - i.e. you're not wrapping a blocking service. If you are then the answer would be different as you're bound by the size of your bounded elastic thread pool, which could quickly become overwhelmed if you have hundreds of requests a second incoming.
(Assuming you're using reactive mechanisms, then there's no need for .subscribeOn(Schedulers.boundedElastic()) as you give in your example, as that's not buying you anything - it's designed for wrapping legacy blocking services.)
Could this lead to a memory exhaustion
It's only a possibility in really extreme cases, the memory used by each individual request will be tiny. It's almost certainly not worth worrying about, if you start seeing memory issues here then you'll almost certainly be hit by other issues elsewhere.
That being said, I'd probably recommend adding .timeout(Duration.ofSeconds(5)) or similar before your inner subscribe method to make sure the requests are killed off after a while if they haven't worked for any reason - this will prevent them building up.
...or [can this cause] the overall process to block?
This one is easier - a short no, it can't.
I am trying to understand the difference between the 2 methods, in terms of functionality.
class MyService (blockService: BlockService){
def doSomething1(): Future[Boolean] = {
//do
//some non blocking
//stuff
val result = blockService.block()
Future.successful(result)
}
def doSomething2(): Future[Boolean] = {
Future{
//do
//some non blocking
//stuff
blockService.block()
}
}
}
To my understanding the difference between the 2 is which thread is the actual thread that will be blocked.
So if there is a thread: thread_1 that execute something1, thread_1 will be the one that is blocked, while if a thread_1 executed something2a new thread will run it - thread_2, and thread_2 is the one to be blocked.
Is this true?
If so, than there is no really a preferred way to write this code? if I don't care which thread will eventually be blocked, then the end result will be the same.
dosomething1 seems like a weird way to write this code, I would choose dosomething2.
Make sense?
Yes, doSomething1 and doSomething2 blocks different threads, but depending on your scenario, this is an important decision.
As #AndreasNeumann said, you can have different execution contexts in doSomething2. Imagine that the main execution context is the one receiving HTTP requests from your users. Block threads in this context is bad because you can easily exhaust the execution context and impact requests that have nothing to do with doSomething.
Play docs have a better explanation about the possible problems with having blocking code:
If you plan to write blocking IO code, or code that could potentially do a lot of CPU intensive work, you need to know exactly which thread pool is bearing that workload, and you need to tune it accordingly. Doing blocking IO without taking this into account is likely to result in very poor performance from Play framework, for example, you may see only a few requests per second being handled, while CPU usage sits at 5%. In comparison, benchmarks on typical development hardware (eg, a MacBook Pro) have shown Play to be able to handle workloads in the hundreds or even thousands of requests per second without a sweat when tuned correctly.
In your case, both methods are being executed using Play default thread pool. I suggest you to take a look at the recommended best practices and see if you need a different execution context or not. I also suggest you to read Akka docs about Dispatchers and Futures to gain a better understanding about what executing Futures and have blocking/non-blocking code.
This approach makes sense if you make use of different execution contexts in the second method.
So having for example one for answering requests and another for blocking requests.
So you would use the normal playExecutionContext to keep you application running and answering and separate blocking operation in a different one.
def doSomething2(): Future[Boolean] = Future{
blocking { blockService.block() }
}( mySpecialExecutionContextForBlockingOperations )
For a little more information: http://docs.scala-lang.org/overviews/core/futures.html#blocking
You are correct. I don't see a point in doSomething1. It simply complicates the interface for the caller while not providing the benefits of an asynchronous API.
Does BlockService handle blocking operation?
Normally, use blocking ,as #Andreas remind,to make blocking operation into another thread is meanful.
Basically, I'm looking to respond to a SOAP request immediately, but also kick off further processing. What I'm seeing is that the response is not sent until the route ends. In other words:
from("cxf:bean:someEndpoint")
.to("seda:replySOAP")
.to("direct:ABCMessage");
from("seda:replySOAP")
.to("bean:soapReply?method=process").end();
from("direct:ABCMessage")
.process(new ConvertABCToNZFCY())
.to("bean:prelimNZFCYCall")
.end();
Does not generate the response until "direct:ABCMessage" has completed. I would think seda would designate asynchronous processing. I have also tried "vm:replySOAP", pointing to a separate Camel Context, and this did not help.
I have also tried multicast, to no avail:
from("cxf:bean:someEndpoint")
.multicast().parallelProcessing()
.to("seda:replySOAP")
.to("direct:ABCMessage");
What DOES work for me is wireTap, but it does not seem elegant:
from("cxf:bean:someEndpoint")
.wireTap("direct:ABCMessage")
.to("direct:replySOAP");
Must I use JMS?
Thanks!
The behavior you see is due to
.to("direct:ABCMessage");
in the routes. It is a synchronous process ie, an InOut exchange pattern. jms can be used but that may be an overkill if you are using it only to avoid wiretap. Why do you think wiretap does not seem elegant.
I've read through the boost:asio documentation (which appears silent on async clients), and looked through here, but can't seem to find the forest for the trees here.
I've got a simulation that has a main loop that looks like this:
for(;;)
{
a = do_stuff1();
do_stuff2(a);
}
Easy enough.
What I'd like to do, is modify it so that I have:
for(;;)
{
a = do_stuff1();
check_for_new_received_udp_data(&b);
modify_a_with_data_from_b(a,b);
do_stuff2(a);
}
Where I have the following requirements:
I cannot lose data just because I wasn't actively listening. IE I don't want to lose packets because I was in do_stuff2() instead of check_for_new_received_udp_data() at the time the server sent the packet.
I can't have check_for_new_received_udp_data() block for more than about 2ms, since the main for loop needs to execute at 60Hz.
The server will be running elsewhere, and has a completely erratic schedule. Sometimes there will be no data, othertimes I may get the same packet repeatedly.
I've played with the async UDP, but that requires calling io_service.run(), which blocks indefinitely, so that doesn't really help me.
I thought about timing out a blocking socket read, but it seems you have to cheat and get out of the boost calls to do that, so that's a non-starter.
Is the answer going to involve threading? Either way, could someone kindly point me to an example that is somewhat similar? Surely this has been done before.
To avoid blocking in the io_service::run() you can use io_service::poll_one().
Regarding loosing UDP packets, I think you are out of luck. UDP does not guarantee delivery, and any part of the network may decide to drop UDP packets if there is much traffic. If you need to ensure delivery you need to have either implement some sort of flow control or just use TCP.
I think your problem is that you're still thinking synchronously. You need to think asynchronously.
Async read on UDP socket - will call handler when data arrives.
Within that handler do your processing on the incoming data. Keep in mind that while you're processing, if you have a single thread, nothing else dispatches. This can be perfectly OK (UDP messages will still be queued in the network stack...).
As a result of this you could be starting other asynchronous operations.
If you need to do work in parallel that is essentially unrelated or offline that will involve threads. Create a thread that calls io_service.run().
If you need to do periodic work in an asynch framework use timers.
In your particular example we can rearrange things like this (psuedo-code):
read_handler( ... )
{
modify_a_with_data_from_b(a,b);
do_stuff2(a);
a = do_stuff1();
udp->async_read( ..., read_handler );
}
periodic_handler(...)
{
// do periodic stuff
timer.async_wait( ..., periodic_handler );
}
main()
{
...
a = do_stuff1();
udp->async_read( ..., read_handler )
timer.async_wait( ..., periodic_handler );
io_service.run();
}
Now I'm sure there are other requirements that aren't evident from your question but you'll need to figure out an asynchronous answer to them, this is just an idea. Also ask yourself if you really need an asynchronous framework or just use the synchronous socket APIs.
I have a Scala application that relies on an external RESTful webservice for some part of its functionality. We'd like to do some performance tests on the application, so we stub out the webservice with an internal class that fakes the response.
One thing we would like to keep in order to make the performance test as realistic as possible is the network lag and response time from the remote host. This is between 50 and 500 msec (we measured).
Our first attempt was to simply do a Thread.sleep(random.nextInt(450) + 50), however I don't think that's accurate - we use NIO, which is non-blocking, and Thread.sleep is blocking and locks up the whole thread.
Is there a (relatively easy / short) way to stub a method that contacts an external resource, then returns and calls a callback object when ready? The bit of code we would like to replace with a stub implementation is as follows (using Sonatype's AsyncHttpClient), where we wrap its completion handler object in one of our own that does some processing:
def getActualTravelPlan(trip: Trip, completionHandler: AsyncRequestCompletionHandler) {
val client = clientFactory.getHttpClient
val handler = new TravelPlanCompletionHandler(completionHandler)
// non-blocking call here.
client.prepareGet(buildApiURL(trip)).setRealm(realm).execute(handler)
}
Our current implementation does a Thread.sleep in the method, but that's, like I said, blocking and thus wrong.
Use a ScheduledExecutorService. It will allow you to schedule things to run at some time in the future. Executors has factory methods for creating them fairly simply.