Making parallel requests from within Springboot + Webflux - reactive-programming

I'm playing around with Springboot 2 with the webflux stack.
Within the application I'm looking to make multiple HTTP requests in parallel to downstream services to reduce the overall response time back to the client. Is this possible without playing around with threads?
I'm currently using org.springframework.web.reactive.function.client.WebClient but open to other clients that would support this; or even RXJava.

I managed to achieve it by something like below. It's a naive example but the async/http requests are made in downstream.request1() and downstream.request2(). If is a more elegant way to achieve this I'd be interested.
#GetMapping("/sample")
public Mono<String> getMultipleRequests() {
Mono<String> monoResponse1 = downstream.request1();
Mono<String> monoResponse2 = downstream.request2();
return Mono.zip(monoResponse1, monoResponse2)
.flatMap(a -> myTransform(a));
}
private Mono<String> myTransform(Tuple2<String, String> tuple) {
String t1 = tuple.getT1();
String t2 = tuple.getT2();
return Mono.just(t1 + t2);
}

Related

Why does my Spring WebFlux controller return data on first request only?

I am working on a web application where the user's connection times out after a specific time (say 20 seconds). For long running requests I have to return a default message ("your request is under process") and then send an email to the user with the actual result.
I couldn't do this with spring web because I didn't know how to specify a timeout in the controller (with customized messages per request) and at the same time let other requests come through and be processed too. That's why I used spring web-flux which has a timeout operator for both Mono and Flux types.
To make the requested process run in a different thread, I have used Sinks. One to receive requests and one to publish the results. My problem is that the response sink can only return one result and subsequent calls to the URL returns an empty response. For example the first call to /reactive/getUser/123456789 returns the user object but subsequent calls return empty.
I'm not sure if the problem is with the Sink I have used or with how I am getting data from it. In the sample code I have used responseSink.asFlux().next() but I have also tried .single(), .toMono(), .take(1). to no avail. I get the same result.
#RequestMapping("/reactive")
#RestController
class SampleController #Autowired constructor(private val externalService: ExternalService) {
private val requestSink = Sinks.many().multicast().onBackpressureBuffer<String>()
private val responseSink = Sinks.many().multicast().onBackpressureBuffer<AppUser>()
init {
requestSink.asFlux()
.map { phoneNumber -> externalService.findByIdOrNull(phoneNumber) }
.doOnNext {
if (it != null) {
responseSink.tryEmitNext(it)
} else {
responseSink.tryEmitError(Throwable("didn't find a value for that phone number"))
}
}
.subscribe()
}
#GetMapping("/getUser/{phoneNumber}")
fun getUser(#PathVariable phoneNumber: String): Mono<String> {
requestSink.tryEmitNext(phoneNumber)
return responseSink.asFlux()
.next()
.map { it.toString() }
.timeout(Duration.ofSeconds(20), Mono.just("processing your request"))
}
}

Use exchange message inside the .to() method in apache camel

Im new to camel and would like to change my route dynamically according to some logic preformed before hand
camelContext.addRoutes(new RouteBuilder() {
public void configure() {
PropertiesComponent pc = getContext().getComponent("properties", PropertiesComponent.class);
pc.setLocation("classpath:application.properties");
log.info("About to start route: Kafka Server -> Log ");
from("kafka:{{consumer.topic}}?brokers={{kafka.host}}:{{kafka.port}}"
+ "&maxPollRecords={{consumer.maxPollRecords}}"
+ "&consumersCount={{consumer.consumersCount}}"
+ "&seekTo={{consumer.seekTo}}"
+ "&groupId={{consumer.group}}"
+ "&valueDeserializer=" + BytesDeserializer.class.getName())
.routeId("FromKafka")
.process(new Processor() {
#Override
public void process(Exchange exchange) throws Exception {
System.out.println(" message: " + exchange.getIn().getBody());
Bytes body = exchange.getIn().getBody(Bytes.class);
HashMap data = (HashMap)SerializationUtils.deserialize(body.get());
// do some work on data;
Map messageBusDetails = new HashMap();
messageBusDetails.put("topicName", "someTopic");
messageBusDetails.put("producerOption", "bla");
exchange.getOut().setHeader("kafka", messageBusDetails);
exchange.getOut().setBody(SerializationUtils.serialize(data));
}
}).choice()
.when(header("kafka"))
.to("kafka:"+ **getHeader("kafka").get("topicName")** )
.log("${body}");
}
});
getHeader("kafka").get("topicName")
this is what im trying to achieve.
But i dont know how to access the headers value ( which is a map - cause a kafka producer might have more configuration) inside the .to()
I understand i might be using it totally wrong... buts thats what i managed to understand until now...
The main goal is to have multiple message busses as .from()
and multiple message bus options in the .to() that will be decided via an external source (like config file) that way the same route will apply to many logic scenarios
and i thought the choice() method is the best answer
Thanks!
Instead of to(), you can use toD(), which is the "Dynamic To"
See this for details
And for the syntax to use to pull in various headers etc., see the Simple expression page

Non-blocking functional methods with Reactive Mongo and Web client

I have a micro service which reads objects from a database using a ReactiveMongoRepository interface.
The goal is to take each one of those objects and push it to a AWS Lambda function (after converting it to a DTO). If the result of that lambda function is in the 200 range, mark the object as being a success otherwise ignore.
In the old days of a simple Mongo Repository and a RestTemplate this is would be a trivial task. However I'm trying to understand this Reactive deal, and avoid blocking.
Here is the code I've come up with, I know I'm blocking on the webClient, but how do I avoid that?
#Override
public Flux<Video> index() {
return videoRepository.findAllByIndexedIsFalse().flatMap(video -> {
final SearchDTO searchDTO = SearchDTO.builder()
.name(video.getName())
.canonicalPath(video.getCanonicalPath())
.objectID(video.getObjectID())
.userId(video.getUserId())
.build();
// Blocking call
final HttpStatus httpStatus = webClient.post()
.uri(URI.create(LAMBDA_ENDPOINT))
.body(BodyInserters.fromObject(searchDTO)).exchange()
.block()
.statusCode();
if (httpStatus.is2xxSuccessful()) {
video.setIndexed(true);
}
return videoRepository.save(video);
});
}
I'm calling the above from a scheduled task, and I don't really care about that actual result of the index() method, just what happens during.
#Scheduled(fixedDelay = 60000)
public void indexTask() {
indexService
.index()
.log()
.subscribe();
}
I've read a bunch of blog posts etc on the subject but they're all just simple CRUD operations without anything happening in the middle so don't really give me a full picture of how to implement these things.
Any help?
Your solution is actually quite close.
In those cases, you should try and decompose the reactive chain in steps and not hesitate to turn bits into independent methods for clarity.
#Override
public Flux<Video> index() {
Flux<Video> unindexedVideos = videoRepository.findAllByIndexedIsFalse();
return unindexedVideos.flatMap(video -> {
final SearchDTO searchDTO = SearchDTO.builder()
.name(video.getName())
.canonicalPath(video.getCanonicalPath())
.objectID(video.getObjectID())
.userId(video.getUserId())
.build();
Mono<ClientResponse> indexedResponse = webClient.post()
.uri(URI.create(LAMBDA_ENDPOINT))
.body(BodyInserters.fromObject(searchDTO)).exchange()
.filter(res -> res.statusCode().is2xxSuccessful());
return indexedResponse.flatMap(response -> {
video.setIndexed(true);
return videoRepository.save(video);
});
});
my approach, maybe a little bit more readable. But I admit I didn't run it so not 100% guarantee that it will work.
public Flux<Video> index() {
return videoRepository.findAll()
.flatMap(this::callLambda)
.flatMap(videoRepository::save);
}
private Mono<Video> callLambda(final Video video) {
SearchDTO searchDTO = new SearchDTO(video);
return webClient.post()
.uri(URI.create(LAMBDA_ENDPOINT))
.body(BodyInserters.fromObject(searchDTO))
.exchange()
.map(ClientResponse::statusCode)
.filter(HttpStatus::is2xxSuccessful)
.map(t -> {
video.setIndexed(true);
return video;
});
}

How do I collect from a flux without closing the stream

My usecase is to create an reactive endpoint like this :
public Flux<ServerEvent> getEventFlux(Long forId){
ServicePoller poller = new ServicePollerImpl();
Map<String,Object> params = new HashMap<>();
params.put("id", forId);
Flux<Long> interval = Flux.interval(Duration.ofMillis(pollDuration));
Flux<ServerEvent> serverEventFlux =
Flux.fromStream(
poller.getEventStream(url, params) //poll a given endpoint after a fixed duration.
);
Flux<ServerEvent> sourceFlux= Flux.zip(interval, serverEventFlux)
.map(Tuple2::getT2); // Zip the two streams.
/* Here I want to store data from sourceFlux into a collection whenever some data arrives without disturbing the downstream processing in Spring. So that I can access collection later on without polling again */
This sends back the data to front end as soon as it is available , however my second use case is to pool that data as it arrives into a separate collection , so that if a similar request arrives later on , I can offload the whole data from the pool without hitting the service again .
I tried to subscribe the flux , buffer , cache and collect the flux before returning from the original flux the controller , but all of that seems to close the stream hence Spring cant process it.
What are my options to tap into the flux and store values into a collection as and when they arrive without closing the flux stream ?
Exception encountered :
java.lang.IllegalStateException: stream has already been operated upon
or closed at
java.util.stream.AbstractPipeline.spliterator(AbstractPipeline.java:343)
~[na:1.8.0_171] at
java.util.stream.ReferencePipeline.iterator(ReferencePipeline.java:139)
~[na:1.8.0_171] at
reactor.core.publisher.FluxStream.subscribe(FluxStream.java:57)
~[reactor-core-3.1.7.RELEASE.jar:3.1.7.RELEASE] at
reactor.core.publisher.Flux.subscribe(Flux.java:6873)
~[reactor-core-3.1.7.RELEASE.jar:3.1.7.RELEASE] at
reactor.core.publisher.FluxZip$ZipCoordinator.subscribe(FluxZip.java:573)
~[reactor-core-3.1.7.RELEASE.jar:3.1.7.RELEASE] at
reactor.core.publisher.FluxZip.handleBoth(FluxZip.java:326)
~[reactor-core-3.1.7.RELEASE.jar:3.1.7.RELEASE]
poller.getEventStream returns a Java 8 stream that can be consumed only once. You can either convert the stream to a collection first or defer the execution of poller.getEventStream by using a supplier:
Flux.fromStream(
() -> poller.getEventStream(url, params)
);
Solution that worked for me as suggested by #a better oliver
public Flux<ServerEvent> getEventFlux(Long forId){
ServicePoller poller = new ServicePollerImpl();
Map<String,Object> params = new HashMap<>();
params.put("id", forId);
Flux<Long> interval = Flux.interval(Duration.ofMillis(pollDuration));
Flux<ServerEvent> serverEventFlux =
Flux.fromStream(
()->{
return poller.getEventStream(url, params).peek((se)->{reactSink.addtoSink(forId, se);});
}
);
Flux<ServerEvent> sourceFlux= Flux.zip(interval, serverEventFlux)
.map(Tuple2::getT2);
return sourceFlux;
}

How to implement distributed rate limiter?

Let's say, I have P processes running some business logic on N physical machines. These processes call some web service S, say. I want to ensure that not more than X calls are made to the service S per second by all the P processes combined.
How can such a solution be implemented?
Google Guava's Rate Limiter works well for processes running on single box, but not in distributed setup.
Are there any standard, ready to use, solutions available for JAVA? [may be based on zookeeper]
Thanks!
Bucket4j is java implementation of "token-bucket" rate limiting algorithm. It works both locally and distributed(on top of JCache). For distributed use case you are free to choose any JCache implementation like Hazelcast or Apache Ignite. See this example of using Bucket4j in cluster.
I have been working on an opensource solution for these kind of problems.
Limitd is a "server" for limits. The limits are implemented using the Token Bucket Algorithm.
Basically you define limits in the service configuration like this:
buckets:
"request to service a":
per_minute: 10
"request to service b":
per_minute: 5
The service is run as a daemon listening on a TCP/IP port.
Then your application does something along these lines:
var limitd = new Limitd('limitd://my-limitd-address');
limitd.take('request to service a', 'app1' 1, function (err, result) {
if (result.conformant) {
console.log('everything is okay - this should be allowed');
} else {
console.error('too many calls to this thing');
}
});
We are currently using this for rate-limiting and debouncing some application events.
The server is on:
https://github.com/auth0/limitd
We are planning to work on several SDKs but for now we only have node.js and partially implemented go:
https://github.com/limitd
https://github.com/jdwyah/ratelimit-java provides distributed rate limits that should do just this. You can configure your limit as S per second / minute etc and choose burst size / refill rate of the leaky bucket that is under the covers.
Simple rate limiting in java where you want to achieve a concurrency of 3 transactions every 3 seconds. If you want to centralize this then either store the tokens array in elasticache or any database. And in place of synchronized block you will have to implement a lock flag as well.
import java.util.Date;
public class RateLimiter implements Runnable {
private long[] tokens = new long[3];
public static void main(String[] args) {
// TODO Auto-generated method stub
RateLimiter rateLimiter = new RateLimiter();
for (int i=0; i<20; i++) {
Thread thread = new Thread(rateLimiter,"Thread-"+i );
thread.start();
}
}
#Override
public void run() {
// TODO Auto-generated method stub
long currentStartTime = System.currentTimeMillis();
while(true) {
if(System.currentTimeMillis() - currentStartTime > 100000 ) {
throw new RuntimeException("timed out");
}else {
if(getToken()) {
System.out.println(Thread.currentThread().getName() +
" at " +
new Date(System.currentTimeMillis()) + " says hello");
break;
}
}
}
}
synchronized private boolean getToken() {
// TODO Auto-generated method stub
for (int i = 0; i< 3; i++) {
if(tokens[i] == 0 || System.currentTimeMillis() - tokens[i] > 3000) {
tokens[i] = System.currentTimeMillis();
return true;
}
}
return false;
}
}
So with all distributed rate limiting architecture, you'll need a single backend store that acts as single source of true to track the number of requests. You can always use zookeeper as a in memory datastore for this out of convenience, although there are better choices such as Redis.