Spring webflux/reactor using #Scheduled to read database and perform some tasks - reactive-programming

I am new to spring webflux and my current spring boot application uses a scheduler(annotated as #Scheduled) to read list of data from DB, call a rest api concurrently in batches and then writes to event stream
I want to achieve the same in Spring webflux.
Should I use #Scheduled or use schedulePeriodically from Webflux?
How can I batch items from DB into smaller sets(say 10 items) and concurrently call rest api?
At present the app fetches max 100 records in one scheduler run and then it processes them. I am planning to shift to r2dbc, if i do so, can i limit the flow of data like 100?
Thanks

1. Should I use #Scheduled or use schedulePeriodically from Webflux?
#Scheduled is an annotation which is part of the spring framework scheduled package, while schedulePeriodically is a function which is part of reactor, so you can't really compare the two. I dont see any problems in using the annotation since it is part of the core framework.
2. How can I batch items from DB into smaller sets (say 10 items) and concurrently call rest api?
By using the Flux#buffer functions which will emit a list of items when the buffer is full.
Flux.just("1", "2", "3", "4")
.buffer(2)
.doOnNext(list -> {
System.out.println(list.size());
}).subscribe()
Will print 2 each time.
3. At present the app fetches max 100 records in one scheduler run and then it processes them. I am planning to shift to r2dbc, if i do so, can i limit the flow of data like 100?
Well you can as written before, you fetch, and then buffer the responses into lists of 100, you can then place each list in its own flux and emit items again, or process each list of 100 items. Up to you.
There are a lot of functions under the buffer segment, check them out.
Flux#buffer

Flux.buffer will combine the streams and will emit a list of streams of mentioned buffer size.
For batching purpose, you can use Flux.expand or Mono.expand. You only have to provide your condition in the expand to execute it again or finally end it.
Here are the examples:
public static void main(String[] args) {
List<String> list = new ArrayList<>();
list.add("1");
Flux.just(list)
.buffer(2)
.doOnNext(ls -> {
System.out.println(ls.getClass());
// Buffering a list returns the list of list of String
System.out.println(ls);
}).subscribe();
Flux.just(list)
.expand(listObj -> {
// Condition to finally end the batch
if (listObj.size()>4) {
return Flux.empty();
}
// Can return the size of data as much as you require
list.add("a");
return Flux.just(listObj);
}).map(ls -> {
// Here it returns list of String which was the original object type not list of list as in case of buffer
System.out.println(ls.getClass());
System.out.println(ls);
return ls;
}).subscribe();
}
Output:
class java.util.ArrayList
[[1]] /// Output of buffer list of list
class java.util.ArrayList
[1]
class java.util.ArrayList
[1, a]
class java.util.ArrayList
[1, a, a]
class java.util.ArrayList
[1, a, a, a]
class java.util.ArrayList
[1, a, a, a, a]

Related

Compare List<String> to Flux<String> in non blocking way

How to compare List to Flux in non blocking way
Below the code in blocking way
public static void main(String[] args) {
List<String> all = List.of("A", "B", "C", "D");
Flux<String> valid = Flux.just("A", "D");
Map<Boolean, List<String>> collect = all.stream()
.collect(Collectors.groupingBy(t -> valid.collectList().block().contains(t)));
System.out.println(collect.get(Boolean.TRUE));
System.out.println(collect.get(Boolean.FALSE));
}
how to get it working in non-blocking way?
Above is an example of what i am trying to do in web application. I receive list of object which is List all. Then i query database which return Flux . Flux returned by database will be subset of List all. I need to prepare two lists. List of items which are present in Flux of valid and List of items which are not present in Flux of valid
EDIT:
I converted Flux to Mono and List to Mono,
public static void main(String[] args) {
Mono<List<String>> all = Mono.just(List.of("A", "B", "C", "D"));
Mono<List<String>> valid = Mono.just(List.of("A", "D"));
var exist = all.flatMap(a -> valid.map(v -> a.stream().collect(Collectors.groupingBy(v::contains))));
System.out.println(exist.block().get(Boolean.TRUE));
System.out.println(exist.block().get(Boolean.FALSE));
}
There is no straightforward way of achieve this in reactive programming without breaking some of its semantics.
If you reflect back on what reactive programming tris to achieve and your problem statement, you should notice that those won't play well that much together.
Reactive programming, as the name suggests, is about reacting to events which in your case would be valid items emitted from your datastore. In a typical situation, you should have been programming your statement to compute some assertions around the emitted valid items then emit these (or some other transformations downstream). Unfortunately, you won't be able to compute the all and valid items intersection and diversion without stopping at some point (otherwise how would you know that an item you assumed non-valid is not emitted at some point by the valid publisher).
Though, to achieve the desired behavior, you will lean on memory to buffer items then trigger your validations.
Retrieving valid items should be achievable using the filterWhen operator paired with the hasElement one:
Flux<String> validItems = Flux.fromIterable(all)
.filterWhen(valid::hasElement);
To retrieve the invalid items, you can collect all and validItems merged together then filter out elements that do appear more than once:
Flux<String> inValidItems = Flux.fromIterable(all)
.mergeWith(validItems)
.collectList()
.flatMapIterable(list -> list.stream().filter(item -> Collections.frequency(list, item) == 1).collect(Collectors.toList()));

Make iterations of a loop sequentially in Mutiny

I am new in the reactive programming world. I am currently working in a Java reactive application using the Mutiny library.
I need to develop a loop that waits for the previous iteration to finish in order to start the next one. For instance:
List<Uni<T>> uniList = new ArrayList<>();
for (T item : items) { //items is an already fulfilled collection
uniList.add(this.doSomethingAndReturnInUni(item));
}
return Uni.combine().all().unis(uniList).combinedWith(unisToCombine -> {
List<T> list = new ArrayList<>();
unisToCombine.forEach(x ->list.add(x));
return list;
});
The for loop in the example, generates a thread per iteration. I am wondering how to order the i-th call to the method doSomethingAndReturnInUni() waits for the (i-1) call to trigger the event, that is, make the for loop sequentially. It is possible to suscribe those events in such a way?
Could you try something like this?
Builder<Item> items = Uni.join().builder();
for (Item item : items) {
builder.add(this.doSomethingAndReturnInUni(item));
}
return builder.joinAll().andCollectFailures()
.flatMap(itemList -> do whatever you need ...) //itemList type is List<Item>
I don't know why you are using uni, as this should just handle one operation, for loops you should use multi, where you can handle the back pressure, and only get the next event, when one event is finished. Multi can be run sequentially and in parallel.
see https://quarkus.io/blog/mutiny-back-pressure/
I’ve done the same, using Multi’s see the ‘generateData()’ method here:
https://github.com/Serkan80/quarkus-quickstarts/blob/development/redis-streams-quickstart/weather-producer/src/main/java/org/acme/redis/streams/producer/ValuesGenerator.java

Spring Webflux - how to get value from Flux without block() operations

I wonder how to write non-blocking code with Webflux.
Here is what I want to do:
Get all Products by ProductProperties field (returned as Flux)
Get a list of values from Flux<Product>.availabilityCalendar
Use the data retrieved in step 2 and fetch some other data (returned as Flux<>) - everything should be a non-blocking operations.
How to do that? How to get values from Flux<Object> and then fetch some other data returned as Flux<> avoiding blocking operations like Flux.block() to retrieve data that are needed in the next step to fetch final data to return?
public Flux<Product> getAllProductsByAvailability(Flux<ProductProperties> productProperties,
Map<String, String> searchParams) {
productProperties
.flatMap(property -> productRepository.findByProductPropertiesId(property.getId())) //1. return Products
.flatMap(product -> Flux.just(product.getAvailabilityCalendar())) //2. how to get Product.availabilityCalendar list as non-blocking operation to work with this data afterwards?
(...)
where:
productRepository.findByProductPropertiesId returns Flux
Product has field: ArrayList<ProductAvailability> availabilityCalendar
Is it a good approach?
Thank you!
like this
I check the tag valid
Flux.fromIterable(vo.getTags())
.flatMap((tag) -> tagService.findByCode(tag.getCode()).map(TagBo::createByVo)).filter(Objects::nonNull).collectList().doOnNext(l->vo.setTags(l));
by using the onNext parameter
productRepository.findByProductPropertiesId(property.getId())
.onNext(product -> {
return // Do things here
})

Concat multiple reactive requests to one Mono

I noticed in the reactive libraries there are Tuples, but what do I do if there are more than 8 Tuples?
https://projectreactor.io/docs/core/release/api/reactor/util/function/Tuples.html#fromArray-java.lang.Object:A-
Example code that seems to work, but is there a better way to use some sort of collector?
private Mono<List<String>> getContent(List<String> ids) {
List<String> allContent = new ArrayList<>();
Mono<List<String>> allContentMono = Mono.empty();
for(String id : ids) {
allContentMono = callApi(id)
.flatMap(result -> result.bodyToMono(String.class))
.map(str -> {
allContent.add(str);
return allContent;
});
}
return allContentMono;
}
Why did the tuple size stop at 8? (haven't looked around for the documentation on why, but not my main concern)
Thanks
zip (which uses TupleN) is for when you want to create values by compositon, out of a combination of sources. Eg. out of a Flux<FirstName> and Flux<LastName> you want a Flux<FullName>, that emits one FullName for each incoming FistName/LastName pair.
For your use case, where you want to execute multiple calls (possibly in parallel) and collect the results in a list, flatMap is enough:
private Mono<List<String>> getContent(List<String> ids) {
return Flux
.fromIterable(ids)
.flatMap(id -> callApi(id))
.flatMap(response -> response.bodyToMono(String.class))
.collectList();
}
Tuple is an immutable, fixed-size data structure, used by zip as convenience when you don't want to create a dedicated POJO. It doesn't make sense to try and support unlimited sizes so we stopped at eight. There is a zip variant that will aggregate more than 8 sources, but will make you work with an Object[] instead of a Tuple.

can i conditionally "merge" a Single with an Observable?

i'm a RxJava newcomer, and i'm having some trouble wrapping my head around how to do the following.
i'm using Retrofit to invoke a network request that returns me a Single<Foo>, which is the type i ultimately want to consume via my Subscriber instance (call it SingleFooSubscriber)
Foo has an internal property items typed as List<String>.
if Foo.items is not empty, i would like to invoke separate, concurrent network requests for each of its values. (the actual results of these requests are inconsequential for SingleFooSubscriber as the results will be cached externally).
SingleFooSubscriber.onComplete() should be invoked only when Foo and all Foo.items have been fetched.
fetchFooCall
.subscribeOn(Schedulers.io())
// Approach #1...
// the idea here would be to "merge" the results of both streams into a single
// reactive type, but i'm not sure how this would work given that the item emissions
// could be far greater than one. using zip here i don't think it would every
// complete.
.flatMap { foo ->
if(foo.items.isNotEmpty()) {
Observable.zip(
Observable.fromIterable(foo.items),
Observable.just(foo),
{ source1, source2 ->
// hmmmm...
}
).toSingle()
} else {
Single.just(foo)
}
}
// ...or Approach #2...
// i think this would result in the streams for Foo and items being handled sequentially,
// which is not really ideal because
// 1) i think it would entail nested streams (i get the feeling i should be using flatMap
// instead)
// 2) and i'm not sure SingleFooSubscriber.onComplete() would depend on the completion of
// the stream for items
.doOnSuccess { data ->
if(data.items.isNotEmpty()) {
// hmmmm...
}
}
.observeOn(AndroidSchedulers.mainThread())
.subscribe(
{ data -> /* onSuccess() */ },
{ error -> /* onError() */ }
)
any thoughts on how to approach this would be greatly appreciated!
bonus points: in trying to come up with a solution to this, i've begun to question the decision to use the Single reactive type vs the Observable reactive type. most (all, except this one Foo.items case?) of my streams actually revolve around consuming a single instance of something, so i leaned toward Single to represent my streams as i thought it would add some semantic clarity around the code. anybody have any general guidance around when to use one vs the other?
You need to nest flatMaps and then convert back to Single:
retrofit.getMainObject()
.flatMap(v ->
Flowable.fromIterable(v.items)
.flatMap(w ->
retrofit.getItem(w.id).doOnNext(x -> w.property = x)
)
.ignoreElements()
.toSingle(v)
)