RxObservable that repeats itself until an expected value is found - rx-java2

The goal of this function is to create a stream that emits values periodically until it encounters one that matches a predicate.
Here is some skeleton code that I've come up with:
class Watcher<T : Any>(
/**
* Emits the data associated with the provided id
*/
private val callable: (id: String) -> T,
/**
* Checks if the provided value marks the observable as complete
*/
private val predicate: (id: String, value: T) -> Boolean
) {
private val watchPool: MutableMap<String, Observable<T>> = ConcurrentHashMap()
fun watch(id: String): Observable<T> {
// reuse obesrvable if exists
val existing = watchPool[id]
if (existing != null)
return existing
val value = callable(id)
if (predicate(id, value)) return Observable.just(value)
// create new observable to fetch until complete,
// then remove from the map once complete
val observable = Observable.fromCallable<T> {
callable(id)
}.repeatWhen { /* What to put here? */ }.doOnComplete {
watchPool.remove(id)
}.distinctUntilChanged()
watchPool[id] = observable
return observable
}
}
As an example, if I have the following enums:
enum class Stage {
CREATED, PROCESSING, DELIVERING, FINISHED
}
And some callable that will retrieve the right stage, I should be able to pass the callable and a predicate checking if stage == FINISHED, and poll until I get the FINISHED event.
The issue I have is in generating an observable when the event received is not a final event. In that case, the observable should continue to poll for events until either it receives an event matching the predicate or until it has no more subscribers.
This observable should:
Not poll until it receives at least one subscriber
Poll every x seconds
Mark itself as complete if predicate returns true
Complete itself if it ever goes from >0 subscribers to 0 subscribers
The use of watch pools is simply to ensure that two threads watching the same id will not poll twice the number of times. Removal of observables from the map is also just so it doesn't pile up. For the same reason, observables that emit just one variable are not stored for reference.
How do I go about adding the functionality for the points added above?
I will link to one existing RxJava Github issue that I found useful, but from what I'm aware, it doesn't allow for predicates dealing with the value emitted by the callable.

I ended up with just using takeUntil, and using the observal's interval method to poll.
abstract class RxWatcher<in T : Any, V : Any> {
/**
* Emits the data associated with the provided id
* At a reasonable point, emissions should return a value that returns true with [isCompleted]
* This method should be thread safe, and the output should not depend on the number of times this method is called
*/
abstract fun emit(id: T): V
/**
* Checks if the provided value marks the observable as complete
* Must be thread safe
*/
abstract fun isCompleted(id: T, value: V): Boolean
/**
* Polling interval in ms
*/
open val pollingInterval: Long = 1000
/**
* Duration between events in ms for which the observable should time out
* If this is less than or equal to [pollingInterval], it will be ignored
*/
open val timeoutDuration: Long = 5 * 60 * 1000
private val watchPool: MutableMap<T, Observable<V>> = ConcurrentHashMap()
/**
* Returns an observable that will emit items every [pollingInterval] ms until it [isCompleted]
*
* The observable will be reused if there is polling, so the frequency remains constant regardless of the number of
* subscribers
*/
fun watch(id: T): Observable<V> {
// reuse observable if exists
val existing = watchPool[id]
if (existing != null)
return existing
val value = emit(id)
if (isCompleted(id, value)) return Observable.just(value)
// create new observable to fetch until complete,
// then remove from the map once complete
val observable = Observable.interval(pollingInterval, TimeUnit.MILLISECONDS, Schedulers.io()).map {
emit(id)
}.takeUntil {
isCompleted(id, it)
}.doOnComplete {
watchPool.remove(id)
}.distinctUntilChanged().run {
if (timeoutDuration > pollingInterval) timeout(timeoutDuration, TimeUnit.MILLISECONDS)
else this
}
watchPool[id] = observable
return observable
}
/**
* Clears the observables from the watch pool
* Note that existing subscribers will not be affected
*/
fun clear() {
watchPool.clear()
}
}

Related

How atomic update is implemented in Coroutines

Here's update function from Coroutines StateFlow. I have two questions:
How is it atomic? We have multiple operations inside, how can atomicity be guaranteed without mutual exclusion?
Why is it in while(true) loop? At what condition the loop is necessary?
Can the loop be endless?
/**
* Updates the [MutableStateFlow.value] atomically using the specified [function] of its value.
*
* [function] may be evaluated multiple times, if [value] is being concurrently updated.
*/
public inline fun <T> MutableStateFlow<T>.update(function: (T) -> T) {
while (true) {
val prevValue = value
val nextValue = function(prevValue)
if (compareAndSet(prevValue, nextValue)) {
return
}
}
}

Preventing KStream from emiting old unchanged aggregate value

I have a KStream pipeline which groups by key and then windows on some interval and then applies a custom aggregation on that:
KStream<String, Integer> input = /* define input stream */
/* group by key and then apply windowing */
KTable<Windowed<String>, MyAggregate> aggregateTable =
input.groupByKey()
.windowedBy(/* window defintion here */)
.aggregate(MyAggregate::new, (key, value, agg) -> agg.addAndReturn(value))
// I need to get a change log of aggregateTable so:
aggregateTable.toStream().to("output-topic");
The problem is that majority of the input records will not change the internal state of MyAggregate object. The structure is similar to:
class MyAggregate {
private Set<Integer> checkBeforeInsert = /* some predefined values */
private List<Integer> actualState = new ArrayList<>();
public MyAggregate addAndReturn(Integer value) {
/* for 99% of records the if check passes */
if (checkBeforeInsert.contains(value)) {
/* do nothing and return. Note that the state hasn't been changed */
return this;
} else {
actualState.add(value);
return this;
}
}
}
However, KStream doesn't have any clue that the aggregate object hasn't been changed, it still stores the aggregate (which is same as old). It also propagates to same old value to changelog topic and also triggers aggregateTable.toStream() with the same old value.
Although the semantic of my application works fine (the rest of application is aware of this fact that unchanged state might arrive), but this creates a huge noise traffic on intermediate topics. I need a way to notify KStream whether an aggregate has been really changed and should be stored or it's the same as previous record (just ignore it).

Resettable Single Rx pattern

I have the following design I'd like to create, but I'm not sure which Rx pattern matches it. The goal is more or less along the lines of a Single, but with a conditional check.
There is one Observable<String>, and the possibility of any number of observers.
If a request is first made, the observable will execute some network request taking in the string, then emit a callback (much like a completable/single)
Any subsequent call with the same key will return the same result immediately
However, if 5 minutes has passed and the same call is made, we will refetch the data as it may have expired, then emit it to any listeners. This result will be saved for another 5 minutes, and the cycle repeats.
All data is stored based on the key sent, much like a flyweight pattern. Expiration is based off of the last request time of the specific key.
My initial thought was to just make my own class with a concurrent hashmaps. However, this will mean I have to handle a lot of the threading mechanisms myself. I feel like RxJava will be a great solution to this, but I'm not sure if such patterns exist. Does anyone have an idea?
I get that the purpose of a Single<T> is meant to only retrieve a single response, so my terms may not be correct.
The following is my attempt, which I will be updating as I go
/**
* Created by Allan Wang on 07/01/18.
*
* Reactive flyweight to help deal with prolonged executions
* Each call will output a [Single], which may be new if none exist or the old one is invalidated,
* or reused if an old one is still valid
*
* Types:
* T input argument for caller
* C condition condition to check against for validity
* R response response within reactive output
*/
abstract class RxFlyweight<in T : Any, C : Any, R : Any> {
/**
* Given an input emit the desired response
* This will be executed in a separate thread
*/
protected abstract fun call(input: T): R
/**
* Given an input and condition, check if
* we may used cache data or if we need to make a new request
* Return [true] to use cache, [false] otherwise
*/
protected abstract fun validate(input: T, cond: C): Boolean
/**
* Given an input, create a new condition to be used
* for future requests
*/
protected abstract fun cache(input: T): C
private val conditionals = mutableMapOf<T, C>()
private val sources = mutableMapOf<T, Single<R>>()
private val lock = Any()
/**
* Entry point to give an input a receive a [Single]
* Note that the observer is not bound to any particular thread,
* as it is dependent on [createNewSource]
*/
operator fun invoke(input: T): Single<R> {
synchronized(lock) {
val source = sources[input]
// update condition and retrieve old one
val condition = conditionals.put(input, cache(input))
// check to reuse observable
if (source != null && condition != null && validate(input, condition))
return source
val newSource = createNewSource(input).cache()
sources.put(input, newSource)
return newSource
}
}
/**
* Open source creator
* Result will then be created with [Single.cache]
* If you don't have a need for cache,
* you likely won't have a need for flyweights
*/
open protected fun createNewSource(input: T): Single<R> =
Single.fromCallable { call(input) }
.timeout(20, TimeUnit.SECONDS)
.subscribeOn(Schedulers.io())
fun reset() {
synchronized(lock) {
sources.clear()
conditionals.clear()
}
}
}

PublishSubject calls map for each present observer

I'm using a PublishSubject along with map operator:
#Test
public void testMapWithMultipleObservers() {
PublishSubject<Integer> subject = PublishSubject.create();
Func1 action = spy(new Func1<Integer, Integer>() {
#Override
public Integer call(Integer integer) {
return integer;
}
});
Observable<Integer> observable = subject.asObservable().map(action);
observable.subscribe(mock(Observer.class));
observable.subscribe(mock(Observer.class));
subject.onNext(1);
verify(action, times(2)).call(anyInt());
// however, I need it to be times(1)
}
The desired behaviour is to perform an action after the subject produces a value. I've tried doOnEach,doOnNext, map and in each case the action is performed for each present observer (for 100 observers action would be performed 100 times), while I need it to perform per emission.
Could you suggest anything?
Thanks.
The quickest option would be to use share()
Observable<Integer> observable =
subject
.map(action)
.share();
You don't need the asObservable() call. It is used to return a Subject from an API and prevent the caller from casting it back to a Subject. For example:
Observable<Integer> getSubjectAsObservable() {
return subject.asObservable();
}

Delay items emission until item is emitted from another observable

Playing with RxJava now and stumbled upon the following problem:
I have 2 different streams:
Stream with items
Stream (with just 1 item) which emits transformation information for the first stream.
So essentially I have stream of items and I want all those items to be combined with that single item from 2nd stream:
----a1----a2----a3----a4----a5----|--------------->
-------------b1--|----------------------------------->
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
------------a1b1-a2b1-a3b1-a4b1-a5b1-------->
It looks really similar to combileLatest operator, but combineLatest will ignore all items from the first stream except the closest to the item from the second stream. It means that I will not receive a1b1 - the first resulting item emitted is gonna be a2b1.
I also looked at delay operator, but it doesn't allow me to specify close stream like it is done with buffer operatior
Is there any fancy operator which solves the problem above?
There are several ways of making this happen:
1) flatMap over b if you don't need to start a upfront
b.flatMap(bv -> a.map(av -> together(av, bv)));
2) You can, of course, cache but it will retain your as for the entire duration of the stream.
3) Use groupBy a bit unconventionally because its GroupedObservable caches values until the single subscriber arrives, replays the cached value and continues as a regular direct observable (letting all previous cached values go).
Observable<Long> source = Observable.timer(1000, 1000, TimeUnit.MILLISECONDS)
.doOnNext(v -> System.out.println("Tick"))
.take(10);
Observable<String> other = Observable.just("-b").delay(5000, TimeUnit.MILLISECONDS)
.doOnNext(v -> System.out.println("Tack"))
;
source.groupBy(v -> 1)
.flatMap(g ->
other.flatMap(b -> g.map(a -> a + b))
).toBlocking().forEach(System.out::println);
It works as follows:
Get a hold onto a GroupedObservable by grouping everything from source into group 1.
when the group g arrives, we 'start observing' the other observable.
Once other fires its element, we take it and map it over the group and 'start observing' it as well, bringing us the final sequence of a + bs.
I've added doOnNexts so you can see the source is really active before the other fires its "Tack".
AFAIK, there is no a built-in operator to achieve the behavior you've described. You can always implement a custom operator or build it on top of existing operators. I think the second option is easier to implement and here is the code:
public static <L, R, T> Observable<T> zipper(final Observable<? extends L> left, final Observable<? extends R> right, final Func2<? super L, ? super R, ? extends T> function) {
return Observable.defer(new Func0<Observable<T>>() {
#Override
public Observable<T> call() {
final SerialSubscription subscription = new SerialSubscription();
final ConnectableObservable<? extends R> cached = right.replay();
return left.flatMap(new Func1<L, Observable<T>>() {
#Override
public Observable<T> call(final L valueLeft) {
return cached.map(new Func1<R, T>() {
#Override
public T call(final R valueRight) {
return function.call(valueLeft, valueRight);
}
});
}
}).doOnSubscribe(new Action0() {
#Override
public void call() {
subscription.set(cached.connect());
}
}).doOnUnsubscribe(new Action0() {
#Override
public void call() {
subscription.unsubscribe();
}
});
}
});
}
If you have any questions regarding the code, I can explain it in details.
UPDATE
Regarding the questing how my solution is different from the following one:
left.flatMap(valueLeft -> right.map(valueRight -> together(valueLeft, valueRight)));
Parallel execution - in my implementation both left and right observables are executing in parallel. right observable doesn't have to wait for a left one to emit its first item.
Caching - my solution subscribes only once to the right observables and caches its result. Thats why b1 will always be the same for all aXXX items. The solution provided by akarnokd subscribes to the rightobservable every time the left one emits an item. That means:
There is no guarantee that b1 won't change its value. For example for the following observable you will get a different b for each a.
final Observable<Double> right = Observable.defer(new Func0<Observable<Double>>() {
#Override
public Observable<Double> call() {
return Observable.just(Math.random());
}
});
If the right observable is a time consuming operation (e.g. network call), you will have to wait for its completion every time the left observable emits a new item.