Delay items emission until item is emitted from another observable - reactive-programming

Playing with RxJava now and stumbled upon the following problem:
I have 2 different streams:
Stream with items
Stream (with just 1 item) which emits transformation information for the first stream.
So essentially I have stream of items and I want all those items to be combined with that single item from 2nd stream:
----a1----a2----a3----a4----a5----|--------------->
-------------b1--|----------------------------------->
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
------------a1b1-a2b1-a3b1-a4b1-a5b1-------->
It looks really similar to combileLatest operator, but combineLatest will ignore all items from the first stream except the closest to the item from the second stream. It means that I will not receive a1b1 - the first resulting item emitted is gonna be a2b1.
I also looked at delay operator, but it doesn't allow me to specify close stream like it is done with buffer operatior
Is there any fancy operator which solves the problem above?

There are several ways of making this happen:
1) flatMap over b if you don't need to start a upfront
b.flatMap(bv -> a.map(av -> together(av, bv)));
2) You can, of course, cache but it will retain your as for the entire duration of the stream.
3) Use groupBy a bit unconventionally because its GroupedObservable caches values until the single subscriber arrives, replays the cached value and continues as a regular direct observable (letting all previous cached values go).
Observable<Long> source = Observable.timer(1000, 1000, TimeUnit.MILLISECONDS)
.doOnNext(v -> System.out.println("Tick"))
.take(10);
Observable<String> other = Observable.just("-b").delay(5000, TimeUnit.MILLISECONDS)
.doOnNext(v -> System.out.println("Tack"))
;
source.groupBy(v -> 1)
.flatMap(g ->
other.flatMap(b -> g.map(a -> a + b))
).toBlocking().forEach(System.out::println);
It works as follows:
Get a hold onto a GroupedObservable by grouping everything from source into group 1.
when the group g arrives, we 'start observing' the other observable.
Once other fires its element, we take it and map it over the group and 'start observing' it as well, bringing us the final sequence of a + bs.
I've added doOnNexts so you can see the source is really active before the other fires its "Tack".

AFAIK, there is no a built-in operator to achieve the behavior you've described. You can always implement a custom operator or build it on top of existing operators. I think the second option is easier to implement and here is the code:
public static <L, R, T> Observable<T> zipper(final Observable<? extends L> left, final Observable<? extends R> right, final Func2<? super L, ? super R, ? extends T> function) {
return Observable.defer(new Func0<Observable<T>>() {
#Override
public Observable<T> call() {
final SerialSubscription subscription = new SerialSubscription();
final ConnectableObservable<? extends R> cached = right.replay();
return left.flatMap(new Func1<L, Observable<T>>() {
#Override
public Observable<T> call(final L valueLeft) {
return cached.map(new Func1<R, T>() {
#Override
public T call(final R valueRight) {
return function.call(valueLeft, valueRight);
}
});
}
}).doOnSubscribe(new Action0() {
#Override
public void call() {
subscription.set(cached.connect());
}
}).doOnUnsubscribe(new Action0() {
#Override
public void call() {
subscription.unsubscribe();
}
});
}
});
}
If you have any questions regarding the code, I can explain it in details.
UPDATE
Regarding the questing how my solution is different from the following one:
left.flatMap(valueLeft -> right.map(valueRight -> together(valueLeft, valueRight)));
Parallel execution - in my implementation both left and right observables are executing in parallel. right observable doesn't have to wait for a left one to emit its first item.
Caching - my solution subscribes only once to the right observables and caches its result. Thats why b1 will always be the same for all aXXX items. The solution provided by akarnokd subscribes to the rightobservable every time the left one emits an item. That means:
There is no guarantee that b1 won't change its value. For example for the following observable you will get a different b for each a.
final Observable<Double> right = Observable.defer(new Func0<Observable<Double>>() {
#Override
public Observable<Double> call() {
return Observable.just(Math.random());
}
});
If the right observable is a time consuming operation (e.g. network call), you will have to wait for its completion every time the left observable emits a new item.

Related

Kafka streams event deduplication keeping last event in window

I'm using Kafka Streams in a deduplication events problem over short time windows (<= 1 minute).
First I've tried to tackle the problem by using DSL API with .suppress(Suppressed.untilWindowCloses(...)) operator but, given the fact that wall-clock time is not yet supported (I've seen the KIP 424), this operator is not viable for my use case.
Then, I've followed this official Confluent example in which low level Processor API is used and it was working fine but has one major limitation for my use-case. The single event (obtained by deduplication) is emitted at the beginning of the time window, subsequent duplicated events are "suppressed". In my use case I need the reverse of that, meaning that a single event should be emitted at the end of the window.
I'm asking for suggestions on how to implement this use case with Processor API.
My idea was to use the Processor API with a custom Transformer and a Punctuator.
The transformer would store in a WindowStore the distinct keys received without returning any KeyValue. Simultaneously, I'd schedule a punctuator running with an interval equal to the size of the window in the WindowStore. This punctuator will iterate over the elements in the store and forward them downstream.
The following are some core parts of the logic:
DeduplicationTransformer (slightly modified from official Confluent example):
#Override
#SuppressWarnings("unchecked")
public void init(final ProcessorContext context) {
this.context = context;
eventIdStore = (WindowStore<E, V>) context.getStateStore(this.storeName);
// Schedule punctuator for this transformer.
context.schedule(Duration.ofMillis(this.windowSizeMs), PunctuationType.WALL_CLOCK_TIME,
new DeduplicationPunctuator<E, V>(eventIdStore, context, this.windowSizeMs));
}
#Override
public KeyValue<K, V> transform(final K key, final V value) {
final E eventId = idExtractor.apply(key, value);
if (eventId == null) {
return KeyValue.pair(key, value);
} else {
if (!isDuplicate(eventId)) {
rememberNewEvent(eventId, value, context.timestamp());
}
return null;
}
}
DeduplicationPunctuator:
public DeduplicationPunctuator(WindowStore<E, V> eventIdStore, ProcessorContext context,
long retainPeriodMs) {
this.eventIdStore = eventIdStore;
this.context = context;
this.retainPeriodMs = retainPeriodMs;
}
#Override
public void punctuate(long invocationTime) {
LOGGER.info("Punctuator invoked at {}, searching from {}", new Date(invocationTime), new Date(invocationTime-retainPeriodMs));
KeyValueIterator<Windowed<E>, V> it =
eventIdStore.fetchAll(invocationTime - retainPeriodMs, invocationTime + retainPeriodMs);
while (it.hasNext()) {
KeyValue<Windowed<E>, V> next = it.next();
LOGGER.info("Punctuator running on {}", next.key.key());
context.forward(next.key.key(), next.value);
// Delete from store with tombstone
eventIdStore.put(next.key.key(), null, invocationTime);
context.commit();
}
it.close();
}
Is this a valid approach?
With the previous code, I'm running some integration tests and I've some synchronization issues. How can I be sure that the start of the window will coincide with the Punctuator's scheduled interval?
Also as an alternative approach, I was wondering (I've googled with no result), if there is any event triggered by window closing to which I can attach a callback in order to iterate over store and publish only distinct events.
Thanks.

How to create a multicast observable that activates on subscribe?

I want to fuse the inputs of several Android sensors and expose the output as an observable (or at least something that can be subscribed to) that supports multiple simultaneous observers. What's the idiomatic way to approach this? Is there a class in the standard library that would make a good starting point?
I was thinking of wrapping a PublishSubject in an object with delegates for one or more subscribe methods that test hasObservers to activate the sensors, and wrap the returned Disposable in a proxy that tests hasObservers to deactivate them. Something like this, although this already has some obvious problems:
public class SensorSubject<T> {
private final PublishSubject<T> mSubject = PublishSubject.create();
public Disposable subscribe(final Consumer<? super T> consumer) {
final Disposable d = mSubject.subscribe(consumer);
if(mSubject.hasObservers()) {
// activate sensors
}
return new Disposable() {
#Override
public void dispose() {
// possible race conditions!
if(!isDisposed()) {
d.dispose();
if(!mSubject.hasObservers()) {
// deactivate sensors
}
}
}
#Override
public boolean isDisposed() {
return d.isDisposed();
}
};
}
}
The idiomatic way to do that in RxJava would be to use hot observable.
Cold observables do some action when someone subscribes to them and emit all items to that subscriber. So it's 1 to 1 relation.
Hot observable do some action and emits items independently on individual subscription. So if you subscribe too late, you might not get some values that were emitted earlier. This is 1 to many relation, aka multicast - which is what you want.
Usual way to do it is Flowable.publish() which makes Flowable multicast, but requires calling connect() method to start emitting values.
In your case you can also call refCount() which adds your desired functionality - it subscribes to source Flowable when there is at least one subscription and unsubscribes when everyone unsubsribed.
Because publish().refCount() is pretty popular combination, there is a shortcut for them - share(). And as far as I understand this is exactly what you want.
Edit by asker: This code incorporates this answer and David Karnok's comment in the form of a Dagger 2 provider method. SimpleMatrix is from EJML. This seems to be doing what I asked for.
#Provides
#Singleton
#Named(MAGNETOMETER)
public Observable<SimpleMatrix> magnetometer(final SensorManager sensorManager) {
final PublishSubject<SimpleMatrix> ps = PublishSubject.create();
final Sensor sensor = sensorManager.getDefaultSensor(TYPE_MAGNETIC_FIELD);
final SensorEventListener listener = new SensorEventAdapter() {
#Override
public void onSensorChanged(final SensorEvent event) {
ps.onNext(new SimpleMatrix(1, 3, true, event.values));
}
};
return ps.doOnSubscribe(s -> {
sensorManager.registerListener(listener, sensor, SENSOR_DELAY_NORMAL);
}).doOnDispose(() -> {
sensorManager.unregisterListener(listener);
}).share();
}

Does a FlowableOperator inherently supports backpressure?

I've implemented an FlowableOperator as described in the RxJava2 wiki (https://github.com/ReactiveX/RxJava/wiki/Writing-operators-for-2.0#operator-targeting-lift) except that I perform some testing in the onNext() operation something like that:
public final class MyOperator implements FlowableOperator<Integer, Integer> {
...
static final class Op implements FlowableSubscriber<Integer>, Subscription {
#Override
public void onNext(Integer v) {
if (v % 2 == 0) {
child.onNext(v * v);
}
}
...
}
}
This operator is part of a chain where I have a Flowable created with a backpressure drop. In essence, it looks almost like this:
Flowable.<Integer>create(emitter -> myAction(), DROP)
.filter(v -> v > 2)
.lift(new MyOperator())
.subscribe(n -> doSomething(n));
I've met the following issue:
backpressure occurs, so doSomething(n) cannot handle the upcoming upstream
items are dropped due to the Backpressure strategy chosen
but doSomething(n) never receives back new item after the drop has been performed and while doSomething(n) was ready to deal with new items
Reading back the excellent blog post http://akarnokd.blogspot.fr/2015/05/pitfalls-of-operator-implementations.html of David Karnok, it's seems that I need to add a request(1) in the onNext() method. But that was with RxJava1...
So, my question is: is this fix enough in RxJava2 to deal with my backpressure issue? Or do my operator have to implement all the stuff about Atomics, drain stuff described in https://github.com/ReactiveX/RxJava/wiki/Writing-operators-for-2.0#atomics-serialization-deferred-actions to properly handle my backpressure issue?
Note: I've added the request(1) and it seems to work. But I can't figure out whether it's enough or whether my operator needs the tricky stuff of queue-drain and atomics.
Thanks in advance!
Does a FlowableOperator inherently supports backpressure?
FlowableOperator is an interface that is called for a given downstream Subscriber and should return a new Subscriber that wraps the downstream and modulates the Reactive Streams events passing in one or both directions. Backpressure support is the responsibility of the Subscriber implementation, not this particular functional interface. It could have been Function<Subscriber, Subscriber> but a separate named interface was deemed more usable and less prone to overload conflicts.
need to add a request(1) in the onNext() [...]
But I can't figure out whether it's enough or whether my operator needs the tricky stuff of queue-drain and atomics.
Yes, you have to do that in RxJava 2 as well. Since RxJava 2's Subscriber is not a class, it doesn't have v1's convenience request method. You have to save the Subscription in onSubscribe and call upstream.request(1) on the appropriate path in onNext. For your case, it should be quite enough.
I've updated the wiki with a new section explaining this case explicitly:
https://github.com/ReactiveX/RxJava/wiki/Writing-operators-for-2.0#replenishing
final class FilterOddSubscriber implements FlowableSubscriber<Integer>, Subscription {
final Subscriber<? super Integer> downstream;
Subscription upstream;
// ...
#Override
public void onSubscribe(Subscription s) {
if (upstream != null) {
s.cancel();
} else {
upstream = s; // <-------------------------
downstream.onSubscribe(this);
}
}
#Override
public void onNext(Integer item) {
if (item % 2 != 0) {
downstream.onNext(item);
} else {
upstream.request(1); // <-------------------------
}
}
#Override
public void request(long n) {
upstream.request(n);
}
// the rest omitted for brevity
}
Yes you have to do the tricky stuff...
I would avoid writing operators, except if you are very sure what you are doing? Nearly everything can be achieved with the default operators...
Writing operators, source-like (fromEmitter) or intermediate-like
(flatMap) has always been a hard task to do in RxJava. There are many
rules to obey, many cases to consider but at the same time, many
(legal) shortcuts to take to build a well performing code. Now writing
an operator specifically for 2.x is 10 times harder than for 1.x. If
you want to exploit all the advanced, 4th generation features, that's
even 2-3 times harder on top (so 30 times harder in total).
There is the tricky stuff explained: https://github.com/ReactiveX/RxJava/wiki/Writing-operators-for-2.0

Rxjava2, zipped iterable and interval, executes only a single mapped observable

I have this the following scenario I need to achieve:
perform each network call for a list of request object with 1 second delay each
and I have this following implementation using rxjava2
emit an interval stream
emit an iterable stream
zip them to emit each item from the iterable source
which by far has no problem and I fully understand how it works, now I integrated the above to the following
map each item emitted from zip into a new observable that defer/postpone an observable source for a network call
each mapped-emitted observable will perform an individual network call for each request
which I ended up with the following code
Observable
.zip(Observable.interval(1, TimeUnit.SECONDS), Observable.fromIterable(iterableRequests), new BiFunction<Long, RequestInput, RequestResult>() {
#Override
public RequestResult apply(#NonNull Long aLong, #NonNull final RequestInput request) throws Exception {
return request;
}
})
.map(new Function<RequestResult, ObservableSource<?>>() {
#Override
public ObservableSource<?> apply(#NonNull RequestResult requestResult) throws Exception {
// map each requestResult into this observable and perform a new stream
return Observable
.defer(new Callable<ObservableSource<?>>() {
// return a postponed observable for each subscriber
})
.retryWhen(new Function<Observable<Throwable>, ObservableSource<?>>() {
// return throwable observable
})
}
})
.subscribe(new Observer<ObservableSource<?>>() {
//.. onSubscribe {}
//.. onError {}
//.. onComplete {}
#Override
public void onNext(ObservableSource<?> observableSource) {
// actual subscription for each of the Observable.defer inside
// so it will start to emit and perform the necessary operation
}
});
but the problem is, it executes the Observable.defer source, only ONCE, but keeps on iterating(by putting a Log inside the map operator to see the iteration).
Can anyone guide me please on how can I achieve what I want, I exhausted alot of papers, drawing alot of marble diagrams, just to see where Im at on my code,
I dont know if the diagram I created illustrate the thing that I want, if it does, I dont know why does the sample code dont perform as the diagram portraits
Any help would be greatly appreciated.
The first part is fine, but the map thingy is a bit unneeded, what you are doing is mapping each RequestResult to an Observable, and then manually subscribe to it at the Observer.onNext(), actually the defer is not necessary as you're creating separate Observable for each RequestResult with different data, defer will occur at each subscribe yoy do at onNext(), and the map occur as you observed for each emission of the zipped RequestResult.
what you probably need is simple flatMap() to map each RequestResult value to a separate Observable that will do the network request, and it will merge back the result for each request to the stream, so you'll just need to handle the final values emission for each request instead to subscribe manually to each Observable.
Just keep in mind that order might be lost, in case some requests might take longer than your delay between them.
Observable.zip(Observable.interval(1, TimeUnit.SECONDS), Observable.fromIterable(iterableRequests),
new BiFunction<Long, RequestInput, RequestResult>() {
#Override
public RequestResult apply(#NonNull Long aLong,
#NonNull final RequestInput request) throws Exception {
return request;
}
})
.flatMap(new Function<RequestResult, ObservableSource<?>>() {
#Override
public ObservableSource<?> apply(RequestResult requestResult) throws Exception {
return createObservableFromRequest(requestResult)
.retryWhen(new Function<Observable<Throwable>, ObservableSource<?>>() {
// return throwable observable
})
}
})
.subscribe(new Observer<ObservableSource<?>>() {
//.. onSubscribe {}
//.. onError {}
//.. onComplete {}
#Override
public void onNext(ObservableSource<?> observableSource) {
//do something with each network result request emission
}
});
I manage to make it work, as somewhere inside the Observable.defer, my retrofitclient was null,
retrofitClient.getApiURL().post(request); // client was null
my retrofitClient was null ( i looked somewhere in the code and I noticed i was not initialized, and I initialized it properly and made it work)
now can anybody tell me why Rx didnt throw an exception back to the original observable stream? theres no NullPointerException that occurred, Im confused

Understanding RxJava: Differences between Runnable callback

I'm trying to understand RxJava and I'm sure this question is a nonsense... I have this code using RxJava:
public Observable<T> getData(int id) {
if (dataAlreadyLoaded()) {
return Observable.create(new Observable.OnSubscribe<T>(){
T data = getDataFromMemory(id);
subscriber.onNext(data);
});
}
return Observable.create(new Observable.OnSubscribe<T>(){
#Override
public void call(Subscriber<? super String> subscriber) {
T data = getDataFromRemoteService(id);
subscriber.onNext(data);
}
});
}
And, for instance, I could use it this way:
Action1<String> action = new Action<String>() {
#Override
public void call(String s) {
//Do something with s
}
};
getData(3).subscribe(action);
and this another with callback that implements Runnable:
public void getData(int id, MyClassRunnable callback) {
if (dataAlreadyLoaded()) {
T data = getDataFromMemory(id);
callback.setData(data);
callback.run();
} else {
T data = getDataFromRemoteService(id);
callback.setData(data);
callback.run();
}
}
And I would use it this way:
getData(3, new MyClassRunnable()); //Do something in run method
Which are the differences? Why is the first one better?
The question is not about the framework itself but the paradigm. I'm trying to understand the use cases of reactive.
I appreciate any help. Thanks.
First of all, your RxJava version is much more complex than it needs to be. Here's a much simpler version:
public Observable<T> getData(int id) {
return Observable.fromCallable(() ->
dataAlreadyLoaded() ? getDataFromMemory(id) : getDataFromRemoteService(id)
);
}
Regardless, the problem you present is so trivial that there is no discernible difference between the two solutions. It's like asking which one is better for assigning integer values - var = var + 1 or var++. In this particular case they are identical, but when using assignment there are many more possibilities (adding values other than one, subtracting, multiplying, dividing, taking into account other variables, etc).
So what is it you can do with reactive? I like the summary on reactivex's website:
Easily create event streams or data streams. For a single piece of data this isn't so important, but when you have a stream of data the paradigm makes a lot more sense.
Compose and transform streams with query-like operators. In your above example there are no operators and a single stream. Operators let you transform data in handy ways, and combining multiple callbacks is much harder than combining multiple Observables.
Subscribe to any observable stream to perform side effects. You're only listening to a single event. Reactive is well-suited for listening to multiple events. It's also great for things like error handling - you can create a long sequence of events, but any errors are forwarded to the eventual subscriber.
Let's look at a more concrete with an example that has more intrigue: validating an email and password. You've got two text fields and a button. You want the button to become enabled once there is a email (let's say .*#.*) and password (of at least 8 characters) entered.
I've got two Observables that represent whatever the user has currently entered into the text fields:
Observable<String> email = /* you figure this out */;
Observable<String> password = /* and this, too */;
For validating each input, I can map the input String to true or false.
Observable<Boolean> validEmail = email.map(str -> str.matches(".*#.*"));
Observable<Boolean> validPw = password.map(str -> str.length() >= 8);
Then I can combine them to determine if I should enable the button or not:
Observable.combineLatest(validEmail, validPw, (b1, b2) -> b1 && b2)
.subscribe(enableButton -> /* enable button based on bool */);
Now, every time the user types something new into either text field, the button's state gets updated. I've setup the logic so that the button just reacts to the state of the text fields.
This simple example doesn't show it all, but it shows how things get a lot more interesting after you get past a simple subscription. Obviously, you can do this without the reactive paradigm, but it's simpler with reactive operators.