Repeat last element in Flux if no elements are available upstream - reactive-programming

I am searching for a way to repeat the last element when the subscriber of a Flux signals onNext but the publisher did not supply a new element.
Of course this approach would logically introduce eager streaming, but in my case that's exactly what I want, similarly to onBackpressureDrop and others, where an infinite demand is requested upstream.
I kind of need the exact opposite - with my subscriber being faster than the publisher.

I struggle to think of a case where it wouldn't be better for the subscriber to simply cache the last emitted value within itself and do what it needs to do there (whether that's looping, firing on a scheduled executor or something else entirely) rather than deliberately having an infinite demand on the last value emitted by the Flux.
Something akin to the following might work, but is incredibly hacky (that being said, I couldn't think of a better way):
flux.subscribe(str -> {
Mono.just(str).repeat().takeUntilOther(flux.next())
.subscribe(s -> {
//Actual subscriber
});
});

Related

Parallel design of program working with Flink and scala

This is the context:
There is an input event stream,
There are some methods to apply on
the stream, which applies different logic to evaluates each event,
saying it is a "good" or "bad" event.
An event can be a real "good" one only if it passes all the methods, otherwise it is a "bad" event.
There is an output event stream who has result of event and its eventID.
To solve this problem, I have two ideas:
We can apply each method sequentially to each event. But this is a kind of batch processing, and doesn't apply the advantages of stream processing, in the same time, it takes Time(M(ethod)1) + Time(M2) + Time(M3) + ....., which maybe not suitable to real-time processing.
We can pass the input stream to each method, and then we can run each method in parallel, each method saves the bad event into a permanent storage, then the Main method could query the permanent storage to get the result of each event. But this has some problems to solve:
how to execute methods in parallel in the programming language(e.g. Scala), how about the performance(network, CPUs, memory)
how to solve the synchronization problem? It's sure that those methods need sometime to calculate and save flag into the permanent storage, but the Main just need less time to query the flagļ¼Œ which a delay issue occurs.
etc.
This is not a kind of tech and design question, I would like to ask your guys' ideas, if you have some new ideas or ideas to solve the problem ? Looking forward to your opinions.
Parallel streams, each doing the full set of evaluations sequentially, is the more straightforward solution. But if that introduces too much latency, then you can fan out the evaluations to be done in parallel, and then bring the results back together again to make a decision.
To do the fan-out, look at the split operation on DataStream, or use side outputs. But before doing this n-way fan-out, make sure that each event has a unique ID. If necessary, add a field containing a random number to each event to use as the unique ID. Later we will use this unique ID as a key to gather back together all of the partial results for each event.
Once the event stream is split, each copy of the stream can use a MapFunction to compute one of evaluation methods.
Gathering all of these separate evaluations of a given event back together is a bit more complex. One reasonable approach here is to union all of the result streams together, and then key the unioned stream by the unique ID described above. This will bring together all of the individual results for each event. Then you can use a RichFlatMapFunction (using Flink's keyed, managed state) to gather the results for the separate evaluations in one place. Once the full set of evaluations for a given event has arrived at this stateful flatmap operator, it can compute and emit the final result.

Converting Rx-Observables to Twitter Futures in Scala

I want to implement the following functions in the most re-active way. I need these for implementing the bijections for automatic conversion between the said types.
def convertScalaRXObservableToTwitterFuture[A](a: Observable[A]): TwitterFuture[A] = ???
def convertScalaRXObservableToTwitterFutureList[A](a: Observable[A]): TwitterFuture[List[A]] = ???
I came across this article on a related subject but I can't get it working.
Unfortunately the claim in that article is not correct and there can't be a true bijection between Observable and anything like Future. The thing is that Observable is more powerful abstraction that can represent things that can't be represented by Future. For example, Observable might actually represent an infinite sequence. For example see Observable.interval. Obviously there is no way to represent something like this with a Future. The Observable.toList call used in that article explicitly mentions that:
Returns a Single that emits a single item, a list composed of all the items emitted by the finite source ObservableSource.
and later it says:
Sources that are infinite and never complete will never emit anything through this operator and an infinite source may lead to a fatal OutOfMemoryError.
Even if you limit yourself to only finite Observables, still Future can't fully express semantics of Observable. Consider Observable.intervalRange that generates a limited range one by one over some time period. With Observable the first event comes after initialDelay and then you get event each period. With Future you can get only one event and it must be only when the sequence is fully generated so Observable is completed. It means that by transforming Observable[A] into Future[List[A]] you immediately break the main benefit of Observable - reactivity: you can't process events one by one, you have to process them all in a single bunch.
To sum up the claim at the first paragraph of the article:
convert between the two, without loosing asynchronous and event-driven nature of them.
is false because conversion Observable[A] -> Future[List[A]] exactly looses the "event-driven nature" of Observable and there is no way to work this around.
P.S. Actually the fact that Future is less powerful than Observable should not be a big surprise. If it was not, why anybody would create Observable in the first place?

Resubscribing same shared observable inside flatMap of own data emits no data. By design?

Trying to migrate to rx-java2and came across a problem with resubscribing the shared observable inside it's own flatMap. Need this pattern to get-update-refresh chain:
Get current data from network (shared observable to avoid multiple network requests if source is being subscribed by several observers at the same time).
Modify the data and send it back to server (completable)
Get the data again after update completes
The whole thing looks like this:
#Test fun sharedTest() {
val o = Observable.just(1).share()
assertEquals(1, o
.take(1)
.flatMap({
Completable.complete()
.andThen(o) })
.blockingFirst())
}
The test fails with: java.util.NoSuchElementException
If o is not shared everything works.
That behavior seems to be because the latter subscriber comes when a single value of original has already been dispatched and only onComplete event is to be seen.
Does anybody know is that a by-design behavior and documented somehow? There is a workaround of course but I need to know the cause, as this is a bit annoying. The approach worked in Rx 1.x
Currently using version 2.1.3
Edit:
Seems to be no legitimate way to "restart" a shared observable and its side-effects as there is no guarantee other subscribers are not listening at the moment.
Take a look at the bubble diagram for 'share' and you'll see why it behaves like that: Observable.share().
share() emits items that are emitted after the subscription, it does not re-emit previously emitted items. Take a look at Observable.replay() for the behavior that should be what you expect.
Seems to be no legitimate way to "restart" a shared observable and its side-effects as there is no guarantee other subscribers are not listening at the moment.

How to emulate a BehaviorSubject with a connectable Observable in RX-Scala

Is there a way to make an Observable emulate a BehaviorSubject (but without the Observer interface) in rx-scala? I.e. make it an Observable with memory, so that it can have multiple subscriptions, and on each new subscription, it produces the last emitted value?
Observable.publish() does half the job, but it doesn't emit the last value. Observable.cache.publish() on the other hand replays all values - I would need something like that, but which only replays the last emitted value, to handle infinite streams.
Rx-Java solutions also accepted, although the native Scala form is preferred!
How about simply using the existing BehaviorSubject Scala implementation? As you can see, it's certainly available in 0.16.0, and I'm certain 0.15.0 includes it as well.
With the scala bindings, use observable replay 1 refCount.

Requesting a clear, picturesque explanation of Reactive Extensions (RX)?

For a long time now I am trying to wrap my head around RX. And, to be true, I am never sure if I got it - or not.
Today, I found an explanation on http://reactive-extensions.github.com/RxJS/ which - in my opinion - is horrible. It says:
RxJS is to events as promises are to async.
Great. This is a sentence so full of complexity that if you do not have the slightest idea of what RX is about, after that sentence you are quite as dumb as before.
And this is basically my problem: All the explanations in the usual places you find about RX make (at least me) feel dumb. They explain RX as a highly sophisticated concept with lots of highly complicated words and terms and whatsoever, and I am never quite sure what it is about.
So my question is: How would you explain RX to someone who is five years old? I'd like a clear, picturesque explanation of what it is, what it is good for, and what its main concepts are?
So, LINQ (in JavaScript, these are high-level array methods like map, filter, reduce, etc - if you're not a C# dev, just replace that whenever I mention 'LINQ') gives you a bunch of tools that you can apply to Sequences ("Lists" in a crude sense), in order to filter and transform an input into an output (aka "A list that's actually interesting to me"). But what is a list?
What is a List?
A List, is some elements, in a particular order. I can take any list and transform it into a better list with LINQ.
(Not necessarily sorted order, but an order).
An Event is a List
But what about an Event? Let's subscribe to an event:
OnKeyUp += (o,e) => Console.WriteLine(e.Key)
>>> 'H'
>>> 'e'
>>> 'l'
>>> 'l'
>>> 'o'
Hm. That looks like some things, in a particular order. It now suddenly dawns upon you, a list and an event are the same thing!
If Lists and Events are the Same....
...then why can't I transform and filter input events into more interesting events. That's what Rx is. It's taking everything you know about dealing with sequences, including all of the LINQ operators like Select and Where and Aggregate, and applies them to events.
Easy peasy.
A Callback is a Sequence Too
Isn't a Callback just basically an Event that only happens once? Isn't it basically just like a List with one item? Turns out it is, and one of the interesting things about Rx is that it lets us treat Events and Callbacks (and things like Geolocation requests) with the same language (i.e. we can combine the two, or wait for ether one or the other, etc etc).
Along with Paul's excellent answer I'd like to add the concept of pulling vs pushing data.
Pipeline
Lets take the example of some code that generates a series of numbers, and outputs the result. If you think of this as a stream on one end you have a producer that is creating new numbers for you, and on the other end you have a consumer that is doing something with those numbers.
Pull - Primes List
Lets say the producer is generating a list of prime numbers. Normally you would have some function that yields a list of numbers, and every time it returned it would push the next value it has calculated through the pipe to the consumer, which would output that number to the screen.
Prime Generator ---> Console.WriteLine
In this scenario it is easy to see that the producer is doing most of the work, and the consumer would be sitting around waiting for the producer to send the next value. The consumer is pulling on the pipeline, waiting for the producer to return the next value.
Push - Progress percent events from a fast process (Reactive)
Ok, let's say you have a function that is processing 1,000,000 items. Each item takes milliseconds to process, and then the function yields out a percentage value of how far it has gotten. So lots of progress values, very fast.
At the other end of the pipeline you have a progress bar. Now if the progress bar was to handle every update the UI would block trying to keep up with the stream of values.
1-Million-Items-Processor ---> Progress Bar
In this scenario the data is being pushed through the pipeline by the producer and then the consumer is blocking because too much data is being pushed for it to handle.
Reactive allows you to put in delays, windows, or to sample the pipeline depending on how you wish to consume the data. In this case I would sample the data every second before updating the progress bar.
Lists vs Events
So lists and events are kinda the same. The difference is whether the data is pulled or pushed through the system. With lists the data is pulled. With events the data is pushed.