Message processing throttling/backpressure

Message processing throttling/backpressure - system.reactive

I have the source of the messages, which is an Observable. For every message I would like to make an HTTP call which will produce another Observable, so I combine them together with the flatMap and then sink them to some subscriber. Here the code of this scenario:
Rx.Observable.interval(1000)
.flatMap (tick) ->
// returns an `Observable`
loadMessages()
.flatMap (message) ->
// also returns and `Observable`
makeHttpRequest(message)
.subscribe (result) ->
console.info "Processed: ", result
this example is written in coffeescript, but I think the problem statement would be valid for any other Rx implementation.
The issue I have with this approach is that loadMessages produces a lot of messages very quickly. This means, that I make a lot of HTTP requests in a very short period of time. This is not acceptable in my situation, so I would like to limit amount of the parallel HTTP requests to 10 or so. In other words I would like to throttle the pipelene or apply some kind of backpresure, when I making HTTP requests.
Is there any standard approach or best practices for the Rx to deal with this kind of situations?
Currently I implemented very simple (and pretty suboptimal) backpresure mechanism, that ignores tick if system has too many massages in processing. It looks like this (simplified version):
Rx.Observable.interval(1000)
.filter (tick) ->
stats.applyBackpressureBasedOnTheMessagesInProcessing()
.do (tick) ->
stats.messageIn()
.flatMap (tick) ->
// returns an `Observable`
loadMessages()
.flatMap (message) ->
// also returns and `Observable`
makeHttpRequest(message)
.do (tick) ->
stats.messageOut()
.subscribe (result) ->
console.info "Processed: ", result
I'm not sure though, whether this can be done better, or maybe Rx already has some mechanisms in-place to deal with this kind of requirements.

This isn't strictly backpressure, this is just limiting concurrency. Here's an easy way to do it (ignore my possibly wrong syntax, coding via TextArea):
Rx.Observable.interval(1000)
.flatMap (tick) ->
// returns an `Observable`
loadMessages()
.map (message) ->
// also returns and `Observable`, but only when
// someone first subscribes to it
Rx.Observable.defer ->
makeHttpRequest(message)
.merge 10 // at a time
.subscribe (result) ->
console.info "Processed: ", result
In C#, the equivalent idea is, instead of SelectMany, it's Select(Defer(x)).Merge(n). Merge(int) subscribes to at most n in-flight Observables, and buffers the rest until later. The reason we have a Defer, is to make it such that we don't do any work until the Merge(n) subscribes to us.

In RXJS you can use the backpressure submodule
http://rxjs.codeplex.com/SourceControl/latest#src/core/backpressure/
disclaimer I have never used the RX version of JS but you did ask for a standard way of implementing backpressure and the core library seems to have support for it. RX c# does not yet have this support. Not sure why.

It sounds like you want to pull from a queue rather than push your http requests. Is Rx really the right choice of technology here?
EDIT:
In general, I would not design a solution using Rx where I had complete imperative control over the source events. It's just not a reactive scenario.
The backpressure module in Rxjs is clearly written to deal with situations where you don't own the source stream. Here you do.
TPL Dataflow sounds like a far better fit here.
If you must use RX, you could set up a loop like this: If you want to limit to X concurrent events, set up a Subject to act as your message source and imperatively push (OnNext) X messages into it. In your subscriber, you can push a new message to the subject in each iteration of the OnNext handler until the source is exhausted. This guarantees a maximum of X messages in flight.

Related

Buffering slow subscribers in Swift combine

I'm currently struggling to get a desired behaviour when using Combine. I've previously used RX framework and believe (from what I remember) that the described scenario is possible by specifying backpressure strategies for buffering.
So the issue I have is that I have a publisher that publishes values very rapidly, I have two subscribers to it, one which can react just as fast as the values are published (cool beans), but then a second subscriber that runs some CPU expensive processing.
I know in order to support the second slower subscriber that I need to afford buffering of values, but don't seem to be be able to make this happen, here is what I have so far:
let subject = PassthroughSubject<Int, Never>()
// publish some values
Task {
for i in 0... {
subject.send(i)
}
}
subject
.print("fast")
.sink { _ in }
subject
.map { n -> Int in
sleep(1) // CPU intensive work here
return n
}
.print("slow")
.sink { _ in }
Originally I thought I could use .buffer(..) on the slow subscriber but this doesn't appear to be the use case, what seems to happen is that the subject dispatches to each subscriber and only after the subscriber finishes, does it then demand more from the publisher, and in this case that seems to block the .send(..) call of the publishing loop.
Any advice would be greatly appreciated 👍

Why/How should I use Publish without Connect?

Why/how should I use .Publish() without a Connect or RefCount call following? What does it do? Example code:
var source = new Subject<int>();
var pairs = source.Publish(_source => _source
.Skip(1)
.Zip(_source, (newer, older) => (older, newer))
);
pairs.Subscribe(p => Console.WriteLine(p));
source.OnNext(1);
source.OnNext(2);
source.OnNext(3);
source.OnNext(4);
How is pairs different from pairs2 here:
var pairs2 = source
.Skip(1)
.Zip(source, (newer, older) => (older, newer));

The Publish<TSource, TResult>(Func<IObservable<TSource, IObservable<TResult>> selector) overload is poorly documented. Lee Campbell doesn't cover it in introtorx.com. It doesn't return an IConnectableObservable, which is what most people associate with Publish, and therefore doesn't require or support a Connect or RefCount call.
This form of Publish is basically a form of defensive coding, against possible side-effects in a source observable. It subscribes once to the source, then can safely 'multicast' all messages via the passed in parameter. If you look at the question code, there's only once mention of source, and two mentions of _source. _source here is the safely multicasted observable, source is the unsafe one.
In the above example, the source is a simple Subject, so it's not really unsafe, and therefore Publish has no effect. However, if you were to replace source with this:
var source = Observable.Create<int>(o =>
{
Console.WriteLine("Print me once");
o.OnNext(1);
o.OnNext(2);
o.OnNext(3);
o.OnNext(4);
return System.Reactive.Disposables.Disposable.Empty;
});
...you would find "Print me once" printed once with pairs (correct), and twice with pairs2. This effect has similar implications where your observable wraps things like DB queries, web requests, network calls, file reads, and other side-effecting code that you want to happen only once and not multiple times.
TL;DR: If you have an observable query that references an observable twice, it is best to wrap that observable in a Publish call.

Confused about Observable vs. Single in functions like readCharacteristic()

In the RxJava2 version of RxAndroidBle, the functions readCharacteristic() and writeCharacteristic() return Single<byte[]>.
The example code to read a characteristic is:
device.establishConnection(false).flatMap(rxBleConnection -> rxBleConnection.readCharacteristic(characteristicUUID))
But the documentation for flatMap() says the mapping function is supposed to return an ObservableSource. Here, it returns a Single. How can this work?
Update: I looked at possibilities using operators like .single() and .singleOrError() but they all seem to require that the upstream emits one item and then completes. But establishConnection() doesn't ever complete. (This is one reason I suggested that perhaps establishConnection() should be reimagined as a Maybe, and some other way be provided to disconnect rather than just unsubscribing.)

You're totally correct, this example cannot be compiled. it's probably leftover from RxJava1 version, where Single wasn't exists.
Simple fix with the same result is to use RxJava2 flatMapSingle for instance:
device.establishConnection(false)
.flatMapSingle(rxBleConnection -> rxBleConnection.readCharacteristic(characteristicUUID))
flatMapSingle accepts a Single as the return value, and will map the success value of the input Single to an emission from the upstream Observable.
The point is, that RxJava has more specific Observable types, that exposes the possible series of emission expected from this Observable. Some methods now return Single as this is the logical operation of their stream (readCharacteristic()), some Observable as they will emit more than single emission (establishConnection() - connection status that can be changed over time).
But RxJava2 also provided many operators to convert between the different types and it really depends on your needs and scenario.

Thanks Rob!
In fact, the README was deprecated and required some pimping here and there. Please have a look if it's ok now.

I think I found the answer I was looking for. The crucial point:
Single.fromObservable(observableSource) doesn't do anything until it receives the second item from observableSource! Assuming that the first item it receives is a valid emission, then if the second item is:
onComplete(), it passes the first item to onSuccess();
onNext(), it signals IndexOutOfBoundsException since a Single can't emit more than one item;
onError(), it presumably forwards the error downstream.
Now, device.establishConnection() is a 1-item, non-completing Observable. The RxBleConnecton it emits is flatMapped to a Single with readCharacteristic(). But (another gotcha), flatMapSingle subscribes to these Singles and combines them into an Observable, which doesn't complete until the source establishConnection() does. But the source doesn't ever complete! Therefore the Single we're trying to create won't emit anything, since it doesn't receive that necessary second item.
The solution is to force the generation of onComplete() after the first (and only) item, which can be done with take(1). This will satisfy the Single we're creating, and cause it to emit the Characteristic value we're interested in. Hope that's clear.
The code:
Single<byte[]> readCharacteristicSingle( RxBleDevice device, UUID characteristicUUID ) {
return Single.fromObservable(
device.establishConnection( false )
.flatMapSingle( connection -> connection.readCharacteristic( characteristicUUID ) )
.take( 1L ) // make flatMapSingle's output Observable complete after the first emission
// (this makes the Single call onSuccess())
);
}

Flatmap my observable to subject

The question is a little tricky.
I am trying to implement the observable interface, within it i need to start listen to another publicsubject once the observable meet some circustance, so i write some code like this:
public myAPI(){
return restAPI.call()
.flatmap{ ret ->
if(ret == success) return myPublishSubject
}
can it guarantee the subscribe start subscribe to the publishsubject only after restAPI call is done successfully ?

The flatMap's Function callback is invoked when there is a value from upstream, in this case, the restAPI.call().
However, note that mapping to a PublishSubject late can result in items being missed. To avoid such problems, you can consider using BehaviorSubject that retains the last item it received so the flatMap can emit immediately upon subscribing to it.
In addition, repeatedly mapping to the same Subject can result in memory leaks and item duplication. Unfortunately, you'd have to complete the Subject in order to release it, but then it becomes unusable for dispatching further events. takeUntil may help in this case though.

Nested react (or receive) in Scala

I have an intermediate remote actor (B) that is supposed forward back and forth messages from A and C ( like A <-> B <-> C ). In B's code I have something like
loop {
react {
case msg => val A = sender
//2) Should this be synchronous with !?
C ! msg
//1) What's better react or receive?
react {
case response => A ! response
}
}
}
3 Questions:
1) What's better react or receive (to nest within a react)?
2) Given that a response will be sent back, should !? be used instead of !
3) Any other recommendation for this scenario?
Thank you all!

For what concerns the standard Actor model, messages must be handled atomically (i.e. you cannot receive and process messages when you are processing another -- and that's exactly what you'd like to do here)
However, Scala Actors have a relaxed semantics, which may allow to do that.
For Question 1, you should have clear which are the differences between react and receive. Anyway, you can easily use react (as used here http://www.scala-lang.org/docu/files/actors-api/actors_api_guide.html)
Alternatively, you could not use nesting. After your actor has sent the request, its state should change so that that next loop cycle it will look for the reply.
You may also want to upgrade to Scala 2.10 which integrate actors from Akka; that model is more clear and easy to use.