Rate-limiting multiple observables created by multiple threads using RxJava - reactive-programming

I'm developing a simple REST application that leverages on RxJava to send requests to a remote server (1). For each incoming request to the REST API a request is sent (using RxJava and RxNetty) to (1). Everything is working fine but now I have a new use case:
In order to not bombard (1) with too many request I need to implement rate limiting. One way to solve this (I assume) would be to add each Observable created when sending a request to (1) into another Observable (2) that does the actual rate-limiting. (2) will then act more or less like a queue and process the outbound requests as fast as possible (but not faster than the rate limit). Here's some pseudo-like code:
Observable<MyResponse> r1 = createRequestToExternalServer() // In thread 1
Observable<MyResponse> r2 = createRequestToExternalServer() // In thread 2
// Somehow send r1 and r2 to the "rate limiter" observable, (2)
rateLimiterObservable.sample(1 / rate, TimeUnit.MILLISECONDS)
How would I use Rx/RxJava to solve this?

I'd use a hot timer along with an atomic counter that keeps track the remaining connection for the given duration:
int rate = 5;
long interval = 1000;
AtomicInteger remaining = new AtomicInteger(rate);
ConnectableObservable<Long> timer = Observable
.interval(interval, TimeUnit.MILLISECONDS)
.doOnNext(e -> remaining.set(rate))
.publish();
timer.connect();
Observable<Integer> networkCall = Observable.just(1).delay(150, TimeUnit.MILLISECONDS);
Observable<Integer> limitedNetworkCall = Observable
.defer(() -> {
if (remaining.getAndDecrement() != 0) {
return networkCall;
}
return Observable.error(new RuntimeException("Rate exceeded"));
});
Observable.interval(100, TimeUnit.MILLISECONDS)
.flatMap(t -> limitedNetworkCall.onErrorReturn(e -> -1))
.take(20)
.toBlocking()
.forEach(System.out::println);

Related

How can I change the period for Flowable.interval

Is there a way to change the Flowable.interval period at runtime?
LOGGER.info("Start generating bullshit for 7 seconds:");
Flowable.interval(3, TimeUnit.SECONDS)
.map(tick -> random.nextInt(100))
.subscribe(tick -> LOGGER.info("tick = " + tick));
TimeUnit.SECONDS.sleep(7);
LOGGER.info("Change interval to 2 seconds:");
I have a workaround, but the best way would be to create a new operator.
How does this solution work?
You have a trigger source, which will provide values, when to start start a new interval. The source is switchMapped with an interval as inner-stream. The inner-stream takes an input value for the upstream source for setting the new interval time.
switchMap
When the source emits a time (Long), the switchMap lambda is invoked and the returned Flowable will be subscribed to immediately. When a new value arrives at the switchMap, the inner subscribed Flowable interval will be unsubscribed from and the lambda will be invoked once again. The returned Inverval-Flowable will be re-subscribed.
This means, that on each emit from the source, a new Inveral is created.
How does it behave?
When the inveral is subscribed to and is about to emit a new value and a new value is emitted from the source, the inner-stream (inverval) is unsubscribed from. Therefore the value is not emitted anymore. The new Interval-Flowable is subscribed to and will emit a value to it's configuration.
Solution
lateinit var scheduler: TestScheduler
#Before
fun init() {
scheduler = TestScheduler()
}
#Test
fun `62232235`() {
val trigger = PublishSubject.create<Long>()
val switchMap = trigger.toFlowable(BackpressureStrategy.LATEST)
// make sure, that a value is emitted from upstream, in order to make sure, that at least one interval emits values, when the upstream-sources does not provide a seed value.
.startWith(3)
.switchMap {
Flowable.interval(it, TimeUnit.SECONDS, scheduler)
.map { tick: Long? ->
tick
}
}
val test = switchMap.test()
scheduler.advanceTimeBy(10, TimeUnit.SECONDS)
test.assertValues(0, 1, 2)
// send new onNext value at absolute time 10
trigger.onNext(10)
// the inner stream is unsubscribed and a new stream with inverval(10) is subscribed to. Therefore the first vale will be emitted at 20 (current: 10 + 10 configured)
scheduler.advanceTimeTo(21, TimeUnit.SECONDS)
// if the switch did not happen, there would be 7 values
test.assertValues(0, 1, 2, 0)
}

How to match events in two causal streams to detect excessive lagging

I have two hot observables, which are respectively a stream Q of requests to a network server, and a stream R of replies from the server. The replies are always delivered in the order of requests, and every request is going to receive exactly one reply eventually. Thus the first event in R, R1, is the reply to the first event in Q, Q1, and so on. I need to detect when a reply Rn takes longer than a defined timeout and signal this timeout condition.
Q --1---2---------3-------> // Requests Q1, Q2...
R ----1-------------------> // Replies
Out ------------------O-|> // Oops: Reply R2 to Q2 did not arrive within time τ.
|<----τ---->|
Events Qn and Rn do not contain any identifying information (think of plain colorless round marbles), and the indices in the diagram are just sequential numbers introduced for explanation.
I seem unable to solve this riddle. I tried the approach below, but it appears I am matching the latest request Qi to the latest response Rj. In the sample Q contains 5 requests, spaced 500ms apart, and replies in R come 750ms apart, starting at 200ms, but only 4 of them (the 5th is delayed indefinitely). The code does not detect that, since that last reply R4 comes within the set timeout of 1000ms after the latest request Q5 (in 200ms, actually).
var Q = Observable.Interval(TimeSpan.FromMilliseconds(500)).Select(_ => Unit.Default)
.Take(5).Concat(Observable.Never<Unit>());
var R = Observable.Interval(TimeSpan.FromMilliseconds(750)).Select(_ => Unit.Default)
.Delay(TimeSpan.FromMilliseconds(200))
.Take(4).Concat(Observable.Never<Unit>());
var dq = Q.Select(v => Observable.Return(v).Delay(TimeSpan.FromMilliseconds(1000)));
var dr = Observable.Zip(Q, R, (_1,_2) => Observable.Never<Unit>());
Observable.Merge(dq, dr).Dump().Switch().Dump();
I believe that you want to be notified that request 4 has timed out (due at 3s, but arrives at 3.2s) and also request 5 as it never arrives
void Main()
{
var scheduler = new TestScheduler();
var requests = scheduler.CreateHotObservable<int>(
ReactiveTest.OnNext(0500.Ms(), 1),
ReactiveTest.OnNext(1000.Ms(), 2),
ReactiveTest.OnNext(1500.Ms(), 3),
ReactiveTest.OnNext(2000.Ms(), 4),
ReactiveTest.OnNext(2500.Ms(), 5));
var responses = scheduler.CreateHotObservable<Unit>(
ReactiveTest.OnNext(0950.Ms(), Unit.Default),
ReactiveTest.OnNext(1700.Ms(), Unit.Default),
ReactiveTest.OnNext(2450.Ms(), Unit.Default),
ReactiveTest.OnNext(3200.Ms(), Unit.Default));
var expected = scheduler.CreateHotObservable<int>(
ReactiveTest.OnNext(3000.Ms(), 4),
ReactiveTest.OnNext(3500.Ms(), 5)
);
var observer = scheduler.CreateObserver<int>();
var query = responses
.Select((val, idx)=>idx)
.Publish(responseIdxs =>
{
return requests.SelectMany((q, qIdx) =>
Observable.Timer(TimeSpan.FromSeconds(1), scheduler)
.TakeUntil(responseIdxs.Where(rIdx => qIdx == rIdx))
.Select(_ => q));
});
query.Subscribe(observer);
scheduler.Start();
//This test passes
ReactiveAssert.AreElementsEqual(
expected.Messages,
observer.Messages);
}
// Define other methods and classes here
public static class TickExtensions
{
public static long Ms(this int ms)
{
return TimeSpan.FromMilliseconds(ms).Ticks;
}
}

Aggregate resource requests & dispatch responses to each subscriber

I'm fairly new to RxJava and struggling with an use case that seems quite common to me :
Gather multiple requests from different parts of the application, aggregate them, make a single resource call and dispatch the results to each subscriber.
I've tried a lot of different approaches, using subjects, connectable observables, deferred observables... none did the trick so far.
I was quite optimistic about this approach but turns out it fails just like the others :
//(...)
static HashMap<String, String> requests = new HashMap<>();
//(...)
#Test
public void myTest() throws InterruptedException {
TestScheduler scheduler = new TestScheduler();
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.doOnSubscribe(() -> System.out.println("new subscriber!"))
.doOnUnsubscribe(() -> System.out.println("unsubscribed"))
.filter(l -> !requests.isEmpty())
.doOnNext(aLong -> System.out.println(requests.size() + " requests to send"))
.flatMap(aLong -> {
System.out.println("requests " + requests);
return Observable.from(requests.keySet()).take(10).distinct().toList();
})
.doOnNext(strings -> System.out.println("calling aggregate for " + strings + " (from " + requests + ")"))
.flatMap(Observable::from)
.doOnNext(s -> {
System.out.println("----");
System.out.println("removing " + s);
requests.remove(s);
})
.doOnNext(s -> System.out.println("remaining " + requests));
TestSubscriber<String> ts1 = new TestSubscriber<>();
TestSubscriber<String> ts2 = new TestSubscriber<>();
TestSubscriber<String> ts3 = new TestSubscriber<>();
TestSubscriber<String> ts4 = new TestSubscriber<>();
Observable<String> defer = buildObservable(interval, "1");
defer.subscribe(ts1);
Observable<String> defer2 = buildObservable(interval, "2");
defer2.subscribe(ts2);
Observable<String> defer3 = buildObservable(interval, "3");
defer3.subscribe(ts3);
scheduler.advanceTimeBy(200, TimeUnit.MILLISECONDS);
Observable<String> defer4 = buildObservable(interval, "4");
defer4.subscribe(ts4);
scheduler.advanceTimeBy(100, TimeUnit.MILLISECONDS);
ts1.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts2.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts3.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts4.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts1.assertValue("1");
ts2.assertValue("2"); //fails (test stops here)
ts3.assertValue("3"); //fails
ts4.assertValue("4"); //fails
}
public Observable<String> buildObservable(Observable<String> interval, String key) {
return Observable.defer(() -> {
System.out.printf("creating observable for key " + key);
return Observable.create(subscriber -> {
requests.put(key, "xxx");
interval.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key))
.doOnError(subscriber::onError)
.subscribe(s -> {
System.out.println("intern " + s);
subscriber.onNext(s);
subscriber.onCompleted();
subscriber.unsubscribe();
});
});
}
)
;
}
Output :
creating observable for key 1new subscriber!
creating observable for key 2new subscriber!
creating observable for key 3new subscriber!
3 requests to send
requests {3=xxx, 2=xxx, 1=xxx}
calling aggregate for [3, 2, 1] (from {3=xxx, 2=xxx, 1=xxx})
----
removing 3
remaining {2=xxx, 1=xxx}
filtering : key/val 1/3
----
removing 2
remaining {1=xxx}
filtering : key/val 1/2
----
removing 1
remaining {}
filtering : key/val 1/1
intern 1
creating observable for key 4new subscriber!
1 requests to send
requests {4=xxx}
calling aggregate for [4] (from {4=xxx})
----
removing 4
remaining {}
filtering : key/val 1/4
The test fails at the second assertion (ts2 not receiving "2")
Turns out the pseudo-aggregation works as expected, but the values are not dispatched to the corresponding subscribers (only the first subscriber receives it)
Any idea why?
Also, I feel like I'm missing the obvious here. If you think of a better approach, I'm more than willing to hear about it.
EDIT : Adding some context regarding what I want to achieve.
I have a REST API exposing data via multiple endpoints (eg. user/{userid}). This API also makes it possible to aggregate requests (eg. user/user1 & user/user2) and get the corresponding data in one single http request instead of two.
My goal is to be able to automatically aggregate the requests made from different parts of my application in a given time frame (say 10ms) with a max batch size (say 10), make an aggregate http request, then dispatch the results to the corresponding subscribers.
Something like this :
// NOTE: those calls can be fired from anywhere in the app, and randomly combined. The timing and order is completely unpredictable
//ts : 0ms
api.call(userProfileRequest1).subscribe(this::show);
api.call(userProfileRequest2).subscribe(this::show);
//--> after 10ms, should fire one single http aggregate request with those 2 calls, map the response items & send them to the corresponding subscribers (that will show the right user profile)
//ts : 20ms
api.call(userProfileRequest3).subscribe(this::show);
api.call(userProfileRequest4).subscribe(this::show);
api.call(userProfileRequest5).subscribe(this::show);
api.call(userProfileRequest6).subscribe(this::show);
api.call(userProfileRequest7).subscribe(this::show);
api.call(userProfileRequest8).subscribe(this::show);
api.call(userProfileRequest9).subscribe(this::show);
api.call(userProfileRequest10).subscribe(this::show);
api.call(userProfileRequest11).subscribe(this::show);
api.call(userProfileRequest12).subscribe(this::show);
//--> should fire a single http aggregate request RIGHT AWAY (we hit the max batch size) with the 10 items, map the response items & send them to the corresponding subscribers (that will show the right user profile)
The test code I wrote (with just strings) and pasted at the top of this question is meant to be a proof of concept for my final implementation.
Your Observable is not well constructed
public Observable<String> buildObservable(Observable<String> interval, String key) {
return interval.doOnSubscribe(() -> System.out.printf("creating observable for key " + key))
.doOnSubscribe(() -> requests.put(key, "xxx"))
.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key));
}
When you subsribe in a subscriber : it's offen a bad design.
I'm not shure to understand what you want to achieve, but I think my code should be pretty close to yours.
Please note that, for all side effects, I use doMethods (like doOnNext, doOnSubscribe) to show I explicitly show that I want to do a side effect.
I replace your defer call by returning directly the interval : as you want to emit all interval events in your custom observable build in your defer call, returning the interval observable is better.
Please note, that you filtering your interval Observable :
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.filter(l -> !requests.isEmpty()).
// ...
So, as soon you'll put something into requests map, interval will stop emmiting.
I don't understand what you wants to achieve with the request map, but please note that you may want to avoid side effects, and updating this map is clearly a side effect.
Update regarding comments
You may want to use the buffer operator to aggregate request, and then perform request in a bulk way :
PublishSubject<String> subject = PublishSubject.create();
TestScheduler scheduler = new TestScheduler();
Observable<Pair> broker = subject.buffer(100, TimeUnit.MILLISECONDS, 10, scheduler)
.flatMapIterable(list -> list) // you can bulk calls here
.flatMap(id -> Observable.fromCallable(() -> api.call(id)).map(response -> Pair.of(id, response)));
TestSubscriber<Object> ts1 = new TestSubscriber<>();
TestSubscriber<Object> ts2 = new TestSubscriber<>();
TestSubscriber<Object> ts3 = new TestSubscriber<>();
TestSubscriber<Object> ts4 = new TestSubscriber<>();
broker.filter(pair -> pair.id.equals("1")).take(1).map(pair -> pair.response).subscribe(ts1);
broker.filter(pair -> pair.id.equals("2")).take(1).map(pair -> pair.response).subscribe(ts2);
broker.filter(pair -> pair.id.equals("3")).take(1).map(pair -> pair.response).subscribe(ts3);
broker.filter(pair -> pair.id.equals("4")).take(1).map(pair -> pair.response).subscribe(ts4);
subject.onNext("1");
subject.onNext("2");
subject.onNext("3");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts1.assertValue("resp1");
ts2.assertValue("resp2");
ts3.assertValue("resp3");
ts4.assertNotCompleted();
subject.onNext("4");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts4.assertValue("resp4");
ts4.assertCompleted();
If you want to perform network request collapsin, you may want to check Hystrix : https://github.com/Netflix/Hystrix

Throttling messages from RabbitMQ using RxJava

I'm using RxJava to pull out values from RabbitMQ. Here's the code:
val amqp = new RabbitQueue("queueName")
val obs = Observable[String](subscr => while (true) subscr onNext amqp.next)
obs subscribe (
s => println(s"String from rabbitmq: $s"),
error => amqp.connection.close
)
It works fine but now I have a requirement that a value should be pulled at most once per second while all the values should be preserved (so debounce won't do since it drops intermediary values).
It should be like amqp.next blocks thread so we're waiting... (RabbitMQ got two messages in queue) pulled a 1st message... wait 1 second... pulled a 2nd message... wait indefinitely for the next message...
How can I achieve this using rx methods?
Alternatively you could create a observable from a timer like that. I personally find this more elegant.
RabbitQueue amqp = new RabbitQueue("queueName");
Observable.timer(0, 1, TimeUnit.SECONDS)
.map(tick -> amp.next())
.subscribe(...)
One option may be to use the Schedulers API in combination with a PublishSubject as the observable.
Unfortunately, I don't know Scala syntax but here is the Java version you should be able to convert:
RabbitQueue amqp = new RabbitQueue("queueName");
Scheduler.Worker worker = Schedulers.newThread().createWorker();
PublishSubject<String> obs = PublishSubject.create();
worker.schedulePeriodically(new Action0() {
#Override
public void call() {
obs.onNext(amqp.next);
}
}, 1, 1, TimeUnit.SECONDS);
Your subscribe code from above would remain the same:
obs subscribe (
s => println(s"String from rabbitmq: $s"),
error => amqp.connection.close
)

rx reactive extension: how to have each subscriber get a different value (the next one) from an observable?

Using reactive extension, it is easy to subscribe 2 times to the same observable.
When a new value is available in the observable, both subscribers are called with this same value.
Is there a way to have each subscriber get a different value (the next one) from this observable ?
Ex of what i'm after:
source sequence: [1,2,3,4,5,...] (infinite)
The source is constantly adding new items at an unknown rate.
I'm trying to execute a lenghty async action for each item using N subscribers.
1st subscriber: 1,2,4,...
2nd subscriber: 3,5,...
...
or
1st subscriber: 1,3,...
2nd subscriber: 2,4,5,...
...
or
1st subscriber: 1,3,5,...
2nd subscriber: 2,4,6,...
I would agree with Asti.
You could use Rx to populate a Queue (Blocking Collection) and then have competing consumers read from the queue. This way if one process was for some reason faster it could pick up the next item potentially before the other consumer if it was still busy.
However, if you want to do it, against good advice :), then you could just use the Select operator that will provide you with the index of each element. You can then pass that down to your subscribers and they can fiter on a modulus. (Yuck! Leaky abstractions, magic numbers, potentially blocking, potentiall side effects to the source sequence etc)
var source = Obserservable.Interval(1.Seconds())
.Select((i,element)=>{new Index=i, Element=element});
var subscription1 = source.Where(x=>x.Index%2==0).Subscribe(x=>DoWithThing1(x.Element));
var subscription2 = source.Where(x=>x.Index%2==1).Subscribe(x=>DoWithThing2(x.Element));
Also remember that the work done on the OnNext handler if it is blocking will still block the scheduler that it is on. This could affect the speed of your source/producer. Another reason why Asti's answer is a better option.
Ask if that is not clear :-)
How about:
IObservable<TRet> SomeLengthyOperation(T input)
{
return Observable.Defer(() => Observable.Start(() => {
return someCalculatedValueThatTookALongTime;
}, Scheduler.TaskPoolScheduler));
}
someObservableSource
.SelectMany(x => SomeLengthyOperation(input))
.Subscribe(x => Console.WriteLine("The result was {0}", x);
You can even limit the number of concurrent operations:
someObservableSource
.Select(x => SomeLengthyOperation(input))
.Merge(4 /* at a time */)
.Subscribe(x => Console.WriteLine("The result was {0}", x);
It's important for the Merge(4) to work, that the Observable returned by SomeLengthyOperation be a Cold Observable, which is what the Defer does here - it makes the Observable.Start not happen until someone Subscribes.