What is the best way to "rate limit" consuming of an Observable? - system.reactive

I have a bunch of events coming in and I have to execute ALL of them without a loss, but I want to make sure that they are buffered and consumed at the appropriate time slots. Anyone have a solution?
I can't find any operators in Rx that can do that without the loss of the events (Throttle - looses events). I've also considered Buffered, Delay, etc... Can't find a good solution.
I've tried to put a timer in the middle, but somehow it doesn't work at all:
GetInitSequence()
.IntervalThrottle(TimeSpan.FromSeconds(5))
.Subscribe(
item =>
{
Console.WriteLine(DateTime.Now);
// Process item
}
);
public static IObservable<T> IntervalThrottle<T>(this IObservable<T> source, TimeSpan dueTime)
{
return Observable.Create<T>(o =>
{
return source.Subscribe(x =>
{
new Timer(state =>
o.OnNext((T)state), x, dueTime, TimeSpan.FromMilliseconds(-1));
}, o.OnError, o.OnCompleted);
});
}

The question is not 100% clear so I'm making some presumptions.
Observable.Delay is not what you want because that will create a delay from when each event arrives, rather than creating even time intervals for processing.
Observable.Buffer is not what you want because that will cause all events in each given interval to be passed to you, rather than one at a time.
So I believe you're looking for a solution that creates some sort of metronome that ticks away, and gives you an event per tick. This can be naively constructed using Observable.Interval for the metronome and Zip for connecting it to your source:
var source = GetInitSequence();
var trigger = Observable.Interval(TimeSpan.FromSeconds(5));
var triggeredSource = source.Zip(trigger, (s,_) => s);
triggeredSource.Subscribe(item => Console.WriteLine(DateTime.Now));
This will trigger every 5 seconds (in the example above), and give you the original items in sequence.
The only problem with this solution is that if you don't have any more source elements for (say) 10 seconds, when the source elements arrive they will be immediately sent out since some of the 'trigger' events are sitting there waiting for them. Marble diagram for that scenario:
source: -a-b-c----------------------d-e-f-g
trigger: ----o----o----o----o----o----o----o
result: ----a----b----c-------------d-e-f-g
This is a very reasonable issue. There are two questions here already that tackle it:
Rx IObservable buffering to smooth out bursts of events
A way to push buffered events in even intervals
The solution provided is a main Drain extension method and secondary Buffered extension. I've modified these to be far simpler (no need for Drain, just use Concat). Usage is:
var bufferedSource = source.StepInterval(TimeSpan.FromSeconds(5));
The extension method StepInterval:
public static IObservable<T> StepInterval<T>(this IObservable<T> source, TimeSpan minDelay)
{
return source.Select(x =>
Observable.Empty<T>()
.Delay(minDelay)
.StartWith(x)
).Concat();
}

I know this could just be too simple, but would this work?
var intervaled = source.Do(x => { Thread.Sleep(100); });
Basically this just puts a minimum delay between values. Too simplistic?

Along the lines of Enigmativity's answer, if all you want to do is just Delay all of the values by a TimeSpan, I cant see why Delay is not the operator you want
GetInitSequence()
.Delay(TimeSpan.FromSeconds(5)) //ideally pass an IScheduler here
.Subscribe(
item =>
{
Console.WriteLine(DateTime.Now);
// Process item
}
);

How about Observable.Buffer? This should return all the events in the 1s window as a single event.
var xs = Observable.Interval(TimeSpan.FromMilliseconds(100));
var bufferdStream = xs.Buffer(TimeSpan.FromSeconds(5));
bufferdStream.Subscribe(item => { Console.WriteLine("Number of events in window: {0}", item.Count); });
It might be what you're asking isnt that clear. What is your code supposed to do? It looks like you're just delaying by creating a timer for each event. It also breaks the semantics of the observable as the next and complete could occur before the next.
Note this is also only as accurate at the timer used. Typically the timers are accurate to at most 16ms.
Edit:
your example becomes, and item contains all the events in the window:
GetInitSequence()
.Buffer(TimeSpan.FromSeconds(5))
.Subscribe(
item =>
{
Console.WriteLine(DateTime.Now);
// Process item
}
);

Related

process Swift DispatchQueue without affecting resource

I have a Swift DispatchQueue that receives data at 60fps.
However, depending on phones or amount of data received, the computation of those data becomes expensive to process at 60fps. In actuality, it is okay to process only half of them or as much as the computation resource allows.
let queue = DispatchQueue(label: "com.test.dataprocessing")
func processData(data: SomeData) {
queue.async {
// data processing
}
}
Does DispatchQueue somehow allow me to drop some data if a resource is limited? Currently, it is affecting the main UI of SceneKit. Or, is there something better than DispatchQueue for this type of task?
There are a couple of possible approaches:
The simple solution is to keep track of your own Bool as to whether your task is in progress or not, and when you have more data, only process it if there's not one already running:
private var inProgress = false
private var syncQueue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".sync.progress") // for reasons beyond the scope of this question, reader-writer with concurrent queue is not appropriate here
func processData(data: SomeData) {
let isAlreadyRunning = syncQueue.sync { () -> Bool in
if self.inProgress { return true }
self.inProgress = true
return false
}
if isAlreadyRunning { return }
processQueue.async {
defer {
self.syncQueue.async { self.inProgress = false }
}
// process `data`
}
}
All of that syncQueue stuff is to make sure that I have thread-safe access to the inProgress property. But don't get lost in those details; use whatever synchronization mechanism you want (e.g. a lock or whatever). All we want to make sure is that we have thread-safe access to the Bool status flag.
Focus on the basic idea, that we'll keep track of a Bool flag to know whether the processing queue is still tied up processing the prior set of SomeData. If it is busy, return immediately and don't process this new data. Otherwise, go ahead and process it.
While the above approach is conceptually simple, it won't offer great performance. For example, if your processing of data always takes 0.02 seconds (50 times per second) and your input data is coming in at a rate of 60 times per second, you'll end up getting 30 of them processed per second.
A more sophisticated approach is to use a GCD user data source, something that says "run the following closure when the destination queue is free". And the beauty of these dispatch user data sources is that it will coalesce them together. These data sources are useful for decoupling the speed of inputs from the processing of them.
So, you first create a data source that simply indicates what should be done when data comes in:
private var dataToProcess: SomeData?
private lazy var source = DispatchSource.makeUserDataAddSource(queue: processQueue)
func configure() {
source.setEventHandler() { [unowned self] in
guard let data = self.syncQueue.sync(execute: { self.dataToProcess }) else { return }
// process `data`
}
source.resume()
}
So, when there's data to process, we update our synchronized dataToProcess property and then tell the data source that there is something to process:
func processData(data: SomeData) {
syncQueue.async { self.dataToProcess = data }
source.add(data: 1)
}
Again, just like the previous example, we're using syncQueue to synchronize our access to some property across multiple threads. But this time we're synchronizing dataToProcess rather than the inProgress state variable we used in the first example. But the idea is the same, that we must be careful to synchronize our interation with a property across multiple threads.
Anyway, using this pattern with the above scenario (input coming in at 60 fps, whereas processing can only process 50 per second), the resulting performance much closer to the theoretical max of 50 fps (I got between 42 and 48 fps depending upon the queue priority), rather than 30 fps.
The latter process can conceivably lead to more frames (or whatever you're processing) to be processed per second and results in less idle time on the processing queue. The following image attempts to graphically illustrate how the two alternatives compare. In the former approach, you'll lose every other frame of data, whereas the latter approach will only lose a frame of data when two separate sets of input data came in prior to the processing queue becoming free and they were coalesced into a single call to the dispatch source.

RxJS combineLatest: how to get emit after just one value changes?

I'm trying to learn the RxJS library. One of the cases I don't quite understand is described in this jsfiddle (code also below).
var A= new Rx.Subject();
var B= new Rx.Subject();
A.onNext(0);
// '.combineLatest' needs all the dependency Observables to get emitted, before its combined signal is emitted.
//
// How to have a combined signal emitted when any of the dependencies change (using earlier given values for the rest)?
//
A.combineLatest( B, function (a,b) { return a+b; } )
.subscribe( function (v) { console.log( "AB: "+ v ); } );
B.onNext("a");
A.onNext(1);
I'd like to get two emits to the "AB" logging. One from changing B to "a" (A already has the value 0). Another from changing A to 1.
However, only changes that occur after a subscribe seem to matter (even though A has a value and thus the combined result could be computed).
Should I use "hot observables" for this, or some other method than .combineLatest?
My problem in the actual code (bigger than this sample) is that I need to make separate initialisations after the subscribes, which cuts stuff in two separate places instead of having the initial values clearly up front.
Thanks
I think you have misunderstood how the Subjects work. Subjects are hot Observables. They do not hold on to values, so if they receive an onNext with no subscribers than that value will be lost to the world.
What you are looking for is a either the BehaviorSubject or the ReplaySubject both of which hold onto past values that re-emit them to new subscribers. In the former case you always construct it with an initial value
//All subscribers will receive 0
var subject = new Rx.BehaviorSubject(0);
//All subscribers will receive 1
//Including all future subscribers
subject.onNext(1);
in the latter you set the number of values to be replayed for each subscription
var subject = new Rx.ReplaySubject(1);
//All new subscribers will receive 0 until the subject receives its
//next onNext call
subject.onNext(0);
Rewriting your example it could be:
var A= new Rx.BehaviorSubject(0);
var B= new Rx.Subject();
// '.combineLatest' needs all the dependency Observables to get emitted, before its combined signal is emitted.
//
// How to have a combined signal emitted when any of the dependencies change (using earlier given values for the rest)?
//
A.combineLatest( B, function (a,b) { return a+b; } )
.subscribe( function (v) { console.log( "AB: "+ v ); } );
B.onNext("a");
A.onNext(1);
//AB: 0a
//AB: 1a
On another note, realizing of course that this is all new to you, in most cases you should not need to use a Subject directly as it generally means that you are trying to wrangle Rx into the safety of your known paradigms. You should ask yourself, where is your data coming from? How is it being created? If you ask those questions enough, following your chain of events back up to the source, 9 out of 10 times you will find that there is probably an Observable wrapper for it.

RXJS : Idiomatic way to create an observable stream from a paged interface

I have paged interface. Given a starting point a request will produce a list of results and a continuation indicator.
I've created an observable that is built by constructing and flat mapping an observable that reads the page. The result of this observable contains both the data for the page and a value to continue with. I pluck the data and flat map it to the subscriber. Producing a stream of values.
To handle the paging I've created a subject for the next page values. It's seeded with an initial value then each time I receive a response with a valid next page I push to the pages subject and trigger another read until such time as there is no more to read.
Is there a more idiomatic way of doing this?
function records(start = 'LATEST', limit = 1000) {
let pages = new rx.Subject();
this.connect(start)
.subscribe(page => pages.onNext(page));
let records = pages
.flatMap(page => {
return this.read(page, limit)
.doOnNext(result => {
let next = result.next;
if (next === undefined) {
pages.onCompleted();
} else {
pages.onNext(next);
}
});
})
.pluck('data')
.flatMap(data => data);
return records;
}
That's a reasonable way to do it. It has a couple of potential flaws in it (that may or may not impact you depending upon your use case):
You provide no way to observe any errors that occur in this.connect(start)
Your observable is effectively hot. If the caller does not immediately subscribe to the observable (perhaps they store it and subscribe later), then they'll miss the completion of this.connect(start) and the observable will appear to never produce anything.
You provide no way to unsubscribe from the initial connect call if the caller changes its mind and unsubscribes early. Not a real big deal, but usually when one constructs an observable, one should try to chain the disposables together so it call cleans up properly if the caller unsubscribes.
Here's a modified version:
It passes errors from this.connect to the observer.
It uses Observable.create to create a cold observable that only starts is business when the caller actually subscribes so there is no chance of missing the initial page value and stalling the stream.
It combines the this.connect subscription disposable with the overall subscription disposable
Code:
function records(start = 'LATEST', limit = 1000) {
return Rx.Observable.create(observer => {
let pages = new Rx.Subject();
let connectSub = new Rx.SingleAssignmentDisposable();
let resultsSub = new Rx.SingleAssignmentDisposable();
let sub = new Rx.CompositeDisposable(connectSub, resultsSub);
// Make sure we subscribe to pages before we issue this.connect()
// just in case this.connect() finishes synchronously (possible if it caches values or something?)
let results = pages
.flatMap(page => this.read(page, limit))
.doOnNext(r => this.next !== undefined ? pages.onNext(this.next) : pages.onCompleted())
.flatMap(r => r.data);
resultsSub.setDisposable(results.subscribe(observer));
// now query the first page
connectSub.setDisposable(this.connect(start)
.subscribe(p => pages.onNext(p), e => observer.onError(e)));
return sub;
});
}
Note: I've not used the ES6 syntax before, so hopefully I didn't mess anything up here.

rx reactive extension: how to have each subscriber get a different value (the next one) from an observable?

Using reactive extension, it is easy to subscribe 2 times to the same observable.
When a new value is available in the observable, both subscribers are called with this same value.
Is there a way to have each subscriber get a different value (the next one) from this observable ?
Ex of what i'm after:
source sequence: [1,2,3,4,5,...] (infinite)
The source is constantly adding new items at an unknown rate.
I'm trying to execute a lenghty async action for each item using N subscribers.
1st subscriber: 1,2,4,...
2nd subscriber: 3,5,...
...
or
1st subscriber: 1,3,...
2nd subscriber: 2,4,5,...
...
or
1st subscriber: 1,3,5,...
2nd subscriber: 2,4,6,...
I would agree with Asti.
You could use Rx to populate a Queue (Blocking Collection) and then have competing consumers read from the queue. This way if one process was for some reason faster it could pick up the next item potentially before the other consumer if it was still busy.
However, if you want to do it, against good advice :), then you could just use the Select operator that will provide you with the index of each element. You can then pass that down to your subscribers and they can fiter on a modulus. (Yuck! Leaky abstractions, magic numbers, potentially blocking, potentiall side effects to the source sequence etc)
var source = Obserservable.Interval(1.Seconds())
.Select((i,element)=>{new Index=i, Element=element});
var subscription1 = source.Where(x=>x.Index%2==0).Subscribe(x=>DoWithThing1(x.Element));
var subscription2 = source.Where(x=>x.Index%2==1).Subscribe(x=>DoWithThing2(x.Element));
Also remember that the work done on the OnNext handler if it is blocking will still block the scheduler that it is on. This could affect the speed of your source/producer. Another reason why Asti's answer is a better option.
Ask if that is not clear :-)
How about:
IObservable<TRet> SomeLengthyOperation(T input)
{
return Observable.Defer(() => Observable.Start(() => {
return someCalculatedValueThatTookALongTime;
}, Scheduler.TaskPoolScheduler));
}
someObservableSource
.SelectMany(x => SomeLengthyOperation(input))
.Subscribe(x => Console.WriteLine("The result was {0}", x);
You can even limit the number of concurrent operations:
someObservableSource
.Select(x => SomeLengthyOperation(input))
.Merge(4 /* at a time */)
.Subscribe(x => Console.WriteLine("The result was {0}", x);
It's important for the Merge(4) to work, that the Observable returned by SomeLengthyOperation be a Cold Observable, which is what the Defer does here - it makes the Observable.Start not happen until someone Subscribes.

Rx Breaking Changes

I know that SL5 has a new property to count MouseClicks, but with help, I got this working when SL4 came out. Now I moved to a new machine, downloaded RX, and I understand the RX went through a few changes that has broken this code. I have tried, but I can't seem to navigate the transition away from FastSubject.
I would really like to fully understand the use of Subject here, and how to update the call to work with the current version of Rx.
public static IObservable<TSource> MonitorForDoubleClicks<TSource>(this IObservable<TSource> source, TimeSpan doubleClickSpeed, IScheduler scheduler)
{
return source.Multicast<TSource, TSource, TSource>(
() => new FastSubject<TSource>(),
values =>
{
return values
.TimeInterval(scheduler) //injects a timestamp event arguments
.Skip(1) // in order to determine an interval we need two of these, so this keeps the event in the collection, but does not process the first one in
.Where(interval => interval.Interval <= doubleClickSpeed) //second event has arrived, so we can test the interval
.RemoveTimeInterval() //take the time argument out of the event args
.Take(1) //we take one of the events (the latest) and throw it
.Repeat(); //keep the observer alive forever
});
FastSubject is just Subject now, all subjects are fast :) However, this is a weird way to check for double clicks.
How about just (warning: Coding via TextArea):
return source.Timestamp(scheduler)
.Buffer(/*buffer of*/2, 1 /*advanced at a time*/)
.Where(x => x[1].Timestamp - x[0].Timestamp < doubleClickSpeed)
.Select(x => x[1]);