Why/How should I use Publish without Connect? - system.reactive

Why/how should I use .Publish() without a Connect or RefCount call following? What does it do? Example code:
var source = new Subject<int>();
var pairs = source.Publish(_source => _source
.Skip(1)
.Zip(_source, (newer, older) => (older, newer))
);
pairs.Subscribe(p => Console.WriteLine(p));
source.OnNext(1);
source.OnNext(2);
source.OnNext(3);
source.OnNext(4);
How is pairs different from pairs2 here:
var pairs2 = source
.Skip(1)
.Zip(source, (newer, older) => (older, newer));

The Publish<TSource, TResult>(Func<IObservable<TSource, IObservable<TResult>> selector) overload is poorly documented. Lee Campbell doesn't cover it in introtorx.com. It doesn't return an IConnectableObservable, which is what most people associate with Publish, and therefore doesn't require or support a Connect or RefCount call.
This form of Publish is basically a form of defensive coding, against possible side-effects in a source observable. It subscribes once to the source, then can safely 'multicast' all messages via the passed in parameter. If you look at the question code, there's only once mention of source, and two mentions of _source. _source here is the safely multicasted observable, source is the unsafe one.
In the above example, the source is a simple Subject, so it's not really unsafe, and therefore Publish has no effect. However, if you were to replace source with this:
var source = Observable.Create<int>(o =>
{
Console.WriteLine("Print me once");
o.OnNext(1);
o.OnNext(2);
o.OnNext(3);
o.OnNext(4);
return System.Reactive.Disposables.Disposable.Empty;
});
...you would find "Print me once" printed once with pairs (correct), and twice with pairs2. This effect has similar implications where your observable wraps things like DB queries, web requests, network calls, file reads, and other side-effecting code that you want to happen only once and not multiple times.
TL;DR: If you have an observable query that references an observable twice, it is best to wrap that observable in a Publish call.

Related

Confused about Observable vs. Single in functions like readCharacteristic()

In the RxJava2 version of RxAndroidBle, the functions readCharacteristic() and writeCharacteristic() return Single<byte[]>.
The example code to read a characteristic is:
device.establishConnection(false).flatMap(rxBleConnection -> rxBleConnection.readCharacteristic(characteristicUUID))
But the documentation for flatMap() says the mapping function is supposed to return an ObservableSource. Here, it returns a Single. How can this work?
Update: I looked at possibilities using operators like .single() and .singleOrError() but they all seem to require that the upstream emits one item and then completes. But establishConnection() doesn't ever complete. (This is one reason I suggested that perhaps establishConnection() should be reimagined as a Maybe, and some other way be provided to disconnect rather than just unsubscribing.)
You're totally correct, this example cannot be compiled. it's probably leftover from RxJava1 version, where Single wasn't exists.
Simple fix with the same result is to use RxJava2 flatMapSingle for instance:
device.establishConnection(false)
.flatMapSingle(rxBleConnection -> rxBleConnection.readCharacteristic(characteristicUUID))
flatMapSingle accepts a Single as the return value, and will map the success value of the input Single to an emission from the upstream Observable.
The point is, that RxJava has more specific Observable types, that exposes the possible series of emission expected from this Observable. Some methods now return Single as this is the logical operation of their stream (readCharacteristic()), some Observable as they will emit more than single emission (establishConnection() - connection status that can be changed over time).
But RxJava2 also provided many operators to convert between the different types and it really depends on your needs and scenario.
Thanks Rob!
In fact, the README was deprecated and required some pimping here and there. Please have a look if it's ok now.
I think I found the answer I was looking for. The crucial point:
Single.fromObservable(observableSource) doesn't do anything until it receives the second item from observableSource! Assuming that the first item it receives is a valid emission, then if the second item is:
onComplete(), it passes the first item to onSuccess();
onNext(), it signals IndexOutOfBoundsException since a Single can't emit more than one item;
onError(), it presumably forwards the error downstream.
Now, device.establishConnection() is a 1-item, non-completing Observable. The RxBleConnecton it emits is flatMapped to a Single with readCharacteristic(). But (another gotcha), flatMapSingle subscribes to these Singles and combines them into an Observable, which doesn't complete until the source establishConnection() does. But the source doesn't ever complete! Therefore the Single we're trying to create won't emit anything, since it doesn't receive that necessary second item.
The solution is to force the generation of onComplete() after the first (and only) item, which can be done with take(1). This will satisfy the Single we're creating, and cause it to emit the Characteristic value we're interested in. Hope that's clear.
The code:
Single<byte[]> readCharacteristicSingle( RxBleDevice device, UUID characteristicUUID ) {
return Single.fromObservable(
device.establishConnection( false )
.flatMapSingle( connection -> connection.readCharacteristic( characteristicUUID ) )
.take( 1L ) // make flatMapSingle's output Observable complete after the first emission
// (this makes the Single call onSuccess())
);
}

Auto-complete a Single created from another Observable

I have a long-running operation that returns a value in code I don't control. I need that value to be published to things that ask for it. For this purpose I am using a BehaviorSubject:
var subject: Subject<Value>? = null
fun retrieveValue(): Single<Value> {
if (subject == null) {
subject = BehaviorSubject.create<Value>()
someOtherThing.retrieveValueAsync { value ->
subject.onNext(value)
}
}
return subject.singleOrError()
}
This lets me perform the operation only once and send the result as a single to all future interested parties. However, it does not work. The single will not emit a value until I call:
subject.onComplete()
But this is a problem because once the subject is completed future things can no longer subscribe to it.
What is the appropriate way to cache a value from another observable and pass it to a Single? If there was a way to have a subject automatically complete once its source observable emitted a value that would work. Single.cache() also looks promising, but I'm unsure how I would handle the fact that my value comes in asynchronously in that case.
It feels like I'm missing something silly.
There is a SingleSubject for this case.
If you don't want experimental code in your codebase, you can use ReplaySubject.createWithSize(1) and call onComplete without losing the last value, then convert it to Single.

Subscription without any parameters

I need to write some RxScala code that creates an Observable from a text field in Scala Swing.
After looking on Github, I wrote this code, which seems to work:
def textValues: Observable[String] = Observable.create[String](observer => {
observer.onNext(field.text)
Subscription()
})
Where field.text gives the current state of the text from that field.
I don't understand this line however:
Subscription()
What is this Subscription useful for? It doesn't take any parameters so I'm assuming that it can't be used to unsubscribe from the observable I created, since it doesn't even reference this observable.
So what is a Subscription() good for?
Why doesn't it take any parameters?
First, to answer your question about the line:
Subscription()
It creates an empty subscription, that is, a subscription which does nothing when unsubscribe is called on it (instead of deregistering the event listener).
Second, note that you're using the "old fashioned" way of creating Observables:
#deprecated("Use [[Observable.apply]] instead", "0.26.2")
def create[T](f: Observer[T] => Subscription): Observable[T]
The new way is this one:
def apply[T](f: Subscriber[T] => Unit): Observable[T]
It corresponds to RxJava's Observable.create.
If you want to register unsubscribe actions with this method, you should use Subscriber.add(Subscription).
And third, let me point you to RxSwing, which defines all kinds of Observables for Swing Events, so that you don't have to re-implement them yourself. It's written in Java, but it's easily compatible with Scala, as you can see, for instance, in this example.

Reactive extensions documentation

I encountered an RX method that I have not seen before and went to MSDN to take a look at any documentation that there may be.
The method was
public static IObservable<IList<TSource>> CombineLatest<TSource> (this IEnumerable<Iobservable<TSource>> sources)
I am familiar with a CombineLatest that takes an Observable as a parameter and a transform function and returns an Observable.
Unfortunately MSDN only has a page for "Observable.CombineLatest<TFirst, TSecond, TResult> Method".
Am I looking in the wrong place or am I missing a better place to see RX API documentation, other than MSDN?
The MDSN documentation is awful. The two best sites I have found are http://introtorx.com and http://reactivex.io. Neither site though has info on that overload. :)
Basically once each observable has emitted a value, then combine latest emits a new list every time one of the streams emits one. The list contents are a list of the latest values from each of the child streams. If a stream completes, that value is propagated forever. If a stream errors, the combined stream errors.
You can run the following in LinqPad to demonstrate what it does:
var letters = new Subject<string>();
var numbers = new Subject<string>();
var observables = new IObservable<string>[] { letters, numbers };
var combineLatest = observables.CombineLatest();
combineLatest.Dump("CombineLatest");
numbers.OnNext("0");
numbers.OnNext("1");
letters.OnNext("A");
letters.OnNext("B");
numbers.OnNext("2");
letters.OnNext("C");
letters.OnNext("D");
numbers.OnCompleted();
letters.OnNext("E");
letters.OnError(new Exception());
Marble diagram:
Letters: ------A--B-----C--D-----E--x
Numbers: 0--1--------2--------|
--------
Combine: ------A1-B1-B2-C2-D2----E2-x

How to buffer based on time and count, but stopping the timer if no events occur

I'm producing a sequence of 50 items each tree seconds. I then want to batch them at max 20 items, but also not waiting more than one second before I release the buffer.
That works great!
But since the interval never dies, Buffer keeps firing empty batch chunks...
How can I avoid that? Shure Where(buf => buf.Count > 0)should help - but that seems like a hack.
Observable
.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge()
.Buffer(TimeSpan.FromSeconds(1), 20)
.Subscribe(e => Console.WriteLine(e.Count));
Output:
0-0-0-20-20-10-0-20-20-10-0-0-20-20
The Where filter you propose is a sound approach, I'd go with that.
You could wrap the Buffer and Where into a single helper method named to make the intent clearer perhaps, but rest assured the Where clause is idiomatic Rx in this scenario.
Think of it this way; an empty Buffer is relaying information that no events occurred in the last second. While you can argue that this is implicit, it would require extra work to detect this if Buffer didn't emit an empty list. It just so happens it's not information you are interested in - so Where is an appropriate way to filter this information out.
A lazy timer solution
Following from your comment ("...the timer... be[ing] lazily initiated...") you can do this to create a lazy timer and omit the zero counts:
var source = Observable.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge();
var xs = source.Publish(pub =>
pub.Buffer(() => pub.Take(1).Delay(TimeSpan.FromSeconds(1))
.Merge(pub.Skip(19)).Take(1)));
xs.Subscribe(x => Console.WriteLine(x.Count));
Explanation
Publishing
This query requires subscribing to the source events multiple times. To avoid unexpected side-effects, we use Publish to give us pub which is a stream that multicasts the source creating just a single subscription to it. This replaces the older Publish().RefCount() technique that achieved the same end, effectively giving us a "hot" version of the source stream.
In this case, this is necessary to ensure the subsequent buffer closing streams produced after the first will start with the current events - if the source was cold they would start over each time. I wrote a bit about publishing here.
The main query
We use an overload of Buffer that accepts a factory function that is called for every buffer emitted to obtain an observable stream whose first event is a signal to terminate the current buffer.
In this case, we want to terminate the buffer when either the first event into the buffer has been there for a full second, or when 20 events have appeared from the source - whichever comes first.
To achieve this we Merge streams that describe each case - the Take(1).Delay(...) combo describes the first condition, and the Skip(19).Take(1) describes the second.
However, I would still test performance the easy way, because I still suspect this is overkill, but a lot depends on the precise details of the platform and scenario etc.
After using the accepted answer for quite a while I would now suggest a different implementation (inspired by James Skip / Take approach and this answer):
var source = Observable.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge();
var xs = source.BufferOmitEmpty(TimeSpan.FromSeconds(1), 20);
xs.Subscribe(x => Console.WriteLine(x.Count));
With an extension method BufferOmitEmpty like:
public static IObservable<IList<TSource>> BufferOmitEmpty<TSource>(this IObservable<TSource> observable, TimeSpan maxDelay, int maxBufferCount)
{
return observable
.GroupByUntil(x => 1, g => Observable.Timer(maxDelay).Merge(g.Skip(maxBufferCount - 1).Take(1).Select(x => 1L)))
.Select(x => x.ToArray())
.Switch();
}
It is 'lazy', because no groups are created as long as there are no elements on the source sequence, so there are no empty buffers. As in Toms answer there is an other nice advantage to the Buffer / Where implementation, that is the buffer is started when the first element arrives. So elements following each other within buffer time after a quiet period are processed in the same buffer.
Why not to use the Buffer method
Three problems occured when I was using the Buffer approach (they might be irrelevant for the scope of the question, so this is a warning to people who use stack overflow answers in different contexts like me):
Because of the Delay one thread is used per subscriber.
In scenarios with long running subscribers elements from the source sequence can be lost.
With multiple subscribers it sometimes creates buffers with count greater than maxBufferCount.
(I can supply sample code for 2. and 3. but I'm insecure whether to post it here or in a different question because I cannot fully explain why it behaves this way)
RxJs5 has hidden features buried into their source code. It turns out it's pretty easy to achieve with bufferTime
From the source code, the signature looks like this:
export function bufferTime<T>(this: Observable<T>, bufferTimeSpan: number, bufferCreationInterval: number, maxBufferSize: number, scheduler?: IScheduler): Observable<T[]>;
So your code would be like this:
observable.bufferTime(1000, null, 20)