Idiomatic way to handle a stream end with RxJS - system.reactive

I need to do a some action when stream ends. What the idiomatic way to do that?
Now I use the code bellow:
source.subscribe(undefined, undefined, function() {
socket.send({type: 'end'});
});

There are several ways to accomplish this:
Use the operator subscribeOnCompleted() instead of passing in empty values to the .subscribe method.
Use tapOnCompleted(), same as above, but it won't start the sequence and you can inject it part way through the sequence.
Use the .finally() which will get executed when the sequence finishes (normally or otherwise).
In your example you are showing a side effect but if you were doing clean up of resources it would be more semantic to use .using() which takes a disposable and ties it to the lifetime of the subscription.
In order these look like:
source.subscribeOnCompleted(() => socket.send({type: 'end'}));
source.tapOnCompleted(() => socket.send({type: 'end'})).subscribe()
source.finally(() => socket.send({type: 'end'})).subscribe()
Rx.Observable.using(() => createResource(), (resource) => source).subscribe()

Related

Why/How should I use Publish without Connect?

Why/how should I use .Publish() without a Connect or RefCount call following? What does it do? Example code:
var source = new Subject<int>();
var pairs = source.Publish(_source => _source
.Skip(1)
.Zip(_source, (newer, older) => (older, newer))
);
pairs.Subscribe(p => Console.WriteLine(p));
source.OnNext(1);
source.OnNext(2);
source.OnNext(3);
source.OnNext(4);
How is pairs different from pairs2 here:
var pairs2 = source
.Skip(1)
.Zip(source, (newer, older) => (older, newer));
The Publish<TSource, TResult>(Func<IObservable<TSource, IObservable<TResult>> selector) overload is poorly documented. Lee Campbell doesn't cover it in introtorx.com. It doesn't return an IConnectableObservable, which is what most people associate with Publish, and therefore doesn't require or support a Connect or RefCount call.
This form of Publish is basically a form of defensive coding, against possible side-effects in a source observable. It subscribes once to the source, then can safely 'multicast' all messages via the passed in parameter. If you look at the question code, there's only once mention of source, and two mentions of _source. _source here is the safely multicasted observable, source is the unsafe one.
In the above example, the source is a simple Subject, so it's not really unsafe, and therefore Publish has no effect. However, if you were to replace source with this:
var source = Observable.Create<int>(o =>
{
Console.WriteLine("Print me once");
o.OnNext(1);
o.OnNext(2);
o.OnNext(3);
o.OnNext(4);
return System.Reactive.Disposables.Disposable.Empty;
});
...you would find "Print me once" printed once with pairs (correct), and twice with pairs2. This effect has similar implications where your observable wraps things like DB queries, web requests, network calls, file reads, and other side-effecting code that you want to happen only once and not multiple times.
TL;DR: If you have an observable query that references an observable twice, it is best to wrap that observable in a Publish call.

scala save slick result into new object

is there a way to save the result of a slick query into a new object?
This is my slick result, there is only one "object" in the list
val result: Future[Seq[ProcessTemplatesModel]] = db.run(action)
The result should be mapped on ProcessTemplatesModel because I want to access the values like this
process.title
Is this possible?
Thanks
TL;DR: you should keep the context as long as you can.
Future denotes the fact that the value will be given at some time in the future (this is what I call some context for the value).
The bad way to use it would be to block your thread, until such value is found, and then work with it.
A better way is to tell your program: "Once the value is found (whenever that is), do something with it". That's a continuation, or call-back, and is implemented with map and flatMap in scala.
Seq is another context for your value. It means that you actually have different possible values. If you want to make sure that you have at most one value, you can always do seq.headOption to switch context from Seq to Option.
The bad way to use it would be to take the first value without bothering checking if it exists or not.
A better way is to tell your program: "No matter how many values you have, do this for each of them".
Now, how do you work in context? You use the Functor and/or Monad operators: map, flatMap.
For instance, if you want to apply a function convertToSomethingElse to each element of your context, just do
result.map(list => list.map(process => convertToSomethingElse(process))
And you'll get a Future[Seq[SomethingElse]].
Another example, if you want to save the result somewhere else, you'll probably have some IO, or database operations, which may take some time, and possibly fail. We will assume you have a function save(entity: ProcessTemplateModel): Future[Boolean] that allows you to save one of your models. The fact that the function will take some time (and that it will be started in another thread) and possibly fail is visible in the return type Future[Boolean] (Boolean is not important here, it's the fact that we have again the Future context that matters).
Here, you will have to do (assuming you just want to save the first element in your list):
val savedFirstResult: Future[Option[ProcessTemplatesModel]] = result.flatMap {list =>
Future.traverse(list.headOption){ process => //traverse will switch the Future and Option contexts
save(process)
}
}
So as you can see, we can do most of what we want by staying inside the contexts that are returned by Slick. You shouldn't want to get outside of them because
most of the time, there's no need to, when you have map to use inside context some function for values outside context
extracting methods are most of the time unsafe: Option#get throws an exception if no element is in the Option, Await.result(future, duration) may block all computations or throw exceptions
responses in Play! can be given as Futures in a controller, using Action.async

How to write unit test when you use Future?

I've wrote a class with some functions that does HTTP calls and returns a Future[String]. I use those functions inside a method that I need to write some tests:
def score(rawEvent: Json) = {
httpService
.get("name", formatJsonAttribute(rawEvent.name))
.onComplete { op =>
op.map { json =>
//What must be tested
}
}
}
The function onComplete doesn't have a return type - it returns Unit. How can I replace that onComplete to make my function return something to be tested?
I completely agree with #Michal, that you should always prefer map to onComplete with Futures. However I'd like to point out that, as you said yourself, what you wish to test is not the HTTP call itself (which relies on an HTTP client you probably don't need to test, a response from a server on which you may have no control, ...), but what you do with its answer.
So why not write a test, not on the function score, but on the function you wrote in your onComplete (or map, if you decided to change it)?
That way you will be able to test it with precise values for json, that you may wish to define as the result you will get from the server, but that you can control completely (for instance, you could test border cases without forcing your server to give unusual responses).
Testing that the two (HTTP call and callback function) sit well together is not a unit-test question, but an integration-test question, and should be done only once you know that your function does what is expected of it.
At that time, you will effectively need to check the value of a Future, in which case, you can use Await.result as #Michal suggested, or use the relevant constructs that your test framework gives. For instance, scalatest has an AsyncTestSuite trait for this kind of issue.
Use map instead of onComplete. It will also provide you with resolved value inside mapping function. The return type of score function will be Future[T] where T will be the result type of your processing.
In the tests you can use scala.concurrent.Await.result() function.

Whats the ScalaJS way to make an event occur in n milliseconds time?

Is it to use ScalaJS DOM and use the following?
org.scalajs.dom.setTimeout( () => {
// Work
}, 1000)
Is there another way or a better way within the context of ScalaJS?
Starting with Scala.js 0.6.0, there is a more standard way, and more idiomatic Scala, to do it:
import scala.scalajs.js.timers._
setTimeout(1000) { // note the absence of () =>
// work
}
See the ScalaDoc of timers.
There isn't a better way. If you want you can wrap it in a helper and call it whatever you want, but by default that's it.

How to buffer based on time and count, but stopping the timer if no events occur

I'm producing a sequence of 50 items each tree seconds. I then want to batch them at max 20 items, but also not waiting more than one second before I release the buffer.
That works great!
But since the interval never dies, Buffer keeps firing empty batch chunks...
How can I avoid that? Shure Where(buf => buf.Count > 0)should help - but that seems like a hack.
Observable
.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge()
.Buffer(TimeSpan.FromSeconds(1), 20)
.Subscribe(e => Console.WriteLine(e.Count));
Output:
0-0-0-20-20-10-0-20-20-10-0-0-20-20
The Where filter you propose is a sound approach, I'd go with that.
You could wrap the Buffer and Where into a single helper method named to make the intent clearer perhaps, but rest assured the Where clause is idiomatic Rx in this scenario.
Think of it this way; an empty Buffer is relaying information that no events occurred in the last second. While you can argue that this is implicit, it would require extra work to detect this if Buffer didn't emit an empty list. It just so happens it's not information you are interested in - so Where is an appropriate way to filter this information out.
A lazy timer solution
Following from your comment ("...the timer... be[ing] lazily initiated...") you can do this to create a lazy timer and omit the zero counts:
var source = Observable.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge();
var xs = source.Publish(pub =>
pub.Buffer(() => pub.Take(1).Delay(TimeSpan.FromSeconds(1))
.Merge(pub.Skip(19)).Take(1)));
xs.Subscribe(x => Console.WriteLine(x.Count));
Explanation
Publishing
This query requires subscribing to the source events multiple times. To avoid unexpected side-effects, we use Publish to give us pub which is a stream that multicasts the source creating just a single subscription to it. This replaces the older Publish().RefCount() technique that achieved the same end, effectively giving us a "hot" version of the source stream.
In this case, this is necessary to ensure the subsequent buffer closing streams produced after the first will start with the current events - if the source was cold they would start over each time. I wrote a bit about publishing here.
The main query
We use an overload of Buffer that accepts a factory function that is called for every buffer emitted to obtain an observable stream whose first event is a signal to terminate the current buffer.
In this case, we want to terminate the buffer when either the first event into the buffer has been there for a full second, or when 20 events have appeared from the source - whichever comes first.
To achieve this we Merge streams that describe each case - the Take(1).Delay(...) combo describes the first condition, and the Skip(19).Take(1) describes the second.
However, I would still test performance the easy way, because I still suspect this is overkill, but a lot depends on the precise details of the platform and scenario etc.
After using the accepted answer for quite a while I would now suggest a different implementation (inspired by James Skip / Take approach and this answer):
var source = Observable.Interval(TimeSpan.FromSeconds(3))
.Select(n => Observable.Repeat(n, 50))
.Merge();
var xs = source.BufferOmitEmpty(TimeSpan.FromSeconds(1), 20);
xs.Subscribe(x => Console.WriteLine(x.Count));
With an extension method BufferOmitEmpty like:
public static IObservable<IList<TSource>> BufferOmitEmpty<TSource>(this IObservable<TSource> observable, TimeSpan maxDelay, int maxBufferCount)
{
return observable
.GroupByUntil(x => 1, g => Observable.Timer(maxDelay).Merge(g.Skip(maxBufferCount - 1).Take(1).Select(x => 1L)))
.Select(x => x.ToArray())
.Switch();
}
It is 'lazy', because no groups are created as long as there are no elements on the source sequence, so there are no empty buffers. As in Toms answer there is an other nice advantage to the Buffer / Where implementation, that is the buffer is started when the first element arrives. So elements following each other within buffer time after a quiet period are processed in the same buffer.
Why not to use the Buffer method
Three problems occured when I was using the Buffer approach (they might be irrelevant for the scope of the question, so this is a warning to people who use stack overflow answers in different contexts like me):
Because of the Delay one thread is used per subscriber.
In scenarios with long running subscribers elements from the source sequence can be lost.
With multiple subscribers it sometimes creates buffers with count greater than maxBufferCount.
(I can supply sample code for 2. and 3. but I'm insecure whether to post it here or in a different question because I cannot fully explain why it behaves this way)
RxJs5 has hidden features buried into their source code. It turns out it's pretty easy to achieve with bufferTime
From the source code, the signature looks like this:
export function bufferTime<T>(this: Observable<T>, bufferTimeSpan: number, bufferCreationInterval: number, maxBufferSize: number, scheduler?: IScheduler): Observable<T[]>;
So your code would be like this:
observable.bufferTime(1000, null, 20)