Is there an Rx operator for combining the latest from streams 1 and 2 only when stream 2 emits things? - system.reactive

Here's my attempt at drawing the marble diagram --
STREAM 1 = A----B----C---------D------>
(magical operator)
STREAM 2 = 1----------2-----3-----4--->
STREAM 3 = 1A---------2C----3C----4D-->
I am basically looking for something that generates stream 3 from streams 1 and 2. Basically, whenever something is emitted from stream 2, it combines it with the latest from stream 1. combineLatest is similar to what I want but I only want things emitted from stream 3 when something is emitted from stream 2, not stream 1. Does an operator like this exist?

There is an operator that does what you need: One overload of sample takes another observable instead of duration as a parameter. The documentation is here: https://github.com/ReactiveX/RxJava/wiki/Filtering-Observables#sample-or-throttlelast
The usage (I'll give examples in scala):
import rx.lang.scala.Observable
import scala.concurrent.duration
import duration._
def o = Observable.interval(100.milli)
def sampler = Observable.interval(180.milli)
// Often, you just need the sampled observable
o.sample(sampler).take(10).subscribe(x ⇒ println(x + ", "))
Thread.sleep(2000)
// or, as for your use case
o.combineLatest(sampler).sample(sampler).take(10).subscribe(x ⇒ println(x + ", "))
Thread.sleep(2000)
The output:
0,
2,
4,
6,
7,
9,
11,
13,
15,
16,
(2,0),
(4,1),
(6,2),
(7,3),
(9,4),
(11,5),
(13,6),
(15,7),
(16,8),
(18,9),
There is a slight catch in that duplicate entries from the sampled observable are swallowed (see discussion at https://github.com/ReactiveX/RxJava/issues/912). Other than that, I think it is exactly what you are looking for.

withLatestFrom seems to fit exactly what I was looking for - http://rxmarbles.com/#withLatestFrom

As far as I know there isn't a single existing operator that will do what you want. However you can compose one by using CombineLatest and DistinctUntilChanged as follows:
var joined = Observable.CombineLatest(sourceA, sourceB, (a,b) => new { A = a, B = b })
.DistinctUntilChanged(pair => pair.B);
EDIT:
The above will work as long as the values for STREAM 1 change each time. If they do not, then then use the following, which is less clear, but works in all situations (that I've tested anyway).
var joined = Observable.Join(
sourceB,
sourceA,
_ => Observable.Return(Unit.Default),
_ => sourceA,
(a, b) => new { A = a, B = b });
The Join operator is never intuitive to me, the best explanation I've found is here.
In answer to #Matthew's comment
var buttonClicks = Observable.FromEventPattern<MouseButtonEventArgs>(this,
"MouseLeftButtonDown")
.Select(_ => Unit.Default);
var sequence = Observable.Interval(TimeSpan.FromSeconds(1));
var joined = Observable.Join(
buttonClicks,
sequence,
_ => Observable.Return(Unit.Default),
_ => sequence,
(b, s) => s); // No info in button click here

Here is a fairly simple way to do it:
var query = stream2.Zip(
stream1.MostRecent(' '),
(s2,s1) => string.Format("{0}{1}", s2, s1));
MostRecent can be supplied a "zero" value which is used in the event stream1 has not emitted yet. This could be null for reference types, but I used a char for stream1 so supplied a space.

I think that the Switch operator is the key here.
Try this:
var query =
stream1
.Select(s1 => stream2.Select(s2 => new { s1, s2 }))
.Switch();
The following test code:
query
.Select(s => String.Format("{0}{1}", s.s2, s.s1))
.Subscribe(Console.WriteLine);
stream1.OnNext('A');
stream2.OnNext(1);
stream1.OnNext('B');
stream1.OnNext('C');
stream2.OnNext(2);
stream2.OnNext(3);
stream1.OnNext('D');
stream2.OnNext(4);
Gives these results:
1A
2C
3C
4D
Please let me know if this is correct.

A solution
public static IObservable<TR> Sample<TSource, TSampler, TR>
(this IObservable<TSource> source,
IObservable<TSampler> sampler,
Func<TSource, TSampler, TR> combiner)
{
return source.Publish
(rs => sampler
.Zip
( rs.MostRecent(default(TSource))
, (samplerElement, sourceElement)
=> combiner(sourceElement, samplerElement)
)
.SkipUntil(rs)
);
}
with a test case because this kind of thing is tricky to get right.
public class SampleSpec : ReactiveTest
{
TestScheduler _Scheduler = new TestScheduler();
[Fact]
public void ShouldWork()
{
var sampler = _Scheduler.CreateColdObservable
( OnNext(10, "A")
, OnNext(20, "B")
, OnNext(30, "C")
, OnNext(40, "D")
, OnNext(50, "E")
, OnNext(60, "F")
);
var source = _Scheduler.CreateColdObservable
( Enumerable
.Range(5,100)
.Where(i=>i%10!=0)
.Select(i=>OnNext(i,i)).ToArray());
var sampled = source.Sample
(sampler, Tuple.Create);
var actual = _Scheduler.Start
(() =>
sampled
, created: 0
, subscribed: 1
, disposed: 1000);
actual.Messages.Count()
.Should()
.Be(6);
var messages = actual.Messages.Take(6)
.Select(v => v.Value.Value)
.ToList();
messages[0].Should().Be(Tuple.Create(9,"A"));
messages[1].Should().Be(Tuple.Create(19,"B"));
messages[2].Should().Be(Tuple.Create(29, "C"));
messages[3].Should().Be(Tuple.Create(39, "D"));
messages[4].Should().Be(Tuple.Create(49, "E"));
messages[5].Should().Be(Tuple.Create(59, "F"));
}
}

Related

How can I emit periodic results over an iteration?

I might have something like this:
val found = source.toCharArray.foreach{ c =>
// Process char c
// Sometimes (e.g. on newline) I want to emit a result to be
// captured in 'found'. There may be 0 or more captured results.
}
This shows my intent. I want to iterate over some collection of things. Whenever the need arrises I want to "emit" a result to be captured in found. It's not a direct 1-for-1 like map. collect() is a "pull", applying a partial function over the collection. I want a "push" behavior, where I visit everything but push out something when needed.
Is there a pattern or collection method I'm missing that does this?
Apparently, you have a Collection[Thing], and you want to obtain a new Collection[Event] by emitting a Collection[Event] for each Thing. That is, you want a function
(Collection[Thing], Thing => Collection[Event]) => Collection[Event]
That's exactly what flatMap does.
You can write it down with nested fors where the second generator defines what "events" have to be "emitted" for each input from the source. For example:
val input = "a2ba4b"
val result = (for {
c <- input
emitted <- {
if (c == 'a') List('A')
else if (c.isDigit) List.fill(c.toString.toInt)('|')
else Nil
}
} yield emitted).mkString
println(result)
prints
A||A||||
because each 'a' emits an 'A', each digit emits the right amount of tally marks, and all other symbols are ignored.
There are several other ways to express the same thing, for example, the above expression could also be rewritten with an explicit flatMap and with a pattern match instead of if-else:
println(input.flatMap{
case 'a' => "A"
case d if d.isDigit => "|" * (d.toString.toInt)
case _ => ""
})
I think you are looking for a way to build a Stream for your condition. Streams are lazy and are computed only when required.
val sourceString = "sdfdsdsfssd\ndfgdfgd\nsdfsfsggdfg\ndsgsfgdfgdfg\nsdfsffdg\nersdff\n"
val sourceStream = sourceString.toCharArray.toStream
def foundStreamCreator( source: Stream[Char], emmitBoundaryFunction: Char => Boolean): Stream[String] = {
def loop(sourceStream: Stream[Char], collector: List[Char]): Stream[String] =
sourceStream.isEmpty match {
case true => collector.mkString.reverse #:: Stream.empty[String]
case false => {
val char = sourceStream.head
emmitBoundaryFunction(char) match {
case true =>
collector.mkString.reverse #:: loop(sourceStream.tail, List.empty[Char])
case false =>
loop(sourceStream.tail, char :: collector)
}
}
}
loop(source, List.empty[Char])
}
val foundStream = foundStreamCreator(sourceStream, c => c == '\n')
val foundIterator = foundStream.toIterator
foundIterator.next()
// res0: String = sdfdsdsfssd
foundIterator.next()
// res1: String = dfgdfgd
foundIterator.next()
// res2: String = sdfsfsggdfg
It looks like foldLeft to me:
val found = ((List.empty[String], "") /: source.toCharArray) {case ((agg, tmp), char) =>
if (char == '\n') (tmp :: agg, "") // <- emit
else (agg, tmp + char)
}._1
Where you keep collecting items in a temporary location and then emit it when you run into a character signifying something. Since I used List you'll have to reverse at the end if you want it in order.

scala observable unify observable with a sequence without intermediate datastructure update

I have a code which calls couchbase for getting some rows as following:
val gotValues: Observable[JsonDocument] = Observable.from(rowKeys).flatMap(id =>
couchbaseBucket.async().get(id))
If I have 1,2,3,4,5,6 as input row-keys and only rows 1,2,3 exist in the DB then the observable would get notified only about 1,2,3.
My requirement however is that I return a map with 1,2,3 true (exist in db) and 4,5,6 with false, (meaning not exist in DB). I managed to do that with scala observable however i'm using an intermediate map data structure to return the total map which contains all ids. Below is a sample code which simulates my problem..
object Main extends App {
import rx.lang.scala.Observable
val idsToFetch = Seq(1,2,3,4,5,6)
println(isInDBOrNot()) // {1=true, 2=true, 3=true, 4=false, 5=false, 6=false}
private def isInDBOrNot(): ConcurrentHashMap[Int, Boolean] = {
val inAndNotInDB = new java.util.concurrent.ConcurrentHashMap[Int, Boolean]
// - How can I avoid the additional data structure?
// - In this case a map, so that the function will return
// a map with all numbers and for each if exist in DB?
// - I mean I want the function to return a map I don't
// want to populate that map inside the observer,
// it's like a mini side effect I would rather simply
// manipulate the stream.
Observable.from(idsToFetch)
.filterNot(x => x == 4 || x == 5 || x == 6) // Simulate fetch from DB, 4,5,6 do not exist in DB, so not returned.
.subscribe(
x => inAndNotInDB.put(x, true),
e => println(e),
() => idsToFetch.filterNot(inAndNotInDB.containsKey)
.foreach(inAndNotInDB.put(_, false)) // mark all non-found as false.
)
inAndNotInDB
}
}
Anyway to do that without the intermediate map (without populating an intermedia data structure, but only by manipulating the stream)? It does not look clean!! . Thanks.
You problem seems to arise from the fact that you use flatMap so if there is no data in the DB for a given id and you get an empty Observable, flatMap just produces no output for such id. So it looks like what you need is defaultIfEmpty which is translated to Scala's orElse. You can use orElse to return some default value inside flatMap. So to modify your example:
def fetchFromDb(id: Int): Observable[String] = {
if (id <= 3)
Observable.just(s"Document #$id")
else
Observable.empty
}
def gotValue(idsToFetch: Seq[Int]): Observable[(Int, Boolean)] = {
Observable.from(idsToFetch).flatMap((id: Int) => fetchFromDb(id).map(_ => (id, true)).orElse((id, false)))
}
println(gotValue(Seq(1, 2, 3, 4, 5, 6)).toBlocking.toList)
which prints
List((1,true), (2,true), (3,true), (4,false), (5,false), (6,false))
Or you can use Option to return Some(JsonDocument) or None such as
def gotValueEx(idsToFetch: Seq[Int]): Observable[(Int, Option[String])] = {
Observable.from(idsToFetch).flatMap((id: Int) => fetchFromDb(id).map(doc => (id, Option(doc))).orElse((id, None)))
}
println(gotValueEx(Seq(1, 2, 3, 4, 5, 6)).toBlocking.toList)
which prints
List((1,Some(Document #1)), (2,Some(Document #2)), (3,Some(Document #3)), (4,None), (5,None), (6,None))
One way of doing this is the following:
(1) convert sequence of ids to Observable and map it with
id => (id, false)
... so you'll get an observable of type Observable[(Int, Boolean)] (lets call this new observable first).
(2) fetch data from database and map every fetched row to from:
(some_id, true)
... inside Observable[(Int, Boolean)] (lets call this observable last)
(3) concat first and last.
(4) toMap result of (3). Duplicate elements coming from first will be dropped in process. (this will be your resultObsrvable)
(5) (possibly) collect the first and only element of the observable (your map). You might not want to do this at all, but if you do, you should really understand implications of blocking to collect result at this point. In any case, this step really depends on your application specifics (how threading\scheduling\io is organized) but brute-force approach should look something like this (refer to this demo for more specifics):
Await.result(resultObsrvable.toBlocking.toFuture, 2 seconds)
how about this
Observable.from(idsToFetch)
.filterNot(x => x._1 == 4 || x._1 == 5 || x._1 == 6)
.foldLeft(idToFetch.map{_->false}.toMap){(m,id)=>m+(id->true)}

Handle Akka stream's first element specially

Is there an idiomatic way of handling Akka stream's Source first element in a special way? What I have now is:
var firstHandled = false
source.map { elem =>
if(!firstHandled) {
//handle specially
firstHandled = true
} else {
//handle normally
}
}
Thanks
While I would generally go with Ramon's answer, you could also use prefixAndTail, with a prefix of 1, together with flatMapConcat to achieve something similar:
val src = Source(List(1, 2, 3, 4, 5))
val fst = Flow[Int].map(i => s"First: $i")
val rst = Flow[Int].map(i => s"Rest: $i")
val together = src.prefixAndTail(1).flatMapConcat { case (head, tail) =>
// `head` is a Seq of the prefix elements, which in our case is
// just the first one. We can convert it to a source of just
// the first element, processed via our fst flow, and then
// concatenate `tail`, which is the remainder...
Source(head).via(fst).concat(tail.via(rst))
}
Await.result(together.runForeach(println), 10.seconds)
// First: 1
// Rest: 2
// Rest: 3
// Rest: 4
// Rest: 5
This of course works not just for the first item, but for the first N items, with the proviso that those items will be taken up as a strict collection.
Using zipWith
You could zip the original Source with a Source of Booleans that only returns true the first time. This zipped Source can then be processed.
First we'll need a Source that emits the Booleans:
//true, false, false, false, ...
def firstTrueIterator() : Iterator[Boolean] =
(Iterator single true) ++ (Iterator continually false)
def firstTrueSource : Source[Boolean, _] =
Source fromIterator firstTrueIterator
We can then define a function that handles the two different cases:
type Data = ???
type OutputData = ???
def processData(data : Data, firstRun : Boolean) : OutputData =
if(firstRun) { ... }
else { ... }
This function can then be used in a zipWith of your original Source:
val originalSource : Source[Data,_] = ???
val contingentSource : Source[OutputData,_] =
originalSource.zipWith(firstTrueSource)(processData)
Using Stateful Flow
You could create a Flow that contains state similar to the example in the question but with a more functional approach:
def firstRunner(firstCall : (Data) => OutputData,
otherCalls : (Data) => OutputData) : (Data) => OutputData = {
var firstRun = true
(data : Data) => {
if(firstRun) {
firstRun = false
firstCall(data)
}
else
otherCalls(data)
}
}//end def firstRunner
def firstRunFlow(firstCall : (Data) => OutputData,
otherCalls : (Data) => OutputData) : Flow[Data, OutputData, _] =
Flow[Data] map firstRunner(firstCall, otherCalls)
This Flow can then be applied to your original Source:
def firstElementFunc(data : Data) : OutputData = ???
def remainingElsFunc(data : Data) : OutputData = ???
val firstSource : Source[OutputData, _] =
originalSource via firstRunFlow(firstElementFunc,remainingElseFunc)
"Idiomatic Way"
Answering your question directly requires dictating the "idiomatic way". I answer that part last because it is the least verifiable by the compiler and is therefore closer to opinion. I would never claim to be a valid classifier of idiomatic code.
My personal experience with akka-streams has been that it is best to switch my perspective to imagining an actual stream (I think of a train with boxcars) of Data elements. Do I need to break it up into multiple fixed size trains? Do only certain boxcars make it through? Can I attach another train side-by-side that contains Boolean cars which can signal the front? I would prefer the zipWith method due to my regard of streams (trains). My initial approach is always to use other stream parts connected together.
Also, I find it best to embed as little code in an akka Stream component as possible. firstTrueIterator and processData have no dependency on akka at all. Concurrently, the firstTrueSource and contingentSource definitions have virtually no logic. This allows you to test the logic independent of a clunky ActorSystem and the guts can be used in Futures, or Actors.
You can use prepend to prepend a source to flows. Just prepend single item source to the flow, after it is drained, rest of the original source will continue.
https://doc.akka.io/docs/akka/current/stream/operators/Source-or-Flow/prepend.html
Source(List(1, 2, 3))
.prepend(Source.single(0))
.runWith(Sink.foreach(println))
0
1
2
3
While I prefer the approach with zip, one can also use statefulMapConcat:
source
.statefulMapConcat { _ =>
var firstRun = true
elem => {
if (firstRun) {
//first
firstRun = false
} else {
//not first
}
}
}

How to produce one Observable from another

Let's say we have source Observable of Ints:
val source:Observable[Int]
I would like to create another Observable, producing values whose difference to first appeared value at source is greater than 10:
def detect() = Observable[Int](
subscriber =>
if (!subscriber.isUnsubscribed) {
var start:Option[Int] = None
source.subscribe(
item => {
if (start.isEmpty) {
start = Option(item)
}
else {
start.filter(v => Math.abs(item - v) > 10).foreach {
item => subscriber.onNext(item)
}
}
}
)
subscriber.onCompleted()
}
)
Here I've used var start to hold first value of source Observable.
Is there a way to simplify this code? I don't like this approach with assigning value to a var
Here is what I came up with:
import rx.lang.scala.Observable
val source = Observable.from(List(5, 2, 3, 16, -40, 2, -70, 50))
source.scan(Option.empty[(Int, Int)]) { (acc, next) =>
acc.map(_.copy(_2 = next)) orElse Some((next, next))
}.collect {
case Some((start, current)) if math.abs(start - current) > 10 => current
}.subscribe(x => println(x))
prints
16
-40
-70
50
basically scan keeps an accumulator that can be uninitialized (None), or can hold a pair: the first value and the last element emitted from source. Then we collect only those elements that meet the your predicate.
You just need to apply the filter operator, which produces a new Observable that mirrors emissions of the source observable, but skips those for which the predicate tests false:
val filtered = source.filter(v => Math.abs(item - v) > 10)

Split Rx Observable into multiple streams and process individually

Here is a picture of what I am attempting to accomplish.
--a-b-c-a--bbb--a
split into
--a-----a-------a --> a stream
----b------bbb--- --> b stream
------c---------- --> c stream
Then, be able to
a.subscribe()
b.subscribe()
c.subscribe()
So far, everything I have found has split the stream using a groupBy(), but then collapsed everything back into a single stream and process them all in the same function. What I want to do is process each derived stream in a different way.
The way I'm doing it right now is doing a bunch of filters. Is there a better way to do this?
Easy as pie, just use filter
An example in scala
import rx.lang.scala.Observable
val o: Observable[String] = Observable.just("a", "b", "c", "a", "b", "b", "b", "a")
val hotO: Observable[String] = o.share
val aSource: Observable[String] = hotO.filter(x ⇒ x == "a")
val bSource: Observable[String] = hotO.filter(x ⇒ x == "b")
val cSource: Observable[String] = hotO.filter(x ⇒ x == "c")
aSource.subscribe(o ⇒ println("A: " + o), println, () ⇒ println("A Completed"))
bSource.subscribe(o ⇒ println("B: " + o), println, () ⇒ println("B Completed"))
cSource.subscribe(o ⇒ println("C: " + o), println, () ⇒ println("C Completed"))
You just need to make sure that the source observable is hot. The easiest way is to share it.
You don't have to collapse Observables from groupBy. You can instead subscribe to them.
Something like this:
String[] inputs= {"a", "b", "c", "a", "b", "b", "b", "a"};
Action1<String> a = s -> System.out.print("-a-");
Action1<String> b = s -> System.out.print("-b-");
Action1<String> c = s -> System.out.print("-c-");
Observable
.from(inputs)
.groupBy(s -> s)
.subscribe((g) -> {
if ("a".equals(g.getKey())) {
g.subscribe(a);
}
if ("b".equals(g.getKey())) {
g.subscribe(b);
}
if ("c".equals(g.getKey())) {
g.subscribe(c);
}
});
If statements look kinda ugly but at least you can handle each stream separately. Maybe there is a way of avoiding them.
I have been thinking about this and Tomas solution is OK, but the issue is that it converts the stream to a hot observable.
You can use share in combination with defer in order to get a cold observable with other streams.
For example (Java):
var originalObservable = ...; // some source
var coldObservable = Observable.defer(() -> {
var shared - originalObservable.share();
var aSource = shared.filter(x -> x.equals("a"));
var bSource = shared.filter(x -> x.equals("b"));
var cSource = shared.filter(x -> x.equals("c"));
// some logic for sources
return shared;
});
In RxJava there is a special version of publish operator that takes a function.
ObservableTransformer {
it.publish { shared ->
Observable.merge(
shared.ofType(x).compose(transformerherex),
shared.ofType(y).compose(transformerherey)
)
}
}
This splits the event stream by type. Then you can process them separately by composing with different transformers. All of them share single subscription.