An observable to emit tuple of latest values of N other observables? - reactive-programming

Is there any operation like zip, but which not waits for entire tuple gather, but emits tuple on each change.
For example if it just emitter 1A and B came on second observable, it immediately emits 1B, which is "latest" tuple.
At the beginning, this operation should wait until all N elements gather.

What you're looking for is usually called zipLatest.
Here's an example implementation in Python:
from typing import *
import rx
import rx.operators as ops
def zip_latest(*xss: rx.Observable) -> rx.Observable:
helper = ZipLatestHelper(len(xss))
return mux(*xss).pipe(
ops.map(helper.process),
ops.filter(lambda x: x is not None),
)
def mux(*xss: rx.Observable) -> rx.Observable:
def pair_index(i: int) -> Callable[[Any], Tuple[int, Any]]:
def inner(x: Any) -> Tuple[int, Any]:
return i, x
return inner
paired = [xs.pipe(ops.map(pair_index(i))) for i, xs in enumerate(xss)]
return rx.from_iterable(paired).pipe(ops.merge_all())
class ZipLatestHelper:
def __init__(self, num_streams):
self.latest = [None for _ in range(num_streams)]
self.ready = set()
def process(self, pair: Tuple[int, Any]) -> Optional[Tuple[Any, ...]]:
i, x = pair
self.latest[i] = x
self.ready.add(i)
return (
tuple(self.latest) if len(self.ready) == len(self.latest) else None
)
And usage:
from time import sleep
zipped = zip_latest(
rx.interval(0.5).pipe(ops.map(lambda i: f"A{i}")),
rx.interval(0.3).pipe(ops.map(lambda i: f"B{i}")),
)
zipped.subscribe(print)
sleep(10)
With output:
('A0', 'B0')
('A0', 'B1')
('A0', 'B2')
('A1', 'B2')
('A1', 'B3')
('A2', 'B3')
('A2', 'B4')
('A2', 'B5')
Caveats:
May not be thread-safe.
What should one do if a stream receives OnCompleted? Should the zipped stream continue emitting items, or should it issue an OnCompleted?
What should one do if a stream receives OnError?

Related

Grouping a stream of elements into multiple streams

Let's say we have a case class MyCaseClass(name: String, value: Int). Given an fs2.Stream[F, MyCaseClass] I want to group elements with the same name
val sourceStream: fs2.Stream[F, MyCaseClass] = //
val groupedSameNameStream: fs2.Stream[F, fs2.Stream[F, MyCaseClass]] = //
The reason I need to do this is I want to apply effectfful transformation
val transform: MyCaseClass => F[Unit] = //
to all elements of a stream and in case one group fails the other should keep working.
Is something like this possible to do?
This is possible, with caveats.
It's relatively straightforward to do this if you accept having a Map with an unbounded number of keys, and an unbounded number of associated Queues for each.
We've use code based on a gist by github user kiambogo in production (though ours has been tweaked), and it works fine:
import fs2.concurrent.Queue
import cats.implicits._
import cats.effect.Concurrent
import cats.effect.concurrent.Ref
def groupBy[F[_], A, K](selector: A => F[K])(implicit F: Concurrent[F]): Pipe[F, A, (K, Stream[F, A])] = {
in =>
Stream.eval(Ref.of[F, Map[K, Queue[F, Option[A]]]](Map.empty)).flatMap { st =>
val cleanup = {
import alleycats.std.all._
st.get.flatMap(_.traverse_(_.enqueue1(None)))
}
(in ++ Stream.eval_(cleanup))
.evalMap { el =>
(selector(el), st.get).mapN { (key, queues) =>
queues.get(key).fold {
for {
newQ <- Queue.unbounded[F, Option[A]] // Create a new queue
_ <- st.modify(x => (x + (key -> newQ), x)) // Update the ref of queues
_ <- newQ.enqueue1(el.some)
} yield (key -> newQ.dequeue.unNoneTerminate).some
}(_.enqueue1(el.some) as None)
}.flatten
}.unNone.onFinalize(cleanup)
}
}
If we assume an overhead of 64 bytes for each Map entry (I believe this is very overestimated) then a cardinality of 100,000 unique keys gives us approximately 6.1MiB - well within reasonable size for a jvm process.

Understanding this Mutable Recursive Function using the uJson library

I am trying to implement an insert function using the ujson library:
Here is my attempt:
import ujson.{Obj, Value}
import upickle.default._
object Example extends App {
def r = transform(
List(Map(
"a" -> Map("b" -> Obj("c" -> List(1,2,3), "d" -> List(2,4,6))),
"e" -> Map("f" -> Obj("g" -> List(1,2,3)))))
).to(Value)
def insert(j: ujson.Value, k: String, v: ujson.Value): Unit = j match {
case a: ujson.Arr => a.arr.foreach(e => insert(e, k, v))
case o: ujson.Obj =>
if (o.obj.keySet contains k) o.obj(k) = v
else o.obj.values.foreach(e => insert(e, k, v))
case _ => Nil
}
println(r)
insert(r, "b", transform(None).to(Value))
println(r)
}
However, this gives me output that is unchanged:
[{"a":{"b":{"c":[1,2,3],"d":[2,4,6]}},"e":{"f":{"g":[1,2,3]}}}]
[{"a":{"b":{"c":[1,2,3],"d":[2,4,6]}},"e":{"f":{"g":[1,2,3]}}}]
Given that the Value type is mutable, why does this not mutate and update the key, k, with value v for json value object r?
You are creating Value anew every time you call r so, every changes you would make to it, are dismissed.
You create one copy when you call println(r).
Then you create a separate copy with insert(r, "b", transform(None).to(Value)), mutate it and dismiss.
Then you are creating third copy with another println(r).
If you want to refer to the same object use val instead of def.

Splitting a Monix Observable

I would like to write a split function for monix.reactive.Observable. It should split a source Observable[A] into a new pair (Observable[A], Observable[A]), based on the value of a predicate, evaluated against each element in the source. I would like the split to work independently of whether the source Observable is hot or cold. In the case where the source is cold, the new pair of Observables should also be cold and where the source is hot the new pair of Observables will be hot. I would like to know if such an implementation is possible and, if so, how (I have pasted a failing testcase below).
The signature, as a method on an implicit class, would look like, or similar to
/**
* Split an observable by a predicate, placing values for which the predicate returns true
* to the right (and values for which the predicate returns false to the left).
* This is consistent with the convention adopted by Either.cond.
*/
def split(p: T => Boolean)(implicit scheduler: Scheduler, taskLike: TaskLike[Future]): (Observable[T], Observable[T]) = {
splitEither[T, T](elem => Either.cond(p(elem), elem, elem))
}
Currently, I have a naive implementation that consumes the source elements and pushes them to PublishSubject. The new pair of Observables is thus hot. My tests for a cold Observable are failing.
import monix.eval.TaskLike
import monix.execution.{Ack, Scheduler}
import monix.reactive.{Observable, Observer}
import monix.reactive.subjects.PublishSubject
import scala.concurrent.Future
object ObservableOps {
implicit class ObservableExtensions[T](o: Observable[T]) {
/**
* Split an observable by a predicate, placing values for which the predicate returns true
* to the right (and values for which the predicate returns false to the left).
* This is consistent with the convention adopted by Either.cond.
*/
def split(p: T => Boolean)(implicit scheduler: Scheduler, taskLike: TaskLike[Future]): (Observable[T], Observable[T]) = {
splitEither[T, T](elem => Either.cond(p(elem), elem, elem))
}
/**
* Split an observable into a pair of Observables, one left, one right, according
* to a determinant function.
*/
def splitEither[U, V](f: T => Either[U, V])(implicit scheduler: Scheduler, taskLike: TaskLike[Future]): (Observable[U], Observable[V]) = {
val l = PublishSubject[U]()
val r = PublishSubject[V]()
o.subscribe(new Observer[T] {
override def onNext(elem: T): Future[Ack] = {
f(elem) match {
case Left(u) => l.onNext(u)
case Right(v) => r.onNext(v)
}
}
override def onError(ex: Throwable): Unit = {
l.onError(ex)
r.onError(ex)
}
override def onComplete(): Unit = {
l.onComplete()
r.onComplete()
}
})
(l, r)
}
}
}
//////////
import ObservableOps._
import monix.execution.Scheduler.Implicits.global
import monix.reactive.Observable
import monix.reactive.subjects.PublishSubject
import org.scalatest.FlatSpec
import org.scalatest.Matchers._
import org.scalatest.concurrent.ScalaFutures._
class ObservableOpsSpec extends FlatSpec {
val isEven: Int => Boolean = _ % 2 == 0
"Observable Ops" should "split a cold observable" in {
val o = Observable(1, 2, 3, 4, 5)
val (l, r) = o.split(isEven)
l.toListL.runToFuture.futureValue shouldBe List(1, 3, 5)
r.toListL.runToFuture.futureValue shouldBe List(2, 4)
}
"Observable Ops" should "split a hot observable" in {
val o = PublishSubject[Int]()
val (l, r) = o.split(isEven)
val lbuf = l.toListL.runToFuture
val rbuf = r.toListL.runToFuture
Observable.fromIterable(1 to 5).mapEvalF(i => o.onNext(i)).subscribe()
o.onComplete()
lbuf.futureValue shouldBe List(1, 3, 5)
rbuf.futureValue shouldBe List(2, 4)
}
}
I expect both testcases above to pass but "Observable Ops" should "split a cold observable" is failing.
Edit: working code
An implementation that passes both test cases is as follows:
import monix.execution.Scheduler
import monix.reactive.Observable
object ObservableOps {
implicit class ObservableExtension[T](o: Observable[T]) {
/**
* Split an observable by a predicate, placing values for which the predicate returns true
* to the right (and values for which the predicate returns false to the left).
* This is consistent with the convention adopted by Either.cond.
*/
def split(
p: T => Boolean
)(implicit scheduler: Scheduler): (Observable[T], Observable[T]) = {
splitEither[T, T](elem => Either.cond(p(elem), elem, elem))
}
/**
* Split an observable into a pair of Observables, one left, one right, according
* to a determinant function.
*/
def splitEither[U, V](
f: T => Either[U, V]
)(implicit scheduler: Scheduler): (Observable[U], Observable[V]) = {
val oo = o.map(f)
val l = oo.collect {
case Left(u) => u
}
val r = oo.collect {
case Right(v) => v
}
(l, r)
}
}
}
class ObservableOpsSpec extends FlatSpec {
val isEven: Int => Boolean = _ % 2 == 0
"Observable Ops" should "split a cold observable" in {
val o = Observable(1, 2, 3, 4, 5)
val o2 = o.publish
val (l, r) = o2.split(isEven)
val x= l.toListL.runToFuture
val y = r.toListL.runToFuture
o2.connect()
x.futureValue shouldBe List(1, 3, 5)
y.futureValue shouldBe List(2, 4)
}
"Observable Ops" should "split a hot observable" in {
val o = PublishSubject[Int]()
val (l, r) = o.split(isEven)
val lbuf = l.toListL.runToFuture
val rbuf = r.toListL.runToFuture
Observable.fromIterable(1 to 5).mapEvalF(i => o.onNext(i)).subscribe()
o.onComplete()
lbuf.futureValue shouldBe List(1, 3, 5)
rbuf.futureValue shouldBe List(2, 4)
}
}
Cold observable, by definition, is lazily evaluated for each subscriber. You can't split it without either evaluating everything twice or converting it into hot one.
If you don't mind evaluating everything twice, just use .filter two times.
If you don't mind converting to hot, do it with .publish (or .publish.refCount so you don't need to connect manually).
If you want to preserve cold/hot property and process two pieces in parallel, there's a publishSelector method that lets you treat any observable like a hot one in a limited scope:
coldOrHot.publishSelector { totallyHot =>
val s1 = totallyHot.filter(...).flatMap(...) // any processing
val s2 = totallyHot.filter(...).mapEval(...) // any processing 2
Observable(s1, s2).merge
}
It's limitation, apart from scope, is that result of inner lambda has to be another Observable (which will be returned from publishSelector), so you can't have the helper with the signature you want. But the result will still be cold if the original was cold.

Chaining a number of transitions with the state Monad

I am starting to use the state monad to clean up my code. I have got it working for my problem where I process a transaction called CDR and modify the state accordingly.
It is working perfectly fine for individual transactions, using this function to perform the state update.
def addTraffic(cdr: CDR): Network => Network = ...
Here is an example:
scala> val processed: (CDR) => State[Network, Long] = cdr =>
| for {
| m <- init
| _ <- modify(Network.addTraffic(cdr))
| p <- get
| } yield p.count
processed: CDR => scalaz.State[Network,Long] = $$Lambda$4372/1833836780#1258d5c0
scala> val r = processed(("122","celda 1", 3))
r: scalaz.State[Network,Long] = scalaz.IndexedStateT$$anon$13#4cc4bdde
scala> r.run(Network.empty)
res56: scalaz.Id.Id[(Network, Long)] = (Network(Map(122 -> (1,0.0)),Map(celda 1 -> (1,0.0)),Map(1 -> Map(1 -> 3)),1,true),1)
What i want to do now is to chain a number of transactions on an iterator. I have found something that works quite well but the state transitions take no inputs (state changes through RNG)
import scalaz._
import scalaz.std.list.listInstance
type RNG = scala.util.Random
val f = (rng:RNG) => (rng, rng.nextInt)
val intGenerator: State[RNG, Int] = State(f)
val rng42 = new scala.util.Random
val applicative = Applicative[({type l[Int] = State[RNG,Int]})#l]
// To generate the first 5 Random integers
val chain: State[RNG, List[Int]] = applicative.sequence(List.fill(5)(intGenerator))
val chainResult: (RNG, List[Int]) = chain.run(rng42)
chainResult._2.foreach(println)
I have unsuccessfully tried to adapt this, but I can not get they types signatures to match because my state function requires the cdr (transaction) input
Thanks
TL;DR
you can use traverse from the Traverse type-class on a collection (e.g. List) of CDRs, using a function with this signature: CDR => State[Network, Long]. The result will be a State[Network, List[Long]]. Alternatively, if you don't care about the List[Long] there, you can use traverse_ instead, which will return State[Network, Unit]. Finally, should you want to "aggregate" the results T as they come along, and T forms a Monoid, you can use foldMap from Foldable, which will return State[Network, T], where T is the combined (e.g. folded) result of all Ts in your chain.
A code example
Now some more details, with code examples. I will answer this using Cats State rather than Scalaz, as I never used the latter, but the concept is the same and, if you still have problems, I will dig out the correct syntax.
Assume that we have the following data types and imports to work with:
import cats.implicits._
import cats.data.State
case class Position(x : Int = 0, y : Int = 0)
sealed trait Move extends Product
case object Up extends Move
case object Down extends Move
case object Left extends Move
case object Right extends Move
As it is clear, the Position represents a point in a 2D plane and a Move can move such point up, down, left or right.
Now, lets create a method that will allow us to see where we are at a given time:
def whereAmI : State[Position, String] = State.inspect{ s => s.toString }
and a method to change our position, given a Move:
def move(m : Move) : State[Position, String] = State{ s =>
m match {
case Up => (s.copy(y = s.y + 1), "Up!")
case Down => (s.copy(y = s.y - 1), "Down!")
case Left => (s.copy(x = s.x - 1), "Left!")
case Right => (s.copy(x = s.x + 1), "Right!")
}
}
Notice that this will return a String, with the name of the move followed by an exclamation mark. This is just to simulate the type change from Move to something else, and show how the results will be aggregated. More on this in a bit.
Now let's try to play with our methods:
val positions : State[Position, List[String]] = for{
pos1 <- whereAmI
_ <- move(Up)
_ <- move(Right)
_ <- move(Up)
pos2 <- whereAmI
_ <- move(Left)
_ <- move(Left)
pos3 <- whereAmI
} yield List(pos1,pos2,pos3)
And we can feed it an initial Position and see the result:
positions.runA(Position()).value // List(Position(0,0), Position(1,2), Position(-1,2))
(you can ignore the .value there, it's a quirk due to the fact that State[S,A] is really just an alias for StateT[Eval,S,A])
As you can see, this behaves as you would expect, and you can create different "blueprints" (e.g. sequences of state modifications), which will be applied once an initial state is provided.
Now, to actually answer to you question, say we have a List[Move] and we want to apply them sequentially to an initial state, and get the result: we use traverse from the Traverse type-class.
val moves = List(Down, Down, Left, Up)
val result : State[Position, List[String]] = moves.traverse(move)
result.run(Position()).value // (Position(-1,-1),List(Down!, Down!, Left!, Up!))
Alternatively, should you not need the A at all (the List in you case), you can use traverse_, instead of traverse and the result type will be:
val result_ : State[Position, List[String]] = moves.traverse_(move)
result_.run(Position()).value // (Position(-1,-1),Unit)
Finally, if your A type in State[S,A] forms a Monoid, then you could also use foldMap from Foldable to combine (e.g. fold) all As as they are calculated. A trivial example (probably useless, because this will just concatenate all Strings) would be this:
val result : State[Position,String] = moves.foldMap(move)
result.run(Position()).value // (Position(-1,-1),Down!Down!Left!Up!)
Whether this final approach is useful or not to you, really depends on what A you have and if it makes sense to combine it.
And this should be all you need in your scenario.

Scala populate map with random values

I need to create a test for various collections based on Map and HashMap.
I have two functions that create test data, e.g.:
def f1: String = { ... )
def f2: String = { ... }
these functions create random data every time they are called.
My map is:
val m:Map[String,String] = ...
what I try to accomplish is construct an immutable map with 10000 random items that were generated by calling f1/f2. so protocode would be:
for 1 to 10000
add-key-value-to-map (key = f1(), value = f2() )
end for
how can I accomplish this in scala, without destroying and rebuilding the list 10000 times?
EDIT:
Since it wasn't clear in the original post above, I am trying to run this with various types of maps (Map, HashMap, TreeMap).
List.fill(10000)((f1, f2)).toMap
You can use List.fill to create a List of couple (String, String) and then call .toMap on it:
scala> def f1 = util.Random.alphanumeric take 5 mkString
f1: String
scala> def f2 = util.Random.alphanumeric take 5 mkString
f2: String
scala> val m = List.fill(5)(f1 -> f2).toMap
m: scala.collection.immutable.Map[String,String] =
Map(T7hD8 -> BpAa1, uVpno -> 6sMjc, wdaRP -> XSC1V, ZGlC0 -> aTwBo, SjfOr -> hdzIN)
Alternatively you could use Map/HashMap/TreeMap's .apply function:
scala> val m = collection.immutable.TreeMap(List.fill(5)(f1 -> f2) : _*)
m: scala.collection.immutable.TreeMap[String,String] =
Map(3cieU -> iy0KV, 8oUb1 -> YY6NC, 95ol4 -> Sf9qp, GhXWX -> 8U8wt, ZD8Px -> STMOC)
val m = (1 to 10000).foldLeft(Map.empty[String,String]) { (m, _) => m + (f1 -> f2) }
Using tabulate as follows,
Seq.tabulate(10000)(_ => f1 -> f2).toMap
It proves unclear whether the random key generator function may duplicate some keys, in which case 10000 iterations would not suffice to produce a map of such size.
An intuitive approach,
(1 to 10000).map(_ => f1 -> f2).toMap
Using a recursive function instead of generating a range to iterate over, (although numerous intermediate Maps are created),
def g(i: Int): Map[String,String] = {
if (i<=0)
Map()
else
Map(f1 -> f2) ++ g(i-1)
}