How to create Source and push elements to it manually? - scala

I want to create custom StatefulStage which should works like groupBy method and emit Source[A, Unit] elements but I don't understand how to create instance of Source[A, Unit] and push incoming element to it. Here is stub:
class GroupBy[A, Mat]() extends StatefulStage[A, Source[A, Unit]] {
override def initial: StageState[A, Source[A, Unit]] = new StageState[A, Source[A, Unit]] {
override def onPush(elem: A, ctx: Context[Source[A, Unit]]): SyncDirective = {
val source: Source[A, Unit] = ... // Need to create source here
// and push `elem` to `source` here
emit(List(source).iterator, ctx)
}
}
}
You can use the following snippet for test GroupBy flow (it should print events from created stream):
case class Tick()
case class Event(timestamp: Long, sessionUid: String, traffic: Int)
implicit val system = ActorSystem()
import system.dispatcher
implicit val materializer = ActorMaterializer()
var rnd = Random
rnd.setSeed(1)
val eventsSource = Source
.tick(FiniteDuration(0, SECONDS), FiniteDuration(1, SECONDS), () => Tick)
.map {
case _ => Event(System.currentTimeMillis / 1000, s"session-${rnd.nextInt(5)}", rnd.nextInt(10) * 10)
}
val flow = Flow[Event]
.transform(() => new GroupByUntil)
.map {
(source) => source.runForeach(println)
}
eventsSource
.via(flow)
.runWith(Sink.ignore)
.onComplete(_ => system.shutdown())
Can anybody explain me how to do it?
UPDATE:
I wrote the following onPush method base on this answer but it didn't print events. As I understand I can push element to source only when it running as part of flow but I want to run flow outside of GroupBy in test snippet. If I run flow in GroupBy as in this example then it will process events and send them to Sink.ignore. I think this is a reason why my test snippet didn't print events.
override def onPush(elem: A, ctx: Context[Source[A, Unit]]): SyncDirective = {
val source: Source[A, ActorRef] = Source.actorRef[A](1000, OverflowStrategy.fail)
val flow = Flow[A].to(Sink.ignore).runWith(source)
flow ! elem
emit(List(source.asInstanceOf[Source[A, Unit]]).iterator, ctx)
}
So, how to fix it?

Related

fs2 - Sharing a Ref with 2 streams

I'm trying to share a Ref[F, A] between 2 concurrent streams. Below is a simplified example of the actual scenario.
class Container[F[_]](implicit F: Sync[F]) {
private val counter = Ref[F].of(0)
def incrementBy2 = counter.flatMap(c => c.update(i => i + 2))
def printCounter = counter.flatMap(c => c.get.flatMap(i => F.delay(println(i))))
}
In the main function,
object MyApp extends IOApp {
def run(args: List[String]): IO[ExitCode] = {
val s = for {
container <- Ref[IO].of(new Container[IO]())
} yield {
val incrementBy2 = Stream.repeatEval(
container.get
.flatTap(c => c.incrementBy2)
.flatMap(c => container.update(_ => c))
)
.metered(2.second)
.interruptScope
val printStream = Stream.repeatEval(
container.get
.flatMap(_.printCounter)
)
.metered(1.seconds)
incrementBy2.concurrently(printStream)
}
Stream.eval(s)
.flatten
.compile
.drain
.as(ExitCode.Success)
}
}
The updates made by the incrementBy2 are not visible in printStream.
How can I fix this?
I would appreciate any help to understand the mistake in this code.
Thanks
Your code is broken since the class definition, you are not even updating the same Ref
Remember that the point of IO is to be a description of a computation, so Ref[F].of(0) returns a program that when evaluated will create a new Ref, calling multiple flatMaps on it will result in multiple Refs being created.
Also, your is not doing tagless final in the right way (and some may argue that even the right way is not worth it: https://alexn.org/blog/2022/04/18/scala-oop-design-sample/)
This is how I would write your code:
trait Counter {
def incrementBy2: IO[Unit]
def printCounter: IO[Unit]
}
object Counter {
val inMemory: IO[Counter] =
IO.ref(0).map { ref =>
new Counter {
override final val incrementBy2: IO[Unit] =
ref.update(c => c + 2)
override final val printCounter: IO[Unit] =
ref.get.flatMap(IO.println)
}
}
}
object Program {
def run(counter: Counter): Stream[IO, Unit] =
Stream
.repeatEval(counter.printCounter)
.metered(1.second)
.concurrently(
Stream.repeatEval(counter.incrementBy2).metered(2.seconds)
).interruptAfter(10.seconds)
}
object Main extends IOApp.Simple {
override final val run: IO[Unit] =
Stream
.eval(Counter.inMemory)
.flatMap(Program.run)
.compile
.drain
}
PS: I would actually not have printCounter but getCounter
and do the printing in the Program
You can see the code running here.

IO and Future[Option] monad transformers

I'm trying to figure out how to write this piece of code in an elegant pure-functional style using scalaz7 IO and monad transformers but just can't get my head around it.
Just imagine I have this simple API:
def findUuid(request: Request): Option[String] = ???
def findProfile(uuid: String): Future[Option[Profile]] = redisClient.get[Profile](uuid)
Using this API I can easily write impure function with OptionT transformer like this:
val profileT = for {
uuid <- OptionT(Future.successful(findUuid(request)))
profile <- OptionT(findProfile(uuid))
} yield profile
val profile: Future[Option[Profile]] = profileT.run
As you have noticed - this function contains findProfile() with a side-effect. I want to isolate this effect inside of the IO monad and interpret outside of the pure function but don't know how to combine it all together.
def findProfileIO(uuid: String): IO[Future[Option[Profile]]] = IO(findProfile(uuid))
val profileT = for {
uuid <- OptionT(Future.successful(findUuid(request)))
profile <- OptionT(findProfileIO(uuid)) //??? how to put Option inside of the IO[Future[Option]]
} yield profile
val profile = profileT.run //how to run transformer and interpret IO with the unsafePerformIO()???
Any peaces of advice on how it might be done?
IO is meant more for synchronous effects. Task is more what you want!
See this question and answer: What's the difference between Task and IO in Scalaz?
You can convert your Future to Task and then have an API like this:
def findUuid(request: Request): Option[String] = ???
def findProfile(uuid: String): Task[Option[Profile]] = ???
This works because Task can represent both synchronous and asynchronous operations, so findUuid can also be wrapped in Task instead of IO.
Then you can wrap these in OptionT:
val profileT = for {
uuid <- OptionT(Task.now(findUuid(request)))
profile <- OptionT(findProfileIO(uuid))
} yield profile
Then at the end somewhere you can run it:
profileT.run.attemptRun
Check out this link for converting Futures to Tasks and vice versa: Scalaz Task <-> Future
End up with this piece of code, thought it might be useful for someone (Play 2.6).
Controller's method is a pure function since Task evaluation takes place outside of the controller inside of PureAction ActionBuilder. Thanks to Luka's answer!
Still struggling with new paradigm of Action composition in Play 2.6 though, but this is another story.
FrontendController.scala:
def index = PureAction.pure { request =>
val profileOpt = (for {
uuid <- OptionT(Task.now(request.cookies.get("uuid").map(t => uuidKey(t.value))))
profile <- OptionT(redis.get[Profile](uuid).asTask)
} yield profile).run
profileOpt.map { profileOpt =>
Logger.info(profileOpt.map(p => s"User logged in - $p").getOrElse("New user, suggesting login"))
Ok(views.html.index(profileOpt))
}
}
Actions.scala
Convenient action with Task resolution at the end
class PureAction #Inject()(parser: BodyParsers.Default)(implicit ec: ExecutionContext) extends ActionBuilderImpl(parser) {
self =>
def pure(block: Request[AnyContent] => Task[Result]): Action[AnyContent] = composeAction(new Action[AnyContent] {
override def parser: BodyParser[AnyContent] = self.parser
override def executionContext: ExecutionContext = self.ec
override def apply(request: Request[AnyContent]): Future[Result] = {
val taskResult = block(request)
taskResult.asFuture //End of the world lives here
}
})
}
Converters.scala
Task->Future and Future->Task implicit converters
implicit class FuturePimped[+T](root: => Future[T]) {
import scalaz.Scalaz._
def asTask(implicit ec: ExecutionContext): Task[T] = {
Task.async { register =>
root.onComplete {
case Success(v) => register(v.right)
case Failure(ex) => register(ex.left)
}
}
}
}
implicit class TaskPimped[T](root: => Task[T]) {
import scalaz._
val p: Promise[T] = Promise()
def asFuture: Future[T] = {
root.unsafePerformAsync {
case -\/(ex) => p.failure(ex); ()
case \/-(r) => p.success(r); ()
}
p.future
}
}

ReactiveMongo with Akka Streams Custom Source

I am using reactivemongo-akka-stream and trying to transform the AkkaStreamCursor.documentSource. I am and am seeing two problems:
the documentation states that this operation returns a Source[T, NotUsed] but collection.find(query).cursor[BSONDocument].doccumentSource() returns a Source[BSONDocument, Future[State]]. Is there a way to avoid the State object?
Assuming I use Future[State] I am looking to get source of class as below
case class Inner(foo: String, baz: Int)
case class Outer(bar: Inner)
//
implicit object InnerReader extends BSONDocumentReader[Inner]//defined
val getCollection: Future[BSONCollection] = connection.database("db").map(_.collection("things")
def stream()(implicit m: Materializer): Source[Outer, Future[State]] = {
getCollection.map(_.find().cursor[Inner]().documentSource()).map(_.via(Flow[Inner].map(in => Outer(in))))
But instead of getting back a Future[Source[Outer, Future[State]] that i could deal with, this returns a Future[Source[Inner, Future[State]]#Repr[Outer]]
How can bson readers be used with this library?
As per the suggestion by cchantep, I needed to use fromFuture with flatMapConcat:
def stream()(implicit m: Materializer): Source[Outer, NotUsed] = {
val foo = getCollection.map(x => col2Source(x))
fix(foo).via(flowmap(map = Outer(_)))
}
def col2Source(col: BSONCollection): Source[Inner, Future[State]] = {
val cursor: AkkaStreamCursor[Inner] =
col.find(BSONDocument.empty).cursor[Inner]()
cursor.documentSource()
}
def flowmap[In, Out](
map: (In) => Out
): Flow[In, Out, NotUsed] = Flow[In].map(e => map(e))
def fix[Out, Mat](futureSource: Future[Source[Out, Mat]]): Source[Out, NotUsed] = {
Source.fromFuture(futureSource).flatMapConcat(identity)
}

In akka-stream how to create a unordered Source from a futures collection

I need to create an akka.stream.scaladsl.Source[T, Unit] from a collection of Future[T].
E.g., having a collection of futures returning integers,
val f1: Future[Int] = ???
val f2: Future[Int] = ???
val fN: Future[Int] = ???
val futures = List(f1, f2, fN)
how to create a
val source: Source[Int, Unit] = ???
from it.
I cannot use Future.sequence combinator, since then I would wait for each future to complete before getting anything from the source. I want to get results in any order as soon as any future completes.
I understand that Source is a purely functional API and it should not run anything before somehow materializing it. So, my idea is to use an Iterator (which is lazy) to create a source:
Source { () =>
new Iterator[Future[Int]] {
override def hasNext: Boolean = ???
override def next(): Future[Int] = ???
}
}
But that would be a source of futures, not of actual values. I could also block on next using Await.result(future) but I'm not sure which tread pool's thread will be blocked. Also this will call futures sequentially, while I need parallel execution.
UPDATE 2: it turned out there was a much easier way to do it (thanks to Viktor Klang):
Source(futures).mapAsync(1)(identity)
UPDATE: here is what I've got based on #sschaef answer:
def futuresToSource[T](futures: Iterable[Future[T]])(implicit ec: ExecutionContext): Source[T, Unit] = {
def run(actor: ActorRef): Unit = {
futures.foreach { future =>
future.onComplete {
case Success(value) =>
actor ! value
case Failure(NonFatal(t)) =>
actor ! Status.Failure(t) // to signal error
}
}
Future.sequence(futures).onSuccess { case _ =>
actor ! Status.Success(()) // to signal stream's end
}
}
Source.actorRef[T](futures.size, OverflowStrategy.fail).mapMaterializedValue(run)
}
// ScalaTest tests follow
import scala.concurrent.ExecutionContext.Implicits.global
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
"futuresToSource" should "convert futures collection to akka-stream source" in {
val f1 = Future(1)
val f2 = Future(2)
val f3 = Future(3)
whenReady {
futuresToSource(List(f1, f2, f3)).runFold(Seq.empty[Int])(_ :+ _)
} { results =>
results should contain theSameElementsAs Seq(1, 2, 3)
}
}
it should "fail on future failure" in {
val f1 = Future(1)
val f2 = Future(2)
val f3 = Future.failed(new RuntimeException("future failed"))
whenReady {
futuresToSource(List(f1, f2, f3)).runWith(Sink.ignore).failed
} { t =>
t shouldBe a [RuntimeException]
t should have message "future failed"
}
}
Creating a source of Futures and then "flatten" it via mapAsync:
scala> Source(List(f1,f2,fN)).mapAsync(1)(identity)
res0: akka.stream.scaladsl.Source[Int,Unit] = akka.stream.scaladsl.Source#3e10d804
One of the easiest ways to feed a Source is through an Actor:
import scala.concurrent.Future
import akka.actor._
import akka.stream._
import akka.stream.scaladsl._
implicit val system = ActorSystem("MySystem")
def run(actor: ActorRef): Unit = {
import system.dispatcher
Future { Thread.sleep(100); actor ! 1 }
Future { Thread.sleep(200); actor ! 2 }
Future { Thread.sleep(300); actor ! 3 }
}
val source = Source
.actorRef[Int](0, OverflowStrategy.fail)
.mapMaterializedValue(ref ⇒ run(ref))
implicit val m = ActorMaterializer()
source runForeach { int ⇒
println(s"received: $int")
}
The Actor is created through the Source.actorRef method and made available through the mapMaterializedValue method. run simply takes the Actor and sends all the completed values to it, which can then be accessed through source. In the example above, the values are sent directly in the Future, but this can of course be done everywhere (for example in the onComplete call on the Future).

How to add elements to Source dynamically?

I have example code to generate an unbound source and working with it:
object Main {
def main(args : Array[String]): Unit = {
implicit val system = ActorSystem("Sys")
import system.dispatcher
implicit val materializer = ActorFlowMaterializer()
val source: Source[String] = Source(() => {
Iterator.continually({ "message:" + ThreadLocalRandom.current().nextInt(10000)})
})
source.runForeach((item:String) => { println(item) })
.onComplete{ _ => system.shutdown() }
}
}
I want to create class which implements:
trait MySources {
def addToSource(item: String)
def getSource() : Source[String]
}
And I need use it with multiple threads, for example:
class MyThread(mySources: MySources) extends Thread {
override def run(): Unit = {
for(i <- 1 to 1000000) { // here will be infinite loop
mySources.addToSource(i.toString)
}
}
}
And expected full code:
object Main {
def main(args : Array[String]): Unit = {
implicit val system = ActorSystem("Sys")
import system.dispatcher
implicit val materializer = ActorFlowMaterializer()
val sources = new MySourcesImplementation()
for(i <- 1 to 100) {
(new MyThread(sources)).start()
}
val source = sources.getSource()
source.runForeach((item:String) => { println(item) })
.onComplete{ _ => system.shutdown() }
}
}
How to implement MySources?
One way to have a non-finite source is to use a special kind of actor as the source, one that mixes in the ActorPublisher trait. If you create one of those kinds of actors, and then wrap with a call to ActorPublisher.apply, you end up with a Reactive Streams Publisher instance and with that, you can use an apply from Source to generate a Source from it. After that, you just need to make sure your ActorPublisher class properly handles the Reactive Streams protocol for sending elements downstream and you are good to go. A very trivial example is as follows:
import akka.actor._
import akka.stream.actor._
import akka.stream.ActorFlowMaterializer
import akka.stream.scaladsl._
object DynamicSourceExample extends App{
implicit val system = ActorSystem("test")
implicit val materializer = ActorFlowMaterializer()
val actorRef = system.actorOf(Props[ActorBasedSource])
val pub = ActorPublisher[Int](actorRef)
Source(pub).
map(_ * 2).
runWith(Sink.foreach(println))
for(i <- 1 until 20){
actorRef ! i.toString
Thread.sleep(1000)
}
}
class ActorBasedSource extends Actor with ActorPublisher[Int]{
import ActorPublisherMessage._
var items:List[Int] = List.empty
def receive = {
case s:String =>
if (totalDemand == 0)
items = items :+ s.toInt
else
onNext(s.toInt)
case Request(demand) =>
if (demand > items.size){
items foreach (onNext)
items = List.empty
}
else{
val (send, keep) = items.splitAt(demand.toInt)
items = keep
send foreach (onNext)
}
case other =>
println(s"got other $other")
}
}
With Akka Streams 2 you can use a sourceQueue : How to create a Source that can receive elements later via a method call?
As I mention in this answer, the SourceQueue is the way to go, and since Akka 2.5 there is a handy method preMaterialize which eliminates the need to create a composite source first.
I give an example in my other answer.