Writing an Observable to file - monix

I currently have the following code:
val writer: PrintWriter = ???
val linesObservable: Observable[String] = ???
val future: CancelableFuture[Unit] = linesObservable.foreach(writer.write)
writer.close()
My goal is to get rid of all side effects in the above snippet.
From a functional perspective, writer should act as resource / bracket for future.
Because I am using Monix version 3.0.0-RC2, there is new bracket methods available for Observable and Task (maybe even more classes), which could be what I'm looking for - but I don't quite see how yet.

Yes, you are correct. It's better to acquire PrintWriter as a resource so that it can be closed cleanly when its done writing.
val writer: Resource[Task, PrintWriter] =
Resource.make[Task, PrintWriter](???)(pw => Task.delay(pw.close()))
val linesObservable: Observable[String] = ???
writer.use { pw =>
linesObservable.foreachL(pw.write)
}

Related

How to add proper error handling to cats-effect's Resource

I am trying to get some basic file IO (write/read) in a purely functional way using cats-effect. After following this tutorial, here is what I ended up with for reading a file:
private def readFile(): IO[String] = for {
lines <- bufferedReader(new File(filePath)).use(readAllLines)
} yield lines.mkString
def bufferedReader(f: File): Resource[IO, BufferedReader] =
Resource.make {
IO(new BufferedReader(new FileReader(f)))
} { fileReader =>
IO(fileReader.close()).handleErrorWith(_ => IO.unit)
}
Now in the handleErrorWith function I could log any error occuring, but how can I add proper error handling to this (e.g. return a Resource[IO, Either[CouldNotReadFileError, BufferedReader]])?
Proper error handling can be added via the use of .attempt on the returned IO value:
import scala.collection.JavaConverters._
val resourceOrError: IO[Either[Throwable, String]] = bufferedReader(new File(""))
.use(resource => IO(resource.lines().iterator().asScala.mkString))
.attempt
If you want to lift that into your own ADT, you can use leftMap:
import cats.syntax.either._
final case class CouldNotReadError(e: Throwable)
val resourceOrError: IO[Either[CouldNotReadError, String]] =
bufferedReader(new File(""))
.use(resource => IO(resource.lines().iterator().asScala.mkString))
.attempt
.map(_.leftMap(CouldNotReadError))
Additionally, you might be interested in the ZIO datatype, which has supported cats-effect instances, and has a slightly different shape of the form IO[E, A] where E captures the error effect type.

FS2 stream to unread InputStream

I'd like to convert fs2.Stream to java.io.InputStream so I can pass that input stream to an http framework (Finch and Akka Http).
I found a fs2.io.toInputStream, but this doesn't work (it prints nothing):
import java.io.{ByteArrayInputStream, InputStream}
import cats.effect.IO
import scala.concurrent.ExecutionContext.Implicits.global
object IOTest {
def main(args: Array[String]): Unit = {
val is: InputStream = new ByteArrayInputStream("test".getBytes)
val stream: fs2.Stream[IO, Byte] = fs2.io.readInputStream(IO(is), 128)
val test: Seq[InputStream] = stream.through(fs2.io.toInputStream).compile.toList.unsafeRunSync()
println(scala.io.Source.fromInputStream(test.head).mkString)
}
}
As far as I understand when I run .unsafeRunSync() it's consuming the whole stream, so even though it returns a Seq[InputStream] the under-laying input stream is already consumed.
Is there any way I can convert fs2.Stream[IO, Byte] to java.io.InputStream without it being consumed?
Thnaks!
The problem is that compile is being invoked prematurely. I'm sure that under the hood fs2.io.toInputStream does the correct thing and brackets the created InputStream. Which means that the InputStream must be accessed inside the Stream itself (e.g., in a map/flatMap call):
val wire: fs2.Stream[IO, Byte] = ???
val result: fs2.Stream[IO, String] = for {
is <- wire.through(fs2.io.toInputStream)
str = scala.io.Source.fromInputStream(is).mkString //<--- use the InputStream here
} yield str
println( result.compile.lastOrError.unsafeRunSync() ) //<--- compile at the _very_ end
Outputs:
test
It looks that Finch has fs2 support https://github.com/finagle/finch/tree/master/fs2 and Akka also has it's stream implementation and there are fs2 - Akka Stream interop libraries like https://github.com/krasserm/streamz/tree/master/streamz-converter
So i recommend you to take a look to the implementations because they take care of the resources life cycle. Probably you don't need the whole library but it serves as guideline.
And if you are starting at the "safe zone" with fs2, why moving out of there :)

How to abruptly stop an akka stream Runnable Graph?

I am not able to figure out how to stop akka stream Runnable Graph immediately ? How to use killswitch to achieve this? It has been just a few days that I started akka streams. In my case I am reading lines from a file and doing some operations in flow and writing to the sink. What I want to do is, stop reading file immediately whenever I want, and I hope this should possibly stop the whole running graph. Any ideas on this would be greatly appreciated.
Thanks in advance.
Since Akka Streams 2.4.3, there is an elegant way to stop the stream from the outside via KillSwitch.
Consider the following example, which stops stream after 10 seconds.
object ExampleStopStream extends App {
implicit val system = ActorSystem("streams")
implicit val materializer = ActorMaterializer()
import system.dispatcher
val source = Source.
fromIterator(() => Iterator.continually(Random.nextInt(100))).
delay(500.millis, DelayOverflowStrategy.dropHead)
val square = Flow[Int].map(x => x * x)
val sink = Sink.foreach(println)
val (killSwitch, done) =
source.via(square).
viaMat(KillSwitches.single)(Keep.right).
toMat(sink)(Keep.both).run()
system.scheduler.scheduleOnce(10.seconds) {
println("Shutting down...")
killSwitch.shutdown()
}
done.foreach { _ =>
println("I'm done")
Await.result(system.terminate(), 1.seconds)
}
}
The one way have a service or shutdownhookup which can call graph cancellable
val graph=
Source.tick(FiniteDuration(0,TimeUnit.SECONDS), FiniteDuration(1,TimeUnit.SECONDS), Random.nextInt).to(Sink.foreach(println))
val cancellable=graph.run()
cancellable.cancel
The cancellable.cancel can be part of ActorSystem.registerOnTermination

How do I dynamically add Source to existing Graph?

What can be alternative to dynamically changing running graph ? Here is my situation. I have graph that ingests articles into DB. Articles come from 3 plugins in different format. Thus I have several flows
val converterFlow1: Flow[ImpArticle, Article, NotUsed]
val converterFlow2: Flow[NewsArticle, Article, NotUsed]
val sinkDB: Sink[Article, Future[Done]]
// These are being created every time I poll plugins
val sourceContentProvider : Source[ImpArticle, NotUsed]
val sourceNews : Source[NewsArticle, NotUsed]
val sourceCit : Source[Article, NotUsed]
val merged = Source.combine(
sourceContentProvider.via(converterFlow1),
sourceNews.via(converterFlow2),
sourceCit)(Merge(_))
val res = merged
.buffer(10, OverflowStrategy.backpressure)
.toMat(sinkDB)(Keep.both)
.run()
Problem is that I get data from content provider once per 24 hrs, from news once per 2 hrs and last source may come at any time because it's coming from humans.
I realize that graphs are immutable but how I can periodically attach new instances of Source to my graph so that I have single point of throttling of the process of ingesting ?
UPDATE: You can say my data is stream of Source-s, three sources in my case. But I cannot change that because I get instances of Source from external classes (so called plugins). These plugins work independently from my ingestion class. I can't combine them into one gigantic class to have single Source.
Okay, in general the correct way would be to join a stream of sources into a single source, i.e. go from Source[Source[T, _], Whatever] to Source[T, Whatever]. This can be done with flatMapConcat or with flatMapMerge. Therefore, if you can get a Source[Source[Article, NotUsed], NotUsed], you can use one of flatMap* variants and obtain a final Source[Article, NotUsed]. Do it for each of your sources (no pun intended), and then your original approach should work.
I've implemented code based up on answer given by Vladimir Matveev and want to share it with others since it looks like common use-case to me.
I knew about Source.queue which Viktor Klang mentioned but I wasn't aware of flatMapConcat. It's pure awesomeness.
implicit val system = ActorSystem("root")
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
case class ImpArticle(text: String)
case class NewsArticle(text: String)
case class Article(text: String)
val converterFlow1: Flow[ImpArticle, Article, NotUsed] = Flow[ImpArticle].map(a => Article("a:" + a.text))
val converterFlow2: Flow[NewsArticle, Article, NotUsed] = Flow[NewsArticle].map(a => Article("a:" + a.text))
val sinkDB: Sink[Article, Future[Done]] = Sink.foreach { a =>
Thread.sleep(1000)
println(a)
}
// These are being created every time I poll plugins
val sourceContentProvider: Source[ImpArticle, NotUsed] = Source(List(ImpArticle("cp1"), ImpArticle("cp2")))
val sourceNews: Source[NewsArticle, NotUsed] = Source(List(NewsArticle("news1"), NewsArticle("news2")))
val sourceCit: Source[Article, NotUsed] = Source(List(Article("a1"), Article("a2")))
val (queue, completionFut) = Source
.queue[Source[Article, NotUsed]](10, backpressure)
.flatMapConcat(identity)
.buffer(2, OverflowStrategy.backpressure)
.toMat(sinkDB)(Keep.both)
.run()
queue.offer(sourceContentProvider.via(converterFlow1))
queue.offer(sourceNews.via(converterFlow2))
queue.offer(sourceCit)
queue.complete()
completionFut.onComplete {
case Success(res) =>
println(res)
system.terminate()
case Failure(ex) =>
ex.printStackTrace()
system.terminate()
}
Await.result(system.whenTerminated, Duration.Inf)
I'd still check success of Future returned by queue.offer but in my case these calls will be pretty infrequent.
If you cannot model it as a Source[Source[_,_],_] then I'd consider using a Source.queue[Source[T,_]](queueSize, overflowStrategy): here
What you'll have to be careful about though is what happens if submission fails.

How to close enumerated file?

Say, in an action I have:
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows)
}
Ok.feed(linesEnu).as(HTML)
How to close readers/streams?
There is a onDoneEnumerating callback that functions like finally (will always be called whether or not the Enumerator fails). You can close the streams there.
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows).onDoneEnumerating {
is.close()
// ... Anything else you want to execute when the Enumerator finishes.
}
}
The IO tools provided by Enumerator give you this kind of resource management out of the box—e.g. if you create an enumerator with fromStream, the stream is guaranteed to get closed after running (even if you only read a single line, etc.).
So for example you could write the following:
import play.api.libs.iteratee._
val splitByNl = Enumeratee.grouped(
Traversable.splitOnceAt[Array[Byte], Byte](_ != '\n'.toByte) &>>
Iteratee.consume()
) compose Enumeratee.map(new String(_, "UTF-8"))
def fileLines(path: String): Enumerator[String] =
Enumerator.fromStream(new java.io.FileInputStream(path)).through(splitByNl)
It's a shame that the library doesn't provide a linesFromStream out of the box, but I personally would still prefer to use fromStream with hand-rolled splitting, etc. over using an iterator and providing my own resource management.