How to create a stream with Scalaz-Stream? - scala

It must be damn simple. But for some reason I cannot make it work.
If I do io.linesR(...), I have a stream of lines of the file, it's ok.
If I do Processor.emitAll(), I have a stream of pre-defined values. It also works.
But what I actually need is to produce values for scalaz-stream asynchronously (well, from Akka actor).
I have tried:
async.unboundedQueue[String]
async.signal[String]
Then called queue.enqueueOne(...).run or signal.set(...).run and listened to queue.dequeue or signal.discrete. Just with .map and .to. With an example proved to work with another kind of stream -- either with Processor or lines from the file.
What is the secret? What is the preferred way to create a channel to be streamed later? How to feed it with values from another context?
Thanks!

If the values are produced asynchronously but in a way that can be driven from the stream, I've found it easiest to use the "primitive" await method and construct the process "by hand". You need an indirectly recursive function:
def processStep(v: Int): Process[Future, Int] =
Process.emit(v) ++ Process.await(myActor ? NextValuePlease())(w => processStep(w))
But if you need a truly async process, driven from elsewhere, I've never done that.

Related

How to convert `fs2.Stream[IO, T]` to `Iterator[T]` in Scala

Need to fill in the methods next and hasNext and preserve laziness
new Iterator[T] {
val stream: fs2.Stream[IO, T] = ...
def next(): T = ???
def hasNext(): Boolean = ???
}
But cannot figure out how an earth to do this from a fs2.Stream? All the methods on a Stream (or on the "compiled" thing) are fairly useless.
If this is simply impossible to do in a reasonable amount of code, then that itself is a satisfactory answer and we will just rip out fs2.Stream from the codebase - just want to check first!
fs2.Stream, while similar in concept to Iterator, cannot be converted to one while preserving laziness. I'll try to elaborate on why...
Both represent a pull-based series of items, but the way in which they represent that series and implement the laziness differs too much.
As you already know, Iterator represents its pull in terms of the next() and hasNext methods, both of which are synchronous and blocking. To consume the iterator and return a value, you can directly call those methods e.g. in a loop, or use one of its many convenience methods.
fs2.Stream supports two capabilities that make it incompatible with that interface:
cats.effect.Resource can be included in the construction of a Stream. For example, you could construct a fs2.Stream[IO, Byte] representing the contents of a file. When consuming that stream, even if you abort early or do some strange flatMap, the underlying Resource is honored and your file handle is guaranteed to be closed. If you were trying to do the same thing with iterator, the "abort early" case would pose problems, forcing you to do something like Iterator[Byte] with Closeable and the caller would have to make sure to .close() it, or some other pattern.
Evaluation of "effects". In this context, effects are types like IO or Future, where the process of obtaining the value may perform some possibly-asynchronous action, and may perform side-effects. Asynchrony poses a problem when trying to force the process into a synchronous interface, since it forces you to block your current thread to wait for the asynchronous answer, which can cause deadlocks if you aren't careful. Libraries like cats-effect strongly discourage you from calling methods like unsafeRunSync.
fs2.Stream does allow for some special cases that prevent the inclusion of Resource and Effects, via its Pure type alias which you can use in place of IO. That gets you access to Stream.PureOps, but that only gets you methods that consume the whole stream by building a collection; the laziness you want to preserve would be lost.
Side note: you can convert an Iterator to a Stream.
The only way to "convert" a Stream to an Iterator is to consume it to some collection type via e.g. .compile.toList, which would get you an IO[List[T]], then .map(_.iterator) that to get an IO[Iterator[T]]. But ultimately that doesn't fit what you're asking for since it forces you to consume the stream to a buffer, breaking laziness.
#Dima mentioned the "XY Problem", which was poorly-received since they didn't really elaborate (initially) on the incompatibility, but they're right. It would be helpful to know why you're trying to make a Stream-to-Iterator conversion, in case there's some other approach that would serve your overall goal instead.

What are some best practices to mix async libraries with sync code in scala

I'm working on a scala code where a 3rd party library returns a Future[Boolean] object while I need to consume this future object in my scala code which is fully written in a synchronous manner.
Currently, I'm doing Await.result on 3rd party lib operation to ensure it returns just boolean. Is there a better way to handle this, my scala code needs a boolean value for further operation?
As Luis noted in the comments, in general there's no alternative to Awaiting on the Future.
That said, you may have some choice about where to Await.
For instance, if you have code like
val result = Await.result(someFuture, Duration.Inf)
f(result)
It may be more useful to run f in Future land with
Await.result(someFuture.map(f), Duration.Inf)
If f happens to block, then it may be worth either wrapping f in blocking or explicitly using an ExecutionContext which will handle a lot of its threads being blocked (e.g. one that can have more threads than cores) for the map.
In general, you'll want to move Awaits to the outermost edge of your code as you can, even shifting edges if you can.

scala save slick result into new object

is there a way to save the result of a slick query into a new object?
This is my slick result, there is only one "object" in the list
val result: Future[Seq[ProcessTemplatesModel]] = db.run(action)
The result should be mapped on ProcessTemplatesModel because I want to access the values like this
process.title
Is this possible?
Thanks
TL;DR: you should keep the context as long as you can.
Future denotes the fact that the value will be given at some time in the future (this is what I call some context for the value).
The bad way to use it would be to block your thread, until such value is found, and then work with it.
A better way is to tell your program: "Once the value is found (whenever that is), do something with it". That's a continuation, or call-back, and is implemented with map and flatMap in scala.
Seq is another context for your value. It means that you actually have different possible values. If you want to make sure that you have at most one value, you can always do seq.headOption to switch context from Seq to Option.
The bad way to use it would be to take the first value without bothering checking if it exists or not.
A better way is to tell your program: "No matter how many values you have, do this for each of them".
Now, how do you work in context? You use the Functor and/or Monad operators: map, flatMap.
For instance, if you want to apply a function convertToSomethingElse to each element of your context, just do
result.map(list => list.map(process => convertToSomethingElse(process))
And you'll get a Future[Seq[SomethingElse]].
Another example, if you want to save the result somewhere else, you'll probably have some IO, or database operations, which may take some time, and possibly fail. We will assume you have a function save(entity: ProcessTemplateModel): Future[Boolean] that allows you to save one of your models. The fact that the function will take some time (and that it will be started in another thread) and possibly fail is visible in the return type Future[Boolean] (Boolean is not important here, it's the fact that we have again the Future context that matters).
Here, you will have to do (assuming you just want to save the first element in your list):
val savedFirstResult: Future[Option[ProcessTemplatesModel]] = result.flatMap {list =>
Future.traverse(list.headOption){ process => //traverse will switch the Future and Option contexts
save(process)
}
}
So as you can see, we can do most of what we want by staying inside the contexts that are returned by Slick. You shouldn't want to get outside of them because
most of the time, there's no need to, when you have map to use inside context some function for values outside context
extracting methods are most of the time unsafe: Option#get throws an exception if no element is in the Option, Await.result(future, duration) may block all computations or throw exceptions
responses in Play! can be given as Futures in a controller, using Action.async

Serving multiple result Futures as soon as available to a client

I have a page that is populated by data that I get using different calls to distant servers. Some requests take longer than others, the way I do things now is that I do all the calls at once and wrap the whole thing in a Future, then put the the whole thing in a Action.async for Play to handle.
This, theoretically, does the job but I don't want my users to be waiting a long time and instead start loading the page part by part. Meaning that as soon as data is available for a given request to a distant server, it should be sent to the client as Json or whatever.
I was able to partially achieve this using EventSource by modifying Play's event-source sample by doing something like this:
Ok.chunked((enumerator1 &> EventSource()) >- (enumerator2 &> EventSource())).as("text/event-stream")
and the enumerators as follows:
val enumerator1: Enumerator[String] = Enumerator.generateM{
Future[Option[String]]{Thread.sleep(1500); Some("Hello")}
}
val enumerator2: Enumerator[String] = Enumerator.generateM{
Future[Option[String]]{Thread.sleep(2000); Some("World!")}
}
As you probably have guessed, I was expecting to have "Hello" after 1.5s and then "World!" 0.5s later sent to the client, but I ended up receiving "Hello" every 1.5s and "World!" every 2s.
My questions are:
Is there a way to stop sending an information once it has been correctly delivered to the client using the method above?
Is there a better way to achieve what I want?
You don't want generateM, it's for building enumerators that can return multiple values. generateM takes a function that either returns a Some, to produce the next value for the Enumerator, or None, to signal that the Enumerator is complete. Because your function always returns Some, you create Enumerators that are infinite in length.
You just want to convert a Future into an Enumerator, to create an Enumerator with a single element:
Enumerator.flatten(future.map(Enumerator(_)))
Also, you can interleave your enumerators and then feed the result into EventSource(). Parenthesis are unnecessary as well (methods that start with > have precedence over methods with &).
enumerator1 >- enumerator2 &> EventSource()

Akka actor forward message with continuation

I have an actor which takes the result from another actor and applies some check on it.
class Actor1(actor2:Actor2) {
def receive = {
case SomeMessage =>
val r = actor2 ? NewMessage()
r.map(someTransform).pipeTo(sender)
}
}
now if I make an ask of Actor1, we now have 2 futures generated, which doesnt seem overly efficient. Is there a way to provide a foward with some kind of continuation, or some other approach I could use here?
case SomeMessage => actor2.forward(NewMessage, someTransform)
Futures are executed in an ExecutionContext, which are like thread pools. Creating a new future is not as expensive as creating a new thread, but it has its cost. The best way to work with futures is to create as much as needed and compose then in a way that things that can be computed in parallel are computed in parallel if the necessary resources are available. This way you will make the best use of your machine.
You mentioned that akka documentation discourages excessive use of futures. I don't know where you read this, but what I think it means is to prefer transforming futures rather than creating your own. This is exactly what you are doing by using map. Also, it may mean that if you create a future where it is not needed you are adding unnecessary overhead.
In your case you have a call that returns a future and you need to apply sometransform and return the result. Using map is the way to go.