Doobie - lifting arbitrary effect into ConnectionIO CE3 - scala

I am trying to migrate project from cats-effect 2 to cats-effect 3, i am using doobie for interacting with database. Previously i could lift ConnectionIO to IO as it was described, but with the upgrade i didn't find any implementation of LiftIO[ConnectionIO], how can be same achieved with CE3?

There's two approaches here. Broadly, there's the "don't do that" option (recommended), and the other way would be to use WeakAsync
"Don't do that"
Generally, it's not recommended to interleave arbitrary IO into a ConnectionIO program, because ConnectionIO is transactional, and you cannot roll back an IO. A ConnectionIO might run multiple times, for example to retry a recoverable failure.
When possible, it's a good idea to refactor the code to not do the non-transactional operation inside a transaction, but instead to perform it afterwards
However, when that's not practical, ConnectionIO provides a Sync instance, and so there's a number of things you can do with it easily that don't require lifting from IO in the first place:
printing to the console can be done by summoning Console[ConnectionIO]
Getting the current time can be done with Clock[ConnectionIO]
You can create a log4cats Logger[ConnectionIO] as needed using the appropriate factory for your backend
Generating a UUID value can be done with UUIDGen[ConnectionIO]
Using WeakAsync
Doobie's WeakAsync provides a way to get a Resource[F, F ~> ConnectionIO]
Note that because this is a resource, you must take care to complete the transaction inside of use - the lifecycle of the FunctionK from the resource will be shut down once use completes.
Usually that means something like this:
def doStuff(rows: List[Int]): F[Unit] = ???
WeakAsync.liftK[F, ConnectionIO].use { fk =>
val transaction = for {
rows <- sql"select 1".query[Int].to[List]
_ <- fk(doStuff(rows))
} yield ()
transaction.transact(xa)
}

I found the way to achieve it by
def f()(implicit liftToConnectionIO: FunctionK[IO, ConnectionIO]): IO[Unit] = IO.unit
implicit val liftToConnectionIO = WeakAsync.liftK[IO, ConnectionIO]
liftToConnectionIO.use(implicit lift => f())

Related

Looking for something like a TestFlow analogous to TestSink and TestSource

I am writing a class that takes a Flow (representing a kind of socket) as a constructor argument and that allows to send messages and wait for the respective answers asynchronously by returning a Future. Example:
class SocketAdapter(underlyingSocket: Flow[String, String, _]) {
def sendMessage(msg: MessageType): Future[ResponseType]
}
This is not necessarily trivial because there may be other messages in the socket stream that are irrelevant, so some filtering is required.
In order to test the class I need to provide something like a "TestFlow" analogous to TestSink and TestSource. In fact I can create a flow by combining both. However, the problem is that I only obtain the actual probes upon materialization and materialization happens inside the class under test.
The problem is similar to the one I described in this question. My problem would be solved if I could materialize the flow first and then pass it to a client to connect to it. Again, I'm thinking about using MergeHub and BroadcastHub and again I see the problem that the resulting stream would behave differently because it is not linear anymore.
Maybe I misunderstood how a Flow is supposed to be used. In order to feed messages into the flow when sendMessage() is called, I need a certain kind of Source anyway. Maybe a Source.actorRef(...) or Source.queue(...), so I could pass in the ActorRef or SourceQueue directly. However, I'd prefer if this choice was up to the SocketAdapter class. Of course, this applies to the Sink as well.
It feels like this is a rather common case when working with streams and sockets. If it is not possible to create a "TestFlow" like I need it, I'm also happy with some advice on how to improve my design and make it better testable.
Update: I browsed through the documentation and found SourceRef and SinkRef. It looks like these could solve my problem but I'm not sure yet. Is it reasonable to use them in my case or are there any drawbacks, e.g. different behaviour in the test compared to production where there are no such refs?
Indirect Answer
The nature of your question suggests a design flaw which you are bumping into at testing time. The answer below does not address the issue in your question, but it demonstrates how to avoid the situation altogether.
Don't Mix Business Logic with Akka Code
Presumably you need to test your Flow because you have mixed a substantial amount of logic into the materialization. Lets assume you are using raw sockets for your IO. Your question suggests that your flow looks like:
val socketFlow : Flow[String, String, _] = {
val socket = new Socket(...)
//business logic for IO
}
You need a complicated test framework for your Flow because your Flow itself is also complicated.
Instead, you should separate out the logic into an independent function that has no akka dependencies:
type MessageProcessor = MessageType => ResponseType
object BusinessLogic {
val createMessageProcessor : (Socket) => MessageProcessor = {
//business logic for IO
}
}
Now your flow can be very simple:
val socket : Socket = new Socket(...)
val socketFlow = Flow.map(BusinessLogic.createMessageProcessor(socket))
As a result: your unit testing can exclusively work with createMessageProcessor, there's no need to test akka Flow because it is a simple veneer around the complicated logic that is tested independently.
Don't Use Streams For Concurrency Around 1 Element
The other big problem with your design is that SocketAdapter is using a stream to process just 1 message at a time. This is incredibly wasteful and unnecessary (you're trying to kill a mosquito with a tank).
Given the separated business logic your adapter becomes much simpler and independent of akka:
class SocketAdapter(messageProcessor : MessageProcessor) {
def sendMessage(msg: MessageType): Future[ResponseType] = Future {
messageProcessor(msg)
}
}
Note how easy it is to use Future in some instances and Flow in other scenarios depending on the need. This comes from the fact that the business logic is independent of any concurrency framework.
This is what I came up with using SinkRef and SourceRef:
object TestFlow {
def withProbes[In, Out](implicit actorSystem: ActorSystem,
actorMaterializer: ActorMaterializer)
:(Flow[In, Out, _], TestSubscriber.Probe[In], TestPublisher.Probe[Out]) = {
val f = Flow.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])
(Keep.both)
val ((sinkRefFuture, (inProbe, outProbe)), sourceRefFuture) =
StreamRefs.sinkRef[In]()
.viaMat(f)(Keep.both)
.toMat(StreamRefs.sourceRef[Out]())(Keep.both)
.run()
val sinkRef = Await.result(sinkRefFuture, 3.seconds)
val sourceRef = Await.result(sourceRefFuture, 3.seconds)
(Flow.fromSinkAndSource(sinkRef, sourceRef), inProbe, outProbe)
}
}
This gives me a flow I can completely control with the two probes but I can pass it to a client that connects source and sink later, so it seems to solve my problem.
The resulting Flow should only be used once, so it differs from a regular Flow that is rather a flow blueprint and can be materialized several times. However, this restriction applies to the web socket flow I am mocking anyway, as described here.
The only issue I still have is that some warnings are logged when the ActorSystem terminates after the test. This seems to be due to the indirection introduced by the SinkRef and SourceRef.
Update: I found a better solution without SinkRef and SourceRef by using mapMaterializedValue():
def withProbesFuture[In, Out](implicit actorSystem: ActorSystem,
ec: ExecutionContext)
: (Flow[In, Out, _],
Future[(TestSubscriber.Probe[In], TestPublisher.Probe[Out])]) = {
val (sinkPromise, sourcePromise) =
(Promise[TestSubscriber.Probe[In]], Promise[TestPublisher.Probe[Out]])
val flow =
Flow
.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])(Keep.both)
.mapMaterializedValue { case (inProbe, outProbe) =>
sinkPromise.success(inProbe)
sourcePromise.success(outProbe)
()
}
val probeTupleFuture = sinkPromise.future
.flatMap(sink => sourcePromise.future.map(source => (sink, source)))
(flow, probeTupleFuture)
}
When the class under test materializes the flow, the Future is completed and I receive the test probes.

Are there better monadic abstraction alternative for representing long running, async task?

A Future is good at representing a single asynchronous task that will / should be completed within some fixed amount of time.
However, there exist another kind of asynchronous task, one where it's not possible / very hard to know exactly when it will finish. For example, the time taken for a particular string processing task might depend on various factors such as the input size.
For these kind of task, detecting failure might be better by checking if the task is able to make progress within a reasonable amount of time instead of by setting a hard timeout value such as in Future.
Are there any libraries providing suitable monadic abstraction of such kind of task in Scala?
You could use a stream of values like this:
sealed trait Update[T]
case class Progress[T](value: Double) extends Update[T]
case class Finished[T](result: T) extends Update[T]
let your task emit Progress values when it is convenient (e.g. every time a chunk of the computation has finished), and emit one Finished value once the complete computation is finished. The consumer could check for progress values to ensure that the task is still making progress. If a consumer is not interested in progress updates, you can just filter them out. I think this is more composable than an actor-based approach.
Depending on how much performance or purity you need, you might want to look at akka streams or scalaz streams. Akka streams has a pure DSL for building flow graphs, but allows mutability in processing stages. Scalaz streams is more functional, but has lower performance last I heard.
You can break your work into chunks. Each chunk is a Future with a timeout - your notion of reasonable progress. Chain those Futures together to get a complete task.
Example 1 - both chunks can run in parallel and don't depend on each other (embarassingly parallel task):
val chunk1 = Future { ... } // chunk 1 starts execution here
val chunk2 = Future { ... } // chunk 2 starts execution here
val result = for {
c1 <- chunk1
c2 <- chunk2
} yield combine(c1, c2)
Example 2 - second chunk depends on the first:
val chunk1 = Future { ... } // chunk 1 starts execution here
val result = for {
c1 <- chunk1
c2 <- Future { c1 => ... } // chunk 2 starts execution here
} yield combine(c1, c2)
There are obviously other constructs to help you when you have many Futures like sequence.
The article "The worst thing in our Scala code: Futures" by Ken Scambler points the need for a separation of concerns:
scala.concurrent.Future is built to work naturally with scala.util.Try, so our code often ends up clumsily using Try everywhere to represent failure, using raw exceptions as failure values even where no exceptions get thrown.
To do anything with a scala.concurrent.Future, you need to lug around an implicit ExecutionContext. This obnoxious dependency needs to be threaded through everywhere they are used.
So if your code does not depend directly on Future, but on simple Monad properties, you can abstract it with a Monad type:
trait Monad[F[_]] {
def flatMap[A,B](fa: F[A], f: A => F[B]): F[B]
def pure[A](a: A): F[A]
}
// read the type parameter as “for all F, constrained by Monad”.
final def load[F[_]: Monad](pageUrl: URL): F[Page]

Akka actor forward message with continuation

I have an actor which takes the result from another actor and applies some check on it.
class Actor1(actor2:Actor2) {
def receive = {
case SomeMessage =>
val r = actor2 ? NewMessage()
r.map(someTransform).pipeTo(sender)
}
}
now if I make an ask of Actor1, we now have 2 futures generated, which doesnt seem overly efficient. Is there a way to provide a foward with some kind of continuation, or some other approach I could use here?
case SomeMessage => actor2.forward(NewMessage, someTransform)
Futures are executed in an ExecutionContext, which are like thread pools. Creating a new future is not as expensive as creating a new thread, but it has its cost. The best way to work with futures is to create as much as needed and compose then in a way that things that can be computed in parallel are computed in parallel if the necessary resources are available. This way you will make the best use of your machine.
You mentioned that akka documentation discourages excessive use of futures. I don't know where you read this, but what I think it means is to prefer transforming futures rather than creating your own. This is exactly what you are doing by using map. Also, it may mean that if you create a future where it is not needed you are adding unnecessary overhead.
In your case you have a call that returns a future and you need to apply sometransform and return the result. Using map is the way to go.

When could Futures be more appropriate than Actors (or vice versa) in Scala?

Suppose I need to run a few concurrent tasks.
I can wrap each task in a Future and wait for their completion. Alternatively I can create an Actor for each task. Each Actor would execute its task (e.g. upon receiving a "start" message) and send the result back.
I wonder when I should use the former (with Futures) and the latter (with Actors) approach and why the Future approach is considered better for the case described above.
Because it is syntactically simpler.
val tasks: Seq[() => T] = ???
val futures = tasks map {
t => future { t() }
}
val results: Future[Seq[T]] = Future.sequence(futures)
The results future you can then wait on using Await.result or you can map it further/use it in for-comprehension or install callbacks on it.
Compare that to instantiating all the actors, sending messages to them, coding their receive blocks, receiving responses from them and shutting them down -- that would generally require more boilerplate.
As a general rule, use the simplest concurrency model that fits your application, rather than the most powerful. Ordering from simplest to most complex would be sequential programming->parallel collections->futures->stateless actors->stateful actors->threads with software transactional memory->threads with explicit locking->threads with lock-free algorithms. Pick the first one in this list that solves your problem. The farther down that list you go, the greater the complexities and risks, so you're better off trading simplicity for conceptual power.
I tend to think that actors are useful when you have interacting threads. In your case, it appears to be that all the jobs are independent; I would use futures.

How to create non-blocking methods in Scala?

What is a good way of creating non-blocking methods in Scala? One way I can think of is to create a thread/actor and the method just send a message to the thread and returns. Is there a better way of creating a non-blocking method?
Use scala.actors.Future:
import actors._
def asyncify[A, B](f: A => B): A => Future[B] = (a => Futures.future(f(a)))
// normally blocks when called
def sleepFor(seconds: Int) = {
Thread.sleep(seconds * 1000)
seconds
}
val asyncSleepFor = asyncify(sleepFor)
val future = asyncSleepFor(5) // now it does NOT block
println("waiting...") // prints "waiting..." rightaway
println("future returns %d".format(future())) // prints "future returns 5" after 5 seconds
Overloaded "asyncify" that takes a function with more than one parameter is left as an exercise.
One caveat, however, is exception handling. The function that is being "asyncified" has to handle all exceptions itself by catching them. Behavior for exceptions thrown out of the function is undefined.
Learn about actors.
It depends on your definition of "blocking." Strictly speaking, anything that requires acquisition of a lock is blocking. All operations that are dependent on an actor's internal state acquire a lock on the actor. This includes message sends. If lots of threads try to send a message to an actor all-at-once they have to get in line.
So if you really need non-blocking, there are various options in java.util.concurrent.
That being said, from a practical perspective actors give you something that close enough to non-blocking because none of the synchronized operations do a significant amount of work, so chances are actors meet your need.