Retrying Monix Task - why Task.defer is required here? - scala

I recently spotted a case I can't fully understand while working with Monix Task:
There are two functions (in queue msg handler):
def handle(msg: RollbackMsg): Task[Unit] = {
logger.info(s"Attempting to rollback transaction ${msg.lockId}")
Task.defer(doRollback(msg)).onErrorRestart(5).foreachL { _ =>
logger.info(s"Transaction ${msg.lockId} rolled back")
}
}
private def doRollback(msg: RollbackMsg): Task[Unit] =
(for {
originalLock <- findOrigLock(msg.lockId)
existingClearanceOpt <- findExistingClearance(originalLock)
_ <- clearLock(originalLock, existingClearanceOpt)
} yield ()).transact(xa)
The internals of doRollback's for-comprehension are all a set of doobie calls returning ConnectionIO[_] monad and then transact is run on it turning the composition into Monix Task.
Now, as seen in handle function I'd like entire process to retry 5 times in case of failure. The mysterious part is that this simple call:
doRollback(msg).onErrorRestart(5)
doesn't really restart the operation on exception (verified in tests). In order to get this retry behaviour I have to explicitly wrap it in Task.defer, or have it already within Task "context" in any other way.
And this is the point I don't fully get: why is it so? doRollback already gives me Task instance, so I should be able to call onErrorRestart on it, no? If it's not the case how can I be sure that a Task instance i get from "somewhere" is ok to be restarted or not?
What am I missing here?

Related

How to cancel a future action if another future did failed?

I have 2 futures (2 actions on db tables) and I want that before saving modifications to check if both futures have finished successfully.
Right now, I start second future inside the first (as a dependency), but I know is not the best option. I know I can use a for-comprehension to execute both futures in parallel, but even if one fail, the other will be executed (not tested yet)
firstFuture.dropColumn(tableName) match {
case Success(_) => secondFuture.deleteEntity(entity)
case Failure(e) => throw new Exception(e.getMessage)
}
// the first future alters a table, drops a column
// the second future deletes a row from another table
in this case, if first future is executed successfully, the second can fail. I want to revert the update of first future. I heard about SQL transactions, seems to be something like that, but how?
val futuresResult = for {
first <- firstFuture.dropColumn(tableName)
second <- secondFuture.deleteEntity(entity)
} yield (first, second)
A for-comprehension is much better in my case because I don't have dependencies between these two futures and can be executed in parallel, but this not solve my problem, the result can be (success, success) or (failed, success) for example.
Regarding Future running sequentially vs in parallel:
This is a bit tricky because Scala Future is designed to be eager. There are some other constructs across various Scala libraries that handle synchronous and asynchronous effects, such as cats IO, Monix Task, ZIO etc. which are designed in a lazy way, and they don't have this behaviour.
The thing with Future being eager is that it will start the computation as soon as it is can. Here "start" means schedule it on an ExecutionContext that is either selected explicitly or present implicitly. While it's technically possible that the execution is stalled a bit in case the scheduler decides to do so, it will most likely be started almost instantaneously.
So if you have a value of type Future, it's going to start running then and there. If you have a lazy value of type Future, or a function / method that returns a value of type Future, then it's not.
But even if all you have are simple values (no lazy vals or defs), if the Future definition is done inside the for-comprehension, then it means it's part of a monadic flatMap chain (if you don't understand this, ignore it for now) and it will be run in sequence, not in parallel. Why? This is not specific to Futures; every for-comprehension has the semantics of being a sequential chain in which you can pass the result of the previous step to the next step. So it's only logical that you can't run something in step n + 1 if it depends on something from step n.
Here's some code to demonstrate this.
val program = for {
_ <- Future { Thread.sleep(5000); println("f1") }
_ <- Future { Thread.sleep(5000); println("f2") }
} yield ()
Await.result(program, Duration.Inf)
This program will wait five seconds, then print "f1", then wait another five seconds, and then print "f2".
Now let's take a look at this:
val f1 = Future { Thread.sleep(5000); println("f1") }
val f2 = Future { Thread.sleep(5000); println("f2") }
val program = for {
_ <- f1
_ <- f2
} yield ()
Await.result(program, Duration.Inf)
The program, however, will print "f1" and "f2" simultaneously after five seconds.
Note that the sequence semantics are not really violated in the second case. f2 still has the opportunity to use the result of f1. But f2 is not using the result of f1; it's a standalone value that can be computed immediately (defined with a val). So if we change val f2 to a function, e.g. def f2(number: Int), then the execution changes:
val f1 = Future { Thread.sleep(5000); println("f1"); 42 }
def f2(number: Int) = Future { Thread.sleep(5000); println(number) }
val program = for {
number <- f1
_ <- f2(number)
} yield ()
As you would expect, this will print "f1" after five seconds, and only then will the other Future start, so it will print "42" after another five seconds.
Regarding transactions:
As #cbley mentioned in the comment, this sounds like you want database transactions. For example, in SQL databases this has a very specific meaning and it ensures the ACID properties.
If that's what you need, you need to solve it on the database layer. Future is too generic for that; it's just an effect type that models sync and async computations. When you see a Future value, just by looking at the type, you can't tell if it's the result of a database call or, say, some HTTP call.
For example, doobie describes every database query as a ConnectionIO type. You can have multiple queries lined up in a for-comprehension, just how you would have with Future:
val program = for {
a <- database.getA()
_ <- database.write("foo")
b <- database.getB()
} yield {
// use a and b
}
But unlike our earlier examples, here getA() and getB() don't return a value of type Future[A], but ConnectionIO[A]. What's cool about that is that doobie completely takes care of the fact that you probably want these queries to be run in a single transaction, so if getB() fails, "foo" will not be committed to the database.
So what you would do in that case is obtain the full description of your set of queries, wrap it into a single value program of type ConnectionIO, and once you want to actually run the transaction, you would do something like program.transact(myTransactor), where myTransactor is an instance of Transactor, a doobie construct that knows how to connect to your physical database.
And once you transact, your ConnectionIO[A] is turned into a Future[A]. If the transaction failed, you'll have a failed Future, and nothing will be really committed to your database.
If your database operations are independent of each other and can be run in parallel, doobie will also allow you to do that. Committing transactions via doobie, both in sequence and in parallel, is quite nicely explained in the docs.

scala - scalatest - eventually trait wins over fail assertion?

by documentation,
eventually trait
Invokes the passed by-name parameter repeatedly until it either
succeeds, or a configured maximum amount of time has passed, sleeping
a configured interval between attempts.
but fail,
fail to fail a test unconditionally;
so i want to use eventually in order to wait until a successful status arrived, but use fail to fail the test if i already know that the test must to fail
e.g.
converting a video with ffmpeg i will wait until conversion is not completed but if conversion reach "error" status i want to make the test fail
with this test
test("eventually fail") {
eventually (timeout(Span(30, Seconds)), interval(Span(15, Seconds))) {
println("Waiting... ")
assert(1==1)
fail("anyway you must fail")
}
}
i understand that i cannot make a test "fail unconditionally" inside eventually cicle : it looks like eventually will ignore "fail" until timeout.
is this a correct behaviour?
so, in the assertion scalatest documentation, fail should not "fail test unconditionally" but it "throw exception"?
It's the same because the only way to fail a test in Scalatest is to throw an exception.
Look at the source:
def eventually[T](fun: => T)(implicit config: PatienceConfig): T = {
val startNanos = System.nanoTime
def makeAValiantAttempt(): Either[Throwable, T] = {
try {
Right(fun)
}
catch {
case tpe: TestPendingException => throw tpe
case e: Throwable if !anExceptionThatShouldCauseAnAbort(e) => Left(e)
}
}
...
So if you want to get your failure through, you could use pending instead of fail (but of course, the test will be reported as pending, not failed). Or write your own version of eventually which lets more exceptions through.

Scalaz Task not starting

I'm trying to make an asyncronous call using scalaz Task.
For some strange reason whenever I attempt to call methodA, although I get a scalaz.concurrent.Task returned I never see the 2nd print statement which leads me to believe that the asynchronous call is never returned. Is there something that I'm missing that is required in order for me to start the task?
for {
foo <- println("we are in this for loop!").asAsyncSuccess()
authenticatedUser <- methodA(parameterA)
foo2 <- println("we are out of this this call!").asAsyncSuccess()
} yield {
authenticatedUser
}
def methodA(parameterA: SomeType): Task[User]
First off, Scalaz Task is lazy: it won't start when you create it or get from some method. It's by design and allows Tasks to combine well. It's not intended to start until you fully assemble your entire program into single main Task, so when you're done, you need to manually start an aggregated Task using one of the perform methods, like unsafePerformSync that will wait for the result of aggregate async computation on a current thread:
(for {
foo <- println("we are in this for loop!").asAsyncSuccess()
authenticatedUser <- methodA(parameterA)
foo2 <- println("we are out of this this call!").asAsyncSuccess()
} yield {
authenticatedUser
}).unsafePerformSync
As you can see from your original code, you've never started any Task. However, you've mentioned that first println printed its message to console. This may be related to asAsyncSuccess method: I've not found it in Scalaz, so I assume it's in implicit class somewhere in your code:
implicit class TaskOps[A](val a: A) extends AnyVal {
def asAsyncSuccess(): Task[A] = Task(a)
}
If you write a helper implicit class like this, it won't convert an effectful expression into lazy Task action, despite the fact that Task.apply
is call-by-name. It will only convert its result, which happens to be just Unit, because its own parameter a is not call-by-name. To work as intended, it needs to be like the following:
implicit class TaskOps[A](a: => A) {
def asAsyncSuccess(): Task[A] = Task(a)
}
You may also ask why first println prints something while second isn't. This is related to for-comprehensions desugaring into method calls. Specifically, compiler turns your code into this:
println("we are in this for loop!").asAsyncSuccess().flatMap(foo =>
methodA(parameterA).flatMap(authenticatedUser =>
println("we are out of this this call!").asAsyncSuccess().map(foo2 =>
authenticatedUser)))
If asAsyncSuccess doesn't respect intended Task semantics, first println gets executed immediately. But its result is then wrapped into Task that is, essentially, just a wrapper over start function, and flatMap implementation on it just combines such functions together without running them. So, methodA or second println won't be executed until you call unsafePerformSync or other perform helper method. Note, however, that second println will still be executed earlier than it should according to Task semantics, just not that earlier.

Can't seem to get Future to run callback in Scala

(I've not included the imports so as not to clutter this question)
(This is the simplest possible Scala App (created using scala-minimal template on Typesafe Activator))
I'm trying to run a query against an Elasticsearch Server.
I've run the same code on sbt console and I can see the results alright.
However, when I run the following code, I see "END" (code after the callbacks) being printed, but neither the Success callback nor the Failure callback get run.
I'm a Scala noob, so maybe I'm doing something wrong here? This code compiles. (Just to let you know all the imports are there)
object Hello{
def main(args: Array[String]): Unit = {
val client = ElasticClient.remote("vm-3bsa", 9300)
val res:Future[SearchResponse] = client.execute{ search in "vulnerabilities/3bsa" query "css" }
res onComplete{
case Success(s) => println(s)
case Failure(t) => println("An error has occured: " + t)
}
println("END")
//EDIT start
Await.result(res,10.seconds)
//EDIT end
}
}
FINAL EDIT
Instead of using onComplete, it works if I, instead, print result of the call to Await.result:
val await=Await.result(res,10.seconds)
println(await)
// results shown
The main thread will register your onComplete, println("END") and then exit, this makes the program terminate so therefore you never see your onComplete callback.
You can use Await.result(future, timeout) to block the main thread to keep it alive until the answer arrives. In a server context that would be a big no-no but in a small app like this it is not a problem blocking one thread.

Sleeping actors?

What's the best way to have an actor sleep? I have actors set up as agents which want to maintain different parts of a database (including getting data from external sources). For a number of reasons (including not overloading the database or communications and general load issues), I want the actors to sleep between each operation. I'm looking at something like 10 actor objects.
The actors will run pretty much infinitely, as there will always be new data coming in, or sitting in a table waiting to be propagated to other parts of the database etc. The idea is for the database to be as complete as possible at any point in time.
I could do this with an infinite loop, and a sleep at the end of each loop, but according to http://www.scala-lang.org/node/242 actors use a thread pool which is expanded whenever all threads are blocked. So I imagine a Thread.sleep in each actor would be a bad idea as would waste threads unnecessarily.
I could perhaps have a central actor with its own loop that sends messages to subscribers on a clock (like async event clock observers)?
Has anyone done anything similar or have any suggestions? Sorry for extra (perhaps superfluous) information.
Cheers
Joe
There was a good point to Erlang in the first answer, but it seems disappeared. You can do the same Erlang-like trick with Scala actors easily. E.g. let's create a scheduler that does not use threads:
import actors.{Actor,TIMEOUT}
def scheduler(time: Long)(f: => Unit) = {
def fixedRateLoop {
Actor.reactWithin(time) {
case TIMEOUT => f; fixedRateLoop
case 'stop =>
}
}
Actor.actor(fixedRateLoop)
}
And let's test it (I did it right in Scala REPL) using a test client actor:
case class Ping(t: Long)
import Actor._
val test = actor { loop {
receiveWithin(3000) {
case Ping(t) => println(t/1000)
case TIMEOUT => println("TIMEOUT")
case 'stop => exit
}
} }
Run the scheduler:
import compat.Platform.currentTime
val sched = scheduler(2000) { test ! Ping(currentTime) }
and you will see something like this
scala> 1249383399
1249383401
1249383403
1249383405
1249383407
which means our scheduler sends a message every 2 seconds as expected. Let's stop the scheduler:
sched ! 'stop
the test client will begin to report timeouts:
scala> TIMEOUT
TIMEOUT
TIMEOUT
stop it as well:
test ! 'stop
There's no need to explicitly cause an actor to sleep: using loop and react for each actor means that the underlying thread pool will have waiting threads whilst there are no messages for the actors to process.
In the case that you want to schedule events for your actors to process, this is pretty easy using a single-threaded scheduler from the java.util.concurrent utilities:
object Scheduler {
import java.util.concurrent.Executors
import scala.compat.Platform
import java.util.concurrent.TimeUnit
private lazy val sched = Executors.newSingleThreadScheduledExecutor();
def schedule(f: => Unit, time: Long) {
sched.schedule(new Runnable {
def run = f
}, time , TimeUnit.MILLISECONDS);
}
}
You could extend this to take periodic tasks and it might be used thus:
val execTime = //...
Scheduler.schedule( { Actor.actor { target ! message }; () }, execTime)
Your target actor will then simply need to implement an appropriate react loop to process the given message. There is no need for you to have any actor sleep.
ActorPing (Apache License) from lift-util has schedule and scheduleAtFixedRate Source: ActorPing.scala
From scaladoc:
The ActorPing object schedules an actor to be ping-ed with a given message at specific intervals. The schedule methods return a ScheduledFuture object which can be cancelled if necessary
There unfortunately are two errors in the answer of oxbow_lakes.
One is a simple declaration mistake (long time vs time: Long), but the second is some more subtle.
oxbow_lakes declares run as
def run = actors.Scheduler.execute(f)
This however leads to messages disappearing from time to time. That is: they are scheduled but get never send. Declaring run as
def run = f
fixed it for me. It's done the exact way in the ActorPing of lift-util.
The whole scheduler code becomes:
object Scheduler {
private lazy val sched = Executors.newSingleThreadedScheduledExecutor();
def schedule(f: => Unit, time: Long) {
sched.schedule(new Runnable {
def run = f
}, time - Platform.currentTime, TimeUnit.MILLISECONDS);
}
}
I tried to edit oxbow_lakes post, but could not save it (broken?), not do I have rights to comment, yet. Therefore a new post.