Running a block with a timeout in Scala without using Futures? - scala

Is there a way to running a function with a timeout in Scala, without Futures?
For example, something like:
val result = runWithTimeout (timeout, function}
The reason to avoid Future, is because we work in a complex running environment and I would like to avoid spanning the threads and managing ExecutionContext.

In general it's not possible in JVM without applying some effect system like cats-effect, fs2, monix or Zio.
We need to have another thread to trigger the time timeout. Another issue, is when you calling with runWithTimeout(Thread.sleep(1 hours)), it will release the caller thread but it will eat the thread from the thread pool for 1 hour.
Still, to go this way, you can do something like:
object Execution {
def runWithTimeout[T](timeout: Duration)(f: => T)(implicit ec: ExecutionContext): Option[T] = {
try {
Some(Await.result(Future(f), timeout))
} catch {
case e: TimeoutException => None
}
}
}

Related

Avoid deadlock with Future { blocking {} }

I am using the following code to get a JDBC connection, inside a blocking block, and pass that connection to a fn: Connection => Future[_]. After fn finishes I'd like to commit/rollback the transaction and close the connection.
def withTransactionAsync[T](fn: Connection => Future[T]): Future[T] =
Future {
blocking {
ds.getConnection
}
}.flatMap { conn =>
fn(conn)
.map { r => conn.commit(); conn.close(); r }
.recoverWith {
case e: Throwable =>
conn.rollback()
conn.close()
throw e
}
}
I am using a separate execution context based on a ForkJoinPool.
With enough calls, this code goes into a deadlock. Intuitively, this makes sense. The first future, with the getConnection call, gets blocked while waiting for available connections, while available connections are waiting for available threads in the ExecutionContext to run the commit(); close() block to free the connection and free a thread in the execution context for getConnection to run. I verified this is the case with a thread dump.
The only way I found around this problem is to run everything on the same Future {} and therefore avoid switching the context:
def withTransactionAsync[T](fn: Connection => Future[T]): Future[T] =
Future {
blocking {
val conn = ds.getConnection
try {
conn.setAutoCommit(false)
val r = Await.result(fn(conn), Duration.Inf)
conn.commit()
r
} catch {
case e: Throwable =>
conn.rollback()
throw e
} finally
conn.close()
}
}
But this way I am blocking on Await.result. I suppose this is not a big problem because I am blocking inside a blocking block, but I am afraid this would have unforeseen consequences and is not necessarily what the caller of this API expects.
Is there a way around this deadlock without using Await and just rely on Future composition?
I suppose a case could be made that this this function not be accepting Connection => Future[T] but only a Connection => T, but I'd like to keep that API.
If I increase the size of the ForkJoinPool enough, it works, but that size is difficult to calculate/predict for all workloads and I don't want to have a ForkJoinPool many times the size of my database pool.
As mentioned in the comments, fn is blocking code. But it is not inside a blocking clause, so it will tie up one of the main threads in the pool. If this happens enough times, the pool will run out of threads and the system will deadlock.
So the call to fn and the code that follows needs to be inside a blocking clause so that a separate thread is created for it and the main threads remain available for non-blocking code.
Given the amount of blocking code, it is probably worth looking at a Task model with a thread per connection rather than a thread per pending operation, so that the number of threads is constrained. This is basically a work-around for the fact that getConnection is synchronous, which is a problem with HikariCP.

Scala style with Futures

Looking for the best way to write a chain of functions that need to run async one after the other. Given these two options:
Option 1
def operation1(): Unit = {...}
def operation2(): Unit = {...}
def foo(): Future[Unit] =
Future {
operation1()
operation2()
} onComplete {case _ => println("done!")}
Option 2
def operation1(): Future[Unit] = {...}
def operation2(): Future[Unit] = {...}
def foo(): Future[Unit] = {
operation1()
.flatMap {case _ => operation2() }
.onComplete {case _ => println("done!")}
}
Are there any advantages/disadvantages of one over the other?
I believe that the Option 1 will run the two functions on the same background thread. Is that also the case for the Option 2?
Are there any good practices for this?
Another question, given this function:
def foo: Future[A]
if I want to cast the result to unit, is this the best way to do it:
foo map { _ => () }
Thanks!
The potential advantage of Option 1 over Option 2 is that, it guarantees operation2 will run right after operation1 - if it didn't fail with an exception - whereas, in Option 2, you might have exhausted your thread pool available threads when the flatMap is to be done.
Yes, Option1 will run the operations in the same thread for sure. Option 2 will try to run them in two threads as long as there are enough of them available.
flatMap[S](f: (T) ⇒ Future[S])(implicit executor: ExecutionContext): Future[S]
You did have to declare an implicit execution context, or import it: That determines which pool you are using. If you imported the default global executor then your pool is a fork join based one with - by default - as many threads as cores you machine has.
The first option is like having a thread running both operations, one after the another whereas the second option runs the first operation in a thread and then tries to get another thread from your ExecutionContext to run the second operation.
The best practice is to use what you need:
Do you want to make sure that operation2 run in a context where no more threads are available in the execution context? If the answer is yes, then use Option1. Otherwise, you can use Option2
In relation to your last question: What you're doing in your proposed snippet is not casting, your are mapping a function which provides an Unit value for any value of type A. The effect is that you get a future of type Unit which is useful to check its completion state. That is the best way to get what you seem to want.
Be aware, however, of the fact that, as well as with flatMap, the execution of that "transformation function" will be run in a different thread provided by the implicit executor in your context. That is why map has an implicit parameter executor too:
def map[S](f: (T) ⇒ S)(implicit executor: ExecutionContext): Future[S]

Alternative to await.ready

I have the following code in Scala:
val status: Future[String] = Await.ready(Http(address OK as.String), 1 second)
I'm making a http call and I'm waiting for an answer for a second.
I was told it's not good practice to block using Await.ready.
I'm curious what I can use instead.
Can I use for comprehensions? How?
It generally bad to block on an asynchronous operation, otherwise, why make it asynchronous at all?
You can use various implementations with Future[T], such as registering a continuation to invoke when the result arrives. For example, let's assume you want to parse the String result into a Foo object. You'd get:
val result: Future[Foo] = Http(address OK as.String).map {
s => parseJson[Foo](s)
}
Note that when working with Future[T], you'll end up bubbling them up the call chain of the execution, unless you synchronously block.
Same can be achieved with for comprehension:
for {
s <- Http(address OK as.String)
} yield (parseJson[Foo](s))
Using Await.ready is not a good practice because its blocking. In most of the cases you can compose and transform the futures to achieve the desired result.
But You can use blocking when its absolutely necessary. Here is my answer about blocking and its consequences When to and when not use blocking
Non-Blocking wait
def afterSomeTime(code: => Unit)(duration: FiniteDuration): Unit = {
someActorSystem.scheduler.scheduleOnce(duration) {
code
}
}
Above function will call the code after given duration, you can use any other timer implementation instead of Akka scheduler
case class TimeoutException(msg: String) extends Exception(msg)
def timeout[T](future: => Future[T])(duration: FiniteDuration)(implicit ec: ExecutionContext): Future[T] = {
val promise = Promise[T]()
future.onComplete(promise tryComplete)
afterSomeTime {
promise tryFailure TimeoutException(s"Future timeout after ${duration.toString()}")
}(duration)
promise.future
}

How to run futures on the current actor's dispatcher in Akka

Akka's documentation warns:
When using future callbacks, such as onComplete, onSuccess, and onFailure, inside actors you need to carefully avoid closing over the containing actor’s reference, i.e. do not call methods or access mutable state on the enclosing actor from within the callback
It seems to me that if I could get the Future which wants to access the mutable state to run on the same dispatcher that arranges for mutual exclusion of threads handling actor messages then this issue could be avoided. Is that possible? (Why not?)
The ExecutionContext provided by context.dispatcher is not tied to the actor messages dispatcher, but what if it were? i.e.
class MyActorWithSafeFutures {
implicit def safeDispatcher = context.dispatcherOnMessageThread
var successCount = 0
var failureCount = 0
override def receive: Receive = {
case MakeExternalRequest(req) =>
val response: Future[Response] = someClient.makeRequest(req)
response.onComplete {
case Success(_) => successCount += 1
case Failure(_) => failureCount += 1
}
response pipeTo sender()
}
}
}
Is there any way to do that in Akka?
(I know that I could convert the above example to do something like self ! IncrementSuccess, but this question is about mutating actor state from Futures, rather than via messages.)
It looks like I might be able to implement this myself, using code like the following:
class MyActorWithSafeFutures {
implicit val executionContext: ExecutionContextExecutor = new ExecutionContextExecutor {
override def execute(runnable: Runnable): Unit = {
self ! runnable
}
override def reportFailure(cause: Throwable): Unit = {
throw new Error("Unhandled throwable", cause)
}
}
override def receive: Receive = {
case runnable: Runnable => runnable.run()
... other cases here
}
}
Would that work? Why doesn't Akka offer that - is there some huge drawback I'm not seeing?
(See https://github.com/jducoeur/Requester for a library which does just this in a limited way -- for Asks only, not for all Future callbacks.)
Your actor is executing its receive under one of the dispatcher's threads, and you want to spin off a Future that's firmly attached to this particular thread? In that case the system can't reuse this thread to run a different actor, because that would mean the thread was unavailable when you wanted to execute the Future. If it happened to use that same thread to execute someClient, you might deadlock with yourself. So this thread can no longer be used freely to run other actors - it has to belong to MySafeActor.
And no other threads can be allowed to freely run MySafeActor - if they were, two different threads might try to update successCount at the same time and you'd lose data (e.g. if the value is 0 and two threads both try to do successCount +=1, the value can end up as 1 rather that 2). So to do this safely, MySafeActor has to have a single Thread that's used for itself and its Future. So you end up with MySafeActor and that Future being tightly, but invisibly, coupled. The two can't run at the same time and could deadlock against each other. (It's still possible for a badly-written actor to deadlock against itself, but the fact that all the code using that actor's "imaginary mutex" is in a single place makes it easier to see potential problems).
You could use traditional multithreading techniques - mutexes and the like - to allow the Future and MySafeActor to run concurrently. But what you really want is to encapsulate successCount in something that can be used concurrently but safely - some kind of... Actor?
TL;DR: either the Future and the Actor: 1) may not run concurrently, in which case you may deadlock 2) may run concurrently, in which case you will corrupt data 3) access state in a concurrency-safe way, in which case you're reimplementing Actors.
You could use a PinnedDispatcher for your MyActorWithSafeFutures actor class which would create a thread pool with exactly one thread for each instance of the given class, and use context.dispatcher as execution context for your Future.
To do this you have to put something like this in your application.conf:
akka {
...
}
my-pinned-dispatcher {
executor = "thread-pool-executor"
type = PinnedDispatcher
}
and to create your actor:
actorSystem.actorOf(
Props(
classOf[MyActorWithSafeFutures]
).withDispatcher("my-pinned-dispatcher"),
"myActorWithSafeFutures"
)
Although what you are trying to achieve breaks completely the purpose of the actor model. The actor state should be encapsulated, and actor state changes should be driven by incoming messages.
This does not answer your question directly, but rather offers an alternative solution using Akka Agents:
class MyActorWithSafeFutures extends Actor {
var successCount = Agent(0)
var failureCount = Agent(0)
def doSomethingWithPossiblyStaleCounts() = {
val (s, f) = (successCount.get(), failureCount.get())
statisticsCollector ! Ratio(f/s+f)
}
def doSomethingWithCurrentCounts() = {
val (successF, failureF) = (successCount.future(), failureCount.future())
val ratio : Future[Ratio] = for {
s <- successF
f <- failureF
} yield Ratio(f/s+f)
ratio pipeTo statisticsCollector
}
override def receive: Receive = {
case MakeExternalRequest(req) =>
val response: Future[Response] = someClient.makeRequest(req)
response.onComplete {
case Success(_) => successCount.send(_ + 1)
case Failure(_) => failureCount.send(_ + 1)
}
response pipeTo sender()
}
}
The catch is that if you want to operate on the counts that would result if you were using #volatile, then you need to operate inside a Future, see doSomethingWithCurrentCounts().
If you are fine with having values which are eventually consistent (there might be pending updates scheduled for the Agents), then something like doSometinghWithPossiblyStaleCounts() is fine.
#rkuhn explains why this would be a bad idea on the akka-user list:
My main consideration here is that such a dispatcher would make it very convenient to have multiple concurrent entry points into the Actor’s behavior, where with the current recommendation there is only one—the active behavior. While classical data races are excluded by the synchronization afforded by the proposed ExecutionContext, it would still allow higher-level races by suspending a logical thread and not controlling the intermediate execution of other messages. In a nutshell, I don’t think this would make the Actor easier to reason about, quite the opposite.

Sleeping actors?

What's the best way to have an actor sleep? I have actors set up as agents which want to maintain different parts of a database (including getting data from external sources). For a number of reasons (including not overloading the database or communications and general load issues), I want the actors to sleep between each operation. I'm looking at something like 10 actor objects.
The actors will run pretty much infinitely, as there will always be new data coming in, or sitting in a table waiting to be propagated to other parts of the database etc. The idea is for the database to be as complete as possible at any point in time.
I could do this with an infinite loop, and a sleep at the end of each loop, but according to http://www.scala-lang.org/node/242 actors use a thread pool which is expanded whenever all threads are blocked. So I imagine a Thread.sleep in each actor would be a bad idea as would waste threads unnecessarily.
I could perhaps have a central actor with its own loop that sends messages to subscribers on a clock (like async event clock observers)?
Has anyone done anything similar or have any suggestions? Sorry for extra (perhaps superfluous) information.
Cheers
Joe
There was a good point to Erlang in the first answer, but it seems disappeared. You can do the same Erlang-like trick with Scala actors easily. E.g. let's create a scheduler that does not use threads:
import actors.{Actor,TIMEOUT}
def scheduler(time: Long)(f: => Unit) = {
def fixedRateLoop {
Actor.reactWithin(time) {
case TIMEOUT => f; fixedRateLoop
case 'stop =>
}
}
Actor.actor(fixedRateLoop)
}
And let's test it (I did it right in Scala REPL) using a test client actor:
case class Ping(t: Long)
import Actor._
val test = actor { loop {
receiveWithin(3000) {
case Ping(t) => println(t/1000)
case TIMEOUT => println("TIMEOUT")
case 'stop => exit
}
} }
Run the scheduler:
import compat.Platform.currentTime
val sched = scheduler(2000) { test ! Ping(currentTime) }
and you will see something like this
scala> 1249383399
1249383401
1249383403
1249383405
1249383407
which means our scheduler sends a message every 2 seconds as expected. Let's stop the scheduler:
sched ! 'stop
the test client will begin to report timeouts:
scala> TIMEOUT
TIMEOUT
TIMEOUT
stop it as well:
test ! 'stop
There's no need to explicitly cause an actor to sleep: using loop and react for each actor means that the underlying thread pool will have waiting threads whilst there are no messages for the actors to process.
In the case that you want to schedule events for your actors to process, this is pretty easy using a single-threaded scheduler from the java.util.concurrent utilities:
object Scheduler {
import java.util.concurrent.Executors
import scala.compat.Platform
import java.util.concurrent.TimeUnit
private lazy val sched = Executors.newSingleThreadScheduledExecutor();
def schedule(f: => Unit, time: Long) {
sched.schedule(new Runnable {
def run = f
}, time , TimeUnit.MILLISECONDS);
}
}
You could extend this to take periodic tasks and it might be used thus:
val execTime = //...
Scheduler.schedule( { Actor.actor { target ! message }; () }, execTime)
Your target actor will then simply need to implement an appropriate react loop to process the given message. There is no need for you to have any actor sleep.
ActorPing (Apache License) from lift-util has schedule and scheduleAtFixedRate Source: ActorPing.scala
From scaladoc:
The ActorPing object schedules an actor to be ping-ed with a given message at specific intervals. The schedule methods return a ScheduledFuture object which can be cancelled if necessary
There unfortunately are two errors in the answer of oxbow_lakes.
One is a simple declaration mistake (long time vs time: Long), but the second is some more subtle.
oxbow_lakes declares run as
def run = actors.Scheduler.execute(f)
This however leads to messages disappearing from time to time. That is: they are scheduled but get never send. Declaring run as
def run = f
fixed it for me. It's done the exact way in the ActorPing of lift-util.
The whole scheduler code becomes:
object Scheduler {
private lazy val sched = Executors.newSingleThreadedScheduledExecutor();
def schedule(f: => Unit, time: Long) {
sched.schedule(new Runnable {
def run = f
}, time - Platform.currentTime, TimeUnit.MILLISECONDS);
}
}
I tried to edit oxbow_lakes post, but could not save it (broken?), not do I have rights to comment, yet. Therefore a new post.