I need to place a timeout on a Future in a cross-platform JVM / JS application. This timeout would only be used in tests, so a blocking solution wouldn't be that bad.
I implemented the following snippet to make the future timeout on JVM:
def runWithTimeout[T](timeoutMillis: Int)(f: => Future[T]) : Future[T] =
Await.ready(f, Duration.create(timeoutMillis, java.util.concurrent.TimeUnit.MILLISECONDS))
This doesn't work on Scala.js, as it has no implementation of Await. Is there any other solution to add a Timeout to a Future that works in both Scala.js and Scala JVM?
Your code doesn't really add a timeout to the existing future. That's not possible. What you're doing is setting a timeout for waiting for that future at that specific point. That, you can reproduce in a different, fully asynchronous way, by creating a future that will
resolve to f if it finishes within the given timeout
otherwise resolves to a failed TimeoutException
import scala.concurrent._
import scala.concurrent.duration.Duration
import scala.scalajs.js
def timeoutFuture[T](f: Future[T], timeout: Duration)(
implicit ec: ExecutionContext): Future[T] = {
val p = Promise[T]()
val timeoutHandle = js.timers.setTimeout(timeout) {
p.tryFailure(new TimeoutException)
}
f.onComplete { result =>
p.tryComplete(result)
clearTimeout(timeoutHandle)
}
p.future
}
The above is written for Scala.js. You can write an equivalent one for the JVM, and place them in platform-dependent sources.
Alternatively, you can probably write something equivalent in terms of java.util.Timer, which is supported both on JVM and JS.
Related
Suppose I've got a simple blocking HTTP client like this:
def httpGet(url: URL): Future[String] = ???
Now I want to use httpGet to call a server with a request rate limit; e.g. 1000 requests/sec. Since the standard library does not provide a rate limiter I will use RateLimiter of Guava:
import com.google.common.util.concurrent.RateLimiter
import scala.concurrent.{ExecutionContext, Future, blocking}
def throttled[A](fut: => Future[A], rateLimiter: RateLimiter)
(implicit ec: ExecutionContext): Future[A] = {
Future(blocking(rateLimiter.acquire())).flatMap(_ => fut)
}
implicit val ec = ExecutionContext.global
val rateLimiter = RateLimiter.create(permitsPerSeconds = 1000.0)
val throttledFuture = throttled(httpGet(url), rateLimiter)
Does it make sense ?
Would you use another execution context to execute rateLimiter.acquire() ?
Since you're using blocking around the acquire, it's okay, IMO.
Depending on how much work gets done in the thread which calls httpGet, if you're on Scala 2.13 it might be worth considering using the parasitic execution context.
Style nit, but it might be worth taking advantage of Scala's ability to use {'s around single argument lists:
def throttled[A](rateLimiter: RateLimiter)(fut: => Future[A])(implicit ec: ExecutionContext): Future[A]
val throttledFuture = throttled(rateLimiter) { httpGet(url) }
I have a use-case in databricks where an API call has to me made on a dataset of URL's. The dataset has around 100K records.
The max allowed concurrency is 3.
I did the implementation in Scala and ran in databricks notebook. Apart from the one element pending in queue, i feel some thing is missing here.
Is the Blocking Queue and Thread Pool the right way to tackle this problem.
In the code below I have modified and instead of reading from dataset I am sampling on a Seq.
Any help/thought will be much appreciated.
import java.time.LocalDateTime
import java.util.concurrent.{ArrayBlockingQueue,BlockingQueue}
import java.util.concurrent.Executors
import java.util.concurrent.TimeUnit;
var inpQueue:BlockingQueue[(Int, String)] = new ArrayBlockingQueue[(Int, String)](1)
val inpDS = Seq((1,"https://google.com/2X6barD"), (2,"https://google.com/3d9vCgW"), (3,"https://google.com/2M02Xz0"), (4,"https://google.com/2XOu2uL"), (5,"https://google.com/2AfBWF0"), (6,"https://google.com/36AEKsw"), (7,"https://google.com/3enBxz7"), (8,"https://google.com/36ABq0x"), (9,"https://google.com/2XBjmiF"), (10,"https://google.com/36Emlen"))
val pool = Executors.newFixedThreadPool(3)
var i = 0
inpDS.foreach{
ix => {
inpQueue.put(ix)
val t = new ConsumerAPIThread()
t.setName("MyThread-"+i+" ")
pool.execute(t)
}
i = i+1
}
println("Final Queue Size = " +inpQueue.size+"\n")
class ConsumerAPIThread() extends Thread
{
var name =""
override def run()
{
val urlDetail = inpQueue.take()
print(this.getName()+" "+ Thread.currentThread().getName() + " popped "+urlDetail+" Queue Size "+inpQueue.size+" \n")
triggerAPI((urlDetail._1, urlDetail._2))
}
def triggerAPI(params:(Int,String)){
try{
val result = scala.io.Source.fromURL(params._2)
println("" +result)
}catch{
case ex:Exception => {
println("Exception caught")
}
}
}
def ConsumerAPIThread(s:String)
{
name = s;
}
}
So, you have two requirements: the functional one is that you want to process asynchronously the items in a list, the non-functional one is that you want to not process more than three items at once.
Regarding the latter, the nice thing is that, as you already have shown in your question, Java natively exposes a nicely packaged Executor that runs task on a thread pool with a fixed size, elegantly allowing you to cap the concurrency level if you work with threads.
Moving to the functional requirement, Scala helps by having something that does precisely that as part of its standard API. In particular it uses scala.concurrent.Future, so in order to use it we'll have to reframe triggerAPI in terms of Future. The content of the function is not particularly relevant, so we'll mostly focus on its (revised) signature for now:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext
def triggerAPI(params: (Int, String))(implicit ec: ExecutionContext): Future[Unit] =
Future {
// some code that takes some time to run...
}
Notice that now triggerAPI returns a Future. A Future can be thought as a read-handle to something that is going to be eventually computed. In particular, this is a Future[Unit], where Unit stands for "we don't particularly care about the output of this function, but mostly about its side effects".
Furthermore, notice that the method now takes an implicit parameter, namely an ExecutionContext. The ExecutionContext is used to provide Futures with some form of environment where the computation happens. Scala has an API to create an ExecutionContext from a java.util.concurrent.ExecutorService, so this will come in handy to run our computation on the fixed thread pool, running no more than three callbacks at any given time.
Before moving forward, if you have questions about Futures, ExecutionContexts and implicit parameters, the Scala documentation is your best source of knowledge (here are a couple of pointers: 1, 2).
Now that we have the new triggerAPI method, we can use Future.traverse (here is the documentation for Scala 2.12 -- the latest version at the time of writing is 2.13 but to the best of my knowledge Spark users are stuck on 2.12 for the time being).
The tl;dr of Future.traverse is that it takes some form of container and a function that takes the items in that container and returns a Future of something else. The function will be applied to each item in the container and the result will be a Future of the container of the results. In your case: the container is a List, the items are (Int, String) and the something else you return is a Unit.
This means that you can simply call it like this:
Future.traverse(inpDS)(triggerAPI)
And triggerAPI will be applied to each item in inpDS.
By making sure that the execution context backed by the thread pool is in the implicit scope when calling Future.traverse, the items will be processed with the desired thread pool.
The result of the call is Future[List[Unit]], which is not very interesting and can simply be discarded (as you are only interested in the side effects).
That was a lot of talk, if you want to play around with the code I described you can do so here on Scastie.
For reference, this is the whole implementation:
import java.util.concurrent.{ExecutorService, Executors}
import scala.concurrent.duration.DurationLong
import scala.concurrent.Future
import scala.concurrent.{ExecutionContext, ExecutionContextExecutorService}
import scala.util.control.NonFatal
import scala.util.{Failure, Success, Try}
val datasets = List(
(1, "https://google.com/2X6barD"),
(2, "https://google.com/3d9vCgW"),
(3, "https://google.com/2M02Xz0"),
(4, "https://google.com/2XOu2uL"),
(5, "https://google.com/2AfBWF0"),
(6, "https://google.com/36AEKsw"),
(7, "https://google.com/3enBxz7"),
(8, "https://google.com/36ABq0x"),
(9, "https://google.com/2XBjmiF")
)
val executor: ExecutorService = Executors.newFixedThreadPool(3)
implicit val executionContext: ExecutionContextExecutorService = ExecutionContext.fromExecutorService(executor)
def triggerAPI(params: (Int, String))(implicit ec: ExecutionContext): Future[Unit] =
Future {
val (index, _) = params
println(s"+ started processing $index")
val start = System.nanoTime() / 1000000
Iterator.from(0).map(_ + 1).drop(100000000).take(1).toList.head // a noticeably slow operation
val end = System.nanoTime() / 1000000
val duration = (end - start).millis
println(s"- finished processing $index after $duration")
}
Future.traverse(datasets)(triggerAPI).onComplete {
case result =>
println("* processing is over, shutting down the executor")
executionContext.shutdown()
}
You need to shutdown the Executor after your job done else It will be waiting.
Try add pool.shutdown() end of your program.
I have a Future in a cross-platform JVM / JS application. The future is polled following way in JVM:
val load = Future(loadSometing())
if (load.isCompleted) {
val loaded = Await.result(load, Duration.Inf)
// now process it
}
This cannot work with Scala.js, as Scala.js does not implement Await. In my case however I am not using Await to wait, only to get the result I know is already there. I know a proper solution is to make the code fully async and to perform the processing in the Future handler (map or onComplete), but even known it is not a proper way, can Future result be polled somehow in Scala.js?
Use Future.value to poll a Future without waiting/blocking:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
val f = Future { 42 }
println(f.value)
js.timers.setTimeout(500) {
println(f.value)
}
will print
None
Some(Success(42))
Fiddle here
I have a sequence of functions that return a future. I want to execute them sequentially i.e. after the first function future is complete, execute the next function and so on. Is there a way to do it?
ops: Seq[() => Future[Unit]]
You can combine all the futures into a single one with a foldLeft and flatMap:
def executeSequentially(ops: Seq[() => Future[Unit]])(
implicit exec: ExecutionContext
): Future[Unit] =
ops.foldLeft(Future.successful(()))((cur, next) => cur.flatMap(_ => next()))
foldLeft ensures the order from left to right and flatMap gives sequential execution. Functions are executed with the ExecutionContext, so calling executeSequentially is not blocking. And you can add callbacks or await on the resulting Future when/if you need it.
If you are using Twitter Futures, then I guess you won't need to pass ExecutionContext, but the general idea with foldLeft and flatMap should still work.
If given a Seq[Future[T]] you can convert it to a Future[Seq[T]] like so:
Val a: Seq[Future[T]] = ???
val resut: Future[Seq[T]] = Future.sequence(a)
a little less boilerplate than the above :)
I believe this should do it:
import scala.concurrent.{Await, Future}
import scala.concurrent.duration.Duration
def runSequentially(ops: Seq[() => Future[Unit]]): Unit = {
ops.foreach(f => Await.result(f(), Duration.Inf))
}
If you want to wait less then Duration.Inf, or stop at failure - should be easy to do.
I have a Java Future object which I would like to convert into a Scala Future.
Looking at the j.u.c.Future API, there is nothing much that I could use other than the isDone method. Is this isDone method blocking?
Currently this is what I have in my mind:
val p = Promise()
if (javaFuture.isDone())
p.success(javaFuture.get)
Is there a better way to do this?
Starting Scala 2.13, the standard library includes scala.jdk.FutureConverters which provides Java to Scala CompletableFuture/Future implicit conversions:
import scala.jdk.FutureConverters._
// val javaFuture = java.util.concurrent.CompletableFuture.completedFuture(12)
val scalaFuture = javaFuture.asScala
// scalaFuture: scala.concurrent.Future[Int] = Future(Success(12))
How about just wrapping it (I'm assuming there's an implicit ExecutionContext here):
val scalaFuture = Future {
javaFuture.get
}
EDIT:
A simple polling strategy could look like this (java.util.Future => F):
def pollForResult[T](f: F[T]): Future[T] = Future {
Thread.sleep(500)
f
}.flatMap(f => if (f.isDone) Future { f.get } else pollForResult(f))
This will check if the Java future is done every 500ms. Obviously the total blocking time is the same as above (rounded up to the nearest 500ms) but this solution will allow other tasks to be interleaved in the same thread of the ExecutionContext.
For those reading this question now, if you're using Scala 2.13 and above, use:
import scala.jdk.FutureConverters._
And convert using completableFuture.asScala
If you're using Scala 2.12 and below, use
import scala.compat.java8.FutureConverters._
And convert using: toScala(completableFuture) or completableFuture.toScala
Also, in Scala 2.12 make sure you're using the correct artifact:
org.scala-lang.modules:scala-java8-compat_2.12:0.9.0
Now, if for some reason what you have is actually a Future and not CompletableFuture, which should be a rare case nowadays, please follow first one of those answers: Transform Java Future into a CompletableFuture
Use FutureConvertors (built-in util in Scala) for conversion of Java Future to Scala Future.
Consider an example for converting Java Future[Int] to Scala Future[Int] ::
import java.util.concurrent.CompletableFuture
import scala.compat.java8.FutureConverters
val javaFuture: java.util.concurrent.Future[Int] = ???
// Some method call which returns Java's Future[Int]
val scalaFuture: Future[Int] =
FutureConverters.toScala(CompletableFuture.supplyAsync(new Supplier[Int] {
override def get(): Int = javaFuture.get
}))
Similar we can do for any custom type instead of Int.