What is the most efficient way of running futures in scala? - scala

Currently we use :
val simpleOps: ExecutionContext = Akka.system(app).dispatchers.lookup("akka.actor.simple-ops")
Then we implicitely import this when we create and compose our futures. Other than that we currently don't use Akka.
There are easier ways to get ExecutionContext, but I am not sure that it is going to run over Java Fork/Join Pool, which is a bit more performant than regular java ExecutorService.
Is Akka the only way to get FJP powered ExecutionContext?
Are there any other ways to get ExecutionContext that are as performant that Akka FJP MessageDispatcher?

Scala futures already use ForkJoinPool under the hood (specifically, they use a scala specific fork of java's ForkJoinPool).
See https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/ExecutionContextImpl.scala#L1
In particular, notice that DefaultThreadFactory extends ForkJoinPool.ForkJoinWorkerThreadFactory:
class DefaultThreadFactory(daemonic: Boolean) extends ThreadFactory with ForkJoinPool.ForkJoinWorkerThreadFactory

Related

Do I need to use ThreadPoolTaskSupport?

I'm facing a situation in Scala 2.11 where I need a parallel collection to use a specific threading mechanism. The Scala code depends on a platform written in Java that creates a ThreadPoolExecutor to manage threads; the Scala code needs to hook into the same pool it creates. From this doc, I'll need to set up a TaskSupport in the parallel collection. I can do this by constructing a ThreadPoolTaskSupport from the ThreadPoolExecutor. However, ThreadPoolTaskSupport is marked as deprecated (note that ThreadPoolExecutor is not marked as deprecated in Java).
Is this a case where I just need to use the deprecated Scala class, or is there another way around it?
You can make a ThreadPoolExecutor to create a scala.concurrent.ExecutionContext, since a ThreadPoolExecutor is a java.util.concurrent.ExecutorService and use the ExecutionContext to construct an ExecutionContextTaskSupport:
import scala.collection.parallel. { ExecutionContextTaskSupport, TaskSupport }
import scala.concurrent.ExecutionContext
import java.util.concurrent.ThreadPoolExecutor
def ectsFromTPE(tpe: ThreadPoolExecutor): TaskSupport =
new ExecutionContextTaskSupport(
environment = ExecutionContext.fromExecutorService(tpe)
)
You can then use the resulting TaskSupport like any other:
import scala.collection.parallel.immutable.ParRange
val nn = ParRange(0, Int.MaxValue, 1, true)
nn.taskSupport = ectsFromTPE(???)

Defining the future implicit context in Play for Scala

In addition to the future's execution context provided by Scala:
import scala.concurrent.ExecutionContext.Implicits.global
Play provides another execution context:
import play.api.libs.concurrent.Execution.Implicits.defaultContext
When to use each in Play for Scala?
You can find an answer here:
Play's internal execution context
That question is not complete duplicate but very close, and the answer there cover your question as well.
In a short:
You must not use import scala.concurrent.ExecutionContext.Implicits.global in Play.
Response to the comment
The quote from the answer:
Instead, you would use
play.api.libs.concurrent.Execution.Implicits.defaultContext, which
uses an ActorSystem.
scala.concurrent.ExecutionContext.Implicits.global is an
ExecutionContext defined in the Scala standard library. It is a
special ForkJoinPool that using the blocking method to handle
potentially blocking code in order to spawn new threads in the pool.
You really shouldn't use this in a Play application, as Play will have
no control over it. It also has the potential to spawn a lot of
threads and use a ton of memory, if you're not careful.
As a general rule, if you need an ExecutionContext inside a method or class, require it as an implicit parameter (Scala) or a normal parameter (Java). Convention is to put this parameter last.
This rule allows the caller/creator to control where/how/when asynchronous effects are evaluated.
The main exception to this rule is when you already have an ExecutionContext and do not wish for the caller/creator to be in control of where the effects are evaluated.

Scalaz and main method

I'm trying to learn Scalaz with a toy project of mine, I used monads in Haskell and now I want to learn how to use them in Scala with Scalaz.
The big question is, how does one use the IO() Monad in the Scala's main method?
In Haskell, the main function is of type IO() and in Scala it is of type ().
The solution I found so far was to create another function foo of type IO() and in the main method call foo.unsafePerformIO(). But this makes me cringe.
What could be a solution?
Scalaz provides a SafeApp trait that allows you to replace Scala's side-effectful main method with a wrapper that looks more like Haskell's main:
import scalaz._, Scalaz._, effect.{ IO, SafeApp }
object MyMain extends SafeApp {
override def runl(args: List[String]): IO[Unit] = IO(println("hello world"))
}
Now MyMain can be used like any other JVM class with a static main.
I don't personally use SafeApp much, but it's there if you want to avoid calling unsafePerformIO by hand.
Scala's native main method is meant to be side effectful; calling unsafePerformIO in it is completely safe.
In fact, considering most Scala projects aren't 100% pure/Scalaz code, this approach is probably the most idiomatic one. Someone might have provided an "elegance" wrapper for it, but it wouldn't add any value besides cosmetics. And, again, most of the time you'd be embedding a Scalaz IO action inside a more mainstream, non-pure and possibly even non-functional Scala code anyway.
Furthermore, in general, and even in Haskell, the unsafe functions are typically just makeSureYouKnowWhatYoureDoing functions.

How do I create a `Scheduler` for `observeOn` method?

I'm using RxJava in my Scala project and I need to execute my Observable in a separate thread. I know in order to achieve this I need to call observeOn method on it and pass an instance of rx.lang.scala.Scheduler as an argument.
But how can I create that instance? I did not find any apparent ways of instantiating of rx.lang.scala.Scheduler trait. For example, I have this code:
Observable.from(List(1,2,3)).observeOn(scheduler)
Can someone provide an example of working scheduler variable that will do the trick?
A trait is not instantiable.
You need to use one of the subclasses of the trait listed under "Known Subclasses" in the API documentation.
All schedulers are in the package
import rx.lang.scala.schedulers._
For blocking IO operations, use IO scheduler
Observable.from(List(1,2,3)).observeOn(IOScheduler())
For computational work, use computation scheduler
Observable.from(List(1,2,3)).observeOn(ComputationScheduler())
To execute on the current thread
Observable.from(List(1,2,3)).observeOn(ImmediateScheduler())
To execute on a new thread
Observable.from(List(1,2,3)).observeOn(NewThreadScheduler())
To queues work on the current thread to be executed after the current one
Observable.from(List(1,2,3)).observeOn(TrampolineScheduler())
If you want to use your own custom thread pool
val threadPoolExecutor = Executors.newFixedThreadPool(2)
val executionContext = ExecutionContext.fromExecutor(threadPoolExecutor)
val customScheduler = ExecutionContextScheduler(executionContext)
Observable.from(List(1,2,3)).observeOn(customScheduler)

Does Scala Futures/ExecutionContext have something like C#'s ConfigureAwait

C#'s Tasks have ConfigureAwait(false) for libraries to prevent synchronization to (for example) the UI-thread which is not always necessary:
http://msdn.microsoft.com/en-us/magazine/hh456402.aspx
In .NET I believe there can only be one SynchonizationContext, so it's clear on which threadpool a Task should execute it's continuation.
For a library, when you can't assume the user is in a webrequest(in .NET HttpContext.Current.Items flows), commandline (normal multithreaded), XAML/Windows Forms (single UI thread), it's almost always better to use ConfigureAwait(false), so the Waiter knows it can just execute the continuation on whatever thread is being used to call the Waiter (this is only bad if you do blocking code in the library which could lead to thread starvation on the threadpool where the initial workload is started, let assume we don't do that).
The point is that from a library perspective you don't want to use a thread from the caller's threadpool to synchronize a continuation, you just want the continuation to run on whatever thread. This saves a context switch and keeps the load of the UI thread for example.
In Scala, for each operation (namely map) on Futures, you need an ExecutionContext (passed implicitly). This makes managing threadpools incredibly easy, which I like a lot more than the way .NET has somewhat strange TaskFactory's (which nobody seems to use, they just use the default TaskFactory).
My question is, does Scala have the same problem as .NET in respect to context switches being sometimes unnecessary, and if so, is there a way, similar to ConfigureAwait, to fix this?
Concrete example I'm finding in Scala where I wonder about this:
def trace[T](message: => String)(block: => Future[T]): Future[T] = {
if (!logger.isTraceEnabled) block
else {
val startedAt = System.currentTimeMillis()
block.map { result =>
val timeTaken = System.currentTimeMillis() - startedAt
logger.trace(s"$message took ${timeTaken}ms")
result
}
}
}
I'm using play and I generally import play's default, implicit ExecutionContext.
The map on block needs to run on an execution context.
If I wrote this piece of Scala in a library and I would add an implicit parameter executionContext:
def trace[T](message: => String)(block: => Future[T])(implicit executionContext: ExecutionContext): Future[T] = {
instead of importing play's default ExecutionContext in the libary.