I'm facing a situation in Scala 2.11 where I need a parallel collection to use a specific threading mechanism. The Scala code depends on a platform written in Java that creates a ThreadPoolExecutor to manage threads; the Scala code needs to hook into the same pool it creates. From this doc, I'll need to set up a TaskSupport in the parallel collection. I can do this by constructing a ThreadPoolTaskSupport from the ThreadPoolExecutor. However, ThreadPoolTaskSupport is marked as deprecated (note that ThreadPoolExecutor is not marked as deprecated in Java).
Is this a case where I just need to use the deprecated Scala class, or is there another way around it?
You can make a ThreadPoolExecutor to create a scala.concurrent.ExecutionContext, since a ThreadPoolExecutor is a java.util.concurrent.ExecutorService and use the ExecutionContext to construct an ExecutionContextTaskSupport:
import scala.collection.parallel. { ExecutionContextTaskSupport, TaskSupport }
import scala.concurrent.ExecutionContext
import java.util.concurrent.ThreadPoolExecutor
def ectsFromTPE(tpe: ThreadPoolExecutor): TaskSupport =
new ExecutionContextTaskSupport(
environment = ExecutionContext.fromExecutorService(tpe)
)
You can then use the resulting TaskSupport like any other:
import scala.collection.parallel.immutable.ParRange
val nn = ParRange(0, Int.MaxValue, 1, true)
nn.taskSupport = ectsFromTPE(???)
Related
If I import collection.mutable.Stack should I use Stack[] or mutable.Stack[]() and are there any differences between the two?
You can not do both at the same time. There are 2 possible ways.
1.
Importing scala.collection.mutable.stack
import scala.collection.mutable.Stack
val stack: Stack[Int] = new Stack[Int]
2.
Importing scala.collection.mutable
import scala.collection.mutable
val stack: mutable.Stack[Int] = new mutable.Stack[Int]
In the first example you are importing scala.collecion.mutable.Stack. Therefore you can directly use Stack object and its functions. In the second one, you are importing scala.collection.mutable. This way you are eligible to use mutable package's functions. You have to use objects in this package by calling mutable.xxx.
If you in general follow the functional principle to use immutable data structures as much as possible, I would import scala.collection.mutable and use mutable.Stack to make it clear that you are using the mutable version in this particular case.
This convention is also described in the scala documentation for collections
In addition to the future's execution context provided by Scala:
import scala.concurrent.ExecutionContext.Implicits.global
Play provides another execution context:
import play.api.libs.concurrent.Execution.Implicits.defaultContext
When to use each in Play for Scala?
You can find an answer here:
Play's internal execution context
That question is not complete duplicate but very close, and the answer there cover your question as well.
In a short:
You must not use import scala.concurrent.ExecutionContext.Implicits.global in Play.
Response to the comment
The quote from the answer:
Instead, you would use
play.api.libs.concurrent.Execution.Implicits.defaultContext, which
uses an ActorSystem.
scala.concurrent.ExecutionContext.Implicits.global is an
ExecutionContext defined in the Scala standard library. It is a
special ForkJoinPool that using the blocking method to handle
potentially blocking code in order to spawn new threads in the pool.
You really shouldn't use this in a Play application, as Play will have
no control over it. It also has the potential to spawn a lot of
threads and use a ton of memory, if you're not careful.
As a general rule, if you need an ExecutionContext inside a method or class, require it as an implicit parameter (Scala) or a normal parameter (Java). Convention is to put this parameter last.
This rule allows the caller/creator to control where/how/when asynchronous effects are evaluated.
The main exception to this rule is when you already have an ExecutionContext and do not wish for the caller/creator to be in control of where the effects are evaluated.
Let's say I have a method that permits to update some date in DB:
def updateLastConsultationDate(userId: String): Unit = ???
How can I throttle/debounce that method easily so that it won't be run more than once an hour per user.
I'd like the simplest possible solution, not based on any event-bus, actor lib or persistence layer. I'd like an in-memory solution (and I am aware of the risks).
I've seen solutions for throttling in Scala, based on Akka Throttler, but this really looks to me overkill to start using actors just for throttling method calls. Isn't there a very simple way to do that?
Edit: as it seems not clear enough, here's a visual representation of what I want, implemented in JS. As you can see, throttling may not only be about filtering subsequent calls, but also postponing calls (also called trailing events in js/lodash/underscore). The solution I'm looking for can't be based on pure-synchronous code only.
This sounds like a great job for a ReactiveX-based solution. On Scala, Monix is my favorite one. Here's the Ammonite REPL session illustrating it:
import $ivy.`io.monix::monix:2.1.0` // I'm using Ammonite's magic imports, it's equivalent to adding "io.monix" %% "monix" % "2.1.0" into your libraryImports in SBT
import scala.concurrent.duration.DurationInt
import monix.reactive.subjects.ConcurrentSubject
import monix.reactive.Consumer
import monix.execution.Scheduler.Implicits.global
import monix.eval.Task
class DbUpdater {
val publish = ConcurrentSubject.publish[String]
val throttled = publish.throttleFirst(1 hour)
val cancelHandle = throttled.consumeWith(
Consumer.foreach(userId =>
println(s"update your database with $userId here")))
.runAsync
def updateLastConsultationDate(userId: String): Unit = {
publish.onNext(userId)
}
def stop(): Unit = cancelHandle.cancel()
}
Yes, and with Scala.js this code will work in the browser, too, if it's important for you.
Since you ask for the simplest possible solution, you can store a val lastUpdateByUser: Map[String, Long], which you would consult before allowing an update
if (lastUpdateByUser.getOrElse(userName, 0)+60*60*1000 < System.currentTimeMillis) updateLastConsultationDate(...)
and update when a user actually performs an update
lastUpdateByUser(userName) = System.currentTimeMillis
One way to throttle, would be to maintain a count in a redis instance. Doing so would ensure that the DB wouldn't be updated, no matter how many scala processes you were running, because the state is stored outside of the process.
I'm using RxJava in my Scala project and I need to execute my Observable in a separate thread. I know in order to achieve this I need to call observeOn method on it and pass an instance of rx.lang.scala.Scheduler as an argument.
But how can I create that instance? I did not find any apparent ways of instantiating of rx.lang.scala.Scheduler trait. For example, I have this code:
Observable.from(List(1,2,3)).observeOn(scheduler)
Can someone provide an example of working scheduler variable that will do the trick?
A trait is not instantiable.
You need to use one of the subclasses of the trait listed under "Known Subclasses" in the API documentation.
All schedulers are in the package
import rx.lang.scala.schedulers._
For blocking IO operations, use IO scheduler
Observable.from(List(1,2,3)).observeOn(IOScheduler())
For computational work, use computation scheduler
Observable.from(List(1,2,3)).observeOn(ComputationScheduler())
To execute on the current thread
Observable.from(List(1,2,3)).observeOn(ImmediateScheduler())
To execute on a new thread
Observable.from(List(1,2,3)).observeOn(NewThreadScheduler())
To queues work on the current thread to be executed after the current one
Observable.from(List(1,2,3)).observeOn(TrampolineScheduler())
If you want to use your own custom thread pool
val threadPoolExecutor = Executors.newFixedThreadPool(2)
val executionContext = ExecutionContext.fromExecutor(threadPoolExecutor)
val customScheduler = ExecutionContextScheduler(executionContext)
Observable.from(List(1,2,3)).observeOn(customScheduler)
Currently we use :
val simpleOps: ExecutionContext = Akka.system(app).dispatchers.lookup("akka.actor.simple-ops")
Then we implicitely import this when we create and compose our futures. Other than that we currently don't use Akka.
There are easier ways to get ExecutionContext, but I am not sure that it is going to run over Java Fork/Join Pool, which is a bit more performant than regular java ExecutorService.
Is Akka the only way to get FJP powered ExecutionContext?
Are there any other ways to get ExecutionContext that are as performant that Akka FJP MessageDispatcher?
Scala futures already use ForkJoinPool under the hood (specifically, they use a scala specific fork of java's ForkJoinPool).
See https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/ExecutionContextImpl.scala#L1
In particular, notice that DefaultThreadFactory extends ForkJoinPool.ForkJoinWorkerThreadFactory:
class DefaultThreadFactory(daemonic: Boolean) extends ThreadFactory with ForkJoinPool.ForkJoinWorkerThreadFactory