Using a separate ExecutionContext for Slick

Using a separate ExecutionContext for Slick - scala

I'm using Play 2.5 with Slick. The docs on this topic simply state that everything is managed by Slick and Play's Slick module. This example however, prints Dispatcher[akka.actor.default-dispatcher]:
class MyDbioImpl #Inject()(protected val dbConfigProvider: DatabaseConfigProvider)(implicit ec: ExecutionContext)
with HasDatabaseConfigProvider[JdbcProfile] {
import profile.api._
def selectSomeStuff(): Future[MyResult] = db.run {
println(ec)
[...]
}
}
Since the execution context is printed inside db.run, it seems like all of my database access will also be executed on the default execution context.
I found this answer to an older question which, at the time, solved the problem. But this solution is since deprecated, it is suggested to use dependency injection to acquire the application context. When I try to do this, I get an error saying that play.akka.actor.slick-context does not exist...
class MyDbioProvider #Inject()(actorSystem: ActorSystem,
protected val dbConfigProvider: DatabaseConfigProvider)
extends Provider[MyDbioImpl] {
override def get(): MyDbioImpl = {
val ec = actorSystem.dispatchers.lookup("play.akka.actor.slick-context")
new MyDbioImpl(dbConfigProvider)(ec)
}
}
Edit:
Is Slick's execution context a "normal" execution context which is defined in a config file somewhere? Where does the context switch take place? I assumed the entry point to the "database world" is at db.run.

According to Slick:
Every Database contains an AsyncExecutor that manages the thread pool
for asynchronous execution of Database I/O Actions. Its size is the
main parameter to tune for the best performance of the Database
object. It should be set to the value that you would use for the size
of the connection pool in a traditional, blocking application (see
About Pool Sizing in the HikariCP documentation for further
information). When using Database.forConfig, the thread pool is
configured directly in the external configuration file together with
the connection parameters. If you use any other factory method to get
a Database, you can either use a default configuration or specify a
custom AsyncExecutor.
Basically it says you don't need to create an isolated ExecutionContext since Slick already isolates a thread pool internally. Any call you make to Slick is non-blocking thus you should use the default ExecutionContext.

Slick's implementation of this can be seen in the BasicBackend.scala file: the runInContextSafe method. The code is as follows:
val promise = Promise[R]
val runnable = new Runnable {
override def run() = {
try {
promise.completeWith(runInContextInline(a, ctx, streaming, topLevel, stackLevel = 1))
} catch {
case NonFatal(ex) => promise.failure(ex)
}
}
}
DBIO.sameThreadExecutionContext.execute(runnable)
promise.future
As shown above, Promise is used here, and then its code is executed quickly using its internal thread pool, and the Future object of the Promise is returned. Therefore, when Await.result/ready is executed, the Promise here is probably already executed by Slick's internal thread, so it is enough to get the result, and it is possible to execute Await.result/ready in an environment such as Play. Of non-blocking.
For details, please refer to Scala's documentation on Future and Promise: https://docs.scala-lang.org/overviews/core/futures.html

Related

Understanding Apache Spark RDD task serialization

I am trying to understand how task serialization works in Spark and am a bit confused by some mixed results I'm getting in a test I've written.
I have some test code (simplified for sake of post) that does the following over more than one node:
object TestJob {
def run(): Unit = {
val rdd = ...
val helperObject = new Helper() // Helper does NOT impl Serializable and is a vanilla class
rdd.map(element => {
helperObject.transform(element)
}).collect()
}
}
When I execute run(), the job bombs out with a "task not serializable" exception as expected since helperObject is not serializable. HOWEVER, when I alter it a little, like this:
trait HelperComponent {
val helperObject = new Helper()
}
object TestJob extends HelperComponent {
def run(): Unit = {
val rdd = ...
rdd.map(element => {
helperObject.transform(element)
}).collect()
}
}
The job executes successfully for some reason. Could someone help me to understand why this might be? What exactly gets serialized by Spark and sent to the workers in each case above?
I am using Spark version 2.1.1.
Thank you!

Could someone help me to understand why this might be?
In your first snippet, helperObject is a local variable declared inside run. As such, it will be closed over (lifted) by the function such that where ever this code executes, all information would be available, and because of that Sparks ClosureCleaner yells at you for trying to serialize it.
In your second snippet, the value is no longer a local variable in the method scope, it is part of the class instance (technically, this is an object declaration but it will be transformed into a JVM class after all).
This is meaningful in Spark for the reason that all worker nodes in the cluster contain the JARs needed to execute your code. Thus, instead of serializing TestObject in its entirety for rdd.map, when Spark spins up an Executor process in one of your workers, it will load TestObject locally via a ClassLoader, and create an instance of it, just like every other JVM class in a non distributed application.
To conclude, the reason you don't see this blowing up is because the class is no longer serialized due to the changes in the way you've declared the type instance.

How do I create thread pools in Play 2.5.x?

I am currently on Play 2.4.2 and have successfully created thread pools using the following below:
package threads
import scala.concurrent.ExecutionContext
import play.api.libs.concurrent.Akka
import play.api.Play.current
object Contexts {
implicit val db: ExecutionContext = Akka.system.dispatchers.lookup("contexts.db-context")
implicit val pdf: ExecutionContext = Akka.system.dispatchers.lookup("contexts.pdf-context")
implicit val email: ExecutionContext = Akka.system.dispatchers.lookup("contexts.email-context")
}
and then in the code with...
Future{....}(threads.Contexts.db)
We are ready to upgrade to Play 2.5 and having trouble understanding the documentation. The documentation for 2.4.2 uses Akka.system.dispatchers.lookup, which we use without issue. The documentation for 2.5.x uses app.actorSystem.dispatchers.lookup. As far as I know, I have to inject the app into a Class, not an Object. Yet the documentation clearly uses an Object for the example!
Has anyone successfully created thread pools in Play 2.5.x that can help out? Is it as simple as changing Contexts to a class, then injecting it wherever I would like to use this threading? Seems odd since to use the default ExecutionContext I just have to make an implicit import.
Also, we are using Play scala.

If you simply change your Contexts to a class, then you will have to deal with how to get an instance of that class.
In my opinion, if you have a number of thread pools that you want to make use of, named bindings are the way to go. In the below example, I will show you how you could accomplish this with guice.
Note that guice injects depedencies at runtime, but it is also possible to inject dependencies at compile time.
I'm going to show it with the db context as an example. First, this is how you will be using it:
class MyService #Inject() (#Named("db") dbCtx: ExecutionContext) {
// make db access here
}
And here's how you could define the binding:
bind[ExecutionContext].qualifiedWith("db").toProvider[DbExecutionContextProvider]
And somewhere define the provider:
class DbExecutionContextProvider #Inject() (actorSystem: ActorSystem) extends Provider[ExecutionContext] {
override def get(): ExecutionContext = actorSystem.dispatchers.lookup("contexts.db-context")
}
You will have to do this for each of your contexts. I understand this may be a little cumbersome and there may actually be more elegant ways to define the bindings in guice.
Note that I have not tried this out. One issue you might stumble upon could be that you'll end up with conflicts, because play already defines a binding for the ExecutionContext in their BuiltinModule. You may need to override the binding to work around that.

How can I perform session based logging in Play Framework

We are currently using the Play Framework and we are using the standard logging mechanism. We have implemented a implicit context to support passing username and session id to all service methods. We want to implement logging so that it is session based. This requires implementing our own logger. This works for our own logs but how do we do the same for basic exception handling and logs as a result. Maybe there is a better way to capture this then with implicits or how can we override the exception handling logging. Essentially, we want to get as many log messages to be associated to the session.

It depends if you are doing reactive style development or standard synchronous development:
If standard synchronous development (i.e. no futures, 1 thread per request) - then I'd recommend you just use MDC, which adds values onto Threadlocal for logging. You can then customise the output in logback / log4j. When you get the username / session (possibly in a Filter or in your controller), you can then set the values there and then and you do not need to pass them around with implicits.
If you are doing reactive development you have a couple options:
You can still use MDC, except you'd have to use a custom Execution Context that effectively copies the MDC values to the thread, since each request could in theory be handled by multiple threads. (as described here: http://code.hootsuite.com/logging-contextual-info-in-an-asynchronous-scala-application/)
The alternative is the solution which I tend to use (and close to what you have now): You could make a class which represents MyAppRequest. Set the username, session info, and anything else, on that. You can continue to pass it around as an implicit. However, instead of using Action.async, you make your own MyAction class which an be used like below
myAction.async { implicit myRequest => //some code }
Inside the myAction, you'd have to catch all Exceptions and deal with future failures, and do the error handling manually instead of relying on the ErrorHandler. I often inject myAction into my Controllers and put common filter functionality in it.
The down side of this is, it is just a manual method. Also I've made MyAppRequest hold a Map of loggable values which can be set anywhere, which means it had to be a mutable map. Also, sometimes you need to make more than one myAction.async. The pro is, it is quite explicit and in your control without too much ExecutionContext/ThreadLocal magic.
Here is some very rough sample code as a starter, for the manual solution:
def logErrorAndRethrow(myrequest:MyRequest, x:Throwable): Nothing = {
//log your error here in the format you like
throw x //you can do this or handle errors how you like
}
class MyRequest {
val attr : mutable.Map[String, String] = new mutable.HashMap[String, String]()
}
//make this a util to inject, or move it into a common parent controller
def myAsync(block: MyRequest => Future[Result] ): Action[AnyContent] = {
val myRequest = new MyRequest()
try {
Action.async(
block(myRequest).recover { case cause => logErrorAndRethrow(myRequest, cause) }
)
} catch {
case x:Throwable =>
logErrorAndRethrow(myRequest, x)
}
}
//the method your Route file refers to
def getStuff = myAsync { request:MyRequest =>
//execute your code here, passing around request as an implicit
Future.successful(Results.Ok)
}

Scala and Slick: DatabaseConfigProvider in standalone application

I have an Play 2.5.3 application which uses Slick for reading an object from DB.
The service classes are built in the following way:
class SomeModelRepo #Inject()(protected val dbConfigProvider: DatabaseConfigProvider) {
val dbConfig = dbConfigProvider.get[JdbcProfile]
import dbConfig.driver.api._
val db = dbConfig.db
...
Now I need some standalone Scala scripts to perform some operations in the background. I need to connect to the DB within them and I would like to reuse my existing service classes to read objects from DB.
To instantiate a SomeModelRepo class' object I need to pass some DatabaseConfigProvider as a parameter. I tried to run:
object SomeParser extends App {
object testDbProvider extends DatabaseConfigProvider {
def get[P <: BasicProfile]: DatabaseConfig[P] = {
DatabaseConfigProvider.get("default")(Play.current)
}
}
...
val someRepo = new SomeModelRepo(testDbProvider)
however I have an error: "There is no started application" in the line with "(Play.current)". Moreover the method current in object Play is deprecated and should be replaced with DI.
Is there any way to initialize my SomeModelRepo class' object within the standalone object SomeParser?
Best regards

When you start your Play application, the PlaySlick module handles the Slick configurations for you. With it you have two choices:
inject DatabaseConfigProvider and get the driver from there, or
do a global lookup via DatabaseConfigProvider.get[JdbcProfile](Play.current), which is not preferred.
Either way, you must have your Play app running! Since this is not the case with your standalone scripts you get the error: "There is no started application".
So, you will have to use Slick's default approach, by instantiating db directly from config:
val db = Database.forConfig("default")
You have lot's of examples at Lightbend's templates.
EDIT: Sorry, I didn't read the whole question. Do you really need to have it as another application? You can run your background operations when your app starts, like here. In this example, InitialData class is instantiated as eager singleton, so it's insert() method is run immediately when app starts.

Scala Play Slick RejectedExecutionException on ScalaTest runs

My FlatSpec tests are throwing:
java.util.concurrent.RejectedExecutionException: Task slick.backend.DatabaseComponent$DatabaseDef$$anon$2#dda460e rejected from java.util.concurrent.ThreadPoolExecutor#4f489ebd[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]
But only when I run more than one suite, on the second suite forward; it seems there's something that isn't reset between tests. I'm using OneAppPerSuite to provide the app context. Whenever I use OneAppPerTest, it fails again after the first test/Suite.
I have a override def beforeEach = tables.foreach ( _.truncate ) set up to clear the tables, where truncate just deletes all from a table: Await.result (db.run (q.delete), Timeout.Inf)
I have the following setup for my DAO layer:
SomeMappedDaoClass extends SomeCrudBase with HasDatabaseConfig
where
trait SomeCrudBase { self: HasDatabaseConfig =>
override lazy val dbConfig = DatabaseConfigProvider.get[JdbcProfile](Play.current)
implicit lazy val context = Akka.system.dispatchers.lookup("db-context")
}
And in application.conf
db-context {
fork-join-executor {
parallelism-factor = 5
parallelism-max = 100
}
}
I was refactoring the code to move away from Play's Guice DI. Before, when it had #Inject() (val dbConfigProvider: DatabaseConfigProvider) and extended HasDatabaseConfigProvider instead on the DAO classes, everything worked perfectly. Now it doesn't, and I don't know why.
Thank you in advance!

Just out of interest is SomeMappedDaoClass an object? (I know it says class but...).
When testing the Play framework I have run into this kind of issue when dealing with objects that setup connections to parts of the Play Framework.
Between tests and between test files the Play app is killed and restarted, however, the objects created persist (because they are objects, they are initialised once within a JVM context--I think).
This can result in an object with a connection (be it for slick, an actor, anything...) that is referencing the first instance of the app used in a test. When the app is terminated and a new test starts a new app, that connection is now pointing to nothing.

I came across the same issue and in my case, the above answers did not work out.
My Solution -
implicit val app = new FakeApplication(additionalConfiguration = inMemoryDatabase())
Play.start(app)
Add above code in your first test case and don't add Play.stop(app). As all the test cases are already refering the first application, it should not be terminated. This worked for me.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Using a separate ExecutionContext for Slick - scala

Related

Understanding Apache Spark RDD task serialization

How do I create thread pools in Play 2.5.x?

How can I perform session based logging in Play Framework

Scala and Slick: DatabaseConfigProvider in standalone application

Scala Play Slick RejectedExecutionException on ScalaTest runs

Categories

Resources