I am doing a disk intensive operation and I want to use my own thread-pool for it and not the default one.
I read the following link, and I am facing the exact same problem
Akka :: dispatcher [%name%] not configured, using default-dispatcher
But my config file is slightly different, I have tried the suggestion but it not working.
My application.conf in play has the following
jpa-execution-context {
thread-pool-executor {
core-pool-size-factor = 10.0
core-pool-size-max = 10
}
}
And then in my test code I do the following, but I get an exception. Here is the test method
private def testContext():Future[Int] = {
val system = ActorSystem.create()
val a = ActorSystem.create()
implicit val executionContext1 = system.dispatchers.lookup("jpa-execution-context")
Future{logger.error("inside my new thread pool wonderland");10}{executionContext1}
}
Here is the exception:
akka.ConfigurationException: Dispatcher [jpa-execution-context] not configured
I think you forgot a few elements in your configuration:
jpa-execution-context {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 10.0
core-pool-size-max = 10
}
}
Doc link: http://doc.akka.io/docs/akka/current/scala/dispatchers.html#types-of-dispatchers
Related
I'm using Akka for several different Actors. The work done by these Actors is non-blocking. I noticed something odd - the number of dispatchers scales with the number of Actors I'm creating. If I create hundreds of actors, I find myself with hundreds of dispatchers, sometimes over 1000.
This is, even though, most of the dispatchers look like this:
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000003d503de50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
(basically, doing nothing most of the time)
I initialize dispatchers with calls like this:
ActorMaterializer(ActorMaterializerSettings(system).withDispatcher(s"akka.dispatchers.$dispatcherName"))(system))
My dispatcher configuration is below (we have different dispatchers for different actors):
dispatchers {
connector-actor-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 200
}
throughput = 1
}
http-actor-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 1
parallelism-factor = 1.0
parallelism-max = 64
task-peeking-mode = "FIFO"
}
throughput = 1
}
commands-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 1
parallelism-factor = 1.0
parallelism-max = 64
task-peeking-mode = "FIFO"
}
throughput = 1
}
http-server-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-low {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-high {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-factor = 1
}
throughput = 1
}
http-client-dispatcher-parser {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 200
}
throughput = 1
}
}
}
What am I missing?
It seems that this mostly answered in the comments above, but to collate them into an "Answer": it appears that "so many dispatchers" are getting created because you are creating them explicitly in your config.
Also, when you give an example of a "dispatcher" you are actually showing a thread stack trace. So you might be confusing threads with dispatchers. And many of the the dispatchers you are creating have large thread counts. As #Tim says "If you are creating hundreds of actors and each actor has its own dispatcher with many threads, you are going to get a lot of threads!"
But that's really a tuning question. The answer to the direct question is that generally (with the exception of the system dispatcher), dispatchers are generally only created when you specifically ask for them to be created. And it appears that you are creating many of them. And that you also have enormous thread counts for each dispatcher.
As discussed, the general best practice is to have one dispatcher for non-blocking actors and another dispatcher with blocking actors. Each dispatcher will also generally only need a small number of threads. There are some edge cases where you might want additional dedicated dispatchers for particularly sensitive or badly behaving actors, but it depends on your actors and your application.
I use dbcp2.BasicDataSource as database-connection-pool. The database query is used in some map function to get additional info of sensors; I found out that, when the flink job restarts due to exceptions, the old DB connections are still active on the server side.
flink version 1.7
BasicDataSource construct code here
object DbHelper extends Lazing with Logging {
private lazy val connectionPool: BasicDataSource = createDataSource()
private def createDataSource(): BasicDataSource = {
val conn_str = props.getProperty("db.url")
val conn_user = props.getProperty("db.user")
val conn_pwd = props.getProperty("db.pwd")
val initialSize = props.getProperty("db.initial.size", "3").toInt
val bds = new BasicDataSource
bds.setDriverClassName("org.postgresql.Driver")
bds.setUrl(conn_str)
bds.setUsername(conn_user)
bds.setPassword(conn_pwd)
bds.setInitialSize(initialSize)
bds
}
}
Change your map function to a RichMapFunction. Override the close() method of the RichMapFunction and put the code to close your database connection there. You should likely be putting the code to open the connection in the open() method as well.
I am running a Spark application (version 1.6.0) on a Hadoop cluster with Yarn (version 2.6.0) in client mode. I have a piece of code that runs a long computation, and I want to kill it if it takes too long (and then run some other function instead).
Here is an example:
val conf = new SparkConf().setAppName("TIMEOUT_TEST")
val sc = new SparkContext(conf)
val lst = List(1,2,3)
// setting up an infite action
val future = sc.parallelize(lst).map(while (true) _).collectAsync()
try {
Await.result(future, Duration(30, TimeUnit.SECONDS))
println("success!")
} catch {
case _:Throwable =>
future.cancel()
println("timeout")
}
// sleep for 1 hour to allow inspecting the application in yarn
Thread.sleep(60*60*1000)
sc.stop()
The timeout is set for 30 seconds, but of course the computation is infinite, and so Awaiting on the result of the future will throw an Exception, which will be caught and then the future will be canceled and the backup function will execute.
This all works perfectly well, except that the canceled job doesn't terminate completely: when looking at the web UI for the application, the job is marked as failed, but I can see there are still running tasks inside.
The same thing happens when I use SparkContext.cancelAllJobs or SparkContext.cancelJobGroup. The problem is that even though I manage to get on with my program, the running tasks of the canceled job are still hogging valuable resources (which will eventually slow me down to a near stop).
To sum things up: How do I kill a Spark job in a way that will also terminate all running tasks of that job? (as opposed to what happens now, which is stopping the job from running new tasks, but letting the currently running tasks finish)
UPDATE:
After a long time ignoring this problem, we found a messy but efficient little workaround. Instead of trying to kill the appropriate Spark Job/Stage from within the Spark application, we simply logged the stage ID of all active stages when the timeout occurred, and issued an HTTP GET request to the URL presented by the Spark Web UI used for killing said stages.
I don't know it this answers your question.
My need was to kill jobs hanging for too much time (my jobs extract data from Oracle tables, but for some unknonw reason, seldom the connection hangs forever).
After some study, I came to this solution:
val MAX_JOB_SECONDS = 100
val statusTracker = sc.statusTracker;
val sparkListener = new SparkListener()
{
override def onJobStart(jobStart : SparkListenerJobStart)
{
val jobId = jobStart.jobId
val f = Future
{
var c = MAX_JOB_SECONDS;
var mustCancel = false;
var running = true;
while(!mustCancel && running)
{
Thread.sleep(1000);
c = c - 1;
mustCancel = c <= 0;
val jobInfo = statusTracker.getJobInfo(jobId);
if(jobInfo!=null)
{
val v = jobInfo.get.status()
running = v == JobExecutionStatus.RUNNING
}
else
running = false;
}
if(mustCancel)
{
sc.cancelJob(jobId)
}
}
}
}
sc.addSparkListener(sparkListener)
try
{
val df = spark.sql("SELECT * FROM VERY_BIG_TABLE") //just an example of long-running-job
println(df.count)
}
catch
{
case exc: org.apache.spark.SparkException =>
{
if(exc.getMessage.contains("cancelled"))
throw new Exception("Job forcibly cancelled")
else
throw exc
}
case ex : Throwable =>
{
println(s"Another exception: $ex")
}
}
finally
{
sc.removeSparkListener(sparkListener)
}
For the sake of future visitors, Spark introduced the Spark task reaper since 2.0.3, which does address this scenario (more or less) and is a built-in solution.
Note that is can kill an Executor eventually, if the task is not responsive.
Moreover, some built-in Spark sources of data have been refactored to be more responsive to spark:
For the 1.6.0 version, Zohar's solution is a "messy but efficient" one.
According to setJobGroup:
"If interruptOnCancel is set to true for the job group, then job cancellation will result in Thread.interrupt() being called on the job's executor threads."
So the anno function in your map must be interruptible like this:
val future = sc.parallelize(lst).map(while (!Thread.interrupted) _).collectAsync()
I have successfully implemented a simple web service using Spray and Slick that passes an incoming request through a Spark ML Prediction Pipeline. Everything was working fine until I tried to add a data layer. I have chosen Slick it seems to be popular.
However, I can't quite get it to work right. I have been basing most of my code on the Hello-Slick Activator Template. I use a DAO object like so:
object dataDAO {
val datum = TableQuery[Datum]
def dbInit = {
val db = Database.forConfig("h2mem1")
try {
Await.result(db.run(DBIO.seq(
datum.schema.create
)), Duration.Inf)
} finally db.close
}
def insertData(data: Data) = {
val db = Database.forConfig("h2mem1")
try {
Await.result(db.run(DBIO.seq(
datum += data,
datum.result.map(println)
)), Duration.Inf)
} finally db.close
}
}
case class Data(data1: String, data2: String)
class Datum(tag: Tag) extends Table[Data](tag, "DATUM") {
def data1 = column[String]("DATA_ONE", O.PrimaryKey)
def data2 = column[String]("DATA_TWO")
def * = (data1, data2) <> (Data.tupled, Data.unapply)
}
I initialize my database in my Boot object
object Boot extends App {
implicit val system = ActorSystem("raatl-demo")
Classifier.initializeData
PredictionDAO.dbInit
// More service initialization code ...
}
I try to add a record to my database before completing the service request
val predictionRoute = {
path("data") {
get {
parameter('q) { query =>
// do Spark stuff to get prediction
DataDAO.insertData(data)
respondWithMediaType(`application/json`) {
complete {
DataJson(data1, data2)
}
}
}
}
}
When I send a request to my service my application crashes
java.lang.OutOfMemoryError: PermGen space
I suspect I'm implementing the Slick API incorrectly. its hard to tell from the documentation, because it stuffs all the operations into a main method.
Finally, my conf is the same as the activator ui
h2mem1 = {
url = "jdbc:h2:mem:raatl"
driver = org.h2.Driver
connectionPool = disabled
keepAliveConnection = true
}
Has anyone encountered this before? I'm using Slick 3.1
java.lang.OutOfMemoryError: PermGen space is normally not a problem with your usage, here is what oracle says about this.
The detail message PermGen space indicates that the permanent generation is full. The permanent generation is the area of the heap where class and method objects are stored. If an application loads a very large number of classes, then the size of the permanent generation might need to be increased using the -XX:MaxPermSize option.
I do not think this is because of incorrect implementation of the Slick API. This probably happens because you are using multiple frameworks that loads many classes.
Your options are:
Increase the size of perm gen size -XX:MaxPermSize
Upgrade to Java 8. The perm gen space is now replaced with MetaSpace which is tuned automatically
what I want to do is:
1) create a master actor on a server which can dynamically create 10 remote actors on 10 different machine
2) master actor distribute the task to 10 remote actors
3) when every remote actor finish their work, they send the results to the master actor
4) master actor shut down the whole system
my problems are:
1) I am not sure how to config the master actor and below is my server part code:
class MasterAppliation extends Bootable{
val hostname = InetAddress.getLocalHost.getHostName
val config = ConfigFactory.parseString(
s"""
akka{
actor{
provider = "akka.remote.RemoteActorRefProvider"
deployment {
/remotemaster {
router = "round-robin"
nr-of-instances = 10
target {
nodes = ["akka.tcp://remotesys#host1:2552", "akka.tcp://remotesys#host2:2552", ....... akka.tcp://remotesys#host10:2552"]
}
}
}
remote{
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp{
hostname = "$hostname"
port = 2552
}
}
}""")
val system = ActorSystem("master", ConfigFactory.load(config))
val master = system.actorOf(Props(new master), name = "master")
def dosomething = master ! Begin()
def startup() {}
def shutdown() {
system.shutdown()
}
}
class master extends Actor {
val addresses = for(i <- 1 to 10)
yield AddressFromURIString(s"akka://remostsys#host$i:2552")
val routerRemote = context.actorOf(Props[RemoteMaster].withRouter(
RemoteRouterConfig(RoundRobinRouter(12), addresses)))
def receive = {
case Begin=>{
for(i <- 1 to 10) routerRemote ! Work(.....)
}
case Result(root) ........
}
}
object project1 {
def main(args: Array[String]) {
new MasterAppliation
}
}
2) I do not know how to create a remote actor on remote client. I read this tutorial. Do I need
to write the client part similar to the server part, which means I need create an object which is responsible to create a remote actor? But that also means when I run the client part, the remote actor is already created ! I am really confused.
3) I do not how to shut down the whole system. In the above tutorial, I find there is a function named shutdown(), but I never see anyone call it.
This is my first time to write a distributed program in Scala and AKKA. So I really need your help.
Thanks a lot.
Setting up the whole thing for the first time is a pain but if you do it once you will have a good skeleton that you will user on regular basis.
I've written in comment below the question user clustering not remoting.
Here is how I do it:
I set up an sbt root project with three sub-projects.
common
frontend
backend
In common you put everything that is common to both projects e.g. the messages that they share, actor classes that are created in frontend and deployed to backend.
Put a reference.conf to common project, here is mine:
akka {
loglevel = INFO
actor {
provider = "akka.cluster.ClusterActorRefProvider"
debug {
lifecycle = on
}
}
cluster {
seed-nodes = [
"akka.tcp://application#127.0.0.1:2558",
"akka.tcp://application#127.0.0.1:2559"
]
}
}
Now in the frontend:
akka {
remote {
log-remote-lifecycle-events = off
netty.tcp {
hostname = "127.0.0.1"
port = 2558
}
}
cluster {
auto-down = on
roles = [frontend]
}
}
and the backend
akka {
remote {
log-remote-lifecycle-events = off
netty.tcp {
hostname = "127.0.0.1"
port = 0
}
}
cluster {
auto-down = on
roles = [backend]
}
}
This will work like this:
You start the fronted part first which will control the cluster.
Then you can start any number of backends you want that will join automatically (look at the port, it's 0 so it will be chosen randomly).
Now you need to add the whole logic to the frontend main:
Create the actor system with name application:
val system = ActorSystem("application")
Do the same at the backend main.
Now write your code in fronted so it will create your workers with a router, here's my example code:
context.actorOf(ServiceRuntimeActor.props(serviceName)
.withRouter(
ClusterRouterConfig(ConsistentHashingRouter(),
ClusterRouterSettings(
totalInstances = 10, maxInstancesPerNode = 3,
allowLocalRoutees = false, useRole = Some("backend"))
)
),
name = shortServiceName)
just change your ServiceRuntimeActor to name of your worker. It will deploy workers to all backends that you've started and limit this to max 3 per node and max 10 in total.
Hope this will help.