I have Play 2 app.
And code something like that.
So, I run app in start mode and run mode.
val calendar = Calendar.getInstance()
calendar.set(Calendar.HOUR_OF_DAY, 19)
calendar.set(Calendar.MINUTE, 0)
calendar.set(Calendar.SECOND, 0)
calendar.set(Calendar.MILLISECOND, 0)
val now = new Date()
val timeDifference = calendar.getTime.getTime - now.getTime
if (timeDifference >= 0) {
val initialDelay = Duration.create(timeDifference, duration.MILLISECONDS)
Akka.system.scheduler.scheduleOnce(initialDelay, new Runnable {
override def run() {
LOGGER.error(System.lineSeparator() +
"=================================================" + System.lineSeparator() +
"| Forming and send report |" + System.lineSeparator() +
"=================================================" + System.lineSeparator()
)
}
})
}
Why scheduleOnce runs on shutdown?
The behavior you are seeing here actually effects all tasks scheduled in the default scheduler (LightArrayResolverScheduler) whether they were scheduled with schedule or scheduleOnce. If you look at the close method on LightArrayResolverScheduler, you will see this logic:
override def close(): Unit = Await.result(stop(), getShutdownTimeout) foreach {
task =>
try task.run() catch {
case e: InterruptedException => throw e
case _: SchedulerException => // ignore terminated actors
case NonFatal(e) => log.error(e, "exception while executing timer task")
}
}
So basically, it looks like during a shutdown of the timer that it will collect all outstanding tasks and execute them serially. I suppose this would not be a big deal if your scheduling was just sending messages to actors (as opposed to using a Runnable) as those actors would hopefully be terminated by this point and result in nothing happening (dead letters).
If you want to avoid this behavior, you could store the result of the call to scheduleOnce and explicitly cancel it first before you initiate your shutdown process.
Related
I am running a Spark application (version 1.6.0) on a Hadoop cluster with Yarn (version 2.6.0) in client mode. I have a piece of code that runs a long computation, and I want to kill it if it takes too long (and then run some other function instead).
Here is an example:
val conf = new SparkConf().setAppName("TIMEOUT_TEST")
val sc = new SparkContext(conf)
val lst = List(1,2,3)
// setting up an infite action
val future = sc.parallelize(lst).map(while (true) _).collectAsync()
try {
Await.result(future, Duration(30, TimeUnit.SECONDS))
println("success!")
} catch {
case _:Throwable =>
future.cancel()
println("timeout")
}
// sleep for 1 hour to allow inspecting the application in yarn
Thread.sleep(60*60*1000)
sc.stop()
The timeout is set for 30 seconds, but of course the computation is infinite, and so Awaiting on the result of the future will throw an Exception, which will be caught and then the future will be canceled and the backup function will execute.
This all works perfectly well, except that the canceled job doesn't terminate completely: when looking at the web UI for the application, the job is marked as failed, but I can see there are still running tasks inside.
The same thing happens when I use SparkContext.cancelAllJobs or SparkContext.cancelJobGroup. The problem is that even though I manage to get on with my program, the running tasks of the canceled job are still hogging valuable resources (which will eventually slow me down to a near stop).
To sum things up: How do I kill a Spark job in a way that will also terminate all running tasks of that job? (as opposed to what happens now, which is stopping the job from running new tasks, but letting the currently running tasks finish)
UPDATE:
After a long time ignoring this problem, we found a messy but efficient little workaround. Instead of trying to kill the appropriate Spark Job/Stage from within the Spark application, we simply logged the stage ID of all active stages when the timeout occurred, and issued an HTTP GET request to the URL presented by the Spark Web UI used for killing said stages.
I don't know it this answers your question.
My need was to kill jobs hanging for too much time (my jobs extract data from Oracle tables, but for some unknonw reason, seldom the connection hangs forever).
After some study, I came to this solution:
val MAX_JOB_SECONDS = 100
val statusTracker = sc.statusTracker;
val sparkListener = new SparkListener()
{
override def onJobStart(jobStart : SparkListenerJobStart)
{
val jobId = jobStart.jobId
val f = Future
{
var c = MAX_JOB_SECONDS;
var mustCancel = false;
var running = true;
while(!mustCancel && running)
{
Thread.sleep(1000);
c = c - 1;
mustCancel = c <= 0;
val jobInfo = statusTracker.getJobInfo(jobId);
if(jobInfo!=null)
{
val v = jobInfo.get.status()
running = v == JobExecutionStatus.RUNNING
}
else
running = false;
}
if(mustCancel)
{
sc.cancelJob(jobId)
}
}
}
}
sc.addSparkListener(sparkListener)
try
{
val df = spark.sql("SELECT * FROM VERY_BIG_TABLE") //just an example of long-running-job
println(df.count)
}
catch
{
case exc: org.apache.spark.SparkException =>
{
if(exc.getMessage.contains("cancelled"))
throw new Exception("Job forcibly cancelled")
else
throw exc
}
case ex : Throwable =>
{
println(s"Another exception: $ex")
}
}
finally
{
sc.removeSparkListener(sparkListener)
}
For the sake of future visitors, Spark introduced the Spark task reaper since 2.0.3, which does address this scenario (more or less) and is a built-in solution.
Note that is can kill an Executor eventually, if the task is not responsive.
Moreover, some built-in Spark sources of data have been refactored to be more responsive to spark:
For the 1.6.0 version, Zohar's solution is a "messy but efficient" one.
According to setJobGroup:
"If interruptOnCancel is set to true for the job group, then job cancellation will result in Thread.interrupt() being called on the job's executor threads."
So the anno function in your map must be interruptible like this:
val future = sc.parallelize(lst).map(while (!Thread.interrupted) _).collectAsync()
We run this code:
scheduler.schedule(1 minute, 1 minute) { triggerOperations.tick() }
when starting our application, where scheduler is an Akka actorSystem.scheduler.
If tick() throws an exception, then it is never called again!
I checked the documentation, but cannot find any statement that this is expected. Mostly the description is "Schedules a function to be run repeatedly with an initial delay and a frequency", with no mention that if the function throws an excpetion the task will stop firing.
Our akka version is 2.3.2.
http://doc.akka.io/docs/akka/2.3.4/scala/scheduler.html
http://doc.akka.io/api/akka/2.0/akka/actor/Scheduler.html
Is this behavior expected? Is it documented anywhere?
When in doubt, go to source. The code is a bit terse, but this fragment:
override def run(): Unit = {
try {
runnable.run()
val driftNanos = clock() - getAndAdd(delay.toNanos)
if (self.get != null)
swap(schedule(executor, this, Duration.fromNanos(Math.max(delay.toNanos - driftNanos, 1))))
} catch {
case _: SchedulerException ⇒ // ignore failure to enqueue or terminated target actor
}
}
shows that if your runnable throws, scheduler does not reschedule the next execution (which happens inside swap as far as I understand).
I'm having some strange behavior in my Play 2.2 app and I'm not sure how to go about debugging it. My code was running fine until I started using iteratees.
My actor creates an enumerator like below and sends it back to the caller:
val emailFeed = Concurrent.unicast[Message] (
onStart = {
pushee => {
log.debug("Pushing 1")
pushee.push(messages.apply(0))
log.debug("Pushed 1")
}
},
onComplete = {
log.debug("Done with pushee")
},
onError = {
(msg, in) => log.error(msg)
}
the caller then consumes it by :
f.map { reply =>
val emailFeed = reply.asInstanceOf[Enumerator[Message]]
val iter = Iteratee.fold[Message,String] ("") {
(result, msg) => {
Logger.debug("appending subj")
result ++ msg.getSubject()
Logger.debug(" subj appended")
result
}
}
val result: Future[String] = emailFeed |>>> iter
Await.result(result, 10 minutes)
The problem is Await.result times out. I have stepped through the iteratee code and see it being called. There is only 1 chunk to process so it is quick. I also see the enumerator and iteratee debug stmts in the console. I just don't know why it doesn't complete:
[debug] application - Pushing 1
[debug] application - Pushed 1
[debug] application - appending subj
[debug] application - subj appended
Apparently, there isn't much feedback in Play if you forget to end your enumerator properly. You must call:
pushee.end
or
pushee.eofAndEnd
to trigger the enumerator to close and the Future to complete.
I have a moderate number of long-running Actors and I wish to write a synchronous function that returns the first one of these that completes. I can do it with a spin-wait on futures (e.g.,:
while (! fs.exists(f => f.isSet) ) {
Thread.sleep(100)
}
val completeds = fs.filter(f => f.isSet)
completeds.head()
), but that seems very "un-Actor-y"
The scala.actors.Futures class has two methods awaitAll() and awaitEither() that seem awfully close; if there were an awaitAny() I'd jump on it. Am I missing a simple way to do this or is there a common pattern that is applicable?
A more "actorish" way of waiting for completion is creating an actor in charge of handling completed result (lets call it ResultHandler)
Instead of replying, workers send their answer to ResultHandler in fire-and-forget manner. The latter will continue processing the result while other workers complete their job.
The key for me was the discovery that every (?) Scala object is, implicitly, an Actor, so you can use Actor.react{ } to block. Here is my source code:
import scala.actors._
import scala.actors.Actor._
//Top-level class that wants to return the first-completed result from some long-running actors
class ConcurrentQuerier() {
//Synchronous function; perhaps fulfilling some legacy interface
def synchronousQuery : String = {
//Instantiate and start the monitoring Actor
val progressReporter = new ProgressReporter(self) //All (?) objects are Actors
progressReporter.start()
//Instantiate the long-running Actors, giving each a handle to the monitor
val lrfs = List (
new LongRunningFunction(0, 2000, progressReporter), new LongRunningFunction(1, 2500, progressReporter), new LongRunningFunction(3, 1500, progressReporter),
new LongRunningFunction(4, 1495, progressReporter), new LongRunningFunction(5, 1500, progressReporter), new LongRunningFunction(6, 5000, progressReporter) )
//Start 'em
lrfs.map{ lrf =>
lrf.start()
}
println("All actors started...")
val start = System.currentTimeMillis()
/*
This blocks until it receives a String in the Inbox.
Who sends the string? A: the progressReporter, which is monitoring the LongRunningFunctions
*/
val s = receive {
case s:String => s
}
println("Received " + s + " after " + (System.currentTimeMillis() - start) + " ms")
s
}
}
/*
An Actor that reacts to a message that is a tuple ("COMPLETED", someResult) and sends the
result to this Actor's owner. Not strictly necessary (the LongRunningFunctions could post
directly to the owner's mailbox), but I like the idea that monitoring is important enough
to deserve its own object
*/
class ProgressReporter(val owner : Actor) extends Actor {
def act() = {
println("progressReporter awaiting news...")
react {
case ("COMPLETED", s) =>
println("progressReporter received a completed signal " + s);
owner ! s
case s =>
println("Unexpected message: " + s ); act()
}
}
}
/*
Some long running function
*/
class LongRunningFunction(val id : Int, val timeout : Int, val supervisor : Actor) extends Actor {
def act() = {
//Do the long-running query
val s = longRunningQuery()
println(id.toString + " finished, sending results")
//Send the results back to the monitoring Actor (the progressReporter)
supervisor ! ("COMPLETED", s)
}
def longRunningQuery() : String = {
println("Starting Agent " + id + " with timeout " + timeout)
Thread.sleep(timeout)
"Query result from agent " + id
}
}
val cq = new ConcurrentQuerier()
//I don't think the Actor semantics guarantee that the result is absolutely, positively the first to have posted the "COMPLETED" message
println("Among the first to finish was : " + cq.synchronousQuery)
Typical results look like:
scala ActorsNoSpin.scala
progressReporter awaiting news...
All actors started...
Starting Agent 1 with timeout 2500
Starting Agent 5 with timeout 1500
Starting Agent 3 with timeout 1500
Starting Agent 4 with timeout 1495
Starting Agent 6 with timeout 5000
Starting Agent 0 with timeout 2000
4 finished, sending results
progressReporter received a completed signal Query result from agent 4
Received Query result from agent 4 after 1499 ms
Among the first to finish was : Query result from agent 4
5 finished, sending results
3 finished, sending results
0 finished, sending results
1 finished, sending results
6 finished, sending results
first of all, i'm learning scala and new to the java world.
I want to create a console and run this console as a service that you could start and stop.
I was able to run a ConsoleReader into an Actor but i don't know how to stop properly the ConsoleReader.
Here is the code :
import eu.badmood.util.trace
import scala.actors.Actor._
import tools.jline.console.ConsoleReader
object Main {
def main(args:Array[String]){
//start the console
Console.start(message => {
//handle console inputs
message match {
case "exit" => Console.stop()
case _ => trace(message)
}
})
//try to stop the console after a time delay
Thread.sleep(2000)
Console.stop()
}
}
object Console {
private val consoleReader = new ConsoleReader()
private var running = false
def start(handler:(String)=>Unit){
running = true
actor{
while (running){
handler(consoleReader.readLine("\33[32m> \33[0m"))
}
}
}
def stop(){
//how to cancel an active call to ConsoleReader.readLine ?
running = false
}
}
I'm also looking for any advice concerning this code !
The underlying call to read a characters from the input is blocking. On non-Windows platform, it will use System.in.read() and on Windows it will use org.fusesource.jansi.internal.WindowsSupport.readByte.
So your challenge is to cause that blocking call to return when you want to stop your console service. See http://www.javaspecialists.eu/archive/Issue153.html and Is it possible to read from a InputStream with a timeout? for some ideas... Once you figure that out, have read return -1 when your console service stops, so that ConsoleReader thinks it's done. You'll need ConsoleReader to use your version of that call:
If you are on Windows, you'll probably need to override tools.jline.AnsiWindowsTerminal and use the ConsoleReader constructor that takes a Terminal (otherwise AnsiWindowsTerminal will just use WindowsSupport.readByte` directly)
On unix, there is one ConsoleReader constructor that takes an InputStream, you could provide your own wrapper around System.in
A few more thoughts:
There is a scala.Console object already, so for less confusion name yours differently.
System.in is a unique resource, so you probably need to ensure that only one caller uses Console.readLine at a time. Right now start will directly call readLine and multiple callers can call start. Probably the console service can readLine and maintain a list of handlers.
Assuming that ConsoleReader.readLine responds to thread interruption, you could rewrite Console to use a Thread which you could then interrupt to stop it.
object Console {
private val consoleReader = new ConsoleReader()
private var thread : Thread = _
def start(handler:(String)=>Unit) : Thread = {
thread = new Thread(new Runnable {
override def run() {
try {
while (true) {
handler(consoleReader.readLine("\33[32m> \33[0m"))
}
} catch {
case ie: InterruptedException =>
}
}
})
thread.start()
thread
}
def stop() {
thread.interrupt()
}
}
You may overwrite your ConsoleReader InputStream. IMHO this is reasonable well because of STDIN is a "slow" stream. Please improve example for your needs. This is only sketch, but it works:
def createReader() =
terminal.synchronized {
val reader = new ConsoleReader
terminal.enableEcho()
reader.setBellEnabled(false)
reader.setInput(new InputStreamWrapper(reader.getInput())) // turn on InterruptedException for InputStream.read
reader
}
with InputStream wrapper:
class InputStreamWrapper(is: InputStream, val timeout: Long = 50) extends FilterInputStream(is) {
#tailrec
final override def read(): Int = {
if (is.available() != 0)
is.read()
else {
Thread.sleep(timeout)
read()
}
}
}
P.S. I tried to use NIO - a lot of troubles with System.in (especially crossplatform). I returned to this variant. CPU load is near 0%. This is suitable for such interactive application.