I'm trying to get familiar with Slick 3.0 and Futures (using Scala 2.11.6). I use simple code based on Slick's Multi-DB Cake Pattern example. Why does the following code terminate with an exception and how to fix it?
import scala.concurrent.Await
import scala.concurrent.duration._
import slick.jdbc.JdbcBackend.Database
import scala.concurrent.ExecutionContext.Implicits.global
class Dispatcher(db: Database, dal: DAL) {
import dal.driver.api._
def init() = {
db.run(dal.create)
try db.run(dal.stuffTable += Stuff(23,"hi"))
finally db.close
val x = {
try db.run(dal.stuffTable.filter(_.serial === 23).result)
finally db.close
}
// This crashes:
val result = Await.result(x, 2 seconds)
}
}
Execution fails with:
java.util.concurrent.RejectedExecutionException: Task slick.backend.DatabaseComponent$DatabaseDef$$anon$2#5c73f637 rejected from java.util.concurrent.ThreadPoolExecutor#4129c44c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136)
at slick.backend.DatabaseComponent$DatabaseDef$class.runSynchronousDatabaseAction(DatabaseComponent.scala:224)
at slick.jdbc.JdbcBackend$DatabaseDef.runSynchronousDatabaseAction(JdbcBackend.scala:38)
at slick.backend.DatabaseComponent$DatabaseDef$class.runInContext(DatabaseComponent.scala:201)
at slick.jdbc.JdbcBackend$DatabaseDef.runInContext(JdbcBackend.scala:38)
at slick.backend.DatabaseComponent$DatabaseDef$class.runInternal(DatabaseComponent.scala:75)
at slick.jdbc.JdbcBackend$DatabaseDef.runInternal(JdbcBackend.scala:38)
at slick.backend.DatabaseComponent$DatabaseDef$class.run(DatabaseComponent.scala:72)
at slick.jdbc.JdbcBackend$DatabaseDef.run(JdbcBackend.scala:38)
at Dispatcher.init(Dispatcher.scala:15)
at SlickDemo$.main(SlickDemo.scala:16)
at SlickDemo.main(SlickDemo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
I think that something is not correct in what you are trying to do: Slick's run method doesn't return Unit and doesn't fail with an exception - as it used to in previous versions. run now returns a Future, so if you want to run actions in sequence you need to flatMap the steps, or use a for-comprehension:
def init() = {
val = results for {
_ <- db.run(dal.create)
_ <- db.run(dal.stuffTable += Stuff(23, "hi"))
r <- db.run(dal.stuffTable.filter(_.serial === 23).result)
} yield r
}
I am not sure that you really need to use db.close that way: that is actually what may be causing the error (i.e. the db is closed in concurrence with the future that runs the actual queries so the execution can't happen).
If you want to handle errors use Future's capabilities, e.g.:
result.onFailure { case NonFatal(ex) => // do something with the exception }
Related
I've inherited some code from an ex-coworker where he started using futures (in Scala) to process some data in Databricks.
I split it into chunks that complete in a similar time period. However there is no output, and I know they aren't using onSuccess or Await or anything.
The thing is, the code finishes running (doesn't return output) but the block in Databricks keeps executing until the thread.sleep() part.
I'm new to Scala and futures and am not sure how I can just exit the notebook once all the futures finish running (should i just use dbutils.notebook.exit() after the future blocks?)
Code is below:
import scala.concurrent.{Future, blocking, Await}
import java.util.concurrent.Executors
import scala.concurrent.ExecutionContext
import com.databricks.WorkflowException
val numNotebooksInParallel = 15
// If you create too many notebooks in parallel the driver may crash when you submit all of the jobs at once.
// This code limits the number of parallel notebooks.
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(numNotebooksInParallel))
val ctx = dbutils.notebook.getContext()
// The simplest interface we can have but doesn't
// have protection for submitting to many notebooks in parallel at once
println("starting parallel jobs... hang tight")
Future {
process("pro","bseg")
process("prc","bkpf")
process("prc","bseg")
process("pr4","bkpf")
process("pr4","bseg")
println("done with future1")
}
Future {
process("pr5","bkpf")
process("pr5","bseg")
process("pri","bkpf")
process("pri","bseg")
process("pr9","bkpf")
println("done with future2")
}
Future {
process("pr9","bseg")
process("prl","bkpf")
process("prl","bseg")
process("pro","bkpf")
println("done with future3")
}
println("finished futures - yay! :)")
Thread.sleep(5*60*60*1000)
println("thread timed out after 5 hrs... hope it all finished.")
One would typically save the futures as values:
val futs = Seq(
Future {
process("pro","bseg")
// and so on
},
// then the other futures
)
and then operate on the futures:
import scala.concurrent.Await
import scala.concurrent.duration._
Await.result(Future.sequence(futs), 5.hours)
Future.sequence will stop at the first one that fails or once they've all succeeded. If you want them all to run even if one fails, you could do something like
Await.result(
futs.foldLeft(Future.unit) { (_, f) =>
f.recover {
case _ => ()
}
},
5.hours
)
I want to use the combinator orElse on ZIO Fibers.
From docs:
If the first fiber succeeds, the composed fiber will succeed with its result; otherwise, the composed fiber will complete with the exit value of the second fiber (whether success or failure).
import zio._
import zio.console._
object MyApp extends App {
def f1 :Task[Int] = IO.fail(new Exception("f1 fail"))
def f2 :Task[Int] = IO.succeed(2)
val myAppLogic =
for {
f1f <- f1.fork
f2f <- f2.fork
ff = f1f.orElse(f2f)
r <- ff.join
_ <- putStrLn(s"Result is [$r]")
} yield ()
def run(args: List[String]) =
myAppLogic.fold(_ => 1, _ => 0)
}
I run it with sbt in console. And output:
[info] Running MyApp
Fiber failed.
A checked error was not handled.
java.lang.Exception: f1 fail
at MyApp$.f1(MyApp.scala:6)
at MyApp$.<init>(MyApp.scala:11)
at MyApp$.<clinit>(MyApp.scala)
at MyApp.main(MyApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Result is [2]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sbt.Run.invokeMain(Run.scala:93)
at sbt.Run.run0(Run.scala:87)
at sbt.Run.execute$1(Run.scala:65)
at sbt.Run.$anonfun$run$4(Run.scala:77)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at sbt.util.InterfaceUtil$$anon$1.get(InterfaceUtil.scala:10)
at sbt.TrapExit$App.run(TrapExit.scala:252)
at java.lang.Thread.run(Thread.java:748)
Fiber:Id(1574829590403,2) was supposed to continue to: <empty trace>
Fiber:Id(1574829590403,2) ZIO Execution trace: <empty trace>
Fiber:Id(1574829590403,2) was spawned by:
Fiber:Id(1574829590397,1) was supposed to continue to:
a future continuation at MyApp$.myAppLogic(MyApp.scala:12)
a future continuation at MyApp$.run(MyApp.scala:19)
Fiber:Id(1574829590397,1) ZIO Execution trace: <empty trace>
Fiber:Id(1574829590397,1) was spawned by:
Fiber:Id(1574829590379,0) was supposed to continue to:
a future continuation at zio.App.main(App.scala:57)
a future continuation at zio.App.main(App.scala:56)
[Fiber:Id(1574829590379,0) ZIO Execution trace: <empty trace>
I see the result of seconds Fiber, is Result is [2]
But why it output these unnecessary exception/warning messages?
By default a fiber failure warning is generated when a fiber that is not joined back fails so that errors do not get lost. But as you correctly note in some cases this is not necessary as the error is handled internally by the program logic, in this case by the orElse combinator. We have been working through a couple of other cases of spurious warnings being generated and I just opened a ticket for this one here. I expect we will have this resolved in the next release.
This happens because the default instance of Platform being created by zio.App has a default which reports uninterrupted, failed fibers to the console:
def reportFailure(cause: Cause[_]): Unit =
if (!cause.interrupted)
System.err.println(cause.prettyPrint)
To avoid this, you can provide your own Platform instance which doesn't do so:
import zio._
import zio.console._
import zio.internal.{Platform, PlatformLive}
object MyApp extends App {
override val Platform: Platform = PlatformLive.Default.withReportFailure(_ => ())
def f1: Task[Int] = IO.fail(new Exception("f1 fail"))
def f2: Task[Int] = IO.succeed(2)
val myAppLogic =
for {
f1f <- f1.fork
f2f <- f2.fork
ff = f1f.orElse(f2f)
r <- ff.join
_ <- putStrLn(s"Result is [$r]")
} yield ()
def run(args: List[String]) =
myAppLogic.fold(_ => 1, _ => 0)
}
Which yields:
Result is [2]
As #Adam Fraser noted, this will probably get fixed in a nearby release.
Edit:
Should be fixed after https://github.com/zio/zio/pull/2339 was merged
This code works as expected:
it("should propagate exceptions") {
intercept[RuntimeException] {
val future = Future { Thread.sleep(10); sys.error("whoops"); 22 }
Await.result(future, Duration.Inf)
}.getMessage should equal ("whoops")
}
But this doesn't:
it("should propagate errors") {
intercept[StackOverflowError] {
val future = Future { Thread.sleep(10); throw new StackOverflowError("dang"); 22 }
Await.result(future, Duration.Inf)
}.getMessage should equal ("dang")
}
The future in this second test never returns. Why doesn't an Error subclass (as opposed to an Exception subclass) terminate my future? How should I handle Errors?
EDIT: This is possibly related, but not identical, to Why does Scala Try not catching java.lang.StackOverflowError?. I'm not using Try here. The core issue is that the Future never returns at all; I can't catch any error from it because it just hangs.
The reporter facility is for catastrophes, which just hooks into the thread's UncaughtExceptionHandler, but it looks like it works out of the box with just the default thread factory:
scala 2.13.0-M5> import concurrent._,java.util.concurrent.Executors
import concurrent._
import java.util.concurrent.Executors
scala 2.13.0-M5> val ec = ExecutionContext.fromExecutor(null, e => println(s"Handle: $e"))
ec: scala.concurrent.ExecutionContextExecutor = scala.concurrent.impl.ExecutionContextImpl$$anon$3#5e7c141d[Running, parallelism = 4, size = 0, active = 0, running = 0, steals = 0, tasks = 0, submissions = 0]
scala 2.13.0-M5> val f = Future[Int](throw new NullPointerException)(ec)
f: scala.concurrent.Future[Int] = Future(<not completed>)
scala 2.13.0-M5> f
res0: scala.concurrent.Future[Int] = Future(Failure(java.lang.NullPointerException))
scala 2.13.0-M5> val f = Future[Int](throw new StackOverflowError)(ec)
Handle: java.lang.StackOverflowError
f: scala.concurrent.Future[Int] = Future(<not completed>)
whereas
scala 2.13.0-M5> val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor, e => println(s"Handle: $e"))
ec: scala.concurrent.ExecutionContextExecutor = scala.concurrent.impl.ExecutionContextImpl#317a118b
scala 2.13.0-M5> val f = Future[Int](throw new StackOverflowError)(ec)
f: scala.concurrent.Future[Int] = Future(<not completed>)
Exception in thread "pool-1-thread-1" java.lang.StackOverflowError
at $line14.$read$$iw$$iw$$iw$$iw$.$anonfun$f$1(<console>:1)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:261)
at scala.util.Success.map(Try.scala:209)
at scala.concurrent.impl.Promise$Transformation.doMap(Promise.scala:420)
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:402)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
You could construct a rig that registers a future when it runs, and a safe await that knows when threads have blown up. Maybe you want to retry an algorithm with a lower max recursion depth, for example.
As pointed out in the comments, this is a duplicate of Why does Scala Try not catching java.lang.StackOverflowError?
According to Scala documentation.
Note: only non-fatal exceptions are caught by the combinators on Try (see >scala.util.control.NonFatal). Serious system errors, on the other hand, will be >thrown.
No Throwable -> Errors are catched by Try
Also to answer your question about how error handling is usually done.
In Scala you can use try / catch for code that can cause exceptions (very similar to Java):
try {
// ... Your dangerous code in here
} catch {
case ioe: IOException => ... //
case e: Exception => ...
}
And you should always have the more specific exceptions first.
The code you provided would look something like this:
https://scastie.scala-lang.org/2DJXJ6ESS9ySJZSwSodmZg
Also I tried out your code and it definitely produces the StackOverFlowerror.
But it can't catch it properly like the above mentioned link explains.
def fixture =
new {
val xyz = new XYZ(spark)
}
val fList: scala.collection.mutable.MutableList[scala.concurrent.Future[Dataset[Row]]] = scala.collection.mutable.MutableList[scala.concurrent.Future[Dataset[Row]]]() //mutable List of future means List[Future]
test("test case") {
val tasks = for (i <- 1 to 10) {
fList ++ scala.collection.mutable.MutableList[scala.concurrent.Future[Dataset[Row]]](Future {
println("Executing task " + i )
val ds = read(fixture.etlSparkLayer,i)
ds
})
}
Thread.sleep(1000*4200)
val futureOfList = Future.sequence(fList)//list of Future job in Future sequence
println(Await.ready(futureOfList, Duration.Inf))
val await_result: Seq[Dataset[Row]] = Await.result(futureOfList, Duration.Inf)
println("Squares: " + await_result)
futureOfList.onComplete {
case Success(x) => println("Success!!! " + x)
case Failure(ex) => println("Failed !!! " + ex)
}
}
I am executing one test case with sequence of Future List and List have collection of Future.I trying to execute same fuction multiple time parallely by help of using Future in scala.In my system only 4 job start in one time after completion of 4 jobs next 4 job will starting like that complete all the jobs. So how to start more than 4 job at a time and how main Thread will wait to complete all the Future thread ? I tried Await.result and Await.ready but not able to control main thread , for main thread control i m use Thread.sleep concept.this program is for read from RDBMS table and write in Elasticsearch. So how to control main thread main issue?
Assuming that you use the scala.concurrent.ExecutionContext.Implicits.global ExecutionContext you can tune the number of threads as described here:
https://github.com/scala/scala/blob/2.12.x/src/library/scala/concurrent/impl/ExecutionContextImpl.scala#L100
Specifically the following System Properties: scala.concurrent.context.minThreads, scala.concurrent.context.numThreads. scala.concurrent.context.maxThreads, and scala.concurrent.context.maxExtraThreads
Otherwise, you can rewrite your code to something like this:
import scala.collection.immutable
import scala.concurrent.duration._
import scala.concurrent._
import java.util.concurrent.Executors
test("test case") {
implicit val ec = ExecutionContext.fromExecutorService(ExecutorService.newFixedThreadPool(NUMBEROFTHREADSYOUWANT))
val aFuture = Future.traverse(1 to 10) {
i => Future {
println("Executing task " + i)
read(fixture.etlSparkLayer,i) // If this is a blocking operation you may want to consider wrapping it in a `blocking {}`-block.
}
}
aFuture.onComplete(_ => ec.shutdownNow()) // Only for this test, and to make sure the pool gets cleaned up
val await_result: immutable.Seq[Dataset[Row]] = Await.result(aFuture, 60.minutes) // Or other timeout
println("Squares: " + await_result)
}
I have a question about scala's Future.
Currently I have a program that runs through a directory and checks if there is a document.
If there is a file the program should convert these file into a ".pdf"
My Code looks like this (It's pseudocode):
for(file <- directory) {
if(timestamp > filetimestamp) {
Future {
// do a convert job that returns UNIT
}
}
}
Is this valid code or do I need to wait for the return value?
Are there any other alternative's that are as lightweight as Futures?
To convert inside a Future, simply use map and flatMap. The actual operations are performed asynchronously when the callbacks are complete, but they are type safe.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
for(file <- directory) {
if(timestamp > filetimestamp) {
val future = Future {
// do a convert job that returns UNIT
} map {
file => // whatever you want.
}
}
Warning!
If any Future throws a "NonFatal" error, it will be swallowed. This is a serious gotcha when using Future[Unit]: if no code ever evaluates the future, errors can disappear down a black hole. (It affects any Future[_], but if you are returning a value, you normally do something with it, so the error is discovered.)
scala> import scala.concurrent.ExecutionContext.Implicits.global
scala.concurrent.Future { throw new IllegalArgumentException("foo") }
scala> res16: scala.concurrent.Future[Nothing] = scala.concurrent.impl.Promise$DefaultPromise#64dd3f78
scala> import scala.concurrent.ExecutionContext.Implicits.global
scala.concurrent.Future { throw new IllegalArgumentException("foo"); 42 }
scala> res11: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#65c8295b
An alternative which accomplishes the same thing, but does not hide the error:
scala> val context = scala.concurrent.ExecutionContext.Implicits.global
context.execute(new Runnable {
def run() = throw new IllegalArgumentException("foo")
})
context: scala.concurrent.ExecutionContextExecutor = scala.concurrent.impl.ExecutionContextImpl#1fff4cac
scala> | | java.lang.IllegalArgumentException: foo
at $line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anon$1.run(<console>:34)
at $line48.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$anon$1.run(<console>:33)
at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Yes, that is valid. All the Futures you create will be run, and their return values will be discarded and garbage collected.