Getting a spurious fibre trace with zip and for comprehension - scala

I have just started to look at ZIO to try to improve my Scala. I have started by trying to update some of my old code. I have wrapped some legacy code which returns a configuration wrapped in an option and I'm converting that to a ZIO which I'm then using in for comprehension.
The code works as expected but if I return None I get:
Fiber failed.
A checked error was not handled.
None
Fiber:Id(1612612180323,1) was supposed to continue to:
a future continuation at tryzio.MyApp$.run(MyApp.scala:94)
a future continuation at zio.ZIO.exitCode(ZIO.scala:543)
Fiber:Id(1612612180323,1) execution trace:
at zio.ZIO$.fromOption(ZIO.scala:3246)
at tryzio.MyApp$.run(MyApp.scala:93)
The code works as expected for both a Some and a None, but I get the spurious Fibre messages for a None.
The code that generates this is really very simple:
def run(args: List[String]) = ({
for {
cmdln <- ZIO.fromOption(getConfig(args.toArray))
_ <- putStrLn("Model Data Builder")
_ <- putStrLn(s"\nVerbose: ${cmdln.verbose}")
} yield()
}).exitCode
But I have clearly missed something obvious! I'm new to ZIO so please use small words when explains my lack of understanding. Do I need to join all Fibres? I have tried to catch the None and then exit but actually my code never strops running if I do that - Very strange.

.exitCode caught an error (empty configuration) which hasn't been handled in the code, printed debug information in stdErr and exited the program with status 1. So it works as expected.
I agree that the error message is a bit misleading and should start with something something business-related rather than fiber failed.
You might fire a ticket on github.

Related

Exception handling in Apache Spark

I have been researching on the proper way of handling exceptions in Apache Spark jobs. I have read through different questions in Stackoverflow but I still haven't got to a conclusion. From my point of view there are three ways of handling exceptions:
Try catch/block surrounding the lambda function that is going to perform the computation. This is tricky because the block will have to be placed surrounding the code that triggers the lazy computation. If an error happens then I assume there won't be any RDD to work with (Taken from this blog entry)
val lines: RDD[String] = sc.textFile("large_file.txt")
val tokens =
lines.flatMap(_ split " ")
.map(s => s(10))
try {
// This try-catch block catch all the exceptions thrown by the
// preceding transformations.
tokens.saveAsTextFile("/some/output/file.txt")
} catch {
case e : StringIndexOutOfBoundsException =>
// Doing something in response of the exception
}
Try catch/block inside the lambda function: This implies deciding on the correct output of a caught exception inside the lambda function.
rdd.map({
Try(fn) match{
case Success: _
case Failure:<<Record with error flag>>
}).filter(record.errorflag==null)
Let the exception propagate. The task will fail and the Spark framework will relaunch the task again. This works when the error is caused by reasons outside the code scope. e.g. (memory leak, connection to another service lost momentarily.)
What's the correct way of handling exceptions?. I guess it depends on what you want to achieve with the RDD operation. If an error in one of the RDD records means that the output is not valid then option 1 is the way to go. If we expect some of the records to fail, we go for option 2. Option 3 does not even need to make a choice as it is the normal behaviour of the platform.
In the past we did not bother with try/catch approach except for input parameter checking.
For the rest we just relied on checking the return code as in:
spark-submit --master yarn ... bla bla
ret_val=$?
...
Why? As you need to correct something in general and we needed to then start over again. It's hard to dynamically correct certain things. Your scheduling tool can pick this up as well, Rundeck, Airflow...et al.
More advanced restart options are possible but simply get convoluted, but could be done. As you allude to in option 2. Never seen that done though.

Scala - differents between eventually timeout and Thread.sleep()

I have some async (ZIO) code, which I need to test. If I create a testing part using Thread.sleep() it works fine and I always get response:
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
But if I made same logic using timeout and interval from eventually then it never works correctly ( I got timeouts):
for {
saved <- database.save(smth)
result <- eventually(timeout(Span(20, Seconds)), interval(Span(20, Seconds))) {
database.search(...)
}
} yield result
I do not understand why timeout and interval works different then Thread.sleep. It should be doing exactly same thing. Can someone explain it to me and tell how I should change this code to do not need to use Thread.sleep()?
Assuming database.search(...) returns ZIO[] object.
eventually{database.search(...)} most probably succeeds immediately after the first try.
It successfully created a task to query the database.
Then database is queried without any retry logic.
Regarding how to make it work:
val search: ZIO[Any, Throwable, String] = ???
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
Something like that should work. But I believe that more elegant solutions exist.
The other answer from #simpadjo addresses the "what" quite succinctly. I'll add some additional context as to why you might see this behavior.
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
There are three different technologies being mixed here which is causing some confusion.
First is ZIO which is an asynchronous programming library that uses it's own custom runtime and execution model to perform tasks. The second is eventually which comes from ScalaTest and is useful for checking asynchronous computations by effectively polling the state of a value. And thirdly, there is Thread.sleep which is a Java api that literally suspends the current thread and prevents task progression until the timer expires.
eventually uses a simple retry mechanism that differs based on whether you are using a normal value or a Future from the scala standard library. Basically it runs the code in the block and if it throws then it sleeps the current thread and then retries it based on some interval configuration, eventually timing out. Notably in this case the behavior is entirely synchronous, meaning that as long as the value in the {} doesn't throw an exception it won't keep retrying.
Thread.sleep is a heavy weight operation and in this case it is effectively blocking the function being passed to eventually from progressing for 20 seconds. Meaning that by the time the database.search is called the operation has likely completed.
The second variant is different, it executes the code in the eventually block immediately, if it throws an exception then it will attempt it again based on the interval/timeout logic that your provide. In this scenario the save may not have completed (or propagated if it is eventually consistent). Because you are returning a ZIO which is designed not to throw, and eventually doesn't understand ZIO it will simply return the search attempt with no retry logic.
The accepted answer:
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
works because the retry and timeout are using the built-in ZIO operators which do understand how to actually retry and timeout a ZIO. Meaning that if search fails the retry will handle it until it succeeds.

How to write async Scala.js test (e.g. using ScalaTest)?

Some of my code is async, and I want to test this code's execution has resulted in correct state. I do not have a reference to a Future or a JS Promise that I could map over – the async code in question lives inside a JS library that I'm using, and it just calls setTimeout(setSomeState, 0), which is why my only recourse is to test the state asynchronously, after a short delay (10 ms).
This is my best attempt:
import org.scalatest.{Assertion, AsyncFunSpec, Matchers}
import scala.concurrent.Promise
import scala.scalajs.js
import scala.scalajs.concurrent.JSExecutionContext
class FooSpec extends AsyncFunSpec with Matchers {
implicit override def executionContext = JSExecutionContext.queue
it("async works") {
val promise = Promise[Assertion]()
js.timers.setTimeout(10) {
promise.success {
println("FOO")
assert(true)
}
}
promise.future
}
}
This works when the assertion succeeds – with assert(true). However, when the assertion fails (e.g. if you replace it with assert(false)), the test suite freezes up. sbt just stops printing anything, and hangs indefinitely, the test suite never completes. In case of such failure FooSpec: line does get printed, but not the name of the test ("async works"), nor the "FOO" string.
If I comment out the executionContext line, I get the "Queue is empty while future is not completed, this means you're probably using a wrong ExecutionContext for your task, please double check your Future." error which is explained in detail in one of the links below.
I think these links are relevant to this problem:
https://github.com/scalatest/scalatest/issues/910
https://github.com/scalatest/scalatest/issues/1039
But I couldn't figure out a solution that would work.
Should I be building the Future[Assertion] in a different way, maybe?
I'm not tied to ScalaTest, but judging by the comments in one of the links above it seems that uTest has a similar problem except it tends to ignore the tests instead of stalling the test suite.
I just want to make assertions after a short delay, seems like it should definitely be possible. Any advice on how to accomplish that would be much appreciated.
As was explained to me in this scala.js gitter thread, I'm using Promise.success incorrectly. That method expects a value to complete the promise with, but assert(false) throws an exception, it does not return a value of type Assertion.
Since in my code assert(false) is evaluated before Promise.success is called, the exception is thrown before the promise has a chance to complete. However, the exception is thrown in an synchronous callback to setTimeout, so it is invisible to ScalaTest. ScalaTest is then left waiting for a promise.future that never completes (because the underlying promise never completes).
Basically, my code is equivalent to this:
val promise = Promise[Assertion]()
js.timers.setTimeout(10) {
println("FOO")
val successValue = assert(false) // exception thrown here
promise.success(successValue) // this line does not get executed
}
promise.future
Instead, I should have used Promise.complete which expects a Try. Try.apply accepts an argument in pass-by-name mode, meaning that it will be evaluated only after Try() is called.
So the working code looks like this:
it("async works") {
val promise = Promise[Assertion]()
js.timers.setTimeout(10) {
promise.complete(Try({
println("FOO")
assert(true)
})
})
promise.future
}
The real answer here is: you should try to get the "async" part out of your unit test setup.
All dealing with waits; sleeps; and so on adds a level of complexity that you do not want to have in your unit tests.
As you are not showing the production code you are using I can only make some suggestion how to approach this topic on a general level.
Example: when one builds his threading on top of Java's ExecutorService, you can go for a same-thread executor service; and your unit tests are using a single thread; and many things become much easier.
Long story short: consider looking into that "concept" that gives you "async" in your solution; and if there are ways to avoid the "really async" part; but of course without (!) making test-only changes to your production code.

Why am I getting an (inconsistent) compiler error when using a for comprehension with a result of Try[Unit]?

A common pattern I have in my code base is to define methods with Try[Unit] to indicate the method is doing something (like writing a transaction long to a SaaS REST endpoint) where there is no interesting return result and I want to capture any and emit any errors encountered to stdout. I have read on another StackOverflow thread this is entirely acceptable (also see 2nd comment on the question).
I am specifically and explicitly avoiding actually throwing and catching an exception which is an extraordinarily expensive activity on most JVM implementations. This means any answer which uses the Try() idiom won't help resolve my issue.
I have the following code (vastly simplified from my real life scenario):
def someMethod(transactionToLog: String): Try[Unit] = {
val result =
for {
_ <- callToSomeMethodReturningATryInstance
} yield Unit
result match {
case Success(_)
//Unit
case Failure(e) ->
println(s"oopsie - ${e.getMessage}")
}
result
}
Sometimes this code compiles just fine. Sometimes it doesn't. When it doesn't, it gives me the following compilation error:
Error:(row, col) type mismatch;
found : scala.util.Try[Unit.type]
required: scala.util.Try[Unit]
result
^
Sometimes the IntelliJ syntax highlighter shows the code as fine. Other times it shows an error (roughly the same as the one above). And sometimes I get the compiler error but not a highligher error. And other times it compiles fine and I get only a highlighter error. And thus far, I am having a difficult time finding a perfect "example" to capture the compiler error.
I attempted to "cure" the issue by adding .asInstanceOf[Unit] to the result of the for comprehension. That got it past the compiler. However, now I am getting a runtime exception of java.lang.ClassCastException: scala.Unit$ cannot be cast to scala.runtime.BoxedUnit. Naturally, this is infinitely worse than the original compilation error.
So, two questions:
Assuming Try[Unit] isn't valid, what is the Scala idiomatic (or even just preferred) way to specify a Try which returns no useful Success result?
Assuming Try[Unit] is valid, how do I get past the compilation error (described above)?
SIDENOTE:
I must have hit this problem a year ago and didn't remember the details. I created the solution below which I have all over my code bases. However, recently I started using Future[Unit] in a number of places. And when I first tried Try[Unit], it appeared to work. However, the number of times it is now causing both compilation and IDE highlighting issues has grown quite a bit. So, I want to make a final decision about how to proceed which is consistent with whatever exists (or is even emerging) and is Scala idiomatic.
package org.scalaolio.util
/** This package object serves to ease Scala interactions with mutating
* methods.
*/
package object Try_ {
/** Placeholder type for when a mutator method does not intend to
* return anything with a Success, but needs to be able to return a
* Failure with an unthrown Exception.
*/
sealed case class CompletedNoException private[Try_] ()
/** Placeholder instance for when a mutator method needs to indicate
* success via Success(completedNoExceptionSingleton)
*/
val completedNoExceptionSingleton = new CompletedNoException()
}
Replace yield Unit with yield ().
Unit is a type - () is a value (and the only possible value) of type Unit.
That is why for { _ <- expr } yield Unit results in a Try[Unit.type], just as yield String would result in a Try[String.type].

Catching unhandled errors in Scala futures

If a Scala future fails, and there is no continuation that "observes" that failure (or the only continuations use map/flatMap and don't run in case of failure), then errors go undetected. I would like such errors to be at least logged, so I can find bugs.
I use the term "observed error" because in .Net Tasks there is the chance to catch "unobserved task exceptions", when the Task object is collected by the GC. Similarly, with synchronous methods, uncaught exceptions that terminate the thread can be logged.
In Scala futures, to 'observe' a failure would mean that some continuation or other code reads the Exception stored in the future value before that future is disposed. I'm aware that finalization is not deterministic or reliable, and presumably that's why it's not used to catch unhandled errors, although .Net does succeed in doing this.
Is there a way to achieve this in Scala? If not, how should I organize my code to prevent unhandled error bugs?
Today I have andThen checkResult appended to various futures. But it's hard to know when to use this and when not to: if a library method returns a Future, it shouldn't checkResult and log errors itself, because the library user may handle the failure, so the responsibility falls onto the user. As I edit code I sometimes need to add checks and sometimes to remove them, and such manual management is surely wrong.
I have concluded there is no way to generically notice unhandled errors in Scala futures.
You can just use Future.recover in the function that returns the Future.
So for instance, you could just "log" the error and rethrow the original exception, in the simplest case:
def libraryFunction(): Future[Int] = {
val f = ...
f.recover {
case NonFatal(t) =>
println("Error : " + t)
throw t
}
}
Note the use of NonFatal to match all the exception types it is sensible to catch.
That recover block could equally return an alternative result if you wish.