Reducing code repetition with recurring code in if-else - scala

I'm wondering if the following short snippet, which does show repetition can be made more DRY. I seem to be hitting these kind of constructions quite often.
Say I want some computation to be done either synchronous or asynchronous, which is chosen at runtime.
for(i <- 1 to reps) {
Thread.sleep(expDistribution.sample().toInt)
if (async) {
Future {
sqlContext.sql(query).collect()
}
} else {
sqlContext.sql(query).collect()
}
}
It feels clumsy repeating the call to the sqlContext. Is there an idiom for this trivial recurring construct?

You can "store" your computation in a local def and then evaluate it either synchronously or asynchronously
def go = sqlContext.sql(query).collect()
if(async) Future(go) else Future.successful(go)

You can execture Future in your current thread using MoreExecutors.directExecutor() which is implemented in guava library.
(If you don't wan't to use guava library, see this question)
Using this method, you can switch the execution context according to async flag.
Here's the sample code.
You can see that setting async flag to false makes each Future executed in order.
import com.google.common.util.concurrent.MoreExecutors
import scala.concurrent.{Future,ExecutionContext}
import java.lang.Thread
object Main {
def main(args:Array[String]){
val async = false // change this to switch sync/async
implicit val ec = if(async){
ExecutionContext.Implicits.global // use thread pool
}else{
ExecutionContext.fromExecutor(MoreExecutors.directExecutor) // directy execute in current thread
}
println(Thread.currentThread.getId)
Future{
Thread.sleep(1000)
println(Thread.currentThread.getId)
}
Future{
Thread.sleep(2000)
println(Thread.currentThread.getId)
}
Thread.sleep(4000) // if you are doing asynchronously, you need this.
}
}

Related

Is map of Future lazy or not?

Basically I mean:
for(v <- Future(long time operation)) yield v*someOtherValue
This expression returns another Future, but the question is, is the v*someOhterValue operation lazy or not? Will this expression block on getting the value of Future(long time operation)?
Or it is like a chain of callbacks?
A short experiment can test this question.
import concurrent._;
import concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
object TheFuture {
def main(args: Array[String]): Unit = {
val fut = for (v <- Future { Thread.sleep(2000) ; 10 }) yield v * 10;
println("For loop is finished...")
println(Await.ready(fut, Duration.Inf).value.get);
}
}
If we run this, we see For loop is finished... almost immediately, and then two seconds later, we see the result. So the act of performing map or similar operations on a future is not blocking.
A map (or, equivalently, your for comprehension) on a Future is not lazy: it will be executed as soon as possible on another thread. However, since it runs on another thread, it isn't blocking, either.
If you want to do the definition and execution of the Future separately, then you have to use something like a Monix Task.
https://monix.io/api/3.0/monix/eval/Task.html

Alternative to await.ready

I have the following code in Scala:
val status: Future[String] = Await.ready(Http(address OK as.String), 1 second)
I'm making a http call and I'm waiting for an answer for a second.
I was told it's not good practice to block using Await.ready.
I'm curious what I can use instead.
Can I use for comprehensions? How?
It generally bad to block on an asynchronous operation, otherwise, why make it asynchronous at all?
You can use various implementations with Future[T], such as registering a continuation to invoke when the result arrives. For example, let's assume you want to parse the String result into a Foo object. You'd get:
val result: Future[Foo] = Http(address OK as.String).map {
s => parseJson[Foo](s)
}
Note that when working with Future[T], you'll end up bubbling them up the call chain of the execution, unless you synchronously block.
Same can be achieved with for comprehension:
for {
s <- Http(address OK as.String)
} yield (parseJson[Foo](s))
Using Await.ready is not a good practice because its blocking. In most of the cases you can compose and transform the futures to achieve the desired result.
But You can use blocking when its absolutely necessary. Here is my answer about blocking and its consequences When to and when not use blocking
Non-Blocking wait
def afterSomeTime(code: => Unit)(duration: FiniteDuration): Unit = {
someActorSystem.scheduler.scheduleOnce(duration) {
code
}
}
Above function will call the code after given duration, you can use any other timer implementation instead of Akka scheduler
case class TimeoutException(msg: String) extends Exception(msg)
def timeout[T](future: => Future[T])(duration: FiniteDuration)(implicit ec: ExecutionContext): Future[T] = {
val promise = Promise[T]()
future.onComplete(promise tryComplete)
afterSomeTime {
promise tryFailure TimeoutException(s"Future timeout after ${duration.toString()}")
}(duration)
promise.future
}

Scala parallel calculation and interrupt when one method returns a result

Given the several methods
def methodA(args:Int):Int={
//too long calculation
result
}
def methodB(args:Int):Int={
//too long calculation
result
}
def methodC(args:Int):Int={
//too long calculation
result
}
They have the same set of arguments and return a result of the same type.
It is necessary to calculate the methods in parallel and when one method returns a result I need to interrupt others.
This was an interesting question. I'm not too experienced, so there is probably a more Scala-ish way to do this, for starters with some of the cancellable future implementations. However, I traded brevity for readability.
import java.util.concurrent.{Callable, FutureTask}
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
def longRunningTask(foo: Int => String)(arg: Int): FutureTask[String] = {
new FutureTask[String](new Callable[String]() {
def call(): String = {
foo(arg)
}
})
}
def killTasks(tasks: Seq[FutureTask[_]]): Unit = {
tasks.foreach(_.cancel(false))
}
val task1 = longRunningTask(methodA)(3)
val task2 = longRunningTask(methodB)(4)
val task1Future = Future {
task1.run()
task1.get()
}
val task2Future = Future {
task2.run()
task2.get()
}
val firstFuture = Future.firstCompletedOf(List(task1Future, task2Future))
firstFuture.onSuccess({
case result => {
println(result)
killTasks(List(task1, task2))
}
})
So, what I did here? I crated a helper method which creates a FutureTask (from the Java API) which executes an Int => String operation on some Int argument. I also created a helper method which kills all FutureTasks in a collection.
After that I created two long running tasks with two different methods and inputs (that's your part).
The next part is a bit ugly, basically I run the FutureTasks with Future monad and call get to fetch the result.
After that I basically only use the firstCompletedOf method from the Future companion to handle the first finishing task, and when that is processed kill the rest of the tasks.
I'll probably make an effort to make this code a bit better, but start with this.
You should also have a go in checking this with different 'ExecutionContext's, and parallelism levels.

Do Futures always end up not returning anything?

Given that we must avoid...
1) Modifying state
2) Blocking
...what is a correct end-to-end usage for a Future?
The general practice in using Futures seems to be transforming them into other Futures by using map, flatMap etc. but it's no good creating Futures forever.
Will there always be a call to onComplete somewhere, with methods writing the result of the Future to somewhere external to the application (e.g. web socket; the console; a message broker) or is there a non-blocking way of accessing the result?
All of the information on Futures in the Scaladocs - http://docs.scala-lang.org/overviews/core/futures.html seem to end up writing to the console. onComplete doesn't return anything, so presumably we have to end up doing some "fire-and-forget" IO.
e.g. a call to println
f onComplete {
case Success(number) => println(number)
case Failure(err) => println("An error has occured: " + err.getMessage)
}
But what about in more complex cases where we want to do more with the result of the Future?
As an example, in the Play framework Action.async can return a Future[Result] and the framework handles the rest. Will it eventually have to expect never to get a result from the Future?
We know the user needs to be returned a Result, so how can a framework do this using only a Unit method?
Is there a non-blocking way to retrieve the value of a future and use it elsewhere within the application, or is a call to Await inevitable?
Best practice is to use callbacks such as onComplete, onSuccess, onFailure for side effecting operations, e.g. logging, monitoring, I/O.
If you need the continue with the result of of your Future computation as opposed to do a side-effecting operation, you should use map to get access to the result of your computation and compose over it.
Future returns a unit, yes. That's because it's an asynchronous trigger. You need to register a callback in order to gather the result.
From your referenced scaladoc (with my comments):
// first assign the future with expected return type to a variable.
val f: Future[List[String]] = Future {
session.getRecentPosts
}
// immediately register the callbacks
f onFailure {
case t => println("An error has occurred: " + t.getMessage)
}
f onSuccess {
case posts => for (post <- posts) println(post)
}
Or instead of println-ing you could do something with the result:
f onSuccess {
case posts: List[String] => someFunction(posts)
}
Try this out:
import scala.concurrent.duration._
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
val f: Future[Int] = Future { 43 }
val result: Int = Await.result(f, 0 nanos)
So what is going on here?
You're defining a computation to be executed on a different thread.
So you Future { 43 } returns immediately.
Then you can wait for it and gather the result (via Await.result) or define computation on it without waiting for it to be completed (via map etc...)
Actually, the kind of Future you are talking about are used for side-effects. The result returned by a Future depends its type :
val f = Future[Int] { 42 }
For example, I could send the result of Future[Int] to another Future :
val f2 = f.flatMap(integer => Future{ println(integer) }) // may print 42
As you know, a future is a process that happens concurrently. So you can get its result in the future (that is, using methods such as onComplete) OR by explicitly blocking the current thread until it gets a value :
import scala.concurrent.Await
import akka.util.Timeout
import scala.concurrent.duration._
implicit val timeout = Timeout(5 seconds)
val integer = Await.result(Future { 42 }, timeout.duration)
Usually when you start dealing with asynchronous processes, you have to think in terms of reactions which may never occur. Using chained Futures is like declaring a possible chain of events which could be broken at any moment. Therefore, waiting for a Future's value is definitely not a good practice as you may never get it :
val integer = Await.result(Future { throw new RuntimeException() }, timeout.duration) // will throw an uncaught exception
Try to think more in terms of events, than in procedures.

why does this scala by-name parameter behave weirdly

OK the question might not say much, but here's the deal:
I'm learning scala and decided to make an utility class "FuncThread" with a method which receives a by-name parameter function (I guess its called that because it's a function but without a parameter list) and then starts a thread with a runable which in turn executes the passed function, I wrote such a class as follows:
class FuncThread
{
def runInThread( func: => Unit)
{
val thread = new Thread(new Runnable()
{
def run()
{
func
}
}
thread.start()
}
}
Then I wrote a junit test as follows:
#Test
def weirdBehaivorTest()
{
var executed = false
val util = new FuncThread()
util.runInThread
{
executed = true
}
//the next line makes the test pass....
//val nonSense : () => Unit = () => { Console println "???" }
assertTrue(executed)
}
If I uncomment the second commented line, the test passes but if it remains commented the test fails, is this the correct behaviour? how and when do by-name parameter functions get executed?
I know Scala has the actors library but I wanted to try this since I've always wanted to do this in Java
Is this just a race condition? runInThread starts the thread but your assertion tests 'executed' before the other thread sets it to true. Adding your extra line means more code (and so time) is executed before the test, making it more likely that 'executed' has been set to true
It's also worth noting that (as of Scala 2.8), the construct you were trying to write is available in the standard library
import scala.actors.Futures._
future{
executed = true
}
This construct is actually more powerful than what you're describing, the thread calculation can return a value, and which can be waited for.
import scala.actors.Futures._
//forks off expensive calculation
val expensiveToCalculateNumber:Future[Int] = future{
bigExpensiveCalculation()
}
// do a lot of other stuff
//print out the result of the expensive calculation if it's ready, otherwise wait until it is
println( expensiveToCalculateNumber());