Basically I'm running two futures queries on cassandra, then I need to do some computation and return the value(an average of values).
Here is my code:
object TestWrapFuture {
def main(args: Array[String]) {
val category = 5392
ExtensiveComputation.average(category).onComplete {
case Success(s) => println(s)
case Failure(f) => throw new Exception(f)
}
}
}
class ExtensiveComputation {
val volume = new ListBuffer[Int]()
def average(categoryId: Int): Future[Double] = {
val productsByCategory = Product.findProductsByCategory(categoryId)
productsByCategory.map { prods =>
for (prod <- prods if prod._2) {
Sku.findSkusByProductId(prod._1).map { skus =>
skus.foreach(sku => volume += (sku.height.get * sku.width.get * sku.length.get))
}
}
val average = volume.sum / volume.length
average
}
}
}
object ExtensiveComputation extends ExtensiveComputation
So what is the problem?
The skus.foreach are appending the result value in a ListBuffer. Since everything is async, when I try to obtain the result in my main, I got an error saying I can't divide by zero.
Indeed, since my Sku.findSkusByProduct returns a Future, when I try to compute the average, the volume is empty.
Should I block anything prior this computation, or should I do anything else?
EDIT
Well, I tried to block like this:
val volume = new ListBuffer[Int]()
def average(categoryId: Int): Future[Double] = {
val productsByCategory = Product.findProductsByCategory(categoryId)
val blocked = productsByCategory.map { prods =>
for (prod <- prods if prod._2) {
Sku.findSkusByProductId(prod._1).map { skus =>
skus.foreach(sku => volume += (sku.height.get * sku.width.get * sku.length.get))
}
}
}
Await.result(blocked, Duration.Inf)
val average = volume.sum / volume.length
Future.successful(average)
}
Then I got two different results from this piece of code:
Sku.findSkusByProductId(prod._1).map { skus =>
skus.foreach(sku => volume += (sku.height.get * sku.width.get * sku.length.get))
}
1 - When there are just a few like 50 to be looked up on cassandra, it just runs and gives me the result
2 - When there are many like 1000, it gives me
java.lang.ArithmeticException: / by zero
EDIT 2
I tried this code as #Olivier Michallat proposed
def average(categoryId: Int): Future[Double] = {
val productsByCategory = Product.findProductsByCategory(categoryId)
productsByCategory.map { prods =>
for (prod <- prods if prod._2) findBlocking(prod._1)
volume.sum / volume.length
}
}
def findBlocking(productId: Long) = {
val future = Sku.findSkusByProductId(productId).map { skus =>
skus.foreach(sku => volume += (sku.height.get * sku.width.get * sku.length.get))
}
Await.result(future, Duration.Inf)
}
And the following as #kolmar proposed:
def average(categoryId: Int): Future[Int] = {
for {
prods <- Product.findProductsByCategory(categoryId)
filtered = prods.filter(_._2)
skus <- Future.traverse(filtered)(p => Sku.findSkusByProductId(p._1))
} yield {
val volumes = skus.flatten.map(sku => sku.height.get * sku.width.get * sku.length.get)
volumes.sum / volumes.size
}
}
Both works with a few skus to find like 50, but both fails with many skus to find like 1000 throwing ArithmeticException: / by zero
It seems that it could not compute everything before returning the future...
You need to wait until all the futures generated by findSkusByProductId have completed before you compute the average. So accumulate all these futures in a Seq, call Future.sequence on it to get a Future[Seq], then map that future to a function that computes the average. Then replace productsByCategory.map with a flatMap.
Since you have to call a function that returns a Future on a sequence of arguments, it's better to use Future.traverse for that.
For example:
object ExtensiveComputation {
def average(categoryId: Int): Future[Double] = {
for {
products <- Product.findProductsByCategory(categoryId)
filtered = products.filter(_._2)
skus <- Future.traverse(filtered)(p => Sku.findSkusByProductId(p._1))
} yield {
val volumes = skus.map { sku =>
sku.height.get * sku.width.get * sku.length.get }
volumes.sum / volumes.size
}
}
}
Related
Trying to execute a function in a given time frame, but if computation fails by TimeOut get a partial result instead of an empty exception.
The attached code solves it.
The timedRun function is from Computation with time limit
Any better approach?.
package ga
object Ga extends App {
//this is the ugly...
var bestResult = "best result";
try {
val result = timedRun(150)(bestEffort())
} catch {
case e: Exception =>
print ("timed at = ")
}
println(bestResult)
//dummy function
def bestEffort(): String = {
var res = 0
for (i <- 0 until 100000) {
res = i
bestResult = s" $res"
}
" " + res
}
//This is the elegant part from stackoverflow gruenewa
#throws(classOf[java.util.concurrent.TimeoutException])
def timedRun[F](timeout: Long)(f: => F): F = {
import java.util.concurrent.{ Callable, FutureTask, TimeUnit }
val task = new FutureTask(new Callable[F]() {
def call() = f
})
new Thread(task).start()
task.get(timeout, TimeUnit.MILLISECONDS)
}
}
I would introduce a small intermediate class for more explicitly communicating the partial results between threads. That way you don't have to modify non-local state in any surprising ways. Then you can also just catch the exception within the timedRun method:
class Result[A](var result: A)
val result = timedRun(150)("best result")(bestEffort)
println(result)
//dummy function
def bestEffort(r: Result[String]): Unit = {
var res = 0
for (i <- 0 until 100000) {
res = i
r.result = s" $res"
}
r.result = " " + res
}
def timedRun[A](timeout: Long)(initial: A)(f: Result[A] => _): A = {
import java.util.concurrent.{ Callable, FutureTask, TimeUnit }
val result = new Result(initial)
val task = new FutureTask(new Callable[A]() {
def call() = { f(result); result.result }
})
new Thread(task).start()
try {
task.get(timeout, TimeUnit.MILLISECONDS)
} catch {
case e: java.util.concurrent.TimeoutException => result.result
}
}
It's admittedly a bit awkward since you don't usually have the "return value" of a function passed in as a parameter. But I think it's the least-radical modification of your code that makes sense. You could also consider modeling your computation as something that returns a Stream or Iterator of partial results, and then essentially do .takeWhile(notTimedOut).last. But how feasible that is really depends on the actual computation.
First, you need to use one of the solution to recover after the future timed out which are unfortunately not built-in in Scala:
See: Scala Futures - built in timeout?
For example:
def withTimeout[T](fut:Future[T])(implicit ec:ExecutionContext, after:Duration) = {
val prom = Promise[T]()
val timeout = TimeoutScheduler.scheduleTimeout(prom, after)
val combinedFut = Future.firstCompletedOf(List(fut, prom.future))
fut onComplete{case result => timeout.cancel()}
combinedFut
}
Then it is easy:
var bestResult = "best result"
val expensiveFunction = Future {
var res = 0
for (i <- 0 until 10000) {
Thread.sleep(10)
res = i
bestResult = s" $res"
}
" " + res
}
val timeoutFuture = withTimeout(expensiveFunction) recover {
case _: TimeoutException => bestResult
}
println(Await.result(timeoutFuture, 1 seconds))
I have two Scala functions that are expensive to run. Each one is like below, they start improving the value of a variable and I'd like to run them simultaneously and after 5 minutes (or some other time). I'd like to terminate the two functions and take their latest value up to that time.
def func1(n: Int): Double = {
var a = 0.0D
while (not terminated) {
/// improve value of 'a' with algorithm 1
}
}
def func2(n: Int): Double = {
var a = 0.0D
while (not terminated) {
/// improve value of 'a' with algorithm 2
}
}
I would like to know how I should structure my code for doing that and what is the best practice here? I was thinking about running them in two different threads with a timeout and return their latest value at time out. But it seems there can be other ways for doing that. I am new to Scala so any insight would be tremendously helpful.
It is not hard. Here is one way of doing it:
#volatile var terminated = false
def func1(n: Int): Double = {
var a = 0.0D
while (!terminated) {
a = 0.0001 + a * 0.99999; //some useless formula1
}
a
}
def func2(n: Int): Double = {
var a = 0.0D
while (!terminated) {
a += 0.0001 //much simpler formula2, just for testing
}
a
}
def main(args: Array[String]): Unit = {
val f1 = Future { func1(1) } //work starts here
val f2 = Future { func2(2) } //and here
//aggregate results into one common future
val aggregatedFuture = for{
f1Result <- f1
f2Result <- f2
} yield (f1Result, f2Result)
Thread.sleep(500) //wait here for some calculations in ms
terminated = true //this is where we actually command to stop
//since looping to while() takes time, we need to wait for results
val res = Await.result(aggregatedFuture, 50.millis)
//just a printout
println("results:" + res)
}
But, of course, you would want to maybe look at your while loops and create a more manageable and chainable calculations.
Output: results:(9.999999999933387,31206.34691883926)
I am not 100% sure if this is something you would want to do, but here is one approach (not for 5 minutes, but you can change that) :
object s
{
def main(args: Array[String]): Unit = println(run())
def run(): (Int, Int) =
{
val (s, numNanoSec, seedVal) = (System.nanoTime, 500000000L, 0)
Seq(f1 _, f2 _).par.map(f =>
{
var (i, id) = f(seedVal)
while (System.nanoTime - s < numNanoSec)
{
i = f(i)._1
}
(i, id)
}).seq.maxBy(_._1)
}
def f1(a: Int): (Int, Int) = (a + 1, 1)
def f2(a: Int): (Int, Int) = (a + 2, 2)
}
Output:
me#ideapad:~/junk> scala s.scala
(34722678,2)
me#ideapad:~/junk> scala s.scala
(30065688,2)
me#ideapad:~/junk> scala s.scala
(34650716,2)
Of course this all assumes you have at least two threads available to distribute tasks to.
You can use Future with Await result to do that:
def fun2(): Double = {
var a = 0.0f
val f = Future {
// improve a with algorithm 2
a
}
try {
Await.result(f, 5 minutes)
} catch {
case e: TimeoutException => a
}
}
use the Await.result to wait algorithm with timeout, when we met this timeout, we return the a directly
Given two scala play enumerators A and B that each provide sorted integers, is there a way to derive an enumerator of integers that exist in B that don't exist in A?
For example:
val A: Enumerator[Int] = Enumerator(1,3,5,9,11,13)
and
val B: Enumerator[Int] = Enumerator(1,3,5,7,9,11,13)
I would somehow get:
val C: Enumerator[Int] // This enumerator will output 7
Doing it in a reactive way with enumerators/iteratees/enumeratees is preferred.
One solution I've thought of is to interleave the enumerators and somehow use Iteratee.fold to maintain a buffer to compare the two streams but that seems like it should be unnecessary.
I had somewhat similar question
How to merge 2 Enumerators in one, based on merge rule
I modified given answer, to fit your needs
object Disjunction {
def disjunction[E: Ordering](enumA: Enumerator[E], enumB: Enumerator[E])(implicit ec: ExecutionContext) = new Enumerator[E] {
def apply[A](iter: Iteratee[E, A]) = {
case class IterateeReturn(o: Option[(Promise[Promise[IterateeReturn]], E)])
val failP: Promise[Nothing] = Promise() // Fail promise
val failPF: Future[Nothing] = failP.future // Fail promise future
val initState1: Future[Seq[IterateeReturn]] = Future.traverse(Seq(enumA, enumB)) {
enum =>
val p: Promise[IterateeReturn] = Promise[IterateeReturn]()
// The flow to transform Enumerator in IterateeReturn form
enum.run(Iteratee.foldM(p)({
(oldP: Promise[IterateeReturn], elem: E) =>
val p = Promise[Promise[IterateeReturn]]()
// Return IterateeReturn pointing to the next foldM Future reference, and current element
oldP success IterateeReturn(Some(p, elem))
// Return new Future as a Result of foldM
p.future
}) map ({
promise => promise success IterateeReturn(None) // Finish last promise with empty IterateeReturn
})
) onFailure {
// In case of failure main flow needs to be informed
case t => failP failure t
}
p.future
}
val initState: Future[List[(Promise[Promise[IterateeReturn]], E)]] = initState1 map (_.map(_.o).flatten.toList)
val newEnum: Enumerator[Option[E]] = Enumerator.unfoldM(initState) { fstate =>
// Whatever happens first, fstate returned of failure happened during iteration
Future.firstCompletedOf(Seq(fstate, failPF)) map { state =>
// state is List[(Promise[Promise[IterateeReturn]], E)
// sort elements by E
if (state.isEmpty) {
None
} else if (state.length == 1) {
val (oldP, elem) = state.head
val p = Promise[IterateeReturn]()
oldP success p
// Return newState, with this iterator moved
val newState: Future[List[(Promise[Promise[IterateeReturn]], E)]] = p.future.map(ir => ir.o.map(List(_)).getOrElse(Nil))
Some(newState, Some(elem))
} else {
val sorted = state.sortBy(_._2)
val (firstP, fe) = sorted.head
val (secondP, se) = sorted.tail.head
if (fe != se) {
// Move first and combine with the second
val p = Promise[IterateeReturn]()
firstP success p
val newState: Future[List[(Promise[Promise[IterateeReturn]], E)]] = p.future.map(ir => ir.o.map(List(_, (secondP, se))).getOrElse(List((secondP, se))))
// Return new state
Some(newState, Some(fe))
} else {
// Move future 1
val p1 = Promise[IterateeReturn]()
firstP success p1
val fState: Future[Option[(Promise[Promise[IterateeReturn]], E)]] = p1.future.map(ir => ir.o)
// Move future 2
val p2 = Promise[IterateeReturn]()
secondP success p2
val sState: Future[Option[(Promise[Promise[IterateeReturn]], E)]] = p2.future.map(ir => ir.o)
// Combine in new state
val newState = Future.sequence(List(fState, sState)).map(_.flatten)
// Return
Some(newState , None)
}
}
}
}
newEnum &>
Enumeratee.filter(_.isDefined) &>
Enumeratee.map(_.get) apply iter
}
}
}
I checked, it works.
I am doing Exercises from Learning Concurrent Programming in Scala.
For an exercise question in code comment.
Program prints proper output of HTML contents for proper URL and timeout sufficiently enough.
Program prints "Error occured" for proper URL and low timeout.
However for invalid URL "Error occured" is not printed. What is the problem with the code below?
/*
* Implement a command-line program that asks the user to input a URL of some website,
* and displays the HTML of that website. Between the time that the user hits ENTER and
* the time that the HTML is retrieved, the program should repetitively print a . to the
* standard output every 50 milliseconds, with a two seconds timeout. Use only futures
* and promises, and avoid the synchronization primitives from the previous chapters.
* You may reuse the timeout method defined in this chapter.
*/
object Excersices extends App {
val timer = new Timer()
def timeout(t: Long = 1000): Future[Unit] = {
val p = Promise[Unit]
val timer = new Timer(true)
timer.schedule(new TimerTask() {
override def run() = {
p success ()
timer cancel()
}
}, t)
p future
}
def printDot = println(".")
val taskOfPrintingDot = new TimerTask {
override def run() = printDot
}
println("Enter a URL")
val lines = io.Source.stdin.getLines()
val url = if (lines hasNext) Some(lines next) else None
timer.schedule(taskOfPrintingDot, 0L, 50.millisecond.toMillis)
val timeOut2Sec = timeout(2.second.toMillis)
val htmlContents = Future {
url map { x =>
blocking {
Source fromURL (x) mkString
}
}
}
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents)) map { x =>
timer cancel ()
x match {
case Some(x) =>
println(x)
case _ =>
println("Error occured")
}
}
Thread sleep 5000
}
As #Gábor Bakos said exception produces Failure which doesn't handled by map:
val fut = Future { Some(Source fromURL ("hhhttp://google.com")) }
scala> fut map { x => println(x) } //nothing printed
res12: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#5e025724
To process failure - use recover method :
scala> fut recover { case failure => None } map { x => println(x) }
None
res13: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#578afc83
In your context it's something like:
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents)) recover {case x => println("Error:" + x); None} map { x => ...}
The Complete Code after using recover as advised by #dk14:
object Exercises extends App {
val timer = new Timer()
def timeout(t: Long = 1000): Future[Unit] = {
val p = Promise[Unit]
val timer = new Timer(true)
timer.schedule(new TimerTask() {
override def run() = {
p success ()
timer cancel ()
}
}, t)
p future
}
def printDot = println(".")
val taskOfPrintingDot = new TimerTask {
override def run() = {
printDot
}
}
println("Enter a URL")
val lines = io.Source.stdin.getLines()
val url = if (lines hasNext) Some(lines next) else None
timer.schedule(taskOfPrintingDot, 0L, 50.millisecond.toMillis)
val timeOut2Sec = timeout(2.second.toMillis)
val htmlContents = Future {
url map { x =>
blocking {
Source fromURL (x) mkString
}
}
}
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents))
.recover { case x => println("Error:" + x); None }
.map { x =>
timer cancel ()
x match {
case Some(x) =>
println(x)
case _ =>
println("Timeout occurred")
}
}
Thread sleep 5000
}
I'm new to scala and functional programming. I'm trying out the usual beginner applications and scripts(Obviously using a bit of over-technology)
Anyways I have this code for a calculator that takes arguments and a switch to dictate the operation to use on the arguments.
object Main {
def main(args: Array[String]): Unit = {
var calc = calculate( "" , _:List[Int])
var values:List[Int] = List()
if(args.size < 1) println("No arguments supplied") else{
args collect {_ match{
case arg if arg.contains("-") => {
if(values.size>0){
calc(values)
values = List()}
calc = calculate( arg , _:List[Int])
}
case arg => {
try{
val value=arg.toInt
values = values ::: List(value)
}catch{case e:NumberFormatException=>println("\nError:Invalid argument:\""+arg+"\"\nCannot convert to Integer.\n")}
}
}
}
calc(values)
}
}
def sum(values:List[Int]) { println("The sum is:"+(values.foldLeft(0)((sum,value) => sum + value))) }
def subtract(values:List[Int]) {
val initial:Int = values(0)
var total:Int = 0
for(i <- 1 until values.size){
total = total + values(i)
}
val diff:Int = initial - total
println("The difference is:"+diff)
}
def calculate(operation:String,values:List[Int]) = operation match {
case "-sum" => sum(values)
case "-diff" => subtract(values)
case _ => println("Default operation \"Sum\" will be applied");sum(values)
}
}
Some points that id like to find if theres a better way to do is like removing the try catch statement.
A better way to compose this application would be very welcome.
How about this one?
object Main extends App {
require(args.size > 0, "Please, supply more arguments")
#annotation.tailrec
def parseArguments(arguments: List[String], operation: String, values: List[Int]() = Nil) {
if(values.nonEmpty) calculate(operation, values)
arguments match {
case op::unprocessed if op.startsWith("-") => parseArguments(unprocessed, op)
case maybeNumber::unprocessed => {
val newValues = try {
maybeNumber.toInt::values
} catch {
case _: NumberFormatException =>
println("\nError:Invalid argument:\""+maybeNumber+"\"\nCannot convert to Integer.\n")
values
}
parseArguments(unprocessed, operation, newValues)
}
case Nil => //done processing, exiting
}
}
parseArguments(args.toList, "")
def diff(values:List[Int]) = {
val initial::tail = values
val total = tail.sum
initial - total
}
def calculate(operation:String, values:List[Int]) = operation match {
case "-sum" => println("The sum is " + values.sum)
case "-diff" => println("The difference is: " + diff(values))
case _ =>
println("""Default operation "Sum" will be applied""")
sum(values)
}
}