How to buffer emission with custom weither function - scala

I need functionality like monix.Observable.bufferTimedAndCounted but with custom "weither". I found bufferTimedWithPressure operator which allow use item's weith:
val subj = PublishSubject[String]()
subj
.bufferTimedWithPressure(1.seconds, 5, _ => 3)
.subscribe(s => {
println(s)
Future.successful(Ack.Continue)
})
for (i <- 1 to 60) {
Thread.sleep(100)
subj.onNext(i.toString)
}
But emission happens every specified duration. I need behavior like bufferTimedAndCounted, so emission happens when buffer full. How to achive that?

I copied BufferTimedObservable from monix sources and slightly change it, add weight function (Note - i'm not tested it in all cases):
import java.util.concurrent.TimeUnit
import monix.execution.Ack.{Continue, Stop}
import monix.execution.cancelables.{CompositeCancelable, MultiAssignCancelable}
import monix.execution.{Ack, Cancelable}
import monix.reactive.Observable
import monix.reactive.observers.Subscriber
import scala.collection.mutable.ListBuffer
import scala.concurrent.Future
import scala.concurrent.duration.{Duration, FiniteDuration, MILLISECONDS}
/**
* Copied from monix sources, adopted to size instead count
*
*/
final class BufferTimedWithWeigthObservable[+A](source: Observable[A], timespan: FiniteDuration, maxSize: Int, sizeOf: A => Int)
extends Observable[Seq[A]] {
require(timespan > Duration.Zero, "timespan must be strictly positive")
require(maxSize >= 0, "maxSize must be positive")
def unsafeSubscribeFn(out: Subscriber[Seq[A]]): Cancelable = {
val periodicTask = MultiAssignCancelable()
val connection = source.unsafeSubscribeFn(new Subscriber[A] with Runnable {
self =>
implicit val scheduler = out.scheduler
private[this] val timespanMillis = timespan.toMillis
// MUST BE synchronized by `self`
private[this] var ack: Future[Ack] = Continue
// MUST BE synchronized by `self`
private[this] var buffer = ListBuffer.empty[A]
// MUST BE synchronized by `self`
private[this] var currentSize = 0
private[this] var sizeOfLast = 0
private[this] var expiresAt = scheduler.clockMonotonic(MILLISECONDS) + timespanMillis
locally {
// Scheduling the first tick, in the constructor
periodicTask := out.scheduler.scheduleOnce(timespanMillis, TimeUnit.MILLISECONDS, self)
}
// Runs periodically, every `timespan`
def run(): Unit = self.synchronized {
val now = scheduler.clockMonotonic(MILLISECONDS)
// Do we still have time remaining?
if (now < expiresAt) {
// If we still have time remaining, it's either a scheduler
// problem, or we rushed to signaling the bundle upon reaching
// the maximum size in onNext. So we sleep some more.
val remaining = expiresAt - now
periodicTask := scheduler.scheduleOnce(remaining, TimeUnit.MILLISECONDS, self)
} else if (buffer != null) {
// The timespan has passed since the last signal so we need
// to send the current bundle
sendNextAndReset(now, byPeriod = true).syncOnContinue(
// Schedule the next tick, but only after we are done
// sending the bundle
run())
}
}
// Must be synchronized by `self`
private def sendNextAndReset(now: Long, byPeriod: Boolean = false): Future[Ack] = {
val prepare = if (byPeriod) buffer else buffer.dropRight(1)
// Reset
if (byPeriod) {
buffer = ListBuffer.empty[A]
currentSize = 0
sizeOfLast = 0
} else {
buffer = buffer.takeRight(1)
currentSize = sizeOfLast
}
// Setting the time of the next scheduled tick
expiresAt = now + timespanMillis
ack = ack.syncTryFlatten.syncFlatMap {
case Continue => out.onNext(prepare)
case Stop => Stop
}
ack
}
def onNext(elem: A): Future[Ack] = self.synchronized {
val now = scheduler.clockMonotonic(MILLISECONDS)
buffer.append(elem)
sizeOfLast = sizeOf(elem)
currentSize = currentSize + sizeOfLast
// 9 and 9 true
//10 and 9
if (expiresAt <= now || (maxSize > 0 && maxSize < currentSize)) {
sendNextAndReset(now)
}
else {
Continue
}
}
def onError(ex: Throwable): Unit = self.synchronized {
periodicTask.cancel()
ack = Stop
buffer = null
out.onError(ex)
}
def onComplete(): Unit = self.synchronized {
periodicTask.cancel()
if (buffer.nonEmpty) {
val bundleToSend = buffer.toList
// In case the last onNext isn't finished, then
// we need to apply back-pressure, otherwise this
// onNext will break the contract.
ack.syncOnContinue {
out.onNext(bundleToSend)
out.onComplete()
}
} else {
// We can just stream directly
out.onComplete()
}
// GC relief
buffer = null
// Ensuring that nothing else happens
ack = Stop
}
})
CompositeCancelable(connection, periodicTask)
}
}
How use it:
object MonixImplicits {
implicit class RichObservable[+A](source: Observable[A]) {
def bufferTimedAndSized(timespan: FiniteDuration, maxSize: Int, sizeOf: A => Int): Observable[Seq[A]] = {
new BufferTimedWithWeigthObservable(source, timespan, maxSize, sizeOf)
}
}
}
import MonixImplicits._
someObservable.bufferTimedAndSized(1.seconds, 5, item => item.size)

Related

Get partial result on Scala time limited best effort computation

Trying to execute a function in a given time frame, but if computation fails by TimeOut get a partial result instead of an empty exception.
The attached code solves it.
The timedRun function is from Computation with time limit
Any better approach?.
package ga
object Ga extends App {
//this is the ugly...
var bestResult = "best result";
try {
val result = timedRun(150)(bestEffort())
} catch {
case e: Exception =>
print ("timed at = ")
}
println(bestResult)
//dummy function
def bestEffort(): String = {
var res = 0
for (i <- 0 until 100000) {
res = i
bestResult = s" $res"
}
" " + res
}
//This is the elegant part from stackoverflow gruenewa
#throws(classOf[java.util.concurrent.TimeoutException])
def timedRun[F](timeout: Long)(f: => F): F = {
import java.util.concurrent.{ Callable, FutureTask, TimeUnit }
val task = new FutureTask(new Callable[F]() {
def call() = f
})
new Thread(task).start()
task.get(timeout, TimeUnit.MILLISECONDS)
}
}
I would introduce a small intermediate class for more explicitly communicating the partial results between threads. That way you don't have to modify non-local state in any surprising ways. Then you can also just catch the exception within the timedRun method:
class Result[A](var result: A)
val result = timedRun(150)("best result")(bestEffort)
println(result)
//dummy function
def bestEffort(r: Result[String]): Unit = {
var res = 0
for (i <- 0 until 100000) {
res = i
r.result = s" $res"
}
r.result = " " + res
}
def timedRun[A](timeout: Long)(initial: A)(f: Result[A] => _): A = {
import java.util.concurrent.{ Callable, FutureTask, TimeUnit }
val result = new Result(initial)
val task = new FutureTask(new Callable[A]() {
def call() = { f(result); result.result }
})
new Thread(task).start()
try {
task.get(timeout, TimeUnit.MILLISECONDS)
} catch {
case e: java.util.concurrent.TimeoutException => result.result
}
}
It's admittedly a bit awkward since you don't usually have the "return value" of a function passed in as a parameter. But I think it's the least-radical modification of your code that makes sense. You could also consider modeling your computation as something that returns a Stream or Iterator of partial results, and then essentially do .takeWhile(notTimedOut).last. But how feasible that is really depends on the actual computation.
First, you need to use one of the solution to recover after the future timed out which are unfortunately not built-in in Scala:
See: Scala Futures - built in timeout?
For example:
def withTimeout[T](fut:Future[T])(implicit ec:ExecutionContext, after:Duration) = {
val prom = Promise[T]()
val timeout = TimeoutScheduler.scheduleTimeout(prom, after)
val combinedFut = Future.firstCompletedOf(List(fut, prom.future))
fut onComplete{case result => timeout.cancel()}
combinedFut
}
Then it is easy:
var bestResult = "best result"
val expensiveFunction = Future {
var res = 0
for (i <- 0 until 10000) {
Thread.sleep(10)
res = i
bestResult = s" $res"
}
" " + res
}
val timeoutFuture = withTimeout(expensiveFunction) recover {
case _: TimeoutException => bestResult
}
println(Await.result(timeoutFuture, 1 seconds))

Scala future-derived observable callbacks not getting called

Using scala 2.11.7, rxscala_2.11 0.25.0, rxjava 1.0.16, my oddFutures callbacks don't get called in AsyncDisjointedChunkMultiprocessing.process():
package jj.async
import scala.concurrent.Future
import scala.concurrent.ExecutionContext
import rx.lang.scala.Observable
import jj.async.helpers._
/* Problem: How to multi-process records asynchronously in chunks.
Processing steps:
- fetch finite # of records from a repository (10 at-a-time (<= 10 for last batch) because of downstream limitations)
- process ea. chunk through a filter asynchronously (has 10-record input limit)
- compute the reverse of the filtered result
- enrich (also has 10-record input limit) filtered results asynchronously
- return enriched filtered results once all records are processed
*/
object AsyncDisjointedChunkMultiprocessing {
private implicit val ec = ExecutionContext.global
def process(): List[Enriched] = {
#volatile var oddsBuffer = Set[Int]()
#volatile var enrichedFutures = Observable just Set[Enriched]()
oddFutures.foreach(
odds =>
if (odds.size + oddsBuffer.size >= chunkSize) {
val chunkReach = chunkSize - oddsBuffer.size
val poors = oddsBuffer ++ odds take chunkReach
enrichedFutures = enrichedFutures + poors
oddsBuffer = odds drop chunkReach
} else {
oddsBuffer ++= odds
},
error => throw error,
() => enrichedFutures + oddsBuffer)
enrichedFutures.toBlocking.toList.flatten
}
private def oddFutures: Observable[Set[Int]] =
Repository.query(chunkSize) { chunk =>
evenFuture(chunk) map {
filtered => chunk -- filtered
}
}
private def evenFuture(chunk: Set[Int]): Future[Set[Int]] = {
checkSizeLimit(chunk)
Future { Remote even chunk }
}
}
class Enriched(i: Int)
object Enriched {
def apply(i: Int) = new Enriched(i)
def enrich(poors: Set[Int]): Set[Enriched] = {
checkSizeLimit(poors);
Thread.sleep(1000)
poors map { Enriched(_) }
}
}
object Repository {
def query(fetchSize: Int)(f: Set[Int] => Future[Set[Int]]): Observable[Set[Int]] = {
implicit val ec = ExecutionContext.global
Observable.from {
Thread.sleep(20)
f(Set(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
Thread.sleep(20)
f(Set(11, 12, 13, 14, 15, 16, 17, 18, 19, 20))
Thread.sleep(15)
f(Set(21, 22, 23, 24, 25))
}
}
}
package object helpers {
val chunkSize = 10
implicit class EnrichedObservable(enrichedObs: Observable[Set[Enriched]]) {
def +(poors: Set[Int]): Observable[Set[Enriched]] = {
enrichedObs merge Observable.just {
Enriched.enrich(poors)
}
}
}
def checkSizeLimit(set: Set[_ <: Any]) =
if (set.size > chunkSize) throw new IllegalArgumentException(s"$chunkSize-element limit violated: ${set.size}")
}
// unmodifiable
object Remote {
def even = { xs: Set[Int] =>
Thread.sleep(1500)
xs filter { _ % 2 == 0 }
}
}
Is there something wrong w/ the way I'm creating my Observable.from(Future) in Repository.query()?
The problem is that I am trying to create an observable from multiple futures but Observable.from(Future) only provides for a singular future (the compiler did not complain because I carelessly omitted the separating commas thereby usurping an unsuspecting overload). My sol'n:
object Repository {
def query(f: Set[Int] => Future[Set[Int]])(fetchSize: Int = 10): Observable[Future[Set[Int]]] =
// observable (as opposed to list) because modeling a process
// where the total result size is unknown beforehand.
// Also, not creating or applying because it blocks the futures
(1 to 21 by fetchSize).foldLeft(Observable just Future((Set[Int]()))) { (obs, i) =>
obs + f(DataSource.fetch(i)())
}
}
object DataSource {
def fetch(begin: Int)(fetchSize: Int = 10) = {
val end = begin + fetchSize
Thread.sleep(200)
(for {
i <- begin until end
} yield i).toSet
}
}
where:
implicit class FutureObservable(obs: Observable[Future[Set[Int]]]) {
def +(future: Future[Set[Int]]) =
obs merge (Observable just future)
}

Why does Future.firstCompletedOf not invoke callback on timeout?

I am doing Exercises from Learning Concurrent Programming in Scala.
For an exercise question in code comment.
Program prints proper output of HTML contents for proper URL and timeout sufficiently enough.
Program prints "Error occured" for proper URL and low timeout.
However for invalid URL "Error occured" is not printed. What is the problem with the code below?
/*
* Implement a command-line program that asks the user to input a URL of some website,
* and displays the HTML of that website. Between the time that the user hits ENTER and
* the time that the HTML is retrieved, the program should repetitively print a . to the
* standard output every 50 milliseconds, with a two seconds timeout. Use only futures
* and promises, and avoid the synchronization primitives from the previous chapters.
* You may reuse the timeout method defined in this chapter.
*/
object Excersices extends App {
val timer = new Timer()
def timeout(t: Long = 1000): Future[Unit] = {
val p = Promise[Unit]
val timer = new Timer(true)
timer.schedule(new TimerTask() {
override def run() = {
p success ()
timer cancel()
}
}, t)
p future
}
def printDot = println(".")
val taskOfPrintingDot = new TimerTask {
override def run() = printDot
}
println("Enter a URL")
val lines = io.Source.stdin.getLines()
val url = if (lines hasNext) Some(lines next) else None
timer.schedule(taskOfPrintingDot, 0L, 50.millisecond.toMillis)
val timeOut2Sec = timeout(2.second.toMillis)
val htmlContents = Future {
url map { x =>
blocking {
Source fromURL (x) mkString
}
}
}
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents)) map { x =>
timer cancel ()
x match {
case Some(x) =>
println(x)
case _ =>
println("Error occured")
}
}
Thread sleep 5000
}
As #Gábor Bakos said exception produces Failure which doesn't handled by map:
val fut = Future { Some(Source fromURL ("hhhttp://google.com")) }
scala> fut map { x => println(x) } //nothing printed
res12: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#5e025724
To process failure - use recover method :
scala> fut recover { case failure => None } map { x => println(x) }
None
res13: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#578afc83
In your context it's something like:
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents)) recover {case x => println("Error:" + x); None} map { x => ...}
The Complete Code after using recover as advised by #dk14:
object Exercises extends App {
val timer = new Timer()
def timeout(t: Long = 1000): Future[Unit] = {
val p = Promise[Unit]
val timer = new Timer(true)
timer.schedule(new TimerTask() {
override def run() = {
p success ()
timer cancel ()
}
}, t)
p future
}
def printDot = println(".")
val taskOfPrintingDot = new TimerTask {
override def run() = {
printDot
}
}
println("Enter a URL")
val lines = io.Source.stdin.getLines()
val url = if (lines hasNext) Some(lines next) else None
timer.schedule(taskOfPrintingDot, 0L, 50.millisecond.toMillis)
val timeOut2Sec = timeout(2.second.toMillis)
val htmlContents = Future {
url map { x =>
blocking {
Source fromURL (x) mkString
}
}
}
Future.firstCompletedOf(Seq(timeOut2Sec, htmlContents))
.recover { case x => println("Error:" + x); None }
.map { x =>
timer cancel ()
x match {
case Some(x) =>
println(x)
case _ =>
println("Timeout occurred")
}
}
Thread sleep 5000
}

How to do thread intercommunication with shared object in Scala?

In the below code, first of all we can create the something with calling create method, and for destroying it we can call destroy method, each method at first changes the status of the object, then asynchronously wait until the time of the operation is up. assume that time of creating is 10 seconds and time for destroying is 5 second. when I call destroy method, it changes the destroying boolean to true, then another threads that is waiting for the creation duration will break the loop in the waitUntil method. but it does not work properly. when I call create method, it changes the status and then creates a thread to wait until the time of the creation duration goes up. after calling create method, I call destroy method as below:
map(0, 0).create(....)
map(0, 0).destroy(....)
But changing the destroying variable in destroy method did not act the another thread that is in the creating block to break and it continue until the end of duration.
#volatile
protected var status: UnitStatus = UnitStatus.NEED_TO_CREATE
#volatile
protected var destroying = false
def destroy(f: Int => Unit): Unit = status match {
case UnitStatus.DESTROYED => {
throw new UnitAlreadyDestroyedException
}
case UnitStatus.DESTROYING =>
case UnitStatus.PREPARED_TO_DESTROY => {
destroying = true
status = UnitStatus.DESTROYING
async {
waitUntil(destroyDuration, UnitStatus.DESTROYED) {
f
}
}
}
}
def create(f: Int => Unit): Unit = status match {
case UnitStatus.DESTROYED => throw new UnitAlreadyDestroyedException
case UnitStatus.NEED_TO_CREATE => {
ResourcesContainer -= creationCost
status = UnitStatus.CONSTRUCTION
async {
waitUntil(creationDuration, UnitStatus.READY) {
f
}
}
}
case _ => throw new UnitAlreadyCreatedException
}
def waitUntil(seconds: Int, finalState: UnitStatus)(f: Int => Unit): Unit = {
var timeElapse = 0
var percent: Int = 0
breakable {
while (timeElapse < seconds) {
val destroyState = isInDestroyState
if (destroying && !destroyState) {
break() // **program does not enter to this part of code**
}
timeElapse += 1
percent = ((timeElapse.asInstanceOf[Float] / seconds) * 100).toInt
f(percent)
Thread.sleep(1000)
}
if (status != null) {
status = finalState
}
}
}
def async[T](fn: => Unit): Unit = scala.actors.Actor.actor {
fn
}
def isInDestroyState: Boolean = {
status == UnitStatus.DESTROYED ||
status == UnitStatus.DESTROYING ||
status == UnitStatus.PREPARED_TO_DESTROY
}

Converting Scala #suspendable Method into a Future

suppose I have a sleep function:
def sleep(delay:Int) : Unit #suspendable = {
....
}
is it possible to have a function future that creates an async version of the sleep function that can be awaited on synchronously.
def future(targetFunc: (Int => Unit #suspendable)) : (Int => Future) = {
....
}
class Future {
def await : Unit #suspendable = {
....
}
}
you should be able to do something like this:
reset {
val sleepAsync = future(sleep)
val future1 = sleepAsync(2000)
val future2 = sleepAsync(3000)
future1.await
future2.await
/* finishes after a delay of 3000 */
}
the two calls to sleepAsync should appear to return straight away and the two calls to Future#await should appear to block. of course they all really fall off the end of reset and the code after is responsible for calling the continuation after the delay.
otherwise is there an alternative method to running two #suspendable functions in parallel and wait for both of them to complete?
I have a compilable gist with a skeleton of what i want to do: https://gist.github.com/1191381
object Forks {
import scala.util.continuations._
case class Forker(forks: Vector[() => Unit #suspendable]) {
def ~(block: => Unit #suspendable): Forker = Forker(forks :+ (() => block))
def joinIf(pred: Int => Boolean): Unit #suspendable = shift { k: (Unit => Unit) =>
val counter = new java.util.concurrent.atomic.AtomicInteger(forks.size)
forks foreach { f =>
reset {
f()
if (pred(counter.decrementAndGet)) k()
}
}
}
def joinAll() = joinIf(_ == 0)
def joinAny() = joinIf(_ == forks.size - 1)
}
def fork(block: => Unit #suspendable): Forker = Forker(Vector(() => block))
}
using fork(), we can now wait many "suspendables". use ~() to chain together suspendables. use joinAll() to wait for all suspendables and joinAny() to wait for just one. use joinIf() to customize join strategy.
object Tests extends App {
import java.util.{Timer, TimerTask}
import scala.util.continuations._
implicit val timer = new Timer
def sleep(ms: Int)(implicit timer: Timer): Unit #suspendable = {
shift { k: (Unit => Unit) =>
timer.schedule(new TimerTask {
def run = k()
}, ms)
}
}
import Forks._
reset {
fork {
println("sleeping for 2000 ms")
sleep(2000)
println("slept for 2000 ms")
} ~ {
println("sleeping for 4000 ms")
sleep(4000)
println("slept for 4000 ms")
} joinAll()
println("and we are done")
}
println("outside reset")
readLine
timer.cancel
}
and this is the output. program starts at time T:
sleeping for 2000 ms
sleeping for 4000 ms
outside reset <<<<<< T + 0 second
slept for 2000 ms <<<<<< T + 2 seconds
slept for 4000 ms <<<<<< T + 4 seconds
and we are done <<<<<< T + 4 seconds
I'm not sure that I completely understand the question, but here's a try:
import scala.util.continuations._
class Future(thread: Thread) {
def await = thread.join
}
object Future {
def sleep(delay: Long) = Thread.sleep(delay)
def future[A,B](f: A => B) = (a: A) => shift { k: (Future => Unit) =>
val thread = new Thread { override def run() { f(a) } }
thread.start()
k(new Future(thread))
}
def main(args:Array[String]) = reset {
val sleepAsync = future(sleep)
val future1 = sleepAsync(2000) // returns right away
val future2 = sleepAsync(3000) // returns right away
future1.await // returns after two seconds
future2.await // returns after an additional one second
// finished after a total delay of three seconds
}
}
Here, a Future instance is nothing more than a handle on a Thread, so you can use its join method to block until it finishes.
The future function takes a function of type A => B, and returns a function which, when supplied with an A will kick off a thread to run the "futured" function, and wrap it up in a Future, which is injected back into the continuation, thereby assigning it to val future1.
Is this anywhere close to what you were going for?