Offer to queue with some initial display - scala

I want to offer to queue a string sent in load request after some initial delay say 10 seconds.
If the subsequent request is made with some short interval delay(1 second) then everything works fine, but if it is made continuously like from a script then there is no delay.
Here is the sample code.
def load(randomStr :String) = Action { implicit request =>
Source.single(randomStr)
.delay(10 seconds, DelayOverflowStrategy.backpressure)
.map(x =>{
println(x)
queue.offer(x)
})
.runWith(Sink.ignore)
Ok("")
}

I am not entirely sure that this is the correct way of doing what you want. There are some things you need to reconsider:
a delayed source has an initial buffer capacity of 16 elements. You can increase this with addAttributes(initialBuffer)
In your case the buffer cannot actually become full because every time you provide just one element.
Who is the caller of the Action? You are defining a DelayOverflowStrategy.backpressure strategy but is the caller able to handle this?
On every call of the action you are creating a Stream consisting of one element, how is the backpressure here helping? It is applied on the stream processing and not on the offering to the queue

Related

Akka streams - shutdown stream with grouping without losing data

I have a source that groups elements and a sink that makes a batch request,
I'm using KillSwitch to be able to shutdown the graph at some arbitrary point in time. The problem that records of the latest incomplete batch that source outputs are getting lost when switch.shutdown() is being called
val source = Source.tick(10.millis, 10.millis, "tick").grouped(500)
val (switch, _) = source.viaMat(KillSwitches.single)(Keep.right)
.toMat(sink)(Keep.both).run()
Thread.sleep(3000) // wait some arbitrary time
switch.shutdown()
Is there a way to 'flush out' the incomplete batch when shutdown happens?
The behaviour of the kill switch shutdown is positional, as per its docs
After calling [[UniqueKillSwitch#shutdown()]] the running instance of
the [[Graph]] of [[FlowShape]] that materialized to the
[[UniqueKillSwitch]] will complete its downstream and cancel its
upstream (unless if finished or failed already in which case the
command is ignored).
See also more docs here.
Now the grouped stage will emit a partially filled group only at completion time, but not when cancelled.
This means that the graph below (grouped before killswitch) will behave like you observed
val switch =
Source.tick(10.millis, 175.millis, "tick")
.grouped(10)
.viaMat(KillSwitches.single)(Keep.right)
.toMat(Sink.foreach(println))(Keep.left)
.run()
whilst the graph below (grouped after killswitch) will emit partial groups downstream at completion
val switch =
Source.tick(10.millis, 175.millis, "tick")
.viaMat(KillSwitches.single)(Keep.right)
.grouped(10)
.toMat(Sink.foreach(println))(Keep.left)
.run()

Proper way to stop Akka Streams on condition

I have been successfully using FileIO to stream the contents of a file, compute some transformations for each line and aggregate/reduce the results.
Now I have a pretty specific use case, where I would like to stop the stream when a condition is reached, so that it is not necessary to read the whole file but the process finishes as soon as possible. What is the recommended way to achieve this?
If the stop condition is "on the outside of the stream"
There is a advanced building-block called KillSwitch that you could use to do this: http://doc.akka.io/japi/akka/2.4.7/akka/stream/KillSwitches.html The stream would get shut down once the kill switch is notified.
It has methods like abort(reason) / shutdown etc, see here for it's API: http://doc.akka.io/japi/akka/2.4.7/akka/stream/SharedKillSwitch.html
Reference documentation is here: http://doc.akka.io/docs/akka/2.4.8/scala/stream/stream-dynamic.html#kill-switch-scala
Example usage would be:
val countingSrc = Source(Stream.from(1)).delay(1.second,
DelayOverflowStrategy.backpressure)
val lastSnk = Sink.last[Int]
val (killSwitch, last) = countingSrc
.viaMat(KillSwitches.single)(Keep.right)
.toMat(lastSnk)(Keep.both)
.run()
doSomethingElse()
killSwitch.shutdown()
Await.result(last, 1.second) shouldBe 2
If the stop condition is inside the stream
You can use takeWhile to express any condition really, though sometimes take or limit may be also enough "take 10 lnes".
If your logic is very advanced, you could build a special stage that handles that special logic using statefulMapConcat that allows to express literally anything - so you could complete the stream whenever you want to "from the inside".

Rx Extensions - Proper way to use delay to avoid unnecessary observables from executing?

I'm trying to use delay and amb to execute a sequence of the same task separated by time.
All I want is for a download attempt to execute some time in the future only if the same task failed before in the past. Here's how I have things set up, but unlike what I'd expect, all three downloads seem to execute without delay.
Observable.amb([
Observable.catch(redditPageStream, Observable.empty()).delay(0 * 1000),
Observable.catch(redditPageStream, Observable.empty()).delay(30 * 1000),
Observable.catch(redditPageStream, Observable.empty()).delay(90 * 1000),
# Observable.throw(new Error('Failed to retrieve reddit page content')).delay(10000)
# Observable.create(
# (observer) ->
# throw new Error('Failed to retrieve reddit page content')
# )
]).defaultIfEmpty(Observable.throw(new Error('Failed to retrieve reddit page content')))
full code can be found here. src
I was hoping that the first successful observable would cancel out the ones still in delay.
Thanks for any help.
delay doesn't actually stop the execution of what ever you are doing it just delays when the events are propagated. If you want to delay execution you would need to do something like:
redditPageStream.delaySubscription(1000)
Since your source is producing immediately the above will delay the actual subscription to the underlying stream to effectively delay when it begins producing.
I would suggest though that you use one of the retry operators to handle your retry logic though rather than rolling your own through the amb operator.
redditPageStream.delaySubscription(1000).retry(3);
will give you a constant retry delay however if you want to implement the linear backoff approach you can use the retryWhen() operator instead which will let you apply whatever logic you want to the backoff.
redditPageStream.retryWhen(errors => {
return errors
//Only take 3 errors
.take(3)
//Use timer to implement a linear back off and flatten it
.flatMap((e, i) => Rx.Observable.timer(i * 30 * 1000));
});
Essentially retryWhen will create an Observable of errors, each event that makes it through is treated as a retry attempt. If you error or complete the stream then it will stop retrying.

Scalaz-stream chunking UP to N

Given a queue like so:
val queue: Queue[Int] = async.boundedQueue[Int](1000)
I want to pull off this queue and stream it into a downstream Sink, in chunks of UP to 100.
queue.dequeue.chunk(100).to(downstreamConsumer)
works sort of, but it will not empty the queue if I have say 101 messages. There will be 1 message left over, unless another 99 are pushed in. I want to take as many as I can off the queue up to 100, as fast as my downstream process can handle.
Is there an existing combinator available?
For this, you may need to monitor size of the queue when dequeueing from it. then, if the size reaches 0 you simply won't wait to any more elements. In fact you may implement elastic sizing of the batching based on the size of queue. I.e. :
val q = async.unboundedQueue[String]
val deq:Process[Task,(String,Int)] = q.dequeue zip q.size
val elasticChunk: Process1[(String,Int), Vector[String]] = ???
val downstreamConsumer : Sink[Task,Vector[String]] = ???
deq.pipe(elasticChunk) to downstreamConsumer
I actually solved this a different way then I had intended.
The scalaz-stream queue now contains a dequeueBatch method that allows dequeuing all values in the queue, up to N, or blocks.
https://github.com/scalaz/scalaz-stream/issues/338

Control flow of messages in Akka actor

I have an actor using Akka which performs an action that takes some time to complete, because it has to download a file from the network.
def receive = {
case songId: String => {
Future {
val futureFile = downloadFile(songId)
for (file <- futureFile) {
val fileName = doSomenthingWith(file)
otherActor ! fileName
}
}
}
}
I would like to control the flow of messages to this actor. If I try to download too many files simultaneously, I have a network bottleneck. The problem is that I am using a Future inside the actor receive, so, the methods exits and the actor is ready to process a new message. If I remove the Future, I will download only one file per time.
What is the best way to limit the number of messages being processed per unit of time? Is there a better way to design this code?
There is a contrib project for Akka that provides a throttle implementation (http://letitcrash.com/post/28901663062/throttling-messages-in-akka-2). If you sit this in front of the actual download actor then you can effectively throttle the rate of messages going into that actor. It's not 100% perfect in that if the download times are taking longer than expected you could still end up with more downloads then might be desired, but it's a pretty simple implementation and we use it quite a bit to great effect.
Another option could be to use a pool of download actors and remove the future and allow the actors to perform this blocking so that they are truly handling only one message at a time. Because you are going to let them block, I would suggest giving them their own Dispatcher (ExecutionContext) so that this blocking does not negatively effect the main Akka Dispatcher. If you do this, then the pool size itself represents your max allowed number of simultaneous downloads.
Both of these solutions are pretty much "out-of-the-box" solutions that don't require much custom logic to support your use case.
Edit
I also thought it would be good to mention the Work Pulling Pattern as well. With this approach you could still use a pool and then a single work distributer in front. Each worker (download actor) could perform the download (still using a Future) and only request new work (pull) from the work distributer when that Future has fully completed meaning the download is done.
If you have an upper bound on the amount of simultanious downloads you want to happen you can 'ack' back to the actor saying that a download completed and to free up a spot to download another file:
case object AckFileRequest
class ActorExample(otherActor:ActorRef, maxFileRequests:Int = 1) extends Actor {
var fileRequests = 0
def receive = {
case songId: String if fileRequests < maxFileRequests =>
fileRequests += 1
val thisActor = self
Future {
val futureFile = downloadFile(songId)
//not sure if you're returning the downloaded file or a future here,
//but you can move this to wherever the downloaded file is and ack
thisActor ! AckFileRequest
for (file <- futureFile) {
val fileName = doSomenthingWith(file)
otherActor ! fileName
}
}
case songId: String =>
//Do some throttling here
val thisActor = self
context.system.scheduler.scheduleOnce(1 second, thisActor, songId)
case AckFileRequest => fileRequests -= 1
}
}
In this example, if there are too many file requests then we put this songId request on hold and queue it back up for processing 1 second later. You can obviously change this however you see fit, maybe you can just send the message straight back to the actor in a tight loop or do some other throttling, depends on your use case.
There is a contrib implementation of message Throttling, as described here.
The code is very simple:
// A simple actor that prints whatever it receives
class Printer extends Actor {
def receive = {
case x => println(x)
}
}
val printer = system.actorOf(Props[Printer], "printer")
// The throttler for this example, setting the rate
val throttler = system.actorOf(Props(classOf[TimerBasedThrottler], 3 msgsPer 1.second))
// Set the target
throttler ! SetTarget(Some(printer))
// These three messages will be sent to the printer immediately
throttler ! "1"
throttler ! "2"
throttler ! "3"
// These two will wait at least until 1 second has passed
throttler ! "4"
throttler ! "5"