Queue nowait in python with single producer and consumer - queue

I am running the code below as an example in python3.8.10, I was curious to see what happens if we don't lock the queue and there is one writer and one reader. I would have expected some corruption in case of multithreading when big data amount is read, but this doesn't seem the case, the result looks correct with or without the sleep in the code. Anyone has an idea why? It shouldn't be thread safe neither with one writer and one reader...
from threading import Thread
from queue import Queue
import signal
import time
v1 = "hello"*10000
v2 = "good"*10000
v3 = "world"*10000
def producer(q: Queue):
while True:
q.put(v1, block=False)
q.put(v2, block=False)
q.put(v3, block=False)
time.sleep(0.3)
def consumer(q: Queue):
while True:
try:
val = q.get_nowait()
print(val)
print("\n")
except Exception:
pass
if __name__ == "__main__":
q = Queue()
p = Thread(target=producer, args=(q,), daemon=True)
c = Thread(target=consumer, args=(q,), daemon=True)
c.start()
p.start()
p.join()

Apparently the queue is implemented thread-safe, when block is True avoids only busy waiting

Related

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?
If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.
Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

How to read only Successful values from a Seq of Futures

I am learning akka/scala and am trying to read only those Futures that succeeded from a Seq[Future[Int]] but cant get anything to work.
I simulated an array of 10 Future[Int] some of which fail depending on the value FailThreshold takes (all fail for 10 and none fail for 0).
I then try to read them into an ArrayBuffer (could not find a way to return immutable structure with the values).
Also, there isn't a filter on Success/Failure so had to run an onComplete on each future and update buffer as a side-effect.
Even when the FailThreshold=0 and the Seq has all Future set to Success, the array buffer is sometimes empty and different runs return array of different sizes.
I tried a few other suggestions from the web like using Future.sequence on the list but this throws exception if any of future variables fail.
import akka.actor._
import akka.pattern.ask
import scala.concurrent.{Await, Future, Promise}
import scala.concurrent.duration._
import scala.util.{Timeout, Failure, Success}
import concurrent.ExecutionContext.Implicits.global
case object AskNameMessage
implicit val timeout = Timeout(5, SECONDS)
val FailThreshold = 0
class HeyActor(num: Int) extends Actor {
def receive = {
case AskNameMessage => if (num<FailThreshold) {Thread.sleep(1000);sender ! num} else sender ! num
}
}
class FLPActor extends Actor {
def receive = {
case t: IndexedSeq[Future[Int]] => {
println(t)
val b = scala.collection.mutable.ArrayBuffer.empty[Int]
t.foldLeft( b ){ case (bf,ft) =>
ft.onComplete { case Success(v) => bf += ft.value.get.get }
bf
}
println(b)
}
}
}
val system = ActorSystem("AskTest")
val flm = (0 to 10).map( (n) => system.actorOf(Props(new HeyActor(n)), name="futureListMake"+(n)) )
val flp = system.actorOf(Props(new FLPActor), name="futureListProcessor")
// val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.failed( throw new IllegalArgumentException("DONE!") ))
val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.successful(0))
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
flp ! seqOfFtrs
The receive in FLPActor mostly gets
Vector(Future(Success(0)), Future(Success(1)), Future(Success(2)), Future(Success(3)), Future(Success(4)), Future(Success(5)), Future(Success(6)), Future(Success(7)), Future(Success(8)), Future(Success(9)), Future(Success(10)))
but the array buffer b has varying number of values and empty at times.
Can someone please point me to gaps here,
why would the array buffer have varying sizes even when all Future have resolved to Success,
what is the correct pattern to use when we want to ask different actors with TimeOut and use only those asks that have successfully returned for further processing.
Instead of directly sending the IndexedSeq[Future[Int]], you should transform to Future[IndexedSeq[Int]] and then pipe it to the next actor. You don't send the Futures directly to an actor. You have to pipe it.
HeyActor can stay unchanged.
After
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
do a recover, and use Future.sequence to turn it into one Future:
val oneFut = Future.sequence(seqOfFtrs.map(f=>f.map(Some(_)).recover{ case (ex: Throwable) => None})).map(_.flatten)
If you don't understand the business with Some, None, and flatten, then make sure you understand the Option type. One way to remove values from a sequence is to map values in the sequence to Option (either Some or None) and then to flatten the sequence. The None values are removed and the Some values are unwrapped.
After you have transformed your data into a single Future, pipe it over to FLPActor:
oneFut pipeTo flp
FLPActor should be rewritten with the following receive function:
def receive = {
case printme: IndexedSeq[Int] => println(printme)
}
In Akka, modifying some state in the main thread of your actor from a Future or the onComplete of a Future is a big no-no. In the worst case, it results in race conditions. Remember that each Future runs on its own thread, so running a Future inside an actor means you have concurrent work being done in different threads. Having the Future directly modify some state in your actor while the actor is also processing some state is a recipe for disaster. In Akka, you process all changes to state directly in the primary thread of execution of the main actor. If you have some work done in a Future and need to access that work from the main thread of an actor, you pipe it to that actor. The pipeTo pattern is functional, correct, and safe for accessing the finished computation of a Future.
To answer your question about why FLPActor is not printing out the IndexedSeq correctly: you are printing out the ArrayBuffer before your Futures have been completed. onComplete isn't the right idiom to use in this case, and you should avoid it in general as it isn't good functional style.
Don't forget the import akka.pattern.pipe for the pipeTo syntax.

What's the best way to wrap a monix Task with a start time and end time print the difference?

This is what I'm trying right now but it only prints "hey" and not metrics.
I don't want to add metric related stuff in the main function.
import java.util.Date
import monix.eval.Task
import monix.execution.Scheduler.Implicits.global
import scala.concurrent.Await
import scala.concurrent.duration.Duration
class A {
def fellow(): Task[Unit] = {
val result = Task {
println("hey")
Thread.sleep(1000)
}
result
}
}
trait AA extends A {
override def fellow(): Task[Unit] = {
println("AA")
val result = super.fellow()
val start = new Date()
result.foreach(e => {
println("AA", new Date().getTime - start.getTime)
})
result
}
}
val a = new A with AA
val res: Task[Unit] = a.fellow()
Await.result(res.runAsync, Duration.Inf)
You can describe a function such as this:
def measure[A](task: Task[A], logMillis: Long => Task[Unit]): Task[A] =
Task.deferAction { sc =>
val start = sc.clockMonotonic(TimeUnit.MILLISECONDS)
val stopTimer = Task.suspend {
val end = sc.clockMonotonic(TimeUnit.MILLISECONDS)
logMillis(end - start)
}
task.redeemWith(
a => stopTimer.map(_ => a)
e => stopTimer.flatMap(_ => Task.raiseError(e))
)
}
Some piece of advice:
Task values should be pure, along with the functions returning Tasks — functions that trigger side effects and return Task as results are broken
Task is not a 1:1 replacement for Future; when describing a Task, all side effects should be suspended (wrapped) in Task
foreach triggers the Task's evaluation and that's not good, because it triggers the side effects; I was thinking of deprecating and removing it, since its presence is tempting
stop using trait inheritance and just use plain functions — unless you deeply understand OOP and subtyping, it's best to avoid it if possible; and if you're into the Cake pattern, stop doing it and maybe join a support group 🙂
never measure time duration via new Date(), you need a monotonic clock for that and on top of the JVM that's System.nanoTime, which can be accessed via Monix's Scheduler by clockMonotonic, as exemplified above, the Scheduler being given to you via deferAction
stop blocking threads, because that's error prone — instead of doing Thread.sleep, do Task.sleep and all Await.result calls are problematic, unless they are in main or in some other place where dealing with asynchrony is not possible
Hope this helps.
Cheers,
Like #Pierre mentioned, latest version of Monix Task has Task.timed, you can do
timed <- task.timed
(duration, t) = timed

Akka Stream return object from Sink

I've got a SourceQueue. When I offer an element to this I want it to pass through the Stream and when it reaches the Sink have the output returned to the code that offered this element (similar as Sink.head returns an element to the RunnableGraph.run() call).
How do I achieve this? A simple example of my problem would be:
val source = Source.queue[String](100, OverflowStrategy.fail)
val flow = Flow[String].map(element => s"Modified $element")
val sink = Sink.ReturnTheStringSomehow
val graph = source.via(flow).to(sink).run()
val x = graph.offer("foo")
println(x) // Output should be "Modified foo"
val y = graph.offer("bar")
println(y) // Output should be "Modified bar"
val z = graph.offer("baz")
println(z) // Output should be "Modified baz"
Edit: For the example I have given in this question Vladimir Matveev provided the best answer. However, it should be noted that this solution only works if the elements are going into the sink in the same order they were offered to the source. If this cannot be guaranteed the order of the elements in the sink may differ and the outcome might be different from what is expected.
I believe it is simpler to use the already existing primitive for pulling values from a stream, called Sink.queue. Here is an example:
val source = Source.queue[String](128, OverflowStrategy.fail)
val flow = Flow[String].map(element => s"Modified $element")
val sink = Sink.queue[String]().withAttributes(Attributes.inputBuffer(1, 1))
val (sourceQueue, sinkQueue) = source.via(flow).toMat(sink)(Keep.both).run()
def getNext: String = Await.result(sinkQueue.pull(), 1.second).get
sourceQueue.offer("foo")
println(getNext)
sourceQueue.offer("bar")
println(getNext)
sourceQueue.offer("baz")
println(getNext)
It does exactly what you want.
Note that setting the inputBuffer attribute for the queue sink may or may not be important for your use case - if you don't set it, the buffer will be zero-sized and the data won't flow through the stream until you invoke the pull() method on the sink.
sinkQueue.pull() yields a Future[Option[T]], which will be completed successfully with Some if the sink receives an element or with a failure if the stream fails. If the stream completes normally, it will be completed with None. In this particular example I'm ignoring this by using Option.get but you would probably want to add custom logic to handle this case.
Well, you know what offer() method returns if you take a look at its definition :) What you can do is to create Source.queue[(Promise[String], String)], create helper function that pushes pair to stream via offer, make sure offer doesn't fail because queue might be full, then complete promise inside your stream and use future of the promise to catch completion event in external code.
I do that to throttle rate to external API used from multiple places of my project.
Here is how it looked in my project before Typesafe added Hub sources to akka
import scala.concurrent.Promise
import scala.concurrent.Future
import java.util.concurrent.ConcurrentLinkedDeque
import akka.stream.scaladsl.{Keep, Sink, Source}
import akka.stream.{OverflowStrategy, QueueOfferResult}
import scala.util.Success
private val queue = Source.queue[(Promise[String], String)](100, OverflowStrategy.backpressure)
.toMat(Sink.foreach({ case (p, param) =>
p.complete(Success(param.reverse))
}))(Keep.left)
.run
private val futureDeque = new ConcurrentLinkedDeque[Future[String]]()
private def sendQueuedRequest(request: String): Future[String] = {
val p = Promise[String]
val offerFuture = queue.offer(p -> request)
def addToQueue(future: Future[String]): Future[String] = {
futureDeque.addLast(future)
future.onComplete(_ => futureDeque.remove(future))
future
}
offerFuture.flatMap {
case QueueOfferResult.Enqueued =>
addToQueue(p.future)
}.recoverWith {
case ex =>
val first = futureDeque.pollFirst()
if (first != null)
addToQueue(first.flatMap(_ => sendQueuedRequest(request)))
else
sendQueuedRequest(request)
}
}
I realize that blocking synchronized queue may be bottleneck and may grow indefinitely but because API calls in my project are made only from other akka streams which are backpressured I never have more than dozen items in futureDeque. Your situation may differ.
If you create MergeHub.source[(Promise[String], String)]() instead you'll get reusable sink. Thus every time you need to process item you'll create complete graph and run it. In that case you won't need hacky java container to queue requests.

Scala future sequence and timeout handling

There are some good hints how to combine futures with timeouts.
However I'm curious how to do this with Future sequence sequenceOfFutures
My first approach looks like this
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits._
object FutureSequenceScala extends App {
println("Creating futureList")
val timeout = 2 seconds
val futures = List(1000, 1500, 1200, 800, 2000) map { ms =>
val f = future {
Thread sleep ms
ms toString
}
Future firstCompletedOf Seq(f, fallback(timeout))
}
println("Creating waitinglist")
val waitingList = Future sequence futures
println("Created")
val results = Await result (waitingList, timeout * futures.size)
println(results)
def fallback(timeout: Duration) = future {
Thread sleep (timeout toMillis)
"-1"
}
}
Is there a better way to handle timeouts in a sequence of futures or is this a valid solution?
There are a few things in your code here that you might want to reconsider. For starters, I'm not a huge fan of submitting tasks into the ExecutionContext that have the sole purpose of simulating a timeout and also have Thread.sleep used in them. The sleep call is blocking and you probably want to avoid having a task in the execution context that is purely blocking for the sake of waiting a fixed amount of time. I'm going to steal from my answer here and suggest that for pure timeout handling, you should use something like I outlined in that answer. The HashedWheelTimer is a highly efficient timer implementation that is mush better suited to timeout handling than a task that just sleeps.
Now, if you go that route, the next change I would suggest concerns handling the individual timeout related failures for each future. If you want an individual failure to completely fail the aggregate Future returned from the sequence call, then do nothing extra. If you don't want that to happen, and instead want a timeout to return some default value instead, then you can use recover on the Future like this:
withTimeout(someFuture).recover{
case ex:TimeoutException => someDefaultValue
}
Once you've done that, you can take advantage of the non-blocking callbacks and do something like this:
waitingList onComplete{
case Success(results) => //handle success
case Failure(ex) => //handle fail
}
Each future has a timeout and thus will not just run infinitely. There is no need IMO to block there and provide an additional layer of timeout handling via the atMost param to Await.result. But I guess this assumes you are okay with the non-blocking approach. If you really need to block there, then you should not be waiting timeout * futures.size amount of time. These futures are running in parallel; the timeout there should only need to be as long as the individual timeouts for the futures themselves (or just slightly longer to account for any delays in cpu/timing). It certainly should not be the timeout * the total number of futures.
Here's a version that shows how bad your blocking fallback is.
Notice that the executor is single threaded and you're creating many fallbacks.
#cmbaxter is right, your master timeout shouldn't be timeout * futures.size, it should be bigger!
#cmbaxter is also right that you want to think non-blocking. Once you do that, and you want to impose timeouts, then you will pick a timer component for that, see his linked answer (also linked from your linked answer).
That said, I still like my answer from your link, in so far as sitting in a loop waiting for the next thing that should timeout is really simple.
It just takes a list of futures and their timeouts and a fallback value.
Maybe there is a use case for that, such as a simple app that just blocks for some results (like your test) and must not exit before results are in.
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import java.lang.System.{ nanoTime => now }
object Test extends App {
//implicit val xc = ExecutionContext.global
implicit val xc = ExecutionContext fromExecutorService (Executors.newSingleThreadExecutor)
def timed[A](body: =>A): A = {
val start = now
val res = body
val end = now
Console println (Duration fromNanos end-start).toMillis + " " + res
res
}
println("Creating futureList")
val timeout = 1500 millis
val futures = List(1000, 1500, 1200, 800, 2000) map { ms =>
val f = future {
timed {
blocking(Thread sleep ms)
ms toString
}
}
Future firstCompletedOf Seq(f, fallback(timeout))
}
println("Creating waitinglist")
val waitingList = Future sequence futures
println("Created")
timed {
val results = Await result (waitingList, 2 * timeout * futures.size)
println(results)
}
xc.shutdown
def fallback(timeout: Duration) = future {
timed {
blocking(Thread sleep (timeout toMillis))
"-1"
}
}
}
What happened:
Creating futureList
Creating waitinglist
Created
1001 1000
1500 -1
1500 1500
1500 -1
1200 1200
1500 -1
800 800
1500 -1
2000 2000
1500 -1
List(1000, 1500, 1200, 800, 2000)
14007 ()
Monix Task has timeout support:
import monix.execution.Scheduler.Implicits.global
import monix.eval._
import scala.concurrent.duration._
println("Creating futureList")
val tasks = List(1000, 1500, 1200, 800, 2000).map{ ms =>
Task {
Thread.sleep(ms)
ms.toString
}.timeoutTo(2.seconds, Task.now("-1"))
}
println("Creating waitinglist")
val waitingList = Task.gather(tasks) // Task.sequence is true/literally "sequencing" operation
println("Created")
val results = Await.result(waitingList, timeout * futures.size)
println(results)