scalaz-stream queue without hanging - scala

I have a two-part question, so let me give some background first. I know that is possible to do something similar to what I want like this:
import scalaz.concurrent._
import scalaz.stream._
val q = async.unboundedQueue[Int]
val p: Process[Task, Int] = q.dequeue
q.enqueueAll(1 to 2).run
val p1: Process1[Int, Int] = process1.take(1)
p.pipe(p1).map(x => println(s"Answer: $x")).run.run
// Answer: 1
p.pipe(p1).map(x => println(s"Answer: $x")).run.run
// Answer: 2
p.pipe(p1).map(x => println(s"Answer: $x")).run.run
// hangs awaiting next input
Is there some other p1 that I could use that would give me the output below without hanging (it would be like process1.awaitOption)?
Answer: Some(1)
Answer: Some(2)
Answer: None
If yes, I think it would be easy to answer the next question. Is there some other p1 that I could use that would give me the output below without hanging (it would be like process1.chunkAll)?
Answer: Seq(1, 2)
Answer: Seq()
Answer: Seq()
Edit:
To complement the question to make it more understandable. If I have a loop like this:
for (i <- 1 to 4) {
p.pipe(p1).map(x => println(s"Answer: $x")).run.run
}
The result could be:
Answer: Seq()
// if someone pushes some values into the queue, like: q.enqueueAll(1 to 2).run
Answer: Seq(1, 2)
Answer: Seq()
Answer: Seq()
I hope it's clear now what I am trying to do. The problem is that I don't have control of the loop and I must not block it if there's no values in the queue.

I am not sure if i understand the semantics you trying to have, but generally the process may be interrupted (that means cancelled to wait for some value) by either closing queue externally, or by using wye. interrupt.
When you would like to have process terminate instead of awaiting next enqueued value? If let say you would like to have this on empty queue, there is "size" process and you may use that to interrupt awaiting queue if the size is empty, something like:
val empty : Process[Task,Boolean] = q.size.continuous.map(_ <= 0)
val deq : Process[Task,Int] = empty.wye(q.enqueue)(wye.interrupt)

Although I couldn't make Pavel's answer work the way I wanted, it was the turning point and I could use his advice to use the size signal.
I'm posting my answer here in case anyone need it:
import scalaz.concurrent._
import scalaz.stream._
val q = async.unboundedQueue[Int]
val p: Process[Task, Int] = q.size.continuous.take(1).flatMap { n => q.dequeue |> process1.take(n) }
q.enqueueAll(1 to 2).run
p.map(x => println(s"Answer: $x")).run.run
// Answer: 1
// Answer: 2
p.map(x => println(s"Answer: $x")).run.run
// not hanging awaiting next input
p.map(x => println(s"Answer: $x")).run.run
// not hanging awaiting next input
q.enqueueAll(1 to 2).run
p.map(x => println(s"Answer: $x")).run.run
// Answer: 1
// Answer: 2
I realize that it's not exactly answering the question since I don't have an explicit p1, but it's fine for my purposes.

Related

Akka SourceQueue to send list elements

I have a List[String] and a Source.queue. I would like to offer this queue string elements after some interval of time. Something like this :
val data : List[String] = ""
val tick = Source.tick(0 second, 1 second, "tick")
tick.runForeach(t => queue.offer(data(??))
Can someone help me out?
Edit : I have found a way but looking for more elegant way
val tick = Source.tick(0 second, 2 second, "tick").zipWithIndex.limit(data.length)
tick.runForeach(t => {
queue.offer(data(t._2.toInt))
})
To send elements from the List[String] to the queue in specific time intervals for each element, use Source#delay in the following manner:
val data: List[String] = ???
Source(data)
.delay(2.seconds, DelayOverflowStrategy.backpressure)
.withAttributes(Attributes.inputBuffer(1, 1))
.mapAsync(1)(x => queue.offer(x))
.runWith(Sink.ignore)
Set the input buffer size to one with withAttributes because the default value is 16, and use DelayOverflowStrategy.backpressure. Also, use mapAsync since the offer method returns a Future.
Alternatively, use Source#throttle:
Source(data)
.throttle(1, 2.seconds, 1, ThrottleMode.Shaping)
.mapAsync(1)(x => queue.offer(x))
.runWith(Sink.ignore)

For loop containing Scala Futures modifying a List

Let's say I have a ListBuffer[Int] and I iterate it with a foreach loop, and each loop will modify this list from inside a Future (removing the current element), and will do something special when the list is empty. Example code:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.collection.mutable.ListBuffer
val l = ListBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
l.foreach(n => Future {
println(s"Processing $n")
Future {
l -= n
println(s"Removed $n")
if (l.isEmpty) println("List is empty!")
}
})
This is probably going to end very badly. I have a more complex code with similar structure and same needs, but I do not know how to structure it so I can achieve same functionality in a more reliable way.
The way you present your problem is really not in the functional paradigm that scala is intended for.
What you seem to want, is to do a list of asynchronous computations, do something at the end of each one, and something else when every one is finished. This is pretty simple if you use continuations, which are simple to implement with map and flatMap methods on Future.
val fa: Future[Int] = Future { 1 }
// will apply the function to the result when it becomes available
val fb: Future[Int] = fa.map(a => a + 1)
// will start the asynchronous computation using the result when it will become available
val fc: Future[Int] = fa.flatMap(a => Future { a + 2 })
Once you have all this, you can easily do something when each of your Future completes (successfully):
val myFutures: List[Future[Int]] = ???
myFutures.map(futInt => futInt.map(int => int + 2))
Here, I will add 2 to each value I get from the different asynchronous computations in the List.
You can also choose to wait for all the Futures in your list to complete by using Future.sequence:
val myFutureList: Future[List[Int]] = Future.sequence(myFutures)
Once again, you get a Future, which will be resolved when each of the Futures inside the input list are successfully resolved, or will fail whenever one of your Futures fails. You'll then be able to use map or flatMap on this new Future, to use all the computed values at once.
So here's how I would write the code you proposed:
val l = 1 to 10
val processings: Seq[Future[Unit]] = l.map {n =>
Future(println(s"processing $n")).map {_ =>
println(s"finished processing $n")
}
}
val processingOver: Future[Unit] =
Future.sequence(processings).map { (lu: Seq[Unit]) =>
println(s"Finished processing ${lu.size} elements")
}
Of course, I would recommend having real functions rather than procedures (returning Unit), so that you can have values to do something with. I used println to have a code which will produce the same output as yours (except for the prints, which have a slightly different meaning, since we are not mutating anything anymore).

handle multiple Future in Scala

I want to create a list of Future, each of which could pass or fail and collate results from successful Future. How can I do this?
val futures2:List[Future[Int]] = List(Future{1}, Future{2},Future{throw new Exception("error")})
Questions
1) I want to wait for each future to finish
2) I want to collect sum of return values from each success future and ignore the ones which failed (so I should get 3).
One thing that you need to understand is that... Avoid trying to "get" values from a Future or Futures.
You can keep on operating in the Futuristic land.
val futureList = List(
Future(1),
Future(2),
Future(throw new Exception("error"))
)
// addd 1 to futures
// map will propagate errors to transformed futures
// only successful futures will result in +1, rest will stay with errors
val tranformedFutureList = futureList
.map(future => future.map(i => i + 1))
// print values of futures
// simimlar to map... for each will work only with successful futures
val unitFutureList = futureList
.map(future => future.foreach(i => println(i)))
// now lets give you sum of your "future" values
val sumFuture = futureList
.foldLeft(Future(0))((facc, f) => f.onComplete({
case Success(i) => facc.map(acc => acc + i)
case Failure(ex) => facc
})
And since OP (#Manu Chanda) asked about "getting" a value from a Promise, I am adding some bits about what Promise are in Scala.
So... first lets talk how to think about a Future in Scala.
If you see a Future[Int] then try to think of it as an ongoing computation which is "supposed to produce" an Int. Now that computation can successfully complete and result in a Success[Int] or a throw an exception and result in a Failure[Throwable]. And thus you see the functions such as onComplete, recoverWith, onFailure which seem like talking about a computation.
val intFuture = Future {
// all this inside Future {} is going to run in some other thread
val i = 5;
val j = i + 10;
val k = j / 5;
k
}
Now... what is a Promise.
Well... as the name indicates... a Promise[Int] is a promise of an Int value... nothing more.
Just like when a parent promises a certain toy to their child. Note that in this case... the parent has not necessarily started working on getting that toy, they have just promised that they will.
To complete the promise... they will first have to start working to complete it... got to market... buy from shop... come back home.Or... sometimes... they are busy so... they will ask someone else to bring that toy and keep doing their work... that other guy will try to bring that toy to parent (he may fail to buy it) and then they will complete the promise with whatever result they got from him.
So... basically a Promise wraps a Future inside of it. And that "wrapped" Future "value" can be considered as the value of the Promise.
so...
println("Well... The program wants an 'Int' toy")
// we "promised" our program that we will give it that int "toy"
val intPromise = Promise[Int]()
// now we can just move on with or life
println("Well... We just promised an 'Int' toy")
// while the program can make plans with how will it play with that "future toy"
val intFuture = intPromise.future
val plusOneIntFuture = intFuture.map(i => i + 1)
plusOneIntFuture.onComplete({
case Success(i) => println("Wow... I got the toy and modified it to - " + i)
case Failure(ex) => println("I did not get they toy")
})
// but since we at least want to try to complete our promise
println("Now... I suppose we need to get that 'Int' toy")
println("But... I am busy... I can not stop everything else for that toy")
println("ok... lets ask another thread to get that")
val getThatIntFuture = Future {
println("Well... I am thread 2... trying to get the int")
val i = 1
println("Well... I am thread 2... lets just return this i = 1 thingy")
i
}
// now lets complete our promise with whatever we will get from this other thread
getThatIntFuture.onComplete(intTry => intPromise.complete(intTry))
The above code will result in following output,
Well... The program wants an 'Int' toy
Well... We just promised an 'Int' toy
Now... I suppose we need to get that 'Int' toy
But... I am busy... I can not stop everything else for that toy
Well... I am thread 2... trying to get the int
Well... I am thread 2... lets just return this i = 1 thingy
Wow... I got the toy and modified it to - 2
Promise don't help you in "getting" a value from a Future. Asynchronous processes (or Future in Scala) are just running in another timeline... you can not "get" their "value" in your time-line unless you work on aligning your timeline with the process's time-line itself.

How do you stop building an Option[Collection] upon reaching the first None?

When building up a collection inside an Option, each attempt to make the next member of the collection might fail, making the collection as a whole a failure, too. Upon the first failure to make a member, I'd like to give up immediately and return None for the whole collection. What is an idiomatic way to do this in Scala?
Here's one approach I've come up with:
def findPartByName(name: String): Option[Part] = . . .
def allParts(names: Seq[String]): Option[Seq[Part]] =
names.foldLeft(Some(Seq.empty): Option[Seq[Part]]) {
(result, name) => result match {
case Some(parts) =>
findPartByName(name) flatMap { part => Some(parts :+ part) }
case None => None
}
}
In other words, if any call to findPartByName returns None, allParts returns None. Otherwise, allParts returns a Some containing a collection of Parts, all of which are guaranteed to be valid. An empty collection is OK.
The above has the advantage that it stops calling findPartByName after the first failure. But the foldLeft still iterates once for each name, regardless.
Here's a version that bails out as soon as findPartByName returns a None:
def allParts2(names: Seq[String]): Option[Seq[Part]] = Some(
for (name <- names) yield findPartByName(name) match {
case Some(part) => part
case None => return None
}
)
I currently find the second version more readable, but (a) what seems most readable is likely to change as I get more experience with Scala, (b) I get the impression that early return is frowned upon in Scala, and (c) neither one seems to make what's going on especially obvious to me.
The combination of "all-or-nothing" and "give up on the first failure" seems like such a basic programming concept, I figure there must be a common Scala or functional idiom to express it.
The return in your code is actually a couple levels deep in anonymous functions. As a result, it must be implemented by throwing an exception which is caught in the outer function. This isn't efficient or pretty, hence the frowning.
It is easiest and most efficient to write this with a while loop and an Iterator.
def allParts3(names: Seq[String]): Option[Seq[Part]] = {
val iterator = names.iterator
var accum = List.empty[Part]
while (iterator.hasNext) {
findPartByName(iterator.next) match {
case Some(part) => accum +:= part
case None => return None
}
}
Some(accum.reverse)
}
Because we don't know what kind of Seq names is, we must create an iterator to loop over it efficiently rather than using tail or indexes. The while loop can be replaced with a tail-recursive inner function, but with the iterator a while loop is clearer.
Scala collections have some options to use laziness to achieve that.
You can use view and takeWhile:
def allPartsWithView(names: Seq[String]): Option[Seq[Part]] = {
val successes = names.view.map(findPartByName)
.takeWhile(!_.isEmpty)
.map(_.get)
.force
if (!names.isDefinedAt(successes.size)) Some(successes)
else None
}
Using ifDefinedAt avoids potentially traversing a long input names in the case of an early failure.
You could also use toStream and span to achieve the same thing:
def allPartsWithStream(names: Seq[String]): Option[Seq[Part]] = {
val (good, bad) = names.toStream.map(findPartByName)
.span(!_.isEmpty)
if (bad.isEmpty) Some(good.map(_.get).toList)
else None
}
I've found trying to mix view and span causes findPartByName to be evaluated twice per item in case of success.
The whole idea of returning an error condition if any error occurs does, however, sound more like a job ("the" job?) for throwing and catching exceptions. I suppose it depends on the context in your program.
Combining the other answers, i.e., a mutable flag with the map and takeWhile we love.
Given an infinite stream:
scala> var count = 0
count: Int = 0
scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)
Take until a predicate fails:
scala> var failed = false
failed: Boolean = false
scala> vs map { case x if x < 5 => println(s"Yup $x"); Some(x) case x => println(s"Nope $x"); failed = true; None } takeWhile (_.nonEmpty) map (_.get)
Yup 1
res0: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res1: List[Int] = List(1, 2, 3, 4)
or more simply:
scala> var count = 0
count: Int = 0
scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> var failed = false
failed: Boolean = false
scala> vs map { case x if x < 5 => println(s"Yup $x"); x case x => println(s"Nope $x"); failed = true; -1 } takeWhile (_ => !failed)
Yup 1
res3: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res4: List[Int] = List(1, 2, 3, 4)
I think your allParts2 function has a problem as one of the two branches of your match statement will perform a side effect. The return statement is the not-idiomatic bit, behaving as if you are doing an imperative jump.
The first function looks better, but if you are concerned with the sub-optimal iteration that foldLeft could produce you should probably go for a recursive solution as the following:
def allParts(names: Seq[String]): Option[Seq[Part]] = {
#tailrec
def allPartsRec(names: Seq[String], acc: Seq[String]): Option[Seq[String]] = names match {
case Seq(x, xs#_*) => findPartByName(x) match {
case Some(part) => allPartsRec(xs, acc +: part)
case None => None
}
case _ => Some(acc)
}
allPartsRec(names, Seq.empty)
}
I didn't compile/run it but the idea should be there and I believe it is more idiomatic than using the return trick!
I keep thinking that this has to be a one- or two-liner. I came up with one:
def allParts4(names: Seq[String]): Option[Seq[Part]] = Some(
names.map(findPartByName(_) getOrElse { return None })
)
Advantage:
The intent is extremely clear. There's no clutter and there's no exotic or nonstandard Scala.
Disadvantages:
The early return violates referential transparency, as Aldo Stracquadanio pointed out. You can't put the body of allParts4 into its calling code without changing its meaning.
Possibly inefficient due to the internal throwing and catching of an exception, as wingedsubmariner pointed out.
Sure enough, I put this into some real code, and within ten minutes, I'd enclosed the expression inside something else, and predictably got surprising behavior. So now I understand a little better why early return is frowned upon.
This is such a common operation, so important in code that makes heavy use of Option, and Scala is normally so good at combining things, I can't believe there isn't a pretty natural idiom to do it correctly.
Aren't monads good for specifying how to combine actions? Is there a GiveUpAtTheFirstSignOfResistance monad?

Strange behavior with reduce and fold on identical code block

EDIT
Ok, #dhg discovered that dot-method syntax required if the code block to fold() is not bound to a val (why with reduce() in the same code block one can use space-method syntax, I don't know). At any rate, the end result is the nicely concise:
result.map { row =>
addLink( row.href, row.label )
}.fold(NodeSeq.Empty)(_++_)
Which negates to some degree the original question; i.e. in many cases one can higher-order away either/or scenarios and avoid "fat", repetitive if/else statements.
ORIGINAL
Trying to reduce if/else handling when working with possibly empty collections like List[T]
For example, let's say I need to grab the latest news articles to build up a NodeSeq of html news <li><a>links</a></li>:
val result = dao.getHeadlines // List[of model objects]
if(result.isEmpty) NodeSeq.Empty
else
result map { row =>
addLink( row.href, row.label ) // NodeSeq
} reduce(_ ++ _)
This is OK, pretty terse, but I find myself wanting to go ternary style to address these only-will-ever-be either/or cases:
result.isEmpty ? NodeSeq.Empty :
result map { row =>
addLink( row.href, row.label )
} reduce(_ ++ _)
I've seen some old postings on pimping ternary onto boolean, but curious to know what the alternatives are, if any, to streamline if/else?
match {...} is, IMO, a bit bloated for this scenario, and for {...} yield doesn't seem to help much either.
You don't need to check for emptiness at all. Just use fold instead of reduce since fold allows you to specify a default "empty" value:
scala> List(1,2,3,4).map(_ + 1).fold(0)(_+_)
res0: Int = 14
scala> List[Int]().map(_ + 1).fold(0)(_+_)
res1: Int = 0
Here's an example with a List of Seqs:
scala> List(1,2).map(Seq(_)).fold(Seq.empty)(_++_)
res14: Seq[Int] = List(1, 2)
scala> List[Int]().map(Seq(_)).fold(Seq.empty)(_++_)
res15: Seq[Int] = List()
EDIT: Looks like the problem in your sample has to do with the dropping of dot (.) characters between methods. If you keep them in, it all works:
scala> List(1,2,3).map(i => node).fold(NodeSeq.Empty)(_ ++ _)
res57: scala.xml.NodeSeq = NodeSeq(<li>Link</li>, <li>Link</li>, <li>Link</li>)