Is there any more functional alternative in Scala for an infinite loop?
while(true) {
if (condition) {
// Do something
} else {
Thread.sleep(interval);
}
}
You can do it recursively
#tailrec
def loop(): Nothing = {
if (condition) {
// Do something
} else {
Thread.sleep(interval);
}
loop()
}
One thing that you can do is using higher-order functions like Stream.continually and pair it up with a for comprehension:
import scala.util.Random
import scala.collection.immutable.Stream.continually
def rollTheDice: Int = Random.nextInt(6) + 1
for (n <- continually(rollTheDice)) {
println(s"the dice rolled $n")
}
This example itself is not purely functional due to the non-referentially transparent nextInt method, but it's a possible construct that may help you think about function composition rather then using side effects.
EDIT (2020-12-24)
As correctly point out in a recent comment, "[a]s of 2.13, Stream is deprecated. But the same method does exist in LazyList(import scala.collection.immutable.LazyList.continually)".
The following will work from 2.13 onward:
import scala.util.Random
import scala.collection.immutable.LazyList.continually
def rollTheDice: Int = Random.nextInt(6) + 1
for (n <- continually(rollTheDice)) {
println(s"the dice rolled $n")
}
You can see it in action and play around with it here on Scastie.
I guess infinite tail recursion:
#tailrec
def loop(): Nothing = {
if (condition) {
// Do something
} else {
Thread.sleep(interval);
}
loop()
}
Just to add to Stefano's great answer, in case someone is looking to a use-case like mine:
I was working on tasks from Kafka Streams course and needed to create an infinite stream of mock events to Kafka with some fields being completely random(amounts), but others rotated within a specific list(names).
The same approach with continually can be used passing a method(via ETA expansion) to it and traversing the bounded variable afterwards:
for {record <- continually(newRandomTransaction _)
name <- List("John", "Stephane", "Alice")} {
producer.send(record(name))
}
where the signature of newRandomTransaction is as follows:
def newRandomTransaction(name: String): ProducerRecord[String, String] = {
...
}
Related
Using Scala 2.13.0:
implicit val ec = ExecutionContext.global
val arr = (0 until 20).toIterator
.map { x =>
Thread.sleep(500);
println(x);
x
}
val fss = arr.map { slowX =>
Future { blocking { slowX } }
}
Await.result(Future.sequence(fss), Inf)
problem
arr is an iterator where each item needs 500ms processing time. We map the iterator with Future { blocking { ... }} with the purpose of making the processing parallel (using the global execution context). Finally we run Future.sequence
to consume the iterator.
Given the definition of Future.apply[T](body: =>T) and blocking[T](body: =>T), body is passed lazily, which means that body will be processed in the Future. If we inject that in the definition of Iterator.map, we get def next() = Future{blocking(self.next())}, so each item of the iterator should be processed in the Future.
But when I try this example however, I can see that the iterator is consumed sequentially, which is not what is expected!
Is that a Scala bug?? Or am I missing something?
No it's not a bug, because:
val arr = (0 until 20).toIterator
// this map invokes first and executed sequentially, because it executes in same thread.
.map { x =>
Thread.sleep(500);
println(x);
x
}
// This go sequentially because upstream map executed sequentially in same thread.
// So, "Future { blocking { slowX } }" can be replaced with "Future.successfull(slowX)"
// because no computation executed
val fss = arr.map { slowX =>
Future { blocking { slowX } }
}
If you want perform completely asynchronously, you can do something like:
def heavyCalculation(x: Int) = {
Thread.sleep(500);
println(x);
x
}
val result = Future.traverse((0 until 20).toList) { x =>
Future(blocking(heavyCalculation(x)))
}
Await.result(result, 1 minute)
Working Scatie example: https://scastie.scala-lang.org/3v06NpypRHKYkqBgzaeVXg
First, this is not a proper benchmark, you actually haven't show formal proof that this is sequential and not parallel (although is "obvious" from the source code that it isn't).
Second, and Iterator of Futures is probably a bad idea; at this point, it may make sense to look into a streaming solution like Akka-Streams, fs2, Monix or ZIO.
Third, what is even the point of having a bunch of blocking futures? you aren't actually winning too much.
Fourth, the problem is that the second map is not passing the block of the first map, just the result. So, you actually did the sleep before creating the Future.
Fifth, you probably want to do this instead.
val result = Future.traverse(data) { elem =>
Future {
blocking {
// Process elem here.
}
}
}
Await.result(result, Inf)
The other answers were pointing in the right direction, but the formal answer is the following: the signature of Iterator.map(f: A => B) tells us that A that A is computed before f is applied to it (because it is not => A). Therefore, next() is computed in the main thread.
I'm beginning with Scala. I have a program which have a method with a while loop which run until the program is not ended.
But for my test, I need to execute this method only once (or twice). In java, I would have used a mutable variable that I would have decremented in order to stop my treatment.
Maybe a condition inside my while loop that I override for my test.
def receive = {
val iterator = stream.iterator()
while (iterator.hasNext && my_condition()) {
something_to_do
}
}
I know it's a stupid question, but could you please advice me ?
Try:
iterator.takeWhile(my_condition).foreach(something_to_do)
or:
iterator.take(n).foreach(something_to_do)
if you just want the first n entries.
Or, if something_to_do returns a result (rather than Unit), and you want to return an iterator of those results, you can use:
iterator.takeWhile(my_condition).map(something_to_do)
(or .take(n).map(...) )
Consider this for comprehension,
for (_ <- iterator if my_condition()) something_to_do
where each iterated value is ignored (note _) and the todo part is invoked while the condition holds.
I think an approach like the following is acceptable:
import akka.actor.{Props, Actor}
import scala.io.Source
object TestableActor {
def props = Props(new TestableActor())
def testProps = Props(new TestableActor(true))
case class Message(stream: Stream)
}
class TestableActor(doOnce: Boolean = false) extends Actor {
import TestableActor._
val stream: Stream = ???
def receive = {
case Message(stream) =>
val iterator = stream.iterator
if(doOnce) {
something_to_do
} else {
while (iterator.hasNext && my_condition()) {
something_to_do
}
}
}
def my_condition(): Boolean = ???
def something_to_do: Unit = ???
}
In your production code, use
context.actorOf(TestableActor.props)
In your test use
TestActorRef[TestableActor](TestableActor.testProps)
Suppose this API is given and we cannot change it:
object ProviderAPI {
trait Receiver[T] {
def receive(entry: T)
def close()
}
def run(r: Receiver[Int]) {
new Thread() {
override def run() {
(0 to 9).foreach { i =>
r.receive(i)
Thread.sleep(100)
}
r.close()
}
}.start()
}
}
In this example, ProviderAPI.run takes a Receiver, calls receive(i) 10 times and then closes. Typically, ProviderAPI.run would call receive(i) based on a collection which could be infinite.
This API is intended to be used in imperative style, like an external iterator. If our application needs to filter, map and print this input, we need to implement a Receiver which mixes all these operations:
object Main extends App {
class MyReceiver extends ProviderAPI.Receiver[Int] {
def receive(entry: Int) {
if (entry % 2 == 0) {
println("Entry#" + entry)
}
}
def close() {}
}
ProviderAPI.run(new MyReceiver())
}
Now, the question is how to use the ProviderAPI in functional style, internal iterator (without changing the implementation of ProviderAPI, which is given to us). Note that ProviderAPI could also call receive(i) infinite times, so it is not an option to collect everything in a list (also, we should handle each result one by one, instead of collecting all the input first, and processing it afterwards).
I am asking how to implement such a ReceiverToIterator, so that we can use the ProviderAPI in functional style:
object Main extends App {
val iterator = new ReceiverToIterator[Int] // how to implement this?
ProviderAPI.run(iterator)
iterator
.view
.filter(_ % 2 == 0)
.map("Entry#" + _)
.foreach(println)
}
Update
Here are four solutions:
IteratorWithSemaphorSolution: The workaround solution I proposed first attached to the question
QueueIteratorSolution: Using the BlockingQueue[Option[T]] based on the suggestion of nadavwr.
It allows the producer to continue producing up to queueCapacity before being blocked by the consumer.
PublishSubjectSolution: Very simple solution, using PublishSubject from Netflix RxJava-Scala API.
SameThreadReceiverToTraversable: Very simple solution, by relaxing the constraints of the question
Updated: BlockingQueue of 1 entry
What you've implemented here is essentially Java's BlockingQueue, with a queue size of 1.
Main characteristic: uber-blocking. A slow consumer will kill your producer's performance.
Update: #gzm0 mentioned that BlockingQueue doesn't cover EOF. You'll have to use BlockingQueue[Option[T]] for that.
Update: Here's a code fragment. It can be made to fit with your Receiver.
Some of it inspired by Iterator.buffered. Note that peek is a misleading name, as it may block -- and so will hasNext.
// fairness enabled -- you probably want to preserve order...
// alternatively, disable fairness and increase buffer to be 'big enough'
private val queue = new java.util.concurrent.ArrayBlockingQueue[Option[T]](1, true)
// the following block provides you with a potentially blocking peek operation
// it should `queue.take` when the previous peeked head has been invalidated
// specifically, it will `queue.take` and block when the queue is empty
private var head: Option[T] = _
private var headDefined: Boolean = false
private def invalidateHead() { headDefined = false }
private def peek: Option[T] = {
if (!headDefined) {
head = queue.take()
headDefined = true
}
head
}
def iterator = new Iterator[T] {
// potentially blocking; only false upon taking `None`
def hasNext = peek.isDefined
// peeks and invalidates head; throws NoSuchElementException as appropriate
def next: T = {
val opt = peek; invalidateHead()
if (opt.isEmpty) throw new NoSuchElementException
else opt.get
}
}
Alternative: Iteratees
Iterator-based solutions will generally involve more blocking. Conceptually, you could use continuations on the thread doing the iteration to avoid blocking the thread, but continuations mess with Scala's for-comprehensions, so no joy down that road.
Alternatively, you could consider an iteratee-based solution. Iteratees are different than iterators in that the consumer isn't responsible for advancing the iteration -- the producer is. With iteratees, the consumer basically folds over the entries pushed by the producer over time. Folding each next entry as it becomes available can take place in a thread pool, since the thread is relinquished after each fold completes.
You won't get nice for-syntax for iteration, and the learning curve is a little challenging, but if you feel confident using a foldLeft you'll end up with a non-blocking solution that does look reasonable on the eye.
To read more about iteratees, I suggest taking a peek at PlayFramework 2.X's iteratee reference. The documentation describes their stand-alone iteratee library, which is 100% usable outside the context of Play. Scalaz 7 also has a comprehensive iteratee library.
IteratorWithSemaphorSolution
The first workaround solution that I proposed attached to the question.
I moved it here as an answer.
import java.util.concurrent.Semaphore
object Main extends App {
val iterator = new ReceiverToIterator[Int]
ProviderAPI.run(iterator)
iterator
.filter(_ % 2 == 0)
.map("Entry#" + _)
.foreach(println)
}
class ReceiverToIterator[T] extends ProviderAPI.Receiver[T] with Iterator[T] {
var lastEntry: T = _
var waitingToReceive = new Semaphore(1)
var waitingToBeConsumed = new Semaphore(1)
var eof = false
waitingToReceive.acquire()
def receive(entry: T) {
println("ReceiverToIterator.receive(" + entry + "). START.")
waitingToBeConsumed.acquire()
lastEntry = entry
waitingToReceive.release()
println("ReceiverToIterator.receive(" + entry + "). END.")
}
def close() {
println("ReceiverToIterator.close().")
eof = true
waitingToReceive.release()
}
def hasNext = {
println("ReceiverToIterator.hasNext().START.")
waitingToReceive.acquire()
waitingToReceive.release()
println("ReceiverToIterator.hasNext().END.")
!eof
}
def next = {
println("ReceiverToIterator.next().START.")
waitingToReceive.acquire()
if (eof) { throw new NoSuchElementException }
val entryToReturn = lastEntry
waitingToBeConsumed.release()
println("ReceiverToIterator.next().END.")
entryToReturn
}
}
QueueIteratorSolution
The second workaround solution that I proposed attached to the question. I moved it here as an answer.
Solution using the BlockingQueue[Option[T]] based on the suggestion of nadavwr.
It allows the producer to continue producing up to queueCapacity before being blocked by the consumer.
I implement a QueueToIterator that uses a ArrayBlockingQueue with a given capacity.
BlockingQueue has a take() method, but not a peek or hasNext, so I need an OptionNextToIterator as follows:
trait OptionNextToIterator[T] extends Iterator[T] {
def getOptionNext: Option[T] // abstract
def hasNext = { ... }
def next = { ... }
}
Note: I am using the synchronized block inside OptionNextToIterator, and I am not sure it is totally correct
Solution:
import java.util.concurrent.ArrayBlockingQueue
object Main extends App {
val receiverToIterator = new ReceiverToIterator[Int](queueCapacity = 3)
ProviderAPI.run(receiverToIterator)
Thread.sleep(3000) // test that ProviderAPI.run can produce 3 items ahead before being blocked by the consumer
receiverToIterator.filter(_ % 2 == 0).map("Entry#" + _).foreach(println)
}
class ReceiverToIterator[T](val queueCapacity: Int = 1) extends ProviderAPI.Receiver[T] with QueueToIterator[T] {
def receive(entry: T) { queuePut(entry) }
def close() { queueClose() }
}
trait QueueToIterator[T] extends OptionNextToIterator[T] {
val queueCapacity: Int
val queue = new ArrayBlockingQueue[Option[T]](queueCapacity)
var queueClosed = false
def queuePut(entry: T) {
if (queueClosed) { throw new IllegalStateException("The queue has already been closed."); }
queue.put(Some(entry))
}
def queueClose() {
queueClosed = true
queue.put(None)
}
def getOptionNext = queue.take
}
trait OptionNextToIterator[T] extends Iterator[T] {
def getOptionNext: Option[T]
var answerReady: Boolean = false
var eof: Boolean = false
var element: T = _
def hasNext = {
prepareNextAnswerIfNecessary()
!eof
}
def next = {
prepareNextAnswerIfNecessary()
if (eof) { throw new NoSuchElementException }
val retVal = element
answerReady = false
retVal
}
def prepareNextAnswerIfNecessary() {
if (answerReady) {
return
}
synchronized {
getOptionNext match {
case None => eof = true
case Some(e) => element = e
}
answerReady = true
}
}
}
PublishSubjectSolution
A very simple solution using PublishSubject from Netflix RxJava-Scala API:
// libraryDependencies += "com.netflix.rxjava" % "rxjava-scala" % "0.20.7"
import rx.lang.scala.subjects.PublishSubject
class MyReceiver[T] extends ProviderAPI.Receiver[T] {
val channel = PublishSubject[T]()
def receive(entry: T) { channel.onNext(entry) }
def close() { channel.onCompleted() }
}
object Main extends App {
val myReceiver = new MyReceiver[Int]()
ProviderAPI.run(myReceiver)
myReceiver.channel.filter(_ % 2 == 0).map("Entry#" + _).subscribe{n => println(n)}
}
ReceiverToTraversable
This stackoverflow question came when I wanted to list and process a svn repository using the svnkit.com API as follows:
SvnList svnList = new SvnOperationFactory().createList();
svnList.setReceiver(new ISvnObjectReceiver<SVNDirEntry>() {
public void receive(SvnTarget target, SVNDirEntry dirEntry) throws SVNException {
// do something with dirEntry
}
});
svnList.run();
the API used a callback function, and I wanted to use a functional style instead, as follows:
svnList.
.filter(e => "pom.xml".compareToIgnoreCase(e.getName()) == 0)
.map(_.getURL)
.map(getMavenArtifact)
.foreach(insertArtifact)
I thought of having a class ReceiverToIterator[T] extends ProviderAPI.Receiver[T] with Iterator[T],
but this required the svnkit api to run in another thread.
That's why I asked how to solve this problem with a ProviderAPI.run method that run in a new thread. But that was not very wise: if I had explained the real case, someone might have found a better solution before.
Solution
If we tackle the real problem (so, no need of using a thread for the svnkit),
a simpler solution is to implement a scala.collection.Traversable instead of a scala.collection.Iterator.
While Iterator requires a next and hasNext def, Traversable requires a foreach def,
which is very similar to the svnkit callback!
Note that by using view, we make the transformers lazy, so elements are passed one by one through all the chain to foreach(println).
this allows to process an infinite collection.
object ProviderAPI {
trait Receiver[T] {
def receive(entry: T)
def close()
}
// Later I found out that I don't need a thread
def run(r: Receiver[Int]) {
(0 to 9).foreach { i => r.receive(i); Thread.sleep(100) }
}
}
object Main extends App {
new ReceiverToTraversable[Int](r => ProviderAPI.run(r))
.view
.filter(_ % 2 == 0)
.map("Entry#" + _)
.foreach(println)
}
class ReceiverToTraversable[T](val runProducer: (ProviderAPI.Receiver[T] => Unit)) extends Traversable[T] {
override def foreach[U](f: (T) => U) = {
object MyReceiver extends ProviderAPI.Receiver[T] {
def receive(entry: T) = f(entry)
def close() = {}
}
runProducer(MyReceiver)
}
}
I have a Traversable, and I want to make it into a Java Iterator. My problem is that I want everything to be lazily done. If I do .toIterator on the traversable, it eagerly produces the result, copies it into a List, and returns an iterator over the List.
I'm sure I'm missing something simple here...
Here is a small test case that shows what I mean:
class Test extends Traversable[String] {
def foreach[U](f : (String) => U) {
f("1")
f("2")
f("3")
throw new RuntimeException("Not lazy!")
}
}
val a = new Test
val iter = a.toIterator
The reason you can't get lazily get an iterator from a traversable is that you intrinsically can't. Traversable defines foreach, and foreach runs through everything without stopping. No laziness there.
So you have two options, both terrible, for making it lazy.
First, you can iterate through the whole thing each time. (I'm going to use the Scala Iterator, but the Java Iterator is basically the same.)
class Terrible[A](t: Traversable[A]) extends Iterator[A] {
private var i = 0
def hasNext = i < t.size // This could be O(n)!
def next: A = {
val a = t.slice(i,i+1).head // Also could be O(n)!
i += 1
a
}
}
If you happen to have efficient indexed slicing, this will be okay. If not, each "next" will take time linear in the length of the iterator, for O(n^2) time just to traverse it. But this is also not necessarily lazy; if you insist that it must be you have to enforce O(n^2) in all cases and do
class Terrible[A](t: Traversable[A]) extends Iterator[A] {
private var i = 0
def hasNext: Boolean = {
var j = 0
t.foreach { a =>
j += 1
if (j>i) return true
}
false
}
def next: A = {
var j = 0
t.foreach{ a =>
j += 1
if (j>i) { i += 1; return a }
}
throw new NoSuchElementException("Terribly empty")
}
}
This is clearly a terrible idea for general code.
The other way to go is to use a thread and block the traversal of foreach as it's going. That's right, you have to do inter-thread communication on every single element access! Let's see how that works--I'm going to use Java threads here since Scala is in the middle of a switch to Akka-style actors (though any of the old actors or the Akka actors or the Scalaz actors or the Lift actors or (etc.) will work)
class Horrible[A](t: Traversable[A]) extends Iterator[A] {
private val item = new java.util.concurrent.SynchronousQueue[Option[A]]()
private class Loader extends Thread {
override def run() { t.foreach{ a => item.put(Some(a)) }; item.put(None) }
}
private val loader = new Loader
loader.start
private var got: Option[A] = null
def hasNext: Boolean = {
if (got==null) { got = item.poll; hasNext }
else got.isDefined
}
def next = {
if (got==null) got = item.poll
val ans = got.get
got = null
ans
}
}
This avoids the O(n^2) disaster, but ties up a thread and has desperately slow element-by-element access. I get about two million accesses per second on my machine, as compared to >100M for a typical traversable. This is clearly a horrible idea for general code.
So there you have it. Traversable is not lazy in general, and there is no good way to make it lazy without compromising performance tremendously.
I've run into this problem before and as far as I can tell, no one's particularly interested in making it easier to get an Iterator when all you've defined is foreach.
But as you've noted, toStream is the problem, so you could just override that:
class Test extends Traversable[String] {
def foreach[U](f: (String) => U) {
f("1")
f("2")
f("3")
throw new RuntimeException("Not lazy!")
}
override def toStream: Stream[String] = {
"1" #::
"2" #::
"3" #::
Stream[String](throw new RuntimeException("Not lazy!"))
}
}
Another alternative would be to define an Iterable instead of a Traversable, and then you'd get the iterator method directly. Could you explain a bit more what your Traversable is doing in your real use case?
How might one implement C# yield return using Scala continuations? I'd like to be able to write Scala Iterators in the same style. A stab is in the comments on this Scala news post, but it doesn't work (tried using the Scala 2.8.0 beta). Answers in a related question suggest this is possible, but although I've been playing with delimited continuations for a while, I can't seem to exactly wrap my head around how to do this.
Before we introduce continuations we need to build some infrastructure.
Below is a trampoline that operates on Iteration objects.
An iteration is a computation that can either Yield a new value or it can be Done.
sealed trait Iteration[+R]
case class Yield[+R](result: R, next: () => Iteration[R]) extends Iteration[R]
case object Done extends Iteration[Nothing]
def trampoline[R](body: => Iteration[R]): Iterator[R] = {
def loop(thunk: () => Iteration[R]): Stream[R] = {
thunk.apply match {
case Yield(result, next) => Stream.cons(result, loop(next))
case Done => Stream.empty
}
}
loop(() => body).iterator
}
The trampoline uses an internal loop that turns the sequence of Iteration objects into a Stream.
We then get an Iterator by calling iterator on the resulting stream object.
By using a Stream our evaluation is lazy; we don't evaluate our next iteration until it is needed.
The trampoline can be used to build an iterator directly.
val itr1 = trampoline {
Yield(1, () => Yield(2, () => Yield(3, () => Done)))
}
for (i <- itr1) { println(i) }
That's pretty horrible to write, so let's use delimited continuations to create our Iteration objects automatically.
We use the shift and reset operators to break the computation up into Iterations,
then use trampoline to turn the Iterations into an Iterator.
import scala.continuations._
import scala.continuations.ControlContext.{shift,reset}
def iterator[R](body: => Unit #cps[Iteration[R],Iteration[R]]): Iterator[R] =
trampoline {
reset[Iteration[R],Iteration[R]] { body ; Done }
}
def yld[R](result: R): Unit #cps[Iteration[R],Iteration[R]] =
shift((k: Unit => Iteration[R]) => Yield(result, () => k(())))
Now we can rewrite our example.
val itr2 = iterator[Int] {
yld(1)
yld(2)
yld(3)
}
for (i <- itr2) { println(i) }
Much better!
Now here's an example from the C# reference page for yield that shows some more advanced usage.
The types can be a bit tricky to get used to, but it all works.
def power(number: Int, exponent: Int): Iterator[Int] = iterator[Int] {
def loop(result: Int, counter: Int): Unit #cps[Iteration[Int],Iteration[Int]] = {
if (counter < exponent) {
yld(result)
loop(result * number, counter + 1)
}
}
loop(number, 0)
}
for (i <- power(2, 8)) { println(i) }
I managed to discover a way to do this, after a few more hours of playing around. I thought this was simpler to wrap my head around than all the other solutions I've seen thus far, though I did afterward very much appreciate Rich's and Miles' solutions.
def loopWhile(cond: =>Boolean)(body: =>(Unit #suspendable)): Unit #suspendable = {
if (cond) {
body
loopWhile(cond)(body)
}
}
class Gen {
var prodCont: Unit => Unit = { x: Unit => prod }
var nextVal = 0
def yld(i: Int) = shift { k: (Unit => Unit) => nextVal = i; prodCont = k }
def next = { prodCont(); nextVal }
def prod = {
reset {
// following is generator logic; can be refactored out generically
var i = 0
i += 1
yld(i)
i += 1
yld(i)
// scala continuations plugin can't handle while loops, so need own construct
loopWhile (true) {
i += 1
yld(i)
}
}
}
}
val it = new Gen
println(it.next)
println(it.next)
println(it.next)