Maintaining a growing list functionally - scala

I'm writing this application (while learning scala and functional programing) that has a growing list that needs to be accessed. I have a way of implementing it but it doesn't seem like a good way of doing it.
I have a class that in another thread polls another source and appends the response to a list. There is another function that will once in a while use the latest response from that list and append to another list.
The only way Ive been able to come up with is having the polling function mutate the list. I do something like declaring it when the class is created. Either by var stream = Stream() or val list = collection.mutable.list. Then have the polling function update that with var stream = stream :+ response.
This doesn't seem like a good way of doing this?
case class example extends Runnable {
var responses = Stream[response] = Stream()
var validResponses = Stream[response] = Stream()
def run() {
while(true){
val r = pollFunction
responses = responses :+ r
Thread.sleep(time)
}
}
def get: response = {
lazy val r = responses.last
if( isValid(r) ) validResponses = validResponses :+ r
validResponses.last
}
}
I just happened to use stream but that can be replaced with collection.mutable or something. Im not sure if there is a way to do something like this within a more functional way?

Related

How to handle asynchronous API response in scala

I have an API that I need to query in scala. API returns a code that would be equal to 1 when results are ready.
I thought about an until loop to handle as the following:
var code= -1
while(code!=1){
var response = parse(Http(URL).asString.body)
code = response.get("code").get.asInstanceOf[BigInt].toInt
}
println(response)
But this code returns:
error: not found: value response
So I thought about doing the following:
var code = -1
var res = null.asInstanceOf[Map[String, Any]]
while(code!=1){
var response = parse(Http(URL).asString.body)
code = response.get("code").get.asInstanceOf[BigInt].toInt
res = response
}
println(res)
And it works. But I would like to know if this is really the best scala-friendly way to do so ?
How can I properly use a variable that outside of an until loop ?
When you say API, do you mean you use a http api and you're using a http library in scala, or do you mean there's some class/api written up in scala? If you have to keep checking then you have to keep checking I suppose.
If you're using a Scala framework like Akka or Play, they'd have solutions to asyncrhonously poll or schedule jobs in the background as a part of their solutions which you can read about.
If you're writing a Scala script, then from a design perspective I would either run the script every 1 minute and instead of having the while loop I'd just quit until code = 1. Otherwise I'd essentially do what you've done.
Another library that could help a scala script might be fs2 or ZIO which can allow you to setup tasks that periodically poll.
This appears to be a very open question about designing apps which do polling. A specific answer is hard to give.
You can just use simple recursion:
def runUntil[A](block: => A)(cond: A => Boolean): A = {
#annotation.tailrec
def loop(current: A): A =
if (cond(current)) current
else loop(current = block)
loop(current = block)
}
// Which can be used like:
val response = runUntil {
parse(Http(URL).asString.body)
} { res =>
res.getCode == 1
}
println(response)
An, if your real code uses some kind of effect type like IO or Future
// Actually cats already provides this, is called iterateUntil
def runUntil[A](program: IO[A])(cond: A => Boolean): IO[A] = {
def loop: IO[A] =
program.flatMap { a =>
if (cond(a)) IO.pure(a)
else loop
}
loop
}
// Used like:
val request = IO {
parse(Http(URL).asString.body)
}
val response = runUntil(request) { res =>
res.getCode == 1
}
response.flatMap(IO.println)
Note, for Future you would need to use () => Future instead to be able to re-execute the operation.
You can see the code running here.

Akka Stream return object from Sink

I've got a SourceQueue. When I offer an element to this I want it to pass through the Stream and when it reaches the Sink have the output returned to the code that offered this element (similar as Sink.head returns an element to the RunnableGraph.run() call).
How do I achieve this? A simple example of my problem would be:
val source = Source.queue[String](100, OverflowStrategy.fail)
val flow = Flow[String].map(element => s"Modified $element")
val sink = Sink.ReturnTheStringSomehow
val graph = source.via(flow).to(sink).run()
val x = graph.offer("foo")
println(x) // Output should be "Modified foo"
val y = graph.offer("bar")
println(y) // Output should be "Modified bar"
val z = graph.offer("baz")
println(z) // Output should be "Modified baz"
Edit: For the example I have given in this question Vladimir Matveev provided the best answer. However, it should be noted that this solution only works if the elements are going into the sink in the same order they were offered to the source. If this cannot be guaranteed the order of the elements in the sink may differ and the outcome might be different from what is expected.
I believe it is simpler to use the already existing primitive for pulling values from a stream, called Sink.queue. Here is an example:
val source = Source.queue[String](128, OverflowStrategy.fail)
val flow = Flow[String].map(element => s"Modified $element")
val sink = Sink.queue[String]().withAttributes(Attributes.inputBuffer(1, 1))
val (sourceQueue, sinkQueue) = source.via(flow).toMat(sink)(Keep.both).run()
def getNext: String = Await.result(sinkQueue.pull(), 1.second).get
sourceQueue.offer("foo")
println(getNext)
sourceQueue.offer("bar")
println(getNext)
sourceQueue.offer("baz")
println(getNext)
It does exactly what you want.
Note that setting the inputBuffer attribute for the queue sink may or may not be important for your use case - if you don't set it, the buffer will be zero-sized and the data won't flow through the stream until you invoke the pull() method on the sink.
sinkQueue.pull() yields a Future[Option[T]], which will be completed successfully with Some if the sink receives an element or with a failure if the stream fails. If the stream completes normally, it will be completed with None. In this particular example I'm ignoring this by using Option.get but you would probably want to add custom logic to handle this case.
Well, you know what offer() method returns if you take a look at its definition :) What you can do is to create Source.queue[(Promise[String], String)], create helper function that pushes pair to stream via offer, make sure offer doesn't fail because queue might be full, then complete promise inside your stream and use future of the promise to catch completion event in external code.
I do that to throttle rate to external API used from multiple places of my project.
Here is how it looked in my project before Typesafe added Hub sources to akka
import scala.concurrent.Promise
import scala.concurrent.Future
import java.util.concurrent.ConcurrentLinkedDeque
import akka.stream.scaladsl.{Keep, Sink, Source}
import akka.stream.{OverflowStrategy, QueueOfferResult}
import scala.util.Success
private val queue = Source.queue[(Promise[String], String)](100, OverflowStrategy.backpressure)
.toMat(Sink.foreach({ case (p, param) =>
p.complete(Success(param.reverse))
}))(Keep.left)
.run
private val futureDeque = new ConcurrentLinkedDeque[Future[String]]()
private def sendQueuedRequest(request: String): Future[String] = {
val p = Promise[String]
val offerFuture = queue.offer(p -> request)
def addToQueue(future: Future[String]): Future[String] = {
futureDeque.addLast(future)
future.onComplete(_ => futureDeque.remove(future))
future
}
offerFuture.flatMap {
case QueueOfferResult.Enqueued =>
addToQueue(p.future)
}.recoverWith {
case ex =>
val first = futureDeque.pollFirst()
if (first != null)
addToQueue(first.flatMap(_ => sendQueuedRequest(request)))
else
sendQueuedRequest(request)
}
}
I realize that blocking synchronized queue may be bottleneck and may grow indefinitely but because API calls in my project are made only from other akka streams which are backpressured I never have more than dozen items in futureDeque. Your situation may differ.
If you create MergeHub.source[(Promise[String], String)]() instead you'll get reusable sink. Thus every time you need to process item you'll create complete graph and run it. In that case you won't need hacky java container to queue requests.

Trying to understand Scala enumerator/iteratees

I am new to Scala and Play!, but have a reasonable amount of experience of building webapps with Django and Python and of programming in general.
I've been doing an exercise of my own to try to improve my understanding - simply pull some records from a database and output them as a JSON array. I'm trying to use the Enumarator/Iteratee functionality to do this.
My code follows:
TestObjectController.scala:
def index = Action {
db.withConnection { conn=>
val stmt = conn.createStatement()
val result = stmt.executeQuery("select * from datatable")
logger.debug(result.toString)
val resultEnum:Enumerator[TestDataObject] = Enumerator.generateM {
logger.debug("called enumerator")
result.next() match {
case true =>
val obj = TestDataObject(result.getString("name"), result.getString("object_type"),
result.getString("quantity").toInt, result.getString("cost").toFloat)
logger.info(obj.toJsonString)
Future(Some(obj))
case false =>
logger.warn("reached end of iteration")
stmt.close()
null
}
}
val consume:Iteratee[TestDataObject,Seq[TestDataObject]] = {
Iteratee.fold[TestDataObject,Seq[TestDataObject]](Seq.empty[TestDataObject]) { (result,chunk) => result :+ chunk }
}
val newIteree = Iteratee.flatten(resultEnum(consume))
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult.onSuccess { case x=> println(x)}
Ok("")
}
}
TestDataObject.scala:
package models
case class TestDataObject (name: String, objtype: String, quantity: Int, cost: Float){
def toJsonString: String = {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
mapper.writeValueAsString(this)
}
}
I have two main questions:
How do i signal that the input is complete from the Enumerator callback? The documentation says "this method takes a callback function e: => Future[Option[E]] that will be called each time the iteratee this Enumerator is applied to is ready to take some input." but I am unable to pass any kind of EOF that I've found because it;s the wrong type. Wrapping it in a Future does not help, but instinctively I am not sure that's the right approach.
How do I get the final result out of the Future to return from the controller view? My understanding is that I would effectively need to pause the main thread to wait for the subthreads to complete, but the only examples I've seen and only things i've found in the future class is the onSuccess callback - but how can I then return that from the view? Does Iteratee.run block until all input has been consumed?
A couple of sub-questions as well, to help my understanding:
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
When I run the code for the first time, I get a single record logged from logger.info and then it reports "reached end of iteration". Subsequent runs in the same session call nothing. I am closing the statement though, so why do I get no results the second time around? I was expecting it to loop indefinitely as I don't know how to signal the correct termination for the loop.
Thanks a lot in advance for any answers, I thought I was getting the hang of this but obviously not yet!
How do i signal that the input is complete from the Enumerator callback?
You return a Future(None).
How do I get the final result out of the Future to return from the controller view?
You can use Action.async (doc):
def index = Action.async {
db.withConnection { conn=>
...
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult map { data =>
OK(...)
}
}
}
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
The Future represents the (potentially asynchronous) processing to obtain the next element. The Option represents the availability of the next element: Some(x) if another element is available, None if the enumeration is completed.

How can I use and return Source queue to caller without materializing it?

I'm trying to use new Akka streams and wonder how I can use and return Source queue to caller without materializing it in my code ?
Imagine we have library that makes number of async calls and returns results via Source. Function looks like this
def findArticlesByTitle(text: String): Source[String, SourceQueue[String]] = {
val source = Source.queue[String](100, backpressure)
source.mapMaterializedValue { case queue =>
val url = s"http://.....&term=$text"
httpclient.get(url).map(httpResponseToSprayJson[SearchResponse]).map { v =>
v.idlist.foreach { id =>
queue.offer(id)
}
queue.complete()
}
}
source
}
and caller might use it like this
// There is implicit ActorMaterializer somewhere
val stream = plugin.findArticlesByTitle(title)
val results = stream.runFold(List[String]())((result, article) => article :: result)
When I run this code within mapMaterializedValue is never executed.
I can't understand why I don't have access to instance of SourceQueue if it should be up to caller to decide how to materialize the source.
How should I implement this ?
In your code example you're returning source instead of the return value of source.mapMaterializedValue (the method call doesn't mutate the Source object).

Closing a resource stored in Option[ ]

I have a resource object stored in an option.
private var ochan: Option[Channel] = None
At some point during program execution, ochan is set to Some(channel). I'd like to close the channel (via invoking the method close) and set the option to None in one fatal swoop.
Currently I have:
def disconnect = ochan = { ochan.foreach{_.close}; None }
And previously I had:
def disconnect = ochan = ochan.flatMap{ o => o.close; None }
Is there a better way to do this?
I'd write it like this:
def disconnect = ochan = ochan match {
case Some(ch) => ch.close(); None
case None => None // do nothing
}
instead of using foreach or flatMap. In my opinion, this solution shows more clearly and explicitly what happens. The solution with foreach or flatMap requires an extra mental jump, you'd have to know what these methods do on an Option.
I don't know that it's better but it's shorter (once you've defined the implicit):
implicit def closer(o: Option[Channel]) = new {
def close(): Option[Channel] = { o.foreach(_.close); None }
}
def disconnect = ochan = ochan.close
There is no big difference between an immutable var and a mutable val. So why not encapsulate the behavior in a separate class, when you want to have mutability anyway?
class ChannelOption {
private var _channel :Option[Channel] = None
def channel = _channel
def channel_=(ch:Option[Channel]) { _channel.foreach(_.close); _channel = ch }
}
Usage:
private val ochan = new ChannelOption
ochan.channel = Some(getAChannel)
ochan.channel.foreach(useChannel)
ochan.channel = Some(getAnotherChannel) //this automatically closes the first channel
ochan.channel = None //this automatically closes the second channel
It's not thread safe! Remember to use #volatile (not here; using synchronization), and do something like this: (this is why I don't like imperative code)
private val lock = new Object
def disconnect() {//Function has side effects: remember parenthesis!
lock.synchronized { //Synchronizations is important; you don't want to close it multiple times
ochan.foreach {_.close()} //Again: side effects -> parens.
}
}
And if you don't use parallel programming, you are doing something wrong.
You could define ochan_= so that assigning a new value to ochan closes the old channel (similar to std::auto_ptr<> in C++) but I don't see how you can encapsulate that in a child class of Option[Channel] because the storage is in your class. The solution wouldn't change the code much at all, it would just make disconnect implicit by assigning ochan.
I guess this could work:
def disconnect {
ochan = {
ochan.get.close
None
}
}
or
def disconnect {
ochan.get.close
ochan = None
}
Anyway since there is mutating operation, it will always need 2 calls (1 for close and one for assignment of None).