Trying to understand Scala enumerator/iteratees

Trying to understand Scala enumerator/iteratees - scala

I am new to Scala and Play!, but have a reasonable amount of experience of building webapps with Django and Python and of programming in general.
I've been doing an exercise of my own to try to improve my understanding - simply pull some records from a database and output them as a JSON array. I'm trying to use the Enumarator/Iteratee functionality to do this.
My code follows:
TestObjectController.scala:
def index = Action {
db.withConnection { conn=>
val stmt = conn.createStatement()
val result = stmt.executeQuery("select * from datatable")
logger.debug(result.toString)
val resultEnum:Enumerator[TestDataObject] = Enumerator.generateM {
logger.debug("called enumerator")
result.next() match {
case true =>
val obj = TestDataObject(result.getString("name"), result.getString("object_type"),
result.getString("quantity").toInt, result.getString("cost").toFloat)
logger.info(obj.toJsonString)
Future(Some(obj))
case false =>
logger.warn("reached end of iteration")
stmt.close()
null
}
}
val consume:Iteratee[TestDataObject,Seq[TestDataObject]] = {
Iteratee.fold[TestDataObject,Seq[TestDataObject]](Seq.empty[TestDataObject]) { (result,chunk) => result :+ chunk }
}
val newIteree = Iteratee.flatten(resultEnum(consume))
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult.onSuccess { case x=> println(x)}
Ok("")
}
}
TestDataObject.scala:
package models
case class TestDataObject (name: String, objtype: String, quantity: Int, cost: Float){
def toJsonString: String = {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
mapper.writeValueAsString(this)
}
}
I have two main questions:
How do i signal that the input is complete from the Enumerator callback? The documentation says "this method takes a callback function e: => Future[Option[E]] that will be called each time the iteratee this Enumerator is applied to is ready to take some input." but I am unable to pass any kind of EOF that I've found because it;s the wrong type. Wrapping it in a Future does not help, but instinctively I am not sure that's the right approach.
How do I get the final result out of the Future to return from the controller view? My understanding is that I would effectively need to pause the main thread to wait for the subthreads to complete, but the only examples I've seen and only things i've found in the future class is the onSuccess callback - but how can I then return that from the view? Does Iteratee.run block until all input has been consumed?
A couple of sub-questions as well, to help my understanding:
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
When I run the code for the first time, I get a single record logged from logger.info and then it reports "reached end of iteration". Subsequent runs in the same session call nothing. I am closing the statement though, so why do I get no results the second time around? I was expecting it to loop indefinitely as I don't know how to signal the correct termination for the loop.
Thanks a lot in advance for any answers, I thought I was getting the hang of this but obviously not yet!

How do i signal that the input is complete from the Enumerator callback?
You return a Future(None).
How do I get the final result out of the Future to return from the controller view?
You can use Action.async (doc):
def index = Action.async {
db.withConnection { conn=>
...
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult map { data =>
OK(...)
}
}
}
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
The Future represents the (potentially asynchronous) processing to obtain the next element. The Option represents the availability of the next element: Some(x) if another element is available, None if the enumeration is completed.

Related

How to handle asynchronous API response in scala

I have an API that I need to query in scala. API returns a code that would be equal to 1 when results are ready.
I thought about an until loop to handle as the following:
var code= -1
while(code!=1){
var response = parse(Http(URL).asString.body)
code = response.get("code").get.asInstanceOf[BigInt].toInt
}
println(response)
But this code returns:
error: not found: value response
So I thought about doing the following:
var code = -1
var res = null.asInstanceOf[Map[String, Any]]
while(code!=1){
var response = parse(Http(URL).asString.body)
code = response.get("code").get.asInstanceOf[BigInt].toInt
res = response
}
println(res)
And it works. But I would like to know if this is really the best scala-friendly way to do so ?
How can I properly use a variable that outside of an until loop ?

When you say API, do you mean you use a http api and you're using a http library in scala, or do you mean there's some class/api written up in scala? If you have to keep checking then you have to keep checking I suppose.
If you're using a Scala framework like Akka or Play, they'd have solutions to asyncrhonously poll or schedule jobs in the background as a part of their solutions which you can read about.
If you're writing a Scala script, then from a design perspective I would either run the script every 1 minute and instead of having the while loop I'd just quit until code = 1. Otherwise I'd essentially do what you've done.
Another library that could help a scala script might be fs2 or ZIO which can allow you to setup tasks that periodically poll.
This appears to be a very open question about designing apps which do polling. A specific answer is hard to give.

You can just use simple recursion:
def runUntil[A](block: => A)(cond: A => Boolean): A = {
#annotation.tailrec
def loop(current: A): A =
if (cond(current)) current
else loop(current = block)
loop(current = block)
}
// Which can be used like:
val response = runUntil {
parse(Http(URL).asString.body)
} { res =>
res.getCode == 1
}
println(response)
An, if your real code uses some kind of effect type like IO or Future
// Actually cats already provides this, is called iterateUntil
def runUntil[A](program: IO[A])(cond: A => Boolean): IO[A] = {
def loop: IO[A] =
program.flatMap { a =>
if (cond(a)) IO.pure(a)
else loop
}
loop
}
// Used like:
val request = IO {
parse(Http(URL).asString.body)
}
val response = runUntil(request) { res =>
res.getCode == 1
}
response.flatMap(IO.println)
Note, for Future you would need to use () => Future instead to be able to re-execute the operation.
You can see the code running here.

Scala Aggregate result from multiple Future calls

Consider a Model for Master/Slave election for a cluster.
Member{ id: Long, isMaster: Boolean }
I have a Dao/Repo with following methods:
MemberDao.findById(id:Long):Future[Option[Member]]
MemberDao.update(id:Long, member: Member):Future[Unit]
MemberDao.all() : Future[List[Member]]
Within the MemberService, I'm trying to write a function to set isMaster to false for all existing members, and I'm ending up with this crazily bloated code:
class MemberService ... {
def demoteAllMembers() : Future[Boolean] = {
val futures = memberDao.all.map{ memberFuture =>
memberFuture.map{ member =>
memberDao.findById(member.id).map { existingMemberFuture =>
existingMemberFuture.map { existingMember =>
memberDao.update(existingMember.id, existingMember.copy(isMaster = false)
}
}
}
val results = Await.result(futures, 10 seconds)
// return something here
}
}
}
My Questions are:
1. How should the return statement be written to handle success / errors? e.g. On success, return Future(true) and on failure, return Future(false)
2. Is this way of repetitively mapping future the correct way of doing async programming in scala? I understand this could be written differently in Actor paradigm and probably much better, but in case of OOP, is this the best Scala can do?
Thanks.

Why are you doing MemberDao.findById when you are already holding a member in hand??? (You are also treating the return as a Member, while it should really be an Option[Member]).
Also, update does not need to take an id as a separate parameter (there is one available inside member).
You don't need to Await your result, because your function is returning a Future, and you don't need to return a Boolean: just throw an exception to signal failure.
Consider something like this:
def demoteAllMembers: Future[Unit] = memberDao.all.flatMap {
Future.sequence(_.foreach {
memberDao.update(_.copy(isMaster = false))
})
}.map ( _ => () )
Not all that bloated, is it? :)

Function return type error with future.onComplete

I am using Scala
I have a method that returns an object , in this method i am using future with onComlpete callback
def xyzFunction (id : Int) : Abc = {
var abcObj = new Abc
var RetunedLists = new MutableList[ArtistImpl]()
val future:Future[MutableList[Abc]] = ask(SomeActor,Message(id)).mapTo[MutableList[Abc]]
future.onComplete {
case Success(result) =>
RetunedLists = result
abcObj = RetunedLists.get(0)
println("name : " + abcObj.name)
case Failure(e) =>
println("printStackTrace"+e.printStackTrace)
}
abcObj
}
the problem is when i run the code the it prints name on console bt the object that this functions is empty
help me please!

The problem is that the future hasn't completed by the time that xyzFunction completes. This means that abcObj hasn't been set (in the future.onComplete block), so it is still equal to its initial value (from the line var abcObj = new Abc).
To ensure xyzFunction returns a valid value for abcObj, you can wait on the future to complete (eg. via Await.result(future, timeoutValue)).
Better, though, would be to return a Future[Abc], chaining results as futures (using map, flatMap, and similar methods) all the way up the line and resolving as late as possible. For example, if working with the Play framework, use Action.async and let Play handle resolving the future internally for you.

How to page REST calls in a future with Dispatch and Scala

I use Scala and Dispatch to get JSON from a paged REST API. The reason I use futures with Dispatch here is because I want to execute the calls to fetchIssuesByFile in parallel, because that function could result in many REST calls for every lookupId (1x findComponentKeyByRest, n x fetchIssuePage, where n is the number of pages yielded by the REST API).
Here's my code so far:
def fetchIssuePage(componentKey: String, pageIndex: Int): Future[json4s.JValue]
def extractPagingInfo(json: org.json4s.JValue): PagingInfo
def extractIssues(json: org.json4s.JValue): Seq[Issue]
def findAllIssuesByRest(componentKey: String): Future[Seq[Issue]] = {
Future {
var paging = PagingInfo(pageIndex = 0, pages = 0)
val allIssues = mutable.ListBuffer[Issue]()
do {
fetchIssuePage(componentKey, paging.pageIndex) onComplete {
case Success(json) =>
allIssues ++= extractIssues(json)
paging = extractPagingInfo(json)
case _ => //TODO error handling
}
} while (paging.pageIndex < paging.pages)
allIssues // (1)
}
}
def findComponentKeyByRest(lookupId: String): Future[Option[String]]
def fetchIssuesByFile(lookupId: String): Future[(String, Option[Seq[Issue]])] =
findComponentKeyByRest(lookupId).flatMap {
case Some(key) => findAllIssuesByRest(key).map(
issues => // (2) process issues
)
case None => //TODO error handling
}
The actual problem is that I never get the collected issues from findAllIssuesByRest (1) (i.e., the sequence of issues is always empty) when I try to process them at (2). Any ideas? Also, the pagination code isn't very functional, so I'm also open to ideas on how to improve this with Scala.
Any help is much appreciated.
Thanks,
Michael

I think you could do something like:
def findAllIssuesByRest(componentKey: String): Future[Seq[Issue]] =
// fetch first page
fetchIssuePage(componentKey, 0).flatMap { json =>
val nbPages = extractPagingInfo(json).pages // get the total number of pages
val firstIssues = extractIssues(json) // the first issues
// get the other pages
Future.sequence((1 to nbPages).map(page => fetchIssuePage(componentKey, page)))
// get the issues from the other pages
.map(pages => pages.map(extractIssues))
// combine first issues with other issues
.flatMap(issues => (firstIssues +: issues).flatten)
}

That's because fetchIssuePage returns a Future that the code doesn't await the result for.
Solution would be to build up a Seq of Futures from the fetchIssuePage calls. Then Future.sequence the Seq to produce a single future. Return this instead. This future will fire when they're all complete, ready for your flatMap code.
Update: Although Michael understood the above well (see comments), I thought I'd put in a much simplified code for the benefit of other readers, just to illustrate the point:
def fetch(n: Int): Future[Int] = Future { n+1 }
val fetches = Seq(1, 2, 3).map(fetch)
// a simplified version of your while loop, just to illustrate the point
Future.sequence(fetches).flatMap(results => ...)
// where results will be Seq[Int] - i.e., 2, 3, 4

Simple Scala actor question

I'm sure this is a very simple question, but embarrassed to say I can't get my head around it:
I have a list of values in Scala.
I would like to use use actors to make some (external) calls with each value, in parallel.
I would like to wait until all values have been processed, and then proceed.
There's no shared values being modified.
Could anyone advise?
Thanks

There's an actor-using class in Scala that's made precisely for this kind of problem: Futures. This problem would be solved like this:
// This assigns futures that will execute in parallel
// In the example, the computation is performed by the "process" function
val tasks = list map (value => scala.actors.Futures.future { process(value) })
// The result of a future may be extracted with the apply() method, which
// will block if the result is not ready.
// Since we do want to block until all results are ready, we can call apply()
// directly instead of using a method such as Futures.awaitAll()
val results = tasks map (future => future.apply())
There you go. Just that.

Create workers and ask them for futures using !!; then read off the results (which will be calculated and come in in parallel as they're ready; you can then use them). For example:
object Example {
import scala.actors._
class Worker extends Actor {
def act() { Actor.loop { react {
case s: String => reply(s.length)
case _ => exit()
}}}
}
def main(args: Array[String]) {
val arguments = args.toList
val workers = arguments.map(_ => (new Worker).start)
val futures = for ((w,a) <- workers zip arguments) yield w !! a
val results = futures.map(f => f() match {
case i: Int => i
case _ => throw new Exception("Whoops--didn't expect to get that!")
})
println(results)
workers.foreach(_ ! None)
}
}
This does a very inexpensive computation--calculating the length of a string--but you can put something expensive there to make sure it really does happen in parallel (the last thing that case of the act block should be to reply with the answer). Note that we also include a case for the worker to shut itself down, and when we're all done, we tell the workers to shut down. (In this case, any non-string shuts down the worker.)
And we can try this out to make sure it works:
scala> Example.main(Array("This","is","a","test"))
List(4, 2, 1, 4)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Trying to understand Scala enumerator/iteratees - scala

Related

How to handle asynchronous API response in scala

Scala Aggregate result from multiple Future calls

Function return type error with future.onComplete

How to page REST calls in a future with Dispatch and Scala

Simple Scala actor question

Categories

Resources