Get Future objects from Future Options in Scala - scala

I am new to Scala from Java so the functional programming thing is still a bit difficult for me to understand. I have a project in Play framework. I need to query the database to get rows with ids and display them in a html template.
Here is my code
def search(query: String) = Action.async{ request =>
val result = SearchEngine.searchResult(query)
val docs = result.map(DocumentService.getDocumentByID(_).map(doc => doc))
val futures = Future.sequence(docs)
futures.map{documents =>
Ok(views.html.results(documents.flatten))
}
}
getDocumentByID returns a Future[Options[Document]] object, but my results template takes Array[Document] so I have tried to no avail to transform the Future[Options[Document]] to Array[Document]
The current code I have is the closest I have been, but it still does not compile. This is the error:
Error:(36, -1) Play 2 Compiler:
found : Array[scala.concurrent.Future[Option[models.Document]]]
required: M[scala.concurrent.Future[A]]

Try to collect only the Somes from the Future returned by the getDocumentByID
val docs = result.map { res =>
val f: Future[Option[Document]] = DocumentService.getDocumentByID(res)
f.collect { case Some(doc) => doc }
}.toList
val futures = Future.seqence(docs) //notice that docs is converted to list from array in the previous line
General suggestion
Do not use Arrays. Arrays are mutable and they do not grow dynamically.
So it is advisable to avoid using Array in concurrent/parallel code.

Related

Iterate sequence of future in sequence of futures

I am working on a PlayFramework application written in Scala.
Problem is that in a rest controller I need a list of elements (books) and for each element list of its subelements (chapters).
Book repository:
def findAll(): Future[Seq[Book]]
Chapter repository:
def findByBookId(bookId: UUID): Future[Seq[Chapter]]
I wanted to do something like
val books = bookRepository.findAll
val result = for {
bookList <- books
book <- bookList
chapters <- chapterRepository.findByBookdId(book.id)
} yield (book, chapters)
I want to have a tuple of book and its chapters so I can latter map it to a json. What I am doing wrong, because I get error:
[error] required: scala.collection.GenTraversableOnce[?]
Or what would be a better approach how to iterate over future of collection and for each element load another future of collection?
See if this gets you close to what you're after.
// found books (future)
val fBooks :Future[Seq[Book]] = bookRepository.findAll()
// chapters in fBooks order (future)
val fChptrs :Future[Seq[Seq[Chapter]]] = fBooks.flatMap {books =>
Future.sequence(books.map(book =>findByBookId(book.id)))
}
// combine book with its chapters
val result :Future[Seq[(Book, Seq[Chapter])]] = for {
books <- fBooks
chptrs <- fChptrs
} yield books zip chptrs
You can only use one type of monad in a for comprehension, since it is just syntactic sugar for flatMap and map.
See here the whole description: https://stackoverflow.com/a/33402543/2750966
You should use the same type throughout the for-comprehension.
You can't have books:Future[Seq[Books] and bookList: Seq[Book] in same for-comprehension.
Also you can convert Seq[Future[Something]] to Future[Seq[Something]] using Future.sequence
So, something like this should work
val res: Future[(Seq[Book], Seq[Chapter])] =
for {
bookList <- books
chapters <- Future.sequence(bookList.map(book => findByBookId(book.id)))
} yield (bookList, chapters.flatten)

Any better, more idiomatic way to convert SQL ResultSet to a Scala List or other collection type?

I'm using the following naive code to convert a ResultSet to a Scala List:
val rs = pstmt.executeQuery()
var nids = List[String]()
while (rs.next()) {
nids = nids :+ rs.getString(1)
}
rs.close()
Is there a better approach, something more idiomatic to Scala, that doesn't require using a mutable object?
Why don't you try this:
new Iterator[String] {
def hasNext = resultSet.next()
def next() = resultSet.getString(1)
}.toStream
Taken from this answer here
I have a similar problem and my solution is:
Iterator.from(0).takeWhile(_ => rs.next()).map(_ => rs.getString(1)).toList
Hope that will help.
Using ORM tool like slick or Quill as mention in comment section is considered to be better approach.
If you want to use the Scala Code for processing the ResultSet.You can use the tailRecursion.
#scala.annotation.tailrec
def getResult(resultSet: ResultSet, list: List[String] = Nil): List[String] = {
if (resultSet.next()) {
val value = resultSet.getString(0)
getResult(resultSet, value :: list)
}
else {
list
}
}
This method returns the list of string that contains column value at 0th position. This process is pure immutable so you don't need to worry.On the plus side this method is tail recursion so Scala will internally optimize this method accordingly.
Thanks

How to get a result from Enumerator/Iteratee?

I am using play2 and reactivemongo to fetch a result from mongodb. Each item of the result needs to be transformed to add some metadata. Afterwards I need to apply some sorting to it.
To deal with the transformation step I use enumerate():
def ideasEnumerator = collection.find(query)
.options(QueryOpts(skipN = page))
.sort(Json.obj(sortField -> -1))
.cursor[Idea]
.enumerate()
Then I create an Iteratee as follows:
val processIdeas: Iteratee[Idea, Unit] =
Iteratee.foreach[Idea] { idea =>
resolveCrossLinks(idea) flatMap { idea =>
addMetaInfo(idea.copy(history = None))
}
}
Finally I feed the Iteratee:
ideasEnumerator(processIdeas)
And now I'm stuck. Every example I saw does some println inside foreach, but seems not to care about a final result.
So when all documents are returned and transformed how do I get a Sequence, a List or some other datatype I can further deal with?
Change the signature of your Iteratee from Iteratee[Idea, Unit] to Iteratee[Idea, Seq[A]] where A is the type. Basically the first param of Iteratee is Input type and second param is Output type. In your case you gave the Output type as Unit.
Take a look at the below code. It may not compile but it gives you the basic usage.
ideasEnumerator.run(
Iteratee.fold(List.empty[MyObject]) { (accumulator, next) =>
accumulator + resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
}
}
) // returns Future[List[MyObject]]
As you can see, Iteratee is a simply a state machine. Just extract that Iteratee part and assign it to a val:
val iteratee = Iteratee.fold(List.empty[MyObject]) { (accumulator, next) =>
accumulator + resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
}
}
and feel free to use it where ever you need to convert from your Idea to List[MyObject]
With the help of your answers I ended up with
val processIdeas: Iteratee[Idea, Future[Vector[Idea]]] =
Iteratee.fold(Future(Vector.empty[Idea])) { (accumulator: Future[Vector[Idea]], next:Idea) =>
resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
} flatMap (ideaWithMeta => accumulator map (acc => acc :+ ideaWithMeta))
}
val ideas = collection.find(query)
.options(QueryOpts(page, perPage))
.sort(Json.obj(sortField -> -1))
.cursor[Idea]
.enumerate(perPage).run(processIdeas)
This later needs a ideas.flatMap(identity) to remove the returning Future of Futures but I'm fine with it and everything looks idiomatic and elegant I think.
The performance gained compared to creating a list and iterate over it afterwards is negligible though.

How to create a play.api.libs.iteratee.Enumerator which inserts some data between the items of a given Enumerator?

I use Play framework with ReactiveMongo. Most of ReactiveMongo APIs are based on the Play Enumerator. As long as I fetch some data from MongoDB and return it "as-is" asynchronously, everything is fine. Also the transformation of the data, like converting BSON to String, using Enumerator.map is obvious.
But today I faced a problem which at the bottom line narrowed to the following code. I wasted half of the day trying to create an Enumerator which would consume items from the given Enumerator and insert some items between them. It is important not to load all the items at once, as there could be many of them (the code example has only two items "1" and "2"). But semantically it is similar to mkString of the collections. I am sure it can be done very easily, but the best I could come with - was this code. Very similar code creating an Enumerator using Concurrent.broadcast serves me well for WebSockets. But here even that does not work. The HTTP response never comes back. When I look at Enumeratee, it looks that it is supposed to provide such functionality, but I could not find the way to do the trick.
P.S. Tried to call chan.eofAndEnd in Iteratee.mapDone, and chunked(enums >>> Enumerator.eof instead of chunked(enums) - did not help. Sometimes the response comes back, but does not contain the correct data. What do I miss?
def trans(in:Enumerator[String]):Enumerator[String] = {
val (res, chan) = Concurrent.broadcast[String]
val iter = Iteratee.fold(true) { (isFirst, curr:String) =>
if (!isFirst)
chan.push("<-------->")
chan.push(curr)
false
}
in.apply(iter)
res
}
def enums:Enumerator[String] = {
val en12 = Enumerator[String]("1", "2")
trans(en12)
//en12 //if I comment the previous line and uncomment this, it prints "12" as expected
}
def enum = Action {
Ok.chunked(enums)
}
Here is my solution which I believe to be correct for this type of problem. Comments are welcome:
def fill[From](
prefix: From => Enumerator[From],
infix: (From, From) => Enumerator[From],
suffix: From => Enumerator[From]
)(implicit ec:ExecutionContext) = new Enumeratee[From, From] {
override def applyOn[A](inner: Iteratee[From, A]): Iteratee[From, Iteratee[From, A]] = {
//type of the state we will use for fold
case class State(prev:Option[From], it:Iteratee[From, A])
Iteratee.foldM(State(None, inner)) { (prevState, newItem:From) =>
val toInsert = prevState.prev match {
case None => prefix(newItem)
case Some(prevItem) => infix (prevItem, newItem)
}
for(newIt <- toInsert >>> Enumerator(newItem) |>> prevState.it)
yield State(Some(newItem), newIt)
} mapM {
case State(None, it) => //this is possible when our input was empty
Future.successful(it)
case State(Some(lastItem), it) =>
suffix(lastItem) |>> it
}
}
}
// if there are missing integers between from and to, fill that gap with 0
def fillGap(from:Int, to:Int)(implicit ec:ExecutionContext) = Enumerator enumerate List.fill(to-from-1)(0)
def fillFrom(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(x, input)
def fillTo(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(input, x)
val ints = Enumerator(10, 12, 15)
val toStr = Enumeratee.map[Int] (_.toString)
val infill = fill(
fillFrom(5),
fillGap,
fillTo(20)
)
val res = ints &> infill &> toStr // res will have 0,0,0,0,10,0,12,0,0,15,0,0,0,0
You wrote that you are working with WebSockets, so why don't you use dedicated solution for that? What you wrote is better for Server-Sent-Events rather than WS. As I understood you, you want to filter your results before sending them back to client? If its correct then you Enumeratee instead of Enumerator. Enumeratee is transformation from-to. This is very good piece of code how to use Enumeratee. May be is not directly about what you need but I found there inspiration for my project. Maybe when you analyze given code you would find best solution.

Scala Parallel Collections- How to return early?

I have a list of possible input Values
val inputValues = List(1,2,3,4,5)
I have a really long to compute function that gives me a result
def reallyLongFunction( input: Int ) : Option[String] = { ..... }
Using scala parallel collections, I can easily do
inputValues.par.map( reallyLongFunction( _ ) )
To get what all the results are, in parallel. The problem is, I don't really want all the results, I only want the FIRST result. As soon as one of my input is a success, I want my output, and want to move on with my life. This did a lot of extra work.
So how do I get the best of both worlds? I want to
Get the first result that returns something from my long function
Stop all my other threads from useless work.
Edit -
I solved it like a dumb java programmer by having
#volatile var done = false;
Which is set and checked inside my reallyLongFunction. This works, but does not feel very scala. Would like a better way to do this....
(Updated: no, it doesn't work, doesn't do the map)
Would it work to do something like:
inputValues.par.find({ v => reallyLongFunction(v); true })
The implementation uses this:
protected[this] class Find[U >: T](pred: T => Boolean, protected[this] val pit: IterableSplitter[T]) extends Accessor[Option[U], Find[U]] {
#volatile var result: Option[U] = None
def leaf(prev: Option[Option[U]]) = { if (!pit.isAborted) result = pit.find(pred); if (result != None) pit.abort }
protected[this] def newSubtask(p: IterableSplitter[T]) = new Find(pred, p)
override def merge(that: Find[U]) = if (this.result == None) result = that.result
}
which looks pretty similar in spirit to your #volatile except you don't have to look at it ;-)
I took interpreted your question in the same way as huynhjl, but if you just want to search and discardNones, you could do something like this to avoid the need to repeat the computation when a suitable outcome is found:
class Computation[A,B](value: A, function: A => B) {
lazy val result = function(value)
}
def f(x: Int) = { // your function here
Thread.sleep(100 - x)
if (x > 5) Some(x * 10)
else None
}
val list = List.range(1, 20) map (i => new Computation(i, f))
val found = list.par find (_.result.isDefined)
//found is Option[Computation[Int,Option[Int]]]
val result = found map (_.result.get)
//result is Option[Int]
However find for parallel collections seems to do a lot of unnecessary work (see this question), so this might not work well, with current versions of Scala at least.
Volatile flags are used in the parallel collections (take a look at the source for find, exists, and forall), so I think your idea is a good one. It's actually better if you can include the flag in the function itself. It kills referential transparency on your function (i.e. for certain inputs your function now sometimes returns None rather than Some), but since you're discarding the stopped computations, this shouldn't matter.
If you're willing to use a non-core library, I think Futures would be a good match for this task. For instance:
Akka's Futures include Futures.firstCompletedOf
Twitter's Futures include Future.select
...both of which appear to enable the functionality you're looking for.