Composing Futures with For Comprehension - scala

I have a Play Framework application using ReactiveMongo with MongoDB, and I have the following code:
def categories(id: String): Future[Vector[Category]] = {...}
....
val categoriesFuture = categories(id)
for {
categories: Vector[Category] <- categoriesFuture
categoryIdsWithoutPerson: Vector[BSONObjectID] <- findCategoryIdsWithoutPerson(categories.map(_.id), personId) //Returns Future[Vector[BSONObjectID]]
categoriesWithoutPerson: Vector[Category] <- categories.filter(category => categoryIdsWithoutPerson.contains(category.id)) //Play cites the error here
} yield categoryIdsWithoutPerson
To explain this code, I fetch a Vector of Categories wrapped in a Future because that's how ReactiveMongo rolls. In the for comprehension, I use that Vector to then fetch a list of ids from the database. Finally, I use a filter call to keep only those categories whose ids can be found in that id list.
It all seems fairly straightforward. The problem is that Play gives me the following compilation error on the last line of the for comprehension:
pattern type is incompatible with expected type;
found : Vector[com.myapp.Category]
required: com.myapp.Category
I am not sure why the required type is a single instance of Category.
I could use some insight into what I am doing wrong and/or if there is a simpler or more idiomatic way of accomplishing this.

It looks like you're trying to compose Futures with Vector. For comprehensions in scala have to all be of the same higher type, which in your case is Future. When you unroll the 'sugar' of the for comprehension, it's just calling flatMap on everything.
for {
categories <- categoriesFuture
// I'm not sure what the return type is here, but I'm guessing it's a future as well
categoryIdsWithoutPerson <- findCategoryIdsWithoutPerson(categories.map(_.id), personId)
// Here we use = to get around need to flatMap
categoriesWithoutPerson = categories.filter(category => categoryIdsWithoutPerson.contains(category.id))
} yield categoryIdsWithoutPerson
Your code de-sugared:
categoriesFuture.flatMap(categories =>
findCategoryIdsWithoutPerson(categories.map(_.id), personId).
flatMap(categoryIdsWithoutPerson =>
categories.filter(category => categoryIdsWithoutPerson.contains(category.id)).
map(_ => categoryIdsWithoutPerson))

Related

How to return successfully parsed rows that converted into my case class

I have a file, each row is a json array.
I reading each line of the file, and trying to convert the rows into a json array, and then for each element I am converting to a case class using json spray.
I have this so far:
for (line <- source.getLines().take(10)) {
val jsonArr = line.parseJson.convertTo[JsArray]
for (ele <- jsonArr.elements) {
val tryUser = Try(ele.convertTo[User])
}
}
How could I convert this entire process into a single line statement?
val users: Seq[User] = source.getLines.take(10).map(line => line.parseJson.convertTo[JsonArray].elements.map(ele => Try(ele.convertTo[User])
The error is:
found : Iterator[Nothing]
Note: I used Scala 2.13.6 for all my examples.
There is a lot to unpack in these few lines of code. First of all, I'll share some code that we can use to generate some meaningful input to play around with.
object User {
import scala.util.Random
private def randomId: Int = Random.nextInt(900000) + 100000
private def randomName: String = Iterator
.continually(Random.nextInt(26) + 'a')
.map(_.toChar)
.take(6)
.mkString
def randomJson(): String = s"""{"id":$randomId,"name":"$randomName"}"""
def randomJsonArray(size: Int): String =
Iterator.continually(randomJson()).take(size).mkString("[", ",", "]")
}
final case class User(id: Int, name: String)
import scala.util.{Try, Success, Failure}
import spray.json._
import DefaultJsonProtocol._
implicit val UserFormat = jsonFormat2(User.apply)
This is just some scaffolding to define some User domain object and come up with a way to generate a JSON representation of an array of such objects so that we can then use a JSON library (spray-json in this case) to parse it back into what we want.
Now, going back to your question. This is a possible way to massage your data into its parsed representation. It may not fit 100% what your are trying to do, but there's some nuance in the data types involved and how they work:
val parsedUsers: Iterator[Try[User]] =
for {
line <- Iterator.continually(User.randomJsonArray(4)).take(10)
element <- line.parseJson.convertTo[JsArray].elements
} yield Try(element.convertTo[User])
First difference: notice that I use the for comprehension in a form in which the "outcome" of an iteration is not a side effect (for (something) { do something }) but an actual value for (something) yield { return a value }).
Second difference: I explicitly asked for an Iterator[Try[User]] rather than a Seq[User]. We can go very down into a rabbit hole on the topic of why the types are what they are here, but the simple explanation is that a for ... yield expression:
returns the same type as the one in the first line of the generation -- if you start with a val ns: Iterator[Int]; for (n<- ns) ... you'll get an iterator at the end
if you nest generators, they need to be of the same type as the "outermost" one
You can read more on for comprehensions on the Tour of Scala and the Scala Book.
One possible way of consuming this is the following:
for (user <- parsedUsers) {
user match {
case Success(user) => println(s"parsed object $user")
case Failure(error) => println(s"ERROR: '${error.getMessage}'")
}
As for how to turn this into a "one liner", for comprehensions are syntactic sugar applied by the compiler which turns every nested call into a flatMap and the final one into map, as in the following example (which yields an equivalent result as the for comprehension above and very close to what the compiler does automatically):
val parsedUsers: Iterator[Try[User]] = Iterator
.continually(User.randomJsonArray(4))
.take(10)
.flatMap(line =>
line.parseJson
.convertTo[JsArray]
.elements
.map(element => Try(element.convertTo[User]))
)
One note that I would like to add is that you should be mindful of readability. Some teams prefer for comprehensions, others manually rolling out their own flatMap/map chains. Coders discretion is advised.
You can play around with this code here on Scastie (and here is the version with the flatMap/map calls).

Scala adding elements to seq and handling futures, maps, and async behavior

I'm still a newbie in scala and don't quite yet understand the concept of Futures/Maps/Flatmaps/Seq and how to use them properly.
This is what I want to do (pseudo code):
def getContentComponents: Action[AnyContent] = Action.async {
contentComponentDTO.list().map( //Future[Seq[ContentComponentModel]] Get all contentComponents
contentComponents => contentComponents.map( //Iterate over [Seq[ContentComponentModel]
contentComponent => contentComponent.typeOf match { //Match the type of the contentComponent
case 1 => contentComponent.pictures :+ contentComponentDTO.getContentComponentPicture(contentComponent.id.get) //Future[Option[ContentComponentPictureModel]] add to _.pictures seq
case 2 => contentComponent.videos :+ contentComponentDTO.getContentComponentVideo(contentComponent.id.get) //Future[Option[ContentComponentVideoModel]] add to _.videos seq
}
)
Ok(Json.toJson(contentComponents)) //Return all the contentComponents in the end
)
}
I want to add a Future[Option[Foo]] to contentComponent.pictures: Option[Seq[Foo]] like so:
case 2 => contentComponent.pictures :+ contentComponentDTO.getContentComponentPicture(contentComponent.id.get) //contentComponent.pictures is Option[Seq[Foo]]
and return the whole contentComponent back to the front-end via json in the end.
I know this might be far away from the actual code in the end, but I hope you got the idea. Thanks!
I'll ignore your code and focus on what is short and makes sense:
I want to add a Future[Option[Foo]] to contentComponent.pictures: Option[Seq[Foo]] like so:
Let's do this, focusing on code readability:
// what you already have
val someFuture: Future[Option[Foo]] = ???
val pics: Option[Seq[Foo]] = contentComponent.pictures
// what I'm adding
val result: Future[Option[Seq[Foo]]] = someFuture.map {
case None => pics
case Some(newElement) =>
pics match {
case None => Some(Seq(newElement)) // not sure what you want here if pics is empty...
case Some(picsSequence) => Some(picsSequence :+ newElement)
}
}
And to show an example of flatMap let's say you need the result of result future in another future, just do:
val otherFuture: Future[Any] = ???
val everything: Future[Option[Seq[Foo]]] = otherFuture.flatmap { otherResult =>
// do something with otherResult i.e., the code above could be pasted in here...
result
}
My answer will attempt to help with some of the conceptual sub-questions which form parts of your overall larger question.
flatMap and for-yield
One of the points of flatMap is to help with the problem of the Pyramid of Doom. This happens when you have
structures nested within structures nested within structures ...
doA().map { resultOfA =>
doB(resultOfA).map { resolutOfB =>
doC(resultOfB).map { resultOfC =>
...
}
}
}
If you use for-yield you get flatMap out of the box and it allows you to
flatten the pyramid
so that your code looks more like a linear structure
for {
resultOfA <- doA
resultOfB <- doB(resultOfA)
resultOfC <- doC(resultOfB)
...
} yield {...}
There is a rule of thumb in software engineering that deeply nested structures are harder to debug and reason about, so
we strive to minimise the nesting. You will hit this issue especially when dealing with Futures.
Mapping over Future vs. mapping over sequence
Mapping is usually first thought in terms of iteration over a sequence, which might lead to understanding of
mapping over a Future in terms of iterating over a sequence of one. My advice would be not to use the iteration concept when
trying to understand mapping over Futures, Options etc. In these cases it might be better to think of mapping as a process of destructing the structure
so that you get at the element inside the structure. One could visualise mapping as
breaking the shell of a walnut so you get at the delicious kernel inside and then rebuilding the shell.
Futures and monads
As you try to learn more about Futures and when you begin to deal with types like Future[Option[SomeType]] you will inevitably
stumble upon documentation about monads and its cryptic terminology might scare you away. If this happens, it might help to think of monads (of which Future is a particular instance) as simply
something you can stick into a for-yield so that you can get at the
delicious walnut kernels whilst avoiding the pyramid of doom.

Inspection error in scala method / play framework / rest

I'm still learning scala so this might be a question with an easy answer, but I've been stuck on writing a single method over and over for almost a day, unable to get this code to compile.
I'm playing with the Play Framework and a reactive mongo template to learn how Scala and Play work.
I have a controller with a few methods, endpoints for a REST service.
The issue is about the following method, which accepts a list of json objects and updates those objects using the mongo reactive driver. The class has one member, citiesFuture which is of type Future[JSONCollection].
The original class code which I'm adding this method to can be found here for context: CityController on github
def updateAll() = Action.async(parse.json) { request =>
Json.fromJson[List[City]](request.body) match {
case JsSuccess(givenCities, _) =>
citiesFuture onComplete[Future[Result]] { cities =>
val updateFutures: List[Future[UpdateWriteResult]] = for {
city <- givenCities
} yield cities.get.update(City.getUniqueQuery(city), Json.obj("$set" -> city))
val promise: Promise[Result] = Promise[Result] {
Future.sequence(updateFutures) onComplete[Result] {
case s#Success(_) =>
var count = 0
for {
updateWriteResult <- s.value
} yield count += updateWriteResult.n
promise success Ok(s"Updated $count cities")
case Failure(_) =>
promise success InternalServerError("Error updating cities")
}
}
promise.future
}
case JsError(errors) =>
Future.successful(BadRequest("Could not build a city from the json provided. " + Errors.show(errors)))
}
}
I've managed to get this far with alot of trial and error, but I'm starting to understand how some of the mechanics of scala and Futures work, I think :) I think I'm close, but my IDE still gives me a single Inspection error just at the single closing curly brace above the line promise.future.
The error reads: Expression of type Unit doesn't conform to expected type Nothing.
I've checked the expected return values for the Promise and onComplete code blocks, but I don't believe they expect Nothing as a return type.
Could somebody please explain to me what I'm missing, and also, I'm sure this can be done better, so let me know if you have any tips I can learn from!
You're kinda on the right track but as #cchantep said, once you're operating in Future-land, it would be very unusual to need to create your own with Promise.future.
In addition, it's actually quite unusual to see onComplete being used - idiomatic Scala generally favors the "higher-level" abstraction of mapping over Futures. I'll attempt to demonstrate how I'd write your function in a Play controller:
Firstly, the "endpoint" just takes care of one thing - interfacing with the outside world - i.e. the JSON-parsing part. If everything converts OK, it calls a private method (performUpdateAll) that actually does the work:
def updateAll() = Action.async(parse.json) { request =>
Json.fromJson[List[City]](request.body) match {
case JsSuccess(givenCities, _) =>
performUpdateAll(givenCities)
case JsError(errors) =>
Future.successful(BadRequest("Could not build a city from the json provided. "))
}
}
Next, we have the private function that performs the update of multiple cities. Again, trying to abide by the Single Responsibility Principle (in a functional sense - one function should do one thing), I've extracted out updateCity which knows how to update exactly one city and returns a Future[UpdateWriteResult]. A nice side-effect of this is code-reuse; you may find you'll be able to use such a function elsewhere.
private def performUpdateAll(givenCities:List[City]):Future[Result] = {
val updateFutures = givenCities.map { city =>
updateCity(city)
}
Future.sequence(updateFutures).map { listOfResults =>
if (listOfResults.forall(_.ok)) {
val count = listOfResults.map(_.n).sum
Ok(s"Updated $count cities")
} else {
InternalServerError("Error updating cities")
}
}
}
As far as I can tell, this will work in exactly the same way as you intended yours to work. But by using Future.map instead of its lower-level counterpart Future.onComplete and matching on Success and Failure you get much more succinct code where (in my opinion) it's much easier to see the intent because there's less boilerplate around it.
We still check that every update worked, with this:
if (listOfResults.forall(_.ok))
which I would argue reads pretty well - all the results have to be OK!
The other little trick I did to tidy up was replace your "counting" logic which used a mutable variable, with a one-liner:
var count = 0
for {
updateWriteResult <- s.value
} yield count += updateWriteResult.n
Becomes:
val count = listOfResults.map(_.n).sum
i.e. convert the list of results to a list of integers (the n in the UpdateWriteResult) and then use the built-in sum function available on lists to do the rest.

Spark Cassandra Connector: for comprehension error (type mismatch)

Problem
Maybe this is due to my lack of Scala knowledge, but it seems like adding another level to the for comprehension should just work. If the first for comprehension line is commented out, the code works. I ultimately want a Set[Int] instead of '1 to 2', but it serves to show the problem. The first two lines of the for should not need a type specifier, but I include it to show that I've tried the obvious.
Tools/Jars
IntelliJ 2016.1
Java 8
Scala 2.10.5
Cassandra 3.x
spark-assembly-1.6.0-hadoop2.6.0.jar (pre-built)
spark-cassandra-connector_2.10-1.6.0-M1-SNAPSHOT.jar (pre-built)
spark-cassandra-connector-assembly-1.6.0-M1-SNAPSHOT.jar (I built)
Code
case class NotifHist(intnotifhistid:Int, eventhistids:Seq[Int], yosemiteid:String, initiatorname:String)
case class NotifHistSingle(intnotifhistid:Int, inteventhistid:Int, dataCenter:String, initiatorname:String)
object SparkCassandraConnectorJoins {
def joinQueryAfterMakingExpandedRdd(sc:SparkContext, orgNodeId:Int) {
val notifHist:RDD[NotifHistSingle] = for {
orgNodeId:Int <- 1 to 2 // comment out this line and it works
notifHist:NotifHist <- sc.cassandraTable[NotifHist](keyspace, "notifhist").where("intorgnodeid = ?", orgNodeId)
eventHistId <- notifHist.eventhistids
} yield NotifHistSingle(notifHist.intnotifhistid, eventHistId, notifHist.yosemiteid, notifHist.initiatorname)
...etc...
}
Compilation Output
Information:3/29/16 8:52 AM - Compilation completed with 1 error and 0 warnings in 1s 507ms
/home/jpowell/Projects/SparkCassandraConnector/src/com/mir3/spark/SparkCassandraConnectorJoins.scala
**Error:(88, 21) type mismatch;
found : scala.collection.immutable.IndexedSeq[Nothing]
required: org.apache.spark.rdd.RDD[com.mir3.spark.NotifHistSingle]
orgNodeId:Int <- 1 to 2
^**
Later
#slouc Thanks for the comprehensive answer. I was using the for comprehension's syntactic sugar to also keep state from the second statement to fill elements in the NotifHistSingle ctor, so I don't see how to get the equivalent map/flatmap to work. Therefore, I went with the following solution:
def joinQueryAfterMakingExpandedRdd(sc:SparkContext, orgNodeIds:Set[Int]) {
def notifHistForOrg(orgNodeId:Int): RDD[NotifHistSingle] = {
for {
notifHist <- sc.cassandraTable[NotifHist](keyspace, "notifhist").where("intorgnodeid = ?", orgNodeId)
eventHistId <- notifHist.eventhistids
} yield NotifHistSingle(notifHist.intnotifhistid, eventHistId, notifHist.yosemiteid, notifHist.initiatorname)
}
val emptyTable:RDD[NotifHistSingle] = sc.emptyRDD[NotifHistSingle]
val notifHistForAllOrgs:RDD[NotifHistSingle] = orgNodeIds.foldLeft(emptyTable)((accum, oid) => accum ++ notifHistForOrg(oid))
}
For comprehension is actually syntax sugar; what's really going on underneath is a series of chained flatMap calls, with a single map at the end which replaces yield. Scala compiler translates every for comprehension like this. If you use if conditions in your for comprehension, they are translated into filters, and if you don't yield anything foreach is used. For more information, see here.
So, to explain on your case - this:
val notifHist:RDD[NotifHistSingle] = for {
orgNodeId:Int <- 1 to 2 // comment out this line and it works
notifHist:NotifHist <- sc.cassandraTable[NotifHist](keyspace, "notifhist").where("intorgnodeid = ?", orgNodeId)
eventHistId <- notifHist.eventhistids
} yield NotifHistSingle(...)
is actually translated by the compiler to this:
val notifHist:RDD[NotifHistSingle] = (1 to 2)
.flatMap(x => sc.cassandraTable[NotifHist](keyspace, "notifhist").where("intorgnodeid = ?", x)
.flatMap(x => x.eventhistids)
.map(x => NotifHistSingle(...))
You are getting the error if you include the 1 to 2 line because that makes your for comprehension operate on a sequence (vector, to be more precise). So when invoking flatMap(), compiler expects you to follow up with a function that transforms each element of your vector to a GenTraversableOnce. If you take a closer look at the type of your for expression (most IDEs will display it just by hovering over it) you can see it for yourself:
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That
This is the problem. Compiler doesn't know how to flatMap the vector 1 to 10 using a function that returns CassandraRDD. It wants a function that returns GenTraversableOnce. If you remove the 1 to 2 line then you remove this restriction.
Bottom line - if you want to use a for comprehension and yield values out of it, you have to obey the type rules. It's impossible to flatten a sequence consisting of elements which are not sequences and cannot be turned into sequences.
You can always map instead of flatMap since map is less restrictive (it requires A => B instead of A => GenTraversableOnce[B]). This means that instead of getting all results in one giant sequence, you will get a sequence where each element is a group of results (one group for each query). You can also play around the types, trying to get a GenTraversableOnce from your query result (e.g. invoking sc.cassandraTable().where().toArray or something; I don't really work with Cassandra so I don't know).

Scala's for-comprehension `if` statements

Is it possible in scala to specialize on the conditions inside an if within a for comprehension? I'm thinking along the lines of:
val collection: SomeGenericCollection[Int] = ...
trait CollectionFilter
case object Even extends CollectionFilter
case object Odd extends CollectionFilter
val evenColl = for { i <- collection if(Even) } yield i
//evenColl would be a SomeGenericEvenCollection instance
val oddColl = for { i <- collection if(Odd) } yield i
//oddColl would be a SomeGenericOddCollection instance
The gist is that by yielding i, I get a new collection of a potentially different type (hence me referring to it as "specialization")- as opposed to just a filtered-down version of the same GenericCollection type.
The reason I ask is that I saw something that I couldn't figure out (an example can be found on line 33 of this ScalaQuery example. What it does is create a query for a database (i.e. SELECT ... FROM ... WHERE ...), where I would have expected it to iterate over the results of said query.
So, I think you are asking if it is possible for the if statement in a for-comprehension to change the result type. The answer is "yes, but...".
First, understand how for-comprehensions are expanded. There are questions here on Stack Overflow discussing it, and there are parameters you can pass to the compiler so it will show you what's going on.
Anyway, this code:
val evenColl = for { i <- collection if(Even) } yield i
Is translated as:
val evenColl = collection.withFilter(i => Even).map(i => i)
So, if the withFilter method changes the collection type, it will do what you want -- in this simple case. On more complex cases, that alone won't work:
for {
x <- xs
y <- ys
if cond
} yield (x, y)
is translated as
xs.flatMap(ys.withFilter(y => cond).map(y => (x, y)))
In which case flatMap is deciding what type will be returned. If it takes the cue from what result was returned, then it can work.
Now, on Scala Collections, withFilter doesn't change the type of the collection. You could write your own classes that would do that, however.
yes you can - please refer to this tutorial for an easy example. The scala query example you cited is also iterating on the collection, it is then using that data to build the query.