Why a Thread.sleep or closing the connection is required after waiting for a remove call to complete? - scala

I'm again seeking you to share your wisdom with me, the scala padawan!
I'm playing with reactive mongo in scala and while I was writting a test using scalatest, I faced the following issue.
First the code:
"delete" when {
"passing an existent id" should {
"succeed" in {
val testRecord = TestRecord(someString)
Await.result(persistenceService.persist(testRecord), Duration.Inf)
Await.result(persistenceService.delete(testRecord.id), Duration.Inf)
Thread.sleep(1000) // Why do I need that to make the test succeeds?
val thrownException = intercept[RecordNotFoundException] {
Await.result(persistenceService.read(testRecord.id), Duration.Inf)
}
thrownException.getMessage should include(testRecord._id.toString)
}
}
}
And the read and delete methods with the code initializing connection to db (part of the constructor):
class MongoPersistenceService[R](url: String, port: String, databaseName: String, collectionName: String) {
val driver = MongoDriver()
val parsedUri: Try[MongoConnection.ParsedURI] = MongoConnection.parseURI("%s:%s".format(url, port))
val connection: Try[MongoConnection] = parsedUri.map(driver.connection)
val mongoConnection = Future.fromTry(connection)
def db: Future[DefaultDB] = mongoConnection.flatMap(_.database(databaseName))
def collection: Future[BSONCollection] = db.map(_.collection(collectionName))
def read(id: BSONObjectID): Future[R] = {
val query = BSONDocument("_id" -> id)
val readResult: Future[R] = for {
coll <- collection
record <- coll.find(query).requireOne[R]
} yield record
readResult.recover {
case NoSuchResultException => throw RecordNotFoundException(id)
}
}
def delete(id: BSONObjectID): Future[Unit] = {
val query = BSONDocument("_id" -> id)
// first read then call remove. Read will throw if not present
read(id).flatMap { (_) => collection.map(coll => coll.remove(query)) }
}
}
So to make my test pass, I had to had a Thread.sleep right after waiting for the delete to complete. Knowing this is evil usually punished by many whiplash, I want learn and find the proper fix here.
While trying other stuff, I found instead of waiting, entirely closing the connection to the db was also doing the trick...
What am I misunderstanding here? Should a connection to the db be opened and close for each call to it? And not do many actions like adding, removing, updating records with one connection?
Note that everything works fine when I remove the read call in my delete function. Also by closing the connection, I mean call close on the MongoDriver from my test and also stop and start again embed Mongo which I'm using in background.
Thanks for helping guys.

Warning: this is a blind guess, I've no experience with MongoDB on Scala.
You may have forgotten to flatMap
Take a look at this bit:
collection.map(coll => coll.remove(query))
Since collection is Future[BSONCollection] per your code and remove returns Future[WriteResult] per doc, so actual type of this expression is Future[Future[WriteResult]].
Now, you have annotated your function as returning Future[Unit]. Scala often makes Unit as a return value by throwing away possibly meaningful values, which it does in your case:
read(id).flatMap { (_) =>
collection.map(coll => {
coll.remove(query) // we didn't wait for removal
() // before returning unit
})
}
So your code should probably be
read(id).flatMap(_ => collection.flatMap(_.remove(query).map(_ => ())))
Or a for-comprehension:
for {
_ <- read(id)
coll <- collection
_ <- coll.remove(query)
} yield ()
You can make Scala warn you about discarded values by adding a compiler flag (assuming SBT):
scalacOptions += "-Ywarn-value-discard"

Related

Slick update operation returns before object is flushed in the database

I am experiencing a scenario where when I fetch an object immediately after updating it, sometimes the result I get from the DB does not contain the most recent changes.
This has led me to think that the update thread returns before the object is actually committed in the DB. Is this expected behavior?
I would think that the update method would only return after the changes have been successfully flushed to the DB however it looks like this not guaranteed.
Below is pseudo code demonstrating what I am talking about.
def processObject = {
for {
objectId: Option[Long] <- saveObjectInDb
_ <- {
//perform other synchronous business logic and then update created object details
dao.findById(objectId.get).map { objectOption: Option[MyObject] =>
dao.update(objectOption.get.copy(processingStep = "third-step"))
}
}
mostRecentMyObject <- dao.findById(objectId.get)
} yield mostRecentMyObject
}
Below is how my update logic looks like
def update(myObject: MyObject): Future[Int] = {
db.run(table.filter(_.id === myObject.id).update(myObject))
}
The problem is that you are not considering the inner Future returned by the update method.
Given the signature of findById:
def findById(id: Long): Future[Option[MyObject]]
the snippet:
dao.findById(objectId.get).map { objectOption: Option[MyObject] =>
dao.update(objectOption.get.copy(processingStep = "third-step"))
}
will gave an object of type Future[Future[Int]].
You should instead flatMap instead of map over the findById future, like so:
dao.findById(objectId.get).flatMap { objectOption: Option[MyObject] =>
dao.update(objectOption.get.copy(processingStep = "third-step"))
}
this will simplify to a single future (Future[Int]), and so you can be sure retrieve the object only once inserted.
Moreover you can rewrite this as:
def processObject = {
for {
objectId: Option[Long] <- saveObjectInDb
objectOption <- dao.findById(objectId.get)
_ <- dao.update(objectOption.get.copy(processingStep = "third-step"))
mostRecentMyObject <- dao.findById(objectId.get)
} yield mostRecentMyObject
}
because, into for-comprehension, the <- is a syntactic sugar for the flatMap

Run transactionally and retrieve result in Future

How to run a transactionally statement in Slick 3.1.x, and capture the result in a Future (without the use of Await)?
This works (but uses Await)
val action = db.run((for {
_ <- table1.filter(_.id1 === id).delete
_ <- table2.filter(_.id2=== id).delete
} yield ()).transactionally)
val result = Await.result(action, Duration.Inf)
However this does not print anything:
val future = db.run((for {
_ <- table1.filter(_.id1 === id).delete
_ <- table2.filter(_.id2=== id).delete
} yield ()).transactionally)
future.map { result => println("result:"+result) }
UPDATE
This is the real code taken from the program that doesn't work. It prints "1" but it never prints "2"
case class UserRole (sk: Int, name: String)
class UserRoleDB(tag: Tag) extends Table[UserRole](tag, "user_roles") {
def sk = column[Int]("sk", O.PrimaryKey)
def name = column[String]("name")
def * = (sk, name) <> ((UserRole.apply _).tupled, UserRole.unapply)
}
class Test extends Controller {
def index = Action.async { request =>
val db = Database.forConfig("db1")
val userRoles = TableQuery[UserRoleDB]
val ur = UserRole(1002,"aaa")
try {
val action = (for {
userRole2 <- userRoles += ur
} yield (userRole2)).transactionally
val future = db.run(action)
println(1)
// val result = Await.result(future, Duration.Inf)
future.map { result => {
println(2)
Ok("Finished OK")
}
}
}
finally db.close
}
}
Coming from the other question you asked: You are opening and then immediately closing the db connection in the finally clause. Therefore your async db operation runs against a closed db connection. That's also why it works by using Await since that blocks the execution of db.close until you received the result set.
So how to fix this?
Either you move db.close into future.map or better you let play-slick handle db connections for you.
Side note
You should close your other question and update this thread accordingly instead.
Your second example is fine. My guess is that you are either running it in standalone program or in test - and it simply finishes before future has a chance to be executed.
Try to add some sleep after your code in your second sample and you'll see it is getting printed. This is definitely not something (this sleep) you would do in your actual code but it will show you it works as it should.

Trying to understand Scala enumerator/iteratees

I am new to Scala and Play!, but have a reasonable amount of experience of building webapps with Django and Python and of programming in general.
I've been doing an exercise of my own to try to improve my understanding - simply pull some records from a database and output them as a JSON array. I'm trying to use the Enumarator/Iteratee functionality to do this.
My code follows:
TestObjectController.scala:
def index = Action {
db.withConnection { conn=>
val stmt = conn.createStatement()
val result = stmt.executeQuery("select * from datatable")
logger.debug(result.toString)
val resultEnum:Enumerator[TestDataObject] = Enumerator.generateM {
logger.debug("called enumerator")
result.next() match {
case true =>
val obj = TestDataObject(result.getString("name"), result.getString("object_type"),
result.getString("quantity").toInt, result.getString("cost").toFloat)
logger.info(obj.toJsonString)
Future(Some(obj))
case false =>
logger.warn("reached end of iteration")
stmt.close()
null
}
}
val consume:Iteratee[TestDataObject,Seq[TestDataObject]] = {
Iteratee.fold[TestDataObject,Seq[TestDataObject]](Seq.empty[TestDataObject]) { (result,chunk) => result :+ chunk }
}
val newIteree = Iteratee.flatten(resultEnum(consume))
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult.onSuccess { case x=> println(x)}
Ok("")
}
}
TestDataObject.scala:
package models
case class TestDataObject (name: String, objtype: String, quantity: Int, cost: Float){
def toJsonString: String = {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
mapper.writeValueAsString(this)
}
}
I have two main questions:
How do i signal that the input is complete from the Enumerator callback? The documentation says "this method takes a callback function e: => Future[Option[E]] that will be called each time the iteratee this Enumerator is applied to is ready to take some input." but I am unable to pass any kind of EOF that I've found because it;s the wrong type. Wrapping it in a Future does not help, but instinctively I am not sure that's the right approach.
How do I get the final result out of the Future to return from the controller view? My understanding is that I would effectively need to pause the main thread to wait for the subthreads to complete, but the only examples I've seen and only things i've found in the future class is the onSuccess callback - but how can I then return that from the view? Does Iteratee.run block until all input has been consumed?
A couple of sub-questions as well, to help my understanding:
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
When I run the code for the first time, I get a single record logged from logger.info and then it reports "reached end of iteration". Subsequent runs in the same session call nothing. I am closing the statement though, so why do I get no results the second time around? I was expecting it to loop indefinitely as I don't know how to signal the correct termination for the loop.
Thanks a lot in advance for any answers, I thought I was getting the hang of this but obviously not yet!
How do i signal that the input is complete from the Enumerator callback?
You return a Future(None).
How do I get the final result out of the Future to return from the controller view?
You can use Action.async (doc):
def index = Action.async {
db.withConnection { conn=>
...
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult map { data =>
OK(...)
}
}
}
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
The Future represents the (potentially asynchronous) processing to obtain the next element. The Option represents the availability of the next element: Some(x) if another element is available, None if the enumeration is completed.

How can I integrate MongoDB Scala Async driver with Akka Streams?

I'm migrating my old Casbah Mongo drivers to the new async Scala drivers and I'm trying to use this in an Akka stream, and the stream is getting stuck.
I have a GraphStage with createLogic() defined. The code is below. This worked fine with Casbah and I'd hoped the async nature of the new mongo drivers would be a great fit, but here what happens...
If I stream 2 records through this code, the first record flows through and triggers the next step. See output below ('HERE IN SEND' confirms it got through). The second record seems to go through the right steps in BlacklistFilter but Akka never flows to the SEND step.
Any ideas why this is not working with the new drivers?
object BlacklistFilter {
type FilterShape = FanOutShape2[QM[RenderedExpression], QM[RenderedExpression], QM[Unit]]
}
import BlacklistFilter._
case class BlacklistFilter(facilities: Facilities, helloConfig: HelloConfig)(implicit asys: ActorSystem) extends GraphStage[FilterShape] {
val outPass: Outlet[QM[RenderedExpression]] = Outlet("Pass")
val outFail: Outlet[QM[Unit]] = Outlet("Fail")
val reIn: Inlet[QM[RenderedExpression]] = Inlet("Command")
override val shape: FilterShape = new FanOutShape2(reIn, outPass, outFail)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
override def preStart(): Unit = pull(reIn)
setHandler(reIn, new InHandler {
override def onPush(): Unit = {
val cmd = grab(reIn)
val re: RenderedExpression = cmd.body
val check = re.recipient.contacts(re.media).toString
// NEW NON-BLOCKING CODE
//-------------------------------------
facilities.withMongo(helloConfig.msgDB, helloConfig.blacklistColl) { coll =>
var found: Option[Document] = None
coll.find(Document("_id" -> check)).first().subscribe(
(doc: Document) => {
found = Some(doc)
println("BLACKLIST FAIL! " + check)
emit(outFail, cmd)
// no pull() here as this happens on complete below
},
(e: Throwable) => {
// Log something here!
emit(outFail, cmd)
pull(reIn)
},
() => {
if (found.isEmpty) {
println("BLACKLIST OK. " + check)
emit(outPass, cmd)
}
pull(reIn)
println("Pulled reIn...")
}
)
}
// OLD BLOCKING CASBAH CODE THAT WORKED
//-------------------------------------
// await(facilities.mongoAccess().mongo(helloConfig.msgDB, helloConfig.blacklistColl)(_.findOne(MongoDBObject("_id" -> check)))) match {
// case Some(_) => emit(outFail, cmd)
// case None => emit(outPass, cmd)
// }
// pull(reIn)
}
override def onUpstreamFinish(): Unit = {} // necessary for some reason!
})
setHandler(outPass, eagerTerminateOutput)
setHandler(outFail, eagerTerminateOutput)
}
}
Output:
BLACKLIST OK. jsmith#yahoo.com
Pulled reIn...
HERE IN SEND (TemplateRenderedExpression)!!!
ACK!
BLACKLIST OK. 919-919-9119
Pulled reIn...
You can see from the output that the first record flowed nicely to the SEND/ACK steps. The second record printed the BLACKLIST message, meaning it emitted outPass then called pull on reIn... but then nothing happens downstream.
Anyone know why this would work differently in Akka Streams than the Casbah version that worked fine (code shown commented out)?
(I could just convert the Mongo call to a Future and Await on it, and that should work like the old code, but that kinda defeats the whole point of going async!)
Well then... "never mind"! :-)
The code above seemed like it should work. I then noticed the Akka guys have just released a new version (2.0.1). I'm not sure what tweaks lay within, but whatever it was, the code above now works as I'd hoped w/o modification.
Left this post up just in case anyone hits a similar problem.

Scala work with multiple futures

I have two source of data which returns List[Int]
// first work with postgresql database
object Db {
def get: Future[List[Int]] = // impl
}
// second it's remote service
object Rs {
def get: Future[List[Int]] = // impl
}
And then i want to return two lists. But i do not know how to dealing with exceptions:
Db possible throw ConnectionUnavailable
Remote service - Bad Request, or InternalServer error
Both - TimeoutException
But, when i have only results from db, i want to return it. If i have results from db and from remote service i want to return sum of two lists.
How to work with this case?
val total: Future[Int] =
Db.get.flatMap { dbResults =>
Rs.get.map { remoteResults =>
dbResults.sum + remoteResults.sum
}
}
or equivalently
val total: Future[Int] = for {
dbResults <- Db.get
remoteResults <- Rs.get
} yield dbResults.sum + remoteResults.sum
I explicitly annotated the result type for the sake of clarity but it's not necessary.
total is a Future[Int] holding either a successful or a failed computation. If you need to handle errors you can attach a onFailure handler on it. For instance
total.onFailure {
case e: TimeoutException => // ...
case e: ConnectionError => // ...
}
(the names of the exceptions are made up)
You need to combine flatMap and recover:
for {
db <- Db.get
rs <- Rs.get.recover {
case e =>
logger.error("Error requesting external service", e)
List.fill(db.length)(0)
}
} yield (db, rs).zipped.map(_+_).sum
You can tweak the transformations if you like (I assumed that you meant element-wise sum of lists), but the basic idea stays the same - if you want to "ignore" the failure of some future, you need to call recover on it.
If you want, you can extract the recovering function from the for comprehension, but recover still has to be called inside of it:
def handler(n: Int): PartialFunction[Throwable, List[Int]] = {
case e =>
logger.error("Error requesting external service", e)
List.fill(n)(0)
}
for {
db <- Db.get
rs <- Rs.get.recover(handler(db.length))
} yield (db, rs).zipped.map(_+_).sum