Make CRUD operations with ReactiveMongo - scala

I have started to learn scala recently and trying to create simple api using akka HTTP and reactivemongo.
Have problems with simple operations. Spend a lot of time digging docks, official tutorials, stackoverflow etc. Probably I am missing something very simple.
My code:
object MongoDB {
val config = ConfigFactory.load()
val database = config.getString("mongodb.database")
val servers = config.getStringList("mongodb.servers").asScala
val credentials = Lis(Authenticate(database,config.getString("mongodb.userName"), config.getString("mongodb.password")))
val driver = new MongoDriver
val connection = driver.connection(servers, authentications = credentials)
//val db = connection.database(database)
}
Now I would like to make basic CRUD operations. I am trying different code snippets but can't get it working.
Here are some examples:
object TweetManager {
import MongoDB._
//taken from docs
val collection = connection.database("test").
map(_.collection("tweets"))
val document1 = BSONDocument(
"author" -> "Tester",
"body" -> "test"
)
//taken from reactivemongo tutorial, it had extra parameter as BSONCollection, but can't get find the way of getting it
def insertDoc1(doc: BSONDocument): Future[Unit] = {
//another try of getting the collection
//def collection = for ( db1 <- db) yield db1.collection[BSONCollection]("tweets")
val writeRes: Future[WriteResult] = collection.insert(doc)
writeRes.onComplete { // Dummy callbacks
case Failure(e) => e.printStackTrace()
case Success(writeResult) =>
println(s"successfully inserted document with result: $writeResult")
}
writeRes.map(_ => {}) // in this example, do nothing with the success
}
}
insertDoc1(document1)
I can't do any operation on the collection. IDE gives me: "cannot resolve symbol". Compiler gives error:
value insert is not a member of scala.concurrent.Future[reactivemongo.api.collections.bson.BSONCollection]
What is the correct way of doing it?

You are trying to call the insert operation on a Future[Collection], rather than on the underlying collection (calling operation on Future[T] rather than on T is not specific to ReactiveMongo).
It's recommanded to have a look at the documentation.

Related

using akka streams to go over mongo collection

I have a collection of people in mongo, and I want to go over each person in the collection as a stream, and for each person call a method that is performing api call, changing the model, and inserting to a new collection in mongo.
It looks like this:
def processPeople()(implicit m: Materializer): Future[Unit] = {
val peopleSource: Source[Person, Future[State]] = collection.find(json()).cursor[Person]().documentSource()
peopleSource.runWith(Sink.seq[Person]).map(people => {
people.foreach(person => {
changeModelAndInsertToNewCollection(person)
})
})
}
but this is not working...the part of changing the model seems like is working, but the insert to mongo is not working.
It looks like also the method is not starting right away, there some processing going behind before for a min before it starts....do you see the issue?
Solution 1 :
def changeModelAndInsertToNewCollection(person:Person) : Future[Boolean] ={
//Todo : call mongo api to update the person
???
}
def processPeople()(implicit m: Materializer): Future[Done] = {
val numberOfConcurrentUpdate = 10
val peopleSource: Source[Person, Future[State]] =
collection
.find(json())
.cursor[Person]()
.documentSource()
peopleSource
.mapAsync(numberOfConcurrentUpdate)(changeModelAndInsertToNewCollection)
withAttributes(ActorAttributes.supervisionStrategy(Supervision.restartingDecider))
.runWith(Sink.ignore)}
Solution 2 :
using Alpakka as akka stream connector for mongo
val source: Source[Document, NotUsed] =
MongoSource(collection.find(json()).cursor[Person]().documentSource())
source.runWith(MongoSink.updateOne(2, collection))

Why a Thread.sleep or closing the connection is required after waiting for a remove call to complete?

I'm again seeking you to share your wisdom with me, the scala padawan!
I'm playing with reactive mongo in scala and while I was writting a test using scalatest, I faced the following issue.
First the code:
"delete" when {
"passing an existent id" should {
"succeed" in {
val testRecord = TestRecord(someString)
Await.result(persistenceService.persist(testRecord), Duration.Inf)
Await.result(persistenceService.delete(testRecord.id), Duration.Inf)
Thread.sleep(1000) // Why do I need that to make the test succeeds?
val thrownException = intercept[RecordNotFoundException] {
Await.result(persistenceService.read(testRecord.id), Duration.Inf)
}
thrownException.getMessage should include(testRecord._id.toString)
}
}
}
And the read and delete methods with the code initializing connection to db (part of the constructor):
class MongoPersistenceService[R](url: String, port: String, databaseName: String, collectionName: String) {
val driver = MongoDriver()
val parsedUri: Try[MongoConnection.ParsedURI] = MongoConnection.parseURI("%s:%s".format(url, port))
val connection: Try[MongoConnection] = parsedUri.map(driver.connection)
val mongoConnection = Future.fromTry(connection)
def db: Future[DefaultDB] = mongoConnection.flatMap(_.database(databaseName))
def collection: Future[BSONCollection] = db.map(_.collection(collectionName))
def read(id: BSONObjectID): Future[R] = {
val query = BSONDocument("_id" -> id)
val readResult: Future[R] = for {
coll <- collection
record <- coll.find(query).requireOne[R]
} yield record
readResult.recover {
case NoSuchResultException => throw RecordNotFoundException(id)
}
}
def delete(id: BSONObjectID): Future[Unit] = {
val query = BSONDocument("_id" -> id)
// first read then call remove. Read will throw if not present
read(id).flatMap { (_) => collection.map(coll => coll.remove(query)) }
}
}
So to make my test pass, I had to had a Thread.sleep right after waiting for the delete to complete. Knowing this is evil usually punished by many whiplash, I want learn and find the proper fix here.
While trying other stuff, I found instead of waiting, entirely closing the connection to the db was also doing the trick...
What am I misunderstanding here? Should a connection to the db be opened and close for each call to it? And not do many actions like adding, removing, updating records with one connection?
Note that everything works fine when I remove the read call in my delete function. Also by closing the connection, I mean call close on the MongoDriver from my test and also stop and start again embed Mongo which I'm using in background.
Thanks for helping guys.
Warning: this is a blind guess, I've no experience with MongoDB on Scala.
You may have forgotten to flatMap
Take a look at this bit:
collection.map(coll => coll.remove(query))
Since collection is Future[BSONCollection] per your code and remove returns Future[WriteResult] per doc, so actual type of this expression is Future[Future[WriteResult]].
Now, you have annotated your function as returning Future[Unit]. Scala often makes Unit as a return value by throwing away possibly meaningful values, which it does in your case:
read(id).flatMap { (_) =>
collection.map(coll => {
coll.remove(query) // we didn't wait for removal
() // before returning unit
})
}
So your code should probably be
read(id).flatMap(_ => collection.flatMap(_.remove(query).map(_ => ())))
Or a for-comprehension:
for {
_ <- read(id)
coll <- collection
_ <- coll.remove(query)
} yield ()
You can make Scala warn you about discarded values by adding a compiler flag (assuming SBT):
scalacOptions += "-Ywarn-value-discard"

ReactiveMongo database dump with Play Framework 2.5

I'm trying to dump my mongo database into a json object but because my queries to the database are asynchrounous I'm having problems.
Each collection in my database contains user data and each collection name is a user name.
So, when I want to get all my users data I recover all the collection names and then loop over them to recover each collection one by one.
def databaseDump(prom : Promise[JsObject]) = {
for{
dbUsers <- getUsers
} yield dbUsers
var rebuiltJson = Json.obj()
var array = JsArray()
res.map{ users =>
users.map{ userNames =>
if(userNames.size == 0){
prom failure new Throwable("Empty database")
}
var counter = 0
userNames.foreach { username =>
getUserTables(username).map { tables =>
/* Add data to array*/
...
counter += 1
if(counter == userNames.size){
/*Add data to new json*/
...
prom success rebuiltJson
}
}
}
}
}
This kinda works, but sometimes the promise is succesfully triggered even though all the data has not yet been recoverd. This is due to that fact that my counter variable is not a reliable solution.
Is there a way to loop over all the users, query the database and wait for all the data to be recovered before succesfully triggering the promise? I tried to use for comprehension but didn't find a way to do it. Is there a way to dump a whole mongo DB into one Json : { username : data, username : data ..} ?
The users/tables terminology was getting me confused, so I wrote a new function that dumps a database into a single JsObject.
// helper function to find all documents inside a collection c
// and return them as a single JsArray
def getDocs(c: JSONCollection)(implicit ec: ExecutionContext) = c.find(Json.obj()).cursor[JsObject]().jsArray()
def dumpToJsObject(db: DefaultDB)(implicit ec: ExecutionContext): Future[JsObject] = {
// get a list of all collections in the db
val collectionNames = db.collectionNames
val collections = collectionNames.map(_.map(db.collection[JSONCollection](_)))
// each entry is a tuple collectionName -> content (as JsArray)
val namesToDocs = collections.flatMap {
colls => Future.sequence(colls.map(c => getDocs(c).map(c.name -> _)))
}
// convert to a single JsObject
namesToDocs.map(JsObject(_))
}
I haven't tested it yet (I will do so later), but this function should at least give you the general idea. You get the list of all collections inside the database. For each collection, you perform a query to get all documents inside that collection. The list of documents is converted into a JsArray, and finally all collections are composed to a single JsObject with the collection names as keys.
If the goal is to write the data to an output stream (local/file or network), with side effects.
import scala.concurrent.{ ExecutionContext, Future }
import reactivemongo.bson.BSONDocument
import reactivemongo.api.{ Cursor, MongoDriver, MongoConnection }
val mongoUri = "mongodb://localhost:27017/my_db"
val driver = new MongoDriver
val maxDocs = Int.MaxValue // max per collection
// Requires to have an ExecutionContext in the scope
// (e.g. `import scala.concurrent.ExecutionContext.Implicits.global`)
def dump()(implicit ec: ExecutionContext): Future[Unit] = for {
uri <- Future.fromTry(MongoConnection.parseURI(mongoUri))
con = driver.connection(uri)
dn <- Future(uri.db.get)
db <- con.database(dn)
cn <- db.collectionNames
_ <- Future.sequence(cn.map { collName =>
println(s"Collection: $collName")
db.collection(collName).find(BSONDocument.empty). // findAll
cursor[BSONDocument]().foldWhile({}, maxDocs) { (_, doc) =>
// Replace println by appropriate side-effect
Cursor.Cont(println(s"- ${BSONDocument pretty doc}"))
}
})
} yield ()
If using with the JSON serialization pack, just replace BSONDocument with JsObject (e.g. BSONDocument.empty ~> Json.obj()).
If testing from the Scala REPL, after having paste the previous code, it can be executed as following.
dump().onComplete {
case result =>
println(s"Dump result: $result")
//driver.close()
}

Slick - What if database does not contain result

I am trying to build a simple RESTful service that performs CRUD operations on a database and returns JSON. I have a service adhering to an API like this
GET mydomain.com/predictions/some%20string
I use a DAO which contains the following method that I have created to retrieve the associated prediction:
def getPrediction(rawText: String): Prediction = {
val predictionAction = predictions.filter{_.rawText === rawText}.result
val header = predictionAction.head
val f = db.run(header)
f.onComplete{case pred => pred}
throw new Exception("Oops")
}
However, this can't be right, so I started reading about Option. I changed my code accordingly:
def getPrediction(rawText: String): Option[Prediction] = {
val predictionAction = predictions.filter{_.rawText === rawText}.result
val header = predictionAction.headOption
val f = db.run(header)
f.onSuccess{case pred => pred}
None
}
This still doesn't feel quite right. What is the best way to invoke these filters, return the results, and handle any uncertainty?
I think the best way to rewrite your code is like this:
def getPrediction(rawText: String): Future[Option[Prediction]] = {
db.run(users.filter(_.rawText === rawText).result.headOption)
}
In other words, return a Future instead of the plain result. This way, the database actions will execute asynchronously, which is the preferred way for both Play and Akka.
The client code will then work with the Future. Per instance, a Play action would be like:
def prediction = Action.async {
predictionDao.getPrediction("some string").map { pred =>
Ok(views.html.predictions.show(pred))
}.recover {
case ex =>
logger.error(ex)
BadRequest()
}
}

Facing Issue in using Scala + Slick + MySQL+ Akka + Stream

Problem Statement : We are adding all incoming request parameters of user for particular module in MySQL DB table as a row (this is a huge data). Now, we want to design a process which will read each record from this table and will get more information about that request of user by calling third party APIs and after that it will put this returned meta information in another table.
Current Attempts:
I am using Scala + Slick to do this. As the data to read is huge, I want to read this table one row at a time and process it. I tried using slick + akka streams, however I am getting 'java.util.concurrent.RejectedExecutionException'
Following is the rough logic that I have tried,
implicit val system = ActorSystem("Example")
import system.dispatcher
implicit val materializer = ActorMaterializer()
val future = db.stream(SomeQuery.result)
Source.fromPublisher(future).map(row => {
id = dataEnrichmentAPI.process(row)
}).runForeach(id => println("Processed row : "+ id))
dataEnrichmentAPI.process : This function makes a third party REST call and also does some DB query to get required data. This DB query is done using 'db.run' method and it also waits until it finishes (Using Await)
e.g.,
def process(row: RequestRecord): Int = {
// SomeQuery2 = Check if data is already there in DB
val retId: Seq[Int] = Await.result(db.run(SomeQuery2.result), Duration.Inf)
if(retId.isEmpty){
val metaData = RestCall()
// SomeQuery3 = Store this metaData in DB
Await.result(db.run(SomeQuery3.result), Duration.Inf)
return metaData.id;
}else{
// SomeQuery4 = Get meta data id
return Await.result(db.run(SomeQuery4.result), Duration.Inf)
}
}
I am getting this exception where I am using blocking call to DB. I don't think if I can get rid of it as return value is required for later flow to continue.
Does 'blocking call' is a reason behind this Exception ?
What is the best practice to solve this kind of problem ?
Thanks.
I don't know if this is your problem (too few details) but you should never block.
Speaking of best practices, us async stages instead.
This is more or less what your code would look like without using Await.result:
def process(row: RequestRecord): Future[Int] = {
db.run(SomeQuery2.result) flatMap {
case retId if retId.isEmpty =>
// what is this? is it a sync call? if it's a rest call it should return a future
val metaData = RestCall()
db.run(SomeQuery3.result).map(_ => metaData.id)
case _ => db.run(SomeQuery4.result)
}
}
Source.fromPublisher(db.stream(SomeQuery.result))
// choose your own parallelism
.mapAsync(2)(dataEnrichmentAPI.process)
.runForeach(id => println("Processed row : "+ id))
This way you will be handling backpressure and parallelism explicitly and idiomatically.
Try to never call Await.result in production code and only compose futures using map, flatMap and for comprehensions