I am on Ubuntu 20.04. I want to write some data in Scala to MongoDB. Here's what I have:
import org.mongodb.scala.bson.collection.immutable.{Document => MongoDocument}
import org.mongodb.scala.{MongoClient, MongoCollection, MongoDatabase}
object Application extends App {
val mongoClient: MongoClient = MongoClient()
// Use a Connection String
//val mongoClient: MongoClient = MongoClient("mongodb://localhost")
val database: MongoDatabase = mongoClient.getDatabase("mydb")
val collection: MongoCollection[MongoDocument] = database.getCollection("user")
val doc: MongoDocument = MongoDocument("_id" -> 0, "name" -> "MongoDB", "type" -> "database",
"count" -> 1, "info" -> MongoDocument("x" -> 203, "y" -> 102))
collection.insertOne(doc)
val documents = (1 to 100) map { i: Int => MongoDocument("i" -> i) }
collection.insertMany(documents)
}
The error (not even an error, INFO level) I get:
Nov 16, 2020 1:42:08 AM com.mongodb.diagnostics.logging.JULLogger log
INFO: Cluster created with settings {hosts=[localhost:27017],
mode=SINGLE, requiredClusterType=UNKNOWN,
serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
And nothing happens to the database. No data appears there. No errors, no insertions into Mongo, nothing.
I used primarily these sources as examples:
https://mongodb.github.io/mongo-scala-driver/2.9/getting-started/quick-tour/
https://blog.knoldus.com/how-scala-interacts-with-mongodb/
MongoDB is up, status is active. Inserting data from the terminal was done successfully. So, the program's behavior I have is strange. I've been searching everywhere on the Internet for the answers but I can't seem to find it. Your help will be appreciated a lot. Thank you!
Thanks to #Luis Miguel Mejía Suárez's help. Here's what I have done so far: added an Observer implementation and a promise. Thanks to this post: Scala script wait for mongo to complete task . That's what I have now:
val mongoClient: MongoClient = MongoClient("mongodb://localhost")
val database: MongoDatabase = mongoClient.getDatabase("mydb")
val collection: MongoCollection[MongoDocument] = database.getCollection("user")
val doc: MongoDocument = MongoDocument("name" -> "MongoDB", "type" -> "database",
"count" -> 1, "info" -> MongoDocument("x" -> 203, "y" -> 102))
val observable: Observable[Completed] = collection.insertOne(doc)
val promise = Promise[Boolean]
observable.subscribe(new Observer[Completed] {
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = {
println("Failed")
promise.success(false)
}
override def onComplete(): Unit = {
println("Completed")
promise.success(true)
}
})
val future = promise.future
Await.result(future, Duration(5, java.util.concurrent.TimeUnit.SECONDS))
mongoClient.close()
Generally speaking, it works in most cases. Though, I didn't handle the case with insertMany method where the program has to wait for the last element insertion. My realization does not work properly with this.
P.S. Turns out insertMany also works fine with this example, I just tested it with the wrong data.
Related
There is an example of Scala mongodb transaction:
https://github.com/mongodb/mongo-scala-driver/blob/r2.4.0/driver/src/it/scala/org/mongodb/scala/DocumentationTransactionsExampleSpec.scala
But it's not clear how to rollback the transaction in case of failure.
Here is the code I copied from official example but modified a bit to make the transaction fail in the second insertion (inserting 2 documents with same ids), but problem is that the first document is persisted, and I need the WHOLE transaction to be rolled back.
import org.mongodb.scala._
import scala.concurrent.Await
import scala.concurrent.duration.Duration
object Application extends App {
val mongoClient: MongoClient = MongoClient("mongodb://localhost:27018")
val database = mongoClient.getDatabase("hr")
val employeesCollection = database.getCollection("employees")
// Implicit functions that execute the Observable and return the results
val waitDuration = Duration(5, "seconds")
implicit class ObservableExecutor[T](observable: Observable[T]) {
def execute(): Seq[T] = Await.result(observable.toFuture(), waitDuration)
}
implicit class SingleObservableExecutor[T](observable: SingleObservable[T]) {
def execute(): T = Await.result(observable.toFuture(), waitDuration)
}
updateEmployeeInfoWithRetry(mongoClient).execute()
Thread.sleep(3000)
/// -------------------------
def updateEmployeeInfo(database: MongoDatabase, observable: SingleObservable[ClientSession]): SingleObservable[ClientSession] = {
observable.map(clientSession => {
val eventsCollection = database.getCollection("events")
val transactionOptions = TransactionOptions.builder().readConcern(ReadConcern.SNAPSHOT).writeConcern(WriteConcern.MAJORITY).build()
clientSession.startTransaction(transactionOptions)
eventsCollection.insertOne(clientSession, Document("_id" -> "123", "employee" -> 3, "status" -> Document("new" -> "Inactive", "old" -> "Active")))
.subscribe((res: Completed) => println(res))
// THIS SHOULD FAIL, SINCE THERE IS ALREADY DOCUMENT WITH ID = 123, but PREVIOUS OPERATION SHOULD BE ALSO ROLLED BACK.
// I COULD NOT FIND THE WAY HOW TO ROLLBACK WHOLE TRANSACTION IF ONE OF OPERATIONS FAILED
eventsCollection.insertOne(clientSession, Document("_id" -> "123", "employee" -> 3, "status" -> Document("new" -> "Inactive", "old" -> "Active")))
.subscribe((res: Completed) => println(res))
// I'VE TRIED VARIOUS THINGS (INCLUDING CODE BELOW)
// .subscribe(new Observer[Completed] {
// override def onNext(result: Completed): Unit = println("onNext")
//
// override def onError(e: Throwable): Unit = clientSession.abortTransaction()
//
// override def onComplete(): Unit = println("complete")
// })
clientSession
})
}
def commitAndRetry(observable: SingleObservable[Completed]): SingleObservable[Completed] = {
observable.recoverWith({
case e: MongoException if e.hasErrorLabel(MongoException.UNKNOWN_TRANSACTION_COMMIT_RESULT_LABEL) => {
println("UnknownTransactionCommitResult, retrying commit operation ...")
commitAndRetry(observable)
}
case e: Exception => {
println(s"Exception during commit ...: $e")
throw e
}
})
}
def runTransactionAndRetry(observable: SingleObservable[Completed]): SingleObservable[Completed] = {
observable.recoverWith({
case e: MongoException if e.hasErrorLabel(MongoException.TRANSIENT_TRANSACTION_ERROR_LABEL) => {
println("TransientTransactionError, aborting transaction and retrying ...")
runTransactionAndRetry(observable)
}
})
}
def updateEmployeeInfoWithRetry(client: MongoClient): SingleObservable[Completed] = {
val database = client.getDatabase("hr")
val updateEmployeeInfoObservable: Observable[ClientSession] = updateEmployeeInfo(database, client.startSession())
val commitTransactionObservable: SingleObservable[Completed] =
updateEmployeeInfoObservable.flatMap(clientSession => clientSession.commitTransaction())
val commitAndRetryObservable: SingleObservable[Completed] = commitAndRetry(commitTransactionObservable)
runTransactionAndRetry(commitAndRetryObservable)
}
}
How to rollback the whole transaction if any operation failed?
From the source code of the Scala driver at https://github.com/mongodb/mongo-scala-driver/blob/r2.6.0/driver/src/main/scala/org/mongodb/scala/ClientSessionImplicits.scala
It appears that there is an abortTransaction() method defined along with commitTransaction().
In another note, currently a single replica set transaction in MongoDB 4.0 will be automatically aborted if it's not committed within 60 seconds (configurable). In the MongoDB Multi-Document ACID Transactions blog post:
By default, MongoDB will automatically abort any multi-document transaction that runs for more than 60 seconds. Note that if write volumes to the server are low, you have the flexibility to tune your transactions for a longer execution time.
I'm practicing on a project that needs a database connection, I'm using the Play Framework combine to Scala and MongoDB.
I'm also using Mongo-scala-driver and following the documentation.
I wrote the exact same code:
println("start")
val mongoClient: MongoClient = MongoClient("mongodb://localhost:27017/Sandbox")
val database: MongoDatabase = mongoClient.getDatabase("test")
val collection: MongoCollection[Document] = database.getCollection("test")
val doc: Document = Document("_id" -> 0, "name" -> "MongoDB", "type" -> "database", "count" -> 1, "info" -> Document("x" -> 203, "y" -> 102))
collection.insertOne(doc).subscribe(new Observer[Completed] {
override def onSubscribe(subscription: Subscription): Unit = println("Subscribed")
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = println("Failed")
override def onComplete(): Unit = println("Completed")
})
mongoClient.close()
println("end")
Nothing is inserted into the database and the only result i get from the log is this:
start
Subscribed
end
I've been looking on stackoverflow for similar subject but everything I found didn't work for me.
You try insert document in asyncronous mode.
Therefore you must define three call back function onNext onError and onComplete
But you don't give time for execute insertion.
Try append any timeout before close connection. For example simple add
Thread.sleep(1000)
before
mongoClient.close()
And you no need redefine onSubscribe()
if you not want manually control demand when you move in documents list from you requests then you no need override onSubscribe(). The default definition for onSubscrime() very usable for trivial requests. In you case you no need override him.
The next code is worked
println("start")
val mongoClient: MongoClient = MongoClient("mongodb://DB01-MongoDB:27017/Sandbox")
val database: MongoDatabase = mongoClient.getDatabase("test")
val collection: MongoCollection[Document] = database.getCollection("test")
val doc: Document = Document("_id" -> 0,
"name" -> "MongoDB",
"type" -> "database",
"count" -> 1,
"info" -> Document("x" -> 203, "y" -> 102))
collection
.insertOne(doc)
.subscribe(new Observer[Completed] {
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = println("Failed")
override def onComplete(): Unit = println("Completed")
})
Thread.sleep(1000)
mongoClient.close()
println("end")
}
The problem was the Observer, I imported it from org.mongodb.async.client but the good one was org.mongodb.scala.
Hope this helps someone else.
The above solution may work but you might have to trade 1 second every time you insert (or any call). Another solution is to do make use of the call back :
val insertObservable = collection.insertOne(doc)
insertObservable.subscribe(new Observer[Completed] {
override def onComplete(): Unit = mongoClient.close()
})
Once the transaction completed, the connection gets closed automatically without wasting 1 second.
I'm following the MongoDB Scala Drive Quick Tour guide and trying to insert a document. But I keeping seeing the following message whenever I do so
INFO: No server chosen by WritableServerSelector from cluster
description ClusterDescription{type=UNKNOWN, connectionMode=SINGLE,
serverDescriptions=[ServerDescription{address=ds155695.mlab.com:55695,
type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing
out
Here is what my code looks like
val url: String = "mongodb://heroku_#######:##############ds155695.mlab.com:55695/heroku_#########"
val mongoClient: MongoClient = MongoClient(url)
val db: MongoDatabase = mongoClient.getDatabase("heroku_#####")
val collection: MongoCollection[Document] = db.getCollection("omens")
println(collection)
val doc: Document = Document("_id" -> 3434, "name" -> "xxxxxx", "type" -> "yyyyyy")
val observable: Observable[Completed] = collection.insertOne(doc)
observable.subscribe(new Observer[Completed] {
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = println("Failed")
override def onComplete(): Unit = println("Completed")
})
If I change my code to use Futures, I get the same message. However, if I use the Await to explicitly wait for a time, its works. The following is defined in the Helpers.scala as suggested by the quick tour.
trait ImplicitObservable[C] {
val observable: Observable[C]
val converter: (C) => String
def results(): Seq[C] = Await.result(observable.toFuture(), Duration(10, TimeUnit.SECONDS))
def headResult() = Await.result(observable.head(), Duration(10, TimeUnit.SECONDS))
def printResults(initial: String = ""): Unit = {
if (initial.length > 0) print(initial)
results().foreach(res => println(converter(res)))
}
def printHeadResult(initial: String = ""): Unit = println(s"${initial}${converter(headResult())}")
}
And if I do the following, it'll work and insert the document.
val result = collection.insertOne(doc).results()
But I find that a little suboptimal and want to use Futures or Observable. Can someone point what I'm doing wrong?
Here is the full stack trace
INFO: Exception in monitor thread while connecting to server ds155695.mlab.com:55695
com.mongodb.MongoInterruptedException: Opening the AsynchronousSocketChannelStream failed
at com.mongodb.connection.FutureAsyncCompletionHandler.get(FutureAsyncCompletionHandler.java:59)
at com.mongodb.connection.FutureAsyncCompletionHandler.getOpen(FutureAsyncCompletionHandler.java:44)
at com.mongodb.connection.AsynchronousSocketChannelStream.open(AsynchronousSocketChannelStream.java:62)
at com.mongodb.connection.InternalStreamConnection.open(InternalStreamConnection.java:115)
at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:113)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1302)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at com.mongodb.connection.FutureAsyncCompletionHandler.get(FutureAsyncCompletionHandler.java:57)
I'm writing a simple scala-based script which supposed to insert some data into Mongo collection. The problem is, that script exits before mongo finishes it's task. What is the idiomatic/best approach to deal with the problem, considering following script:
#!/usr/bin/env scalas
/***
scalaVersion := "2.12.2"
libraryDependencies ++= {
Seq(
"org.mongodb.scala" %% "mongo-scala-driver" % "2.1.0"
)
}
*/
import org.mongodb.scala._
val mongoClient: MongoClient = MongoClient("mongodb://localhost")
val database: MongoDatabase = mongoClient.getDatabase("dev")
val doc: Document = Document("name" -> "MongoDB", "type" -> "database",
"count" -> 1, "info" -> Document("x" -> 203, "y" -> 102))
val collection: MongoCollection[Document] = database.getCollection("test")
val subscription = new Observer[Completed] {
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = println("Failed"+e.toString)
override def onComplete(): Unit = println("Completed")
}
collection.insertOne(doc).subscribe(subscription)
The script above produces follwoing error when executed:
com.mongodb.MongoInterruptedException: Interrupted acquiring a permit to retrieve an item from the pool
However, if I add Thread.sleep(3000) in the end it completes just fine.
I recommend using Promise object to notify completion of asynchronous jobs.
http://www.scala-lang.org/api/2.12.1/scala/concurrent/Promise.html
After asynchronous jobs finishing or after timeout, the program would exit.
val promise = Promise[Boolean]
...
override def onError(e: Throwable): Unit = {
println("Failed"+e.toString)
promise.success(false)
}
override def onComplete(): Unit = {
println("Completed")
promise.success(true)
}
val future = promise.future
Await.result(future, Duration(10, java.util.concurrent.TimeUnit.SECONDS))
//after completion, the program would exit.
I want to inserting a new entry in embedded field in existing document in mongodb using spark.
Example
Blog {
id:"001"
title:"This is a test blog",
content:"...."
comments:[{title:"comment1",content:".."},{title:"comment2",content:"..."}]
}
What I want to do
db.blogs.update({id:"001"}, {$push:{comments:{title:"commentX",content:".."}}});
Is it possible currently in this library? If not, can you please point me to the right direction.
Thanks in Advance
I was able to do the operation I wished using the Casbah Library for Spark-mongoDb.
import java.sql.Timestamp
import java.util.Date
import com.mongodb.casbah.MongoClient
import com.mongodb.casbah.commons.MongoDBObject
import com.mongodb.casbah.query.Imports._
object TestCasbah {
def main(args: Array[String]) {
val mongoClient = MongoClient("172.18.96.45", 27017)
val db = mongoClient("agentCallRecord")
val coll = db("CallDetails")
val query = MongoDBObject("agentId" -> "agent_1")
val callRatingMongoObject = MongoDBObject("audioId" -> 12351,"startTime" -> new Timestamp(new Date().getTime).toString, "endTime" -> new Timestamp(new Date().getTime).toString, "totalScore" -> 1, "sentiment" -> "NEGATIVE")
val update = $push("callRating" -> callRatingMongoObject)
coll.update(query, update)
}
}