update + push support in spark-mongodb library? - mongodb

I want to inserting a new entry in embedded field in existing document in mongodb using spark.
Example
Blog {
id:"001"
title:"This is a test blog",
content:"...."
comments:[{title:"comment1",content:".."},{title:"comment2",content:"..."}]
}
What I want to do
db.blogs.update({id:"001"}, {$push:{comments:{title:"commentX",content:".."}}});
Is it possible currently in this library? If not, can you please point me to the right direction.
Thanks in Advance

I was able to do the operation I wished using the Casbah Library for Spark-mongoDb.
import java.sql.Timestamp
import java.util.Date
import com.mongodb.casbah.MongoClient
import com.mongodb.casbah.commons.MongoDBObject
import com.mongodb.casbah.query.Imports._
object TestCasbah {
def main(args: Array[String]) {
val mongoClient = MongoClient("172.18.96.45", 27017)
val db = mongoClient("agentCallRecord")
val coll = db("CallDetails")
val query = MongoDBObject("agentId" -> "agent_1")
val callRatingMongoObject = MongoDBObject("audioId" -> 12351,"startTime" -> new Timestamp(new Date().getTime).toString, "endTime" -> new Timestamp(new Date().getTime).toString, "totalScore" -> 1, "sentiment" -> "NEGATIVE")
val update = $push("callRating" -> callRatingMongoObject)
coll.update(query, update)
}
}

Related

Scala does not write into MongoDB

I am on Ubuntu 20.04. I want to write some data in Scala to MongoDB. Here's what I have:
import org.mongodb.scala.bson.collection.immutable.{Document => MongoDocument}
import org.mongodb.scala.{MongoClient, MongoCollection, MongoDatabase}
object Application extends App {
val mongoClient: MongoClient = MongoClient()
// Use a Connection String
//val mongoClient: MongoClient = MongoClient("mongodb://localhost")
val database: MongoDatabase = mongoClient.getDatabase("mydb")
val collection: MongoCollection[MongoDocument] = database.getCollection("user")
val doc: MongoDocument = MongoDocument("_id" -> 0, "name" -> "MongoDB", "type" -> "database",
"count" -> 1, "info" -> MongoDocument("x" -> 203, "y" -> 102))
collection.insertOne(doc)
val documents = (1 to 100) map { i: Int => MongoDocument("i" -> i) }
collection.insertMany(documents)
}
The error (not even an error, INFO level) I get:
Nov 16, 2020 1:42:08 AM com.mongodb.diagnostics.logging.JULLogger log
INFO: Cluster created with settings {hosts=[localhost:27017],
mode=SINGLE, requiredClusterType=UNKNOWN,
serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
And nothing happens to the database. No data appears there. No errors, no insertions into Mongo, nothing.
I used primarily these sources as examples:
https://mongodb.github.io/mongo-scala-driver/2.9/getting-started/quick-tour/
https://blog.knoldus.com/how-scala-interacts-with-mongodb/
MongoDB is up, status is active. Inserting data from the terminal was done successfully. So, the program's behavior I have is strange. I've been searching everywhere on the Internet for the answers but I can't seem to find it. Your help will be appreciated a lot. Thank you!
Thanks to #Luis Miguel Mejía Suárez's help. Here's what I have done so far: added an Observer implementation and a promise. Thanks to this post: Scala script wait for mongo to complete task . That's what I have now:
val mongoClient: MongoClient = MongoClient("mongodb://localhost")
val database: MongoDatabase = mongoClient.getDatabase("mydb")
val collection: MongoCollection[MongoDocument] = database.getCollection("user")
val doc: MongoDocument = MongoDocument("name" -> "MongoDB", "type" -> "database",
"count" -> 1, "info" -> MongoDocument("x" -> 203, "y" -> 102))
val observable: Observable[Completed] = collection.insertOne(doc)
val promise = Promise[Boolean]
observable.subscribe(new Observer[Completed] {
override def onNext(result: Completed): Unit = println("Inserted")
override def onError(e: Throwable): Unit = {
println("Failed")
promise.success(false)
}
override def onComplete(): Unit = {
println("Completed")
promise.success(true)
}
})
val future = promise.future
Await.result(future, Duration(5, java.util.concurrent.TimeUnit.SECONDS))
mongoClient.close()
Generally speaking, it works in most cases. Though, I didn't handle the case with insertMany method where the program has to wait for the last element insertion. My realization does not work properly with this.
P.S. Turns out insertMany also works fine with this example, I just tested it with the wrong data.

Getting datatype of all columns in a dataframe using scala

I have a data frame in which following data is loaded:
enter image description here
I am trying to develop a code which will read data from any source, load the data into data frame and return the following o/p:
enter image description here
You can use the schema property and then iterate over the fields.
Example:
Seq(("A", 1))
.toDF("Field1", "Field2")
.schema
.fields
.foreach(field => println(s"${field.name}, ${field.dataType}"))
Results:
Make sure to take a look at the Spark ScalaDoc.
Thats the closest I could get to the output.
Create schema from case class and and there after created DF from the list of schema columns and mapped them to a Dataframe
import java.sql.Date
object GetColumnDf {
def main(args: Array[String]): Unit = {
val spark = Constant.getSparkSess
val map = Map("Emp_ID" -> "Dimension","Cust_Name" -> "Dimension","Cust_Age" -> "Measure",
"Salary" -> "Measure","DoJ" -> "Dimension")
import spark.implicits._
val lsit = Seq(Bean123("C-1001","Jack",25,3000,new Date(2000000))).toDF().schema.fields
.map( col => (col.name,col.dataType.toString,map.get(col.name))).toList.toDF("Headers","Data_Type","Type")
lsit.show()
}
}
case class Bean123(Emp_ID: String,Cust_Name: String,Cust_Age: Int, Salary : Int,DoJ: Date)

How do I turn a Scala case class into a mongo Document

I'd like to build a generic method for transforming Scala Case Classes to Mongo Documents.
A promising Document constructor is
fromSeq(ts: Seq[(String, BsonValue)]): Document
I can turn a case class into a Map[String -> Any], but then I've lost the type information I need to use the implicit conversions to BsonValues. Maybe TypeTags can help with this?
Here's what I've tried:
import org.mongodb.scala.bson.BsonTransformer
import org.mongodb.scala.bson.collection.immutable.Document
import org.mongodb.scala.bson.BsonValue
case class Person(age: Int, name: String)
//transform scala values into BsonValues
def transform[T](v: T)(implicit transformer: BsonTransformer[T]): BsonValue = transformer(v)
// turn any case class into a Map[String, Any]
def caseClassToMap(cc: Product) = {
val values = cc.productIterator
cc.getClass.getDeclaredFields.map( _.getName -> values.next).toMap
}
// transform a Person into a Document
def personToDocument(person: Person): Document = {
val map = caseClassToMap(person)
val bsonValues = map.toSeq.map { case (key, value) =>
(key, transform(value))
}
Document.fromSeq(bsonValues)
}
<console>:24: error: No bson implicit transformer found for type Any. Implement or import an implicit BsonTransformer for this type.
(key, transform(value))
def personToDocument(person: Person): Document = {
Document("age" -> person.age, "name" -> person.name)
}
Below code works without manual conversion of an object.
import reactivemongo.api.bson.{BSON, BSONDocument, Macros}
case class Person(name:String = "SomeName", age:Int = 20)
implicit val personHandler = Macros.handler[Person]
val bsonPerson = BSON.writeDocument[Person](Person())
println(s"${BSONDocument.pretty(bsonPerson.getOrElse(BSONDocument.empty))}")
You can use Salat https://github.com/salat/salat. A nice example can be found here - https://gist.github.com/bhameyie/8276017. This is the piece of code that will help you -
import salat._
val dBObject = grater[Artist].asDBObject(artist)
artistsCollection.save(dBObject, WriteConcern.Safe)
I was able to serialize a case class to a BsonDocument using the org.bson.BsonDocumentWriter. The below code runs using scala 2.12 and mongo-scala-driver_2.12 version 2.6.0
My quest for this solution was aided by this answer (where they are trying to serialize in the opposite direction): Serialize to object using scala mongo driver?
import org.mongodb.scala.bson.codecs.Macros
import org.mongodb.scala.bson.codecs.DEFAULT_CODEC_REGISTRY
import org.bson.codecs.configuration.CodecRegistries.{fromRegistries, fromProviders}
import org.bson.codecs.EncoderContext
import org.bson.BsonDocumentWriter
import org.mongodb.scala.bson.BsonDocument
import org.bson.codecs.configuration.CodecRegistry
import org.bson.codecs.Codec
case class Animal(name : String, species: String, genus: String, weight: Int)
object TempApp {
def main(args: Array[String]) {
val jaguar = Animal("Jenny", "Jaguar", "Panthera", 190)
val codecProvider = Macros.createCodecProvider[Animal]()
val codecRegistry: CodecRegistry = fromRegistries(fromProviders(codecProvider), DEFAULT_CODEC_REGISTRY)
val codec = Macros.createCodec[Animal](codecRegistry)
val encoderContext = EncoderContext.builder.isEncodingCollectibleDocument(true).build()
var doc = BsonDocument()
val writr = new BsonDocumentWriter(doc) // need to call new since Java lib w/o companion object
codec.encode(writr, jaguar, encoderContext)
print(doc)
}
};

How can I find a play form type (for handling post requests in controller) in order to map a class containing BSONObjectID type?

I'm working on a web application using Play Framework (2.2.6) / scala / mongoDB, and I have a problem with reactivemongo.bson.BSONObjectID. (I'm a beginner in both ReactiveMongo and Scala)
My controller contains this code :
val actForm = Form(tuple(
"name" -> optional(of[String]),
"shortcode" -> optional(of[String]),
"ccam" -> mapping(
"code" -> optional(of[String]),
"description" -> optional(of[String]),
"_id" -> optional(of[BSONObjectID])
)(CCAMAct.apply)(CCAMAct.unapply)
));
def addAct = AsyncStack(AuthorityKey -> Secretary) { implicit request =>
val user = loggedIn
actForm.bindFromRequest.fold(
errors => Future.successful(BadRequest(errors.errorsAsJson)), {
case (name, shortcode, ccam) =>
val newact = Json.obj(
"id" -> BSONObjectID.generate,
"name" -> name,
"shortcode" -> shortcode,
"ccam" -> ccam
)
settings.update(
Json.obj("practiceId" -> user.practiceId.get),
Json.obj("$addToSet" -> Json.obj("acts" -> Json.obj("acte" -> newact)))
).map { lastError => Ok(Json.toJson(newact)) }
})
}
The CCAMAct class is defined like this :
import models.db.Indexable
import play.api.libs.json._
import reactivemongo.bson.BSONObjectID
import reactivemongo.api.indexes.{Index, IndexType}
import models.db.{MongoModel, Indexable}
import scala.concurrent.Future
import play.modules.reactivemongo.json.BSONFormats._
import models.practice.Practice
import play.api.libs.functional.syntax._
case class CCAMAct(code:Option[String],
description:Option[String],
_id: Option[BSONObjectID] = None) extends MongoModel {}
object CCAMAct extends Indexable {
private val logger = play.api.Logger(classOf[CommonSetting]).logger
import play.api.Play.current
import play.modules.reactivemongo.ReactiveMongoPlugin._
import play.modules.reactivemongo.json.collection.JSONCollection
import scala.concurrent.ExecutionContext.Implicits.global
def ccam: JSONCollection = db.collection("ccam")
implicit val ccamFormat = Json.format[CCAMAct]
def index() = Future.sequence(
List (
Index(Seq("description" -> IndexType.Text))
).map(ccam.indexesManager.ensure)
).onComplete { indexes => logger.info("Text index on CCAM ends") }
}
Thus the compiler throws me this error :
Cannot find Formatter type class for reactivemongo.bson.BSONObjectID. Perhaps you will need to import play.api.data.format.Formats._
"_id" -> optional(of[BSONObjectID])
^
(Of course I have already imported "play.api.data.format.Formats._")
I also tried to add a custom Formatter following answers from similar posts on the web.
object Format extends Format[BSONObjectID] {
def writes(objectId: BSONObjectID): JsValue = JsString(objectId.stringify)
def reads(json: JsValue): JsResult[BSONObjectID] = json match {
case JsString(x) => {
val maybeOID: Try[BSONObjectID] = BSONObjectID.parse(x)
if(maybeOID.isSuccess)
JsSuccess(maybeOID.get)
else {
JsError("Expected BSONObjectID as JsString")
}
}
case _ => JsError("Expected BSONObjectID as JsString")
}
}
...without any success.
[UPDATED POST]
Finally I'm not able to find a play form type (for handling POST requests in controller) in order to map a class containing BSONObjectID type...
Anyone knows a clean solution to resolve this problem?
The JSON Format for the BSON types from ReactiveMongo are not provided by Play itself in Formats._, but by the ReactiveMongo plugin for Play.
import play.modules.reactivemongo.json.BSONFormats._

Using Solr from Scala/ Play

How can use Solr from within Scala/ Play? Specifically how do I add/ update documents?
Update: see my newer answer refer https://stackoverflow.com/a/17315047/604511
Here is code I wrote which uses Play's JSON library and Dispatch HTTP client. Its not perfect, but it should help you get started.
package controllers
import play.api._
import play.api.mvc._
import play.api.libs.json.Json
import play.api.libs.json.Json.toJson
import dispatch._
object Application extends Controller {
def index = Action {
val addDocument = Json.toJson(
Map(
"add" ->
Seq(
//a document
Map(
"id" -> toJson("123"),
"subject" -> toJson("you have been served")
)
)
))
val toSend = Json.stringify( addDocument)
val params = Map( "commit" -> "true", "wt" -> "json")
val headers = Map( "Content-type" -> "application/json")
val solr = host( "127.0.0.1", 8983)
val req = solr / "solr" / "update" / "json" <<?
params <:< headers << toSend
val response = Http(req)()
Ok( toSend + response.getResponseBody)
//Redirect(routes.Application.tasks)
}
def tasks = TODO
def newTask = TODO
def deleteTask(id: Long) = TODO
}
You might consider using the SolrJ Java Lib, which uses a binary protocol to communicate with the Solr Server which performs better than using the XML way.
Adding a document to the index is done this:
http://wiki.apache.org/solr/Solrj#Adding_Data_to_Solr
Not directly related to updating documents, but a nice query DSL for Solr in Scala build by foursquare is described in their engineering blog article: http://engineering.foursquare.com/2011/08/29/slashem-a-rogue-like-type-safe-scala-dsl-for-querying-solr/