I get the following list of documents back from MongoDB when I find for "campaignID":"DEMO-1".
[
{
"_id": {
"$oid": "56be0e8b3cf8a2d4f87ddb97"
},
"campaignID": "DEMO-1",
"revision": 1,
"action": [
"kick",
"punch"
],
"transactionID": 20160212095539543
},
{
"_id": {
"$oid": "56c178215886447ea261710f"
},
"transactionID": 20160215000257159,
"campaignID": "DEMO-1",
"revision": 2,
"action": [
"kick"
],
"transactionID": 20160212095539578
}
]
Now, what I am trying to do here is for a given campaignID I need to find all its versions (revision in my case) and modify the action field to dead of type String. I read the docs and the examples they have is too simple not too helpful in my case. This is what the docs say:
val selector = BSONDocument("name" -> "Jack")
val modifier = BSONDocument(
"$set" -> BSONDocument(
"lastName" -> "London",
"firstName" -> "Jack"),
"$unset" -> BSONDocument(
"name" -> 1))
// get a future update
val futureUpdate = collection.update(selector, modifier)
I can't just follow the docs because its easy to create a new BSON document and use it to modify following the BSON structure by hardcoding the exact fields. In my case I need to find the documents first and then modify the action field on the fly because unlike the docs, my action field can have different values.
Here's my code so far which obviously does not compile:
def updateDocument(campaignID: String) ={
val timeout = scala.concurrent.duration.Duration(5, "seconds")
val collection = db.collection[BSONCollection](collectionName)
val selector = BSONDocument("action" -> "dead")
val modifier = collection.find(BSONDocument("campaignID" -> campaignID)).cursor[BSONDocument]().collect[List]()
val updatedResults = Await.result(modifier, timeout)
val mod = BSONDocument(
"$set" -> updatedResults(0),
"$unset" -> BSONDocument(
"action" -> **<???>** ))
val futureUpdate = collection.update(selector, updatedResults(0))
futureUpdate
}
This worked for me as an answer to my own question. Thanks #cchantep for helping me out.
val collection = db.collection[BSONCollection](collectionName)
val selector = BSONDocument("campaignID" -> campaignID)
val mod = BSONDocument("$set" -> BSONDocument("action" -> "dead"))
val futureUpdate = collection.update(selector, mod, multi = true)
If you have a look at the BSON documentation, you can see BSONArray can be used to pass sequence of BSON values.
BSONDocument("action" -> BSONArray("kick", "punch"))
If you have List[T] as values, with T being provided a BSONWriter[_ <: BSONValue, T], then this list can be converted as BSONArray.
BSONDocument("action" -> List("kick", "punch"))
// as `String` is provided a `BSONWriter`
Related
I have a scala Spark application that I would like to unset the fields for all documents in a Mongo collection before I load updated data into the collection.
Let's say I have a data source like this and I want to remove the "rank" field from all documents (some may have this field and some may not).
[
{
"_id": 123,
"value": "a"
},
{
"_id": 234,
"value": "b",
"rank": 1
},
...
]
I know in mongo there is an unset function, but I don't see any documentation in the mongo spark connector on how to do something like this w/ Spark.
I've tried filtering out the field and dropping it in the Dataset before I save to Mongo but I run into the following error:
com.mongodb.MongoBulkWriteException: Bulk write operation error on server localhost:58200. Write errors: [BulkWriteError{index=0, code=9, message=''$set' is empty. You must specify a field like so: {$set: {<field>: ...}}', details={}}].
at com.mongodb.connection.BulkWriteBatchCombiner.getError(BulkWriteBatchCombiner.java:173)
...
I have the following definitions:
case class Item(_id: Int, rank: Option[Int])
val idCol = new ColumnName("_id")
val rankCol = new ColumnName("rank")
and a function that does something like this in the same class:
def resetRanks(): {
val records = MongoSpark
.load[Item](
sparkSession,
ReadConfig(
Map(
"collection" -> mongoConfig.collection,
"database" -> mongoConfig.db,
"uri" -> mongoConfig.uri
),
Some(ReadConfig(sparkSession))
)
)
.select(idCol, rankCol)
.repartition(sparkConfig.partitionSize, $"_id")
.where(rankCol.isNotNull)
.drop(rankCol)
MongoSpark.save(
records,
WriteConfig(
Map(
"collection" -> mongoConfig.collection,
"database" -> mongoConfig.db,
"forceInsert" -> "false",
"ordered" -> "true",
"replaceDocument" -> "false", // not replacing docs since there are other fields I'd like to keep intact that I won't be modifying
"uri" -> mongoConfig.uri,
"writeConcern.w" -> "majority"
),
Some(WriteConfig(sparkSession))
)
)
}
I'm using MongoSparkConnector v2.4.2.
I also saw this thread which seemed to suggest the reason I get the above error is that that I can't have null fields, but I need to unset these fields so I'm at a lost on how to go about it.
Any tips or pointers are appreciated.
You can try something like this where you can drop the column from the dataframe and write to a new collection. One issue I have observed here is, when trying to write to save collection, my collection was getting dropped, perhaps you can take the research from there.
Here I am directly utilizing the dataframeWriter Save function. You can use the conventional MongoSpark.save() function along with the WriteConfig as you like.
I am using Spark 3.1.2, Mongo-Spark Connector 3.0.1, Mongo 4.2.6
case class Item(id: Int, rank: Option[Int], value: String = "abc")
def main(args: Array[String]): Unit = {
val sparkSession = getSparkSession(args)
val items = MongoSpark.load[Item](sparkSession, ReadConfig(Map("collection" -> "items"), Some(ReadConfig(sparkSession))))
items.show()
val dropped = items.drop("rank")
dropped.write.option("collection", "items-updated").mode("overwrite").format("mongo").save()
dropped.show()
}
I am not currently able to run a Raw Command in ReactiveMongo 0.12.5 using the Play JSON Plugin. The documentation (Run a raw command) is not currently accessible but from a cached page in my browser I can see the following:
import scala.concurrent.{ ExecutionContext, Future }
import play.api.libs.json.{ JsObject, Json }
import reactivemongo.play.json._
import reactivemongo.api.commands.Command
def rawResult(db: reactivemongo.api.DefaultDB)(implicit ec: ExecutionContext): Future[JsObject] = {
val commandDoc = Json.obj(
"aggregate" -> "orders", // we aggregate on collection `orders`
"pipeline" -> List(
Json.obj("$match" -> Json.obj("status" -> "A")),
Json.obj(
"$group" -> Json.obj(
"_id" -> "$cust_id",
"total" -> Json.obj("$sum" -> "$amount"))),
Json.obj("$sort" -> Json.obj("total" -> -1))
)
)
val runner = Command.run(JSONSerializationPack) // run is since deprecated
runner.apply(db, runner.rawCommand(commandDoc)).one[JsObject] // one is since deprecated
}
However I am not looking to return a JsObject (or anything in fact) - I actually want to update all documents in another collection as this previous answer illustrates. My issue is that both methods contain deprecated functions and so I have put together a combination to (possibly) work with JSON Collections (as mentioned):
def bulkUpdateScoreBA(scoreBAs: List[ScoreBA]) = {
def singleUpdate(scoreBA: ScoreBA) = Json.obj(
("q" -> Json.obj("_id" ->
Json.obj("$oid" -> scoreBA.idAsString(scoreBA._id))
)),
("u" ->
Json.obj("$set" ->
Json.obj("scoreBA" -> scoreBA.scoreBA)
)
)
)
val commandJson = Json.obj(
"update" -> "rst",
"updates" -> Json.arr(scoreBAs.map(singleUpdate)),
"ordered" -> false,
"writeConcern" -> Json.obj("w" -> "majority", "wtimeout" -> 5000)
)
val runner = Command.CommandWithPackRunner(JSONSerializationPack)
runner.apply(db, runner.rawCommand(commandJson)) // ?? how to get a Future[Unit] here
}
However I need this to return a Future[Unit] so that I can call it from the controller but I cannot find how this is done or even if what I have done so far is the best way. Any help is appreciated!
The Scaladoc for bulk update is available (since 0.12.7), with example in tests.
I have no idea how I should use play-reactivemongo's JSONFindAndModifyCommand.
I need to make an upsert query by some field. So I can first remove any existing entry and then insert. But Google says that FindAndModify command has upsert: Boolean option to achieve the same result.
Suppose I have two play.api.libs.json.JsObjects: query and object.
val q = (k: String) => Json.obj("sha256" -> k)
val obj = (k: String, v: String) => Json.obj(
"sha256" -> k,
"value" -> v
)
Then I do:
db.collection.findAndModify(
q(someSha256),
what?!,
...
)
I use play2-reactivemongo 0.11.9
Thanks!
The simpler is to use the collection operations findAndUpdate or findAndRemove, e.g.
val person: Future[BSONDocument] = collection.findAndUpdate( BSONDocument("name" -> "James"), BSONDocument("$set" -> BSONDocument("age" -> 17)), fetchNewObject = true) // on success, return the update document: // { "age": 17 }
This is a scala question.
I currently have the following two collections objects:
val keywordLookup = Map("a" -> "1111",
"b" -> "2222",
"c" -> "3333",
"d" -> "4444",
"e" -> "5555")
val keywordList = Set("1111", "3333")
The keywordLookup is a lookup object. The keywordList contains a list of values that I need to find the Ids from the keywordLookup object.
I would like the get the following result:
Map("a" -> "1111", "c" -> "3333")
val filtered = keywordLookup.filter(kv => keywordList.contains(kv._2))
filtered is the Map you want as output
keywordLookup.filter(x => keywordList.contains(x._2))
Using flatMap on find,
keywordList.flatMap (k => keywordLookup.find( _._2 == k)).toMap
I think there should be an easy solution around, but I wasn't able to find it.
I start accessing data from MongoDB with the following in Scala:
val search = MongoDBObject("_id" -> new ObjectId("xxx"))
val fields = MongoDBObject("community.member.name" -> 1, "community.member.age" -> 1)
for (res <- mongoColl.find(search, fields)) {
var memberInfo = res.getAs[BasicDBObject]("community").get
println(memberInfo)
}
and get a BasicDBObject as result:
{
"member" : [
{
"name" : "John Doe",
"age" : "32",
},{
"name" : "Jane Doe",
"age" : "29",
},
...
]
}
I know that I can access values with getAs[String], though this is not working here...
Anyone has an idea? Searching for a solution for several hours...
If you working with complex MongoDB objects, you can use Salat, which provides simple case class serialization.
Sample with your data:
case class Community(members:Seq[Member], _id: ObjectId = new ObjectId)
case class Member(name:String, age:Int)
val mongoColl: MongoCollection = _
val dao = new SalatDAO[Community, ObjectId](mongoColl) {}
val community = Community(Seq(Member("John Doe", 32), Member("Jane Doe", 29)))
dao.save(community)
for {
c <- dao.findOneById(community._id)
m <- c.members
} println("%s (%s)" format (m.name, m.age))
I think you should try
val member = memberInfo.as[MongoDBList]("member").as[BasicDBObject](0)
println(member("name"))
This problem has not to do really with MongoDB, but rather with your data structure. Your JSON/BSON data structure includes
An object community, which includes
An array of members
Each member has properties name or age.
Your problem is completely equivalent to the following:
case class Community(members:List[Member])
case class Member(name:String, age:Int)
val a = List(member1,member2)
// a.name does not compile, name is a property defined on a member, not on the list
Yes, you can do this beautifully with comprehensions. You could do the following:
for { record <- mongoColl.find(search,fields).toList
community <- record.getAs[MongoDBObject]("community")
member <- record.getAs[MongoDBObject]("member")
name <- member.getAs[String]("name") } yield name
This would work just to get the name. To get multiple values, I think you would do:
for { record <- mongoColl.find(search,fields).toList
community <- record.getAs[MongoDBObject]("community")
member <- record.getAs[MongoDBObject]("member")
field <- List("name","age") } yield member.get(field).toString