Mongodb upsert embedded document - mongodb

I have a document per day per meter. How can I add another subdocument in the data array and create the whole document if he doesn't exists ?
{
"key": "20120418_123456789",
"data":[
{
"Meter": 123456789,
"Dt": ISODate("2011-12-29T16:00:00.0Z"),
"Energy": 25,
"PMin": 11,
"PMax": 16
}
],
"config": {"someparam": 4.5}
}
Can I use upsert for that purpose ?
The result will be if document exists :
{
"key": "20120418_123456789",
"data":[
{
"Meter": 123456789,
"Dt": ISODate("2011-12-29T16:00:00.0Z"),
"Energy": 25,
"PMin": 11,
"PMax": 16
},
{
"Meter": 123456789,
"Dt": ISODate("2011-12-29T16:15:00.0Z"),
"Energy": 22,
"PMin": 13,
"PMax": 17
}
],
"config": {"someparam": 4.5}
}
Thanks in advance

I think what you want is the $addToSet command - that will push an element to an array only if it does not already exist. I've simplified your example a bit for brevity:
db.meters.findOne()
{
"_id" : ObjectId("4f8e95a718bc9c7da1e6511a"),
"config" : {
"someparam" : 4.5
},
"data" : [
{
"Meter" : 123456789,
}
],
"key" : "20120418_123456789"
}
Now run:
db.meters.update({"key" : "20120418_123456789"}, {"$addToSet": {"data" : {"Meter" : 1234}}})
And we get the updated version:
db.meters.findOne()
{
"_id" : ObjectId("4f8e95a718bc9c7da1e6511a"),
"config" : {
"someparam" : 4.5
},
"data" : [
{
"Meter" : 123456789,
},
{
"Meter" : 1234
}
],
"key" : "20120418_123456789"
}
Run the same command again and the result is unchanged.
Note: you are likely going to be growing these documents, especially if this field is unbounded and causing frequent (relatively expensive) moves by updating in this way - you should have a look here for ideas on how to mitigate this:
http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-ManualPadding

Related

Complex MongoDB query?

I'm pretty brand new to Mongo and queries still, so that said, I'm trying to build a query that will find me results that match these three types of dog breeds and in addition to that, check for additional two specs. And finally, sort all by age. All the data comes from a csv file (scrnshot), there aren't any sub categories to any of the entries.
db.animals.find({
"animal_id" : 1,
"breed" : "Labrador Retriever Mix",
"breed" : "Chesapeake Bay Retriever",
"breed" : "Newfoundland",
$and : [ { "age_upon_outcome_in_weeks" :{"$lt" : 156, "$gte" : 26} ],
$and: {"sex_upon_outcome" : "Intact Female"}}).sort({"age_upon_outcome_in_weeks" : 1})
This is throwing a number of errors, such as :
Error: error: {
"ok" : 0,
"errmsg" : "$and must be an array",
"code" : 2,
"codeName" : "BadValue"
}
What am I messing up? Or is there a better way to do it?
As mentionend by takis in the comments, you cannot repeat a key in a mongo query - you have to imagine that your query document becomes a json object, and each time a key is repeated is replaces the previous one. To go around this problem, mongodb supports $or and $and operators. For complex queries like this one, I would recommend starting with a global each containing a single constraint or a $or constraint. Your query becomes this:
db.coll.find({
"$and": [
{ "animal_id": 1 },
{ "age_upon_outcome_in_weeks": { "$lt": 156, "$gte": 26 } },
{ "sex_upon_outcome": "Intact Female" },
{ "$or": [
{ "breed": "Labrador Retriever Mix" },
{ "breed": "Chesapeake Bay Retriever" },
{ "breed": "Chesapeake Bay Retriever" },
{ "breed": "Newfoundland" }
]
}
]
})
.sort({"age_upon_outcome_in_weeks" : 1})
--- edit
You can also consider using the $in instead of the $or:
db.coll.find({
"animal_id": 1,
"age_upon_outcome_in_weeks": { "$lt": 156, "$gte": 26 },
"sex_upon_outcome": "Intact Female",
"breed": { "$in": [
"Labrador Retriever Mix",
"Chesapeake Bay Retriever",
"Chesapeake Bay Retriever",
"Newfoundland"
] }
})
.sort({"age_upon_outcome_in_weeks" : 1})

How to combine Documents in aggregation pipeline with MongoDB Java driver 3.6?

I am using an aggregation pipeline with the MongoDB Java driver version 3.6. If I have documents that look something like:
doc1 --
{
"CAR": {
"VIN": "ASDF1234",
"YEAR": "2018",
"MAKE": "Honda",
"MODEL": "Accord"
},
"FEATURES": [
{
"AUDIO": "MP3",
"TIRES": "All Season",
"BRAKES": "ABS"
}
]
}
doc2 --
{
"CAR": {
"VIN": "ASDF1234",
"AVAILABILITY": "In Stock"
}
}
And if I submit a query like:
collection.aggregate(
Arrays.asList(
Aggregates.match(
and(
in("CAR.VIN", vinList),
or(
eq("CAR.MAKE", carMake),
eq("CAR.AVAILABILITY", carAvailability),
)
)
)
)
)
Let us assume that there are exactly two different records for which the "CAR.VIN" criteria match for every VIN, and I am going to get two results. Rather than deal with two results each time, I would like to merge the documents so that the result looks like this:
{
"CAR": {
"VIN": "ASDF1234",
"YEAR": "2018",
"MAKE": "Honda",
"MODEL": "Accord",
"AVAILABILITY": "In Stock"
},
"FEATURES": [
{
"AUDIO": "MP3",
"TIRES": "All Season",
"BRAKES": "ABS"
}
]
}
The example where I have two and only two results trivializes my need for this. Imagine that vinList is a list of 10000 values, and it might return 2 x 10000 documents. When I return an AggregateIterable to the client that is calling my code, I do not want to impose the requirement that they have to group or collate the results in any way, but that they will receive one document for each result that has all of the information that they will want to parse, cleanly and easily.
Of course, people will suggest that the data is simply combined into one document with all of the data in the MongoDB collection. For reasons that I cannot control, there are two separate documents corresponding to each VIN in the same collection, and that is something that I am unable to change. There is a value in our system that makes this more reasonable than it might seem, so please don't focus on this apparent problem with the data.
I am trying, with not much luck, to utilize the Aggretes.group() operation to merge the fields in my aggregation pipeline. Accumulators.push seems to be the closest operation to what I need, but I do not want to complicate the document structure with extra arrays, etc. Is there a straightforward approach that I am not seeing?
you can try $mergeObjects added in mongo v3.6
db.cc.aggregate(
[
{
$group: {
_id : "$CAR.VIN",
CAR : {$mergeObjects : "$CAR"},
FEATURES : {$mergeObjects : {$arrayElemAt : ["$FEATURES", 0 ]}}
}
}
]
).pretty()
result
{
"_id" : "ASDF1234",
"CAR" : {
"VIN" : "ASDF1234",
"YEAR" : "2018",
"MAKE" : "Honda",
"MODEL" : "Accord",
"AVAILABILITY" : "In Stock"
},
"FEATURES" : {
"AUDIO" : "MP3",
"TIRES" : "All Season",
"BRAKES" : "ABS"
}
}
>
to get features as array
db.cc.aggregate(
[
{
$group: {
_id : "$CAR.VIN",
CAR : {$mergeObjects : "$CAR"},
FEATURES : {$push : {$arrayElemAt : ["$FEATURES", 0 ]}}
}
}
]
).pretty()
result
{
"_id" : "ASDF1234",
"CAR" : {
"VIN" : "ASDF1234",
"YEAR" : "2018",
"MAKE" : "Honda",
"MODEL" : "Accord",
"AVAILABILITY" : "In Stock"
},
"FEATURES" : [
{
"AUDIO" : "MP3",
"TIRES" : "All Season",
"BRAKES" : "ABS"
},
null
]
}
>

Adding additional fields to an embedded document

I make extensive use of embedded documents in my MongoDB database and I'm running into speed problems when trying to add additional data:
As an example I have a document that looks a bit like this:
{
"date" : <<the date>>
"name" : "thisName"
"basket": [
{
"stock": "IBM",
"quantity": 1000.0,
"profit" : 10:0,
},
...
{
"stock": "MSFT",
"quantity": 2000.0,
"profit" : 30:0,
},
]
}
What I want to do is to add 5 new fields in the embedded documents like this:
{
"date" : <<the date>>
"name" : "thisName"
"basket": [
{
"stock": "IBM",
"quantity": 1000.0,
"profit" : 10:0,
"new_1" : 10:0,
"new_2" : 10:0,
"new_3" : 10:0,
"new_4" : 10:0,
"new_4" : 10:0,
"new_5" : 10:0
},
...
{
"stock": "MSFT",
"quantity": 2000.0,
"profit" : 30:0,
"new_1" : 10:0,
"new_2" : 10:0,
"new_3" : 10:0,
"new_4" : 10:0,
"new_4" : 10:0,
"new_5" : 10:0
},
]
}
I started doing this using find().update_one() in a for loop, identifying each embedded document explicitly and using document explicitly using "$set". This approach works but it is very slow. If my collection was small I'm sure that this wouldn't matter but as it is it's huge (100's of millions of documents). It's probably so slow because the entire document has to be moved every time I add a set of fields. With that in mind I attempted to add the new fields to all the embedded documents in one go. I did this by leaving the find query empty and removing the positional $ from the "$set" command. A little like this (in pymongo):
bulk.find({"date": dates[i],
"strategyId": strategyInfo[siOffset[l]][ID]
}).update({
"$set": {
"basket.new_1": 0.0,
"basket.new_2": 0.0,
"basket.new_3": 0.0,
"basket.new_4": 0.0,
"basket.new_5": 0.0
}
})
This approach seems to throw an error cannot use the part (basket of basket.new_5) to traverse the element ({basket:......
Is anyone able to give some insight as to what I'm doing wrong? Is it even possible to do this?
You can use a recursive function like this.
First find all that data for update
db.collection('game_users').find(
{"date": dates[i],"strategyId": strategyInfo[siOffset[l]][ID]}
).toArray(function(err, data) {
var i=0;
function data_Update(){
if(i!=data.length){
db.collection('game_users').update(
{"date": dates[i],"strategyId": strategyInfo[siOffset[l]][ID]},
{ $set : {
"basket.new_1": 0.0,
"basket.new_2": 0.0,
"basket.new_3": 0.0,
"basket.new_4": 0.0,
"basket.new_5": 0.0
}
},
function(err, resp) {
i++;
data_Update();
}
);
}
}
}
);`

Something is wrong with '$' index for updating mongodb data

Given a mongodb data of
{
"_id" : ObjectId("552f283dd951e49c6f2f451d"),
"uuid" : "1-2-1-b95a4040-e29d-11e4-bce8-0381ce4bc8a5",
"sub" : [
{
"prod" : 30,
"var" : 0,
"status" : "Test",
"files" : [
{
"filePath" : "20150415/2-1/21001429153881552f2859699769.82145796.jpg"
},
{
"filePath" : "20150415/2-1/21001429153880552f28589ca9a8.67013013.jpg"
}
]
},
{
"prod" : 10,
"var" : 0,
"status" : "Pending",
"files" : []
}
],
"process_marker" : 3
}
I would want to update the status of "sub.status" where "uuid" : "1-2-1-b95a4040-e29d-11e4-bce8-0381ce4bc8a5", "sub.prod":10, "sub.prod":0
Normally we'd use the "$" to modify the resulting index as seen below:
use targetDB
db.collection.update({
"uuid" : "1-2-1-b95a4040-e29d-11e4-bce8-0381ce4bc8a5",
"sub.prod":10,
"sub.var":0
},{
"$set":{"sub.$.status":"MyNewValue"}
})
==== BUT THE CODE ABOVE DOES NOT UPDATE THE CORRECT $ TARGET ====
It updates the "prod":30, "var":0 set... Why is that?
If we limit the query conditions for two key value pairs as seen below, the correct array set is updated
use targetDB
db.collection.update({
"uuid" : "1-2-1-b95a4040-e29d-11e4-bce8-0381ce4bc8a5",
"sub.prod":10
},{
"$set":{"sub.$.status":"MyNewValue"}
})
==== THE CODE ABOVE UPDATES THE CORRECT $ TARGET ====
I'm confused that a more detailed find query would result in updating the wrong array set. Is this a bug or did I do something wrong?
MongoDB version : 3.0.2
There is no bug in mongoDB, problem in your query is that you used first match "sub.prod":10 and then "sub.var":0 but in updated it takes latest value in your case "sub.var":0 takes and match first matching array where "sub.var":0 that why it updated only "prod" : 30 array element. More ref. click here.
In this case you should use mongo $elemMatch with $and conditions as below
db.collectionName.update({
"uuid": "1-2-1-b95a4040-e29d-11e4-bce8-0381ce4bc8a5",
"sub": {
"$elemMatch": {
"$and": [{
"prod": 10
}, {
"var": 0
}]
}
}
}, {
"$set": {
"sub.$.status": "MyNewValue"
}
})

Compare array elements,remove the one with the lowest score

There are 200 documents in school db. I must remove each document which has "type":"homework" and the lowest score.
{
"_id" : 0,
"name" : "aimee Zank",
"scores" :
[
{
"type" : "exam",
"score" : 1.463179736705023
},
{
"type" : "quiz",
"score" : 11.78273309957772
},
{
"type" : "homework",
"score" : 6.676176060654615
},
{
"type" : "homework",
"score" : 35.8740349954354
}
]
}
For example,here
{
"type" : "homework",
"score" : 6.676176060654615
}
must be removed as score = 6.6 < 35.8
I sorted all the documents like this:
db.students.find({"scores.type":"homework"}).sort({"scores.score":1})
But I do not know how then to remove the doc having the lowest score and type:homework???
NOTE: how to solve it by not using aggregation method? E.g., by sorting and then updating.
This can be done in a couple of steps. The first step is to grab a list of the documents with the minimum score by using the aggregation framework with $match, $unwind and $group operators that streamlines your documents to find the minimum score for each document:
lowest_scores_docs = db.school.aggregate([
{ "$match": {"scores.type": "homework"} },
{ "$unwind": "$scores" }, { "$match": {"scores.type": "homework"} },
{ "$group": { "_id":"$_id", "lowest_score": {"$min": "$scores.score" } } } ] )
The second step is to loop through the dictionary above and use the $pull operator in the update query to remove the element from the array as follows:
for result in lowest_scores_docs["result"]:
db.school.update({ "_id": result["_id"] },
{ "$pull": { "scores": { "score": result["lowest_score"] } } } )
import pymongo
import sys
# connnecto to the db on standard port
connection = pymongo.MongoClient("mongodb://localhost")
db = connection.school # attach to db
students = db.students # specify the colllection
try:
cursor = students.find({})
print(type(cursor))
for doc in cursor:
hw_scores = []
for item in doc["scores"]:
if item["type"] == "homework":
hw_scores.append(item["score"])
hw_scores.sort()
hw_min = hw_scores[0]
#students.update({"_id": doc["_id"]},
# {"$pull":{"scores":{"score":hw_min}}})
except:
print ("Error trying to read collection:" + sys.exc_info()[0])