Upsert mongo array of object with condition - mongodb

I have a collection with some documents and each document has ha list of objects representing a temporal interval in epoch with a keyword. I want to update the final value of the interval of each object in the right document where the ending value is greater than a value. If nothing can be updated I want to insert a new interval with both start and end as the new value and with the keyword used in the query.
I'm using nifi to perform this task and the update block, with upsert enabled. I can use aggregation too but it should be possible to just do it with an upsert.
This is what I have in place right now
QUERY
{
"_id":"docid",
"thearray.keyword": "red",
"thearray.end": {$gte :minUpdatable}
}
and this as the update body:
UPDATE BODY
{
"thearray.$[].end": valueToUpdate,
"$setOnInsert":{"$push": {"thearray":{"keyword":"red","start":valueToUpdate,"end":valueToUpdate}}}
}
}
This is the document data structure:
INITITAL STATUS
{
_id:"docid",
otherInfo:"",
thearray:[
{keyword:"red",start:8,end:15},
{keyword:"blue",start:8,end:15},
{keyword:"red",start:8,end:9},
{keyword:"red",start:9,end:16},
...
]
}
EXAMPLE 1: In this situation, from initial status, updating with "docid" as the document, "red" as keyword, 12 as lower possible end value to update (minUpdatable) and 22 as the value to set the document should become something like:
{
_id:"docid",
otherInfo:"",
thearray:[
{keyword:"red",start:8,end:22},
{keyword:"blue",start:8,end:15},
{keyword:"red",start:8,end:9},
{keyword:"red",start:9,end:22},
...
]
}
EXAMPLE 2: In the same situation, from initial status, updating with with "docid" as the document, "red" as keyword, 33 as lower possible end value to update and 39 as the value to set the document should result in something like (or any query that result in the impossibility to update) :
{
_id:"docid",
otherInfo:"",
thearray:[
{keyword:"red",start:8,end:15},
{keyword:"blue",start:8,end:15},
{keyword:"red",start:8,end:9},
{keyword:"red",start:9,end:16},
...,
{keyword:"red",start:33,end:33}
]
}

Related

Upsert timeseries in Mongodb v5 - v6

I'm reading the documentation about Timeseries in Mongodb v5 - v6 and I don't understand if it's possible to upsert a record after it has been saved; for example if I have a record like this (the "name" field is the "metadata" ):
{
_id: ObjectId("6560a0ef02a1877734a9df66")
timestamp: 2022-11-24T01:00:00.000Z,
name: 'sensor1',
pressure: 5,
temperature: 25
}
is it possible to update the value of the "pressure" field after the record has been saved?
From the official mongo documentation, inside the "Time Series Collection Limitations" section, I read that: The update command may only modify the metaField field value.
Is there a way to upsert also other field? Thanks a lot.
No, updating the pressure field in your example is impossible with update alone, and upsert doesn't exist for time series collections.
The only functions currently available for time series collections are Delete and Update, but they only work on the metaField values, so in your example, we can only update/rename 'sensor1'.
The only workaround I know to update values is as follows:
Get a copy of all documents matched on the metaField values.
Update desired values on the copied documents.
Delete the original documents from the database
Insert your new copy of the documents into the database.
Here's a way to update values on a time series collections, using the MongoDB Shell (mongosh)
First, we create a test database. The important part here is the metaField named "metadata." This field will be an object/dictionary that stores multiple fields.
db.createCollection(
"test_coll",
{
timeseries: {
timeField: "timestamp",
metaField: "metadata",
granularity: "hours"
}
}
)
Then we add some test data to the collection. Note the 'metadata' is an object/dictionary that stores two fields named
sensorName and sensorLocation.
db.test_coll.insertMany( [
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside"},
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 5,
"temperature": 32
},
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside" },
"timestamp": ISODate("2022-11-24T02:00:00.000Z"),
"pressure": 6,
"temperature": 35
},
{
"metadata": { "sensorName": "sensor2", "sensorLocation": "inside" },
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 7,
"temperature": 72
},
] )
In your example we want to update the 'pressure' field which currently holds the pressure value of 5. So, we need to find all documents where the metaField 'metadata.sensorName' has a value of 'sensor1' and store all the found documents in a variable called old_docs.
var old_docs = db.test_coll.find({ "metadata.sensorName": "sensor1" })
Next, we loop through the documents (old_docs), updating them as needed. We add the documents (updated or not) to a variable named updated_docs. In this example, we are looping through all 'sensor1' documents, and if the timestamp is equal to '2022-11-24T01:00:00.000Z' we update the 'pressure' field with the value 555 ( which was initially 5 ). Alternatively, we could search for a specific _id here instead of a particular timestamp.
Note that there is a 'pressure' value of 7 at the
timestamp 2022-11-24T01:00:00.000Z, as well, but its value will remain the same because we are only looping through all 'sensor1' documents, so the document with sensorName set to sensor2 will not be updated.
var updated_docs = [];
while (old_docs.hasNext()) {
var doc = old_docs.next();
if (doc.timestamp.getTime() == ISODate("2022-11-24T01:00:00.000Z").getTime()){
print(doc.pressure)
doc.pressure = 555
}
updated_docs.push(doc)
}
We now have a copy of all the documents for 'sensor1' and we have updated our desired fields.
Next, we delete all documents with the metaField 'metadata.sensorName' equal to 'sensor1' ( on an actual database, please don't forget to backup first )
db.test_coll.deleteMany({ "metadata.sensorName": "sensor1" })
And finally, we insert our updated documents into the database.
db.test_coll.insertMany(updated_docs)
This workaround will update values, but it will not upsert them.

how can I make the "updated" of mongodb stop when updating a field of a nested array?

I have a database like this:
{
"universe":"comics",
"saga":[
{
"name":"x-men",
"characters":[
{
"character":"wolverine",
"picture":"618035022351.png"
},
{
"character":"wolverine",
"picture":"618035022352.png"
}
]
}
]
},
{
"universe":"dc",
"saga":[
{
"name":"spiderman",
"characters":[
{
"character":"venom",
"picture":"618035022353.png"
}
]
}
]
}
And with this code, I update the field where name: wolverine:
db.getCollection('collection').findOneAndUpdate(
{
"universe": "comics"
},
{
$set: {
"saga.$[outer].characters.$[inner].character": "lobezno",
"saga.$[outer].characters.$[inner].picture": "618035022354.png"
}
},
/*{
"saga.characters": 1
},*/
{
"arrayFilters": [
{
"outer.name": "x-men"
},
{
"inner.character": "wolverine"
}
],
"multi":false
}
)
I want to just update the first object where there is a match, and stop it.
For example, if I have an array of 100,000 elements and the object where the match is, is in the tenth position, he will update that record, but he will continue going through the entire array and this seems ineffective to me even though he already did the update.
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
How can I do it?
Update using arrayFilters conditions
I don't think it will find and update through loop, and It does not matter if collection have 100,000 sub documents, because here is nice explanation in $[<identifier>] and has mentioned:
The $[<identifier>] to define an identifier to update only those array elements that match the corresponding filter document in the arrayFilters
In the update document, use the $[<identifier>] filtered positional operator to define an identifier, which you then reference in the array filter documents. But make sure you cannot have an array filter document for an identifier if the identifier is not included in the update document.
Update using _id
Your point,
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
MongoDB will certainly use the _id index. Here is the nice answer on question MongoDB Update Query Performance, from this you will get an better idea on above point
Update using indexed fields
You can create index according to your query section of update command, Here MongoDB Indexes and Indexing Strategies has explained why index is important,
In your example, lets see with examples:
Example 1: If document have 2 sub documents and when you update and check with explain("executionStats"), assume it will take 1 second to update,
quick use Mongo Playground (this platform will not support update query)
Example 2: If document have 1000 sub documents and when you update and check with explain("executionStats"), might be it will take more then 1 second,
If provide index on fields (universe, saga.characters.character and saga.characters.picture) then definitely it will take less time then usual without index, main benefit of index it will direct point to indexed fields.
quick use Mongo Playground (this platform will not support update query)
Create Index for your fields
db.maxData.createIndex({
"universe": 1,
"saga.characters.character": 1,
"saga.characters.picture": 1
})
For more experiment use above 2 examples data with index and without index and check executionStats you will get more clarity.

Update field only if document updated in MongoDB

I am calling findAndModify() using the $max function to set the value of a field to the largest value.
For example, as shown in the MongoDB documentation.
db.scores.update( { _id: 1 }, { $max: { highScore: 950 } } )
I'd like to also set a lastUpdatedTimestamp only if the document is updated. I can't just perform a $set because that will always change the last updated timestamp. Is there a good mechanism within MongoDB to set another value only if the document is updated? Something similar to $setOnInsert but for any update.
If there isn't what might be a good approach here? Right now I'm thinking I could perform a regular find. Then do a local comparison. If the new value is greater than the old, then there is a good possibility that the update will update the document. So I just include the $set for the lastUpdatedTimestamp.
You can first make a query to find records having highScore less than your input value and then update. This will only set lastUpdatedTimestamp on updating the record.
db.scores.findAndModify({
query: { highScore: { $lt: 950 } },
update: { $set: { "highScore" : 950, "lastUpdatedTimestamp" : new Date() } },
})
as I see you wanted to . update your document only if your highScore can be updated .
only the documents's score is lower than new score value ,it will be updated with score field and lastUpdatedTimestamp
the best way is put your new score in the filter to find the documents match old score < new score
do it like this
db.scores.update(
{_id :4,highScore:{$lt:900}},
{$set:{highScore:900},
$currentDate: { lastModified: true }})
or set the modify time like
{$set:{highScore:900 ,lastupdatetime: new_time},

Mongo $set removing other fields in object on update

I am working on a mongo statement for adding and updating values into an object within my document.
Here is my current statement. field and value changes depending what is getting passed in:
db.collection.update(id, {
$set: {
analysis : {[field]: value}
}
});
Here is an example of what a document could look like(there are potentially 20+ fields in analysis)
{
_id
analysis:{
interest_rate: 22
sales_cost: 4000
value: 300
}
}
The problem is that every time I update the object all fields are removed except the the field I updated.
so if
field = interest_rate
and the new
value = 33
my document would end up looking like this and all the other fields in analysis would be removed:
{
_id
analysis:{
interest_rate: 33
}
}
Is there a way to update fields within an object like this to keep the code simple or will I have to write out update statements for each individual field?
You should use the dot notation to build the path when you're trying to update nested field. Try:
let fieldPath = 'analysis.' + field; // for instance "analysis.interest_rate"
db.collection.update(id, {
$set: {
fieldPath: value
}
});
Otherwise you're just replacing existing analysis object.

Mongo add timestamp field from existing date field

I currently have a collection with documents like the following:
{ foo: 'bar', timeCreated: ISODate("2012-06-28T06:51:48.374Z") }
I would now like to add a timestampCreated key to the documents in this collection, to make querying by time easier.
I was able to add the new column with an update and $set operation, and set the timestamp value but I appears to be setting the current timestamp using this:
db.reports.update({}, {
$set : {
timestampCreated : new Timestamp(new Date('$.timeCreated'), 0)
}
}, false, true);
I however have not been able to figure out a way to add this column and set it's value to the timestamp of the existing 'timeCreated' field.
Do a find for all the documents, limiting to just the id and timeCreated fields. Then loop over that and generate the timestampCreated value, and do an update on each.
Use updateMany() which can accept aggregate pipelines (starting from MongoDB 4.2) and thus take advantage of the $toLong operator which converts a Date into the number of milliseconds since the epoch.
Also use the $type query in the update filter to limit only documents with the timeCreated field and of Date type:
db.reports.updateMany(
{ 'timeCreated': {
'$exists': true,
'$type': 9
} },
[
{ '$set': {
'timestampCreated': { '$toLong': '$timeCreated' }
} }
]
)