Upsert timeseries in Mongodb v5 - v6 - mongodb

I'm reading the documentation about Timeseries in Mongodb v5 - v6 and I don't understand if it's possible to upsert a record after it has been saved; for example if I have a record like this (the "name" field is the "metadata" ):
{
_id: ObjectId("6560a0ef02a1877734a9df66")
timestamp: 2022-11-24T01:00:00.000Z,
name: 'sensor1',
pressure: 5,
temperature: 25
}
is it possible to update the value of the "pressure" field after the record has been saved?
From the official mongo documentation, inside the "Time Series Collection Limitations" section, I read that: The update command may only modify the metaField field value.
Is there a way to upsert also other field? Thanks a lot.

No, updating the pressure field in your example is impossible with update alone, and upsert doesn't exist for time series collections.
The only functions currently available for time series collections are Delete and Update, but they only work on the metaField values, so in your example, we can only update/rename 'sensor1'.
The only workaround I know to update values is as follows:
Get a copy of all documents matched on the metaField values.
Update desired values on the copied documents.
Delete the original documents from the database
Insert your new copy of the documents into the database.
Here's a way to update values on a time series collections, using the MongoDB Shell (mongosh)
First, we create a test database. The important part here is the metaField named "metadata." This field will be an object/dictionary that stores multiple fields.
db.createCollection(
"test_coll",
{
timeseries: {
timeField: "timestamp",
metaField: "metadata",
granularity: "hours"
}
}
)
Then we add some test data to the collection. Note the 'metadata' is an object/dictionary that stores two fields named
sensorName and sensorLocation.
db.test_coll.insertMany( [
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside"},
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 5,
"temperature": 32
},
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside" },
"timestamp": ISODate("2022-11-24T02:00:00.000Z"),
"pressure": 6,
"temperature": 35
},
{
"metadata": { "sensorName": "sensor2", "sensorLocation": "inside" },
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 7,
"temperature": 72
},
] )
In your example we want to update the 'pressure' field which currently holds the pressure value of 5. So, we need to find all documents where the metaField 'metadata.sensorName' has a value of 'sensor1' and store all the found documents in a variable called old_docs.
var old_docs = db.test_coll.find({ "metadata.sensorName": "sensor1" })
Next, we loop through the documents (old_docs), updating them as needed. We add the documents (updated or not) to a variable named updated_docs. In this example, we are looping through all 'sensor1' documents, and if the timestamp is equal to '2022-11-24T01:00:00.000Z' we update the 'pressure' field with the value 555 ( which was initially 5 ). Alternatively, we could search for a specific _id here instead of a particular timestamp.
Note that there is a 'pressure' value of 7 at the
timestamp 2022-11-24T01:00:00.000Z, as well, but its value will remain the same because we are only looping through all 'sensor1' documents, so the document with sensorName set to sensor2 will not be updated.
var updated_docs = [];
while (old_docs.hasNext()) {
var doc = old_docs.next();
if (doc.timestamp.getTime() == ISODate("2022-11-24T01:00:00.000Z").getTime()){
print(doc.pressure)
doc.pressure = 555
}
updated_docs.push(doc)
}
We now have a copy of all the documents for 'sensor1' and we have updated our desired fields.
Next, we delete all documents with the metaField 'metadata.sensorName' equal to 'sensor1' ( on an actual database, please don't forget to backup first )
db.test_coll.deleteMany({ "metadata.sensorName": "sensor1" })
And finally, we insert our updated documents into the database.
db.test_coll.insertMany(updated_docs)
This workaround will update values, but it will not upsert them.

Related

What will the "request.resource.data" contain if I only update a single field of a document among several fields in firestore database?

Suppose, I have several fields in the favorites document and it looks like below:
{
favorites:
{
"color": "Red",
"place": "ocean",
"time": "morning"
}
}
Now, I want to update a single field in this document. If I write the below code to update a single field, will the request.resource.data contain only this field or will it also contain the other fields in this document?
db.collection("users")
.document("favorites")
.update({
"color": "Red"
});
The request.resource variable contains the document as it'll exist after the update, if that update is allowed.
So request.resource.data contains your updated color and the place and time values from the existing document.

Managing schema changes with MongoDB

How to handle if document structure after production changes.
Suppose I had 500 documents like this:
{
name: ‘n1’
height: ‘h1’
}
Later if I decide to add all the documents in below format:
{
name: ‘n501’
height: ‘h501’
weight: ‘w501’
}
I am using cursor.All(&userDetails) to decode(deserialize) in Go to get the output of the query in struct userDetails. If I modify the structure of further documents and userDetails accordingly, it will fail for the first 500 documents?
How to handle this change?
If you add a new field to your struct, querying old documents will not fail. Since the old documents do not have the new field saved in MongoDB, querying them will give you struct values where the new field will be its zero value. E.g. if its type is string, it will be the empty string "", if it's an int field, it will be 0.
If it bothers you that the old documents do not have this new field, you may extend them in the mongo console like this:
db.mycoll.updateMany({ "weight": {$exists:false} }, { $set: {"weight": ""} } )
This command adds a new weight field to old documents where this field did not exist, setting them to the empty string.

Query in mongodb mongoose

There is a database on mongodb. It contains a collection of products, which was created when importing from a csv file with a unique _id. In products, each document has a field articul corresponding to the article of the manufacturer. There is also a field size indicating the size of the product. Since the size of one product can be different, when you import documents are created which for the same articul will have different size.
How to make a selection from products and create another collection in which to put values with a unique articul that must contain all the values of size for each articul?
What you are looking for is aggregation. You can group the documents on article names and save it in a different collection
db.products.aggregate(
{$group: {_id: '$articul', sizes: {$addToSet: '$size'}}},
{$out: 'articles'}
)
To store new size for existed articul
db.products.update(
{ "articul": "Banana" },
{ $addToSet: { size: 9 } }
)
if nothing matched from above query you need create new insert query
db.products.insertOne({
"articul": "Apple", "size": [8]
});

Delete a MongoDB subdocument by value

I have a collection containing documents that look like this:
{
"user": "foo",
"topics": {
"Topic AB": {
"score": 20,
"frequency": 3,
"last_seen": 40
},
"Topic BD": {
"score": 10,
"frequency": 2,
"last_seen": 38
},
"Topic TF": {
"score": 19,
"frequency": 6,
"last_seen": 20
}
}
}
I want to remove subdocuments whose last_seen value is less than 30.
I don't want to use arrays here since I'm using $inc to update the subdocuments in conjunction with upsert (which doesn't support the $ notation).
The real question here is how can I delete a key depending on its value. Using $unset simply drops a subdocument regardless of what it contains.
I'm afraid I don't think this is possible with your current design. Knowing the name of the key whose last_seen value you wish to test, for example Topic TF, you can do
> db.topics.update({"topics.Topic TF.last_seen" : { "$lt" : 30 }},
{ "$unset" : { "topics.Topic TF" : 1} })
However, with an embedded document structure, if you don't know the name of the key that you want to query against then you can't run the query. If the Topic XX keys are only known by what's in the document, you'd have to pull the whole document to find out what keys to test, and at that point you ought to just manipulate the document client-side and then update by _id.
The best option is to use arrays. The $ positional operator works with upserts, it just has a serious gotcha that, in the case of an insert, the $ will be interpreted as part of the field name instead of as an operator, so I understand your conclusion that it doesn't seem feasible. I'm not quite sure how you are using upsert such that arrays seem like they won't work, though. Could you give more detail there and I'll try to help come up with a reasonable workaround to use arrays and $ with your use case?

How do I manage a sublist in Mongodb?

I have different types of data that would be difficult to model and scale with a relational database (e.g., a product type)
I'm interested in using Mongodb to solve this problem.
I am referencing the documentation at mongodb's website:
http://docs.mongodb.org/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/
For the data type that I am storing, I need to also maintain a relational list of id's where this particular product is available (e.g., store location id's).
In their example regarding "one-to-many relationships with embedded documents", they have the following:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
I am currently importing the data with a spreadsheet, and want to use a batchInsert.
To avoid duplicates, I assume that:
1) I need to do an ensure index on the ID, and ignore errors on the insert?
2) Do I then need to loop through all the ID's to insert a new related ID to the books?
Your question could possibly be defined a little better, but let's consider the case that you have rows in a spreadsheet or other source that are all de-normalized in some way. So in a JSON representation the rows would be something like this:
{
"publisher": "O'Reilly Media",
"founded": 1980,
"location": "CA",
"book": 12346789
},
{
"publisher": "O'Reilly Media",
"founded": 1980,
"location": "CA",
"book": 234567890
}
So in order to get those sort of row results into the structure you wanted, one way to do this would be using the "upsert" functionality of the .update() method:
So assuming you have some way of looping the input values and they are identified with some structure then an analog to this would be something like:
books.forEach(function(book) {
db.publishers.update(
{
"name": book.publisher
},
{
"$setOnInsert": {
"founded": book.founded,
"location": book.location,
},
"$addToSet": { "books": book.book }
},
{ "upsert": true }
);
})
This essentially simplified the code so that MongoDB is doing all of the data collection work for you. So where the "name" of the publisher is considered to be unique, what the statement does is first search for a document in the collection that matches the query condition given, as the "name".
In the case where that document is not found, then a new document is inserted. So either the database or driver will take care of creating the new _id value for this document and your "condition" is also automatically inserted to the new document since it was an implied value that should exist.
The usage of the $setOnInsert operator is to say that those fields will only be set when a new document is created. The final part uses $addToSet in order to "push" the book values that have not already been found into the "books" array (or set).
The reason for the separation is for when a document is actually found to exist with the specified "publisher" name. In this case, all of the fields under the $setOnInsert will be ignored as they should already be in the document. So only the $addToSet operation is processed and sent to the server in order to add the new entry to the "books" array (set) and where it does not already exist.
So that would be simplified logic compared to aggregating the new records in code before sending a new insert operation. However it is not very "batch" like as you are still performing some operation to the server for each row.
This is fixed in MongoDB version 2.6 and above as there is now the ability to do "batch" updates. So with a similar analog:
var batch = [];
books.forEach(function(book) {
batch.push({
"q": { "name": book.publisher },
"u": {
"$setOnInsert": {
"founded": book.founded,
"location": book.location,
},
"$addToSet": { "books": book.book }
},
"upsert": true
});
if ( ( batch.length % 500 ) == 0 ) {
db.runCommand( "update", "updates": batch );
batch = [];
}
});
db.runCommand( "update", "updates": batch );
So what is doing in setting up all of the constructed update statements into a single call to the server with a sensible size of operations sent in the batch, in this case once every 500 items processed. The actual limit is the BSON document maximum of 16MB so this can be altered appropriate to your data.
If your MongoDB version is lower than 2.6 then you either use the first form or do something similar to the second form using the existing batch insert functionality. But if you choose to insert then you need to do all the pre-aggregation work within your code.
All of the methods are of course supported with the PHP driver, so it is just a matter of adapting this to your actual code and which course you want to take.