i have created and want to now import a dummy collection. one of the fields in each item are "created" and "updated" fields. what can i put in the source/json file so that MongoDb will use the current date and time as the value on import?
this wont work
"created" : Date()
mongoimport is intended for importing data existing data in CSV, TSV, or JSON format. If you want to insert new fields (such as a created timestamp) you will have to set a value for them.
For example, if you want to set the created timestamp to the current time, you could get a unix timestamp from the command line (which will be seconds since the epoch):
$ date +%s
1349960286
The JSON <date> representation that mongoimport expects is a 64-bit signed integer representing milliseconds since the epoch. You'll need to multiply the unixtime seconds value by 1000 and include in your JSON file:
{ "created": Date(1349960286000) }
An alternative approach would be to add the created timestamps to documents after they have been inserted.
For example:
db.mycoll.update(
{created: { $exists : false }}, // Query criteria
{ $set : { created: new Date() }}, // Add 'created' timestamp
false, // upsert
true // update all matching documents
)
As Stennie correctly pointed out, you can not do this with just mongoimport or mongorestore: they are just for restoring your previously dumped data. Correct way of doing this is to restore the data and then to make update on the restored data.
With a new mongo 2.6 you can do this easily using $currentDate operation, which was created to update time to a current timestamp.
In your case you need something like
db.users.update(
{},
{
$currentDate: {
created: true,
updated: true
},
}
)
Related
I'm reading the documentation about Timeseries in Mongodb v5 - v6 and I don't understand if it's possible to upsert a record after it has been saved; for example if I have a record like this (the "name" field is the "metadata" ):
{
_id: ObjectId("6560a0ef02a1877734a9df66")
timestamp: 2022-11-24T01:00:00.000Z,
name: 'sensor1',
pressure: 5,
temperature: 25
}
is it possible to update the value of the "pressure" field after the record has been saved?
From the official mongo documentation, inside the "Time Series Collection Limitations" section, I read that: The update command may only modify the metaField field value.
Is there a way to upsert also other field? Thanks a lot.
No, updating the pressure field in your example is impossible with update alone, and upsert doesn't exist for time series collections.
The only functions currently available for time series collections are Delete and Update, but they only work on the metaField values, so in your example, we can only update/rename 'sensor1'.
The only workaround I know to update values is as follows:
Get a copy of all documents matched on the metaField values.
Update desired values on the copied documents.
Delete the original documents from the database
Insert your new copy of the documents into the database.
Here's a way to update values on a time series collections, using the MongoDB Shell (mongosh)
First, we create a test database. The important part here is the metaField named "metadata." This field will be an object/dictionary that stores multiple fields.
db.createCollection(
"test_coll",
{
timeseries: {
timeField: "timestamp",
metaField: "metadata",
granularity: "hours"
}
}
)
Then we add some test data to the collection. Note the 'metadata' is an object/dictionary that stores two fields named
sensorName and sensorLocation.
db.test_coll.insertMany( [
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside"},
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 5,
"temperature": 32
},
{
"metadata": { "sensorName": "sensor1", "sensorLocation": "outside" },
"timestamp": ISODate("2022-11-24T02:00:00.000Z"),
"pressure": 6,
"temperature": 35
},
{
"metadata": { "sensorName": "sensor2", "sensorLocation": "inside" },
"timestamp": ISODate("2022-11-24T01:00:00.000Z"),
"pressure": 7,
"temperature": 72
},
] )
In your example we want to update the 'pressure' field which currently holds the pressure value of 5. So, we need to find all documents where the metaField 'metadata.sensorName' has a value of 'sensor1' and store all the found documents in a variable called old_docs.
var old_docs = db.test_coll.find({ "metadata.sensorName": "sensor1" })
Next, we loop through the documents (old_docs), updating them as needed. We add the documents (updated or not) to a variable named updated_docs. In this example, we are looping through all 'sensor1' documents, and if the timestamp is equal to '2022-11-24T01:00:00.000Z' we update the 'pressure' field with the value 555 ( which was initially 5 ). Alternatively, we could search for a specific _id here instead of a particular timestamp.
Note that there is a 'pressure' value of 7 at the
timestamp 2022-11-24T01:00:00.000Z, as well, but its value will remain the same because we are only looping through all 'sensor1' documents, so the document with sensorName set to sensor2 will not be updated.
var updated_docs = [];
while (old_docs.hasNext()) {
var doc = old_docs.next();
if (doc.timestamp.getTime() == ISODate("2022-11-24T01:00:00.000Z").getTime()){
print(doc.pressure)
doc.pressure = 555
}
updated_docs.push(doc)
}
We now have a copy of all the documents for 'sensor1' and we have updated our desired fields.
Next, we delete all documents with the metaField 'metadata.sensorName' equal to 'sensor1' ( on an actual database, please don't forget to backup first )
db.test_coll.deleteMany({ "metadata.sensorName": "sensor1" })
And finally, we insert our updated documents into the database.
db.test_coll.insertMany(updated_docs)
This workaround will update values, but it will not upsert them.
In Database the date format stored is as below. Adding two same fields but values are different.
packageDeliveredTime : 2020-08-21 2:39:37
packageDeliveredTime : 2020-08-21 09:3:45
Due to the the above format few of the API's and database query's are not filtering the data correctly and like this there is so many records that got stored in the data base.
The above date format needs to get updated as below.
packageDeliveredTime : 2020-08-21 02:39:37
packageDeliveredTime : 2020-08-21 09:03:45
You should never store date/time values as string, it's a design flaw. Store always proper Date objects.
In order to convert the strings to date function $dateFromString is not sufficient. You can use moment.js library to do this:
db.collection.find({ packageDeliveredTime: { $exists: true, $type: "string" } }).forEach(function (doc) {
db.collection.updateOne(
{ _id: doc._id },
{ $set: { packageDeliveredTime: moment(doc.packageDeliveredTime).toDate() } }
);
})
I want to depict the following use case using MongoDb:
I want to read from a collection and memorize that particular point in time.
When writing the next time to that collection, I want to not be able to write a new document, if another document has been added to that collection in between.
Using a timestamp property on the documents would be ok.
Is this possible?
One trick is use findAndModify
Assume at the time of reading, your most recent timestamp on a document is oldTimestamp:
db.collection.findAndModify({
query: {timestamp: {$gt: oldTimestamp}},
new: true, // Return modified / inserted document
upsert: true, // Update if match found, insert otherwise
update: {
$setOnInsert: {..your document...}
}
})
This will not insert your document if another document is inserted between your read and write operation.
However, this won't let you know that the document is inserted or not directly.
You should compare returned document with your proposed document to find that out.
In case using nodejs driver, the correct pattern should be:
collection.findAndModify(criteria[, sort[, update[, options]]], callback)
According to the example, our query should be:
db.collection('test').findAndModify(
{timestamp: {$gt: oldTimestamp}}, // query, timestamp is a property of your document, often set as the created time
[['timestamp','desc']], // sort order
{$setOnInsert: {..your document..}}, // replacement, replaces only the field "hi"
{
new: true,
upsert: true
}, // options
function(err, object) {
if (err){
console.warn(err.message); // returns error if no matching object found
}else{
console.dir(object);
}
});
});
This can be achieved, using a timestamp property in every document. You can take a look at the Mongoose Pre Save path validation hook . Using this hook, you can write something like this.
YourSchema.path('timestamp').validate(function(value, done) {
this.model(YourSchemaModelName).count({ timestamp: {$gt : value} }, function(err, count) {
if (err) {
return done(err);
}
// if count exists and not zero hence document is found with greater timestamp value
done(!count);
});
}, 'Greater timestamp already exists');
Sounds like you'll need to do some sort of optimistic locking at the collection level. I understand you are writing new documents but never updating existing ones in this collection?
You could add an index on the timestamp field, and your application would need to track the latest version of this value. Then, before attempting a new write you could lookup the latest value from the collection with a query like
db.collection.find({}, {timestamp: 1, _id:0}).sort({timestamp:-1}).limit(1)
which would project just the maximum timestamp value using a covered query which is pretty efficient.
From that point on, it's up to your application logic to handle the 'conflict'.
MongoDB provides a way to update a date field by the system on update operations: https://docs.mongodb.com/manual/reference/operator/update/currentDate/. Is there any equivalent to this for insert operations?
You may try to do a few things if you do not want to handle this from code (I have executed the code below directly on mongo shell):
If you want to use $currentDate use update with upsert = true:
db.orders.update(
{"_id":ObjectId()},
{
$currentDate: {
createtime: true
}
},
{ upsert: true }
)
It will now generate the objectid on app server instead of date/time (unless you use raw command).
Use new timestamp or date object directly:
db.orders.insert(
"createtime": new Timestamp()
)
The problem with most driver will be then to make sure the new object is created on mondodb server- not on the machine where the code is running. You driver hopefully allows to run raw insert command.
Both will serve the purpose of avoiding time differences/ time sync issue between application server machines.
The $currentDate is an update operator which populates date field with current date through update operation.
To auto populate date field while insertion of new MongoDB document,please try executing following code snippet
var current_date=new Date();
db.collection.insert({datefield:current_date})
In above code snippet the statement
new Date()
creates a new JavaScript Date object which consists of a year, a month, a day, an hour, a minute, a second, and milliseconds
If you want to populate this value when running it in the server side, and are concerned about it being passed by the client, you can add properties to the data object that is being used in the insert statement only when it will be saved. This way, you can guarantee that it will be added every time with the server's date, not the client's:
Client side
...
let data = { info1: 'value1', info2: 'value2'}
someApi.addInfo(data);
...
Server side
function addInfo(data){
...
data['creationDate'] = new Date();
db.collection.insertOne(data);
...
}
Result will be:
{
info1: 'value1',
info2: 'value2',
creationDate: ISODate("2018-09-15T21:42:13.815Z")
}
If you are passing multiple values for insertion (using insertMany) you have to loop over the items and add this property for all of them.
You can also use this approach when updating documents if for some reason you can't use the $currentDate operator, just be sure you are not replacing any existing properties in the data passed to mongodb.
Since mongo 3.6 you can use 'change stream':
https://emptysqua.re/blog/driver-features-for-mongodb-3-6/#change-streams
to use it you need to create a change stream object by the 'watch' query:
def update_ts_by(change):
update_fields = change["updateDescription"]["updatedFields"].keys()
print("update_fields: {}".format(update_fields))
collection = change["ns"]["coll"]
db = change["ns"]["db"]
key = change["documentKey"]
if len(update_fields) == 1 and "update_ts" in update_fields:
pass
else:
client[db][collection].update(key, {"$set": {"update_ts": datetime.now()}})
client = MongoClient("172.17.0.2")
db = client["Data"]
change_stream = db.watch()
for change in change_stream:
print(change)
update_ts_by(change)
Note, to use the change_stream object, your mongodb instance should run as 'replica set'.
It can be done also as a 1-node replica set (almost no change then the standalone use):
Mongo DB - difference between standalone & 1-node replica set
I currently have a collection with documents like the following:
{ foo: 'bar', timeCreated: ISODate("2012-06-28T06:51:48.374Z") }
I would now like to add a timestampCreated key to the documents in this collection, to make querying by time easier.
I was able to add the new column with an update and $set operation, and set the timestamp value but I appears to be setting the current timestamp using this:
db.reports.update({}, {
$set : {
timestampCreated : new Timestamp(new Date('$.timeCreated'), 0)
}
}, false, true);
I however have not been able to figure out a way to add this column and set it's value to the timestamp of the existing 'timeCreated' field.
Do a find for all the documents, limiting to just the id and timeCreated fields. Then loop over that and generate the timestampCreated value, and do an update on each.
Use updateMany() which can accept aggregate pipelines (starting from MongoDB 4.2) and thus take advantage of the $toLong operator which converts a Date into the number of milliseconds since the epoch.
Also use the $type query in the update filter to limit only documents with the timeCreated field and of Date type:
db.reports.updateMany(
{ 'timeCreated': {
'$exists': true,
'$type': 9
} },
[
{ '$set': {
'timestampCreated': { '$toLong': '$timeCreated' }
} }
]
)