I am working on mongodb and making a database structure as:
{
"_id" : ObjectId("5e9fef05c0228a50befba0d7"),
"name" : "sanghm",
"id" : "3456",
"dep" : {
"dep1" : "ops",
"dep2" : "analytics"
},
"data" : [
{
"date" : "25-apr-2020",
"log" : [
{
"machine" : "windows-user1",
"task" : "excel",
"time": "10:00am"
}
{
"machine" : "windows-user1",
"task" : "email",
"time": "11:00am"
}
]
}
]
}
to push every new task on database, i'm using a query like
date = '25-apr-2020'
db.inventory.update({id:'3456'},{{$push:{'data.$[t].log':{machine:'windows-user1',task:'new task',time:'3:00pm'}}},{arrayFilters:[{"t.date":date}]}})
this query pushing new task to date array.
if in case the date changes to '26-apr-2020' then can we use same query to first add new date filed and then push data for the same.
i have read about {upsert: true} but dont know how to use it in my case
The double nesting is weird. Usually all of the events would be stored flat in the top level data array and one would produce the grouping by date (if desired) at query time.
Related
I have a mongo DB aggregation pipeline that performs the following steps:
Sorts a list of user stats objects by timestamp
Groups the results by user ID
Sorts by a specified stat's name
Pages the results via skip and limit stages
In plain English, this pipeline returns a page from a list of user stats sorted by a specified stat. Each user can have multiple stats object, so I group to return only the most recent stats object for each user.
In Mongo Shell, this looks like:
db.getCollection("stats").aggregate(
[
{ "$sort" : { "Timestamp" : -1.0 } },
{
"$group" : {
"_id" : "$UserId",
"UserId" : { "$last" : "$UserId" },
"StatsOverall" : { "$last" : "$StatsOverall" },
"Timestamp" : { "$last" : "$Timestamp" }
}
},
{ "$sort" : { "StatsOverall.Rank" : -1.0 } },
{ "$skip" : specifiedPageNumber },
{ "$limit" : specifiedNumResultsPerPage }
]
);
This works fine.
I now want to modify this query to be able to search the user by name, and get back the entire page that user is contained on. (This is for a leaderboard). So, if the user is on page 5 of the leaderboard, I want to return the entirety of page 5.
However, I'm having trouble seeing a solution that doesn't require me to either load all of the users in to memory and page them there (awful idea), or go back and forth to the database iterating through pages (almost as awful).
Is there some way I can modify my aggregation pipeline to do all this at the database level?
EDIT: As requested, added some sample data and the expected result.
Sample data looks something like this... I've omitted some fields that aren't relevant. The initial data is a collection of user's stats, where each user can have more than one object. My existing pipeline returns the 1 most recent stats object for each user sorted by a specified stat name.
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// ----------------------------------------------
{
"_id" : "5c6238658966ae5860795879",
"UserId" : "5c6238658966ae5860795879",
"StatsOverall" : {
"Rank" : NumberInt(413),
"GamesLost" : NumberInt(2),
"GamesWon" : NumberInt(141),
},
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// many objects like this
The expected result looks like this:
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
It returns the exact same type of object, sorted the same way as the existing pipeline, however I want to return only the page the the user is on. In the example result, assume the page size is just 1 result per page. So, the result would contain the 1 page that the user with the given UserId is on. In my sample result, that ID would be 5c611e71ab0ffc430410e0ba.
I have MongoDB Collection where some documents have arrays of objects. One of the fields of this objects is timestamp.
The problem is that historically some of timestamp values are Strings (e.g. '2018-02-25T13:33:56.675000') or Date and some of them are Double (e.g. 1528108521726.26).
I have to convert all of them to Double.
I've built the query to get all the documents with the problematic type:
db.getCollection('Cases').find({sent_messages: {$elemMatch:{timestamp: {$type:[2, 9]}}}})
And I also know how to convert Date-string to double using JS:
new Date("2018-02-18T06:39:20.797Z").getTime()
> 1518935960797
But I can't build the proper query to perform the update.
Here is an example of such a document:
{
"_id" : ObjectId("6c88f656532aab00050dc023"),
"created_at" : ISODate("2018-05-18T03:43:18.986Z"),
"updated_at" : ISODate("2018-05-18T06:39:20.798Z"),
"sent_messages" : [
{
"timestamp" : ISODate("2018-02-18T06:39:20.797Z"),
"text" : "Hey",
"sender" : "me"
}
],
"status" : 1
}
After the update it should be:
{
"_id" : ObjectId("6c88f656532aab00050dc023"),
"created_at" : ISODate("2018-05-18T03:43:18.986Z"),
"updated_at" : ISODate("2018-05-18T06:39:20.798Z"),
"sent_messages" : [
{
"timestamp" : 1518935960797.00,
"text" : "Hey",
"sender" : "me"
}
],
"status" : 1
}
As per your question, you are trying to fetch the record first.
db.getCollection('Cases').find({sent_messages: {$elemMatch:{timestamp: {$type:[2, 9]}}}})
Then convert date in JS:
new Date("2018-02-18T06:39:20.797Z").getTime()
And then this is an update query:
db.getCollection('Cases').updateOne({_id:ObjectId("6c88f656532aab00050dc023")}, { $set: { "sent_messages.$.timestamp" : "218392712937.0" }})
And if you want to update all records then you should write some forEach mechanism. I think you have already this implemented.
Hope this may help you.
Finally I just do it with JS code that can be run in mongo console:
db.getCollection('Cases').find({sent_messages: {$elemMatch:{timestamp: {$type:[2, 9]}}}}).forEach(function(doc) {
print('=================');
print(JSON.stringify(doc));
doc.sent_messages.forEach(function(msg){
var dbl = new Date(msg.timestamp).getTime();
print(dbl);
msg.timestamp = dbl;
});
print(JSON.stringify(doc))
db.Cases.save(doc);
} )
Thanks all for your help!
I have a JSON file with a horrific data structure
{ "#timestamp" : "20160226T065604,39Z",
"#toplevelentries" : "941",
"viewentry" : [ { "#noteid" : "161E",
"#position" : "1",
"#siblings" : "941",
"entrydata" : [
and entrydata is a list of 941 entries, each of which look like this
{ "#columnnumber" : "0",
"#name" : "$Created",
"datetime" : { "0" : "20081027T114133,55+01" }
},
{ "#columnnumber" : "1",
"#name" : "WriteLog",
"textlist" : { "text" : [ { "0" : "2008.OCT.28 12:54:39 CET # EMI" },
{ "0" : "2008.OCT.28 12:56:13 CET # EMI" },
There are many more columns. The structure is always this:
{
"#columnnumber": "17",
"#name": "PublicDocument",
"text": {
"0": "TMI-1-2005.pdf"
}
}
there's a column number which we can throw away, a #name which is the important part, then one of text, datetime or textlist fields where the value is always this weird subdocument with a 0 key and the actual value.
All 941 entries have the same number of these column entries and the column entry is always the same structure. Ie. if "#columnnumber": "13" has a #name: foo then it'll always be foo and if it has a datetime key then it always will have a datetime key, never a text or textlist. This monster was borne out of a SQL or similar database somewhere at the very far end of the pipeline but I have no access to the source beyond this. The goal is to revert the transformation and make it into something a SELECT statement would produce (except textlist, although I guess array_agg and similar could produce that too).
Is there a way to get 941 separate JSON entries out of MongoDB looking like:
{
$Created: "20081027T114133,55+01",
WriteLog: ["2008.OCT.28 12:54:39 CET # EMI", "2008.OCT.28 12:56:13 CET # EMI"],
PublicDocument: "TMI-1-2005.pdf"
}
is viewentry also a list?
if you do an aggregate on the collection, and $unwind on viewentry.entrydata you will get one document for every entrydata. It should be possible to the do a $project to reformat these documents to produce the output you need
this is a nice challenge,
to get outupt like that:
{
"_id" : "161E",
"field" : [
{
"name" : "$Created",
"datetime" : {
"0" : "20081027T114133,55+01"
}
},
{
"name" : "WriteLog",
"textlist" : {
"text" : [
{
"0" : "2008.OCT.28 12:54:39 CET# EMI"
},
{
"0" : "2008.OCT.28 12:56:13 CET# EMI"
}
] } } ]}
use this aggregation pipelines:
db.chx.aggregate([ {$unwind: "$viewentry"}
, {$unwind: "$viewentry.entrydata"}
,{$group:{
"_id":"$viewentry.#noteid", field:{ $push:{
"name": "$viewentry.entrydata.#name" ,
datetime:"$viewentry.entrydata.datetime",
textlist:"$viewentry.entrydata.textlist" }}
}}
]).pretty()
the next step should be extracting log entries, but I have no idea, as my brain is already fried tonight - so probably I can return later...
I have a collection :
gStats : {
"_id" : "id1",
"criteria" : ["key1":"value1", "key2":"value2"],
"groups" : [
{"id":"XXXX", "visited":100, "liked":200},
{"id":"YYYY", "visited":30, "liked":400}
]
}
I want to be able to update a document of the stats Array of a given array of criteria (exact match).
I try to do this on 2 steps :
Pull the stat document from the array of a given "id" :
db.gStats.update({
"criteria" : {$size : 2},
"criteria" : {$all : [{"key1" : "2096955"},{"value1" : "2015610"}]}
},
{
$pull : {groups : {"id" : "XXXX"}}
}
)
Push the new document
db.gStats.findAndModify({
query : {
"criteria" : {$size : 2},
"criteria" : {$all : [{"key1" : "2015610"}, {"key2" : "2096955"}]}
},
update : {
$push : {groups : {"id" : "XXXX", "visited" : 29, "liked" : 144}}
},
upsert : true
})
The Pull query works perfect.
The Push query gives an error :
2014-12-13T15:12:58.571+0100 findAndModifyFailed failed: {
"value" : null,
"errmsg" : "exception: Cannot create base during insert of update. Cause
d by :ConflictingUpdateOperators Cannot update 'criteria' and 'criteria' at the
same time",
"code" : 12,
"ok" : 0
} at src/mongo/shell/collection.js:614
Neither query is working in reality. You cannot use a key name like "criteria" more than once unless under an operator such and $and. You are also specifying different fields (i.e groups) and querying elements that do not exist in your sample document.
So hard to tell what you really want to do here. But the error is essentially caused by the first issue I mentioned, with a little something extra. So really your { "$size": 2 } condition is being ignored and only the second condition is applied.
A valid query form should look like this:
query: {
"$and": [
{ "criteria" : { "$size" : 2 } },
{ "criteria" : { "$all": [{ "key1": "2015610" }, { "key2": "2096955" }] } }
]
}
As each set of conditions is specified within the array provided by $and the document structure of the query is valid and does not have a hash-key name overwriting the other. That's the proper way to write your two conditions, but there is a trick to making this work where the "upsert" is failing due to those conditions not matching a document. We need to overwrite what is happening when it tries to apply the $all arguments on creation:
update: {
"$setOnInsert": {
"criteria" : [{ "key1": "2015610" }, { "key2": "2096955" }]
},
"$push": { "stats": { "id": "XXXX", "visited": 29, "liked": 144 } }
}
That uses $setOnInsert so that when the "upsert" is applied and a new document created the conditions specified here rather than using the field values set in the query portion of the statement are used instead.
Of course, if what you are really looking for is truly an exact match of the content in the array, then just use that for the query instead:
query: {
"criteria" : [{ "key1": "2015610" }, { "key2": "2096955" }]
}
Then MongoDB will be happy to apply those values when a new document is created and does not get confused on how to interpret the $all expression.
I need get a specific object in array of array in MongoDB.
I need get only the task object = [_id = ObjectId("543429a2cb38b1d83c3ff2c2")].
My document (projects):
{
"_id" : ObjectId("543428c2cb38b1d83c3ff2bd"),
"name" : "new project",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b"),
"members" : [
ObjectId("5424ac37eb0ea85d4c921f8b")
],
"US" : [
{
"_id" : ObjectId("5434297fcb38b1d83c3ff2c0"),
"name" : "Test Story",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b"),
"tasks" : [
{
"_id" : ObjectId("54342987cb38b1d83c3ff2c1"),
"name" : "teste3",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b")
},
{
"_id" : ObjectId("543429a2cb38b1d83c3ff2c2"),
"name" : "jklasdfa_XXX",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b")
}
]
}
]
}
Result expected:
{
"_id" : ObjectId("543429a2cb38b1d83c3ff2c2"),
"name" : "jklasdfa_XXX",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b")
}
But i not getting it.
I still testing with no success:
db.projects.find({
"US.tasks._id" : ObjectId("543429a2cb38b1d83c3ff2c2")
}, { "US.tasks.$" : 1 })
I tryed with $elemMatch too, but return nothing.
db.projects.find({
"US" : {
"tasks" : {
$elemMatch : {
"_id" : ObjectId("543429a2cb38b1d83c3ff2c2")
}
}
}
})
Can i get ONLY my result expected using find()? If not, what and how use?
Thanks!
You will need an aggregation for that:
db.projects.aggregate([{$unwind:"$US"},
{$unwind:"$US.tasks"},
{$match:{"US.tasks._id":ObjectId("543429a2cb38b1d83c3ff2c2")}},
{$project:{_id:0,"task":"$US.tasks"}}])
should return
{ task : {
"_id" : ObjectId("543429a2cb38b1d83c3ff2c2"),
"name" : "jklasdfa_XXX",
"author" : ObjectId("5424ac37eb0ea85d4c921f8b")
}
Explanation:
$unwind creates a new (virtual) document for each array element
$match is the query part of your find
$project is similar as to project part in find i.e. it specifies the fields you want to get in the results
You might want to add a second $match before the $unwind if you know the document you are searching (look at performance metrics).
Edit: added a second $unwind since US is an array.
Don't know what you are doing (so realy can't tell and just sugesting) but you might want to examine if your schema (and mongodb) is ideal for your task because the document looks just like denormalized relational data probably a relational database would be better for you.