MongoDB Aggregation lookup how often Document is mentioned in other Collection - mongodb

I need to know how often a Document from Collection A is mentioned in Collection B. I am currently doing this with a $lookup aggregation and the size of the resulting array, but I guess that there is a much nicer way to do that?
Example:
Collection A
{ "_id": 1, "name": "User 1" }
{ "_id": 2, "name": "User 2" }
Collection B
{ "_id": 1, "user": 1, ... }
{ "_id": 2, "user": 1, ... }
{ "_id": 3, "user": 2, ... }
Desired result:
{ "_id": 1, "name": "User 1", "mentions": 2 }
{ "_id": 2, "name": "User 2", "mentions": 1 }

Related

How to get lastest record each group in mongodb?

I have a list message with format below:
[
{
"_id": 1,
"groupID": 1,
"content": "content 1",
"createAt": 1
},
{
"_id": 2,
"groupID": 1,
"content": "content 2",
"createAt": 2
},
{
"_id": 3,
"groupID": 2,
"content": "content 3",
"createAt": 3
},
{
"_id": 4,
"groupID": 2,
"content": "content 4",
"createAt": 4
},
{
"_id": 5,
"groupID": 2,
"content": "content 5",
"createAt": 5
},
{
"_id": 6,
"groupID": 2,
"content": "content 6",
"createAt": 6
}
]
How to get the last message (compare with 'createAt') with each group?
Expected result:
{
"_id": 2,
"groupID": 1,
"content": "content 2",
"createAt": 2
}
{
"_id": 6,
"groupID": 2,
"content": "content 6",
"createAt": 6
}
You can $sort + $group and use $$ROOT to preserve the last document entirely. Then you need $replaceRoot to promote last document to a root level:
db.collection.aggregate([
{ $sort: { createAt: 1 } },
{
$group: {
_id: "$groupID",
last: { $last: "$$ROOT" }
}
},
{
$replaceRoot: { newRoot: "$last" }
}
])
Mongo Playground

Aggrerate nested objectid

I have documents like this:
{
"_id": "...",
"collectionName": "blabla",
"items": ["ObjectID("1")","ObjectID("2")"],
}
items collection:
{
"_id": "1",
"name": "item name",
"size": 3
},
{
"_id": "2"
"name": "item name 2",
"size": 4
}
Output:
{
"_id": "...",
"collectionName": "blabla",
"items": ["ObjectID("1")","ObjectID("2")"],
"totalSize": 7
}
I'm trying to use aggrerate to sum all items size which is referenced by ObjectID
is this even possible? I couldn't find any information about it

MongoDB: Return deeply nested document from it's id regardless of level

I have a collection consisting of documents that are made up of an id, name and an array of items; the array in each document is supposed to nest other documents of the same structure when applicable, and there is no limit to how many nested arrays of documents there are (nested documents can also have nested documents).
Using the MongoClient package, I'm trying to query my collection and return a document based on it's id, regardless of it's location in the data (be it at the top level or 3 levels down).
So far I can return any top level data okay, but my query is not finding any nested data. I've seen similar questions where the data structure is limited and consistent, but as my data is dynamic and multi-layered, I haven't find a solution that fits this particular issue.
Here's my data:
[
{
"_id": 1,
"name": "Test 1",
"items": []
},
{
"_id": 2,
"name": "Test 2",
"items": [
{
"_id": 3,
"name": "Test 3",
"items": []
},
{
"_id": 4,
"name": "Test 4",
"items": [
{
"_id": 6,
"name": "Test 6",
"items": []
}
]
}
]
},
{
"_id": 5,
"name": "Test 5",
"items": []
}
]
Here's my Mongo query:
MongoClient.connect(connString, function(err, db) {
var collection = db.collection('items');
collection.findOne({_id: ObjectID("4")}, function(err, result) {
console.log(result);
});
db.close();
});
And result is intended to return the following, but returns null instead:
{
"_id": 4,
"name": "Test 4",
"items": [
{
"_id": 6,
"name": "Test 6",
"items": []
}
]
}
Can anyone share a solution as to how I can retrieve my data as intended?
It is not possible to query an infinate amount of document of documents due to the change of properties at each level. See Query Selectors (https://docs.mongodb.com/manual/reference/operator/query/#query-selectors) for a list of selectors that can be used.
However if it's possbile to limit or give a depth limit on the items then you could use a query such as:-
db.col.find( {$or: [ { "_id" : id }, { "items._id" : id }, { "items.items._id" : id }, { "items.items.items._id" : id } ] } );
If it's not possible to give a depth limit I'd advise re-modeling the document in to something like:
[
{
"_id": 1,
"name": "Test 1"
},
{
"_id": 3,
"name": "Test 3",
"parentId": 2
},
{
"_id": 6,
"name": "Test 6",
"parentId": 4
},
{
"_id": 4,
"name": "Test 4",
"parentId": 2
},
{
"_id": 2,
"name": "Test 2",
},
{
"_id": 5,
"name": "Test 5",
}
]
Then you could do a simple find by _id query:
> db.collection.find({_id: 4})
{ "_id" : 4, "name" : "Test 4", "parentId" : 2 }
If you also need to retain the document structure of the query you can use the $graphLookup aggregation stage.

Determining whether a Mongo document is the first one with a particular attribute

I need to produce a report on a set of documents with timestamps between two dates. The report needs to list each document, but it also needs to include a field for each document to indicate whether it's the first document in its group, which is indicated by an attribute.
There's a slight complication in the fact that although only documents between the two dates should be included, documents before the start date need to be considered when deciding if each document is the first it its set.
E.g. given the data
{ "_id": 1, "group": "A", "timestamp": "2015-01-01" }
{ "_id": 2, "group": "B", "timestamp": "2015-01-02" }
{ "_id": 3, "group": "A", "timestamp": "2015-01-03" }
{ "_id": 4, "group": "C", "timestamp": "2015-01-04" }
{ "_id": 5, "group": "B", "timestamp": "2015-01-05" }
{ "_id": 6, "group": "C", "timestamp": "2015-01-06" }
Generating a report from 2015-01-02 to 2015-01-05 would return
{ "_id": 2, "group": "B", "timestamp": "2015-01-02", "first": 1 }
{ "_id": 3, "group": "A", "timestamp": "2015-01-03", "first": 0 }
{ "_id": 4, "group": "C", "timestamp": "2015-01-04", "first": 1 }
{ "_id": 5, "group": "B", "timestamp": "2015-01-05", "first": 0 }
Currently I'm doing this by sorting all documents by group then timestamp, then looping over the entire dataset keeping track of the previous row to decide if a row inside the date range is the first of its type. With a large dataset this is very slow - it feels as though there must be a better way involving grouping or something clever but my Mongo skills aren't up to the job - any suggestions?

mongo searching more than 1 collection for same criteria

I have a mongo DB with several collections that contain JSON document formats shown below:
{
"questions": [
{
"questionEntry": {
"id": 1,
"info": {
"seasonNumber": 1,
"episodeNumber": 1,
"episodeName": "Days Gone Bye"
},
"questionItem": {
"theQuestion": "q1",
"attachedElement": {
"type": 1,
"value": ""
}
},
"options": [
{
"type": 1,
"value": "o1"
},
{
"type": 1,
"value": "o1"
}
],
"answer": {
"questionId": 1,
"answer": 1
},
"metaTags": [
"Season 1",
"Episode 1",
"Rick Grimmes"
]
}
},
{
"questionEntry": {
"id": 1,
"info": {
"seasonNumber": 1,
"episodeNumber": 1,
"episodeName": "Days Gone Bye"
},
"questionItem": {
"theQuestion": "q2",
"attachedElement": {
"type": 1,
"value": ""
}
},
"options": [
{
"type": 1,
"value": "o2"
},
{
"type": 1,
"value": "o2"
}
],
"answer": {
"questionId": 1,
"answer": 1
},
"metaTags": [
"Season 1",
"Episode 1",
"Rick Grimmes",
"Glenn Rhee"
]
}
}
]
}
I'm able to search for questions.questionEntry.questionItem.theQuestion for a matching criteria with:
db.questions.find({"questions.questionEntry.questionItem.theQuestion" : "q1"},{'questions.$':1}).pretty()
This works well for the questions collection but how would I do the same search across multiple collections?
Many thanks
To use the same query across multiple collections you may have to use the JavaScript bracket notation to access the collections in a loop. For example, the following queries the records database for all the collections (using the db.getCollectionNames() command) with the specified query:
use records
var colls = db.getCollectionNames(), // get all the collections in records db
query = {"questions.questionEntry.questionItem.theQuestion" : "q1"},
projection = {"questions.$": 1};
colls.forEach(function (collection){
var docs = db[collection].find(query, projection).toArray(); // use the bracket notation
docs.forEach(function (doc){ printjson(doc); });
})
You will have to do this by yourself. There is no out-of-the-box support.
You can query MongoDB multi-threaded (depending on your programming language) and aggregate the results to a unified result.