List of userids without duplicates in mongodb [duplicate] - mongodb

I'm trying to learn MongoDB and how it'd be useful for analytics for me. I'm simply playing around with the JavaScript console available on their website and have created the following items:
{"title": "Cool", "_id": {"$oid": "503e4dc0cc93742e0d0ccad3"}, "tags": ["twenty", "sixty"]}
{"title": "Other", "_id": {"$oid": "503e4e5bcc93742e0d0ccad4"}, "tags": ["ten", "thirty"]}
{"title": "Ouch", "_id": {"$oid": "503e4e72cc93742e0d0ccad5"}, "tags": ["twenty", "seventy"]}
{"title": "Final", "_id": {"$oid": "503e4e72cc93742e0d0ccad6"}, "tags": ["sixty", "seventy"]}
What I'd like to do is query so I get a list of unique tags for all of these objects. The result should look something like this:
["ten", "twenty", "thirty", "sixty", "seventy"]
How do I query for this? I'm trying to distinct() it, but the call always fails without even querying.

The code that fails on their website works on an actual MongoDB instance:
> db.posts.insert({title: "Hello", tags: ["one", "five"]});
> db.posts.insert({title: "World", tags: ["one", "three"]});
> db.posts.distinct("tags");
[ "one", "three", "five"]
Weird.

You can use the aggregation framework. Depending on how you'd like the results structured, you can use either
var pipeline = [
{"$unwind": "$tags" } ,
{ "$group": { _id: "$tags" } }
];
R = db.tb.aggregate( pipeline );
printjson(R);
{
"result" : [
{
"_id" : "seventy"
},
{
"_id" : "ten"
},
{
"_id" : "sixty"
},
{
"_id" : "thirty"
},
{
"_id" : "twenty"
}
],
"ok" : 1
}
or
var pipeline = [
{"$unwind": "$tags" } ,
{ "$group":
{ _id: null, tags: {"$addToSet": "$tags" } }
}
];
R = db.tb.aggregate( pipeline );
printjson(R);
{
"result" : [
{
"_id" : null,
"tags" : [
"seventy",
"ten",
"sixty",
"thirty",
"twenty"
]
}
],
"ok" : 1
}

You should be able to use this:
db.mycollection.distinct("tags").sort()

Another way of getting unique array elements using aggregation pipeline
db.blogs.aggregate(
[
{$group:{_id : null, uniqueTags : {$push : "$tags"}}},
{$project:{
_id : 0,
uniqueTags : {
$reduce : {
input : "$uniqueTags",
initialValue :[],
in : {$let : {
vars : {elem : { $concatArrays : ["$$this", "$$value"] }},
in : {$setUnion : "$$elem"}
}}
}
}
}}
]
)
collection
> db.blogs.find()
{ "_id" : ObjectId("5a6d53faca11d88f428a2999"), "name" : "sdfdef", "tags" : [ "abc", "def", "efg", "abc" ] }
{ "_id" : ObjectId("5a6d5434ca11d88f428a299a"), "name" : "abcdef", "tags" : [ "abc", "ijk", "lmo", "zyx" ] }
>
pipeline
> db.blogs.aggregate(
... [
... {$group:{_id : null, uniqueTags : {$push : "$tags"}}},
... {$project:{
... _id : 0,
... uniqueTags : {
... $reduce : {
... input : "$uniqueTags",
... initialValue :[],
... in : {$let : {
... vars : {elem : { $concatArrays : ["$$this", "$$value"] }},
... in : {$setUnion : "$$elem"}
... }}
... }
... }
... }}
... ]
... )
result
{ "uniqueTags" : [ "abc", "def", "efg", "ijk", "lmo", "zyx" ] }

There are couple of web mongo consoles available:
http://try.mongodb.org/
http://www.mongodb.org/#
But if you type help in them you will realise they only support a very small number of ops:
HELP
Note: Only a subset of MongoDB's features are provided here.
For everything else, download and install at mongodb.org.
db.foo.help() help on collection method
db.foo.find() list objects in collection foo
db.foo.save({a: 1}) save a document to collection foo
db.foo.update({a: 1}, {a: 2}) update document where a == 1
db.foo.find({a: 1}) list objects in foo where a == 1
it use to further iterate over a cursor
As such distinct does not work because it is not supported.

Related

Omit empty fields from MongoDB query result

Is there a way to omit empty fields (eg empty string, or an empty array) from MongoDB query results' documents (find or aggregate).
Document in DB:
{
"_id" : ObjectId("5dc3fcb388c1c7c5620ed496"),
"name": "Bill",
"emptyString" : "",
"emptyArray" : []
}
Output:
{
"_id" : ObjectId("5dc3fcb388c1c7c5620ed496"),
"name": "Bill"
}
Similar question for Elasticsearch: Omit null fields from elasticsearch results
Please use aggregate function.
If you want to remove key. you use $cond by using $project.
db.Speed.aggregate( [
{
$project: {
name: 1,
"_id": 1,
"emptyString": {
$cond: {
if: { $eq: [ "", "$emptyString" ] },
then: "$$REMOVE",
else: "$emptyString"
}
},
"emptyArray": {
$cond: {
if: { $eq: [ [], "$emptyArray" ] },
then: "$$REMOVE",
else: "$emptyArray"
}
}
}
}
] )
One way this could be done is using cursor.map() which is available on find() and aggregation([]) both.
The idea is to have list of the fields that are present/could be in the documents and filter out by using delete operator to remove the fields (which are empty strings or empty array, both have length property) from returning document.
Mongo Shell:
var fieldsList = ["name", "emptyString", "emptyArray"];
db.collection.find().map(function(d) {
fieldsList.forEach(function(k) {
if (
k in d &&
(Array.isArray(d[k]) ||
(typeof d[k] === "string" || d[k] instanceof String)) &&
d[k].length === 0
) {
delete d[k];
}
});
return d;
});
Test documents:
{
"_id" : ObjectId("5dc426d1f667120607ac5006"),
"name" : "Bill",
"emptyString" : "",
"emptyArray" : [ ]
}
{
"_id" : ObjectId("5dc426d1f667120607ac5007"),
"name" : "Foo",
"emptyString" : "foo",
"emptyArray" : [ ]
}
{
"_id" : ObjectId("5dc426d1f667120607ac5008"),
"name" : "Bar",
"emptyString" : "",
"emptyArray" : [
"foo",
"bar"
]
}
{
"_id" : ObjectId("5dc426d1f667120607ac5009"),
"name" : "May",
"emptyString" : "foobar",
"emptyArray" : [
"foo",
"bar"
]
}
O/P
[
{
"_id" : ObjectId("5dc426d1f667120607ac5006"),
"name" : "Bill"
},
{
"_id" : ObjectId("5dc426d1f667120607ac5007"),
"name" : "Foo",
"emptyString" : "foo"
},
{
"_id" : ObjectId("5dc426d1f667120607ac5008"),
"name" : "Bar",
"emptyArray" : [
"foo",
"bar"
]
},
{
"_id" : ObjectId("5dc426d1f667120607ac5009"),
"name" : "May",
"emptyString" : "foobar",
"emptyArray" : [
"foo",
"bar"
]
}
]
Note: if the number of fields are very large in the documents this may not be very optimal solution since the comparisons are going to happen with all fields in document. You might want to chunk the fieldsList with properties that are suspected to be empty array or string.
I think the easiest way to remove all empty string- and empty array-fields from the output is to add the aggregation stage below. (And yes, "easy" is relative, when you have to create these levels of logic to accomplish such a trivial task...)
$replaceRoot: {
newRoot: {
$arrayToObject: {
$filter: {
input: {
$objectToArray: '$$ROOT'
},
as: 'item',
cond: {
$and: [
{ $ne: [ '$$item.v', [] ] },
{ $ne: [ '$$item.v', '' ] }
]
}
}
}
}
}
Just modify the cond-clause to filter out other types of fields (e.g. null).
btw: I haven't tested the performance of this, but at least it's generic and somewhat readable.
Edit: IMPORTANT! The $replaceRoot-stage does prevent MongoDB from optimizing the pipeline, so if you use it in a View that you run .find() on, it will append a $match-stage to the end of the View's pipeline, in stead of prepending an indexed search at the start of the pipeline. This will have significant impact on the performance. You can safely use it in a custom pipeline though, as long as you have the $match-stage before it. (At least as far as my limited MongoDB knowledge tells me). And if anyone knows how to prépend a $match-stage to a View when querying, then please leave a comment :-)

How to find the records with same key value assigned to multiple values in MongoDB

I have data like the following,
Student | Subject
A | Language
A | Math
B | Science
A | Arts
C | Biology
B | History
and so on...
I want to fetch the students who has same name but enrolled in two different subjects Language & Math only.
I tried to use the query:
$group:{
_id:"$student",
sub:"{$addToSet:"$subject"}
},
$match:{
sub:{$in:["Language","Math"]}
}
But I am getting no documents to preview in MongoDB Compass. I am working in a VM machine, Compass is able to group only biology, history, science, arts only but not able to group language and math. I wanted to get A as my output.
Thanks in loads.
The collection data and the expected output:
{ Student:"A", Subject:"Language" },
{ Student:"A", Subject:"Math" },
{ Student:"B", Subject:"Science" },
{ Student:"A", Subject:"Arts" },
{ Student:"C", Subject:"Biology" },
{ Student:"B", Subject:"History" }
I am looking to get A as my output.
You are almost there, just need some tweak to your aggregation pipeline:
const pipeline = [
{
$group:
{
_id: '$Student', // Group students by name
subjects: {
$addToSet: '$Subject', // Push all the subjects they take uniquely into an array
},
},
},
{
// Filter for students who only offer Language and Mathematics
$match: { subjects: { $all: ['Language', 'Math'], $size: 2 } },
},
];
db.students.aggregate(pipeline);
That should give an output array like this:
[
{ "_id" : studentName1 , "subjects" : [ "Language", "Math" ] },
{ "_id" : studentName2 , "subjects" : [ "Language", "Math" ] },
....
]
You have to use an Aggregation operator, $setIsSubset. The $in (aggregation) operator is used to check an array for one value only. I think you are thinking of $in (query operator)..
The Query:
db.student_subjects.aggregate( [
{ $group: {
_id: "$student",
studentSubjects: { $addToSet: "$subject" }
}
},
{ $project: {
subjectMatches: { $setIsSubset: [ [ "Language", "Math" ], "$studentSubjects" ] }
}
},
{ $match: {
subjectMatches: true
}
},
{ $project: {
matched_student: "$_id", _id: 0
}
}
] )
The Result:
{ "matched_student" : "A" }
NOTES:
If you replace [ "Language", "Math" ] with [ "History" ], you will get the result: { "matched_student" : "B" }.
You can also try and see other set operators (aggregation), like the $allElementsTrue. Use the best one that suits your application.
[ EDIT ADD ]
Sample data for student_subjects collection:
{ "_id" : 1, "student" : "A", "subject" : "Language" }
{ "_id" : 2, "student" : "A", "subject" : "Math" }
{ "_id" : 3, "student" : "B", "subject" : "Science" }
{ "_id" : 4, "student" : "A", "subject" : "Arts" }
{ "_id" : 5, "student" : "C", "subject" : "Biology" }
{ "_id" : 6, "student" : "B", "subject" : "History" }
The Result After Each Stage:
1st Stage: $group
{ "_id" : "C", "studentSubjects" : [ "Biology" ] }
{ "_id" : "B", "studentSubjects" : [ "History", "Science" ] }
{ "_id" : "A", "studentSubjects" : [ "Arts", "Math", "Language" ] }
2nd Stage: $project
{ "_id" : "C", "subjectMatches" : false }
{ "_id" : "B", "subjectMatches" : false }
{ "_id" : "A", "subjectMatches" : true }
3rd Stage: $match
{ "_id" : "A", "subjectMatches" : true }
4th Stage: $project
{ "matched_student" : "A" }

Updating nested collection inside document - MongoDB

Given the following document inside Mongo:
{
"_id" : ObjectId("5d5e9852b2b803bfc66e74a6"),
"name" : "NAME",
"collection1" : [
{ "type" : "TYPE", "collection2" : [ ] }
]
}
I would like to add elements in the collection2 attribute. I am using the mongo console.
I tried using this query:
db.mycollection.updateOne(
{"name": "NAME"},
{$addToSet: {"collection1.$[element].collection2" : { $each: ["a", "b", "c"]}}},
{arrayFilters: [{element: 0}]}
);
I also tried to use push, but with no success.
The console returns:
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 0 }
The update didn't update the document because the arrayFilters clause did not match the document. Specifically, your example is filtering on an element in collection1 that is defined as 0, which does not exist.
Changing the update to filter on collection2 being an empty array should result in the update working as expected:
db.test.insert({
... "_id" : ObjectId("5d5e9852b2b803bfc66e74a6"),
... "name" : "NAME",
... "collection1" : [
... { "type" : "TYPE", "collection2" : [ ] }
... ]
... })
db.test.update(
... { name: "NAME" },
... { "$addToSet": { collection1.$[element].collection2: { "$each" : [ "a", "b", "c" ] } } },
... { arrayFilters: [ { element.collection2: [ ] } ] }
... )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
db.test.find()
{ "_id" : ObjectId("5d5e9852b2b803bfc66e74a6"), "name" : "NAME", "collection1" : [ { "type" : "TYPE", "collection2" : [ "a", "b", "c" ] } ] }
If you know the index of collection1 element you can omit arrayFilters and just use the index
db.mycollection.updateOne(
{ "name": "NAME" },
{ $addToSet: { "collection1.0.collection2": { $each: ["a", "b", "c"] }}}
);

MongoDB: projection $ when find document into nested arrays

I have the following document of collection "user" than contains two nested arrays:
{
"person" : {
"personId" : 78,
"firstName" : "Mario",
"surname1" : "LOPEZ",
"surname2" : "SEGOVIA"
},
"accounts" : [
{
"accountId" : 42,
"accountRegisterDate" : "2018-01-04",
"banks" : [
{
"bankId" : 1,
"name" : "Bank LTD",
},
{
"bankId" : 2,
"name" : "Bank 2 Corp",
}
]
},
{
"accountId" : 43,
"accountRegisterDate" : "2018-01-04",
"banks" : [
{
"bankId" : 3,
"name" : "Another Bank",
},
{
"bankId" : 4,
"name" : "BCT bank",
}
]
}
]
}
I'm trying to get a query that will find this document and get only this subdocument at output:
{
"bankId" : 3,
"name" : "Another Bank",
}
I'm getting really stucked. If I run this query:
{ "accounts.banks.bankId": "3" }
Gets the whole document. And I've trying combinations of projection with no success:
{"accounts.banks.0.$": 1} //returns two elements of array "banks"
{"accounts.banks.0": 1} //empty bank array
Maybe that's not the way to query for this and I'm going in bad direction.
Can you please help me?
You can try following solution:
db.user.aggregate([
{ $unwind: "$accounts" },
{ $match: { "accounts.banks.bankId": 3 } },
{
$project: {
items: {
$filter: {
input: "$accounts.banks",
as: "bank",
cond: { $eq: [ "$$bank.bankId", 3 ] }
}
}
}
},
{
$replaceRoot : {
newRoot: { $arrayElemAt: [ "$items", 0 ] }
}
}
])
To be able to filter accounts by bankId you need to $unwind them. Then you can match accounts to the one having bankId equal to 3. Since banks is another nested array, you can filter it using $filter operator. This will give you one element nested in items array. To get rid of the nesting you can use $replaceRoot with $arrayElemAt.

Mongodb counting array combinations

I have this kind of documents
[
{
....
tags : ["A","B"]
},
{
....
tags : ["A","B"]
},
{
....
tags : ["J","K"]
},
{
....
tags : ["A","B","C"]
}
]
With the Aggregation Framwork I'd like to group by array combinations to have something like this :
[
{
_id:["A","B"],
count : 2
},
{
_id:["J","K"],
count : 1
},
{
_id:["A","B","C"],
count : 1
},
]
Is it possible to do that?
Thank you
Not sure why you didn't even think this would work:
db.collection.aggregate([
{ "$group": {
"_id": "$tags",
"count": { "$sum": 1 }
}}
])
Returns:
{ "_id" : [ "A", "B", "C" ], "count" : 1 }
{ "_id" : [ "J", "K" ], "count" : 1 }
{ "_id" : [ "A", "B" ], "count" : 2 }
MongoDB "does not care" what you throw into the value of a "field" or "property". This applies to the "grouping key" of _id in the $group operator as well. Everything is a "document" and therefore a BSON value and is therefore valid.
Anything works. So long as it's what you want.