Get distinct values from each field within mongodb collection - mongodb

How to get the distinct values of all the fields within the mongodb collection using single query.
{ "_id": "1", "Gender": "male", "car": "bmw" , "house":"2bhk" , "married_to": "kalpu"},
{ "_id": "2", "Gender": "female", "car": nan , "house":"3bhk", "married_to": "kalpu"},
{ "_id": "3", "Gender": "female", "car": "audi", "house":"1bhk", "married_to": "deepa"},
This is an example with few fields, In my actual collection, each document has atleast 50 fields. So how to query effeciently that will return unique values within each of the fields? Thanks in advance for help.
Answer expected:
for each field,
Gender:"male", "female"
car :"bmw", "audi",.....
house : "3hbk","2bhk","1bhk"
married_to: "kalpu","deepa",....
....
....
...

You can use aggregation pipeline $group stage with $addToSet operator
db.collection.aggregate([
{
$group: {
_id: null,
Gender: {
"$addToSet": "$Gender"
},
car: {
"$addToSet": "$car"
},
house: {
"$addToSet": "$house"
},
married_to: {
"$addToSet": "$married_to"
},
}
}
])
Working Example

Related

Mongodb update and delete fields that no longer exist

There is a document, i.e.
{
"_id": {
"$oid": "63ee577ca5340cd594916852"
},
"id": 12345,
"price": 123,
"oldprice": 456
}
I am performing an update with
db.testupd.updateOne({'id': 12345}, [{'$set': {"id": 12345, "price": 222}}], upsert=true)
It works but "oldprice" field is still there after update and what I need is to delete the fields that no longer exist, because unfortunately the data source is not consistent.
How can I achieve this?
If the list of fields that you do not want to keep is unknown, using replace in $merge is one of the options. You can $project to keep only the fields you want.
db.collection.aggregate([
{
$match: {
"id": 12345
}
},
{
"$set": {
"id": 12345,
"price": 222
}
},
{
$project: {
id: 1,
price: 1
}
},
{
"$merge": {
"into": "collection",
"on": "_id",
"whenMatched": "replace"
}
}
])
Mongo Playgroud

Migrate to new document structure in mongo 3.6

I have to migrate data from a structure from
{
"_id": "some-id",
"_class": "org.some.class",
"number": 1015,
"timestamp": {"$date": "2020-09-05T12:08:02.809Z"},
"cost": 0.9200000166893005
}
to
{"_id": {
"productId": "some-id",
"countryCode": "DE"
},
"_class": "org.some.class",
"number": 1015,
"timestamp": {"$date": "2020-09-05T12:08:02.809Z"},
"cost": 0.9200000166893005
}
The change that is in the new document is the _id field is replaced by a complex _id object (productId : String, country : String).
The country field is to be completed for the entire collection with a specific value - DE.
The collection has about 40 million records in the old format and 700k in the new format. I would like to bring these 40 million to this new form. I’m using mongo 3.6, so I’m a bit limited and I’ll probably have to use the aggregate functions to create a completely new collection, and then remove the old one.
I will be grateful for help on how to do it - how the query that will do it should look like and how to keep these migrated 700k documents.
What I have got so far:
db.productDetails.aggregate(
{$match: {_id: {$exists: true}}},
{$addFields: {"_id": {"productId": "$_id", "country": "DE"}},
{$project: {_id: 1, _class: 1, number: 1, timestamp: 1, cost: 1}},
{$out: "productDetailsV2"}
)
but this solution would only work if I didn't have 700k documents in the new form.
Your query is in the right direction. You may want to modify the $match filter a bit to better catch the old type documents.
db.collection.aggregate([
{
$match: {
"_id.country": {
$exists: false
}
}
},
{
$addFields: {
"_id": {
"productId": "$_id",
"country": "DE"
}
}
},
{
$project: {
"_id": 1,
"_class": 1,
"number": 1,
"timestamp": 1,
"cost": 1
}
},
{
$out: "productDetailsV2"
}
])
Mongo Playground

How to find document that contains array with two equal values?

I have chats's collection with participants array
[{
"_id": ObjectId("5d12b2a10507cfe0bad6d93c"),
"participants": [{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
},
{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
}
]
}, {
"_id": ObjectId("5d1124a50507cfe0baba7909"),
"participants": [{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
},
{
"_id": ObjectId("5ce54cb80507cfe0ba25d74b"),
"firstname": "Milosh",
"lastname": "Jersi",
"icon": "/assets/images/avatars/small/3.jpg"
}
]
}]
I fetch it by
req.db.collection('chats').findOne({'participants._id': {$all: [req.userID, new mongo.ObjectID(req.params.to)]}});
where userID is also ObjectID and equals.
Usually it have different participants, but our user can also send messages to itself, is allowed option in many social networks.
So in this situation, our user "John Anderson" sent message to himself and we inserted chat document for it.
And now i have problem, how to get document with equal array values
{'participants._id': { '$all': [ 5ce4af580507cfe0ba1c6f5b, 5ce4af580507cfe0ba1c6f5b] }}
// return every chat contains our id in atleast one field, but we need both to be equal
// same for $in
{'participants._id': { '$eq': [ 5ce4af580507cfe0ba1c6f5b, 5ce4af580507cfe0ba1c6f5b] }}
// return nothing
what else can I do ?
you can achieve this with the aggregation framework, using a $group stage. First, group by chat._id and use $addToSet to keep only unique users in a new array, ant then add a filter to keep only the documents with one participant:
db.collection.aggregate([
{
"$unwind": "$participants"
},
{
"$group": {
"_id": "$_id",
"participants": {
"$addToSet": "$participants._id"
}
}
},
{
"$match": {
"participants": {
"$size": 1
}
}
}
])
result:
[
{
"_id": ObjectId("5d12b2a10507cfe0bad6d93c"),
"participants": [
ObjectId("5ce4af580507cfe0ba1c6f5b")
]
}
]
you can try it online: mongoplayground.net/p/plB-gsNIxRd

Add conditional and in array of object, mongodb

I have a document like this
{
"_id": {
"$oid": "5c7369826023661073802f63"
},
"participants": [
{
"id": "ABC",
"nickname": "USER1",
},
{
"id": "DEF",
"nickname": "USER2",
}
]},... etc, et
I want to find the record that has the two ids that you provide
I try with this.
moodel.aggregate([
{
$match:{'participants.id':idOne}
},{
$project:{
list:{
$filter:{
input:'$list',
as:'item',
cond: {$eq: ['$$item.participants.id', idTwo]}
}
},
}
}
])
but the result is:
[ { _id: 5c7369826023661073802f63, list: null }]
I want it to return only the record that match the two ids.
use $elematch and $in
https://docs.mongodb.com/manual/reference/operator/query/elemMatch/
https://docs.mongodb.com/manual/reference/operator/query/in/
db.moodel.find({"participants": {$elemMatch: {id: {$in: [idOne, idTwo]}}}})

Mongodb aggregate and return multiple document value

Assuming I have the following JSON structure I want to group by gender and want to return multiple document values on in the same field:
[
{
"id": 0,
"age": 40,
"name": "Tony Bond",
"gender": "male"
},
{
"id": 1,
"age": 30,
"name": "Nikki Douglas",
"gender": "female"
},
{
"id": 2,
"age": 23,
"name": "Kasey Cardenas",
"gender": "female"
},
{
"id": 3,
"age": 25,
"name": "Latasha Burt",
"gender": "female"
}
]
Now I know I can do something like this but I need to join both age and name into one field
.aggregate().group({ _id:'$gender', age: { $addToSet: "$age" }, name: { $addToSet: "$name"}})
Yes you can, just have a sub-document as the argument:
db.collection.aggregate([
{ "$group": {
"_id": "$gender",
"person": { "$addToSet": { "name": "$name", "age": "$age" } }
}}
])
Of course if you actually expect no duplicates here the $push does the same thing:
db.collection.aggregate([
{ "$group": {
"_id": "$gender",
"person": { "$push": { "name": "$name", "age": "$age" } }
}}
])
In addition to what Neil mentioned for the aggregation query, you have to consider the fact that the document that contains the grouped individual record cannot be beyond 16 MB (in v2.5+) or the entire aggregate result cannot be more than 16MB (on v2.4 and below).
So in case you have huge data set, you have to be aware of limitations that may impose on the model you use to aggregate data.