Export collection and replace field with field from another collection (aggregate?) - mongodb

Using MongoChef GUI but fine in command line.
I have a collection with a structure as thus:
Votes
{
"_id" : "5qgfddRubJ32pS48B",
"createdBy" : "HdKRfwzGriMMZgSQu",
"fellowId" : "yCaqt5nT3LQCBLj8j",
}
I need to first look up the user in a users collection using the createdBy field to see if they are verified
Users
{
"_id": "HdKRfwzGriMMZgSQu",
"emails" : [
{
"address" : "someuser#example.com",
"verified" : true
}
]
}
and additionally, get some more information from a third collection from fellowId
Fellows
{
"_id" : "yCaqt5nT3LQCBLj8j",
"title" : "Fellow Title"
}
And have them all export as one csv or json file. How can I achieve this as a mongo query/export?
The desired output would be, for example:
{
"_id" : "yCaqt5nT3LQCBLj8j",
"fellowTitle": "Fellow Title"
"isVerified" : true
}

You can perform an aggregate with 2 $lookup to join both collections :
1 $lookup to join users
1 $unwind to remove users array
1 $unwind to remove user email array (as we have to check verify)
1 $sort to sort with user.emails.verified
1 $group to actually pick only the first entry (verified or not)
1 $lookup to join fellows
1 $unwind to remove fellows array
1 $project to format whatever format you want at the end
1 $out to export to a new collection
Query is :
db.votes.aggregate([{
$lookup: {
from: "users",
localField: "createdBy",
foreignField: "_id",
as: "user"
}
}, {
$unwind: "$user"
}, {
$unwind: "$user.emails"
}, {
$sort: { "user.emails.verified": -1 }
}, {
$group: {
_id: "$_id",
createdBy: { $first: "$createdBy" },
fellowId: { $first: "$fellowId" },
user: { $first: "$user" }
}
}, {
$lookup: {
from: "fellows",
localField: "fellowId",
foreignField: "_id",
as: "fellow"
}
}, {
$unwind: "$fellow"
}, {
$project: {
"_id": 1,
"fellowTitle": "$fellow._id",
"isVerified": "$user.emails.verified"
}
}, {
$out: "results"
}])
Then export with :
mongoexport - d testDB - c results > results.json

Related

Mongodb combine aggregate queries

I have following collections in MongoDB
Profile Collection
> db.Profile.find()
{ "_id" : ObjectId("5ec62ccb8897af3841a46d46"), "u" : "Test User", "is_del": false }
Store Collection
> db.Store.find()
{ "_id" : ObjectId("5eaa939aa709c30ff4703ffd"), "id" : "5ec62ccb8897af3841a46d46", "a" : { "ci": "Test City", "st": "Test State" }, "ip" : false }, "op" : [ ], "b" : [ "normal" ], "is_del": false}
Item Collection
> db.Item.find()
{ "_id" : ObjectId("5ea98a25f1246b53a46b9e10"), "sid" : "5eaa939aa709c30ff4703ffd", "n" : "sample", "is_del": false}
Relation among these collections are defined as follows:
Profile -> Store: It is 1:n relation. id field in Store relates with _id field in Profile.
Store -> Item: It is also 1:n relation. sid field in Item relates with _id field in Store.
Now, I need to write a query to find the all the store of profiles alongwith their count of Item for each store. Document with is_del as true must be excluded.
I am trying it following way:
Query 1 to find the count of item for each store.
Query 2 to find the store for each profile.
Then in the application logic use both the result to produce the combined output.
I have query 1 as follows:
db.Item.aggregate({$group: {_id: "$sid", count:{$sum:1}}})
Query 2 is as follows:
db.Profile.aggregate([{ "$addFields": { "pid": { "$toString": "$_id" }}}, { "$lookup": {"from": "Store","localField": "pid","foreignField": "id", "as": "stores"}}])
In the query, is_del is also missing. Is there any simpler way to perform all these in a single query? If so, what will be scalability impact?
You can use uncorrelated sub-queries, available from MongoDB v3.6
db.Profile.aggregate([
{
$match: { is_del: false }
},
{
$lookup: {
from: "Store",
as: "stores",
let: {
pid: { $toString: "$_id" }
},
pipeline: [
{
$match: {
is_del: false,
$expr: { $eq: ["$$pid", "$id"] }
}
},
{
$lookup: {
from: "Item",
as: "items",
let: {
sid: { $toString: "$_id" }
},
pipeline: [
{
$match: {
is_del: false,
$expr: { $eq: ["$$sid", "$sid"] }
}
},
{
$count: "count"
}
]
}
},
{
$unwind: "$items"
}
]
}
}
])
Mongo Playground
To improve performance, I suggest you store the reference ids as ObjectId so you don't have to convert them in each step.

MongoDB aggregate return count of documents or 0

I have the following aggregate query:
db.user.aggregate()
.match({"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")})
.lookup({
'localField': 'profile_id',
'from': 'profile',
'foreignField' : '_id',
'as': 'profile'
})
.unwind("$profile")
.match({"profile.type" : "consultant"})
.group({_id:"$business_account_id", count:{$sum:1}})
My goal is to count how many consultant users belong to a given company.
Using the query above, if there is at least one user belonging to the provided business_account_id I get a correct count value.
But if there are none users, the .match({"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")}) will return an empty (0 documents) result.
How can I get a count: 0 if the there are no users assigned to the company ?
I tried many approach based on other threads but I coundn't get a count: 0
UPDATE 1
A simple version of my problem:
user collection
{
"_id" : ObjectId("5e36beb7b1dbae5124e4b6dc"),
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
},
{
"_id" : ObjectId("5e36d83db1dbae5124e4b732"),
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
}
Using the following aggregate query:
db.getCollection("user").aggregate([
{ "$match" : {
"business_account_id" : ObjectId("5e3377bcb1dbae5124e4b6bf")
}
},
{ "$group" : {
"_id" : "$business_account_id",
"count" : { "$sum" : 1 }
}
}
]);
I get:
{
"_id" : ObjectId("5e3377bcb1dbae5124e4b6bf"),
"count" : 2
}
But if I query for an ObjectId that doesn't exist, such as:
db.getCollection("user").aggregate([
{ "$match" : {
"business_account_id" : ObjectId("5e335c873e8d40676928656d")
}
},
{ "$group" : {
"_id" : "$business_account_id",
"count" : { "$sum" : 1 }
}
}
]);
I get an result completely empty. I would expect to get:
{
"_id" : ObjectId("5e335c873e8d40676928656d"),
"count" : 0
}
The root of the problem is if there is no document in the user collection that satisfies the initial $match there is nothing to pass to the next stage of the pipeline. If the business_account_id actually exists somewhere (perhaps another collection?) run the aggregation against that collection so that the initial match finds at least one document. Then use $lookup to find the users. If you are using MongoDB 3.6+, you can might combine the user and profile lookups. Lastly, use $size to count the elements in the users array.
(You will probably need to tweak the collection and field names)
db.businesses.aggregate([
{$match:{_id : ObjectId("5e3377bcb1dbae5124e4b6bf")}},
{$project: { _id:1 }},
{$lookup:{
from: "users",
let: {"busId":"$_id"},
as: "users",
pipeline: [
{$match: {$expr:{$eq:[
"$$busId",
"$business_account_id"
]}}},
{$lookup:{
localField: "profile_id",
from: "profile",
foreignField : "_id",
as: "profile"
}},
{$match: { "profile.type" : "consultant"}}
]
}},
{$project: {
_id: 0,
business_account_id: "$_id",
count:{$size:"$users"}
}}
])
Playground
Since you match non-existing business_account_id value, aggregation process will stop.
Workaround: We perform 2 aggregations in parallel with $facet operator to get default value if matching has no result.
Note: Make sure user collection has at least 1 record, otherwise this won't work
db.user.aggregate([
{
$facet: {
not_found: [
{
$project: {
"_id": ObjectId("5e3377bcb1dbae5124e4b6bf"),
"count": { $const: 0 }
}
},
{
$limit: 1
}
],
found: [
{
"$match": {
"business_account_id": ObjectId("5e3377bcb1dbae5124e4b6bf")
}
},
{
"$group": {
"_id": "$business_account_id",
"count": { "$sum": 1 }
}
}
]
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
{
$arrayElemAt: ["$not_found", 0]
},
{
$arrayElemAt: ["$found", 0]
}
]
}
}
}
])
MongoPlayground

MongoDB Aggregation : Double lookup, and merge lookup response to respective object

I'm trying an aggregation but I can't find the right pipeline to do it.
So, this is a part of my document model :
//company.js
{
"_id" : "5dg8aa8c435b1e2868c841f6",
"name" : "My Corp",
"externalId" : "d7f348c9-c69b-69c4-923c-91458c53dc22",
"professionals_customers" : [
{
"company" : "6f4d01eb3b948150c2aad9c0"
},
{
"company" : "5dg7aa8c366b1e2868c841f6",
"contact" : "5df8ab5c355b1e2999c841f7"
}
],
}
I try to return the professionnal customers fields hydrated with data, like a classic populate would do.
Company field came from the company collection and contact is provided by the user collection
The desired output must look like :
{
"professionals_customers" : [
{
"company": {
"_id": "6f4d01eb3b948150c2aad9c0",
"name": "Transtar",
"externalId": "d7f386c9-c79b-49c5-905c-90750c42dc22",
},
},
{
"company": {
"_id": "5dg7aa8c366b1e2868c841f6",
"name": "Aperture",
"externalId": "d7f386c9-c69b-49c4-905c-90750c53dc22",
},
"contact" : {
"_id": "5df8ab5c355b1e2999c841f7",
"firstname": "Caroline",
"lastname": "Glados",
"externalId": "d7f386c9-c69b-49c4-905c-90750c53dc22", //same externalId as above, the user belongs to the company
},
}
]
}
At this point I've tried multiple solutions but I can't reach my goal.
let query = [{
$match : { _id : companyId }
},{
$lookup : {
from: 'companies',
localField : 'professionals_customers.company',
foreignField : '_id',
as : 'professionalsCustomers'
}
},{
$lookup : {
from: 'users',
localField : 'professionals_customers.contact',
foreignField : '_id',
as : 'contacts'
}
}]
At this, point I' ve got two new arrays with all the needed informations, but I don't know how to get the right contact grouped with the right company. Also, maybe it's easier to try to populate the data (with $lookup) keeping the initial struct than trying to regroup professionalCustomers and contacts through the shared externalId.
Additional informations :
-An user that belongs to a company has the same externalId.
-I don't want to use a classical populate, after that, I need to do some other operations
Try this query :
db.companies.aggregate([
{ $match: { _id: companyId } },
{ $unwind: "$professionals_customers" },
{
$lookup: {
from: "companies",
localField: "professionals_customers.company",
foreignField: "_id",
as: "professionals_customers.company"
}
},
{
$lookup: {
from: "users",
localField: "professionals_customers.contact",
foreignField: "_id",
as: "professionals_customers.contact"
}
},
{
$addFields: {
"professionals_customers.company": {
$arrayElemAt: ["$professionals_customers.company", 0]
},
"professionals_customers.contact": {
$arrayElemAt: ["$professionals_customers.contact", 0]
}
}
},
{
$group: { _id: "$_id", professionals_customers: { $push: "$professionals_customers" }, data: { $first: "$$ROOT" } }
},
{ $addFields: { "data.professionals_customers": "$professionals_customers" } },
{ $replaceRoot: { newRoot: "$data" } }
])
Test : MongoDB-Playground
Note : If needed you need to convert fields/input which is of type string to ObjectId(). Basic thing is you need to check types of two fields being compared or input-to-field-in-DB matches or not.

Lookup and sort the foreign collection

so I have a collection users, and each document in this collection, as well as other properties, has an array of ids of documents in the other collection: workouts.
Every document in the collection workouts has a property named date.
And here's what I want to get:
For a specific user, I want to get an array of {workoutId, workoutDate} for the workouts that belong to that user, sorted by date.
This is my attempt, which is working fine.
Users.aggregate([
{
$match : {
_id : ObjectId("whateverTheUserIdIs")
}
},
{
$unwind : {
path : "$workouts"
}
}, {
$lookup : {
from : "workouts",
localField : "workouts",
foreignField : "_id",
as : "workoutDocumentsArray"
}
}, {
$project : {
_id : false,
workoutData : {
$arrayElemAt : [
$workoutDocumentsArray,
0
]
}
}
}, {
$project : {
date : "$workoutData.date",
id : "$workoutData._id"
}
}, {
$sort : {date : -1}
}
])
However I refuse to believe I need all this for what would be such a simple query in SQL!? I believe I must at least be able to merge the two $project stages into one? But I've not been able to figure out how looking at the docs.
Thanks in advance for taking the time! ;)
====
EDIT - This is some sample data
Collection users:
[{
_id:xxx,
workouts: [2,4,6]
},{
_id: yyy,
workouts: [1,3,5]
}]
Colleciton workouts:
[{
_id:1,
date: 1/1/1901
},{
_id:2,
date: 2/2/1902
},{
_id:3,
date: 3/3/1903
},{
_id:4,
date: 4/4/1904
},{
_id:5,
date: 5/5/1905
},{
_id:6,
date: 6/6/1906
}]
And after running my query, for example for user xxx, I would like to get only the workouts that belong to him (whose ids appear in his workouts array), so the result I want would look like:
[{
id:6,
date: 6/6/1906
},{
id:4,
date: 4/4/1904
},{
id:2,
date: 2/2/1902
}]
You don't need to $unwind the workouts array as it already contains array of _ids and use $replaceRoot instead of doing $project
Users.aggregate([
{ "$match": { "_id" : ObjectId("whateverTheUserIdIs") }},
{ "$lookup": {
"from" : "workouts",
"localField" : "workouts",
"foreignField" : "_id",
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
{ "$sort" : { "date" : -1 }}
])
or even with new $lookup syntax
Users.aggregate([
{ "$match" : { "_id": ObjectId("whateverTheUserIdIs") }},
{ "$lookup" : {
"from" : "workouts",
"let": { "workouts": "$workouts" },
"pipeline": [
{ "$match": { "$expr": { "$in": ["$_id", "$$workouts"] }}},
{ "$sort" : { "date" : -1 }}
]
"as" : "workoutDocumentsArray"
}},
{ "$unwind": "$workoutDocumentsArray" },
{ "$replaceRoot": { "newRoot": "$workoutDocumentsArray" }}
])

Use array first field in mongo aggregate $lookup query to match a document

I want to use my array field 0th value to find a match in sale document using Mongo aggregate $lookup query. Here is my query:
db.products.aggregate([
{
$match : { _id:ObjectId("57c6957fb190ecc02e8b456b") }
},
{
$lookup : {
from : 'sale',
localField: 'categories.0',
foreignField: 'saleCategoryId',
as : 'pcSales'
}
}]);
Result :
{
"_id" : ObjectId("57c6957fb190ecc02e8b456b"),
"categories" : [
"57c54f0db190ec430d8b4571"
],
"pcSales" : [
{
"_id" : ObjectId("57c7df5f30fb6eacb3810d1b"),
"Title" : "Latest Arrivals",
}
]}
This query will return me a match but when i check it not a match. I don't get why is this happening, And when i removed 0th part from query its return blank array.
Like this:
{
"_id" : ObjectId("57c6957fb190ecc02e8b456b"),
"categories" : [
"57c54f0db190ec430d8b4571"
],
"pcSales" : []
}
saleCategoryId is also a array field which contain array of categoriesKey.
Please help.
Because your localField is an array, you'll need to add an $unwind stage to your pipeline before the lookup or use the $arrayElemAt in a $project pipeline step to get the actual element in the array.
Here are two examples, one which uses the $arrayElemAt operator:
db.products.aggregate([
{ "$match" : { "_id": ObjectId("57c6957fb190ecc02e8b456b") } },
{
"$project": {
"category": { "$arrayElemAt": [ "$categories", 0 ] }
}
},
{
"$lookup": {
from : 'sale',
localField: 'category',
foreignField: 'saleCategoryId',
as : 'pcSales'
}
}
]);
and this which uses $unwind to flatten the categories array first before applying the $lookup pipeline:
db.products.aggregate([
{ "$match" : { "_id": ObjectId("57c6957fb190ecc02e8b456b") } },
{ "$unwind": "$categories" },
{
"$lookup": {
from : 'sale',
localField: 'categories',
foreignField: 'saleCategoryId',
as : 'pcSales'
}
}
]);