List of Id's with no dependency in Mongo Collection - mongodb

I have a scenario in spring-mongo query. Mongo version is 3.2
Application have two collections (Collection A and Collection B).
**Sample contents**
Collection A :: {"_id":1, "name":"content 1"}...{"_id":100, "name":"content 100"}
Collection B :: {"_id":1, "name":"parent 1", "a":[1,2,58,67]}
{"_id":2, "name":"parent 2", "a":[2,85,96,99]}
Collection B holds reference ids of Collection A as a array.
Scenario:
I will past list of ids of Collection A to the query: I need to get list of ids of Collection A which are not associated anywhere in Collection B.
how to achieve this?

I am planning to proceed with Aggregation with following query.. Looking up with
preserveNullAndEmptyArrays saving my day.
db.a.aggregate([
{
$match: { "_id":{$in:["1","5","10"]} }
},
{
$lookup:
{
from: "b",
localField: "_id",
foreignField: "a",
as: "moneyid"
}
},
{
$unwind:
{"path":"$moneyid", "preserveNullAndEmptyArrays":true}
},
{
$match:{
"moneyid":{$eq:null}
}
}
])

It can be done using the following approach
1) Find Unique Id's in Collection B which are in array "a":[]
2) Execute a find query in Collection A - Use $nin in the find query and pass all the ids which you obtained from array "a":[] from Collection B
Note:If you need to find matching documents between two collections then you can use $lookup which works like a left outer join.

Related

MongoDB: Delete 'orphaned' documents?

I am trying to delete orphaned documents in mongodb, which cross collections.
In the collection 'values', I have documents like this:
value:
{
resultId: <ObjectId>
...other data..
}
which reference documents in the collection 'results':
result:
{
_id: <ObjectId> //the resultId
}
A number of 'result' documents have been deleted, resulting in orphaned 'value' documents. How can I find all orphans and delete them?
What you will want to do is build up an aggregate pipeline and use the $lookup operator to fetch the corresponding result document. Then, add a $match operator to your aggregate pipeline to filter those that don't have corresponding result object.
db.values.aggregate([
{
$lookup: {
from: "results",
localField: "resultId",
foreignField: "id",
as: "resultDocument"
}
},
{ $match: { resultDocument: {$size:0} }}
])
This way, you have identified your orphaned documents and can delete them afterwards.

Mongo lookup returning all values

I have 2 collections as shown below:
branches
{
_id: ...,
custId: "abc123",
branchCode: "AA",
...other fields
}
branchHolidays
{
_id: ...,
custId: "abc123",
holidayDate: ISODate("2019-06-01T00:00:00:0000"),
holidayStatus: "PROCESSED",
..other fields
}
Need to get all branchHolidays with the custId available in branches collection along with the branchCode from branches collection. (branches.custId = branchHolidays.custId)
For the first part of join I tried the below query but I'm getting all the fields from branchHolidays collection.
db.getCollection('branchHolidays').aggregate([
{
$lookup: {
localField: "custId",
from: "branches",
foreignField: "custId",
as: "holidays"
}
},
$match: { holidayStatus: "PROCESSED" }
])
The above query returns all the documents from branchHolidays collection.
I'm new to mongo but I'm not able to figure out what the problem is. Have gone through most of the SO queries but haven't found anything which helped.
Note: There are multiple branchCodes mapped to 1 custId in branches collection.
The $lookup stage is similar to a left outer join. The sample aggregation should return all documents from the branchHolidays collection that have holidayStatus: "PROCESSED", and each document will have an added field holidays containing all documents from the branches collection that have the same custId. For those documents that do not match any braches, the holidays field will contain an empty array.
If you want to return only document that have matching branches, match on size, like:
holidays:{$not:{$size:0}}
Also note placing the $match: { holidayStatus: "PROCESSED" } before the $lookup will avoid querying the branches collection for documents that would be eliminated, which may improve performance.

MongoDB query, filter using cursor

I have two collections, one that has _id and UserId, and another that has UserId (same unique identifier) and "other data".
I want to filter the latter collection based on a list of _ids from the former collection.
Can someone provide an example query for this scenario?
The only way to 'join' collections in MongoDB is a $lookup aggregation stage (available in version 3.2).
firstCollection.aggregate([
{ $match: { _id: {$in: [1,2,3] }}}, // filter by _ids
{
$lookup:
{
from: "secondCollection",
localField: "UserId",
foreignField: "UserId",
as: "data"
}
}
])
That will add 'data' field to the documents from the first collection which will contain all related documents from second collection. If relation is not 1:1, you can add $unwind stage to flatten results:
{$unwind: "$data"}

Mongo aggregation framework on big data

Could you please help me with mongoDB aggregation. Here is what I would like to do next:
I have collection A. A document from A represents an object like:
{
nameA: 'first',
items: [
'item1',
'item2',
'item3',
'item4'
]
}
And I have the Collection B with documents like:
[
{
item: 'item3',
info: 'info1'
},
{
item: 'item3',
info: 'info2'
},
{
item: 'item3',
info: 'info3'
}
]
I work with big data, so it would be better to do it in one query. Imagine that we already have all data from collection A. I would like to build a query on collection B to get next structure result:
{
'first'/*nameA*/: ['info1', 'info2', 'info3'],
....
}
How do I achieve the desired result with MongoDB aggregation?
As Rahul Kumar mentioned in his comment, your design is more leaning towards a relational database schema design, and it makes it quite difficult to design efficient MongoDB it.
However, it is still possible to achieve the functionality you are looking for by leveraging the $lookup stage of the aggregation framework, as follows:
db.A.aggregate([
{
$unwind: {
path: "$items"
}
},
{
$lookup: {
from: "B",
localField: "items",
foreignField: "item",
as: "item_info"
}
},
{
$unwind: {
path: "$item_info"
}
},
{
$group: {
_id: "$nameA",
item_info: { $addToSet: "$item_info.info" }
}
}
]);
In the first $unwind stage you normalize the items array on
collection A in order to be able to pass its output to the next
stage
In the $lookup stage you make a left join between two collections
that are part of the same database, in this case used to get the
item information from collection B
In the second $unwind stage you normalize the data you extracted
from collection B in order to flatten the array containing the
objects from collection B that were mapped to the corresponding
items in collection A
Finally, in the $group stage you group all the entries of the
result set by nameA and create an array of unique item information
values. If you would like to have all the duplicate occurrences of
the item information values, you can replace the $addToSet
accumulator with $push.
Below is the result of running the above aggregation pipeline on the collections that you provided:
{ "_id" : "second", "item_info" : [ "info3", "info2", "info1" ] }
{ "_id" : "first", "item_info" : [ "info3", "info2", "info1" ] }

mongodb - perform batch query

I need query data from collection a first, then according to those data, query from collection b. Such as:
For each id queried from a
query data from b where "_id" == id
In SQL, this can be done by join table a & b in a single select. But in mongodb, it needs do multi query, it seems inefficient, doesn't it? Or it can be done by just 2 queries?(one for a, another for b, rather than 1 plus n) I know NoSQL doesn't support join, but is there a way to batch execute queries in for loop into a single query?
You'll need to do it as two steps.
Look into the $in operator (reference) which allows passing an array of _ids for example. Many would suggest you do those in batches of, say, 1000 _ids.
db.myCollection.find({ _id : { $in : [ 1, 2, 3, 4] }})
It's very simple, don't make so many DB calls for each id, it is very inefficient, it is possible to execute a single query which will return all documents relevant to each of the ids in a single pass using the $in operator in MongoDB, which is synonymous to in syntax in SQL so for example if you need to find out the documents for 5 ids in a single pass then
const ids = ['id1', 'id2', 'id3', 'id4', 'id5'];
const results = db.collectionName.find({ _id : { $in : ids }})
This will get you all the relevant documents in a single pass.
This is basically a join in SQL parlance and you can do it in mongo using an aggregate query called $lookup.
For example if you had some types in collections like these:
interface IFoo {
_id: ObjectId
name: string
}
interface IBar {
_id: ObjectId
fooId: ObjectId
title: string
}
Then query with an aggregate like this:
await db.foos.aggregate([
{
$match: { _id: { $in: ids } } // query criteria here
},
{
$lookup: {
from: 'bars',
localField: '_id',
foreignField: 'fooId',
as: 'bars'
}
}
])
May produce resulting objects like this:
{
"_id": "foo0",
"name": "example foo",
"bars": [
{ _id: "bar0", "fooId": "foo0", title: "crow bar" }
]
}