mongoose aggregate with array of string - mongodb

I have a user collection like the below format:
[
{
"_id": ObjectId("5f76ac1c337b875f3aff8cad"),
"units": [ '1', '2', '3' ],
"status": "true",
"name": "xxx"
},
{
"_id": ObjectId("5f76ac1c337b875f3aff8cad"),
"units": [ '2', '3', '4' ],
"status": "true",
"name": "yyy"
},
{
"_id": ObjectId("5f76ac1c337b875f3aff8cad"),
"units": ['4', '1', '5' ],
"status": "true",
"name": "zzz"
}
]
I am trying to find the query like below
input: ['2', '3'] , should return all 3 records.
I have list of units like [ '2'] , should return first 2 records.
I am using aggregate query with facet, because required total count of records.
const id_lists = ['1', '2'];
User.aggregate(
{"$match": { "status" : "true" , "units": { "$in": id_lists } },
{
'$facet': {
metadata: [{ $count: "total" }],
users: [{ $skip: 0 }, { $limit: 10 }]
}
})
Also I am adding support for sorting, select fields, search the text(name field) and pagination. etc

Related

Unable to parseMongoDB to Python dataframe in multiple rows from one record document of Mongo

I have a very deep nested array structure in Mongo.
I am trying to unwind some of the data by aggregate function, however not sure what to do next to get the results in a particular format
Here is my sample json:
{
"_id": {
"$oid": "asdf016303a7f6a"
},
"layout": [
{
"id": "",
"contents": [
{
"columns": [
{
"items": [
{
"Answer": [],
"stopBlurSave": false
}
]
}
],
"displayName": "Container"
},
{
"columns": [
{
"items": [
{
"isdirty": false,
"todaysDate": "03/21/2017"
},
{
"dbSettings": {
"Name": "Name1",
"Query": "Query1"
},
"new": false,
"stopBlurSave": false
}
]
},
{
"items": [
{
"showEditPanel": false,
"isdirty": false
},
{
"islive": false,
"new": false
}
]
}
],
"new": true
},
{
"showEditPanel": false,
"displayName": "Container",
"columns": [
{
"items": [
{
"dbSettings": {
"Name": "Name2",
"Query": "Query2"
},
"new": false,
"stopBlurSave": false
}
]
}
],
"new": true
}
],
"displayName": "Section"
}
],
"name": "Task Form"
}
If you see inside layout.contents.columns.items array there may or may not be dbSettings.
I want to take the value in the dbSettings and then create a dataframe table like below
Name DBName Query
Task Form Name1 Query1
Task Form Name2 Query2
Currently i am trying with pymongo and doing the below aggregation
collection.aggregate([
{
'$match': {
'layout.contents.columns.items.dbSettings.Name': {
'$exists': True
}
}
},
{
'$unwind': {
'path': '$layout.contents.columns.items.dbSettings',
'preserveNullAndEmptyArrays': True
}
}, {
'$unwind': {
'path': '$layout.contents.columns.items.dbSettings',
'preserveNullAndEmptyArrays': True
}
},
{
'$project': {
'_id': 0,
'name': '$name',
'db': '$layout.contents.columns.items.dbSettings.Name',
'query': '$layout.contents.columns.items.dbSettings.Query'
}
}
There are two issue here, after unwind all the data is in one column but all the null brackets are still there something like [][][][],,Name1,[]Name2[], i have tried unwind options but either those come or nothing comes in the output.
Second even after i replace everything in the data frame i am still unsure how to map it to a new row as each record can have many dbSettings i want to replicate it that many times (there is a 1:1 mapping with Name and Query).
Is it something Mongo aggregation can do or we have to do some other logic in python. Any directions are welcome.

MongoDB aggregate + $match + $group + Array

Here is my MongoDB query :
profiles.aggregate([{"$match":{"channels.sign_up":true}},{"$group":{"_id":"$channels.slug","user_count":{"$sum":1}}},{"$sort":{"user_count":-1}}])
Here is my Code :
$profiles = Profile::raw()->aggregate([
[
'$match' => [
'channels.sign_up' => true
]
],
[
'$group' => [
'_id' => '$channels.slug',
'user_count' => ['$sum' => 1]
]
],
[
'$sort' => [
"user_count" => -1
]
]
]);
Here is my Mongo Collection :
"channels": [
{
"id": "5ae44c1c2b807b3d1c0038e5",
"slug": "swachhata-citizen-android",
"mac_address": "A3:72:5E:DC:0E:D1",
"sign_up": true,
"settings": {
"email_notifications_preferred": true,
"sms_notifications_preferred": true,
"push_notifications_preferred": true
},
"device_token": "ff949faeca60b0f0ff949faeca60b0f0"
},
{
"id": "5ae44c1c2b807b3d1c0038f3",
"slug": "website",
"mac_address": null,
"device_token": null,
"created_at": "2018-06-19 19:15:13",
"last_login_at": "2018-06-19 19:15:13",
"last_login_ip": "127.0.0.1",
"last_login_user_agent": "PostmanRuntime/7.1.5"
}
],
Here is my response :
{
"data": [
{
"_id": [
"swachhata-citizen-android"
],
"user_count": 1
},
{
"_id": [
"icmyc-portal"
],
"user_count": 1
},
{
"_id": [
"swachhata-citizen-android",
"website",
"icmyc-portal"
],
"user_count": 1
}
]
}
what i am expecting is :
{
"data": [
{
"_id": [
"swachhata-citizen-android"
],
"user_count": 1
},
{
"_id": [
"icmyc-portal"
],
"user_count": 1
},
{
"_id": [
"website",
],
"user_count": 1
}
]
}
As you can see channels is an array and "sign_up" is true only for one element in array from where user is registered as we have many app so we have to maintain more than 1 channel for users.
i want to data how many user registered with different channels but in response its coming all the channel instead of one channel where sign_up is true.
Also count is wrong as i have to records where "slug": "swachhata-citizen-android" and "sign_up": true.
Need suggestion :)
Use $unwind to transform each document with arrays to array of documents with nested fields. In your example, like this:
profiles.aggregate([
{$unwind: '$channels'},
{$match: {'channels.sign_up': true}},
{$group: {_id: '$channels.slug', user_count: {$sum: 1}}},
{$sort: {user_count: -1}}
])

Return object from a list if it's child object contains a certain value

I've been stuck on this issue for a while, I feel like I'm close but just can't to figure out the solution.
I have a condensed schema that look like this:
{
"_id": {
"$oid": "5a423f48d3983274668097f3"
},
"id": "59817",
"key": "DW-15450",
"changelog": {
"histories": [
{
"id": "449018",
"created": "2017-12-13T11:11:26.406+0000",
"items": [
{
"field": "status",
"toString": "Released"
}
]
},
{
"id": "448697",
"created": "2017-12-08T09:54:41.822+0000",
"items": [
{
"field": "resolution",
"toString": "Fixed"
},
{
"field": "status",
"toString": "Completed"
}
]
}
]
},
"fields": {
"issuetype": {
"id": "1",
"name": "Bug"
}
}
}
And I would like to grab all changelog.histories that have a changelog.histories.items.toString value of Completed.
Below is my pipeline
"pipeline" => [
[
'$match' => [
'changelog.histories.items.toString' => 'Completed'
]
],
[
'$unwind' => '$changelog.histories'
],
[
'$project' => [
'changelog.histories' => [
'$filter' => [
'input' => '$changelog.histories.items',
'as' => 'item',
'cond' => [
'$eq' => [
'$$item.toString', 'Completed'
]
]
]
]
]
]
]
So ideally I would like the following returned
{
"id": "448697",
"created": "2017-12-08T09:54:41.822+0000",
"items": [
{
"field": "resolution",
"toString": "Fixed"
},
{
"field": "status",
"toString": "Completed"
}
]
}
You can try something like this.
db.changeLogs.aggregate([
{ $unwind: '$changelog.histories' },
{ $match: {'changelog.histories.items.toString': 'Completed'} },
{ $replaceRoot: { newRoot: "$changelog.histories" } }
]);
This solution performs a COLLSCAN, so it is expensive in case of a large collection. Should you have strict performance requirements, you can create an index as follows.
db.changeLogs.createIndex({'changelog.histories.items.toString': 1})
Then, in order to exploit the index, you have to change the query as follows.
db.changeLogs.aggregate([
{ $match: {'changelog.histories.items.toString': 'Completed'} },
{ $unwind: '$changelog.histories' },
{ $match: {'changelog.histories.items.toString': 'Completed'} },
{ $replaceRoot: { newRoot: "$changelog.histories" } }
]);
The first stage filters the changeLog documents having at least one history item in the Completed state. This stage uses the index. The second stage unwinds the vector. The third stage filters again the unwound documents having at least one history item in the Completed state. Finally, the fourth stage replaces the root returning items as documents.
Edit
Based on your comment, this is an alternate solution preserving id and key fields in the returned documents (while keeping using the index).
db.changeLogs.aggregate([
{ $match: {'changelog.histories.items.toString': 'Completed'} },
{ $unwind: '$changelog.histories' },
{ $match: {'changelog.histories.items.toString': 'Completed'} },
{ $project: { _id: 0, id: 1, key: 1, changelog: 1 }}
]);

MongoDB filter for specific data in Array and return only specific fields in the output

I have a below structure maintained in a sample collection.
{
"_id": "1",
"name": "Stock1",
"description": "Test Stock",
"lines": [
{
"lineNumber": "1",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
},
{
"lineNumber": "2",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10002",
"name": "CricketBall",
"description": "Cricket ball"
},
"quantity": 10
},
{
"lineNumber": "3",
"priceInfo": {
"buyprice": 10,
"sellprice": 15
},
"item": {
"id": "BAT10003",
"name": "CricketStumps",
"description": "Cricket stumps"
},
"quantity": 10
}
]
}
I have a scenario where i will be given lineNumber and item.id, i need to filter the above collection based on lineNumber and item.id and i need to project only selected fields.
Expected output below:
{
"_id": "1",
"lines": [
{
"lineNumber": "1",
"item": {
"id": "BAT10001",
"name": "CricketBat",
"description": "Cricket bat"
},
"quantity": 10
}
]
}
Note: I may not get lineNumber all the times, if lineNumber is null then i should filter for item.id alone and get the above mentioned output.The main purpose is to reduce the number of fields in the output, as the collection is expected to hold huge number of fields.
I tried the below query,
db.sample.aggregate([
{ "$match" : { "_id" : "1"} ,
{ "$project" : { "lines" : { "$filter" : { "input" : "$lines" , "as" : "line" , "cond" :
{ "$and" : [ { "$eq" : [ "$$line.lineNumber" , "3"]} , { "$eq" : [ "$$line.item.id" , "BAT10001"]}]}}}}}
])
But i got all the fields, i'm not able to exclude or include the required fields.
I tried the below query and it worked for me,
db.Collection.aggregate([
{ $match: { _id: '1' } },
{
$project: {
lines: {
$map: {
input: {
$filter: {
input: '$lines',
as: 'line',
cond: {
$and: [
{ $eq: ['$$line.lineNumber', '3'] },
{ $eq: ['$$line.item.id', 'BAT10001'] },
],
},
},
},
as: 'line',
in: {
lineNumber: '$$line.lineNumber',
item: '$$line.item',
quantity: '$$line.quantity',
},
},
},
},
},
])
You can achieve it with $unwind and $group aggregation stages:
db.collection.aggregate([
{$match: {"_id": "1"}},
{$unwind: "$lines"},
{$match: {
$or: [
{"lines.lineNumber":{$exists: true, $eq: "1"}},
{"item.id": "BAT10001"}
]
}},
{$group: {
_id: "$_id",
lines: { $push: {
"lineNumber": "$lines.lineNumber",
"item": "$lines.item",
"quantity": "$lines.quantity"
}}
}}
])
$match - sets the criterias for the documents filter. The first stage is takes document with _id = "1", the second takes only documents which have lines.lineNumber equal to "1" or item.id equal to "BAT10001".
$unwind - splits the lines array into seperated documents.
$group - merges the documents by the _id element and puts the generated object with lineNumber, item and quantity elements into the lines array.

Mongo complex aggregate (condition based on group count)

Id need a mongo aggregate that given the sample data:
{
'employeeNumber': '1',
'companyId': '1',
'role': 'D',
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
{
'employeeNumber': '1',
'companyId': '1',
'role': 'S',
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
{
'employeeNumber': '1',
'companyId': '2',
'role': 'D',
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
{
'employeeNumber': '2',
'companyId': '1',
'role': 'D',
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
queries for a given companyId (e.g. companyId = 1, using match stage probably) and would return something like:
{
'employeeNumber': '1',
'companyId': '1',
'role': 'D','S'
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
notice that
{
'employeeNumber': '1',
'companyId': '2',
'role': 'D'
'dateHired':ISODate("2013-11-26T00:00:00.0Z")
...
}
is not returned.
Ideally it would return the whole object as the collection has 10/12 fields.
By using aggregation you will not get exact expected output but you can get output like following:
{ "role" : [ "D" ], "employeeNumber" : "2" }
{ "role" : [ "S" ], "employeeNumber" : "3" }
{ "role" : [ "D", "S" ], "employeeNumber" : "1" }
And the query will be like:
db.collection.aggregate({
$group: {
_id: "$employeeNumber",
"role": {
"$push": "$role"
}
}
}, {
$project: {
"employeeNumber": "$_id",
"role": 1,
"_id": 0
}
})
Edit After question edit:
db.collection.aggregate({
$group: {
_id: {
employeeNumber: "$employeeNumber",
"companyId": "$companyId"
},
"role": {
"$push": "$role"
},
"dateHired": {
$last: "$dateHired"
}
}
}, {
$project: {
"employeesNumber": "$_id.employeeNumber",
"comapnyId": "$_id.companyId",
"role": 1,
"dateHired": 1,
"_id": 0
}
})