Sorting by nested objects attributes in mongoose when using populate - mongodb

I'm trying to sort parent documents by an attribute from a populated child document.
so if a person schema has an attribute business and the business schema has an attribute name, I want to sort the list of person documents by the name of their business alphabetically.
This seems very possible since the relationship between the person and business is always 1 to 1. but it seems mongoose doesn't allow such sorting mechanism as whenever I pass business.name as sorting arguments it will default sort instead (same sorting as passing unknown arguments).
I'm trying to use aggregate but the docs on that are very bad and not all arguments are clear.
I would like to know if there is a way of doing.
This is my aggregate code:
let populatedArray = [
{
"path": "business",
"schema": require("../models/business").collection.name
},
{
"path": "createdBy",
"schema": require("../models/User").collection.name
},
{
"path": "schema2",
"schema": require("../models/schema2").collection.name
},
{
"path": "schema3",
"schema": require("../models/schema3").collection.name
},
{
"path": "schema4",
"schema": require("../models/schema4").collection.name
},
{
"path": "schema5",
"schema": require("../models/schema5").collection.name
},
{
"path": "schema6",
"schema": require("../models/schema6").collection.name,
"populate": [{
"path": "schema7",
"schema": require("../models/schema7").collection.name
}]
}];
populatedArray.forEach((elem)=>{
docsPromise.lookup({from:elem.schema,localField: elem.path, foreignField: '_id', as: elem.path})
docsPromise.unwind("$"+elem.path)
})
With the unwind command I get no documents. without the unwind command I get 500 documents while I only have 140 in the database. I know that aggregate is near a LEFT_JOIN on a SQL DB which can give similar result but I don't know to stop it form doing so.

Related

How to implement a RESTful API for order changes on large collection entries?

I have an endpoint that may contain a large number of resources. They are returned in a paginated list. Each resource has a unique id, a rank field and some other data.
Semantically the resources are ordered with respect to their rank. Users should be able to change that ordering. I am looking for a RESTful interface to change the rank field in many resources in a large collection.
Reordering one resource may result in a change of the rank fields of many resources. For example consider moving the least significant resource to the most significant position. Many resources may need to be "shifted down in their rank".
The collection being paginated makes the problem a little tougher. There has been a similar question before about a small collection
The rank field is an integer type. I could change its type if it results in a reasonable interface.
For example:
GET /my-resources?limit=3&marker=234 returns :
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": [
{
"id": 12,
"rank": 2,
"otherData": {}
},
{
"id": 35,
"rank": 0,
"otherData": {}
},
{
"id": 67,
"rank": 1,
"otherData": {}
}
]
}
Considered approaches.
1) A PATCH request for the list.
We could modify the rank fields with the standard json-patch request. For example the following:
[
{
"op": "replace",
"path": "/data/0/rank",
"value": 0
},
{
"op": "replace",
"path": "/data/1/rank",
"value": 1
},
{
"op": "replace",
"path": "/data/2/rank",
"value": 2
}
]
The problems I see with this approach:
a) Using the array indexes in path in patch operations. Each resource has already a unique ID. I would rather use that.
b) I am not sure to what the array index should refer in a paginated collection? I guess it should refer to the global index once all pages are received and merged back to back.
c) The index of a resource in a collection may be changed by other clients. What the current client thinks at index 1 may not be at that index anymore. I guess one could add test operation in the patch request first. So the full patch request would look like:
[
{
"op": "test",
"path": "/data/0/id",
"value": 12
},
{
"op": "test",
"path": "/data/1/id",
"value": 35
},
{
"op": "test",
"path": "/data/2/id",
"value": 67
},
{
"op": "replace",
"path": "/data/0/rank",
"value": 0
},
{
"op": "replace",
"path": "/data/1/rank",
"value": 1
},
{
"op": "replace",
"path": "/data/2/rank",
"value": 2
}
]
2) Make the collection a "dictionary"/ json object and use a patch request for a dictionary.
The advantage of this approach over 1) is that we could use the unique IDs in path in patch operations.
The "data" in the returned resources would not be a list anymore:
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": {
"12": {
"id": 12,
"rank": 2,
"otherData": {}
},
"35": {
"id": 35,
"rank": 0,
"otherData": {}
},
"67": {
"id": 67,
"rank": 1,
"otherData": {}
}
}
}
Then I could use the unique ID in the patch operations. For example:
{
"op": "replace",
"path": "/data/12/rank",
"value": 0
}
The problems I see with this approach:
a) The my-resources collection can be large, I am having difficulty about the meaning of a paginated json object, or a paginated dictionary. I am not sure whether an iteration order can be defined on this large object.
3) Have a separate endpoint for modifying the ranks with PUT
We could add a new endpoint like this PUT /my-resource-ranks.
And expect the complete list of the ordered id's to be passed in a PUT request. For example
[
{
"id": 12
},
{
"id": 35
},
{
"id": 67
}
]
We would make the MyResource.rank a readOnly field so it cannot be modified through other endpoints.
The problems I see with this approach:
a) The need to send the complete ordered list. In the PUT request for /my-resource-ranks we will not include any other data, but only the unique id's of resources. It is less severe than sending the full resources but still the complete ordered list can be large.
4) Avoid the MyResource.rank field and the "rank" be the order in the /my-collections response.
The returned resources would not have the "rank" field in them and they will be already sorted with respect to their rank in the response.
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": [
{
"id": 35,
"otherData": {}
},
{
"id": 67,
"otherData": {}
},
{
"id": 12,
"otherData": {}
}
]
}
The user could change the ordering with the move operation in json-patch.
[
{
"op": "test",
"path": "/data/2/id",
"value": 12
},
{
"op": "move",
"from": "/data/2",
"path": "/data/0"
}
]
The problems I see with this approach:
a) I would prefer the freedom for the server to return to /my-collections in an "arbitrary" order from the client's point of view. As long as the order is consistent, the optimal order for a "simpler" server implementation may be different than the rank defined by the application.
b) Same concern as 1)b). Does index in the patch operation refer to the global index once all pages are received and merged back to back? Or does it refer to the index in the current page ?
Update:
Does anyone know further examples from an existing public API ? Looking for further inspiration. So far I have:
Spotify's Reorder a Playlist's Tracks
Google Tasks: change order, move
I would
Use PATCH
Define a specialized content-type specifically for updating the order.
The application/patch+json type is pretty great for doing straight-up modifications, but I think your use-case is unique enough to warrant a useful, minimal specialized content-type.

MongoDB - how to properly model relations

Let's assume we have the following collections:
Users
{
"id": MongoId,
"username": "jsloth",
"first_name": "John",
"last_name": "Sloth",
"display_name": "John Sloth"
}
Places
{
"id": MongoId,
"name": "Conference Room",
"description": "Some longer description of this place"
}
Meetings
{
"id": MongoId,
"name": "Very important meeting",
"place": <?>,
"timestamp": "1506493396",
"created_by": <?>
}
Later on, we want to return (e.g. from REST webservice) list of upcoming events like this:
[
{
"id": MongoId(Meetings),
"name": "Very important meeting",
"created_by": {
"id": MongoId(Users),
"display_name": "John Sloth",
},
"place": {
"id": MongoId(Places),
"name": "Conference Room",
}
},
...
]
It's important to return basic information that need to be displayed on the main page in web ui (so no additional calls are needed to render the table). That's why, each entry contains display_name of the user who created it and name of the place. I think that's a pretty common scenario.
Now my question is: how should I store this information in db (question mark values in Metting document)? I see 2 options:
1) Store references to other collections:
place: MongoId(Places)
(+) data is always consistent
(-) additional calls to db have to be made in order to construct the response
2) Denormalize data:
"place": {
"id": MongoId(Places),
"name": "Conference room",
}
(+) no need for additional calls (response can be constructed based on one document)
(-) data must be updated each time related documents are modified
What is the proper way of dealing with such scenario?
If I use option 1), how should I query other documents? Asking about each related document separately seems like an overkill. How about getting last 20 meetings, aggregate the list of related documents and then perform a query like db.users.find({_id: { $in: <id list> }})?
If I go for option 2), how should I keep the data in sync?
Thanks in advance for any advice!
You can keep the DB model you already have and still only do a single query as MongoDB introduced the $lookup aggregation in version 3.2. It is similar to join in RDBMS.
$lookup
Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.
So instead of storing a reference to other collections, just store the document ID.

MongoDb query - aggregation, group, filter, max

I am trying to figure out specific mongoDb query, so far unsuccessfully.
Documents in my collections looks someting like this (contain more attributes, which are irrelevant for this query):
[{
"_id": ObjectId("596e01b6f4f7cf137cb3d096"),
"code": "A",
"name": "name1",
"sys": {
"cts": ISODate("2017-07-18T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d097"),
"code": "A",
"name": "name2",
"sys": {
"cts": ISODate("2017-07-19T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d098"),
"code": "B",
"name": "name3",
"sys": {
"cts": ISODate("2017-07-16T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d099"),
"code": "B",
"name": "name3",
"sys": {
"cts": ISODate("2017-07-10T12:40:22.772Z"),
}
}]
What I need is to get current versions of documents, filtered by code or name, or both. Current version means that from two(or more) documents with same code, I want pick the one which has latest sys.cts date value.
So, result of this query executed with filter name="name3" would be the 3rd document from previous list. Result of query without any filter would be 2nd and 3rd document.
I have an idea how to construct this query with changed data model but I was hoping someone could lead me right way without doing so.
Thank you

Mongo DB Join Collections

I am pretty new to Mongo db and coming from T-SQL background, I am finding little hard to understand how joins work in Mongo.
I have a very simple case where i have a "User Table.. err.. Collections" and "User Audit Collections"..
My User Collection looks something like this.
{
"_id": LUUID("d991e92a-766c-054e-9ad8-1c902acc6efc"),
"System": {
"VisitCount": 1
},
"UserData": {
"Uid": "46831",
"UserName": "abc.",
"FirstName": "abv",
"LastName": "test",
"EmailId": "abc#gmail.com",
"Region": "Georgia",
"Postal": "10000",
"Country": "United States",
"Phone": "800-000-1734",
}
}
and a User Audit Table :
{
"_id": LUUID("9561a583-0afe-e844-a090-43ffdab46ed2"),
"UserId": LUUID("914ed252-3fc7-d84c-9731-f382e7cf400b"),
"StartDateTime": ISODate("2016-05-12T04:07:37.299Z"),
"EndDateTime": ISODate("2016-05-12T04:07:42.715Z"),
"SaveDateTime": ISODate("2016-05-12T04:28:23.186Z"),
"Browser": {
"BrowserVersion": "50.0",
"BrowserMajorName": "Chrome",
"BrowserMinorName": "50.0"
},
"Pages": [
{
"DateTime": ISODate("2016-05-12T04:07:37.365Z"),
"Duration": 5416,
"Item": {
"_id": LUUID("f293157a-f22d-fe49-a7b0-f66f412408fe"),
"Language": "en",
"Version": 1
}"Url": {
"Path": "/"
},
"VisitPageIndex": 1
},
{
"DateTime": ISODate("2016-05-12T04:07:42.781Z"),
"Duration": 0,
"Item": {
"Version": 0
},
"SitecoreDevice": {
"_id": LUUID("df7f5dfe-c089-994d-9aa3-b5fbd009c9f3"),
"Name": "Default"
},
"MvTest": {
"ValueAtExposure": 0
},
"Url": {
"Path": "/Sample Page1"
},
"VisitPageIndex": 2
}
]
}
I need a Flat view where each row will hold all the user User information and the pages the user visited.
The Audit information can be grouped by user or repeated per user.. My main idea is to combine the User details with Page visited history.
I am looking for something like a Left outer join equivalent
something like
Select * from usertable, useraudittable
on usertable.id = userAuditTable.UserId
group by userID.
Mongo is a simple object storage database and does not offer a lot of relational operations like joins. Normally you have to do it programmatically doing multiple queries and processing the data using your application code and logic.
In Mongo 3.2 they introduced the lookup operation to the aggregation pipeline and fortunately it kinda does what you are looking for. You can use something like this (using mongo shell javascript syntax as example)
db.user.aggregate([{
$lookup: {
from: "audit",
localField: "_id",
foreignField: "UserId",
as: "VisitedPages"
}
}]);
If you are using the last version of mongo you can play with this approach otherwise you'll need to go with multiple queries on your application.
Take a look at the documentation

How to update the Embedded Data which is inside of another Embedded Data?

I have document like below in MongoDB:
{
"_id": "test",
"tasks": [
{
"Name": "Task1",
"Parameter": [
{
"Name": "para1",
"Type": "String",
"Value": "*****"
},
{
"Name": "para2",
"Type": "String",
"Value": "*****"
}
]
},
{
"Name": "Task2",
"Parameter": [
{
"Name": "para1",
"Type": "String",
"Value": "*****"
},
{
"Name": "para2",
"Type": "String",
"Value": "*****"
}
]
}
]
}
There is Embedded Data Structure (Parameter) inside of another Embedded Data Structure (Tasks). Now I want to update the para1 in Task1's Parameter.
I have tried many ways but I can only use query tasks.Parameter.name to find the para1 but cannot update it. the example in the doc are using .$. to update the value in a Embedded Data Structure but it doesn't work in my case.
Anyone have any ideas ?
MongoDB currently only supports the positional operator once, and only for the top level array. There is a ticket SERVER-831 to change this behavior for your use case. You can follow the issue there and up vote it.
However, you might be able to change your approach to accomplish what you want to do. One way is to change your schema. Collapse the tasks name into the array so the document looks like this:
{
_id:test,
tasks:
[
{
Task:1
Name:para1,
Type:String,
Value:*****
},
{
Task:1
Name:para2,
Type:String,
Value:*****
},
{
Task:2
Name:para1,
Type:String,
Value:*****
},
{
Task:2
Name:para2,
Type:String,
Value:*****
}
]
}
Another approach that may work for you is to use $pull and $push. For instance something like this to replace a task (this assumes that tasks.Parameter.Name is unique to an array of Parameters):
db.test2.update({$and: [{"tasks.Name": "Task3"}, {"tasks.Parameter.Name":"para1"}]}, {$pull: {"tasks.$.Parameter": {"Name": "para1"}}})
db.test2.update({"tasks.Name": "Task3"}, {$push: {"tasks.$.Parameter": {"Name": "para3", Type: "String", Value: 1}}})
With this solution you need to be careful with regard to concurrency, as there will be a brief moment where the document doesn't exist.