MongoDB database design for users and their data

MongoDB database design for users and their data - mongodb

New to MongoDB and databases in general. I'm trying to make a basic property app with Express and MongoDB for practice.
I'm looking for some help on the best way to scheme this out.
Basically, my app would have landlords and tenants. Each landlord would have a bunch of properties that information is stored about. Things like lease terms, tenant name, maintenance requests, images, etc.
The tenants would be able to sign up and be associated with the property they live in. They could submit maintenance forms, etc.
Is this a good approach? Should everything be kept in the same collection? Thanks.
{
"_id": "507f1f77bcf86cd799439011",
"user": "Corey",
"password": "hashed#PASSWORD",
"email": "corey#email.com",
"role": "landlord",
"properties": [
{
"addressId": "1",
"address": "101 Main Street",
"tenant": "John Smith",
"leaseDate": "04/21/2016",
"notes": "These are my notes about this property.",
"images": [ "http://www.imagelink.com/image1", "http://www.imagelink.com/image2", "http://www.imagelink.com/image3"]
},
{
"addressId": "2",
"address": "105 Maple Street",
"tenant": "John Jones",
"leaseDate": "01/01/2018",
"notes": "These are my notes about 105 Maple Ave property.",
"images": ["http://www.imagelink.com/image1", "http://www.imagelink.com/image2", "http://www.imagelink.com/image3"],
"forms": [
{
"formType": "lease",
"leaseTerm": "12 months",
"leaseName": "John Jones",
"leaseDate": "01/01/2018"
},
{
"formtype": "maintenance",
"maintenanceNotes": "Need furnace looked at. Doesn't heat properly.",
"maintenanceName": "John Jones",
"maintenanceDate": "01/04/2018",
"status": "resolved"
},
]
},
{
"addressId": "3",
"address": "110 Chestnut Street",
"tenant": "John Brown",
"leaseDate": "07/28/2014",
"notes": "These are some notes about 110 Chestnut Ave property.",
"images": [ "http://www.imagelink.com/image1", "http://www.imagelink.com/image2", "http://www.imagelink.com/image3"]
}
]
}
{
"_id": "507f1f77bcf86cd799439012",
"user": "John",
"password": "hashed#PASSWORD",
"email": "john#email.com",
"role": "tenant",
"address": "2",
"images": [ "http://www.imagelink.com/image1", "http://www.imagelink.com/image2" ]
}

For this relation I'd suggest three collections (Landlords, Properties, and Tenants), with each tenant having a "landLordId" and "propertyId".
This "landLordId" would simply be the ObjectId of the landLord, and same for the property Id.
This will make your life easier if you plan to do any kind of roll-up reports or if the you have more than one-to-one mappings for landlords to properties or landlords to tenants. (Example, more than one property manager for a given property)
This just makes everything easier/more intuitive as you could simply add things like maintenance requests, lease terms etc in arrays on the tenants with references to whatever need be.
This offers the most flexibility in terms of being able to aggregate easily for any kind of report/query.

Related

Fast Searching Based on "custom fields" along with other fields

We have a legacy implementation of User Groups, which is way more than it implies. Users can be assigned to a group and you can create a hierarchy of groups. Groups can also have system wide permissions assigned to them, or a group can be used on some other module for permissions. You can even do a permission where it's something complicated like
((In Group1 or Group2) and (In Group3 and Group4)) or (In Group5 and (not IN Group1 or Group2))
When a permission like this is created, it will actually select all the users that match this, create a "derived group" and then assign those users to the new group.
In our new application, we have a completely different permissions system that handles these sorts of use cases pretty well, with it also being attribute based, rather than group/role based.
That being said, groups are still used for other things, other than permissions. We might build a report based upon a group, or send out emails to a group, etc. We still need this functionality.
It also looks like we're moving our current user information, into Mongo DB due to the fact that each client can customize the fields that are available for a user to populate, so a user might have a "job title" for one client, but another they would have a "designation", or "position". We call these "custom fields". The client can create as many of these fields as they want.
So that's the back story. My issue is that I don't really want to create a new "groups" implementation, since all we really need is a way to create and save filters for users, so when we need to send out an email to a specific subset of users, it will either use a filter that has already been saved, or create a new one.
So this is the original format for the user document in MongoDB:
{
"id": 123456,
"username": "john.smith#domain.com",
"first_name": "John",
"last_name": "Smith",
"email": "john.smith#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": {
"job_title": "Cashier",
"branch_code": "000123",
"social_team_name": "The Terrible Trolls"
}
},
{
"id": 123457,
"username": "jane.smith#domain.com",
"first_name": "Jane",
"last_name": "Smith",
"email": "john.smith#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": {
"job_title": "Mortgage Consultant",
"branch_code": "000123",
"social_team_name": "Team Savage"
}
},
{
"id": 123458,
"username": "morgan.jones#domain.com",
"first_name": "Morgan",
"last_name": "Jones",
"email": "morgan.jones#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": {
"job_title": "Regional Manager",
"branch_code": "000124",
"social_team_name": "The Terrible Trolls"
}
}
So we might want to create a filter where account.enabled = true AND employee_type='permanent' AND custom_fields.branch_code=000124. The filter could be any combination of fields in anyway.
Ultimately I'm wondering if this sort of structure is the best way to do this, I know I can use wildcard indexes to index the custom fields, but I'm still limited with regards to the amount of indexes I can create, so if a field is used in a query that isn't indexed, or we've hit our limit for creating indexes, then it's going to start slowing things down.
Another structure I saw is as follows:
{
"id": 123456,
"username": "john.smith#domain.com",
"first_name": "John",
"last_name": "Smith",
"email": "john.smith#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": [
{
"k": "Job Title",
"v": "Cashier"
},
{
"k": "Branch Code",
"v": "000123"
},
{
"k": "Social Team Name",
"v": "The Terrible Trolls"
}
]
},
{
"id": 123457,
"username": "jane.smith#domain.com",
"first_name": "Jane",
"last_name": "Smith",
"email": "john.smith#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": [
{
"k": "Job Title",
"v": "Mortgage Consultant"
},
{
"k": "Branch Code",
"v": "000123"
},
{
"k": "Social Team Name",
"v": "Team Savage"
}
]
},
{
"id": 123458,
"username": "morgan.jones#domain.com",
"first_name": "Morgan",
"last_name": "Jones",
"email": "morgan.jones#domain.com",
"employee_type": "permanent",
"account": {
"enabled": true,
"locked": false,
"redeem_only": false
},
"custom_fields": [
{
"k": "Job Title",
"v": "Regional Manager"
},
{
"k": "Branch Code",
"v": "000124"
},
{
"k": "Social Team Name",
"v": "The Terrible Trolls"
}
]
}
However, I'm not really sure if this would be better or not, as the problem still remains that we are limited by the amount of indexes we can create.
Is there a viable solution for this (links to articles/resources would be great), or am I going to end up saving a "filter", selecting all the users that apply to the filter and then assigning them to "filter" for easy lookup, but then have to rebuild every time a user updates their information, gets promoted, or anything else that changes the field values?

Delete sub-document from array in array of sub documents

Let's imagine a mongo collection of - let's say magazines. For some reason, we've ended up storing each issue of the magazine as a separate document. Each article is a subdocument inside an Articles-array, and the authors of each article is represented as a subdocument inside the Writers-array on the Article-subdocument. Only the name and email of the author is stored inside the article, but there is an Writers-array on the magazine level containing more information about each author.
{
"Title": "The Magazine",
"Articles": [
{
"Title": "Mongo Queries 101",
"Summary": ".....",
"Writers": [
{
"Name": "tom",
"Email": "tom#example.com"
},
{
"Name": "anna",
"Email": "anna#example.com"
}
]
},
{
"Title": "Why not SQL instead?",
"Summary": ".....",
"Writers": [
{
"Name": "mike",
"Email": "mike#example.com"
},
{
"Name": "anna",
"Email": "anna#example.com"
}
]
}
],
"Writers": [
{
"Name": "tom",
"Email": "tom#example.com",
"Web": "tom.example.com"
},
{
"Name": "mike",
"Email": "mike#example.com",
"Web": "mike.example.com"
},
{
"Name": "anna",
"Email": "anna#example.com",
"Web": "anna.example.com"
}
]
}
How can one author be completely removed from a magazines?
Finding magazines where the unwanted author exist is quite easy. The problem is pulling the author out of all the sub documents.
MongoDB 3.6 introduces some new placeholder operators, $[] and $[<identity>], and I suspect these could be used with either $pull or $pullAll, but so far, I haven't had any success.
Is it possible to do this in one go? Or at least no more than two? One query for removing the author from all the articles, and one for removing the biography from the magazine?

You can try below query.
db.col.update(
{},
{"$pull":{
"Articles.$[].Writers":{"Name": "tom","Email": "tom#example.com"},
"Writers":{"Name": "tom","Email": "tom#example.com"}
}},
{"multi":true}
);

REST - Resource and Collection Representations

I have a confusion with the design of collection resources.
Let's say I have a user resource - represented as below.
http://www.example.com/users/{user-id}
user : {
id : "",
name : "",
age : "",
addresses : [
{
line1 : "",
line2 : "",
city : "",
state : "",
country : "",
zip : ""
}
]
}
Now, how should my users collection resource representation be? Should it be a list of user representations (as above)? Or can it be a subset of that like below:
http://www.example.com/users/
users : [
{
id : "",
name : "",
link : {
rel : "self",
href : "/users/{id}"
}
}
]
Should the collection resource representation include the complete representation of the containing resources or can it be a subset?

Media types define the rules on how you can convey information. Look at the specifications for Collection+JSON and HAL for examples of how to do what you are trying to do.

That falls entirely on what you want it to do. The great thing about REST APIs is that they are so flexible. You can represent data in any way (theoretically) that you want.
Personally, I would have an attribute that allows the user to specify a subset or style of representation. For instance /users/{userid}.json?output=simple or /users/{userid}.json?subset=raw
Something along those lines would also allow you to nest representations and fine tune what you want without sacrificing flexibility:
/users/{userid}.json?output=simple&subset=raw
The sky is the limit

I would make the list service fine grained by entertaining the
http://www.example.com/users?include=address|profile|foo|bar
Any delimiter (other than & and URL encoded) like , or - can be used instead of |. On the server side, check for those include attributes and render the JSON response accordingly.

There isn't really a standard for this. You have options:
1. List of links
Return a list of links to the collection item resources (i.e., the user IDs).
http://www.example.com/users/
users : [
"jsmith",
"mjones",
...
]
Note that these can actually be interpreted as relative URIs, which somewhat supports the "all resources should be accessible by following URIs from the root URI" ideal.
http://www.example.com/users/ + jsmith = http://www.example.com/users/jsmith
2. List of partial resources
Return a list of partial resources (users), allowing the caller to specify which fields to include. You might also have a default selection of fields in case the user doesn't supply any - the default might even be "include all fields."
http://www.example.com/users/?field=id&field=name&field=link
users : [
{
id : "jsmith",
name : "John Smith",
link : "www.google.com"
},
...
]

It can be subset but depends on the data. take a look at the below code.
{
"usersList": {
"users": [{
"firstName": "Venkatraman",
"lastName": "Ramamoorthy",
"age": 27,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": 10021
},
"phoneNumbers": [{
"type": "mobile",
"number": "+91-9999988888"
}, {
"type": "fax",
"number": "646 555-4567"
}]
}, {
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": 10021
},
"phoneNumbers": [{
"type": "home",
"number": "212 555-1234"
}, {
"type": "fax",
"number": "646 555-4567"
}]
}]
}
}

Advice on collection schema for MongoDB

I am developing a little league app for my son's league as a "weekend project" and as a way to learn mongodb. I'm struggling with the best way to setup the schema in MongoDB. My biggest hangup is on whether or not I should replicate some of the data. Here's my first stab at the schema
Collection -
Player
{ "firstname": "Test",
"lastname" : "Player",
"street":"123 Lamar",
"city": "Austin",
"state":"TX" ,
"zip": "78701",
"littleleagueid": "123",
"league":"minors",
"team":"Rangers",
parents :
[ {
"firstname": "Bob",
"lastname": "Player",
"relationship": "father",
"street":"123 Lamar",
"city": "Austin",
"state":"TX",
"zip": "78701"
},
{
"firstname": "Sally",
"lastname": "Player",
"relationship": "stepmother",
"street":"123 Lamar",
"city": "Austin",
"state":"TX",
"zip": "78701"
},
{
"firstname": "Sue",
"lastname": "Explayer",
"relationship": "mother",
"street":"456 Congress",
"city": "Austin",
"state":"TX",
"zip": "78761"}
]
}
My biggest question is should I embed the parents into the kids collection or should they be separated to into their own collection? The address is being repeated multiple times. This might be the best method but in a SQL environment I would have just pulled this into its own table.
Any and all advice would be greatly appreciated.

If you need fast queries:
I think your schema is more or less Ok.
If you need fast updates:
I would instead refer to the parents using their _id to convert player collection into people collection. In this way if parents change you only update one document and not as many as players having this parent.
Keep in mind that, even if you need fast queries, this is not very maintainable because you may have, as you say, duplicate data and this can lead to data inconsistencies.

MongoDB Database Structure and Best Practices Help

I'm in the process of developing Route Tracking/Optimization software for my refuse collection company and would like some feedback on my current data structure/situation.
Here is a simplified version of my MongoDB structure:
Database: data
Collections:
“customers” - data collection containing all customer data.
[
{
"cust_id": "1001",
"name": "Customer 1",
"address": "123 Fake St",
"city": "Boston"
},
{
"cust_id": "1002",
"name": "Customer 2",
"address": "123 Real St",
"city": "Boston"
},
{
"cust_id": "1003",
"name": "Customer 3",
"address": "12 Elm St",
"city": "Boston"
},
{
"cust_id": "1004",
"name": "Customer 4",
"address": "16 Union St",
"city": "Boston"
},
{
"cust_id": "1005",
"name": "Customer 5",
"address": "13 Massachusetts Ave",
"city": "Boston"
}, { ... }, { ... }, ...
]
“trucks” - data collection containing all truck data.
[
{
"truckid": "21",
"type": "Refuse",
"year": "2011",
"make": "Mack",
"model": "TerraPro Cabover",
"body": "Mcneilus Rear Loader XC",
"capacity": "25 cubic yards"
},
{
"truckid": "22",
"type": "Refuse",
"year": "2009",
"make": "Mack",
"model": "TerraPro Cabover",
"body": "Mcneilus Rear Loader XC",
"capacity": "25 cubic yards"
},
{
"truckid": "12",
"type": "Dump",
"year": "2006",
"make": "Chevrolet",
"model": "C3500 HD",
"body": "Rugby Hydraulic Dump",
"capacity": "15 cubic yards"
}
]
“drivers” - data collection containing all driver data.
[
{
"driverid": "1234",
"name": "John Doe"
},
{
"driverid": "4321",
"name": "Jack Smith"
},
{
"driverid": "3421",
"name": "Don Johnson"
}
]
“route-lists” - data collection containing all predetermined route lists.
[
{
"route_name": "monday_1",
"day": "monday",
"truck": "21",
"stops": [
{
"cust_id": "1001"
},
{
"cust_id": "1010"
},
{
"cust_id": "1002"
}
]
},
{
"route_name": "friday_1",
"day": "friday",
"truck": "12",
"stops": [
{
"cust_id": "1003"
},
{
"cust_id": "1004"
},
{
"cust_id": "1012"
}
]
}
]
"routes" - data collections containing data for all active and completed routes.
[
{
"routeid": "1",
"route_name": "monday1",
"start_time": "04:31 AM",
"status": "active",
"stops": [
{
"customerid": "1001",
"status": "complete",
"start_time": "04:45 AM",
"finish_time": "04:48 AM",
"elapsed_time": "3"
},
{
"customerid": "1010",
"status": "complete",
"start_time": "04:50 AM",
"finish_time": "04:52 AM",
"elapsed_time": "2"
},
{
"customerid": "1002",
"status": "incomplete",
"start_time": "",
"finish_time": "",
"elapsed_time": ""
},
{
"customerid": "1005",
"status": "incomplete",
"start_time": "",
"finish_time": "",
"elapsed_time": ""
}
]
}
]
Here is the process thus far:
Each day drivers begin by Starting a New Route. Before starting a new route drivers must first input data:
driverid
date
truck
Once all data is entered correctly the Start a New Route will begin:
Create new object in collection “routes”
Query collection “route-lists” for “day” + “truck” match and return "stops"
Insert “route-lists” data into “routes” collection
As driver proceeds with his daily stops/tasks the “routes” collection will update accordingly.
On completion of all tasks the driver will then have the ability to Complete the Route Process by simply changing “status” field to “active” from “complete” in the "routes" collection.
That about sums it up. Any feedback, opinions, comments, links, optimization tactics are greatly appreciated.
Thanks in advance for your time.

You database schema looks like for me as 'classic' relational database schema. Mongodb good fit for data denormaliztion. I guess when you display routes you loading all related customers, driver, truck.
If you want make your system really fast you may embedd everything in route collection.
So i suggest following modifications of your schema:
customers - as-is
trucks - as-is
drivers - as-is
route-list:
Embedd data about customers inside stops instead of reference. Also embedd truck. In this case schema will be:
{
"route_name": "monday_1",
"day": "monday",
"truck": {
_id = 1,
// here will be all truck data
},
"stops": [{
"customer": {
_id = 1,
//here will be all customer data
}
}, {
"customer": {
_id = 2,
//here will be all customer data
}
}]
}
routes:
When driver starting new route copy route from route-list and in addition embedd driver information:
{
//copy all route-list data (just make new id for the current route and leave reference to routes-list. In this case you will able to sync route with route-list.)
"_id": "1",
route_list_id: 1,
"start_time": "04:31 AM",
"status": "active",
driver: {
//embedd all driver data here
},
"stops": [{
"customer": {
//all customer data
},
"status": "complete",
"start_time": "04:45 AM",
"finish_time": "04:48 AM",
"elapsed_time": "3"
}]
}
I guess you asking yourself what do if driver, customer or other denormalized data changed in main collection. Yeah, you need update all denormalized data within other collections. You will probably need update billions of documents (depends on your system size) and it's okay. You can do it async if it will take much time.
What benfits in above data structure?
Each document contains all data that you may need to display in your application. So, for instance, you no need load related customers, driver, truck when you need display routes.
You can make any difficult queries to your database. For example in your schema you can build query that will return all routes thats contains stops in stop of customer with name = "Bill" (you need load customer by name first, get id, and look by customer id in your current schema).
Probably you asking yourself that your data can be unsynchronized in some cases, but to solve this you just need build a few unit test to ensure that you update your denormolized data correctly.
Hope above will help you to see the world from not relational side, from document database point of view.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse