How to model a Manufacturer in DynamoDB - nosql

I come from relational database background and I'm struggling with the concepts of a single table to represents all my data in DynamoDB.
My application has Manufacturers and they give access to their staff to my portal CRM (Manufacturer Users). The Users then add their own customers to the system and log and record all their orders.
GetManufacturer()
GetManufacturerUsers()
GetManufacturerCustomers()
GetManufacturerCustomersOrders()
A Manufacturer will never see another Manufacturers Customers orders.
I understand the basics around PK and SK, my question is... really? Is this really single table?
Table Customer Accounts
Manufacturers (Name, logo etc)
Manufacturer Users (Users of that Manufacturer that access my system i.e. email, role)
Customers of the Manufacturer (custID, Name Manufacturer they belong to)
Customer transaction data (lots of it)
Given the above, how would you model in DynamoDB?
Access pattern stories
Query Manufacturer (Login, Name) by Users SK ManufacturerID
Get all Customers IDs and Name by ManufacturerID Filter by UsersID
Get all Orders filtered by CustomersID and ManufacturerID
Get all Ordered Items filtered by OrdersID and CustomersID

As you're learning, NoSQL data modeling is completely different than data modeling in a SQL/relational database. Single table design requires you to think differently about your data, which can come with a steep learning curve.
Alex Debrie, author of The DynamoDB Book, has authored some of the best materials about data modeling in DynamoDB. His book is fantastic and I recommend it to anyone wanting to learn about NoSQL data modeling in DynamoDB.
In your situation, I'd start by reading an article on modeling one-to-many relationships in DynamoDB. You can also see the same material in this video presented at the 2019 AWS Re:Invet conference. Both of these resources will give you a much better understanding of single table design in DynamoDB. I haven't found a better resource that can jump-start the DynamoDB learning process.

Related

One-to-many relationships in NoSQL DB

I just start to learning DynamoDB and I face a big problem.
Suppose, I have an author and a book table where the author can have multiple books and each book must have an author.
so, In NonSQL DB I just embedded author information in book table to solve this problem.
Sample code: https://pastebin.ubuntu.com/p/DvHpS8JQJV/
But, recently I face a problem which is, if long time later admin want to change some information about author like, live attribute. How can I make effect in book table.
Note: Embedded book collection in author table could solve this problem but in future retrieve all books data with pagination and other operation could be more difficult.
In relational db it's every easy to solve just use foreign key and retrieve data by using join query.
How can I solve this type of problem In NonSQL or dynamoDB any suggestions?
You have two options.
Go with semi-sql design. Create separate table for books and autor. And joins will be handled on application level. It's not perfect from performance perspective, but it's easy to start for devs with SQL background.
Go with single table design. This is a complex topic. There is no silver bullet to handle one-to-many relationships like in SQL. You need good understanding of your domain and single table design to do this well.

What is the best way to structure shared data and access rights in a document database

I'm coming at this problem with a RDMS background so some of the best practices of document databases is new to me. I'm trying to understand the best way to store shared data and access rights to that data. The schema in SQL Server might look like this:
Project Table
projectId PK
ownerId FK User.userId
title
...
User Table
userId PK
name
...
ProjectShare Table
sharedById FK User.userId
sharedWithId FK User.userId
state
...
With the above tables I could query all projects that a user has access to. I could then query for all the data related to each project. Each project will have many related tables. The hierarchical nature of the data seems well suited for a document database.
How would I best structure something like this in a document database like MongoDB, CouchDB or DocumentDB?
There are indeed multiple approaches to model this data in DocumentDB.
Collections in DocumentDB can host heterogeneous set of documents and can be partitioned for massive scale.
Depending on the query requirements, data could be denormalized in many directions - either by pivoting on project (and keeping all users associated including owners, shared by and sharedWith details) or by pivoting on users (and keeping all the projects they own, the details of the projects including information of other users who shared this project etc).
One can also control the level of denormalization by simply storing a soft reference and keeping the referred information as a separate document. For instance, if we pivot by project, we could store all of user information repeatedly in each project document or just store userId alone (in which case user information is stored in a separate document). We can control how much referred data to store based on your query/ logical integrity constraints.

How to model mongodb for custom user data

I'm developing a cms using MongoDb and am trying to get some modelling advice. It's multi-tenant and each tenant can create their own schema and choose what custom fields they want searchable/indexed. The only thing I'm waffling on is how to model my collections. It seems to me like it would be ideal for each tenant to have their own collection due to indexing, but I am not very experienced with MongoDb and would love to hear if that's even a valid statement or not.
I'm thinking about separating each tenant's schema definitions from their data - perhaps a customSchema and customData collection for each tenant. Maybe something like customSchema_5543e1191a85d8946f0ee6fc and customData_5543e1191a85d8946f0ee6fc? The major question here being how many collections are feasible in MongoDb. I'm not clear if there's a cap with the new WiredTiger or not. If not, would such a large number of collections have any downsides?
Or, is it better to have just two collections with all tenant's data in them, along with all of their individual indexes? What are the pros and cons of this approach?
Any thoughts or suggestions are welcome, particularly if anyone has had experience doing something like this before.
Update:
My use case is a cms where tenants can specify their own data, like in Sharepoint or Expression Engine, or most other content apis, like contentful or CloudCMS. A user can say, "I want to store Products, and each product has a Name, Description, Quantity, and a price". Another user could say, "I want to store bands, and each band has a Name, a HomeCity, and a whatever." The users would then want to retrieve and display that data on their pages however they like. It's a basic cms scenario where tenants can create their own schema, then create, edit, and retrieve entries of those schemas. Tenants would need to be able to denote which fields they can search on, so this highly customizable indexing per tenant is the primary area of focus and concern in the modelling strategy.
I'm waffling between two big collections to store schemas and data, shared by all tenants, and a pair of those collections for every tenant. I just don't know the pros and cons of each of those solutions in MongoDb. I'm also open to any ideas I haven't thought of yet :)

in mongodb, how to design a bus station schema in this situation

I have a schema to build, it is a bus station application, which stores the distance or other info between two nearby bus stations, it is unlikely to store one station id for the index or unique key, I think a better to do it is to group station 1 and station 2 as unique and index key, but I am a bit not confident whether it is the right way to do it, put 2 bus stations id into a array, make this array as index and unique key?
That sounds very reasonable... It's a "relationship table" or a many-to-many join table with additional attributes. You would store the distance as an attribute of the M:N relationship./ the two bus station ids would form the composite primary key.
Look at the image in section 2.1.5 of this modeling guide
You may want to learn a little bit about database design techniques. If so, some useful sources on databases and modeling are:
Fundamentals of Database Systems by Elmasri and Nevathi - Very technical about all aspects of database and covers modeling in depth
An Introduction to Database Systems by CJ Date - similar to above
A Tutorial on Normal Form (BCNF) - BCNF prescribes a means of bringing out your data model by iteratively applying a rule to your model till it meets normal form (is efficient - barring intended redundancy).
Wikipedia entry looks pretty concise - as above re BCNF but looks nice and concise (perhaps focus on sections 3 and 4)
EDIT: Update relevant to Mongo DB
Actually, the above is all pretty general for database modeling. Having a read of some of the excellent resources on Data Modeling Considerations for MongoDB Applications I think you need more specific guidance.
As such, I refer you to this informative SO post: how-to-organise-a-many-to-many-relationship-in-mongodb. The author gives a good explanation that sounds like what you're after. There's even references to docs and a video.

MongoDB Schema Design ordering service

I have the following objects Company, User and Order (contains orderlines). User's place orders with 1 or more orderlines and these relate to a Company. The time period for which orders can be placed for this Company is only a week.
What I'm not sure on is where to place the orders array, should it be a collection of it's own containing a link to the User and a link to the Company or should it sit under the Company or finally should the orders be sat under the User.
Numbers wise I need to plan for 50k+ in orders.
Queries wise, I'll probably be looking at Orders by Company mainly but I would need to find an Order by Company based for a specific user.
1) For folks coming from the SQL world (such as myself) one of the hardest learn about MongoDB is the new style of schema design. In the SQL world, everything goes into third normal form. Folks come to think that there is a single right way to design their schema, because there typically is one.
In the MongoDB world, there is no one best schema design. More accurately, in MongoDB schema design depends on how the application is going to access the data.
2) Here are the key questions that you need to have answered in order to design a good schema for MongoDB:
How much data do you have?
What are your most common operations? Will you be mostly inserting new data, updating existing data, or doing queries?
What are your most common queries?
How many I/O operations do you expect per second?
What you're talking about here is modeling Many-to-One relationships:
Company -> User
User -> Order
Order -> Order Lines
Company -> Order
Using SQL you would create a pair of master/detail tables with a primary key/foreign key relationship. In MongoDB, you have a number of choices: you can embed the data, you can create a linked relationship, you can duplicate and denormalize the data, or you can use a hybrid approach.
The correct approach would depend on a lot of details about the use case of your application, many of which you haven't provided.
3) This is my best guess - and it's only a guess - as to a good schema for you.
a) Have separate collections for Users, Companies, and Orders
If you're looking at 50k+ orders, there are too many to embed in a single document. Having them as a separate collection will allow you to reference them from both the Company and the User documents.
b) Have an array of references to the Order documents in both the Company and the User documents. This makes the query "Find all Orders for this Company" a single-document query
c) If your query pattern supports it, you might also have a duplicate link from Orders back to the owning Company and/or User.
d) Assuming that the order lines are unique to the individual Order, you would embed the Order Lines in an array within the Order documents.
e) If your order lines refer back to individual Products, you might want to have a separate Product collection, and include a reference to the Product document in the order line sub-document
4) Here are some good general references on MongoDB schema design.
MongoDB presentations:
http://www.10gen.com/presentations/mongosf2011/schemabasics
http://www.10gen.com/presentations/mongosv-2011/schema-design-by-example
http://www.10gen.com/presentations/mongosf2011/schemascale
Here are a couple of books about MongoDB schema design that I think you would find useful:
http://www.manning.com/banker/ (MongoDB in Action)
http://shop.oreilly.com/product/0636920018391.do
Here are some sample schema designs:
http://docs.mongodb.org/manual/use-cases/
Note that the "MongoDB in Action" book includes a sample schema for an e-commerce application, which is very similar to what you're trying to build -- I recommend you check it out.