How to design MongoDB collection relation correctly

How to design MongoDB collection relation correctly - mongodb

I am trying to create a simple DB of Customers/Products/Orders to practice my skills with aggregate and so on. I need some help on how to design the relation between those collections correctly (that i'll really be able to store the details in the right way).
I go for the minimal build and information-->
Customer:
(1) customer_id ,
(2)orders (this will be an array with all the order_id's this customer did)
Product:
(1) product_id
Order:
(1) order_id ,
(2) products (this will be an array with all the products in this order) + I wonder how I add quantity for each product ,
(3) user_id (this will store the user_id did this order) or i don't even need this field??
I hope someone could tell me if this is a right thinking, because I come from SQL and this what I would do there I guess
// I'm not using embedded document because in case i'll want to change a specific field value it would change in all places

Related

How to avoid customer's order history being changed in MongoDB?

I have two collections
Customers
Products
I have a field called "orders" in each of my customer document and what this "orders" field does is that it stores a reference to the product Id which was ordered by a customer, now my question is since I'm referencing product Id and if I update the "title" of that product then it will also update in the customer's order history since I can't embed each order information since a customer may order thousands of products and it can hit 16mb mark in no time so what's the fix for this. Thanks.

Create an Orders Collection
Store ID of the user who made the order
Store ID of the product bought

I understand you are looking up the value of the product from the customer entity. You will always get the latest price if you are not storing the order/price historical transactions. Because your data model is designed this way to retrieve the latest price information.
My suggestion.
Orders place with product and price always need to be stored in history entity or like order lines and not allow any process to change it so that when you look up products that customers brought you can always get the historical price and price change of the product should not affect the previous order. Two options.
Store the order history in the current collection customers (or top say 50 order lines if you don't need all of history(write additional logic to handle this)
if "option1" is not feasible due to large no. of orders think of creating an order lines transaction table and refer order line for the product brought via DBref or lookup command.
Note: it would have helped if you have given no. of transactions in each collection currently and its expected rate of growth of documents in the collection QoQ.

You have orders and products. Orders are referencing products. Your problem is that the products get updated and now your orders reference the new product. The easiest way to combat this issue is to store full data in each order. Store all the key product-related information.
The advantage is that this kind of solution is extremely easy to visualize and implement. The disadvantage is that you have a lot of repetitive data since most of your products probably don't get updated.
If you store a product update history based on timestamps, then you could solve your problem. Products are identified now by 3 fields. The product ID, active start date and active end date. Or you could configure products in this way: product ID = product ID + "Version X" and store this version against each order.
If you use dates, then you will query for the product and find the product version that was active during the time period that the order occurred. If you use versions against the product, then you will simply query the database for the particular version of the product itself. I haven't used mongoDb so I'm not sure how you would achieve this in mongoDb exactly. Naively however, you can modify the product ID to include the version as well using # as a delimiter possibly.
The advantage of this solution is that you don't store too much of extra data. Considering that products won't be updated too often, I feel like this is the ideal solution to your problem

Filter and display database audit / changelog (activity stream)

I'm developing an application with SQLAlchemy and PostgreSQL. Users of the system modify data in 8 or so tables. Consider this contrived example schema:
I want to add visible logging to the system to record what has changed, but not necessarily how it has changed. For example: "User A modified product Foo", "User A added user B" or "User C purchased product Bar". So basically I want to store:
Who made the change
A message describing the change
Enough information to reference the object that changed, e.g. the product_id and customer_id when an order is placed, so the user can click through to that entity
I want to show each user a list of recent and relevant changes when they log in to the application (a bit like the main timeline in Facebook etc). And I want to store subscriptions, so that users can subscribe to changes, e.g. "tell me when product X is modified", or "tell me when any products in store S are modified".
I have seen the audit trigger recipe, but I'm not sure it's what I want. That audit trigger might do a good job of recording changes, but how can I quickly filter it to show recent, relevant changes to the user? Options that I'm considering:
Have one column per ID type in the log and subscription tables, with an index on each column
Use full text search, combining the ID types as a tsvector
Use an hstore or json column for the IDs, and index the contents somehow
Store references as URIs (strings) without an index, and walk over the logs in reverse date order, using application logic to filter by URI
Any insights appreciated :)
Edit It seems what I'm talking about it an activity stream. The suggestion in this answer to filter by time first is sounding pretty good.

Since the objects all use uuid for the id field, I think I'll create the activity table like this:
Have a generic reference to the target object, with a uuid column with no foreign key, and an enum column specifying the type of object it refers to.
Have an array column that stores generic uuids (maybe as text[]) of the target object and its parents (e.g. parent categories, store and organisation), and search the array for marching subscriptions. That way a subscription for a parent category can match a child in one step (denormalised).
Put a btree index on the date column, and (maybe) a GIN index on the array UUID column.
I'll probably filter by time first to reduce the amount of searching required. Later, if needed, I'll look at using GIN to index the array column (this partially answers my question "Is there a trick for indexing an hstore in a flexible way?")
Update this is working well. The SQL to fetch a timeline looks something like this:
SELECT *
FROM (
SELECT DISTINCT ON (activity.created, activity.id)
*
FROM activity
LEFT OUTER JOIN unnest(activity.object_ref) WITH ORDINALITY AS act_ref
ON true
LEFT OUTER JOIN subscription
ON subscription.object_id = act_ref.act_ref
WHERE activity.created BETWEEN :lower_date AND :upper_date
AND subscription.user_id = :user_id
ORDER BY activity.created DESC,
activity.id,
act_ref.ordinality DESC
) AS sub
WHERE sub.subscribed = true;
Joining with unnest(...) WITH ORDINALITY, ordering by ordinality, and selecting distinct on the activity ID filters out activities that have been unsubscribed from at a deeper level. If you don't need to do that, then you could avoid the unnest and just use the array containment #> operator, and no subquery:
SELECT *
FROM activity
JOIN subscription ON activity.object_ref #> subscription.object_id
WHERE subscription.user_id = :user_id
AND activity.created BETWEEN :lower_date AND :upper_date
ORDER BY activity.created DESC;
You could also join with the other object tables to get the object titles - but instead, I decided to add a title column to the activity table. This is denormalised, but it doesn't require a complex join with many tables, and it tolerates objects being deleted (which might be the action that triggered the activity logging).

TSQL - Deleting with Inner Joins and multiple conditions

My question is a variation on one already asked and answered (TSQL Delete Using Inner Joins) but I have a different level of complexity and I couldn't see a solution to it.
My requirement is to delete Special Prices which haven't been accessed in 90 days. Special Prices are keyed on Customer ID and Product ID and the products have to matched to a Customer Order Detail table which also contains a Customer ID and a Product ID. I want to write one function that will look at the Special Price table for each Customer, compare each Product for that Customer with the Customer Order Detail table and if the Maximum Order Date is more than 90 days earlier than today, delete it from the Special Price table.
I know I can use a CURSOR (slow but effective) but would prefer to have a single query like the one in the TSQL Delete Using Inner Joins example. Any ideas and/or is more information required?

I cannot dig more on the situation of your system but i think and if it is ok for you, check MERGE STATEMENT, it might be a help instead of using cursors. check this Link MERGE STATEMENT

Need a good database design for this situation

I am making an application for a restaurant.
For some food items, there are some add-ons available - e.g. Toppings for Pizza.
My current design for Order Table-
FoodId || AddOnId
If a customer opts for multiple addons for a single food item (say Topping and Cheese Dip for a Pizza), how am I gonna manage?
Solutions I thought of -
Ids separated by commas in AddOnId column (Bad idea i guess)
Saving Combinations of all addon as a different addon in Addon Master Table.
Making another Trans table for only Addon for ordered food item.
Please suggest.
PS - I searched a lot for a similar question but cudnt find one.

Your relationship works like this:
(1 Order) has (1 or more Food Items) which have (0 or more toppings).
The most detailed structure for this will be 3 tables (in addition to Food Item and Topping):
Order
Order to Food Item
Order to Food Item to Topping
Now, for some additional details. Let's start flushing out the tables with some fields...
Order
OrderId
Cashier
Server
OrderTime
Order to Food Item
OrderToFoodItemId
OrderId
FoodItemId
Size
BaseCost
Order to Food Item to Topping
OrderToFoodItemId
ToppingId
LeftRightOrWhole
Notice how much information you can now store about an order that is not dependent on anything except that particular order?
While it may appear to be more work to maintain more tables, the truth is that it structures your data, allowing you many added advantages... not the least of which is being able to more easily compose sophisticated reports.

You want to model two many-to-many realtionships by the sound of it.
i.e. Many products (food items) can belong to many orders, and many addons can belong to many products:
Orders
Id
Products
Id
OrderLines
Id
OrderId
ProductId
Addons
Id
ProductAddons
Id
ProductId
AddonId
Option 1 is certainly a bad idea as it breaks even first normal form.

why dont you go for many-to-many relationship.
situation: one food can have many toppings, and one toppings can be in many food.
you have a food table and a toppings table and another FoodToppings bridge table.
this is just a brief idea. expand the database with your requirement

You're right, first one is a bad idea, because it is not compliant with normal form of tables and it would be hard to maintain it (e.g. if you remove some addon you would need to parse strings to remove ids from each row - really slow).
Having table you have already there is nothing wrong, but the primary key of that table will be (foodId, addonId) and not foodId itself.
Alternatively you can add another "id" not to use compound primary key.

Access 2010 - Having multiple products to one Quote ID

I have created an adaptation of the 'Goods' database that includes a quote feature. The user selects the customer (customer table), Product (product table), qty, discount ect.
The chosen entities then get saved to the quotes table and there is a 'print' function on form.
Whilst the information can be saved and the quote prints via a quote report, I'm having major difficulty in finding a way to add multiple products to a single quote.
The main objective is to be able to select various products and add their total price (product after addition of qty, discount) to a SUB TOTAL
Quote total is therefore the formula Tax+Shipping+SubTotal
any takers? :)
Hi guys,
Thanks for the response I really appreciate it. As for tax and shipping, they are just added in the form and are not pushed from anywhere else in the database. Its simply a type in form and display on report sort of thing. As you said in the answer, HansUp, the salesperson will compute it seperately and just input it.
As for tax, products will be shiped globally so the tax/vat shall be computed seperately also.
Also, each table DOES have its own unique ID.
More to the point of having QuoteProducts. I can't seem to get my head around it! Are you saying that whatever products are chosen in QuoteProducts will create a QuoteProd_ID and then that ID's total price will therefore be added to the Quote?
I tried making a subform before but through the 'multiple records' form but obviously every selection made its own ID. Is there any way you could elaborate on the Quote products part and how it allows multiple records to store to one ID? Without understanding it i'm pretty much useless.
In addition, how the multiple records are then added up to make the subtotal also baffles me. Is that done in the Quote form?
Edit 2
HALLELUJAH.
It works! I created a sum in a textbox on the footer of the subform and then pushed that into subtotal :)
I do have one slight issue:
I made a lookup&relationship for the ListPrice. I don't think its the correct way to do it as it comes up with the price of every light (i.e 10 products priced £10, £10 shows up ten times in dropdown).
Can you guys help?
List Price Problem
here's what i've tried:
1) Create >Client>Query Design
2) Show Products, QuoteDetails. For some reason, it automtically comes up with ListPrice, ProductID (as it should) and Product Name linked to ID in Products
3) Delete links with ListPrice and ProductName.
4) Show all in quoteDetails (*)
5) Create Multiple Items form
Doesnt work! What am I doing wrong?
I'm extremely grateful for both your help. If I can do anything, just shout.
Ryan

In addition to HansUp's stellar answer, you might be interested in DatabaseAnswers.org. They have a number of data models for free that might provide additional insight to your situation and possibly serve as inspiration for future projects you may encounter.
Edit 1
Forget about the form and report for a moment - they are important but not as important as the data and how you store the data.
In your current design, you have a quotes table presumably with an autonumber key field. For the purposes of this answer, this field is named Quote_ID.
The quotes table, as HansUp suggested, should store information such as the Customer_ID, Employee_ID, OrderDate and perhaps even a reference to a BillingAddress and ShippingAddress.
The quotes table SHOULD NOT store anything about the products that the customer has ordered as part of this quote.
Instead, this information should be stored in a table called QuoteProducts or QuoteDetails.
It's structure would look something like the following:
QuoteDetails_ID --> Primary Key of the table (autonumber field)
Quote_ID --> Foreign key back to the Quotes table
Product_ID --> Foreign key back to Products table
Quantity
UnitPrice
You may also want to consider a field for tax and a separate field for shipping per line item on the quote. You will inevitably run into situations where certain items are taxable in some locations and not others, etc.
This design allows a particular quote to have any number of products assigned to the quote.
Returning to your form \ reports, you would need to change your existing forms and reports to accomodate this new table design. Typically one would use a main form for the quote itself, and then a subform for the quote details (item, price, quantity, etc).
To get the quote total, you would sum the items in QUoteDetails for a particular Quote_ID.
You may also want to check out the Northwind sample database from Microsoft. From what I recall Northwind had a sample Orders system that might help make these ideas more concrete for you by seeing a working example.

For the first 3 tables mentioned in your comment, each should have a primary key: Customers, customer_id; products, product_id; and employees, employee_id.
The quotes table will have its own primary key, quote_id, and will store customer_id and employee_id as foreign keys. (I'm assuming you want employee_id to record which customer representative/salesperson created the quote.) You may also decide to include additional attributes for each quote; date and time quote prepared, for example.
The products offered for quotes will be stored in a junction table, QuoteProducts. It will have foreign keys for quote_id and product_id, with one row for each product offered in the quote. This is also where you can store the attributes quantity and discount. An additional field, unit_price, can allow you to store the product price which was effective at the time the quote was prepared ... which would be useful in case product prices change over time. I don't know whether tax should be included in this table (see below).
I also don't know how to address shipping. If all the products associated with a quote are intended to be delivered in one shipment, shipping cost could be an attribute of the quotes table. I don't know how you intend to derive that value. Seems like it might be determined by shipping method, distance, and weight. If you have the salesperson compute that value separately, and then input the value, consider how to handle the case where the product selection changes after the shipping fee has been entered.
That design is somewhat simplistic, but might be sufficient for the situation you described. However, it could get more complex. For example, if you decide to maintain a history of product price changes, you would be better off to build in provisions for that now. Also, I have no idea how tax applies in your situation --- whether it's a single rate applied to all products, varies by customer location, varies by type of customer, and/or varies by product. Your business rules for taxes will need to be accommodated in the schema design.
However, if that design works for you (test it by entering dummy data into the tables without using a form), you could create a form based on quotes with a subform based on QuoteProducts. With quote_id as the link master/child property, the subform will allow you to view all products associated with the main form's current quote_id. You can use the subform to add, remove, and/or edit products associated with that quote.
Not much I can say about the report. There is a lot of uncertainty in the preceding description. However, if your data base design allows you to build a workable form/subform, it should also support a query which gathers the same data. Use that query as the record source for the report. And use the report's sorting and grouping features to create the quote grand total.
Edit: With the main form/ subform approach, each new row in the subform should "inherit" the quote_id value of the current record in the main form. You ensure that happens by setting the link master/child properties to quote_id. Crystal Long explains that in more detail in chapter 5 of Access Basics by Crystal: PDF file. Scroll down to the heading Creating a Main Form and Subform on page 24.
Edit2: Your strategy may include storing Products.ListPrice in QuoteDetails.ListPrice. That would be useful to record the current ListPrice offered for a quote. If so, you can fetch ListPrice from Products and store it in QuoteDetails when you select the ProductID for a row in the subform. You can do that with VBA code in the after update event of the control which is bound to the ProductID field. So if that control is a combo box named cboProductID and the subform control bound to the QuoteDetails ListPrice field is a text box named txtListPrice, use code like this for cboProductID after update:
Me.txtListPrice = DLookup("ListPrice", "Products", "ProductID = " _
& Me.cboProductID)
That suggestion assumes the Products and QuoteDetails tables both include a ProductID field and its data type is numeric. And cboProductID has ProductID as its bound field and uses a query as its RowSource similar to this:
SELECT ProductID, ProductName
FROM Products
ORDER BY ProductName;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse