Mongodb Schema considering sub-document to avoid multiple reads - mongodb

I am trying to come up with a MongoDB document model and would like others opinions. I want to have a Document that represents an Employee. This table will contain all attributes of an employee (I.e. firstName, LastName). Now where I am stuck coming from the relational realm, is the need to store a list of employees an employee can access. In other words lets say Employee A is a Manager. I need to store the direct reports that he manages, in order to use this in various applications. In relational I would have a mapping table that tied an employee to many employees. In mongo not being able join documents, do you think I should utilize an embeded (sub-document) to store the list of accessible employees as part of the Employee document? Any other ideas ?

Unless your using employee groups (Accounting, HR, etc) You'll probably be fine adding the employee name, mongo Object ID, and any other information unique to that manager / employee relationship as a sub document to the managers document.
With that in place you could probably do your reporting on these relationships through a simple aggregation.
This is all IMHO, and begs the question; Is simple aggregation another oxymoron like military intelligence?

Related

updating and modeling nosql record

So in a traditional database I might have 2 tables like users, company
id
username
companyid
email
1
j23
1
something#gmail.com
2
fj222
1
james#aol.com
id
ownerid
company_name
1
1
A Really boring company
This is to say that user 1 and 2 are apart of company 1 (a really boring company) and user 1 is the owner of this company.
I could easily issue an update statement in MySQL or Postgresql to update the company name.
But how could I model the same data from a NoSQL perspective, in something like Dynamodb or Mongodb?
Would each user record (document in NoSQL) contain the same company table data (id, ownerid (or is owner true/false, and company name)? I'm unclear how to update the record for all users containing this data then if the company name needed to be updated.
In case you want to save the company object as JSON in each field (for performance reasons), indeed, you have to update a lot of rows.
But best way to achieve this is to have a similar structure as you have above, in MySQL. NoSql schema depends a lot on the queries you will be making.
For example, the schema above is great for:
Find a particular user by username, along with his company name. First you need to query User by username (you can add an index), get the companyId and do another query on Company to fetch the name.
Let's assume company name changes often
In this case company name update is easy. To execute the read query, you need 2 queries to get your result (but they should execute fast)
Embedded company JSON would work better for:
Find all users from a specific city and show their company name
Let's assume company name changes very rarely
In this case, we can't use the "relational" approach, because we will do 1 query to fetch Users by city and then another query for all users found to fetch the company name
Using embedded approach, we need only 1 query
To update a company name, a full (expensive) scan is needed, but should be ok if done rarely
What if company name changes ofter and I want to get users by city?
This becomes tricky, NoSQL is not a replacement for SQL, it has it's shortcomings. Solution may be a platform dependent feature (from mongo, dynamodb, firestore etc.), an additional layer above (elasticSearch) or no solution at all (consider not using key-value NoSQL)
Depends on the programming language used to handle NoSQL objects/documents you have variety of ORM libraries to model your schema. Eg. for MongoDB plus JS/Typescript I recommend Mongoose and its subdocuments. Here is more about it:
https://mongoosejs.com/docs/subdocs.html

Complex and multiple connected database relations in MongoDB

I am currently trying to model a MongoDB database structure where the entities are very complex in relation to each other.
In my current collections, MongoDB queries are difficult or impossible to put into a single aggregation. Incidentally, I'm not a database specialist and have been working with MongoDB for only about half a year.
To keep it as simple as possible but necessary, this is my challenge:
I have newspaper articles that contain simple keywords, works (oevres, books, movies), persons and linked combinations of works and persons. In addition, the same people appear under different names in different articles.
Later, on the person view I want to show the following:
the links of the person with name and work and the respective articles
the articles in which the person appears without a work (by name)
the other keywords that are still in the article
In my structure I want to avoid that entities such as people occur multiple times. So these are my current collections:
Article
id
title
keywordRelations
KeywordRelation
id
type (single or combination)
simpleKeywordId (optional)
personNameConnectionIds (optional)
workIds (optional)
SimpleKeyword
id
value
PersonNameConnection
id
personId
nameInArticleId
Person
id
firstname
lastname
NameInArticle
id
name
type (e.g. abbreviation, synonyme)
Work
id
title
To meet the requirements, I would always have to create queries that range over 3 to 4 tables. Is that possible and useful with MongoDB?
Or is there an easier way and structure to achieve that?

MongoDb conditional relationship

Suppose I have following 4 collections:
1- posts
2- companies
3- groups
4- users
Bellow is my current structure in post:
and their relation is:
A company has an owner and many other members (user collection).
A group has many members (users).
A user has many posts.
A group has many posts that published by one of its members.
A company has many posts that published by its owner or members.
Now i have a problem on storing relation of users, company, and group with posts collection.
Bellow is my current structure:
I have decided to have a field postable inside my post document, and has a type field that will be 'user', or 'group', or 'company', and two other fields name, and id that will be company/group id and company/group name in cases that post is belonged to company or group but not user means type="group" || type="company".
Now how i can handle this to map id as FK of group and company collection (one field FK of two collection) ?
Is it the right structure ?
What you have here is a polymorphic association. In relational databases, it is commonly implemented with two fields, postable_id and postable_type. The type column defines which table to query and id column determines the record.
You can do the same in mongodb (in fact, that is what you came up with, minus the naming convention). But mongodb has a special field type precisely for this type of situations: DBRef. Basically, it's an upgraded id field. It carries not only the id, but also collection name (and database name).
how i can handle this to map id as FK of group and company collection (one field FK of two collection)?
Considering that mongodb doesn't have joins and you have to load all references manually, I don't see how this is any different from a regular FK field. Just the collection name is stored in the type field now, instead of being hardcoded.

Mongo Schema Design

I'm pretty new to Mongo. Just started a project using Mongodb as the database.
I'm not sure how should i design the following use-case to a document base database.
User-Case
1. Vendor/Distributor has a list of product on our system.
2. There's a standard price list of each product for any customers.
3. Vendor/Distributor also has customize price list of each of the product for each customer.
eg. CustA have a productA at different pricing from the standard and it's only available to him.
4. Some of the Product are only available through customize price, and I match those product with attribute public = false.
How should i work this out in document base database?
Current design i have is.
1. [Product Document] with embedded document of standard price list.
2. [Product_Price Document] with oneToMany link [Product Document] and oneToMany to [Customer Document]
3. [Customer Document].
With this Model, I'm facing problem with querying by paging.
Example I query the first 30 Product sorted by name. Then query [Product_Price Document] with the 30 ProductId that match, so that I have those customize price for that customer who login.
The problems come where by I couldn't query item that are customize to the user that is not available for everyone.
Is there a better way or design the schema or what should i do with the query?
I'm using PHP, Doctrine2, Symfony2
When you query the Product_Price_Document query it using both ProductID and current CustomerID. Or am I missing something?
Here's how I would structure it.
Have two collections:
- Products
- Vendors
Your products table would have the list of all your products and their standard price. Your vendors page would have an array of product ID's along with an override price in the case that they have a different price for that particular product.
If you are also tracking customers then you could make that a collection too and have a belongs to relationship almost to the vendors.
so in short:
collection.vendor:
{"name":'foo',"products":[{"_id":mongoId,"priceOveride":15.50},..]}
collection.products:
{"name":"bar","price":15.40}
Excellent resource for reading a bit more into the relationships which you can use:
Learn Mongo Interactively

No-sql relations question

I'm willing to give MongoDB and CouchDB a serious try. So far I've worked a bit with Mongo, but I'm also intrigued by Couch's RESTful approach.
Having worked for years with relational DBs, I still don't get what is the best way to get some things done with non relational databases.
For example, if I have 1000 car shops and 1000 car types, I want to specify what kind of cars each shop sells. Each car has 100 features. Within a relational database i'd make a middle table to link each car shop with the car types it sells via IDs. What is the approach of No-sql? If every car shop sells 50 car types, it means replicating a huge amount of data, if I have to store within the car shop all the features of all the car types it sells!
Any help appreciated.
I can only speak to CouchDB.
The best way to stick your data in the db is to not normalize it at all beyond converting it to JSON. If that data is "cars" then stick all the data about every car in the database.
You then use map/reduce to create a normalized index of the data. So, if you want an index of every car, sorted first by shop, then by car-type you would emit each car with an index of [shop, car-type].
Map reduce seems a little scary at first, but you don't need to understand all the complicated stuff or even btrees, all you need to understand is how the key sorting works.
http://wiki.apache.org/couchdb/View_collation
With that alone you can create amazing normalized indexes over differing documents with the map reduce system in CouchDB.
In MongoDB an often used approach would be store a list of _ids of car types in each car shop. So no separate join table but still basically doing a client-side join.
Embedded documents become more relevant for cases that aren't many-to-many like this.
Coming from a HBase/BigTable point of view, typically you would completely denormalize your data, and use a "list" field, or multidimensional map column (see this link for a better description).
The word "column" is another loaded
word like "table" and "base" which
carries the emotional baggage of years
of RDBMS experience.
Instead, I find it easier to think
about this like a multidimensional map
- a map of maps if you will.
For your example for a many-to-many relationship, you can still create two tables, and use your multidimenstional map column to hold the relationship between the tables.
See the FAQ question 20 in the Hadoop/HBase FAQ:
Q:[Michael Dagaev] How would you
design an Hbase table for many-to-many
association between two entities, for
example Student and Course?
I would
define two tables: Student: student
id student data (name, address, ...)
courses (use course ids as column
qualifiers here) Course: course id
course data (name, syllabus, ...)
students (use student ids as column
qualifiers here) Does it make sense?
A[Jonathan Gray] : Your design does
make sense. As you said, you'd
probably have two column-families in
each of the Student and Course tables.
One for the data, another with a
column per student or course. For
example, a student row might look
like: Student : id/row/key = 1001
data:name = Student Name data:address
= 123 ABC St courses:2001 = (If you need more information about this
association, for example, if they are
on the waiting list) courses:2002 =
... This schema gives you fast access
to the queries, show all classes for a
student (student table, courses
family), or all students for a class
(courses table, students family).
In relational database, the concept is very clear: one table for cars with columns like "car_id, car_type, car_name, car_price", and another table for shops with columns "shop_id, car_id, shop_name, sale_count", the "car_id" links the two table together for data Ops. All the columns must well defined in creating the database.
No SQL database systems do not require you pre-define these columns and tables. You just construct your records in a certain format, say JSon, like:
"{car:[id:1, type:auto, name:ford], shop:[id:100, name:some_shop]}",
"{car:[id:2, type:auto, name:benz], shop:[id:105, name:my_shop]}",
.....
After your system is on-line providing service for your management, you may find there are some flaws in your design of db structure, you hope to add one column "employee" of "shop" for your future records. Then your new records coming is as:
"{car:[id:3, type:auto, name:RR], shop:[id:108, name:other_shop, employee:Bill]}",
No SQL systems allow you to do so, but relational database is impossible for this job.