Genealogy Relationship Mapping why base on families - database-schema

I am developing a geneaology application, and am currently at the stage of modelling relationships between individuals.
Based on my research, I have noted that most of the mappings of relationships are based on families (father + mother) and so I would like to understand the underlying reasoning behind this before I adopt it blindly.
Since my project is patriarchal, I assume that as soon as a person adds a father, then that creates a new family

Most Genealogy software vendors decided to follow the model that professional genealogists use. The basis is a family group sheet, that includes the father at the top left, the mother at the top right, and the children below.
The database structure then chosen is to have records of two types: Individuals and Families. These are exemplified by the GEDCOM standard which is used to transfer genealogy data between programs.
Then they use what is called a lineage-linked data structure. This structure has two connections:
The Individual will link to the family in which they are a husband or a wife (a FAMS link) and the Family will link back to the two individuals (a HUSB and a WIFE link).
The Individual will link to the families who are their parents, either blood or adopted (a FAMC), and each Family will link back to their children (CHIL links).
Once you develop your program, make sure it will be able to read and write GEDCOM.

Related

REST API filtering/search on a parent resource

Suppose I have a store such as Amazon that sells a variety of products such as computers and paintings. They are quite different from each other and have their own set of fields and logic.
In addition to the typical CRUD, I need to design a JSON API that allows me to:
A. Fetch an ungrouped list of paintings and computers. For example: [computer, painting, painting, computer, ...] ordered by date published (so with filtering capability).
B. Fetch only paintings
C. Fetch only computers
The RESTful approach will typically be something like: /api/paintings and api/computers which works really well for segregated results.
But my main concern is operation A - getting an ungrouped list of paintings and products sorted by date published. The way I see it, there are three approaches:
1) Create a new standalone resource called products such as /api/products which will have filtering capability and continue to use /api/resource for specific CRUD operations.
2) Create a parent products resource which will be used for filtering operations. So I can do something like /products?order_by=published_date And for more specific resources I can do something like /products/paintings or /products/computers
3) Do not have a resource for paintings or computers. Instead have one for a generic product. I will then have most logic in the api layer and reduce the complexity of the client.
I am leaning towards approach #3 but wanted to get feedback prior to implementing since this will be a core feature of the API.
I've always taken the approach the your API Layer should match your object modeling. So, the answer to your question would be it depends on the source data. Well, the source data after it's object modeling.
If you have an object model for Computer and for Printer, they should be resources like you've said. Do they share any data/functions? If so, you should have an object model for that, too, perhaps: Product. Then Computer and Printer extend the Product class.
With that in mind, design the API layer to mirror it. Since Computer and Printer both extend Product. Product as a parent of the Computer and Printer resources make sense.
In my opinion I would go for approach #3 and query the API with type of product if you search for it.
/products?type=computers&order_by=date

Practical usage of noSQL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I’m starting a new web project and have to decide what database to use. I know, the question is very long but please bear with me on this.
I am very familiar with relational databases and have used frameworks like hibernate to get my data from the DB into Objects. But I have no experience with noSQL DBs. I am aware of the concepts of Document, Key-Value, etc. types.
While I do my research one question pops out every time and I don’t know how someone would handle this in noSQL DBs like MongoDB or any other Document-Typed noSQL DB where consistency takes top priority.
For example: let’s assume that we are creating a small shopping management system where customers can buy and sell stuff.
We have:
CUSTOMERs
ORDERs
PRODUCTs
A single CUSTOMER can have multiple ORDERs and an ORDER can have multiple PRODUCTs.
In a traditional RDBMS I would of course have 3 tables.
In the first version of our application, the front end for the customer should display his/her personal data, ORDERs and all the PRODUCTs he or she bought per order. Also which products are available for sale. So I guess in noSQL I would model the CUSTOMER class like this:
{
"id": 993784,
"firstname": "John",
"lastname": "Doe",
"orders": [
{
"id": 3234,
"quantity": 4,
"products": [
{
"id:" 378234,
"type": "TV",
"resolution": "1920x1080",
"screenSize":37,
"price": 999
}
]
}
],
"products": [
{
"id:" 7932,
"type": "car",
"sold": false,
"horsepower": 90
}
]
}
But later I want to extend my application to have 3 different UIs instead of only the first one:
The CUSTOMER Dashboard where a customer can view all his/her orders.
The PRODUCT Dashboard where a customer can add or remove products in his/her store.
THE SOLD Dashboard where a customer can view all sold PRODUCTs ready for shipping.
One very important thing to consider (the reason why I even bother asking this question): I want to be flexible with the classes like PRODUCT because products can have different properties. For Example: A TV has screen size and resolution while a car has horsepower and other properties. And if a user adds a new product, he or she should be able to dynamically add those properties depending on what he/she knows about it.
Now to some practical use cases of two fictional users Jane and John:
Let's say, Jane buys from John. Does that mean i have to create the PRODUCTs two times? One time as a child of Jane's ORDER and another time to stay in the "products" property of John?
Later Jane wants to view all products that are available from any user. Do i have to load every user to query the "products" property to generate a list of all products?
In version 2 of the application i want to enable John to view all outgoing orders (not orders he made but orders from other users who bought stuff from him) instead of viewing all sold products. How would this be done in noSQL? Would i now need to create an "outgoing" array of orders and duplicate them? (an outgoing order of Jane is an incoming order of John)
Some of you may say that noSQL is not right for this use case but isn’t that very common? Especially when we do not know what the future brings? If it does not fit for this use case, what use case would it fit into? Only baby applications (I guess not)? Wasn’t noSQL designed for more complex and flexible data?
Thank you very much for your advises and opinions!
EDIT 1:
Because this question was put on hold because of the unprecise question:
I made a very clear and simple example. So my question is not general about the use of noSQL but how to handle this specific example. How would a experienced noSQL user handle this use case? How to model this data? A recommendation to simply not use noSQL at all for this use case is also a valid answer to me.
I simply want to know how to use a noSQL database but still be able to manage entities and avoid redundancy.
For example: Are MongoDB's DBRefs/Manual refs a good way to achieve this? Performance issues because of multiple queries? What else to think about? I guess these questions can probably be answered quite well.
There probably isn't the one right answer to your question. But I'll make a start.
While it is technically possible in NoSQL to store some business entity together with all entities that are transitively linked with it (like Customer, Order, Product), it is't always clever to do so. The traditional reasons for separating entities, namely redundancies and therefore update and delete anomalies, don't just go away because a different platform is used.
So if you stored the product description with every customer who buys or sells this product, you will get update anomalies. If you have to change the screen size from 37 to 35, you'll have to find all customer records containing this product, which can be quite cumbersome.
Also, building up such a deep nested structure favors one direction of evaluating those structures over all other directions. If you put all orders and products into the customer document, this is very fine for getting a comprehensive view for a customer: whatever she bought throughout her lifetime. But if you want to query your database by orders (which orders need to be fulfilled tonight?) or products (who ordered product 1234?) you'll have to load tons of data that are of no interest to this query.
Similar questions are due to storing all orders with a customer. Old orders will sometimes still be of interest, so they may not be deleted. But do you want to load lots of orders everytime you load the customer?
This doesn't mean not to make use of the complex structuring made possible by a document store. As a rule of thumb, I would suggest: As long as the nested information belongs to the same business entity, put it into one document. If, e.g., the product description has some hierarchic structure, like nested sections consisting of text, pics, and videos, they may all go into one document. But entities with a totally different life cycle, like customers, orders, and suppliers, should be kept separate. Another indicator is references: A product will frequently be referenced as a whole, e.g. when it is ordered by a customer or ordered from a supplier. But the different parts of the product description may possibly never be referenced from the outside.
This rule of thumb wasn't completely precise, and it's not supposed to be. One person's business entity is another person's dumb attribute. Imagine the color of a car: For the car owner, it's just a piece of information describing a car. For the manufacturer, it's a business entity, having an availability, a price, one or more suppliers, a way of handling it, etc.
Your question also touches the aspect of dynamically adding attributes. This is often praised as one of the goodies of NoSQL, but it's no free lunch. Let's assume, as you mentioned, that the user may add attributes. That's technically possible, but how will these attributes be processed by the system? There won't be a specific view, nor specific business rules, for those attributes. So the best the system can do is offer some generic mechanism for displaying those attributes that were defined at runtime and never reflected in the program code.
This doesn't mean the feature is useless. Imagine your product description may be complex, as described above. You might build a generic mechanism to display (and edit) descriptions made up of sections, texts, images, etc., and afterwards the users may enter descriptions of unlimited width and depth. But in contrast, imagine your user will add a tiny delivery date attribute to the order. Unless the system knows specifically how to interpret this date, it will just be a dumb piece of information without any effect.
Now imagine not the user, but the developer adds new attributes. She has the opportunity to enhance the code at the same time, e.g. building some functionality around delivery dates. But this means that, although the database doesn't require it by its own, a new release of the software needs to be rolled out to make use of the new information.
The absence of a database scheme even makes the programmer's task more complicated. When a relational table has a certain column, you may be sure that each of its records has this column. If you want to make sure that it has a meaningful value, make it not null, and you may be sure that each record contains a value of the correct data type. Nothing like that is guaranteed by schemaless databases. So, when reading a record, defensive programming is needed to find out which parts are present, and whether they have the expected content. The same holds for database maintenance via administrative tools. Adding an attribute and initializing it with a default value is a 2-liner in SQL, or a couple of mouse clicks in pgadmin. For a schemaless database, you will write a short program on your own to achieve this.
This doesn't mean that I dislike NoSQL databases. But I think the "schemaless" characteristic is sometimes overestimated, and I wouldn't make it the main, or only, reason to employ such a database.

MongoDB model design for meteorjs app

I'm more used to a relational database and am having a hard time thinking about how to design my database in mongoDB, and am even more unclear when taking into account some of the special considerations of database design for meteorjs, where I understand you often prefer separate collections over embedded documents/data in order to make better use of some of the benefits you get from collections.
Let's say I want to track students progress in high school. They need to complete certain required classes each school year in order to progress to the next year (freshman, sophomore, junior, senior), and they can also complete some electives. I need to track when the students complete each requirement or elective. And the requirements may change slightly from year to year, but I need to remember for example that Johnny completed all of the freshman requirements as they existed two years ago.
So I have:
Students
Requirements
Electives
Grades (frosh, etc.)
Years
Mostly, I'm trying to think about how to set up the requirements. In a relational DB, I'd have a table of requirements, with className, grade, and year, and a table of student_requirements, that tracks the students as they complete each requirement. But I'm thinking in MongoDB/meteorjs, I'd have a model for each grade/level that gets stored with a studentID and initially instantiates with false values for each requirement, like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: false
}
}
and as the student completes a requirement, it updates like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: [completionDateTime]
}
}
So in this way, each student will collect four Requirements documents, which are somewhat dictated by their initial instantiation values. And instead of the actual requirements for each grade/year living in the database, they would essentially live in the code itself.
Some of the actions I would like to be able to support are marking off requirements across a set of students at one time, and showing a grid of users/requirements to see who needs what.
Does this sound reasonable? Or is there a better way to approach this? I'm pretty early in this application and am hoping to avoid painting myself into a corner. Any help suggestion is appreciated. Thanks! :-)
Currently I'm thinking about my application data design too. I've read the examples in the MongoDB manual
look up MongoDB manual data model design - docs.mongodb.org/manual/core/data-model-design/
and here -> MongoDB manual one to one relationship - docs.mongodb.org/manual/tutorial/model-embedded-one-to-one-relationships-between-documents/
(sorry I can't post more than one link at the moment in an answer)
They say:
In general, use embedded data models when:
you have “contains” relationships between entities.
you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents.
The normalized approach uses a reference in a document, to another document. Just like in the Meteor.js book. They create a web app which shows posts, and each post has a set of comments. They use two collections, the posts and the comments. When adding a comment it's submitted together with the post_id.
So in your example you have a students collection. And each student has to fulfill requirements? And each student has his own requirements like a post has his own comments?
Then I would handle it like they did in the book. With two collections. I think that should be the normalized approach, not the embedded.
I'm a little confused myself, so maybe you can tell me, if my answer makes sense.
Maybe you can help me too? I'm trying to make a app that manages a flea market.
Users of the app create events.
The creator of the event invites users to be cashiers for that event.
Users create lists of stuff they want to sell. Max. number of lists/sellers per event. Max. number of position on a list (25/50).
Cashiers type in the positions of those lists at the event, to track what is sold.
Event creators make billings for the sold stuff of each list, to hand out the money afterwards.
I'm confused how to set up the data design. I need Events and Lists. Do I use the normalized approach, or the embedded one?
Edit:
After reading percona.com/blog/2013/08/01/schema-design-in-mongodb-vs-schema-design-in-mysql/ I found following advice:
If you read people information 99% of the time, having 2 separate collections can be a good solution: it avoids keeping in memory data is almost never used (passport information) and when you need to have all information for a given person, it may be acceptable to do the join in the application.
Same thing if you want to display the name of people on one screen and the passport information on another screen.
But if you want to display all information for a given person, storing everything in the same collection (with embedding or with a flat structure) is likely to be the best solution

Entity Framework Code First - Database schema for a catalog with product and product options

I'm trying to create a e-commerce website with out using any third party components.
My biggest problem so far is designing my model/database schema.
The e-commerce solution is for a Take away.
They only really have two types of Meals they Sell.
Rice Meals
Noodle meals Meals
Now Rice Meals have a set of options, so for example a Rice meal comes with either beans or plantain or both. (If both we need to off set the price)
Rice meals also come with a sauce the customer has 3 different options. There is no price difference.
Noodle meals
You can choose a Noodle type
You Can choose a sauce that goes with it.
You can choose if you want fish or meat
Then they have other products that don't have any options.
So my question is how can I create a flexible schema to store Products the options they have and the possible values for those options.
I also need to work out how to store what has actually been selected by the user.
I'm using EF with code first, would love someone to give me a few tips in the right direction.
The closest thing I have come across that may be a solution is this.
http://villyblog.blogspot.co.uk/2008/11/sample-database-schema-for-catalog-with.html
Really confused about the best way to do this.
Keep it simple!
Modeling is a skill. It's about observing and filtering. Even in a relatively simple business like a Take-away there is a lot of noise and if you manage to filter the noise and keep the essence your entity model will become both robust and flexible. First focus on the absolute minimum. Let me try to show you how this could work in your case.
The filtering begins with finding the "ubiquitous" language (Evans, Domain Driven Design): the "things" the business talks about and that are candidates to become entities in the model.
You talk about meals, types, values, prices, discounts, options, products. What are candidate entities?
One important step to take is to find the real, tangible "things". Customers don't eat options. They eat meals, or products. Nor do they eat prices.
"Option" is an interesting word. It is a covert verb. It's an act of opting for some "thing". It's a common design flaw in modeling to turn actions into entities, while they should become methods working on entities. Finding these disguised verbs is very very important. Without diving too deep into this issue I can say that having actions as entities make it hard to assign the right responsibilities to classes.
Likewise, prices (values) and types are no tangible things. They are attributes of things. Turning attributes into entities is a less obvious error, but it happens. I think the model you show as example contains both of the above flaws.
So far, in fact, the only real "thing" that emerges is a Product. The rest is either action or attribute. A Product can either be a meal, or a component of a meal. So these products come in combinations, or aggregates, which can be modeled by a hierarchy.
So here's the core of your "flexible schema to store Products":
You can store all possible combinations of products in one database table. No need to store options separately. Options are products as well. It's an act of combining to design the options in a hierarchy of products.
A concrete part of the hierarchy, the rice meals, could look like this:
The business does the combining, which is designing the hierarchy. The customer does the picking of options. Here business rules come into play. Let's say one rule is that the owner can combine any products, another rule is the customer can only combine end points (the smaller gray rectangles). The parent product could contain a property telling how many of its children can be chosen.
There may be a way to build these rules into the model, but coded rules are far more easy to modify than a model.. Let the model just be a dumb bag of data.
Now the part
I also need to work out how to store what has actually been selected by the user.
When a customer picks options he is making a classical order with order lines. That would make for a model like
Well, this is getting a long answer. A short word on the discounts. It depends a bit on how you want to calculate them. A simple way is a product property that's simply a multiplication factor to apply to the prices of each child product when more than 1 are selected.
Something like this might work, based vaguely on the link you provided:
MealType
- MealTypeID (short maybe? identity, PK)
- Name
Meal
- MealID (long, identity, PK)
- MealTypeID (FK)
- Name
- BasePrice
- IsActive (bit)
MealOption
- MealOptionID (PK) (short or int, identity)
- Name
- PriceOffset
- IsActive (bit)
MealMealOption (not the best name, but just represents a relationship between Meals and MealOptions)
- MealMealOptionID (PK, int or long, identity)
(composite foreign key with MealID and MealOptionID)
- MealID
- MealOptionID
Order
- (this holds stuff common to all orders such as billing address info, messages from the customer, etc.)
- OrderID (long, identity, PK)
- TotalCost
- TotalPriceOffset
etc...
OrderItems
- OrderItemsID (long, identity, PK)
- OrderId (FK)
- MealID (FK)
other order item-specific stuff...
OrderOptions
- OrderOptionID (long, identity, PK)
- OrderItemsID (FK)
- OrderID (FK)
- MealMealOptionID (FK)
anything else needed here...
Any table obviously will also have whatever other fields you deem necessary for that table.
The answer to this question is here >
Database design for user settings
It's a edr diagram of how to store: N number of settings, that are associated with N Number of users, and each user can have a N number of settings associated with that individual user. So you could have one user with 5 settings and another user with 10 settings.
This is very flexible and I'll be using it in the future. I have swapped the entity user and replaced it with product.
What I want to know now is HOW DO YOU STORE SELECTED SETTINGS? These tables are only able to store and show user settings and available options/settings.
Any ideas?

How to design form inheritance

I need to design about 20 forms for various business processes.
We have to do this for about 10 countries and need some form of object prietnted approach because each country has different business rules and some different bits of data. However, there is also common data between some of these countries.
For example, we have 5 bits of data, 4 are common to every country, 1 is specific for individual countries.
eg common Name, Address, Telephone, Male/Female
eg Bonus payment
It's a question of how do you manage all the code changes easily in an enterprise application without the code being too unwieldy?
It's not just the languages, that would essentially be driven by a config code that lists the names for the lables...but also each form may have most of it's design from a gloabl form and then less from a local form, local to the specific country.
Isn't there some way to build a dynamic form on the fly so that you have 1 form ProcessBonus for every country, that form inherits fields from MainForm, and then it checks configuration Class in the background to build the form dynamicaly for each country?
I'm trying to avoid having 10 form types and then another 100 local forms for each country, that would be well over 1000 forms and would be unmanageable wouldn't it?
Actually what you need to achieve is Multiple Inheritance. .Net technologies , PHP, FLEX etc some other languages doesn't support multiple inheritance. Thus, Interfaces are used as a hack to achieve multiple inheritance.
You can find the implementation of interfaces in this link.
http://www.codersource.net/MicrosoftNet/CBasicsTutorials/CNetTutorialInterfaces.aspx