Normalization...Which table does the Address field belong in? - database-normalization

I'm trying to understand how to apply the concept of normalization. I’m working with this example that I came across online.
FIRST NORMAL FORM…
CONVERSION TO SECOND NORMAL FORM…
My question about this is, how would I know if Address belongs in the Membership_Details_Table. Doesn’t it seem like the following would be a better schema in case a member has multiple addresses?
**Table1**
MembershipID
Salutation
FullName
**Table2**
MembershipID
Address
**Table3**
MembershipID
BooksIssued

No. The second method is just duplicating information about a particular member across two different tables. That is sometimes useful, but not for normalization.
It is possible that different members might be at the same address:
Members
MembershipID
Salutation
FullName
AddressId
Addresses
AddresssId
Address
Note the difference: The addresses table has its own id.
Or it is also possible that the addresses change over time. That would suggest a type-2 table. However, your data has no dates, so that is beyond the scope of this question.

Related

Using found set of records as basis for value list

Beginner question. I would like to have a value list display only the records in a found set.
For example, in a law firm database that has two tables, Clients and Cases, I can easily create value list that displays all cases for clients.
But that is a lot of cases to pick from, and invites user mistakes. I would like the selection from the value list to be restricted to cases matched to a particular client.
I have tried this method https://support.claris.com/s/article/Creating-conditional-Value-Lists-1503692929150?language=en_US and it works up to a point, but it requires too much entry of data and too many tables.
It seem like there ought to be a simpler method using the find function. Any help or ideas greatly appreciated.

CQRS projections, joining data from different aggregates via probe commands

In CQRS when we need to create a custom-tailored projections for our read-models, we usually prefer a "denormalized" projections (assume we are talking about projecting onto a DB). It is not uncommon to have the information need by the application/UI come from different aggregates (possibly from different BCs).
Imagine we need a projected table to contain customer's information together with her full address and that Customer and Address are different aggregates in our system (possibly in different BCs). Meaning that, addresses are generated and maintained independently of customers. Or, in other words, when a new customer is created, there is no guarantee that there will be an AddressCreatedEvent subsequently produced by the system, this event may have already been processed prior to the creation of the customer. All we have at the time of CreateCustomerCommand is an UUID of an existing address.
We have several solutions here.
Enrich CreateCustomerCommand and the subsequent CustomerCreatedEvent to contain full address of the customer (looking up this information on the fly from the UI or the controller). This way the projection handler will just update the table directly upon receiving CustomerCreatedEvent.
Use the addrUuid provided in CustomerCreatedEvent to perform an ad-hoc query in the projection handler to get the missing part of the address information before updating the table.
These are commonly discussed solution to this problem. However, as noted by many others, there are problems with each approach. Enriching events can be difficult to justify as well described by Enrico Massone in this question, for example. Querying other views/projections (kind of JOINs) will work but introduces coupling (see the same link).
I would like describe another method here, which, as I believe, nicely addresses these concerns. I apologize beforehand for not giving a proper credit if this is a known technique. Sincerely, I have not seen it described elsewhere (at least not as explicitly).
"A picture speaks a thousand words", as they say:
The idea is that :
We keep CreateCustomerCommand and CustomerCreatedEvent simple with only addrUuid attribute (no enriching).
In API controller we send two commands to the command handler (aggregates): the first one, as usual, - CreateCustomerCommand to create customer and project customer information together with addrUuid to the table leaving other columns (full address, etc.) empty for time being. (Warning: See the update, we may have concurrency issue here and need to issue the probe command from a Saga.)
Right after this, and after we have obtained custUuid of the newly created customer, we issue a special ProbeAddrressCommand to Address aggregate triggering an AddressProbedEvent which will encapsulate the full state of the address together with the special attribute probeInitiatorUuid which is, of course our custUuid from the previous command.
The projection handler will then act upon AddressProbedEvent by simply filling in the missing pieces of the information in the table looking up the required row by matching the provided probeInitiatorUuid (i.e. custUuid) and addrUuid.
So we have two phases: create Customer and probe for the related Address. They are depicted in the diagram with (1) and (2) correspondingly.
Obviously, we can send as many such "probe" commands (in parallel) as needed by our projection: ProbeBillingCommand, ProbePreferencesCommand, etc. effectively populating or "filling in" the denormalized projection with missing data from each handled "probe" event.
The advantages of this method is that we keep the commands/events in the first phase simple (only UUIDs to other aggregates) all the while avoiding synchronous coupling (joining) of the projections. The whole approach has a nice EDA feeling about it.
My question is then: is this a known technique? Seems like I have not seen this... And what can go wrong with this approach?
I would be more then happy to update this question with any references to other sources which describe this method.
UPDATE 1:
There is one significant flaw with this approach that I can see already: command ProbeAddrressCommand cannot be issued before the projection handler had a chance to process CustomerCreatedEvent. But this is impossible to know from the API gateway (or controller).
The solution would probably involve a Saga, say CustomerAddressJoinProjectionSaga with will start upon receiving CustomerCreatedEvent and which will only then issue ProbeAddrressCommand. The Saga will end upon registering AddressProbedEvent. Or, if many other aggregates are involved in probing, when all such events have been received.
So here is the updated diagram.
UPDATE 2:
As noted by Levi Ramsey (see answer below) my example is rather convoluted with respect to the choice of aggregates. Indeed, Customer and Address are often conceptualized as belonging together (same Aggregate Root). So it is a better illustration of the problem to think of something like Student and Course instead, assuming for the sake of simplicity that there is a straightforward relation between the two: a student is taking a course. This way it is more obvious that Student and Course are independent aggregates (students and courses can be created and maintained at different times and different places in the system).
But the question still remains: how can we obtain a projection containing the full information about a student (full name, etc.) and the courses she is registered for (title, credits, the instructor's full name, prerequisites, etc.) all in the same table, if the UI requires it ?
A couple of thoughts:
I question why address needs to be a separate aggregate much less in a different bounded context, in view of the requirement that customers have an address. If in some other bounded context customer addresses are meaningful (e.g. you want to know "which addresses have more customers" etc.), then that context can subscribe to the events from the customer service.
As an alternative, if there's a particularly strong reason to model addresses separately from customers, why not have the read side prospectively listen for events from the address aggregate and store the latest address for a given address UUID in case there's a customer who ends up with that address. The reliability per unit effort of that approach is likely to be somewhat greater, I would expect.

EF Core multiple (dynamic) columns with equal data type

Please excuse me for bad title. I couldn't think of any better. Feel free to suggest a better one in comments.
I have a case, where I need to have a dynamic number of columns. In my example addresses.
One customer want 1 address and other wants 7 (bill address, deliver address, bid address, bill address, confirmation address ...). I do not want to create all possible columns for all application setups. But provide some sort of mechanism that every user should set (in program) how many and which information's they need.
I know entity framework do not support dynamic tables. But I am sure many had the same problem. Maybe string[] would suite best, but EF Core doesn't support it.
I could store all addresses in JSON, but then I loose options to filter and sort by this columns in SQL (LINQ).
Is there any common pattern to achieve dynamic columns and keep server filtering (maybe ordering)?

What emails are equivalent to each other?

I am trying to detect if the destination of the email is someone that is in the database. The problem is that there are equivalent emails and so a direct compare wont catch all cases. Some examples: foobar#gmail.com == foo.bar#gmail.com == foobar+123#gmail.com. Is there somewhere where these patterns are defined?
What you are referencing is "Subaddressing Samantics"
As far as most people are concerned: No, there aren't any additional rules you are not aware of. This is because both the "." and "+/-" are domain-specific identifiers.
Gmail chose to recongize emails with dots the same as those without, e.g. JohnSmith#gmail.com == John.Smith#gmail.com, because it makes it too easy for imposters. Therefore, we are guaranteed that all Gmail addresses will be the same with or without dots, but this guarantee does not extend to every other domain.
If you mean to ask if there's a generic way of deciding this from the name alone, then no. Even if you used patterns to find similar email addresses, how do you know they belong to the same person? Maybe foobar123 really is different from foobar?
Now if your database tables have email addresses linked to people, you could use queries to find the people to whom they are linked and compare those.
The only other thing I can think of would be if you could do an IP lookup for email addresses, but there are a host of problems with that.

How do I deconflict multiple ABRecord when the uniqueid fails and I have people with the same first and last name?

This may be down in the weeds, but thought I'd throw this out there and ask. So I'm building an app where I need to reference information in the address book, say a person's phone number, but do not want to store the info separately. Basically, every time the user loads the app, I'll go check the address book and get the info just in case they've updated it.
Anyway, I've read this in the iPhone Programming Doc:
"Every record in the Address Book database has a unique record identifier. This identifier always refers to the same record, unless that record is deleted or the MobileMe sync data is reset. Record identifiers can be safely passed between threads. They are not guaranteed to remain the same across devices.
The recommended way to keep a long-term reference to a particular record is to store the first and last name, or a hash of the first and last name, in addition to the identifier. When you look up a record by ID, compare the record’s name to your stored name. If they don’t match, use the stored name to find the record, and store the new ID for the record."
So what I'm worried about or curious about is let's say I've stored the uids and the first and last name. Then let's say upon sync or device transfer or whatever my uids get hosed. Now let's say my address book contains two entries for user "Bob Smith".
How do you deconflict this given the uids no longer match and first/last name are the same? My guess is to end up storing other info (e.g. phone number, email, etc...) but that puts me back into the situation of not really wanting to store more info than is necessary. I'm realizing this could be a .0001% time problem, but thought I'd throw this out there to see what you all thought.
Thanks for any suggestions!
I’m trying to solve the same problem. But with my solution, I can (or actually need to) store everything for multi-platform/multi-device issues. But if you can afford to pick up one or two additional properties to really find the difference between the two people (lets say e-mail address and birthday date), you should be able to recognize them at most times. If not, I would wonder if they aren’t really the same person (same name, same e-mail, same birthday… walks like a duck, quacks like a duck, is duck enough).
And I would say, it’s not .0001 % time, ie. me and my father have the very same name (and that happens to a lot of people in my country). But we definitely don’t have the same e-mail address or birthday date.