What emails are equivalent to each other? - email

I am trying to detect if the destination of the email is someone that is in the database. The problem is that there are equivalent emails and so a direct compare wont catch all cases. Some examples: foobar#gmail.com == foo.bar#gmail.com == foobar+123#gmail.com. Is there somewhere where these patterns are defined?

What you are referencing is "Subaddressing Samantics"
As far as most people are concerned: No, there aren't any additional rules you are not aware of. This is because both the "." and "+/-" are domain-specific identifiers.
Gmail chose to recongize emails with dots the same as those without, e.g. JohnSmith#gmail.com == John.Smith#gmail.com, because it makes it too easy for imposters. Therefore, we are guaranteed that all Gmail addresses will be the same with or without dots, but this guarantee does not extend to every other domain.

If you mean to ask if there's a generic way of deciding this from the name alone, then no. Even if you used patterns to find similar email addresses, how do you know they belong to the same person? Maybe foobar123 really is different from foobar?
Now if your database tables have email addresses linked to people, you could use queries to find the people to whom they are linked and compare those.
The only other thing I can think of would be if you could do an IP lookup for email addresses, but there are a host of problems with that.

Related

Normalization...Which table does the Address field belong in?

I'm trying to understand how to apply the concept of normalization. I’m working with this example that I came across online.
FIRST NORMAL FORM…
CONVERSION TO SECOND NORMAL FORM…
My question about this is, how would I know if Address belongs in the Membership_Details_Table. Doesn’t it seem like the following would be a better schema in case a member has multiple addresses?
**Table1**
MembershipID
Salutation
FullName
**Table2**
MembershipID
Address
**Table3**
MembershipID
BooksIssued
No. The second method is just duplicating information about a particular member across two different tables. That is sometimes useful, but not for normalization.
It is possible that different members might be at the same address:
Members
MembershipID
Salutation
FullName
AddressId
Addresses
AddresssId
Address
Note the difference: The addresses table has its own id.
Or it is also possible that the addresses change over time. That would suggest a type-2 table. However, your data has no dates, so that is beyond the scope of this question.

How to create google actions with one term name

how can I name my google action with one name?
The following exception is occurring when trying to:
App names with only one word, or only one word that is not a prefix (such as "the" or "an"), are not normally allowed. If you need further guidance, please contact the support.
You do exactly what it says - you contact support at the form at the bottom of https://developers.google.com/actions/support/.
In general, one-name actions won't be permitted unless you have the trademark or domain name with that same one word already. This name is a unique way to contact your Action, and for generic names or words, there would be far too many people trying to get the same one. In rare cases, it might be allowed (again - when you already have a clear claim on the word), but in general it is unlikely.

Is the MODSEQ value of an email unique to the entire mail account?

As far as I can see in the IMAP RFC, it appears to say that a MODSEQ value is unique to a folder and will never be repeated unless UIDValidity changes. However, I can't see it saying anything about the account as a whole, rather just folders.
My question is, can I use an emails MODSEQ value as a unique value across the entire inbox, or need I define my own unique value, likely something similar to:
let uid = path + MODSEQ
There are no guarantee about uniqueness across folders. This is because some servers don't know much about other folders than the ones they have open at the moment, and it was considered important to make MODSEQ easy to implement for servers.
Yes, you need your own uniqueness value.

Address Unification

I'm creating a business directory where I need to display results based on area and keywords. The problem is the scope might be across countries that have fairly irregular address structures. I currently have the following as form fields (and their respective database fields)
Fields (All required):
- Address 1
- Address 2
- Area <------key search criteria
- Keywords <------key search criteria
The problem is I'm not sure how reliable this setup is. I would have to rely on the data entry when searching to be relevant enough for it to work, and that goes against validating everything before inserting to the database. Is there a standard way of looking up areas across countries? And if so, how?
I decided to solve this by running (and verify) addresses via batch geocoding, which converts the addresses to 'geocodes' one can use with mapping plugins (there seems to be a lot of solutions in this regard. Google "batch geocode addresses"), although you may have to research further for accuracy. Though I initially started with OpenLayers for mapping I found leaflet faster to understand and deploy (with emphasis on mobile), Though I am talking from my own experience of learning and being able to implement in time.

Does it make sense to have two classes related by a 1:1 cardinality?

I am currently coursing Computer Engineering and I remember a professor of a class called Introduction to Informational Systems saying that two classes related by a 1:1 cardinality does not make sense.
For example: I have the Client class and the Telephone class. Let's supose that the client can only have one phone. The professor said that does not make sense creating the Telephone class, and telephone should be an attribute of the Client class. I absolutely agree with him.
But now I'm taking the Software Engineering class and the professor (not the same) did not make any comments about this issue, and now I'm really confused about this.
What is the correct approach?
I would say your Introduction to Information Systems professor was correct. And your SE professor, too (assuming his lack of comments makes him a contrarian). They are each right depending on your requirements and the domain you're working with. But without any other details, it's hard to model this for you, and I would lean towards what your CE professor had said. Keep in mind all those fun little principles you learned: KISS, DRY, etc., and apply them to your problem.
If Client will never ever possibly have more than one telephone number and no other entity in your domain needs a telephone number, then a separate Telephone class isn't necessary. In the real world, if your requirements are vague, find out more information from your client.
If somebody down the road decides Clients can take on more than one telephone number, or another entity is introduced into your domain that needs a telephone number, this is a fairly easy refactoring to accomplish.
So with that in mind, let's say your Client had a separate Address class that included the telephone number instead. Maybe that Address class gets re-used by another class, maybe Invoice or Shipment, where an Address could be shared or applied in both cases. In this example, you might want Address (Telephone) to be its own class.
In your example, Telephone might be a little too contrived. You'd want it to be a separate class for re-use if it had many properties (AreaCode, InternationalPrefix, Number, etc.), but if Client just needed a string-value called Telephone that a user would be typing in merely for reference, then it probably doesn't make sense to be its own class.
If you wish to re-use the Telephone class, it won't be very useful having it as a part of the Client class. That would be one really good reason. If you leave it in the Client class, it implies that it is intrinsically part of the Client even when you use it elsewhere, which I doubt you would ever mean.
Sometimes though, it makes sense to model 2 entities with a 1:1 relationship as separate classes. Perhaps you have a Client and you also have ClientBilling. You do not want all of your programmers to have access to the ClientBilling so you move it into its own class where it can be separately controlled.
Perhaps your structure is huge, and shipping the whole thing around isn't normally necessary. By breaking it into functional pieces, you can reduce the size of the data to only that needed for a particular function.
Perhaps the 1:1ness isn't necessarily intrinsic to the data and a reasonable guess would be that it will not always be that way. Tour Telephone example falls into this category I think.
I'd say 1:1 relationships (mandatory on both ends) are suspicious and should be carefully considered to be sure they are needed. Usually it's a trade-off between flexibility and simplicity of the diagram (flexibility because it will be easier to change the diagram in the future and adapt it to new requirements if you keep the two classes against simplicity of having to maintain one class instead of two)