What really is the maximum length of email address local part? - email

According to Wikipedia (https://en.wikipedia.org/wiki/Email_address) and http://isemail.info/about the maximal length of the local part of an email address is 64 characters.
However, I just received email from this address:
reply+0032ff332e028331fad75f7549ee52d90483c7aa70138a3192cf00000001123b88e492a169ce06aab82c#reply.github.com
Its local part is 90 characters and it is deemed invalid by isemail.info, however, it's a perfectly valid email address. I can send email to it and it is received by the other party.
So what gives: is not the maximal length of the local part of email address 64 characters or not? If not, what is the maximal length then?

The maximum length is 64 octets.
Yet as MSalters says in comments, just because something is done doesn't mean it's legal.
Some system accept longer local parts, some others don't. In this case, Github says that you should send an e-mail to them on that address. It's bad practice from Github because they might accept a longer e-mail address, but they forget that the client might be more pedantic and refuse to send (or worse, truncate the e-mail address).
They probably consider reply as the real local part and use +0032ff33... as an identifier, but all in all, as you point out, it makes their local part much (too?) bigger.

Related

What is the maximum url length that can be safely used in an email?

My website sends out an email with a link in it. Lately we've been getting a lot of errors that indicate that the URL in the email that we send out is being somehow garbled. Unfortunately we dont' have any logs that indicate what the url they tried to access was exactly. I've ruled out a number of possibilities (bad data, bad url encoding, etc) The only thing I haven't ruled out is that perhaps the url is being truncated by our users email clients. The URL is slightly different for each user, but generally the url will be 210 - 220 characters in length.
My question: As a rule of thumb, what is the maximum url length that can be be safely sent in an email client, to ensure consistent behavior?
UPDATE
I know that there are a number of questions on SO related to the maximum URL length, but my question is specific to a hyperlink in an email client, and I can't seem to find that.
Good style recommendation [URL length <= recommended line length]
URL should fit into a single line, single email line should be 78 chars (minus at least two chars for quoting in reply).
https://www.rfc-editor.org/rfc/rfc5322.txt
2.1.1. Line Length Limits
There are two limits that this specification places on the number of
characters in a line. Each line of characters MUST be no more than
998 characters, and SHOULD be no more than 78 characters, excluding
the CRLF

perl gethostbyname when given IP

What happens if a wrong format IP is given to gethostbyname function in perl? One of our scripts was behaving weird when given a wrong format IP (say 1.1.1). On debugging, found that gethostbyname was returning a value when given 1.1.1 for example..any thoughts on this?... In my opinion, gethostbyname should return undef, right?
In the beginning of IPv4, before CIDR, addresses were considered to be composed of a network part and a host part. The parts could be written sort of independently in dotted decimal form, and didn't need to be fully decomposed into bytes. So 1.1 is host 1 on network 1, equivalent to 1.0.0.1 or you can also write it as one big 32-bit number: 16777217. There was a time when people used URLs like http://16777127/ to show how clever they were. That was ruined when spammers started doing it to fool filters.
Somehow, when I ping 1.1.1, it goes to 1.1.0.1. I would have guessed 1.0.1.1. I'm not sure what the rule is to decide how it's broken up exactly.
These old forms are not widely supported (or even understood) anymore, but they haven't been completely rooted out from all the tools and libraries.
P.S. on my first attempt to submit this answer, stackoverflow said:
Your post contains a link to the invalid domain '16777127'.
Please correct it by specifying a full domain or wrapping it in a code block.
Which is sort of what I meant by "not widely supported".
Numeric IPv4 addresses can be written as 1, 2, 3 or 4 numeric components. Each non-final component represents 8 bits (1 octet), and the final represents as many bits required to give the full 32 bit address. Thus, the following all represent the local loopback address:
2130706433
127.1
127.0.1
127.0.0.1
Each component itself may be written in decimal, hex or octal; thus the following all also encode the same address
0x7f000001
127.0x01
0177.0.1
0x7f.0.0.1

Is it possible to send email to an address that contains latin unicode characters with cfmail?

We need to be able to send an email with cfmail to an email address that contains a latin a with acute. I assume we'll eventually have to allow other Unicode characters too - a sample email address is foobár#example.com. ColdFusion throws an error on this email address, which is technically valid. Since the acute a is a UTF-8 character, and the default encoding for cfmail is UTF-8, I'm not sure what other settings I would need to enable to make this work. Is this possible?
The error I get is Attribute validation error for tag CFMAIL.
Detail: The value of the attribute to, which is currently foobár#example.com, is invalid.
I'm neither an I18N nor email expert but my understanding FWIW is that current systems don't generally support unicode in the local part of the email address, i.e. the mailbox name before the #. Local mail servers may support it and allow a name such as foobár internally, but if that person wants to receive mail from the outside world they will also need an ASCII alias such as foobar.
There is however a mechanism for supporting unicode in the domain portion of the address, which involves conversion to an ASCII representation called punycode. This means an address such as foo#foobár.com will be converted to foo#xn--foobr-0qa.com which current DNS and mail systems will accept.
It's possible to do this conversion in ColdFusion by using existing Java libraries. For more detail see this question.

Exceeding Max Email Address Sizes

RFC Standard says the max email size is 320 (actually 256 according to http://www.dominicsayers.com/isemail/). Is there any conceivable scenario where email addresses could end up being bigger than this?
Read this: http://www.eph.co.uk/resources/email-address-length-faq/
The upshot of it is that you should use 254 characters to store email addresses, because that is the maximum allowed in an SMTP transaction. This is specified in RFC5321 (your article says so, and is actually quoted in mine), which is authoritative.
To be honest, even if someone had a valid email address beyond 256/320 chars it would be a major pain to use.
Anyone using an email address that is even half as big as that (128 chars) needs to trim back!
although on the plus side, they likely get no spam!
For example both of these would be unusable:
//long domain
joe.shmoe#someveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylogdomain.com
//long username
someveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylonguser#aregularlengthdomain.com
This guy managed to have a 345 character-long email address and make it work:
The World’s Longest Active Email Address
Admittedly, such a long email address is completely pointless.

Can an email address contain international (non-english) characters?

If it's possible, should I accept such emails from users and what problems to expect when I will be sending mails to such addresses?
Officially, per RFC 6532 - Yes.
For a quick explanation, check out wikipedia on the subject.
Update 2015: Use RFC 6532
The experimental 5335 has been Obsoleted by: 6532 and
this later has been set to "Category: Standards Track",
making it the standard.
The Section 3.2 (Syntax Extensions to RFC 5322) has updated most text fields to
include (proper) UTF-8.
The following rules extend the ABNF syntax defined in [RFC5322] and
[RFC5234] in order to allow UTF-8 content.
VCHAR =/ UTF8-non-ascii
ctext =/ UTF8-non-ascii
atext =/ UTF8-non-ascii
qtext =/ UTF8-non-ascii
text =/ UTF8-non-ascii
; note that this upgrades the body to UTF-8
dtext =/ UTF8-non-ascii
The preceding changes mean that the following constructs now
allow UTF-8:
1. Unstructured text, used in header fields like
"Subject:" or "Content-description:".
2. Any construct that uses atoms, including but not limited
to the local parts of addresses and Message-IDs. This
includes addresses in the "for" clauses of "Received:"
header fields.
3. Quoted strings.
4. Domains.
Note that header field names are not on this list; these are still
restricted to ASCII.
Please note the explicit inclusion of Domains.
And the explicit exclusion of header names.
Also Note about NFKC:
The UTF-8 NFKC normalization form SHOULD NOT be used because
it may lose information that is needed to correctly spell
some names in some unusual circumstances.
And Section 3 start:
Also note that messages in this format require the use of the
SMTPUTF8 extension [RFC6531] to be transferred via SMTP.
The problem is that some mail clients (server-tools and / or desktop tools) don't support it and throw an 'invalid email' exception when you try to send a mail to an address which contains umlauts for example.
If you want full support, you could do the trick with converting the email-address parts to "punycode". This allows users to type in their addresses the usual way but you save it the supported-level way.
Example: müller.com » xn--mller-kva.com
Both points to the same thing.
I would assume yes since a number of top level domains already allow non ascii
characters for domains and since the domain is part of an email address, it's
perfectly possible. An example for such a domain would be www.öko.de
short answer: yes
not only in the username but also in the domain name are allowed.
The answer is yes, but they need to be encoded specially.
Look at this. Read the part that refers to email-headers and RFC 2047.
Not yet. The IEEE plans to do this:
H-Online article: IEFT planning internationalised email addresses, here is the RfC: SMTP Extension for Internationalized Email Addresses
Quote from H-Online (as it went down):
The Internet Engineering Task Force (IETF) has published three crucial documents for the standardisation of email address headers
that include symbols outside the ASCII character set. This means that
soon you'll be able to use Chinese characters, French accents, and
German umlauts in email addresses as well as just in the body of the
message. So if your name is Zoë and you work for a company that makes
façades, you might be interested in a new email address. But
representatives of providers are already moaning. They say there would
need to be an "upgrade mania" if the Unicode standard UTF-8 is to
replace the American Standard Code for Information Interchange (ASCII)
currently used as the general email language.
RFC 5335 specifies the use of UTF-8 in practically all email headers.
Changes would have to be made to SMTP clients, SMTP servers, mail user
agents (MUAs), software for mailing lists, gateways to other media,
and everywhere else where email is processed or passed along. RFC 5336
expands the SMTP email transport protocol. At the level of the
protocol, the expansion is labelled UTF8SMTP.
A new header field will be added as a sort of "emergency parachute" to
ensure that UTF-8 emails have a soft landing if they are thrown out
before reaching the recipient by systems that have not been upgraded.
The "OldAddress" is a purely ASCII address. But OldAddress is not to
be used as a channel for a second transfer attempt, but rather to make
sure that feedback is sent home.
Finally, RFC5337 ensures that correct messages are sent pertaining to
the delivery status of non-ASCII emails. The correct address of an
unreachable addressee must be sent back, even if further transport has
been refused. The email Address Internationalization (EAI) working
group is also working on a number of "downgrade mechanisms" for
various header fields and the envelope. If possible, original header
information is to be "packaged" and preserved.
Germany's DeNIC, the registrar for the ".de" domain, is nonetheless
taking this in its stride. "There is really not much we can do",
explained DeNIC spokesperson Klaus Herzig. DeNIC is instead paying
more attention to the update that the IETF is working on for the
standard of international domains – RFC3490, or IDNA2003 as it's
sometimes known. "We are not that happy about it because there is no
backwards compatibility," Herzig explained. When the update comes,
DeNIC says it will be throwing its weight behind the symbol "ß" - also
known as estzett - which has been overlooked up to now. The German
registrar also says that it may wait a bit before switching in light
of the lack of backward compatibility. Once the new standard is
running stably and registrars and providers have adopted it, the ß
will be added.
In contrast, experts believe that Chinese registrars in China and
Taiwan will quickly implement the change for internationalised email.
Representatives of CNIC and TWNIC are authors of the standards.
Chinese users currently have to write emails in ASCII to the left of
the # and in Chinese characters to the right of it for Chinese
domains, which have already been internationalized.
(Monika Ermert)