I'm trying to make sense of rfc5322 Line Length Limits
. Is the line limit 78 chars or 998 chars? Is one for the body and the other for the headers? I can't find anything to specify that.
Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF.
It's saying that preferably, lines should ideally be no longer than 78 characters but lines must never be longer than 998 characters.
In other words, 998 is a hard limit while 78 is a soft limit.
Related
I have this issue with the latest pg admin where postgres is automatically truncating function names less than 63 characters. I don't know if it's related to language or something else, but here's a function name I'm using:
"βρες_ασθενείς_μίας_μέρας_νοσηλευτή"
postgres truncates the name to:
"βρες_ασθενείς_μίας_μέρας_νοσηλευτ"
which is 33 characters.
Did the rules of max function name size change or is there something wrong with my preferences?
Thanks for your time.
"4.1.1. Identifiers and Key Words":
The system uses no more than NAMEDATALEN-1 bytes of an identifier; longer names can be written in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier length is 63 bytes. If this limit is problematic, it can be raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h.
Note that it says 63 bytes, not characters. If you use UTF-8, your untruncated string is 64 bytes long, which is too long. The truncated string is 62 bytes long and fits.
I'm reviewing for an exam right now and one of the review questions gives an answer that I'm not understanding.
A main memory location of a MIPS processor based computer contains the following bit pattern:
0 01111110 11100000000000000000000
a. If this is to be interpreted as a NULL-terminated string of ASCII characters, what is the string?
The answer that's given is "?p" but I'm not sure how they got that.
Thanks!
All ASCII characters are made up of 8 bits. So given your main memory location, we can break it up into a few bytes.
00111111
01110000
00000000
...
Null terminated strings are terminated with none other than... a null byte! (A byte with all zeros). So this means that your string contains two bytes that are ASCII characters. Byte 1 has a value of 63 and byte two has a value of 112. If you have a look at an ASCII chart like this one you'll see that 63 corresponds to '?' and 112 corresponds to 'p'.
I am using yoururls with Base32 encoding to send shortened links within an sms. The URL is preceded by a message and since sms is limited to 160 characters and my messages are approximately 140 characters I need to be very careful about character count.
My question is this; How do I calculate how many URL's I can fit with a 4 character limit using base32 encoding?
I'm not sure if you are asking about permutations.
Each character in base32 encoding can have 32 values ([A - Z] and [2 - 7]). If you use the form http://yoursite.com/xxxx, where xxxx is the short URL, four digits can contain 32 4 permutations. That is, 1,048,576.
If you also include URLs with three digits (e.g. http://yoursite.com/xxx), you can have 32 3 permutations. That is, 32,768. Together with four-digit URLS, then you get a total of 1,081,344.
If you also use 2 digit URLs (e.g. http://yoursite.com/xx), you get additional 1,024 URLS, totalling 1,082,368. And including single digits (e.g. http://yoursite.com/x) will give you additional 32. totaling 1,082,400.
But you don't need to use only [A - Z] and [2 - 7]. As per RFC3986, you can use characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]#!$&'()*+,;=. That's 84 different characters. With this:
http://yoursite.com/xxxx 49,787,136
http://yoursite.com/xxx added 50,379,840 (+592,704)
http://yoursite.com/xx added 50,386,896 (+ 7,056)
http://yoursite.com/x added 50,386,980 (+ 84)
Even if you leave out characters -._~:/?#[]#!$&'()*+,;=, because they would really not fit in a shortened URL, you'll still get 62 characters. With that:
http://yoursite.com/xxxx 14,776,336
http://yoursite.com/xxx added 15,014,664 (+238,328)
http://yoursite.com/xx added 15,018,508 (+ 3,844)
http://yoursite.com/x added 15,018,570 (+ 62)
I have a question about the email addresses length.
Why in wikipedia or in some other sites tell that the maximum number of chars in an email-user-name is 64, that the server-name must have maximum 255 chars and that together the user-name-mail#server-name must not exceed 254 characters?
If it must have 254 chars, why the server-name must not exceed 255??? I don't understand that...
Could you help me please? Thanks!
The relevant SMTP standard is currently RFC5321. Section 4.5.3 describes the limits. The length of a mail path can't exceed 256 bytes. Since a mail path includes angle brackets, practically this means the user#host portion can't exceed 254 characters.
How many characters are allowed to be in the subject line of Internet email?
I had a scan of The RFC for email but could not see specifically how long it was allowed to be.
I have a colleague that wants to programmatically validate for it.
If there is no formal limit, what is a good length in practice to suggest?
See RFC 2822, section 2.1.1 to start.
There are two limits that this
standard places on the number of
characters in a line. Each line of
characters MUST be no more than 998
characters, and SHOULD be no more than
78 characters, excluding the CRLF.
As the RFC states later, you can work around this limit (not that you should) by folding the subject over multiple lines.
Each header field is logically a
single line of characters comprising
the field name, the colon, and the
field body. For convenience however,
and to deal with the 998/78 character
limitations per line, the field body
portion of a header field can be split
into a multiple line representation;
this is called "folding". The general
rule is that wherever this standard
allows for folding white space (not
simply WSP characters), a CRLF may be
inserted before any WSP. For
example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
The recommendation for no more than 78 characters in the subject header sounds reasonable. No one wants to scroll to see the entire subject line, and something important might get cut off on the right.
RFC2322 states that the subject header "has no length restriction"
but to produce long headers but you need to split it across multiple lines, a process called "folding".
subject is defined as "unstructured" in RFC 5322
here's some quotes ([...] indicate stuff i omitted)
3.6.5. Informational Fields
The informational fields are all optional. The "Subject:" and
"Comments:" fields are unstructured fields as defined in section
2.2.1, [...]
2.2.1. Unstructured Header Field Bodies
Some field bodies in this specification are defined simply as
"unstructured" (which is specified in section 3.2.5 as any printable
US-ASCII characters plus white space characters) with no further
restrictions. These are referred to as unstructured field bodies.
Semantically, unstructured field bodies are simply to be treated as a
single line of characters with no further processing (except for
"folding" and "unfolding" as described in section 2.2.3).
2.2.3 [...] An unfolded header field has no length restriction and
therefore may be indeterminately long.
after some test: If you send an email to an outlook client, and the subject is >77 chars, and it needs to use "=?ISO" inside the subject (in my case because of accents) then OutLook will "cut" the subject in the middle of it and mesh it all that comes after, including body text, attaches, etc... all a mesh!
I have several examples like this one:
Subject: =?ISO-8859-1?Q?Actas de la obra N=BA.20100154 (Expediente N=BA.20100182) "NUEVA RED FERROVIARIA.=
TRAMO=20BEASAIN=20OESTE(Pedido=20PC10/00123-125),=20BEASAIN".?=
To:
As you see, in the subject line it cutted on char 78 with a "=" followed by 2 or 3 line feeds, then continued with the rest of the subject baddly.
It was reported to me from several customers who all where using OutLook, other email clients deal with those subjects ok.
If you have no ISO on it, it doesn't hurt, but if you add it to your subject to be nice to RFC, then you get this surprise from OutLook. Bit if you don't add the ISOs, then iPhone email will not understand it(and attach files with names using such characters will not work on iPhones).
Limits in the context of Unicode multi-byte character capabilities
While RFC5322 defines a limit of 1000 (998 + CRLF) characters, it does so in the context of headers limited to ASCII characters only.
RFC 6532 explains how to handle multi-byte Unicode characters.
Section 3.4 ( Effects on Line Length Limits ) states:
Section 2.1.1 of [RFC5322] limits lines to 998 characters and
recommends that the lines be restricted to only 78 characters. This
specification changes the former limit to 998 octets. (Note that, in
ASCII, octets and characters are effectively the same, but this is
not true in UTF-8.) The 78-character limit remains defined in terms
of characters, not octets, since it is intended to address display
width issues, not line-length issues.
So for example, because you are limited to 998 octets, you can't have 998 smiley faces in your subject line as each emoji of this type is 4 octets.
Using PHP to demonstrate:
Run php -a for an interactive terminal.
// Multi-byte string length:
var_export(mb_strlen("\u{0001F602}",'UTF-8'));
// 1
// ASCII string length:
var_export(strlen("\u{0001F602}"));
// 4
// ASCII substring of four octet character:
var_export(substr("\u{0001F602}",0,4));
// '😂'
// ASCI substring of four octet character truncated to 3 octets, mutating character:
var_export(substr("\u{0001F602}",0,3));
// '▒'
I don't believe that there is a formal limit here, and I'm pretty sure there isn't any hard limit specified in the RFC either, as you found.
I think that some pretty common limitations for subject lines in general (not just e-mail) are:
80 Characters
128 Characters
256 Characters
Obviously, you want to come up with something that is reasonable. If you're writing an e-mail client, you may want to go with something like 256 characters, and obviously test thoroughly against big commercial servers out there to make sure they serve your mail correctly.
Hope this helps!
What's important is which mechanism you are using the send the email. Most modern libraries (i.e. System.Net.Mail) will hide the folding from you. You just put a very long email subject line in without (CR,LF,HTAB). If you start trying to do your own folding all bets are off. It will start reporting errors. So if you are having this issue just filter out the CR,LF,HTAB and let the library do the work for you. You can usually also set the encoding text type as a separate field. No need for iso encoding in the subject line.