Is the following sintax rfc 3261 compatible? - sip

could you please tell me if the following syntax is compatible with rfc3261? Thanks.
401 Unauthorized:
Digest realm="movistar.tel",\r\n nonce="35e21fba60422a1f2D6a1fde23235c885d287d50e45fd49d56a354",\r\n algorithm=MD5,\r\n qop="auth"
or should be
Digest realm="movistar.tel",nonce="35e21fba60422a1f2D6a1fde23235c885d287d50e45fd49d56a354",algorithm=MD5,qop="auth"
I'm trying to analyze using "https://github.com/nygge/abnfc/blob/master/samples/sip/rfc3261.abnf" without success.
Please, can you help me to understand this?. Sorry my bad english.
Thanks.

The rule is defined in section 25.1 of rfc3261
In most place CRLF can be added for readability. Make sur at least a WSP is located right after this CRLF. This helps to differentiate new headers from LWS when parsing.
SIP header field values can be folded onto multiple lines if the
continuation line begins with a space or horizontal tab. All linear
white space, including folding, has the same semantics as SP. A
recipient MAY replace any linear white space with a single SP before
interpreting the field value or forwarding the message downstream.
This is intended to behave exactly as HTTP/1.1 as described in RFC
2616 [8]. The SWS construct is used when linear white space is
optional, generally between tokens and separators.
LWS = [*WSP CRLF] 1*WSP ; linear whitespace
SWS = [LWS] ; sep whitespace
CONCLUSION: both lines seems to be valid.

Related

Are newlines in MIME headers using encoded-words legal?

RFC 2047 defines the encoded-words mechanism for encoding non-ASCII character in MIME documents. It specifies that whitespace characters (space and tabs) are not allowed inside the encoded-word.
However, RFC 5322 for parsing email MIME documents specifies that long header lines should be "folded". Should this folding take place before or after encoded-words decoding?
I recently received an email where encoded-text part of the header had a newline in it, like this:
Header: =?UTF-8?Q?=C3=A5
=C3=A4?=
Would this be valid?
Of course emails can be invalid in lots of exciting ways and the parser needs to handle that, but it's interesting to know the "correct" way. :)
I misread the question and answered as if it was a different sort of whitespace. In this case the white space appears inside the MIME word, not multiple ones separated by white space.
This sort of thing is explicitly disallowed. From the introduction to the format in RFC2047:
2. Syntax of encoded-words
An 'encoded-word' is defined by the following ABNF grammar. The
notation of RFC 822 is used, with the exception that white space
characters MUST NOT appear between components of an 'encoded-word'.
And then later on in the same section:
IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's
by an RFC 822 parser. As a consequence, unencoded white space
characters (such as SPACE and HTAB) are FORBIDDEN within an
'encoded-word'. For example, the character sequence
=?iso-8859-1?q?this is some text?=
would be parsed as four 'atom's, rather than as a single 'atom' (by
an RFC 822 parser) or 'encoded-word' (by a parser which understands
'encoded-words'). The correct way to encode the string "this is some
text" is to encode the SPACE characters as well, e.g.
=?iso-8859-1?q?this=20is=20some=20text?=
The characters which may appear in 'encoded-text' are further
restricted by the rules in section 5.
Earlier answer
This sort of thing is explicitly allowed. Headers with MIME words should be 76 characters or less and folded if needed. RFC822 folded headers are indented second and any additional lines. RFC2047 headers are supposed to only indent one space. The whitespace between ?= on the first line and =? should be suppressed from output.
See the example on the bottom of page 12 of the RFC:
encoded form displayed as
---------------------------------------------------------------------
(=?ISO-8859-1?Q?a?= (ab)
=?ISO-8859-1?Q?b?=)
Any amount of linear-space-white between 'encoded-word's,
even if it includes a CRLF followed by one or more SPACEs,
is ignored for the purposes of display.

hcm cloud rest call from jersey with query parameter having space [duplicate]

This question already has answers here:
Is a URL allowed to contain a space?
(10 answers)
Closed 8 years ago.
w3fools claims that URLs can contain spaces: http://w3fools.com/#html_urlencode
Is this true? How can a URL contain an un-encoded space?
I'm under the impression the request line of an HTTP Request uses a space as a delimiter, being formatted as {the method}{space}{the path}{space}{the protocol}:
GET /index.html http/1.1
Therefore how can a URL contain a space? If it can, where did the practice of replacing spaces with + come from?
A URL must not contain a literal space. It must either be encoded using the percent-encoding or a different encoding that uses URL-safe characters (like application/x-www-form-urlencoded that uses + instead of %20 for spaces).
But whether the statement is right or wrong depends on the interpretation: Syntactically, a URI must not contain a literal space and it must be encoded; semantically, a %20 is not a space (obviously) but it represents a space.
They are indeed fools. If you look at RFC 3986 Appendix A, you will see that "space" is simply not mentioned anywhere in the grammar for defining a URL. Since it's not mentioned anywhere in the grammar, the only way to encode a space is with percent-encoding (%20).
In fact, the RFC even states that spaces are delimiters and should be ignored:
In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may
have to be added to break a long URI across lines. The whitespace
should be ignored when the URI is extracted.
and
For robustness, software that accepts user-typed URI should attempt
to recognize and strip both delimiters and embedded whitespace.
Curiously, the use of + as an encoding for space isn't mentioned in the RFC, although it is reserved as a sub-delimeter. I suspect that its use is either just convention or covered by a different RFC (possibly HTTP).
Spaces are simply replaced by "%20" like :
http://www.example.com/my%20beautiful%20page
The information there is I think partially correct:
That's not true. An URL can use spaces. Nothing defines that a space is replaced with a + sign.
As you noted, an URL can NOT use spaces. The HTTP request would get screwed over. I'm not sure where the + is defined, though %20 is standard.

rfc5322: how to fold values without spaces?

I've already seen answer on similar topic, but rfc5322 says:
Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP.
this can break source spaceless text by adding into it spaces.
so how to fold values without spaces?
The modern solution is to use RFC2047 encoding which allows completely arbitrary folding:
=?us-ascii?B?A?=
=?us-ascii?B?B?=
encodes the single string AB, preserving the lack of spaces.
The header format does not allow breaking lines without spaces.
Note that there's nothing wrong with sending long lines — the length limit in header lines is 998 characters (Section 2.3), and if you need to send a header that contains more than 998 characters with no intervening spaces, you're probably doing something wrong.

why is there a '=' at the end of a SMTP message body?

I receive email messages over sockets and see that long lines in the message body are broken up, separated by the following expression
'=\r\n'
I cannot find any documentation on this and wonder if someone just happens to know where I can find information on this behavior.
Also, please ONLY feedback on my question, no comments regarding email and sockets!
Thanks
Alex
From Wikipedia, regarding Quoted-printable:
Lines of quoted-printable encoded data must not be longer than 76 characters. To satisfy this requirement without altering the encoded text, soft line breaks may be added as desired. A soft line break consists of an "=" at the end of an encoded line, and does not appear as a line break in the decoded text.
The \r\n is likely coming from whatever is generating the content or body of the email, and is a line break also. Depending on the client used to view the message, it may or may not render as an actual line break.

What is the email subject length limit?

How many characters are allowed to be in the subject line of Internet email?
I had a scan of The RFC for email but could not see specifically how long it was allowed to be.
I have a colleague that wants to programmatically validate for it.
If there is no formal limit, what is a good length in practice to suggest?
See RFC 2822, section 2.1.1 to start.
There are two limits that this
standard places on the number of
characters in a line. Each line of
characters MUST be no more than 998
characters, and SHOULD be no more than
78 characters, excluding the CRLF.
As the RFC states later, you can work around this limit (not that you should) by folding the subject over multiple lines.
Each header field is logically a
single line of characters comprising
the field name, the colon, and the
field body. For convenience however,
and to deal with the 998/78 character
limitations per line, the field body
portion of a header field can be split
into a multiple line representation;
this is called "folding". The general
rule is that wherever this standard
allows for folding white space (not
simply WSP characters), a CRLF may be
inserted before any WSP. For
example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
The recommendation for no more than 78 characters in the subject header sounds reasonable. No one wants to scroll to see the entire subject line, and something important might get cut off on the right.
RFC2322 states that the subject header "has no length restriction"
but to produce long headers but you need to split it across multiple lines, a process called "folding".
subject is defined as "unstructured" in RFC 5322
here's some quotes ([...] indicate stuff i omitted)
3.6.5. Informational Fields
The informational fields are all optional. The "Subject:" and
"Comments:" fields are unstructured fields as defined in section
2.2.1, [...]
2.2.1. Unstructured Header Field Bodies
Some field bodies in this specification are defined simply as
"unstructured" (which is specified in section 3.2.5 as any printable
US-ASCII characters plus white space characters) with no further
restrictions. These are referred to as unstructured field bodies.
Semantically, unstructured field bodies are simply to be treated as a
single line of characters with no further processing (except for
"folding" and "unfolding" as described in section 2.2.3).
2.2.3 [...] An unfolded header field has no length restriction and
therefore may be indeterminately long.
after some test: If you send an email to an outlook client, and the subject is >77 chars, and it needs to use "=?ISO" inside the subject (in my case because of accents) then OutLook will "cut" the subject in the middle of it and mesh it all that comes after, including body text, attaches, etc... all a mesh!
I have several examples like this one:
Subject: =?ISO-8859-1?Q?Actas de la obra N=BA.20100154 (Expediente N=BA.20100182) "NUEVA RED FERROVIARIA.=
TRAMO=20BEASAIN=20OESTE(Pedido=20PC10/00123-125),=20BEASAIN".?=
To:
As you see, in the subject line it cutted on char 78 with a "=" followed by 2 or 3 line feeds, then continued with the rest of the subject baddly.
It was reported to me from several customers who all where using OutLook, other email clients deal with those subjects ok.
If you have no ISO on it, it doesn't hurt, but if you add it to your subject to be nice to RFC, then you get this surprise from OutLook. Bit if you don't add the ISOs, then iPhone email will not understand it(and attach files with names using such characters will not work on iPhones).
Limits in the context of Unicode multi-byte character capabilities
While RFC5322 defines a limit of 1000 (998 + CRLF) characters, it does so in the context of headers limited to ASCII characters only.
RFC 6532 explains how to handle multi-byte Unicode characters.
Section 3.4 ( Effects on Line Length Limits ) states:
Section 2.1.1 of [RFC5322] limits lines to 998 characters and
recommends that the lines be restricted to only 78 characters. This
specification changes the former limit to 998 octets. (Note that, in
ASCII, octets and characters are effectively the same, but this is
not true in UTF-8.) The 78-character limit remains defined in terms
of characters, not octets, since it is intended to address display
width issues, not line-length issues.
So for example, because you are limited to 998 octets, you can't have 998 smiley faces in your subject line as each emoji of this type is 4 octets.
Using PHP to demonstrate:
Run php -a for an interactive terminal.
// Multi-byte string length:
var_export(mb_strlen("\u{0001F602}",'UTF-8'));
// 1
// ASCII string length:
var_export(strlen("\u{0001F602}"));
// 4
// ASCII substring of four octet character:
var_export(substr("\u{0001F602}",0,4));
// '😂'
// ASCI substring of four octet character truncated to 3 octets, mutating character:
var_export(substr("\u{0001F602}",0,3));
// '▒'
I don't believe that there is a formal limit here, and I'm pretty sure there isn't any hard limit specified in the RFC either, as you found.
I think that some pretty common limitations for subject lines in general (not just e-mail) are:
80 Characters
128 Characters
256 Characters
Obviously, you want to come up with something that is reasonable. If you're writing an e-mail client, you may want to go with something like 256 characters, and obviously test thoroughly against big commercial servers out there to make sure they serve your mail correctly.
Hope this helps!
What's important is which mechanism you are using the send the email. Most modern libraries (i.e. System.Net.Mail) will hide the folding from you. You just put a very long email subject line in without (CR,LF,HTAB). If you start trying to do your own folding all bets are off. It will start reporting errors. So if you are having this issue just filter out the CR,LF,HTAB and let the library do the work for you. You can usually also set the encoding text type as a separate field. No need for iso encoding in the subject line.