Are single quotes legal in the name part of an email address? - email

For example:
jon.o'conner#example.com ?

Yes, jon.o'conner#example.com is a valid email address according to RFC 5322.
From the Email address article at wikipedia (Syntax section):
The local-part of the email address may use any of these ASCII characters:
Uppercase and lowercase English letters (a–z, A–Z)
Digits 0 to 9
Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
Character . (dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe#example.com).
(The syntax is formally defined in RFC 5322 section 3.4.1 and RFC 5321.)

Although the answer is correct according to RFC 5322 the practice of using the quote (') has holes.
Since it is string delimiter, too many automation and integration services fail when this character is used.
You will note that professional mail services like GMail do not allow it.
Strongly suggest that you use the alternate quote (`) if you need it, but in practice it should be avoided.

The format for email addresses is defined in RFC 5322; The local part (i.e. recipient) may use any of these ASCII characters:
Uppercase and lowercase English letters (a–z, A–Z)
Digits 0 to 9
Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
Character . (dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe#example.com).
From this, you can see that single quotes are valid for the recipient address

Related

What is the proper way of denoting URI with diacritics (letters with accents)?

What is the correct and official way of using diacritics in URI?
I have 3 different ways shown below:
Here á = %E1, â = %E2, space = %20, comma = %2C, but this link doesn't work properly since the characters are mangled:
http://www.recordspreservation.org/cgi-bin/list_directory_1.cgi?directory=%2CBrasil%2CGoi%E1s%2CLuzi%E2nia%2CSanta%20Luzia%2CBatismos%201749-1753%2CImagens&image_name=_MG_5229.JPG
Here space = %20, comma = %2C and I don't do anything with the a's. This link works:
http://www.recordspreservation.org/cgi-bin/list_directory_1.cgi?directory=%2CBrasil%2CGoiás%2CLuziânia%2CSanta%20Luzia%2CBatismos%201749-1753%2CImagens&image_name=_MG_5229.JPG
Here space = +, comma = %2C and I don't do anything with the a's. This link works:
http://www.recordspreservation.org/cgi-bin/list_directory_1.cgi?directory=%2CBrasil%2CGoiás%2CLuziânia%2CSanta+Luzia%2CBatismos+1749-1753%2CImagens&image_name=_MG_5229.JPG
The characters in a URL string must be within in a restricted subset of 7-bit ASCII, and no encoding is specified for wide characters
Some of that set are unreserved, and may be used literally anywhere the syntax allows
The remaining characters are reserved because they form part of the URL syntax; reserved characters must be percent-encoded if they are used outside their syntactical meaning
Eight-bit characters that are in neither the reserved nor the unreserved categories must always be percent-encoded
##Unreserved characters
0 to 9
A to Z
a to z
-
.
_
~
##Reserved characters
! - %21
# - %23
$ - %24
& - %26
' - %27
( - %28
) - %29
* - %2A
+ - %2B
, - %2C
/ - %2F
: - %3A
; - %3B
= - %3D
? - %3F
# - %40
[ - %5B
] - %5D
This link doesn't work properly since the characters are mangled
That is a problem between the client and the server. It looks like you're sending ISO-8859-1 characters, in which scheme E1 and E2 correspond to e acute, and e circumflex. But if your server is expecting UTF-8 encoding then those should appear as byte sequences C3 A1 and C3 A2
I can't tell what encoding is expected by your server, but it clearly isn't what you're sending. The current standard is to encode non-ASCII characters in UTF-8 and percent-encode the resulting bytes
###Update
The best solution is to use the URI module, which will encode character string as necessary
Take special note that, if you need to use UTF-8-encoded characters in your source code, as below, then you must have use utf8 at the top of your program. You also need to make sure that your editor is writing UTF-8 data to the program file.
use utf8;
use strict;
use warnings 'all';
use feature 'say';
use URI;
my $url = URI->new('http://www.recordspreservation.org/cgi-bin/list_directory_1.cgi?directory=,Brasil,Goiás,Luziânia,Santa Luzia,Batismos 1749-1753,Imagens&image_name=_MG_5229.JPG');
say $url;
###output
http://www.recordspreservation.org/cgi-bin/list_directory_1.cgi?directory=,Brasil,Goi%C3%A1s,Luzi%C3%A2nia,Santa%20Luzia,Batismos%201749-1753,Imagens&image_name=_MG_5229.JPG

YANG model Special Characters includes #

How to use # as a special character for name field in YANG file.
I am using type as a string which help me to accept all ASCII special characters from keyboard except #
Is # is some kind of a Keyword or carrying some special meaning for YANG modeling language?
I'm assuming your issue happens during YANG modeling, not during instance document validation.
No, # character does not have special meaning in YANG modules. You are most likely attempting to use this character in a YANG identifier, which is not valid. YANG identifiers, such as statement arguments to container, leaf, leaf-list and list have to follow this grammar:
;; An identifier MUST NOT start with (('X'|'x') ('M'|'m') ('L'|'l'))
identifier = (ALPHA / "_")
*(ALPHA / DIGIT / "_" / "-" / ".")
ALPHA = %x41-5A / %x61-7A
; A-Z / a-z
DIGIT = %x30-39
; 0-9
The first character must be an underscore or a letter, and may be followed by letters, digits, underscores, dots and hyphens. An identifier must also not start with xml regardless of letter case.

Mail subject decoding?

I have a email subject like this:
Subject: =?gbk?Q?=B3=F6=C3=C0=C1=E2=C7=BF=C1=A6=B3=E9=CA=AA=BB=FA=D2=BB=CC=A8?=
=?gbk?Q?=A3=AC=D6=E9=BA=A3=B9=E3=D6=DD=C9=FA=BB=EE=B1=D8=B1=B8?=
But I don't know what kind of encoding is this?
Could someone help? Newbie to email protocol.
This subject is encoded in GBK, an extension of the GB2312 character set for simplified Chinese characters, used in the People's Republic of China.
As defined in the RFC1342 specification, to represent non-ASCII text in Internet Message headers, you have to encode it with the MIME encoded-word syntax:
encoded-word = "=" "?" charset "?" encoding "?" encoded-text "?" "="
charset = token ; legal charsets defined by RFC 1341
encoding = token ; Either "B" or "Q"
token = 1*
tspecials = "(" / ")" / "<" / ">" / "#" / "," / ";" / ":" / "\" /
<"> / "/" / "[" / "]" / "?" / "." / "="
encoded-text = 1* (but see "Use of encoded-words in message
; headers", below)
The "B" encoding:
The "B" encoding is identical to the "BASE64" encoding defined by
RFC
1341.
The "Q" encoding:
The "Q" encoding is similar to the "Quoted-Printable" content-
transfer-encoding defined in RFC 1341. It is designed to allow text
containing mostly ASCII characters to be decipherable on an ASCII
terminal without decoding.
(1) Any 8-bit value may be represented by a "=" followed by two
hexadecimal digits. For example, if the character set in use
were ISO-8859-1, the "=" character would thus be encoded as
"=3D", and a SPACE by "=20". (Upper case should be used for
hexadecimal digits "A" through "F".)
(2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be
represented as "" (underscore, ASCII 95.). (This character may
not pass through some internetwork mail gateways, but its use
will greatly enhance readability of "Q" encoded data with mail
readers that do not support this encoding.) Note that the ""
always represents hexadecimal 20, even if the SPACE character
occupies a different code position in the character set in use.
(3) 8-bit values which correspond to printable ASCII characters
other
than "=", "?", and "_" (underscore), MAY be represented as those
characters. (But see section 5 for restrictions.) In
particular, SPACE and TAB MUST NOT be represented as themselves
within encoded words.
In your subject:
Subject:
=?gbk?Q?=B3=F6=C3=C0=C1=E2=C7=BF=C1=A6=B3=E9=CA=AA=BB=FA=D2=BB=CC=A8?= =?gbk?Q?=A3=AC=D6=E9=BA=A3=B9=E3=D6=DD=C9=FA=BB=EE=B1=D8=B1=B8?=
We can see that the Quoted-Printable encoding has been used, hence the presence of = as escape character instead of %.
You can find an online encode here, and an online MIME Headers Decoder here.
Finally, here is your decoded subject:
Subject: 出美菱强力抽湿机一台,珠海广州生活必备

Extract text from email then send text

I have alerts setup with my bank for whenever a transaction occurs. I have been trying to extract only the Date and the amount and forward that as a text message to myself.
Here is what the alert email looks like:
FIRSTNAME LAST NAME
A transaction has been posted to your BANKNAME ACCOUNTNAME, and is within the parameters you set for triggering this alert.
The transaction was on 06/20/2014 in the amount of ($40.00). For recent account history, including transaction descriptions and running balances, sign on to BANKNAME Account Manager (online banking) and click on the account name.
BANKNAME Disclaimer: This transmittal is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this transmittal is not the intended recipient, or the employee or agent responsible for delivering the transmittal to the intended recipient, you are notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by e-mail and delete this message from your computer.
I have been able to grep, awk, and sed but only can get the entire line to display.
:~# nawk '/The transaction was on/,/For recent account history/' alert.txt
The transaction was on 06/20/2014 in the amount of ($40.00). For recent account history, including transaction descriptions and running balances, sign on to BANKNAME Account Manager (online banking) and click on the account name.
What can I do to change the command to extract only the date and the amount so that the result would look something like this:
06/20/2014 $40.00
The plan is to take that output and send it to my self as a text message.
You could try the below grep command to get the date and the amount,
$ grep -oP '\d{2}\/\d{2}\/\d{4}|\$[^\)]*' file | paste -d' ' - -
06/20/2014 $40.00
You could do it also in GNU sed,
$ sed -nr 's~^.*([0-9]{2}\/[0-9]{2}\/[0-9]{4}).*\((\$[^)]*)\).*$~\1 \2~p' file
06/20/2014 $40.00
Try
awk -vRS=\ '/[0-9]+\/[0-9]+\/[0-9]+/ {d=$0} /\$[0-9]+\.[0-9]+/ {print d, substr($0, 2, length - 3); exit}'
Explanation:
/[0-9]+\/[0-9]+\/[0-9]+/
Matches 1 or more digits, a slash, 1 or more digits, a slash, and 1 or more digits.
[0-9] matches a single digit character in 0, 1, 2, ..., 9
+ causes the previous entity to be matched 1 or more times
\/ is a literal slash (the backslash "escapes" it so it doesn't terminate
the pattern)
/\$[0-9]+\.[0-9]+/
Matches a dollar sign, 1 or more digits, a period, and 1 or more digits.
\$ matches a literal dollar sign (a dollar sign is otherwise an anchor matching
the end of the string)
\. matches a literal period (a period otherwise matches any character)

Amazon Signature Encoding

The Amazon documentation for "creating a signature" has some pretty specific requirements. In particular, it asks me to:
URL encode the parameter name and values according to the following rules:
Do not URL encode any of the unreserved characters that RFC 3986 defines. These unreserved characters are A-Z, a-z, 0-9, hyphen ( - ), underscore ( _ ), period ( . ), and tilde ( ~ ).
Percent encode all other characters with %XY, where X and Y are hex characters 0-9 and uppercase A-F.
Percent encode extended UTF-8 characters in the form %XY%ZA....
Percent encode the space character as %20 (and not +, as common encoding schemes do).
Does this encoding have a name?
I still don't know if the encoding has a name, but it is defined by RFC 3689. Once I knew that, finding a library was easy.