postfix header_checks pcre (lookahead) - pcre

Currently a lot of spam mails containing links to trojans go around. This mails are relativ good fakes, they mostly look like legitimate delivery announcements from delivery services like UPS or DHL.
But there is one significant difference: Legitimate mails, e. g. from DHL have a from like From: "DHL name"<name#dhl.com>
The trojans from is like "DHL name"<name#any_domain.tld>
So I want to block any mail, which starts with From: "DHL, but has any other domain but dhl after the #.
I think, that the following lookahead should fit: /^From: "dhl(?!.*#dhl)/ REJECT No trojans please
Don't work.
Afaik the regexes in header_checks are not case-senitive. To avoid any confusion with special characters like " and #, I tried the more simple form From: .dhl(?!.*dhl)
Don't work either.
Is there something wrong with my regex, or with my understanding of pcre's in postfix?

Solved! The problem was my own fault. In main.cf, header_checks was bound in as regexp: rather than pcre: so it did not support the enhanced syntax of PCRE.

Related

Is it possible to add another Unicode character for "at sign" without changing any code in the back-end of all the email providers?

So lets say for some reason we wanted to add another Unicode character for at sign, and use it instead of # in all the email providers
Now i have three questions:
How do email providers parse the email, do they actually parse the written email until they see a # and they have hard-coded the # symbol's Unicode in the parser?
Do different service providers have different email parser with different standards or is there a standard type of parser library that every email provider use?
Will it be possible to add another at sign symbol and use it in emails without having to make changes in all the email provider's code?
Yes, e-mail addresses are parsed using a hard-wired # character. After almost fifty years of e-mail, there are literally millions of e-mail handling programs, and they all use this same syntax. So you're not going to be able to change this convention, and your second and third questions are moot.
E-mail addresses are parsed by tens of different kind of softwares, not just "email server" software inside "e-mail providers". Even things as trivial as client-side javascript highlighting for an e-mail field - of which there are easly tens of thousands around, would have to adapt.
An "#" is not a character class by itself - so, even if it were an unique "unicode character class" for "Unicode Separator", whou would ever have written code that would check for the character class of the separator? Have you ever done that, even for filtering punctuation out? (A real use case for the unicode classification of characters, and even them, this sees little use in real-world code).
Now, of course, you are free to write email client code that would present the "#" as anything else when rendering e-mail data to the users. Internally, if this software would not use "#", even for its own uses, it would not work with anything else in the World - from antivirus software to text-based templates.
And finally, such a change would hardly have to do with "unicode" itself - Unicode can standardize characters - but the e-mail protocol is a separate thing - normally the series of documents kept as "RFC"s is what mandate various internet protocols, including IMAP, POP and SMTP- the three protocols that are used to enable e-mail to work. Even if new RFCs for all these would be published with a new character accept in place of "#", it would likely take more than a decade until all software around, as detailed above, would be compliant enough to enable it to be used. (And yes, all of it would have to be changed)

Exchange Server Transport Rule Failing Emails From .mil

I am using Exchange Server 2013 and have many transport rules set up to filter out emails from most countries outside of the US.
We recently received an email from a military email, ending in .mil
The email was blocked by my transport rules but does not match any of the extensions I have listed. Except for possibly one! I have an extension to block '.il$'. So this should block ALL emails that end with ".il". However, if the transport rules use true regular expression rules, the "." would be a wildchar and match any and every character including a "." itself. Is this the cause of my issue? I do not have a .mil email account to test with or I could check myself. I have added a character escape to my transport rule, making it '\.il$' hoping that it will fix this.
I read everything I can find about the regex rules for Exchange's Transport Rules, and I cannot find anything that mentions you must escape the dot. Maybe this is just a rare issue and they didn't foresee it occurring?
One of the documents I have read: https://technet.microsoft.com/en-us/library/aa997187(v=exchg.141).aspx
Long story short: YES, the dot(.) must be escaped with a \. Otherwise it is a single wildchar that matches any character [A-Z a-z 0-9 . , /] etc. just like in regular expression. I assume that Microsoft is using every rule from regular expression for the transport rules but do not quote me on that.
This cannot be found in any documentation that I have researched, it also seems that every example that I have looked at on the web has been doing it wrong as well. Examples that I see are always ".com$" will block all emails from a sender ending in .com. This is true because the dot can also be a dot. But this will also block any emails that end in "ecom" for example, which may be an issue if they ever decide to release such extension.
Sorry for answering my own question, but I want this to be here for future reference since it can't seem to be found anywhere else.

postfix header_checks.pcre wrongly blocking IPHONE emails

I have a postfix/dovecot mail server which has been working fine for a year or so but today one user came to me with his iPhone and said he couldn't send emails.
It turns out the emails were being rejected by my header_checks.pcre which I set up as per the example in http://www.postfix.org/header_checks.5.html
The error I got was something like:
Apr 30 09:48:28 mail06 postfix/cleanup[28849]: 53893A00CD: reject:
header Content-Type:
image/png;??name=email_logo.png;??x-apple-part-url="part22.05080008.04000601#mydomain.com"
from unknown[112.134.156.178]; from=
to= proto=ESMTP helo=<[192.168.1.12]>: 5.7.1
Attachment name
"email_logo.png;??x-apple-part-url="part22.05080008.04000601#mydomain.com"
may not end with ".com"
So it seems that the iPhone mail app was appending an "x-apple-part-url" suffix to the attachment name and the PCRE was mistakenly blocking this as a .com instead of allowing through a .png.
Does anyone know how I can safely modify the PCRE in http://www.postfix.org/header_checks.5.html to avoid this happening?
So far as I know ".com" is still a viable extension for Windows malware. The problem is that the second .* in the example PCRE in the Postfix documentation is spanning two parameters as if the .com ended the name or filename parameter.
According to RFC 2045, value := token / quoted-string. This means you need to cater for both the quoted and unquoted cases by providing appropriate character classes. You could split into two rules or, to save multiple lists of extensions, do something like:
/etc/postfix/header_checks.pcre:
/^Content-(Disposition|Type).*name\s*=\s*
("(?:[^"]|\\")*|[^();:,\/<>\#\"?=<>\[\]\ ]*)
((?:\.|=2E)(
ade|adp|asp|bas|bat|chm|cmd|com|cpl|crt|dll|exe|
hlp|ht[at]|
inf|ins|isp|jse?|lnk|md[betw]|ms[cipt]|nws|
\{[[:xdigit:]]{8}(?:-[[:xdigit:]]{4}){3}-[[:xdigit:]]{12}\}|
ops|pcd|pif|prf|reg|sc[frt]|sh[bsm]|swf|
vb[esx]?|vxd|ws[cfh])(\?=)?"?)\s*(;|$)/x
REJECT Attachment name $2$3 may not end with ".$4"
The new second line of the rule distinguishes between the quoted and unquoted cases and any closing quotation mark is absorbed into $3.
BTW I'd probably stick .mso, .xl, .ocx (obscure MS extensions) and .jar in there too. Obviously this check is useful against malware floods but doesn't substitute for an up-to-date antivirus or more detailed spam analysis.

Mandrill "reject_reason": "invalid-sender"

I'm trying to send emails using mandrill email service but I get the following error
Full Response
[
{
"email": "someemail#somedomain.com",
"status": "rejected",
"_id": "b814c2974594466cba9c904c54dca6c6",
"reject_reason": "invalid-sender"
}
]
Apart from the above error there is no more details about it. we are using .net to send emails with Mandrill SMTP settings.
It'd be useful to see the call/email that's being sent. That error means that there's an invalid sender, as indicated in the reject reason field. That could be because of an invalid email address, invalidly-encoded from name, or invalid or broken encoding in other headers making it so that Mandrill can't parse the "from" header, but without seeing the actual email that you're sending, it's hard to say for sure exactly what the issue is.
You probably want to check that there's a fully-qualified domain name in the from email address, and that if the subject line is encoded, there aren't things like newline (\n) characters that break multibyte characters in the subject line. If you aren't able to identify the issue in the raw SMTP message, feel free to get in touch with support for further troubleshooting assistance.
I had the same problem, in my case, I had forgotten to complete the template defaults "From Name" and "Subject".
I had the same problem. In my case encoding in headers was the problem. I did change the headers encoding to UTF-8 and it worked. I was using C# SMTP and the code is below.
message.HeadersEncoding = Encoding.UTF8;
Hope it works!
For me, it was because my emails were coming from email#example.net1
Mandrill rejected me because of the 1 at the end. e+mail#example.net and email#example.neta are both valid and will be accepted.
My other tests just had blank From headers, so they were rejected as well. I didn't even realize these emails were being received by Mandrill until I logged in and checked the API logs.
I've had a similar problem recently. It was due to my use of certain characters in the message.from_name field. After searching through documentation and stack overflow, I couldn't find a list of forbidden characters, so although this doesn't necessarily pertain to your case, I thought I'd share this small list I compiled of some acceptable characters (not an exhaustive list):
a-z
A-Z
0-9
_, -, !, #, $, %, \, ^, &, *, +, =, {, }, ?, .
In JS, here's a RegExp that will match with forbidden characters (or, rather, any characters that aren't in the aforementioned list):
const pattern = /[^a-zA-Z0-9_\-!#$%\^&*+={}?.]/;
Hope this is helpful for anyone else stuck on this.
If you use .NET SmtpClient, may be this is because of bug on it: https://social.msdn.microsoft.com/Forums/vstudio/en-US/4d1c1752-70ba-420a-9510-8fb4aa6da046/subject-encoding-on-smtpclientmailmessage
Workaround, that helped us:
use
message.SubjectEncoding = Encoding.Unicode;
instead of
message.SubjectEncoding = Encoding.UTF8;
This is still actual in .Net Framework 4.7.2

Can an email address contain international (non-english) characters?

If it's possible, should I accept such emails from users and what problems to expect when I will be sending mails to such addresses?
Officially, per RFC 6532 - Yes.
For a quick explanation, check out wikipedia on the subject.
Update 2015: Use RFC 6532
The experimental 5335 has been Obsoleted by: 6532 and
this later has been set to "Category: Standards Track",
making it the standard.
The Section 3.2 (Syntax Extensions to RFC 5322) has updated most text fields to
include (proper) UTF-8.
The following rules extend the ABNF syntax defined in [RFC5322] and
[RFC5234] in order to allow UTF-8 content.
VCHAR =/ UTF8-non-ascii
ctext =/ UTF8-non-ascii
atext =/ UTF8-non-ascii
qtext =/ UTF8-non-ascii
text =/ UTF8-non-ascii
; note that this upgrades the body to UTF-8
dtext =/ UTF8-non-ascii
The preceding changes mean that the following constructs now
allow UTF-8:
1. Unstructured text, used in header fields like
"Subject:" or "Content-description:".
2. Any construct that uses atoms, including but not limited
to the local parts of addresses and Message-IDs. This
includes addresses in the "for" clauses of "Received:"
header fields.
3. Quoted strings.
4. Domains.
Note that header field names are not on this list; these are still
restricted to ASCII.
Please note the explicit inclusion of Domains.
And the explicit exclusion of header names.
Also Note about NFKC:
The UTF-8 NFKC normalization form SHOULD NOT be used because
it may lose information that is needed to correctly spell
some names in some unusual circumstances.
And Section 3 start:
Also note that messages in this format require the use of the
SMTPUTF8 extension [RFC6531] to be transferred via SMTP.
The problem is that some mail clients (server-tools and / or desktop tools) don't support it and throw an 'invalid email' exception when you try to send a mail to an address which contains umlauts for example.
If you want full support, you could do the trick with converting the email-address parts to "punycode". This allows users to type in their addresses the usual way but you save it the supported-level way.
Example: müller.com » xn--mller-kva.com
Both points to the same thing.
I would assume yes since a number of top level domains already allow non ascii
characters for domains and since the domain is part of an email address, it's
perfectly possible. An example for such a domain would be www.öko.de
short answer: yes
not only in the username but also in the domain name are allowed.
The answer is yes, but they need to be encoded specially.
Look at this. Read the part that refers to email-headers and RFC 2047.
Not yet. The IEEE plans to do this:
H-Online article: IEFT planning internationalised email addresses, here is the RfC: SMTP Extension for Internationalized Email Addresses
Quote from H-Online (as it went down):
The Internet Engineering Task Force (IETF) has published three crucial documents for the standardisation of email address headers
that include symbols outside the ASCII character set. This means that
soon you'll be able to use Chinese characters, French accents, and
German umlauts in email addresses as well as just in the body of the
message. So if your name is Zoë and you work for a company that makes
façades, you might be interested in a new email address. But
representatives of providers are already moaning. They say there would
need to be an "upgrade mania" if the Unicode standard UTF-8 is to
replace the American Standard Code for Information Interchange (ASCII)
currently used as the general email language.
RFC 5335 specifies the use of UTF-8 in practically all email headers.
Changes would have to be made to SMTP clients, SMTP servers, mail user
agents (MUAs), software for mailing lists, gateways to other media,
and everywhere else where email is processed or passed along. RFC 5336
expands the SMTP email transport protocol. At the level of the
protocol, the expansion is labelled UTF8SMTP.
A new header field will be added as a sort of "emergency parachute" to
ensure that UTF-8 emails have a soft landing if they are thrown out
before reaching the recipient by systems that have not been upgraded.
The "OldAddress" is a purely ASCII address. But OldAddress is not to
be used as a channel for a second transfer attempt, but rather to make
sure that feedback is sent home.
Finally, RFC5337 ensures that correct messages are sent pertaining to
the delivery status of non-ASCII emails. The correct address of an
unreachable addressee must be sent back, even if further transport has
been refused. The email Address Internationalization (EAI) working
group is also working on a number of "downgrade mechanisms" for
various header fields and the envelope. If possible, original header
information is to be "packaged" and preserved.
Germany's DeNIC, the registrar for the ".de" domain, is nonetheless
taking this in its stride. "There is really not much we can do",
explained DeNIC spokesperson Klaus Herzig. DeNIC is instead paying
more attention to the update that the IETF is working on for the
standard of international domains – RFC3490, or IDNA2003 as it's
sometimes known. "We are not that happy about it because there is no
backwards compatibility," Herzig explained. When the update comes,
DeNIC says it will be throwing its weight behind the symbol "ß" - also
known as estzett - which has been overlooked up to now. The German
registrar also says that it may wait a bit before switching in light
of the lack of backward compatibility. Once the new standard is
running stably and registrars and providers have adopted it, the ß
will be added.
In contrast, experts believe that Chinese registrars in China and
Taiwan will quickly implement the change for internationalised email.
Representatives of CNIC and TWNIC are authors of the standards.
Chinese users currently have to write emails in ASCII to the left of
the # and in Chinese characters to the right of it for Chinese
domains, which have already been internationalized.
(Monika Ermert)