How to put some text into procmail forwarded e-mail? - email

For a couple of days, I've been trying to write procmail script.
I want to forward messages, and inject some text into message contents.
What I want to accomplish :
someone send me e-mail, with word "weather" in the subject
email is forwarded to address "mymail#somedomain.com"
every forwarded email gets some added text in contents
But so far, no success.
In .procmail.log, there's a message "procmail: Missing action"
SHELL=/bin/bash
VERBOSE=off
LOGFILE=/home/test/.procmail.log
LOGDATE_=`/bin/date +%Y-%m-%d`
:0
* ^Subject:.*weather
:0 bfw
| echo "This is injected text" ; echo "" ; cat
:0 c
! mymail#somedomain.com
When I looked into email source, I saw that text is injected.
But the place is wrong ...
Take a look:
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="------------148F3F0AD3D65DD3F3498ACA"
Content-Language: pl
Status:
X-EsetId: 37303A29AA1D9F60667466
This is injected text
This is a multi-part message in MIME format.
--------------148F3F0AD3D65DD3F3498ACA
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
CONTENT CONTENT CONTENT
*********************************************************
Injected text should be placed, where content is. Now it is above ...

You don't explain your code, but it looks like that you are attempting to use multiple actions under a single condition. Use braces for that.
:0
* ^Subject:.*weather
{
:0 bfw
| echo "This is injected text" ; echo "" ; cat
:0 c
! mymail#somedomain.com
}
Just to summarize, every recipe must have a header line (the :0 and possible flags) and an action. The conditions are optional, and there can be more than one. A block of further recipes is one form of action so that satisfies these requirements (the other action types are saving to a folder, piping to a command, or forwarding to an email address).
To inject text at the top of the first MIME body part of a multipart message, you need to do some MIME parsing. Procmail unfortunately has no explicit support for MIME, but if you know that the incoming message will always have a particular structure, you might get away with something fairly simple.
:0
* ^Subject:.*weather
{
:0fbw
* ^Mime-version: 1\.0
* ^Content-type: multipart/
| awk '/^Content-type: text\/plain;/&&!s {n=s=1} \
n&&/^$/{n=0; p=1} \
1; \
p{ print "This is injected text.\n"; p=0 }'
:0 c
! mymail#somedomain.com
}
The body (which contains all the MIME body parts, with their headers and everything) is passed to a simple Awk script, which finds the first empty line after (what we optimistically assume is) the first text/plain MIME body part header, and injects the text there. (Awk is case-sensitive, so the regex text might need to be adapted or generalized, and I have assumed the whitespace in the input message is completely regular. For a production system, these simplifying assumptions are unrealistic.)
If you need full MIME support (for example, the input message may or may not be multipart, or contain nested multiparts), my recommendation would be to write the injection code in some modern script language with proper MIME support libraries; Python would be my pick, though it is still (even after the email library update in 3.6) is slightly cumbersome and clumsy.

Related

How to send mail with text of different colors from Unix using command 'mail'?

I would like to send message from Unix server. I use command 'mail':
echo "MESSAGE_BODY" | mail -s "MESSAGE_TITLE" somebody#gmail.com
It's ok with it.
After that I want to send message with different colors. I tried this command:
echo "<font color="red">MESSAGE_BODY</font>" | mail -s "MESSAGE_TITLE" somebody#gmail.com
But it didn't help me. How to use colors ?
There have already been a "one-liner" that have posted the correct answer.
I do still feel that it's better to post how and why.
The reason why you can't just echo HTML code directly into your mail is that the receiver (Client) don't know how to display it. So it will most likely just fallback to clear text and all you would see was your HTML code when viewing the message.
What you need, is to tell the client that the content of your message is composed in HTML. You do this by adding the correct MIME header to the message.
Content-Type: text/html; charset=UTF-8
MIME-Version: 1.0
Notice you can also set charset information.
The MIME version is there for better compatibility also some SMTP servers will give you a higher spam score if you don't obey the RFC :)
But with these headers set now all "BODY" content will be treated like HTML content.
I don't just want to provide you with a "one-liner" I think showing more in a script is better to make it easier to read.
So how about this
(
echo "From: my#email.tld";
echo "To: some#email.tld";
echo "Subject: Test html mail";
echo "Content-Type: text/html";
echo "MIME-Version: 1.0";
echo "";
echo "<strong>Testing</strong><br><font color=\"blue\">I'm Blue :)</font>";
) | sendmail -t
Well technically it's still a one-liner :) But it just looks nicer and you can see what's going on!
Bonus information
If you want to have both HTML and TEXT bodies you need to look into Multipart content type bodies. I have included an example but you would properly need to read up on this if you don't know much about multipart types.
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="--0001boundary text--"
--0001boundary text--
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
The TEXT body goes here
--0001boundary text--
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
<strong>HTML code goes here</strong>
--0001boundary text--
As you can see it's no longer some simple mail body.
But I thought I wanted to show you how it was done in case you wanted to give it a go.
echo "<font color="green">Message body</font>" | mail -s "$(echo -e "Message title\nContent-Type: text/html")" somebody#gmail.com

How to extract email body and attachment

I am trying to extract a message rom multi-part email body or from attachment, so I used :0B to try each option like the following:
msgID=""
#extract message in the attachment if it's plain text
:0B
* ^Content-Disposition: *attachment.*(($)[a-z0-9].*)*($)($)\/[a-z0-9+]+=*
{msgID="$MATCH"}
#extract message in the body if it's there
:0EB
* ^()\/[a-z]+[0-9]+[^\+]
{msgID = "$MATCH"}
But msgID got the same message from the body which was inline image code, what's wrong with it, who know the better condition to filter it?
I also need to detect if the sub-header in the body is text and base64 encoded, then decode it, how to stipulate it with regex:
:0B
* ^Content-Type:text/html;
* ^Content-Location:text_0.txt
* ^Content-Transfer-Encoding:base64
* ^Content-Disposition: *attachment.*(($)[a-z0-9].*)*($)($)\/[a-z0-9+]+=*
{ msgID= msgId =`printf '%s' "$MATCH" | base64 -d` }
It always complains no match: ^Content-Type:text/html;
I'm guessing you are trying to say, there are two types of incoming messages. One looks something like this:
From: Sender <there#example.net>
To: You <AmyX#example.com>
Subject: plain text
ohmigod0
And the other is a complex MIME multipart with the same contents:
From: Sender <there#example.net>
To: Amy X <AmyX#example.com>
Subject: MIME complexity
MIME-Version: 1.0
Content-Type: multipart/related; boundary=12345
--12345
Content-type: text/plain; charset="us-ascii"
Content-transfer-encoding: base64
Content-disposition: attachment; filename="text_0.txt"
Content-location: text_0.txt
b2htaWdvZDA=
--12345--
If this is correct, you would want to create a recipe to handle the more complex case first, because it has more features -- if your regex hits, it's unlikely to be a false positive. If not, fall back to the simpler pattern, and assume there will never be any false positives on this (perhaps because this account only receives email from a single system).
# extract message in the attachment if this is a MIME message
:0B
* ^Content-Disposition: *attachment.*(($)[a-z0-9].*)*($))($)\/[a-z0-9+]+=*
{ msgID="$MATCH" } # hafta have spaces inside the braces
:0EB # else, do this: assume the first non-empty body line is msgID
* ^()\/[a-z]+[0-9]+[^\+]
{ msgID="$MATCH" } # still need spaces inside braces;
# ... and, as pointed out many times before, cannot have spaces
# around the equals sign
The regular expression for the attachment is an oversimplification, but I already showed you how to cope with a complex MIME message in a previous question of yours -- if you have multiple cases (for example, base64-encoded attachment, or just a plain-text attachment, or no MIME), I would arrange them from more-complex (meaning more features in the regex) and fall back successively to simpler regexes, with higher chance of false positives. You can chain :0E ("else") cases for as long as you like -- if a regex succeeds and the following recipes are :0E recipes, they will all be skipped.
In response to your update, there are two problems with your attempt. The first, as you note, is that the first regex doesn't match. You have no space after the colon, and I'm guessing there is one in the message you are matching against. You need to understand that every character in a regex needs to match exactly, with the exception of regex metacharacters, which have special meaning. You would typically see something like this in many Procmail recipes:
* ^Content-Type:[ ]*text/html;
where the spaces between the square brackets are a space and a tab. The character class (the stuff in the square brackets) matches either character once, and the asterisk * says to repeat this pattern zero or more times. This allows for arbitrary spacing after the colon. The square brackets and the star are metacharacters. (This is very basic stuff which should be in any Procmail introduction you may have read.)
Your other problem is that each regex is applied in isolation. So your recipe says, if the Content-Type header appears anywhere in the body, and the Content-Location header appears anywhere else (typically, in another MIME header somewhere) etc. In other words, your recipe is very prone to false positives. This is why the rule I proposed earlier is so complex: It looks for these headers in sequence, in a single block, that is, in a single MIME header (though there is nothing to actually make sure that the context is a MIME body part header; more on that in a bit).
Because we want to ensure that there are four different headers, in any order, the regex for this is going to be huge: ABCD|ACDB|ACDB|ABDC|ADCB|BACD|... where A is the Content-Type header regex, B is the Content-Location regex, etc. You could cheat a little bit and craft a single regex which matches a sequence of four matches of the same header-identifying regex -- this is unlikely to cause any false positives (there is no sane reason to have two copies of the same header) and simplifies the code significantly, though it's still complex. Pay attention here: We want to create a single regex which matches any one out of these four headers.
^Content-(Type:[ ]text/plain;|\
Location:[ ]*text_0\.txt|\
Transfer-Encoding:[ ]*base64|\
Disposition:[ ]*attachment)
... followed by any header, repeated four times, followed by the MIME body part (which you had after the Content-Disposition header, slightly out of context, but not incorrectly per se).
(Your code has text/html but if the attachment isn't HTML, as suggested by the format and the filename, it should be text/plain; so I'm going with that instead.)
Before we go there, I'll point out that MIME parsing in Procmail is not done a lot, precisely because it tends to explode into enormously complex regular expressions. MIME has a lot of options, and you need each regex to allow for omission or inclusion of each optional element. There are options for how to encode things (base64, or quoted-printable, or not encoded at all) and options to include or omit quotes around many elements, and options to use a multipart message with one or more body parts or just put the data in the body, like in my constructed first example message (which is still technically a MIME message; its implied content type is text/plain; charset="us-ascii" and the default content transfer encoding is 7bit, which conveniently happens to be what email before MIME always had to look like).
So unless you are in this because (a) you really, really want to learn the deepest secrets of Procmail or (b) you are on a very constrained system where you have to because there is nothing else you can use, I would seriously suggest that you move to a language with a proper MIME parser. A Python script which decodes this would be just half a dozen lines or so, and you get everything normalized and decoded nicely for you with no need for you to reinvent quoted-printable decoding or character set translation. (You can still call the Python script from Procmail if you like.)
I'll also point out here that a proper MIME parser would extract the boundary= parameter from the top-level headers in a multipart message, and make sure any matching on body part headers only occurs immediately after a boundary separator. The following Procmail code does not do that, so we could get a false positive if a message contains a match somewhere else than in the MIME body part headers (such as, for example, if a bounce message contains a fragment of the MIME headers of the bounced message; in this case, you would like for the recipe not to match, but it will).
:0B
* ^(Content-(Type:[ ]text/plain;|\
Location:[ ]*text_0\.txt|\
Transfer-Encoding:[ ]*base64|\
Disposition:[ ]*attachment).*(($)[a-z0-9].*)*)($)\
(Content-(Type:[ ]text/plain;|\
Location:[ ]*text_0\.txt|\
Transfer-Encoding:[ ]*base64|\
Disposition:[ ]*attachment).*(($)[a-z0-9].*)*)($)\
(Content-(Type:[ ]text/plain;|\
Location:[ ]*text_0\.txt|\
Transfer-Encoding:[ ]*base64|\
Disposition:[ ]*attachment).*(($)[a-z0-9].*)*)($)\
(Content-(Type:[ ]text/plain;|\
Location:[ ]*text_0\.txt|\
Transfer-Encoding:[ ]*base64|\
Disposition:[ ]*attachment).*(($)[a-z0-9].*)*)($)\
($)\/[a-z0-9/+]+=*
{ msgid=`printf '%s' "$MATCH" | base64 -d` }
:0BE
* ^^\/[a-z]+[0-9]*[^\+]
{ msgid="$MATCH" }
(Unfortunately, Procmail's regex engine doesn't have the {4} repetition operator, so we have to repeat the regex literally four times!)
As noted before, Procmail, unfortunately, doesn't know anything about MIME. As far as Procmail is concerned, the top-level headers are headers, and everything else is body. There have been attempts to write MIME libraries or extensions for Procmail, but they don't tend to reduce complexity, just shuffle it around.

What is this Perl string encoded in?

I'm using use Mail::IMAPClient to retrieve mail headers from an imap server. It works great. But when the header contains any character other that [a-z|A-Z|0-9] I'm served with strings that look like this :
Subject : Un message en =?UTF-8?B?ZnJhbsOnYWlzIMOgIGxhIGNvbg==?= (original string : "Un message en français à la con")
Body :
=C3=A9aeio=C3=B9=C3=A8=C3=A8 (original string : éaeioùèè)
What is this strange format ? Is that the famous "perl string
internal" format ?
what is the safest way of handling human idioms
coming from IMAP servers ?
The body encoding is Quoted-Printable; the header (subject) encoding is MIME "encoded-word" encoding ("B" type for base64). The best way to deal with both of them is to pass the email into a module that's capable of dealing with MIME, such as Email::MIME or the older and buggier MIME::Lite.
For example:
# $message was retrieved from IMAP
my $mime = Email::MIME->new($message);
my $subject = $mime->header('Subject'); # automatically decoded
my $body = $mime->body_str; # also automatically decoded
However if you need to deal with them outside of the context of an entire message, there are also modules like Encode::MIME::Header and MIME::QuotedPrint.
It is quoted-printable coded. It is a standard encoding used in email. It has nothing to do with Perl's internal string format.

converting base64 encoded mail subject to text

Set out to write a simple procmail recipie that would forward the mail to me if it found the text "Unprovisioned" in the subject.
:0:
* ^Subject:.*Unprovisioned.*
! me#test.com
Unfortunately the subject field in the mail message coming from the mail server was in MIME encoded-word syntax.
The form is: "=?charset?encoding?encoded text?=".
Subject: =?UTF-8?B?QURWSVNPUlk6IEJNRFMgMTg0NSwgTkVXIFlPUksgLSBVbnByb3Zpc2lvbmVkIENvbm4gQQ==?=
=?UTF-8?B?bGVydA==?=
The above subject is utf-8 charset, base64 encoding with text folded to two lines. So was wondering if there are any mechanisms/scripts/utilities to parse this and convert to string format so that I could apply my procmail filter. Ofcourse I can write a perl script to parse this an perform the required validations, but looking to avoid it if possible.
Encode::MIME::Header, which ships with Perl, accessed directly through Encode:
use Encode qw(encode decode);
my $header_text = decode('MIME-Header', $header);

Maildrop: Filter mail by Date: header

I'm using getmail + maildrop + mutt + msmtp chain with messages stored in Maildir. Very big inbox bothers me, so i wanted to organize mail by date like that:
Maildir
|-2010.11->all messages with "Date: *, * Nov 2010 *"
|-2010.12->same as above...
|-2011.01
`-2011.02
I've googled much and read about mailfilter language, but still it is hard for me to write such filter. Maildrop's mailing list archives has almost nothing on this (as far as i scanned through it). There is some semi-solution on https://unix.stackexchange.com/questions/3092/organize-email-by-date-using-procmail-or-maildrop, but i don't like it, because i want to use "Date:" header and i want to sort by month like "YEAR.MONTH" in digits.
Any help, thoughts, links, materials will be appreciated.
Using mostly man pages, I came up with the following solution for use on Ubuntu 10.04. Create a mailfilter file called, for example, mailfilter-archive with the following content:
DEFAULT="$HOME/mail-archive"
MAILDIR="$DEFAULT"
# Uncomment the following to get logging output
#logfile $HOME/tmp/maildrop-archive.log
# Create maildir folder if it does not exist
`[ -d $DEFAULT ] || maildirmake $DEFAULT`
if (/^date:\s+(.+)$/)
{
datefile=`date -d "$MATCH1" +%Y-%m`
to $DEFAULT/$datefile
}
# In case the message is missing a date header, send it to a default mail file
to $DEFAULT/inbox
This uses the date command, taking the date header content as input (assuming it is in RFC-2822 format) and producing a formatted date to use as the mail file name.
Then execute the following on existing mail files to archive your messages:
cat mail1 mail2 mail3 mail4 | reformail -s maildrop mailfilter-archive
If the mail-archive contents look good, you could remove the mail1, mail2, mail3, mail4, etc. mail files.