Search emails in Microsoft Exchange/EWS/Office 365 - rest

We would like to query our Exchange server for emails based on (for example) the Subject field. Not from a specific address or to a specific address, but rather all emails "that went through the server" and the term X appeared in the subject.
Preferably some standard way like REST / SOAP and so on with HTTPS. Thanks.

You have two options I think, at least with powershell. First and fastest is to do a messagetrace using get-transportservice | get-messagetrackinglog to see what messages have been received and sent.
https://technet.microsoft.com/en-us/library/aa997573(v=exchg.160).aspx
Other would be to search every mailbox for messages conforming to a pre-set filter. You could use get-mailbox | search-mailbox for that.
https://technet.microsoft.com/en-us/library/dd298173(v=exchg.160).aspx

eDiscovery is the most common method you would use to do that https://support.office.com/en-us/article/eDiscovery-in-Office-365-143b3ab8-8cb0-4036-a5fc-6536d837bfce . You can use ediscovery is EWS (SOAP) https://msdn.microsoft.com/en-us/library/office/jj190897(v=exchg.150).aspx the REST API's in Office365 also allow you to search at the folder level https://msdn.microsoft.com/office/office365/APi/mail-rest-operations

Related

Issue with custom exim filter

I am attempting to create some rules to help deal with the outbound spam we've seen lately from our customers being compromised. To do this I'm using an Exim filter and I need to discard the emails which includes numbers from 0-9 for example:
$sender_address: contains "#domain.com"
$header_subject: contains "^([0-9]+)\#domain\.com"
However, the filter is not working as expected. I want to discard the emails which are received from domain.com like 123#domain.com | 234567#domain.com
I tried many tricks, but none of them was working.
For the regex matching, try using matches instead of contains
$header_subject: matches "^([0-9]+)\#domain\.com"

IMAP - search for all messages in a conversation thread by references

Please don't mark this as a duplicate of 16816425 because that one has incorrect search syntax which is corrected in this question. It is also not a duplicate of 6088914 because this question is about IMAP, not Gmail's IMAP extension.
I'm working on an IMAP client, and would like to be able to find a list of all messages that are referenced in a conversation thread.
I know that the "References" header includes a list of messages referenced in a conversation, so I tried searching it like so:
C: 3 SEARCH HEADER References {48}
S: + go ahead
C> <C63EF8D6-6874-401B-9E8C-B1D63B633246#gmail.com>
But it returns nothing. I've successfully searched for a single message using the "Message-ID" header, like so:
C: 3 SEARCH HEADER Message-Id {48}
S: + go ahead
C> <8D7F7FD5-9CD8-4D4B-BC8B-E1A3BC217350#gmail.com>
Is there any way to do this using IMAP 4?
Of course, I manually checked that <8D7F7FD5-9CD8-4D4B-BC8B-E1A3BC217350#gmail.com> was in the "References:" field in some emails in that mailbox and that very email is in the same mailbox too.
The above test was done against GMAIL.
NOTE: A google search suggested that gmail doesn't support RFC5256. I am validating if searching "References:" works before falling back to modify the IMAP library to use Gmail-specific IMAP command X-GM-THRID.

Mailgun API: Batch Sending vs. Individual Calls

Background
We're building an application that will process & send emails via Mailgun. These are sometimes one-off messages, initiated by a transaction. Some emails, though, will be sent to 30k+ at once.
Eg, a newsletter to all members.
Considerations
Mailgun offers a Batch Sending option with their API. Using "Recipient Variables", you can include dynamic values that are paired with a particular user.
This Batch Sending functionality is limited, however. You cannot send more than 1,000 recipients per request, which means we have to iterate through a recipient list (on our database) for each set of 1,000. Mailgun provides an example of how this might work, using Python (scroll about 2/3 down).
Question
Are there any advantages to batch sending (ie, sending an email to a group of recipients through a single API call, using recipient variables) as opposed to making our own loop, variable substitutions and individual API calls?
I assume this is more taxing on our server, as it would be processing each message itself, instead of just offloading all that data to Mailgun's server for heavy-lifting on their end. But I also like the flexibility & simplicity of handling that on our end and sending a "fully-rendered" message to Mailgun, one at a time, without having to iterate 1k at a time.
Any thoughts on best practices, or considerations we should take into account?
Stumbled onto this today, and felt it provided a pretty good summary/answer for my original question. I wanted to post this as an answer, in case anybody else has this question and hasn't found this Mailgun post. Straight from the horse's mouth, too. The nutshell version:
For PHP, at least, the SDK has a Mailgun class, with a BatchMessage() method. This actually handles the counting of recipients for you, so you can just queue up as many email addresses as you want (ie, more than 1k) and Mailgun will fire off to the API endpoint as needed. Pretty slick!
Here's their original wording, plus a link to the page.
Sending a message with Mailgun PHP SDK + Batch Message:
Batch Message
In addition to Message Builder, we have Batch Message. This class
allows you to build a message and submit a template message in
batches, up to 1,000 recipients per post. The benefit of using this
class is that the recipients tally is monitored and will automatically
submit the message to the endpoint when you've added the 1,000th
recipient. This means you can build your message and begin iterating
through your database. Forget about sending the message, the SDK will
keep track of posting to the API when necessary.
// First, instantiate the SDK with your API credentials and define your domain.
$mgClient = new Mailgun("key-example");
$domain = "example.com";
// Next, instantiate a Message Builder object from the SDK, pass in your sending domain.
$batchMsg = $mgClient->BatchMessage($domain);
// Define the from address.
$batchMsg->setFromAddress("dwight#example.com",
array("first"=>"Dwight", "last" => "Schrute"));
// Define the subject.
$batchMsg->setSubject("Help!");
// Define the body of the message.
$batchMsg->setTextBody("The printer is on fire!");
// Next, let's add a few recipients to the batch job.
$batchMsg->addToRecipient("pam#example.com",
array("first" => "pam", "last" => "Beesly"));
$batchMsg->addToRecipient("jim#example.com",
array("first" => "Jim", "last" => "Halpert"));
$batchMsg->addToRecipient("andy#example.com",
array("first" => "Andy", "last" => "Bernard"));
// ...etc...etc...
// After 1,000 recipeints,
// Batch Message will automatically post your message to the messages endpoint.
// Call finalize() to send any remaining recipients still in the buffer.
$batchMsg->finalize();
The answer of #cdwyer and #nikoshr is very helpful, but bit legacy. Used methods in the example are deprecated. Here is current usage of lib:
$batchMessage = $this->mailgun->messages()->getBatchMessage('mydomain.com');
$batchMessage->setFromAddress('user#domain.com');
$batchMessage->setReplyToAddress('user2#domain.com');
$batchMessage->setSubject('Contact form | Company');
$batchMessage->setHtmlBody('<html>...</html>');
foreach ($recipients as $recipient) {
$batchMessage->addToRecipient($recipient);
}
$batchMessage->finalize();
More info at documentation.

Apache Camel mail to identify auto generated messages

I'm looking for a way to identify auto generated messages like Outlook's "Out of office" replies.
I stumbled upon a header called "Auto-submitted" that's supposed to do the trick, but Camel doesn't seems to provide this header in the "Message" object. Reference: http://www.iana.org/assignments/auto-submitted-keywords/auto-submitted-keywords.xml
Is it possible to know if a message is auto generated or human generated?
I don't know Apache Camel, but I can tell you that there is no simple and safe way to detect automated email messages in general. Headers like auto-submitted are an indicator, but unfortunately lots of automated scripts do not add them. I once had to write an out-of-office implementation that should not send ooo replies to any automated messages (mailing lists, spam, newsletters, etc.). Here is what I finally came up with, maybe this helps in your case as well:
Sender address regular expressions that indicate automated senders:
"^owner-"
"^request-"
"-request#"
"bounce.*#"
"-confirm#"
"-errors#"
"^no[-]?reply"
"^donotreply"
"^postmaster#"
"^mailer[-_]daemon#"
"^mailer#"
"^listserv#"
"^majordom[o]?#"
"^mailman#"
"^nobody#"
"^bounce"
"^www(-data)?#"
"^mdaemon#"
"^root#"
"^news(letter)?#"
"^webmaster#" (role address - may not be a good indicator in your case)
"^administrator#" (role address - may not be a good indicator in your case)
"^support#" (role address - may not be a good indicator in your case)
Headers that indicate automated messages if they exist:
list-help
list-unsubscribe
list-subscribe
list-owner
list-post
list-archive
list-id
mailing-List
x-facebook-notify
x-mailing-list
x-cron-env
x-autoresponse
x-eBay-mailtracker
Headers that indicate automated messages if they have a special value:
'x-spam-flag':'yes'
'x-spam-status':'yes'
'X-Spam-Flag2': 'yes'
'precedence':'(bulk|list|junk)'
'x-precedence':'(bulk|list|junk)'
'x-barracuda-spam-status':'yes'
'x-dspam-result':'(spam|bl[ao]cklisted)'
'X-Mailer':'^Mail$'
'auto-submitted':'auto-replied'

How to make an email bot that replies to users not reply to auto-responses and get itself into mail loops

I have a bot that replies to users. But sometimes when my bot sends its reply, the user or their email provider will auto-respond (vacation message, bounce message, error from mailer-daemon, etc). That is then a new message from the user (so my bot thinks) that it in turn replies to. Mail loop!
I'd like my bot to only reply to real emails from real humans. I'm currently filtering out email that admits to being bulk precedence or from a mailing list or has the Auto-Submitted header equal to "auto-replied" or "auto-generated" (see code below). But I imagine there's a more comprehensive or standard way to deal with this. (I'm happy to see solutions in other languages besides Perl.)
NB: Remember to have your own bot declare that it is autoresponding! Include
Auto-Submitted: auto-reply
in the header of your bot's email.
My original code for avoiding mail loops follows. Only reply if realmail returns true.
sub realmail {
my($email) = #_;
$email =~ /\nSubject\:\s*([^\n]*)\n/s;
my $subject = $1;
$email =~ /\nPrecedence\:\s*([^\n]*)\n/s;
my $precedence = $1;
$email =~ /\nAuto-Submitted\:\s*([^\n]*)\n/s;
my $autosub = $1;
return !($precedence =~ /bulk|list|junk/i ||
$autosub =~ /(auto\-replied|auto\-generated)/i ||
$subject =~ /^undelivered mail returned to sender$/i
);
}
(The Subject check is surely unnecessary; I just added these checks one at a time as problems arose and the above now seems to work so I don't want to touch it unless there's something definitively better.)
RFC 3834 provides some guidance for what you should do, but here are some concrete guidelines:
Set your envelope sender to a different email address than your auto-responder so bounces don't feed back into the system.
I always store in a database a key of when an email response was sent from a specific address to another address. Under no circumstance will I ever respond to the same address more than once in a 10 minute period. This alone stopped all loops, but doesn't ensure nice behavior (auto-responses to mailing lists are annoying).
Make sure you add any permutation of header that other people are matching on to stop loops. Here's the list I use:
X-Loop: autoresponder
Auto-Submitted: auto-replied
Precedence: bulk (autoreply)
Here are some header regex's I use to avoid loops and to try to play nice:
/^precedence:\s+(?:bulk|list|junk)/i
/^X-(?:Loop|Mailing-List|BeenThere|Mailman)/i
/^List-/i
/^Auto-Submitted:/i
/^Resent-/i
I also avoid responding if any of these are the envelop senders:
if ($sender eq ""
|| $sender =~ /^(?:request|owner|admin|bounce|bounces)-|-(?:request|owner|admin|bounce|bounces)\#|^(?:mailer-daemon|postmaster|daemon|majordomo|ma
ilman|bounce)\#|(?:listserv|listsrv)/i) {
That really sounds like something that's probably available as a module from CPAN, but I didn't find anything clearly relevant in five minutes of searching. Mail::Lite::Mbox::Processor looks like it might do what you want:
Mail::Lite::Message::Matcher is a
framework for automated mail
processing. For example you have a
mail server and you have a need to
process some types of incoming mail
messages automatically. For example,
you can extract automated
notifications, invoices, alerts etc.
from your mail flow and perform some
tasks based on content of those
messages.
but its docs are sparse enough that it isn't immediately obvious whether it provides those example functions itself or if you have to provide the code to drive them.
In any case, though, if you haven't already checked CPAN, that's where I would start if I wanted to do something like this.
My answer here only deals with bounces which is more straightforward.
Using DSN (Delivery Status Notification) identifier will help you detect a DSN/bounced message. It should go to Return-Path and not Reply-To.
Here's a sample of a typical DSN message. The header information includes the message id, content type has specific values (delivery-status) etc.
Not able to provide you any codes in perl, just my 2 cents of idea.
PS: Do note that not all mail servers or MTA conforms to this, but I guess most do.
There should be a standard way of dealing with this, but the problem is that you'd have to assume that systems that send auto-replies comply to that standard, when most the time, they just don't.
How do you get the address that you reply to? I hope you aren't using the From: header. Check the Reply-to: header first and if that doesn't exist, use the Return-path:.
But whatever you do, you will simply have to keep a log of what you sent to whom and throttle your bot to some sensible value of messages per time.