Why does my Outlook add in randomly stop working? - email

I have a plugin running in Outlook 2013 that is intended to filter out spam. It looks at incoming emails, runs through a half-dozen heuristics, and if 1 or 2 of them return positive, it moves the mail to another folder for inspection; if three or more are tripped, it permanently deletes the mail.
My problem is that it randomly stops working. The add in isn't marked as "disabled" or "inactive"; it just doesn't work anymore. Disabling it and then enabling it brings it back on line, and then it will work fine, including on the mail that it just failed on. And then it will fail randomly again a few messages later. No error is being raised as far as I can see; no exception is being thrown.
I feel like this has something to do with Outlook's limits for add-ins, since it is (a) non-deterministic and (b) seems to happen more often the more checks I do. But nothing I am doing is very heavy; most of it is just checking things in the email address, subject, and headers, or looking for key phrases in the body. Eyeballing it, every message appears to be processed in just a fraction of a second. I guess the first question is, is there a way to see Outlook's decision process on this, i.e. the specific perf statistics it is tracking? (ETA: Does Outlook enforce its restrictions when the add-in is invoked, or just at startup? Everything I can find only refers to startup.)
No single one of the checks seems to be responsible; I've disabled each one individually and it keeps happening.
Here's the central code. SpamMarkerCheckList is a list of delegates for the individual heuristic checks. ClearAllMailMarkers() does things like mark the mail as read, set the priority to normal, remove any flags, etc. headers is a Dictionary with the header names as the keys and lists of strings as the values.
private static void FilterInbox(object mailItem)
{
try
{
if (mailItem != null)
{
var mail = (Outlook.MailItem)mailItem;
var markers = SpamMarkersCount(mail, 3);
if (markers > 2)
{
PermanentlyDeleteEmail(mail);
}
else if (markers > 0)
{
ClearAllMailMarkers(mail);
mail.Move(_sequesteredFolder);
}
}
}
catch (Exception e)
{
MessageBox.Show("FilterSpam caught unexpected exception -- " + e);
}
}
private static int SpamMarkersCount(Outlook.MailItem mail, int threshold)
{
var spamMarkerCount = 0;
var headers = GetHeaderProperties(mail);
var emailAddressList = BuildAddressList(mail, headers);
var fullStringList = GetAllStrings(mail, headers);
foreach (var spamMarkerCheck in SpamMarkerCheckList)
{
if (spamMarkerCheck(mail, headers, emailAddressList, fullStringList))
{
spamMarkerCount++;
if (spamMarkerCount >= threshold)
{
return spamMarkerCount;
}
}
}
return spamMarkerCount;
}
The checks I am doing are:
Non-ASCII characters in the sender's name or address (e.g. hearts, shopping carts, etc.)
Indicators in the headers, like the presence of List-Unsubscribe or failed SPF, DKIM, or DMARC authentication
Email addresses that are badly formed or missing
If it was sent from a fake domain (via doing a DNS lookup)
The presence of the user's email alias in the subject or sender name ("delius1967 is a winner!")
If it was sent from a known blocked list of domains
If it contains specific phrases (e.g. "this is an advertisement")

I'd suggest checking the Windows event viewer for Outlook-specific records. Most probably your add-in fires an exception at runtime.
First and foremost, Microsoft Office applications can disable VSTO Add-ins that behave unexpectedly. If an application does not load your VSTO Add-in, the application might have hard disabled or soft disabled your VSTO Add-in.
Hard disabling can occur when a VSTO Add-in causes the application to close unexpectedly. It might also occur on your development computer if you stop the debugger while the Startup event handler in your VSTO Add-in is executing.
Soft disabling can occur when a VSTO Add-in produces an error that does not cause the application to unexpectedly close. For example, an application might soft disable a VSTO Add-in if it throws an unhandled exception while the Startup event handler is executing.
When you re-enable a soft-disabled VSTO Add-in, the application immediately attempts to load the VSTO Add-in. If the problem that initially caused the application to soft disable the VSTO Add-in has not been fixed, the application will soft disable the VSTO Add-in again. Read more about that in the How to: Re-enable a VSTO Add-in that has been disabled article.
Second, extending the add-in resiliency pillar of Outlook 2010, Outlook 2013 and later versions monitor add-in performance metrics such as add-in startup, shutdown, folder switch, item open, and invoke frequency. Outlook records the elapsed time in milliseconds for each performance monitoring metric.
For example, the startup metric measures the time required by each connected add-in during Outlook startup. Outlook then computes the median startup time over 5 successive iterations. If the median startup time exceeds 1000 milliseconds (1 second), then Outlook disables the add-in and displays a notification to the user that an add-in has been disabled. The user has the option of always enabling the add-in, in which case Outlook will not disable the add-in even if the add-in exceeds the 1000 millisecond performance threshold. Read more about that in the Performance criteria for keeping add-ins enabled section.
You may find the Outlook’s slow add-ins resiliency logic and how to always enable slow add-ins article helpful.
Finally, Outlook uses a single-threaded apartment model and may prevent any calls from secondary threads by throwing exceptions at runtime. You should use the OOM on the main thread only. If you need to do any processing on the background you may consider extracting the required plain data and passing it for processing. Extended MAPI allows running secondary threads also.

I found the proximate cause: garbage collection. After running in the debugger for quite a while, I noticed that the add-in stopped working immediately after a GC event; several more runs through show that this happens consistently.
The ultimate cause was in how I was associating the function with the folder:
private static Outlook.Account emailStore; // outside ThisAddIn_Startup
[...]
emailStore.DeliveryStore.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderInbox).Items.ItemAdd += FilterInbox;
For reasons I do not fully understand, that causes a problem after GC, but this does not:
private static Outlook.Items emailItems; // outside ThisAddIn_Startup
[...]
emailItems = emailStore.DeliveryStore.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderInbox).Items;
emailItems.ItemAdd += FilterInbox;
I get why having it as static is critical, but I don't understand why it makes a difference in the assignment. Anyone with a deeper understanding of C# objects who can explain, I would love to hear it.

Related

(Laravel 5) Monitor and optionally cancel an ALREADY RUNNING job on queue

I need to achieve the ability to monitor and be able to cancel an ALREADY RUNNING job on queue.
There's a lot of answers about deleting QUEUED jobs, but not on an already running one.
This is the situation: I have a "job", which consists of HUNDREDS OF THOUSANDS rows on a database, that need to be queried ONE BY ONE against a web service.
Every row needs to be picked up, queried against a web service, stored the response and its status updated.
I had that already working as a Command (launching from / outputting to console), but now I need to implement queues in order to allow piling up more jobs from more users.
So far I've seen Horizon (which doesn't runs on Windows due to missing process control libs). However, in some demos seen around it lacks (I believe) a couple things I need:
Dynamically configurable timeout (the whole job may take more than 12 hours, depending on the number of rows to process on the selected job)
Ability to CANCEL an ALREADY RUNNING job.
I also considered the option to generate EACH REQUEST as a new job instead of seeing a "job" as the whole collection of rows (this would overcome the timeout thing), but that would give me a Horizon "pending jobs" list of hundreds of thousands of records per job, and that would kill the browser (I know Redis can handle this without itching at all). Further, I guess is not possible to cancel "all jobs belonging to X tag".
I've been thinking about hitting an API route, fire the job and decouple it from the app, but I'm seeing that this requires forking processes.
For the ability to cancel, I would implement a database with job_id, and when the user hits an API to cancel a job, I'd mark it as "halted". On every loop I would check its status and if it finds "halted" then kill itself.
If I've missed any aspect just holler and I'll add it or clarify about it.
So I'm asking for an advice here since I'm new to Laravel: how could I achieve this?
So I finally came up with this (a bit clunky) solution:
In Controller:
public function cancelJob()
{
$jobs = DB::table('jobs')->get();
# I could use a specific ID and user owner filter, etc.
foreach ($jobs as $job) {
DB::table('jobs')->delete($job->id);
}
# This is a file that... well, it's self explaining
touch(base_path(config('files.halt_process_signal')));
return "Job cancelled - It will stop soon";
}
In job class (inside model::chunk() function)
# CHECK FOR HALT SIGNAL AND [OPTIONALLY] STOP THE PROCESS
if ($this->service->shouldHaltProcess()) {
# build stats, do some cleanup, log, etc...
$this->halted = true;
$this->service->stopProcess();
# This FALSE is what it makes the chunk() method to stop looping
return false;
}
In service class:
/**
* Checks the existence of the 'Halt Process Signal' file
*
* #return bool
*/
public function shouldHaltProcess() :bool
{
return file_exists($this->config['files.halt_process_signal']);
}
/**
* Stop the batch process
*
* #return void
*/
public function stopProcess() :void
{
logger()->info("=== HALT PROCESS SIGNAL FOUND - STOPPING THE PROCESS ===");
$this->deleteHaltProcessSignalFile();
return ;
}
It doesn't looks quite elegant, but it works.
I've surfed the whole web and many goes for Horizon or other tools that doesn't fit my case.
If anyone has a better way to achieve this, it's welcome to share.
Laravel queue have 3 important config:
1. retry_after
2. timeout
3. tries
See more: https://laravel.com/docs/5.8/queues
Dynamically configurable timeout (the whole job may take more than 12
hours, depending on the number of rows to process on the selected job)
I think you can config timeout + retry_after about 24h.
Ability to CANCEL an ALREADY RUNNING job.
Delete job in jobs table
Delete process by process id in your server
Hope it help you :)

Order of events reversed 'Ribbon_Load' and 'ThisAddin_Startup' Word VSTO Add-in. (Build 8201.2025 onwards)

As of Build 8201.2025 there has been an unexpected change to the order of events when loading a VSTO addin with a Ribbon in Word.
Using Office version 16.0.8067.2115 or older. When loading the addin the following order of events is observed (as has always been the case).
Ribbon_Load event
ThisAddin_Startup event
Using Office versions 8201.2025, 8201.2064 or 8201.2075 or newer the order of events is reversed which is an unexpected breaking change.
ThisAddin_Startup event
Ribbon_Load event
I have created a simple VSTO Addin using a Visual Designer Ribbon to demonstrate the issue.
>
Public Class Ribbon1
Private Sub Ribbon1_Load(ByVal sender As System.Object, ByVal e As RibbonUIEventArgs) Handles MyBase.Load
System.Diagnostics.Debug.Write("Ribbon1_Load event called.")
'Pass the Ribbon to the Addin.
ThisAddIn.MyRibbon = Me
End Sub
End Class
Public Class ThisAddIn
Public Shared Property MyRibbon As Ribbon1 = Nothing
Private Sub ThisAddIn_Startup() Handles Me.Startup
Debug.Write("ThisAddin_Startup Called")
If (MyRibbon Is Nothing) Then
Debug.Write("MyRibbon is nothing - the ribbon was not captured.")
Else
Debug.Write("Ribbon captured successfully.")
End If
End Sub
End Class
Debug output for 16.0.8067.2115 32 bit
[7772] Ribbon1_Load event called.
[7772] ThisAddin_Startup Called
[7772] Ribbon captured successfully.
Debug output for 16.0.8201.2075 32 bit
[13556] ThisAddin_Startup Called
[13556] MyRibbon is nothing - the ribbon was not captured.
[13556] Ribbon1_Load event called
I have posted this up on the Microsoft Support forums however they have stopped responding and since released this version to the Current office channel I need help from the dev community.
Has anyone found a successful workaround? This change of timing is causing alot of problems with how we initialise. It would be ideal for Microsoft Support to provide a solution or workaround until they investigate this bug.
I always got Ribbon_Load before ThisAddin_Startup because I use Ribbon XML. Ribbon UI allow less controls ... As the both are "entry" points, I suggest you to use only Ribbon1_Load at startup. Or, if you use the Ribbon XML model and you want the very very first entry point, try its constructor
I am not feeling that issue as a bug, to make Word fast many processes are asynchronous. So, in my opinion, the first of ThisAddin_Startup or Ribbon1_Load to start can accidentally change depending on many factors: System performances, Word started alone, Word started via a doc ...
Hope this helps someone! We used the following workaround successfully to work around the changed office load behavior.
Within ThisAddIn_Startup loop until the Ribbon load event has fired and the ribbon is captured.
While m_oRibbon Is Nothing
If (timeWaited >= MAX_WAIT_TIME) Then
Exit Try
End If
Threading.Thread.Sleep(50)
timeWaited = timeWaited + 50
End While

How to control the handle order of multiple coexisting VSTO Word Addins

If I have two different VSTO AddIns installed on the same Word application, is there a way to know which one of them first receives the events raised by Word. For example the DocumentOpen event?
Can I control that order?
Thanks
I didn't find anything specific about event order in multiple add-ins (but I believe I've read something about it years ago) so I did simple test
I created three Excel add-ins with this code
private void ThisAddIn_Startup(object sender, System.EventArgs e)
{
this.Application.WorkbookBeforeSave += Application_WorkbookBeforeSave;
}
void Application_WorkbookBeforeSave(Excel.Workbook Wb, bool SaveAsUI, ref bool Cancel)
{
System.Windows.Forms.MessageBox.Show("Add-in 1");
}
One of them I gave it random name as I wanted to know if Excel runs them in alphabetical order, in one of the add-in I set the Cancel = true
I don't know why but Excel always fired the ExcelAddin 2 first, then ExcelAddin 1 and finished with AAExcelAddin. I tried to rebuild (I also cleaned my solutions first) in different order but the order was still the same (note the very first addin created was the ExcelAddin 1) No matter if I run it from VS or just simply start Excel and pressed CTRL+S
Based on above I would say that you should not have any logic in your code that assume certain order of add-ins. You never know if any new add-in will break the order.
Also keep in mind that if you use any of the cancellation events (has the argument Cancel) and you cancel it (Cancel=true) then all the other add-ins will receive the event and will still run it but the Cancel flag will be set to true from the previous one.
In my case I set Cancel=true in the add-in that was fired at first (ExcelAddin 2) and even the two other add-ins received the event, they didn't save as saving was cancelled in the first one.

Moving from file-based tracing session to real time session

I need to log trace events during boot so I configure an AutoLogger with all the required providers. But when my service/process starts I want to switch to real-time mode so that the file doesn't explode.
I'm using TraceEvent and I can't figure out how to do this move correctly and atomically.
The first thing I tried:
const int timeToWait = 5000;
using (var tes = new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl") { StopOnDispose = false })
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
using (var tes = new TraceEventSession("TEMPSESSIONNAME", TraceEventSessionOptions.Attach))
{
Thread.Sleep(timeToWait);
tes.SetFileName(null);
Thread.Sleep(timeToWait);
Console.WriteLine("Done");
}
Here I wanted to make that I can transfer the session to real-time mode. But instead, the file I got contained events from a 15s period instead of just 10s.
The same happens if I use new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl", TraceEventSessionOptions.Create) instead.
It seems that the following will cause the file to stop being written to:
using (var tes = new TraceEventSession("TEMPSESSIONNAME"))
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
But here I must reenable all the providers and according to the documentation "if the session already existed it is closed and reopened (thus orphans are cleaned up on next use)". I don't understand the last part about orphans. Obviously some events might occur in the time between closing, opening and subscribing on the events. Does this mean I will lose these events or will I get the later?
I also found the following in the documentation of the library:
In real time mode, events are buffered and there is at least a second or so delay (typically 3 sec) between the firing of the event and the reception by the session (to allow events to be delivered in efficient clumps of many events)
Does this make the above code alright (well, unless the improbable happens and for some reason my thread is delayed for more than a second between creating the real-time session and starting processing the events)?
I could close the session and create a new different one but then I think I'd miss some events. Or I could open a new session and then close the file-based one but then I might get duplicate events.
I couldn't find online any examples of moving from a file-based trace to a real-time trace.
I managed to contact the author of TraceEvent and this is the answer I got:
Re the exception of the 'auto-closing and restarting' feature, it is really questions about the OS (TraceEvent simply calls the underlying OS API). Just FYI, the deal about orphans is that it is EASY for your process to exit but leave a session going. This MAY be what you want, but often it is not, and so to make the common case 'just work' if you do Create (which is the default), it will close a session if it already existed (since you asked for a new one).
Experimentation of course is the touchstone of 'truth' but I would frankly expecting unusual combinations to just work is generally NOT true.
My recommendation is to keep it simple. You need to open a new session and close the original one. Yes, you will end up with duplicates, but you CAN filter them out (after all they are IDENTICAL timestamps).
The other possibility is use SetFileName in its intended way (from one file to another). This certainly solves your problem of file size growth, and often is a good way to deal with other scenarios (after all you can start up you processing and start deleting files even as new files are being generated).

WMI and Win32_DeviceChangeEvent - Wrong event type returned?

I am trying to register to a "Device added/ Device removed" event using WMI. When I say device - I mean something in the lines of a Disk-On-Key or any other device that has files on it which I can access...
I am registering to the event, and the event is raised, but the EventType propery is different from the one I am expecting to see.
The documentation (MSDN) states : 1- config change, 2- Device added, 3-Device removed 4- Docking. For some reason I always get a value of 1.
Any ideas ?
Here's sample code :
public class WMIReceiveEvent
{
public WMIReceiveEvent()
{
try
{
WqlEventQuery query = new WqlEventQuery(
"SELECT * FROM Win32_DeviceChangeEvent");
ManagementEventWatcher watcher = new ManagementEventWatcher(query);
Console.WriteLine("Waiting for an event...");
watcher.EventArrived +=
new EventArrivedEventHandler(
HandleEvent);
// Start listening for events
watcher.Start();
// Do something while waiting for events
System.Threading.Thread.Sleep(10000);
// Stop listening for events
watcher.Stop();
return;
}
catch(ManagementException err)
{
MessageBox.Show("An error occurred while trying to receive an event: " + err.Message);
}
}
private void HandleEvent(object sender,
EventArrivedEventArgs e)
{
Console.WriteLine(e.NewEvent.GetPropertyValue["EventType"]);
}
public static void Main()
{
WMIReceiveEvent receiveEvent = new WMIReceiveEvent();
return;
}
}
Well, I couldn't find the code. Tried on my old RAC account, nothing. Nothing in my old backups. Go figure. But I tried to work out how I did it, and I think this is the correct sequence (I based a lot of it on this article):
Get all drive letters and cache
them.
Wait for the WM_DEVICECHANGE
message, and start a timer with a
timeout of 1 second (this is done to
avoid a lot of spurious
WM_DEVICECHANGE messages that start
as start as soon as you insert the
USB key/other device and only end
when the drive is "settled").
Compare the drive letters with the
old cache and detect the new ones.
Get device information for those.
I know there are other methods, but that proved to be the only one that would work consistently in different versions of windows, and we needed that as my client used the ActiveX control on a webpage that uploaded images from any kind of device you inserted (I think they produced some kind of printing kiosk).
Oh! Yup, I've been through that, but using the raw Windows API calls some time ago, while developing an ActiveX control that detected the insertion of any kind of media. I'll try to unearth the code from my backups and see if I can tell you how I solved it. I'll subscribe to the RSS just in case somebody gets there first.
Well,
u can try win32_logical disk class and bind it to the __Instancecreationevent.
You can easily get the required info
I tried this on my system and I eventually get the right code. It just takes a while. I get a dozen or so events, and one of them is the device connect code.