My question is theoretical.
I have a database with e-mails. For each email I store the desired sending time (as an UNIX timestamp) and the contents of the e-mail (sender, receiver, subject, body, etc.). There's a large number of e-mails scheduled.
This's how I wanted to send the e-mails so far:
I would have a worker process or server which periodically queries the database for "overdue" e-mails based on the timestamps. Then it sends those e-mails, and in the end it deletes them from the DB.
I started to think about two things:
What if the worker dies when it has sent the e-mail but hasn't
deleted it from the database? If I restart the worker, the e-mail
will be sent again.
How do I do it if I have a really large number of
e-mails and therefore I run multiple workers? I can mark an e-mail in
the database as "being sent", but how do I re-initiate sending if the
responsible worker dies? I mean I won't know if a worker has died or
it's just so slow that it's still sending the messages. I'm assuming I cannot get notified about a worker has died, so I can't re-send the e-mails that it failed to send.
I know that e-mail sending is not a so serious thing like bank transactions, but I think there must be a good solution for this.
How is this used to be done?
I would actually use a flag on each email record in the database:
Your worker (or multiples) update the oldest record with their unique worker ID (e.g. a PID or IP/PID combination).
Example for Oracle SQL:
update email set workerid = 'my-unqiue-worker-id' where emailid in (
select emailid from email where
rownum <= 1 and
duetime < sysdate and
workerid = null
order by duetime
)
This would just take 1 not yet processed record (ordered by duetime, which has to be in the past) and set the worker ID. This procedure would be synchronized by the normal database locking mechanism (so only one thread writes at the same time).
Then you select all records with:
select * from email where workerid = 'my-unique-worker-id'
which will be either 0 or 1 record. If it is 0, there is no due mail.
If you have finished sending the email you set the workerid = 'some-invalid-value' (or you use another flag-column to mark the progress. That way it doesn't get picked up by the next worker.
You probably won't be able to find out if the email really has been sent. If the worker dies after sending and before updating the record, there's not much you can do. To be a bit more self-sufficient the worker could create a process file locally (e.g. an empty file with the emailid as the file name. This could at least detect if the crash was just a database connection issue..
If the worker is started and before updating any record already finds a message, which has its ID as the workerid then I would raise an alert / error which should be handled manually (by checking the SMTP server log and manually updating the record).
Related
I am receiving a large number of correlated HL7 messages in Mirth. They contain an ID which is always the same for all correlated messages and they always come within a minute. Multiple batches can be received at the same time. It's hard to say when the batch ends, but when there are no more messages for a minute, it's safe to assume that the batch has finished.
How could I implement an aggregator pattern in Mirth that would keep reading and completing correlated messages and send the completed message after it didn't receive any new messages with the same ID within a defined time interval?
You may drop all incoming message to a folder and store the message ID in a Global Map. Once new messages start to arrive with the message ID different than the one stored in the map (meaning that the next sequence is started), trigger another channel either by sending the message ID it needs to look for or in some other way. After that replace the message ID in the Global Map with the message ID of a new sequence.
If that sounds too complicated, you may do the same, but the second channel will constantly scan the folder (File Reader) and grab only files having the same message ID and older than a minute from a current time (which is in my mind is too vague qualifier).
I've implemented this by saving all messages in a folder using an ID inside the message (that identifies the sequence) as the file name. The file gets updated with each new message. Several sequences live together in the same folder.
The next channel is using a simple file reader that only fetches the files that are a minute or more old.
In our application we have a central database and many disconnected client applications with their own local databases. A client connects to the central server and the server should send them the data that have changed since the client's last connection.
Because there are too many clients, and some of them might cease to exist without notifying the server, it is not practical to keep the pending changes on the server per client.
That is why in every relevant table we have a column update_date that is on every insert and every update set to the current_timestamp. Deletes are handled in a similar way, with an auxiliary table for every synchronized table, where we store the primary key of the synchronized table and the delete_date.
When a client connects to the server, it sends to the server the last synchronization timestamp, the server sends all changes where update_date > last_sync and then the current_timestamp of the transaction to store on the client as the last_sync.
The problem of this approach is that when there is a running transaction T1 with the current_timestamp = 1000, the client connects in a transaction T2 with the current_timestamp = 2000. Since T2 does not see the not yet committed changes made in T1, their are not sent to the client. The next time when the client connects, the changes from T1 are already committed, but they are marked with update_date = 1000, so they will not be sent the client requesting the changes made after 2000.
Any suggestions how to make sure that the clients get all the changed records? It is acceptable that the clients gets the same changes multiple times.
Personally I would go for an audit trigger to solve this which is described here: https://wiki.postgresql.org/wiki/Audit_trigger
After that you can choose how to apply the updates (or ignore some of them if they're not relevant).
Alternatively you could try one of the standard replication modules, some of the asynchronous ones should do the trick: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling#Comparison_matrix
Bucardo for example was specifically designed for cases like these.
I have created an NServiceBus Distributor and Worker, running on separate machines. When I run the worker, it successfully sends a message to the Distributor (and I can see it processed through the Storage queue) but for some reason an output queue is created on the Distributor called
'DIRECT=TCP:xx.xx.xx.xx\PRIVATE$\order_queue$ when the queue should be called
'DIRECT=OS:WORKERDNSNAME\private$\myqueue'.
Does anyone know why the order_queue$ is being created?
Shameless copy direct from an old post at pg2e.blogspot.co.uk:
Transactional queues over HTTP from private networks
When sending messages to a transactional queue over http/s from a
server without a public ip address the ACK-messages may have a hard
time reaching their destination. This is due to the same cause as in
this post (Basically NATting causing a mismatch with the message destination address).
By default the receipts are sent to the sending computers name, which
of course will not work unless both parties resides on the same
network. To fix this you have to map the receipts to the public address
of the sender. This is done by creating an xml-file (of any name) in
C:\WINDOWS\system32\msmq\mapping with the following content.
<StreamReceiptSetup xmlns="msmq-streamreceipt-mapping.xml">
<setup>
<LogicalAddress>http://msmq.domain.com/*</LogicalAddress>
<StreamReceiptURL>http://[ADDRESS_TO_SENDER]/msmq/Private$/order_queue$</StreamReceiptURL>
</setup>
<default>http://xxx.xx.xxx.xx/msmq/Private$/order_queue$</default>
</StreamReceiptSetup>
Explanation: All messages sent to any queue at msmq.domain.com will
have their receipts sent to the given StreamReceiptURL. The
order_queue$ queue is used to handle transactional control messages.
I suspect later versions of MSMQ or NServiceBus handle creating this queue automatically without you having to create the XML file yourself.
I know that Perl Mime::Lite is deprecated, but I have to work on a hosted server where only Mime::Lite is installed. This server also limits to 500 the number of emails that can be sent every hour.
I have a large list of participants that need to be emailed instructions to complete questionnaires and reminders if they haven't completed their questionnaire weeks later. I have a script that check if they have completed their questionnaire and if a reminder should and has been sent or not. Otherwise a reminder is sent. However, I have to limit the number of emails sent to 500 per hour.
Is there a way to tell Mime::Lite to send 500 emails, wait 1 hour and then send 500 other emails or do I need to program it myself in perl using external files: Sending 500 emails, marking that those emails have been sent and at what time. Every time the script is run, it checks again if an email has to be sent and at what time the last email has been sent. If it is more than one hour from current time, it sends 500 new emails.
Or any other more convenient ways?
Just to be sure, my emails are legitimate and expected by the users (and wanted).
Mime::Lite itself doesn't implement this, but it's easy to implement yourself. Assuming you have a sub send_to($recipient, $msg) that actually uses Mime::Lite to send the message, you can wrap it with something like:
my $msg = ...;
my #recipients = ...;
while (#recipients) {
for (1 .. 500) {
last unless #recipients; # batch is implicitly over if we're out of people to send to
send_to(shift #recipients, $msg);
}
sleep 3600 if #recipients; # wait an hour before the next batch
}
Note that this is contingent upon your host server allowing you to keep a process running for enough hours to work through the entire list. If they don't, then you'll need to work up something with a database to track all recipients and which have already been mailed to.
Honestly, though, it would probably be better (and likely easier) to use real mailing list software to handle this instead of writing your own semi-functional list server. Perhaps your hosting service offers mailing lists as well?
I need to implement support of multiple messages per one connection for my SMTP server.
Every message ends with:
data
<<content>>
.
And it's logically that protocol state should be reset to "after receive authentication" point. Is it correct?
The question: Is it possible that any client sends message content with multiple data commands? Does the standard allow it?
From RFC2821 ("Simple Mail Transfer Protocol"):
The mail data is terminated by a line containing only a period, that
is, the character sequence "." (see section 4.5.2).
...
Receipt of the end of mail data indication requires the server to process the stored mail transaction information. This processing consumes the information in the reverse-path buffer, the forward-path buffer, and the mail data buffer, and on the completion of this command these buffers are cleared.
i.e. after <CRLF>.<CRLF> is received, the server consumes the mail data and clears its buffers; hence the client cannot then send more content associated with the message, since the server will have forgotten about the message.
...
Once started, a mail transaction consists of a transaction beginning command, one or more RCPT commands, and a DATA command, in that order.
...
MAIL (or SEND, SOML, or SAML) MUST NOT be sent if a mail transaction is already open, i.e., it should be sent only if no mail transaction had been started in the session, or it the previous one successfully concluded with a successful DATA command, or if the previous one was aborted with a RSET.
i.e. MAIL begins a new mail transaction, and a successful DATA command (terminated by <CRLF>.<CRLF>) concludes it; the client may then send another message.
From RFC4954 ("SMTP Service Extension for Authentication"):
After an AUTH command has been successfully completed, no more AUTH commands may be issued in the same session. After a successful AUTH command completes, a server MUST reject any further AUTH commands with a 503 reply.
i.e. authentication takes place at most once per session, and applies until the end of that session.