Using a saga, given an event EventA, saga starts, it sends a command (or many).
How can we make sure that the command is sent successfully then actual logic in other micro-service did not throw, etc.
Let's have an example of email saga:
When a user register, we create a User Aggregate which publishes UserRegisteredEvent, a saga will be created and this saga is responsible to make sure that registration email is sent to user (email may contain a verification key, welcome message, etc).
Should we use :
commandGateway.sendAndWait with a try/catch -> does it scale?
commandGateway.send and use a deadline and use some kind of "fail event" like SendEmailFailedEvent -> requires to associate a "token" for commands so can associate the "associationProperty" with the correct saga
that sent SendRegistrationEmailCommand
commandGateway.send(...).handle(...) -> in handle can we reference eventGateway/commandGateway that were in MyEmailSaga?
If error we send an event? Or can we modify/call a method from the saga instance we had. If no error then other service have sent an event like "RegistrationEmailSentEvent" so saga will end.
use deadline because we just use "send" and do not handle the eventual error of the command which may have failed to be sent (other service is down, etc)
something else?
Or a combination of all?
How to handle errors below? (use deadline or .handle(...) or other)
Errors could be:
command has no handlers (no service up, etc)
command was handled but exception is raised in other service and no event is sent (no try/catch in other service)
command was handled, exception raised and caught, other service publish an event to notify it failed to send email (saga will receive event and do appropriate action depending on event type and data provided -> maybe email is wrong or does not exist so no need to retry)
other errors I missed?
#Saga
public class MyEmailSaga {
#Autowired
transient CommandGateway commandGateway;
#Autowired
transient EventGateway eventGateway;
#Autowired
transient SomeService someService;
String id;
SomeData state;
/** count retry times we send email so can apply logic on it */
int sendRetryCount;
#StartSaga
#SagaEventHandler(associationProperty = "id")
public void on(UserRegisteredEvent event) {
id = event.getApplicationId();
//state = event........
// what are the possibilities here?
// Can we use sendAndWait but it does not scale very well, right?
commandGateway.send(new SendRegistrationEmailCommand(...));
// Is deadline good since we do not handle the "send" of the command
}
// Use a #DeadlineHandler to retry ?
#DeadlineHandler(deadlineName = "retry_send_registration_email")
fun on() {
// resend command and re-schedule a deadline, etc
}
#EndSaga
#SagaEventHandler(associationProperty = "id")
public void on(RegistrationEmailSentEvent event) {
}
}
EDIT (after accepted answer):
Mainly two options (Sorry but kotlin code below):
First option
commandGateway.send(SendRegistrationEmailCommand(...))
.handle({ t, result ->
if (t != null) {
// send event (could be caught be the same saga eventually) or send command or both
}else{
// send event (could be caught be the same saga eventually) or send command or both
}
})
// If not use handle(...) then you can use thenApply as well
.thenApply { eventGateway.publish(SomeSuccessfulEvent(...)) }
.thenApply { commandGateway.send(SomeSuccessfulSendOnSuccessCommand) }
2nd option:
Use a deadline to make sure that saga do something if SendRegistrationEmailCommand failed and you did not receive any events on the failure (when you do not handle the command sent).
Can of course use deadline for other purposes.
When the SendRegistrationEmailCommand was received successfully, the receiver will publish an event so the saga will be notified and act on it.
Could be an RegistrationEmailSentEvent or RegistrationEmailSendFailedEvent.
Summary:
It seems that it is best to use handle() only if the command failed to be sent or receiver has thrown an unexpected exception, if so then publish an event for the saga to act on it.
In case of success, the receiver should publish the event, saga will listen for it (and eventually register a deadline just in case); Receiver may also send event to notify of error and do not throw, saga will also listen to this event.
ideally, you would use the asynchronous options to deal with errors. This would either be commandGateway.send(command) or commandGateway.send(command).thenApply(). If the failure are businesslogic related, then it may make sense to emit events on these failures. A plain gateway.send(command) then makes sense; the Saga can react on the events returned as a result. Otherwise, you will have to deal with the result of the command.
Whether you need to use sendAndWait or just send().then... depends on the activity you need to do when it fails. Unfortunately, when dealing with results asynchronously, you cannot safely modify the state of the Saga anymore. Axon may have persisted the state of the saga already, causing these changes to go lost. sendAndWait resolves that. Scalability is not often an issue, because different Sagas can be executed in parallel, depending on your processor configuration.
The Axon team is currently looking at possible APIs that would allow for safe asynchronous execution of logic in Sagas, while still keeping guarantees about thread safety and state persistence.
Related
Using mirth version 3.8.1. I've set up an alert for a channel's errors. When errors come from the destination transformer (which is Javascript), the alert is able to access the {messageId} variable and pull the correct id. However, when an error originates in the Javascript-based database writer, the alert just returns '{messageId}' instead of the value.
I tried a bunch of things...
The global map is accessible from the alert, but putting a message id in there would get overwritten by another processing thread.
Other destination types - http sender, tcp sender, channel writer, and even a non-javascript-based database writer destination all work.
I even stripped the database writer code down to just:
var dbConn;
dbConn = DatabaseConnectionFactory.createDatabaseConnection('com.mysql.cj.jdbc.Driver','jdbc:mysql://host:port/dbname','','');
Do I just have to raise specific exceptions within the db writer code and raise alerts when those exceptions are hit, and send the message id in the error string?
You stumbled across a bug. I opened an issue and a fix.
If not for another bug that also neglects to provide the messageId, you should be able to use alerts.sendAlert('Custom Error Message'). alerts is an instance of AlertSender from the User API that mirth creates for you. I created a fix for that as well.
The only workaround I know of at this time to manually send an alert that includes the messageId is to call the EventController directly. The caveat is that this is technically not supported as part of a public API and usage could break in future versions without notice.
com.mirth.connect.server.controllers.ControllerFactory
.getFactory()
.createEventController()
.dispatchEvent(new com.mirth.connect.donkey.server.event.ErrorEvent(
connectorMessage.getChannelId(),
connectorMessage.getMetaDataId(),
connectorMessage.getMessageId(),
com.mirth.connect.donkey.model.event.ErrorEventType.USER_DEFINED_TRANSFORMER,
connectorMessage.getConnectorName(),
null, /* connectorType */
'A TEST ERROR MESSAGE',
null /* throwable */
)
);
This will work as written from a filter, transformer, Javascript Writer, or Database Writer in javascript mode. In other contexts, connectorMessage won't be defined and you'll have to provide some of those values in a different way. If you don't need the messageId and don't want to throw an exception, just use alerts.sendAlert(errorMessage) since that doesn't require calling unsupported internal classes.
I've configured monolog to send errors via email as described in the symfony docs here: https://symfony.com/doc/4.3/logging/monolog_email.html
Works well with all errors happing during a request, as well as console command errors.
But it does not send emails for errors which occurred during the handling of a messenger message.
Errors are shown when running the consumer bin/console messenger:consume async -vv and they also show up in prod.log like this:
[2020-01-10 12:52:38] messenger.CRITICAL: Error thrown while handling message...
Thanks for any hints on how to set up monolog to get messenger errors emailed too.
In fact monolog swift_mailer type use SwiftMailerHandler
wish also implements reset interface and use memory spool by default wish keep all emails in buffer until it is destructed, so till the end of request :
onKernelTerminate
onCliTerminate
OR till reset method is called, which means that for messenger worker no emails will be send ever because ther's no instant flush - all of them will be kept in in-memory buffer, and probably lost if the process will be killed.
To solve this, you can just disable the default spool memory setting for swiftmailer.
Another solution is to flush your emails after WorkerMessageFailedEvent event gets fired, you can implement an event subscriber to do it for this.
use Symfony\Component\EventDispatcher\EventSubscriberInterface;
use Symfony\Component\Messenger\Event\WorkerMessageFailedEvent;
use Symfony\Component\Messenger\Event\WorkerMessageHandledEvent;
use Symfony\Contracts\Service\ResetInterface;
/**
* Class ServiceResetterSubscriber.
*/
class ServiceResetterSubscriber implements EventSubscriberInterface
{
protected ResetInterface $servicesResetter;
public function __construct(ResetInterface $servicesResetter)
{
$this->servicesResetter = $servicesResetter;
}
public function resetServices(): void
{
$this->servicesResetter->reset();
}
public static function getSubscribedEvents(): array
{
return [
WorkerMessageFailedEvent::class => ['resetServices', 10],
];
}
}
Register your service with the right argument:
App\EventSubscriber\ServiceResetterSubscriber:
arguments: ['#services_resetter']
By the way without this (and without buffer limit) your app will leak and no emails will be sent ever.
Another trick:
Make sure that your message implements \JsonSerializable to get the message content in your logs, because messenger uses his monolog directly and its context serializer wish use json_encode for seriliazation.
That's why we need to customize their JSON representation when encoded is done with json_encode.
I am building a bot for for Facebook Messenger using Microsoft Bot Framework. I am planning to use CosmosDB for State Management and also as my backend data store. (I am not stuck to CosmosBD and can use any other store if needed)
I need to send daily/weekly proactive messages(push notifications) to users based on their time preference. I will capturing their time preference when they first interact with the bot.
What is the best way to deliver these notifications?
As I will be storing these preferences in CosmosDB, I am thinking using ComosDB trigger of creating an Azure Function and schedule it based on the user time preference. This Azure function will make a call to my webhook which will deliver these messages. If requried, I will change Function schedule when a user changes his/her preference.
My questions are:
Is this a good approach?
Are there any other alternatives (Notifications Hub?)
I should be able to set specific times for notifications (like at the top of the hour or something like that), does it make sense to schedule an Azure Function to run at these hours rather than creating a function based on user preference (I can actually combine these two approaches too)
Thank you in advance.
First, I don't think there's any "right" answer to be given here; it's going to depend a lot on your domain's specific needs. Scale is going to play a major factor in the design of this. Will you have 100 users? 10000 users? 1mil users? I'm going to assume you want to design for maximum scale up front, but it could be overkill.
First, based on what you've described, I don't think a CosmosDB trigger is necessarily the solution to your problem because that's only going to fire when the preference data is created/updated. I assume that, from that point forward, your function needs to continuously fire at the time slot they've opted into, correct?
So let's pretend you let people choose from the 24hrs in the day. A naïve approach would be to simply use a scheduled trigger that fires up every hour, queries the CosmosDB for all the documents where the preference is set to that particular hour and then begins sending out notifications from there. The problem is how you scale from there and deal with issues of idempotency in the face of failures.
First off, a timer trigger only ever spins up one instance. If you were to just go query the CosmosDB documents and start processing them one by one in the scope of that single trigger, you'd hit a ceiling relatively quickly on how many notifications you can scale to. Instead what you'd want to do is use that timer trigger to fan out the notifications to as many "worker" function instances as possible. The timer trigger can act as the orchestrator in the sense that it can own the query against the CosmosDB and then turn each document result it finds for that particular notification time window into a message that it places on a queue to be processed by a separate function which will scale out on its own.
There are actually a couple ways you can accomplish this with Azure Functions, it really depends on how early an adopter of technology you are comfortable with being.
The first is what I would call the "manual" way which would be done by simply using the existing Azure Storage Queue extension by taking an IAsyncCollector<YourNotificationWorkerMessage> as a parameter to the timer function that's bound to the worker queue and then pumping out the messages through that. Then you write a second companion function which uses a QueueTrigger, bind it to that same queue, and it will take care of processing each message. This second function is where you get the scaling, enabling process all of the queued messages as quickly as possible based on whatever scaling parameters you choose to configure. This is the "simplest" approach
The second approach would be to adopt the newer Durable Functions extension. With that model, you don't have to directly think about creating a worker queue. You simply kick off a new instance of your orchestrator function from the timer function and the orchestrator fans out the work by invoking N "concurrent" calls to an action for each notification. Now, it happens to distribute those calls using queues under the covers, but that's an implementation detail that you need no longer maintain yourself. Additionally, if the work of delivering the notification requires more involved work and/or retry logic, you might actually consider using a sub-orchestration instead of a simple action. Finally, another added benefit of this approach, is that you can "fan back in" to your main orchestrator function once all the notifications are delivered to do some follow up work... even if that's simply some kind of event logging that the notification cycle has completed for this hour.
Now, the challenge with either of these approach is actually dealing with failure in initially fetching the candidates for notification from CosmosDB, paging through the results and making sure you actually fan all of them out in an idempotent manner. You need to deal with possible hiccups as you page and you need to deal with the fact that your whole function could be torn down and you might have to restart. Perhaps on the initial run of the 8AM notifications you got through page 273 out of 371 pages and then you got hit with a complete network connectivity fail or the VM your function was running on suffered a power failure. You could resume, but you'd need to know that you left off on page 273 and that you actually processed the 27th record out of that page and start from there. Otherwise, you risk sending double notifications to your users. Maybe that's something you can accept, maybe it's not. Maybe you're ok with the 27 notifications on that page being duplicated as long as the first 272 pages aren't. Again, this is something you need to decide for your domain, but if you want to avoid this issue your orchestrator function will need to track its progress to ensure that it doesn't send out dupes. Again I would say Durable Functions has a leg up here as it comes with the ability to configure retries. Maintaining the state of a particular run is left up to the author in either approach though.
I use pro-active dialog extensively with botframwork and messenger without any issue. During your facebook approval process you simply need to inform them you will be sending notifications trough messenger with your bot. Usually if you use it to inform your user and stay away from promotional content you should be fine.
I also use azure function to trigger the pro-active dialog from a custom controller endpoint.
Bellow sample code for azure function:
public static void Run(TimerInfo notificationTrigger, TraceWriter log)
{
try
{
//Serialize request object
string timerInfo = JsonConvert.SerializeObject(notificationTrigger);
//Create a request for bot service with security token
HttpRequestMessage hrm = new HttpRequestMessage()
{
Method = HttpMethod.Post,
RequestUri = new Uri(NotificationEndPointUrl),
Content = new StringContent(timerInfo, Encoding.UTF8, "application/json")
};
hrm.Headers.Add("Authorization", NotificationApiKey);
log.Info(JsonConvert.SerializeObject(hrm));
//Call service
using (var client = new HttpClient())
{
Task task = client.SendAsync(hrm).ContinueWith((taskResponse) =>
{
HttpResponseMessage result = taskResponse.Result;
var jsonString = result.Content.ReadAsStringAsync();
jsonString.Wait();
if (result.StatusCode != System.Net.HttpStatusCode.OK)
{
//Throw what ever problem as an exception with details
throw new Exception($"AzureFunction - ERRROR - HTTP {result.StatusCode}");
}
});
task.Wait();
}
}
catch (Exception ex)
{
//TODO log
}
}
Bellow sample code for starting the pro-active dialog:
public static async Task Resume<T, R>(string resumptionCookie) where T : IDialog<R>, new()
{
//Deserialize reference to conversation
ConversationReference conversationReference = JsonConvert.DeserializeObject<ConversationReference>(resumptionCookie);
//Generate message from bot to user
var message = conversationReference.GetPostToBotMessage();
var builder = new ContainerBuilder();
using (var scope = DialogModule.BeginLifetimeScope(Conversation.Container, message))
{
//From a cold start the service is not yet authenticated with dev bot azure services
//We thus must trust endpoint url.
if (!MicrosoftAppCredentials.IsTrustedServiceUrl(message.ServiceUrl))
{
MicrosoftAppCredentials.TrustServiceUrl(message.ServiceUrl, DateTime.MaxValue);
}
var botData = scope.Resolve<IBotData>();
await botData.LoadAsync(CancellationToken.None);
//This is our dialog stack
var task = scope.Resolve<IDialogTask>();
T dialog = scope.Resolve<T>(); //Resolve the dialog using autofac
try
{
task.Call(dialog.Void<R, IMessageActivity>(), null);
await task.PollAsync(CancellationToken.None);
}
catch (Exception ex)
{
//TODO log
}
finally
{
//flush dialog stack
await botData.FlushAsync(CancellationToken.None);
}
}
}
Your dialog needs to be registered in autofac.
Your resumptionCookie needs to be saved in your db.
You might want to check FB policy regarding proactive messages
There’s a 24h limit but it might not be totally screwed in your case
https://developers.facebook.com/docs/messenger-platform/policy/policy-overview#standard_messaging
The question assumes the use of Event Sourcing.
When rebuilding current state by replaying events, event handlers should be idempotent. For example, when a user successfully updates their username, a UsernameUpdated event might be emitted, the event containing a newUsername string property. When rebuilding current state, the appropriate event handler receives the UsernameUpdated event and sets the username property on the User object to the newUsername property of the UsernameUpdated event object. In other words, the handling of the same message multiple times always yields the same result.
However, how does such an event handler work when integrating with external services? For example, if the user wants to reset their password, the User object might emit a PasswordResetRequested event, which is handled by a portion of code that issues a 3rd party with a command to send an SMS. Now when the application is rebuilt, we do NOT want to re-send this SMS. How is this situation best avoided?
There are two messages involved in the interaction: commands and events.
I do not regard the system messages in a messaging infrastructure the same as domain events. Command message handling should be idempotent. Event handlers typically would not need to be.
In your scenario I could tell the aggregate root 100 times to update the user name:
public UserNameChanged ChangeUserName(string username, IServiceBus serviceBus)
{
if (_username.Equals(username))
{
return null;
}
serviceBus.Send(new SendEMailCommand(*data*));
return On(new UserNameChanged{ Username = userName});
}
public UserNameChanged On(UserNameChanged #event)
{
_username = #event.UserName;
return #event;
}
The above code would result in a single event so reconstituting it would not produce any duplicate processing. Even if we had 100 UserNameChanged events the result would still be the same as the On method does not perform any processing. I guess the point to remember is that the command side does all the real work and the event side is used only to change the state of the object.
The above isn't necessarily how I would implement the messaging but it does demonstrate the concept.
I think you are mixing two separate concepts here. The first is reconstructing an object where the handlers are all internal methods of the entity itself. Sample code from Axon framework
public class MyAggregateRoot extends AbstractAnnotatedAggregateRoot {
#AggregateIdentifier
private String aggregateIdentifier;
private String someProperty;
public MyAggregateRoot(String id) {
apply(new MyAggregateCreatedEvent(id));
}
// constructor needed for reconstruction
protected MyAggregateRoot() {
}
#EventSourcingHandler
private void handleMyAggregateCreatedEvent(MyAggregateCreatedEvent event) {
// make sure identifier is always initialized properly
this.aggregateIdentifier = event.getMyAggregateIdentifier();
// do something with someProperty
}
}
Surely you wouldn't put code that talks to an external API inside an aggregate's method.
The second is replaying events on a bounded context which could cause the problem you are talking about and depending on your case you may need to divide your event handlers into clusters.
See Axon frameworks documentation for this point to get a better understanding of the problem and the solution they went with.
Replaying Events on a Cluster
TLDR; store the SMS identifier within the event itself.
A core principle of event sourcing is "idempotency". Events are idempotent, meaning that processing them multiple times will have the same result as if they were processed once. Commands are "non-idempotent", meaning that the re-execution of a command may have a different result for each execution.
The fact that aggregates are identified by UUID (with a very low percentage of duplication) means that the client can generate the UUIDs of newly created aggregates. Process managers (a.k.a., "Sagas") coordinate actions across multiple aggregates by listening to events in order to issue commands, so in this sense, the process manager is also a "client". Cecause the process manager issues commands, it cannot be considered "idempotent".
One solution I came up with is to include the UUID of the soon-to-be-created SMS in the PasswordResetRequested event. This allows the process manager to only create the SMS if it does not yet already exist, hence achieving idempotency.
Sample code below (C++ pseudo-code):
// The event indicating a password reset was successfully requested.
class PasswordResetRequested : public Event {
public:
PasswordResetRequested(const Uuid& userUuid, const Uuid& smsUuid, const std::string& passwordResetCode);
const Uuid userUuid;
const Uuid smsUuid;
const std::string passwordResetCode;
};
// The user aggregate root.
class User {
public:
PasswordResetRequested requestPasswordReset() {
// Realistically, the password reset functionality would have it's own class
// with functionality like checking request timestamps, generationg of the random
// code, etc.
Uuid smsUuid = Uuid::random();
passwordResetCode_ = generateRandomString();
return PasswordResetRequested(userUuid_, smsUuid, passwordResetCode_);
}
private:
Uuid userUuid_;
string passwordResetCode_;
};
// The process manager (aka, "saga") for handling password resets.
class PasswordResetProcessManager {
public:
void on(const PasswordResetRequested& event) {
if (!smsRepository_.hasSms(event.smsUuid)) {
smsRepository_.queueSms(event.smsUuid, "Your password reset code is: " + event.passwordResetCode);
}
}
};
There are a few things to note about the above solution:
Firstly, while there is a (very) low possibility that the SMS UUIDs can conflict, it can actually happen, which could cause several issues.
Communication with the external service is prevented. For example, if user "bob" requests a password reset that generates an SMS UUID of "1234", then (perhaps 2 years later) user "frank" requests a password reset that generates the same SMS UUID of "1234", the process manager will not queue the SMS because it thinks it already exists, so frank will never see it.
Incorrect reporting in the read model. Because there is a duplicate UUID, the read side may display the SMS sent to "bob" when "frank" is viewing the list of SMSes the system sent him. If the duplicate UUIDs were generated in quick succession, it is possible that "frank" would be able to reset "bob"s password.
Secondly, moving the SMS UUID generation into the event means you must make the User aggregate aware of the PasswordResetProcessManager's functionality (but not the PasswordResetManager itself), which increases coupling. However, the coupling here is loose, in that the User is unaware of how to queue an SMS, only that an SMS should be queued. If the User class were to send the SMS itself, you could run into the situation in which the SmsQueued event is stored while the PasswordResetRequested event is not, meaning that the user will receive an SMS but the generated password reset code was not saved on the user, and so entering the code will not reset the password.
Thirdly, if a PasswordResetRequested event is generated but the system crashes before the PasswordResetProcessManager can create the SMS, then the SMS will eventually be sent, but only when the PasswordResetRequested event is re-played (which might be a long time in the future). E.g., the "eventual" part of eventual consistency could be a long time away.
The above approach works (and I can see that it should also work in more complicated scenarious, like the OrderProcessManager described here: https://msdn.microsoft.com/en-us/library/jj591569.aspx). However, I am very keen to hear what other people think about this approach.
My CustomerActivity class happens also to be a presenter in the MVP sense. In response to actions by the user, the following code is called:
context.update(customer).fire(new Receiver<CustomerProxy>() {
public void onSuccess(CustomerProxy response) {
// set the view according to the response
}
});
When the above code executes, two things happen:
I receive an updated copy of the Customer, with which I can refresh the state of the view
An EntityProxyChange event is fired
The CustomerActivity listens for EntityProxyChange events because other Activities also make changes to customer records, and I want to keep the CustomerActivity up-to-date.
EntityProxyChange.registerForProxyType(eventBus, CustomerProxy.class,
new EntityProxyChange.Handler<CustomerProxy>() {
public void onProxyChange(EntityProxyChange<CustomerProxy> event) {
fetchCustomer(event.getProxyId());
// ...but what if it was me that made the change?
}
});
Since theupdate method already returns an up-to-date Customer, I don't need to fetch the customer again during processing of the EntityProxyChange; and I don't want to incur the cost of another call to the server if I can avoid it.
I was hoping that the EntityProxyChange class would provide me with the entity's version number, which I could compare with the version of the customer I have cached. No dice.
I suppose I can set some kind of inTheMiddleOfAnUpdateOperation flag, and check it before fetching the customer. Is that what people are doing? The idea makes me gag just a little bit. Can you suggest a better way to write an activity that listens for changes and makes changes to the same entity types?
I'm not sure this would work, but you could try to create another event bus :
final EventBus customerActivityEventBus = new SimpleEventBus();
Initialize a new RequestFactory with this eventBus and use it in the CustomerActivity.
customerActivityRequestFactory = GWT.create(CustomerRequestFactory.class);
customerActivityRequestFactory.initialize(customeActivityEventBus);
CustomerRequest context = customerActivityRequestFactory.customerRequest();
Your CustomerActivity will still listen to change events from the main eventBus but will not see the events it fired.
In your other activities, listen to events either only from the customerActivityEventBus or from both, depending on what you want.
Of course, keep only one main eventBus to use for events that are not from a Request Factory (ie. ActivityManager, PlaceHistoryHandler, etc ..)