Are messages in the mailbox from Reliable Actors stateful? - azure-service-fabric

The state of Reliable Actors including reminders are restored whenever a primary node fails. I could however not find any information regarding messages in the mailbox. What happens with these messages, are they lost or does the actor restore these messages?
The only information I could find is the following:
Because the actor service itself is a reliable service, all the
application model, lifecycle, packaging, deployment, upgrade, and
scaling concepts of Reliable Services apply the same way to actor
services.
I'm not sure if the above quote includes messages from a mailbox from an actor.
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-actors-platform

SF Reliable Actors does not use the same mailbox messaging approach like Akka.
Akka uses the TELL approach where the messages are sent to the actor and stored in a mailbox for processing, once the processing is complete the actor send a message to the caller with an answer.
SF uses the ASK approach, where the caller keep waiting for the answer, so there is no mailbox, the processing will happens according to the locking order taken from the actor, if the actor service fails, the calls and locks will be dropped.
Because the calls and retries to the actors are managed by the caller, using the ActorProxy, it will resend the call, and it will get to a new service instance\replica, consequently getting a new priority order than before.

Related

What are underling mechanisms of actor->actor or service->actor calls and how reliable they are?

I'd like to have more technical details about underling mechanisms of calling Actors in Azure Service Fabric, which I can't easily find online. Actors are know for their single-threaded scope, so unless any of its method execution is fully completed, no other clients are allowed to call it.
To be more specific, I need to know what happens if Actor is stuck for a while with its job initiated by one client call. How long are other clients supposed to wait until the job gets done? seconds, minutes, hours?
Is there any time-out mechanism, and if so is it somehow
configurable?
What happens if node where actor is located crashes,
would client receives immediate error, or ActorProxy somehow handles this situation and redirects call to newly created instance of Actor on healthy node?
There are quite a few SO answers with details about actor mechanisms and also in docs, I can point you a few:
This one does not exactly answer your question, but I described a bit how the locking works: Start a thread inside an Azure Service Fabric actor?
Q: Is there any time-out mechanism, and if so is it somehow configurable?
yes, there is a timeout, I have answered here: Acquisition of turn based concurrency lock for actor '{actorName}' timed out after {time}
The configuration docs are located in here
Q: What happens if node where actor is located crashes, would client receives immediate error, or ActorProxy somehow handles this situation and redirects call to newly created instance of Actor on healthy node?
Generally there is always a replica available when one replica goes down, new requests will start moving to the new replica when SF promotes a secondary replica to Primary.
Regarding the communication, by default, SF Actors use .Net Remoting for communication the same way as the reliable services, the behaviour is described very well in here, In summary, it retries transient failures, if the client can't connect to the service(Actor) it will retry until it reaches the connection timeout.
From the docs:
The service proxy handles all failover exceptions for the service partition it is created for. It re-resolves the endpoints if there are failover exceptions (non-transient exceptions) and retries the call with the correct endpoint. The number of retries for failover exceptions is indefinite. If transient exceptions occur, the proxy retries the call.
The Actor docs, has more info, in summary there are two points to keep in mind:
Message delivery is best effort.
Actors may receive duplicate messages from the same client.
That means, in case a transient failure occurs while delivering a message, it will retry, even though the message has been already delivered, causing duplicate messages.

How to deal with long-lasting operations in Reliable Actors or stateful Reliable Service and 're-process' failed states

I'm new to Service Fabric Reliable Actors technology and trying to figure out best practices for this specific scenario:
Let's say we have some legacy code that we want to run new code built on SF Reliable Actors. Actors of certain type "ActorExecutor" are going to asynchronously call some third-party service that sometimes could stuck for pretty long time, longer than actor's calling client is ready to wait, or even experience some prolonged underling communication issues. We do not want client (legacy code) to get blocked by any sort of issues in ActorExecutor, it does not expect to receive any value or status back from actor. Should we use SF ReliableQueue for that? Should we use some sort of actor-broker to receive requests from client and storing them to queue: Client->ActorBroker->ActorExecutor? Are reminders could be helpful here?
One more question in this regard: Giving the situation is possible when many thousands of actors might stuck in 'third-party incomplete call' in the same time, and we want to reactivate and repeat the very last call for them, should we write a new tool for that? In NServiceBus you can create an error queue in MSMQ where all failed like 'unable to process' messages to be landed, and then we were able to simply re-process them anytime in the future. From my understanding, there is no such thing in Service Fabric and it's something we need to built on our own.
An event driven approach can help you here. Instead of waiting for the Actor to return from the call to a service, you can enqueue some task on it, to request it to perform some action. The service calling Actor would function autonomously, processing items from it's task queue. This will allow it to perform retries and error handling. After a successful call, a new event can notify the rest of the system.
Maybe this project can help you to get started.
edits:
At this time, I don't believe you can use reliable collections in Actors. So a queue inside the state of an Actor, is a regular (read-only) collection.
Process the queue using an Actor Timer. Don't use the threadpool, as it's not persistent and won't survive crashes and Actor garbage collections.

How to recover messages in Akka Actors now that Durable Mailboxes are removed?

I was working with the latest version of Akka when I noticed that durable mailboxes are now removed from Akka.
I need to make sure that my messages are recovered upon a restart after crash. Is there an alternate way to work without durable mailboxes or a custom implementation by someone else.
I also tried Akka Persistence but it replays the messages and I don't want to send the same messages twice in the event of a crash given that all messages are expensive to perform.
While this is not exactly a solution to work with Akka Actors, it does solve the original problem in question here.
Instead of using Akka here, I believe it's a better idea to use something like Kafka along with reactive streams with something like akka/reactive-kafka.
A system like that is very good for persistence, and offers very good semantics for preserving the message queue on a crash. This is way better than storing the message somewhere that is to be processed, and in general performs better.
It does not have to be Kafka, but any backend that can plug with a reactive stream (Akka's implementation or otherwise).
Akka Persistence replays events that were created based on received commands. Events are generated from command messages after validation and shouldn't be able to create invalid actor states.
This means that not the initial received messages (commands) are necesarilly replayed but you can persist events that are cheaper to apply to reconstruct the state of an actor after the crash. In addition you can use snapshots to recover state directly.
Edit:
As mentioned i the comments it is true that only the state of the actor is persisted and survives the crash. This state only reflects the consumed messages and not those that still reside in the actors mailbox.
However instead of pushing messages to an actor which would then be stored in a durable mailbox an alternative might be for the 'recipient' to pull messages from a persistent actor which stores the list of messages as part of his state.
UntypedPersistentActorWithAtLeastOnceDelivery as part of akka persistence offers another possibility where the sender takes care of persisting messages.
I realize that those are no drop-in replacements for durable mailboxes as they require rethinking the system. Pulling work from the consumers has worked for me so far. Initially we also considered message Queue products (RabbitMQ with durable queues) but since our initial work items come from a db we can deal with an akka crash without durable messages.

Routing MSMQ messages from one queue to another

Is there some standard configuration setting, service, or tool that accepts messages from one queue and moves them on to another one? Automatically handling the dead message problem, and providing some of retry capability? I was thinking this is what "MSMQ Message Routing" does but can't seem to find documentation on it (except for on Windows Mobile 6, and I don't know if that's relevant).
Context:
I understand that when using MSMQ you should always write to a local queue so that failure is unlikely, and then X should move that message to a remote queue. Is my understanding wrong? Is this where messaging infrastructure like Biztalk comes in? Is it unnecessary to write to a local queue first to absolutely ensure success? Am I supposed to build X myself?
As Hugh points out, you need only one MSMQ Queue to Send messages in one direction from a source to a destination. Source and destination can be on the same server, same network or across the internet, however, both source and destination must have the MSMQ service running.
If you need to do 'message' routing (e.g. a switch which processes messages from several source or destination queues, or routing a message to one or more subscribers based on the type of message etc) you would need more than just MSMQ queue.
Although you certainly can use BizTalk to do message routing, this would be expensive / overkill if you didn't need to use other features of BizTalk. Would recommend you look at open source, or building something custom yourself.
But by "Routing" you might be referring to the queue redirection capability when using HTTP as the transport e.g. over the internet (e.g. here and here).
Re : Failed delivery and retry
I think you have most of the concepts - generally the message DELIVERY retry functionality should be implicit in MSMQ. If MSMQ cannot deliver the message before the defined expiry, then it will be returned on the Dead Letter Queue, and the source can then process messages from the DLQ and then 'compensate' for them (e.g. reverse the actions of the 'send', indicate failure to the user, etc).
However 'processing' type Retries in the destination will need to be performed by the destination application / listener (e.g. if the destination system is down, deadlocks, etc)
Common ways to do this include:
Using 2 Phase commit - under a distributed unit of work, pull the message off MSMQ and process it (e.g. insert data into a database, change the status of some records etc), and if any failure is encountered, then leave the message back onto the queue and the DB changes will be rolled back.
Application level retries - i.e. on the destination system, in the event of 'retryable' type errors (timeout due to load, deadlocks etc) then to sleep for a few seconds and then retry the same transaction.
However, in most cases, indefinite processing retries are not desirable and you would ultimately need to admit defeat and implement a mechanism to log the message and the error and remove it from the queue.
But I wouldn't 'retry' business failures (e.g. Business Rules, Validation etc) and the behaviour should be defined in your requirements of how to handle these (e.g. account is overdrawn, message is not in a correct format or not valid, etc), e.g. by returning a "NACK" type message back to the source.
HTH
MSMQ sends messages from one queue to another queue.
Let's say you have a queue on a remote machine. You want to send a message to that queue.
So you create a sender. A sender is an application that can use the MSMQ transport to send a message. This can be a .Net queue client (System.Messaging), a WCF service consumer (either over netMsmqBinding or msmqIntegrationBinding, BizTalk using the MSMQ adapter, etc etc.
When you send the message, what actually happens is:
The MSMQ queue manager on the sender machine writes the message to a temporary local queue.
The MSMQ queue manager on the sender machine connects to the MSMQ manager on the receiving machine and transmits the message.
The MSMQ queue manager on the receivers machine puts the message onto the destination queue.
In certain situations MSMQ will encounter messages which for some reason or another cannot be received on the destination queue. In these situations, if you have indicated that a message will use the dead-letter queue then MSMQ will make sure that the message is forwarded to the dead-letter queue.

What is Microsoft Message Queuing (MSMQ)? How does it work?

I need to work with MSMQ (Microsoft Message Queuing). What is it, what is it for, how does it work? How is it different from web services?
With all due respect to #Juan's answer, both are ways of exchanging data between two disconnected processes, i.e. interprocess communication channels (IPC). Message queues are asynchronous, while webservices are synchronous. They use different protocols and back-end services to do this so they are completely different in implementation, but similar in purpose.
You would want to use message queues when there is a possibility that the other communicating process may not be available, yet you still want to have the message sent at the time of the client's choosing. Delivery will occur the when process on the other end wakes up and receives notification of the message's arrival.
As its name states, it's just a queue manager.
You can Send objects (serialized) to the queue where they will stay until you Receive them.
It's normally used to send messages or objects between applications in a decoupled way
It has nothing to do with webservices, they are two different things
Info on MSMQ:
https://msdn.microsoft.com/en-us/library/ms711472(v=vs.85).aspx
Info on WebServices:
http://msdn.microsoft.com/en-us/library/ms972326.aspx
Transactional Queue Management 101
A transactional queue is a middleware system that asynchronously routes messages of one sort of another between hosts that may or may not be connected at any given time. This means that it must also be capable of persisting the message somewhere. Examples of such systems are MSMQ and IBM MQ
A Transactional Queue can also participate in a distributed transaction, and a rollback can trigger the disposal of messages. This means that a message is guaranteed to be delivered with at-most-once semantics or guaranteed delivery if not rolled back. The message won't be delivered if:
Host A posts the message but Host B
is not connected
Something (possibly but not
necessarily initiated from Host A)
rolls back the transaction
B connects after the transaction is
rolled back
In this case B will never be aware the message even existed unless informed through some other medium. If the transaction was rolled back, this probably doesn't matter. If B connects and collects the message before the transaction is rolled back, the rollback will also reverse the effects of the message on B.
Note that A can post the message to the queue with the guarantee of at-most-once delivery. If the transaction is committed Host A can assume that the message has been delivered by the reliable transport medium. If the transaction is rolled back, Host A can assume that any effects of the message have been reversed.
Web Services
A web service is remote procedure call or other service (e.g. RESTFul API's) published by a (typically) HTTP Server. It is a synchronous request/response protocol and has no guarantee of delivery built into the protocol. It is up to the client to validate that the service has been correctly run. Typically this will be through a reply to the request or timeout of the call.
In the latter case, web services do not guarantee at-most-once semantics. The server can complete the service and fail to deliver a response (possibly through something outside the server going wrong). The application must be able to deal with this situation.
IIRC, RESTFul services should be idempotent (the same state is achieved after any number of invocations of the same service), which is a strategy for dealing with this lack of guaranteed notification of success/failure in web service architectures. The idea is that conceptually one writes state rather than invoking a service, so one can write any number of times. This means that a lack of feedback about success can be tolerated by the application as it can re-try the posting until it gets a 'success' message from the server.
Note that you can use Windows Communication Foundation (WCF) as an abstraction layer above MSMQ. This gives you the feel of working with a service - with only one-way operations.
For more information, see:
http://msdn.microsoft.com/en-us/library/ms789048.aspx
Actually there is no relation between MSMQ and WebService.
Using MSMQ for interprocess communication (you can use also sockets, windows messaging, mapped memory).
it is a windows service that responsible for keeping messages till someone dequeue them.
you can say it is more reliable than sockets as messages are stored on a harddisk but it is slower than other IPC techniques.
You can use MSMQ in dotnet with small lines of code, Just Declare your MessageQueue object and call Receive and Send methods.
The Message itself can be normal string or binary data.
As everyone has explained MSMQ is used as a queue for messages. Messages can be wrapper for actual data, object and anything that you can serialize and send across the wire. MSMQ has it's own limitations. MSMQ 1.0 and MSMQ 2.0 had a 4MB message limit. This restriction was lifted off with MSMQ 3.0. Message oriented Middleware (MOM) is a concept that heavily depends on Messaging. Enterprise Service Bus foundation is built on Messaging. All these new technologies, depend on Messaging for asynchronous data delivery with reliability.
MSMQ stands for Microsoft Messaging Queue.
It is simply a queue that stores messages formatted so that it can pass to DB (may on same machine or on Server). There are different types of queues over there which categorizes the messages among themselves.
If there is some problem/error inside message or invalid message is passed, it automatically goes to Dead queue which denotes that it is not to be processed further. But before passing a message to dead queue it will retry until a max count and till it is not processed. Then it will be sent to the Dead queue.
It is generally used for sending log message from client machine to server or DB so that if there is any issue happens on client machine then developer or support team can go through log to solve problem.
MSMQ is also a service provided by Microsoft to Get records of Log files.
You get Better Idea from this blog http://msdn.microsoft.com/en-us/library/ms711472(v=vs.85).aspx.