Is there a quicker way to get workflowHistory - workflow

we use cadence 0.11
In our project, there is an often used service which needs the workflowHistory.
So we have to call this function oftenly:
GetWorkflowHistory(ctx context.Context, workflowID string, runID string, isLongPoll bool, filterType s.HistoryEventFilterType) HistoryEventIterator
The problem is, the function calling is quite slow and it makes our service quite slow. What's more, we must ensure the correctness of the workflow so we can not use cache since cache cannot store data which updates frequently.
Is there a quicker way for us to get workflowHistory? maybe a new api, or some new configurations in cadence?

There is no faster way to get the history. The latency depends on the length and size of the history. This is a critical api in Cadence so the perf has been optimized to the best within its architecture.
However, based on your use case, this is a misuse of the api. You should implement Query handler for the workflow. Worker will do everything same as you are doing to parse history.
If preferred, you can implement a very generic Query handler which return almost everything in the workflow events. So that only one Query handler can let you get anything like getting from the history. Specifically, you can put all signals, activity input output etc into a list and return the list as the query results.
Using worker query will save the latency for you. This is because Cadence worker would cache the history. When there is no changes on the workflow it will not get any more history from server. when new events are appended, only delta of events will be transported between the worker and server. Therefore the latency is minimum regardless of the length of the history.
The only case that worker will need to reload the whole history is when cache is evicted or worker restart. so it’s always recommended to keep history short so that it won’t cost too much resources in that case.
See more docs about query.
https://cadenceworkflow.io/docs/concepts/queries/#stack-trace-query
Golang
https://cadenceworkflow.io/docs/go-client/queries/
Java
https://cadenceworkflow.io/docs/java-client/queries/
Taking Golang as an example, if you want to return activity result as the query output:
func MyWorkflow(ctx workflow.Context, input string) error {
var res map[string]string
err := workflow.SetQueryHandler(ctx, "current_state", func() (map[string]string, error) {
return res, nil
})
ctx = workflow.WithActivityOptions(ctx, ...)
var act_out string
err = workflow.ExecuteActivity(ctx, ActivityA, "my_input").Get(ctx, &act_out)
if err != nil {
res["ActivityA"] = act_out
return err
}
return nil
}

Related

Implementing idempotency keys

I'm trying to get my two Golang GRPC endpoints to support idempotency keys. My service will store and read keys from Mongo (because I'm already using it for other data) as a unique index in its own Collection.
I'm thinking of two solutions but each has their weaknesses. I know there's more complex stuff like saving request and response and making the logic ACID. However for my first endpoint, the only-once logic (the endpoint's code which needs to be idempotent) calls a service that sends an email, so it can't be rollbacked. My second endpoint does multiple Inserts in Mongo, which seems can be rollbacked but I'm not sure how and if there's another solution that'd also solve for the first endpoint.
Solution 1
func MyEndpoint(request Request) (Response, error) {
doesExist, err := doesIdemKeyExist(request.IdemKey)
if err != nil {
return nil, status.Error(codes.Internal, "Failed to check idem key.")
}
if doesExist {
return Response{}, nil
}
// < only-once logic >
err := insertIdemKey(request.IdemKey)
if err != nil {
if mongo.IsDuplicateKeyError(err) {
return Response{}, nil
}
return nil, status.Error(codes.Internal, "Failed to insert idem key.")
}
return Response{}, nil
}
The weakness here is that client could send first request to my endpoint and lose connection, then retry with second request. First request could process but not reach insertIdemKey, so second request would process too, violating idempotency.
Solution 2
func MyEndpoint(request Request) (Response, error) {
err := insertIdemKey(request.IdemKey)
if err != nil {
if mongo.IsDuplicateKeyError(err) {
return Response{}, nil
}
return nil, status.Error(codes.Internal, "Failed to insert idem key.")
}
// < only-once logic >
return Response{}, nil
}
The weakness here is that only-once logic could have intermittent failures, such as from dependencies. Affected requests that are retried will be ignored.
What's the best solution here? Should I just compromise and go with one of these imperfect solutions?
You should use a document with a state property in MongoDB, with possible values processing and done.
When a request comes in, try to insert the document into the database with the given idemKey and state=processing. If that fails because the key already exists, then either report success (if state is done) or that it is still being processed (if state is processing). Or wait for it to complete and then report success.
If inserting the document succeeds, proceed with executing "only-once logic".
Once the "only-once logic" is done, update the document's state to state=done. If executing the logic fails, you may delete the document from the database so a subsequent request can try to execute it again.
To protect against a server failure during executing the logic or against a failure to delete the document, you should record the start / creation timestamp too, and define an expiration. Let's say when a new request comes in and the document exists with processing sate but the document is older than 30 seconds, you could assume it will never be completed and proceed as if the document didn't exist in the database in the first place: set its creation timestamp to the current time and execute the logic, then update the state to done if logic execution succeeds. MongoDB also supports auto-removal of expired documents, but note that the removal is not timed precisely.
Note that this solution isn't perfect either: if executing the logic succeeds but you can't update the document's state to done afterwards, after expiration you could end up repeating the execution. What you want is the atomic / transactional execution of your logic and a MongoDB operation, which is not possible.
If your "only-once logic" contains multiple inserts, you could use insertOrUpdate() to not duplicate the records if the execution fails and you have to repeat it, or you could insert the documents with idemKey included, so you could identify which documents were previously inserted (and you could remove them first thing, or skip them and just insert the rest).
Also note that starting with MongoDB 5.0, transactions are supported, so you can execute multiple inserts in a single transaction.
See related question: How to synchronize two apps running in two different servers via MongoDB

Spring Boot controller preventing multiple inserts upon quick successive requests in mongodb

I have a REST API to calculate something upon a request, and if the same request is made again, return the result from the cache, which consist of documents saved in MongoDB. To know if two request is the same, I am hashing some relevant fields in the request. But when same request is made in a quick succession, duplicate documents occur in MongoDB, which later results in "IncorrectResultSizeDataAccessException" when I try to read them.
To solve it I tried to synchronize on hash value in following controller method (tried to cut out non relevant parts):
#PostMapping(
path = "/{myPath}",
consumes = {MediaType.APPLICATION_JSON_UTF8_VALUE},
produces = {MediaType.APPLICATION_JSON_UTF8_VALUE})
#Async("asyncExecutor")
public CompletableFuture<ResponseEntity<?>> retrieveAndCache( ... a,b,c,d various request parameters) {
//perform some validations on request...
//hash relevant equest parameters
int hash = Objects.hash(a, b, c, d);
synchronized (Integer.toString(hash).intern()) {
Optional<Result> resultOpt = cacheService.findByHash(hash);
if (resultOpt.isPresent()) {
return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(opt.get().getResult()));
} else {
Result result = ...//perform requests to external services and do some calculations...
cacheService.save(result);
return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(result));
}
}
}
//cacheService methods
#Transactional
public Optional<Result> findByHash(int hash) {
return repository.findByHash(hash); //this is the part that throws the error
}
I am sure that no hash collision is occuring, its just when the same request is performed in a quick succession duplicate records occur. To my understanding, it shouldn't occur as long as I have only 1 running instance of my spring boot application. Do you see any other reason than there are multiple instances running in production?
You should check the settings of your MongoDB client.
If one thread calls the cacheService.save(result) method, and after that method returns, releases the lock, then another thread calls cacheService.findByHash(hash), it's still possible that it will not find the record that you just saved.
It's possible that e.g. the save method returns as soon as the saved object is in the transaction log, but not fully processed yet. Or the save is processed on the primary node, but the findByHash is executed on the secondary node, where it's not replicated yet.
You could use WriteConcern.MAJORITY, but I'm not 100% sure if it covers everything.
Even better is to let MongoDB do the locking by using findAndModify with FindAndModifyOptions.upsert(true), and forget about the lock in your java code.

MongoDB: Upon incomplete Mass delete, are the deleted values gone or they are rolled back as in a typical RDBMS Transaction? [duplicate]

I know there are similar questions here but they are either telling me to switch back to regular RDBMS systems if I need transactions or use atomic operations or two-phase commit. The second solution seems the best choice. The third I don't wish to follow because it seems that many things could go wrong and I can't test it in every aspect. I'm having a hard time refactoring my project to perform atomic operations. I don't know whether this comes from my limited viewpoint (I have only worked with SQL databases so far), or whether it actually can't be done.
We would like to pilot test MongoDB at our company. We have chosen a relatively simple project - an SMS gateway. It allows our software to send SMS messages to the cellular network and the gateway does the dirty work: actually communicating with the providers via different communication protocols. The gateway also manages the billing of the messages. Every customer who applies for the service has to buy some credits. The system automatically decreases the user's balance when a message is sent and denies the access if the balance is insufficient. Also because we are customers of third party SMS providers, we may also have our own balances with them. We have to keep track of those as well.
I started thinking about how I can store the required data with MongoDB if I cut down some complexity (external billing, queued SMS sending). Coming from the SQL world, I would create a separate table for users, another one for SMS messages, and one for storing the transactions regarding the users' balance. Let's say I create separate collections for all of those in MongoDB.
Imagine an SMS sending task with the following steps in this simplified system:
check if the user has sufficient balance; deny access if there's not enough credit
send and store the message in the SMS collection with the details and cost (in the live system the message would have a status attribute and a task would pick up it for delivery and set the price of the SMS according to its current state)
decrease the users's balance by the cost of the sent message
log the transaction in the transaction collection
Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not updated and/or the transaction is not logged.
I came up with two ideas:
Create a single collection for the users, and store the balance as a field, user related transactions and messages as sub documents in the user's document. Because we can update documents atomically, this actually solves the transaction problem. Disadvantages: if the user sends many SMS messages, the size of the document could become large and the 4MB document limit could be reached. Maybe I can create history documents in such scenarios, but I don't think this would be a good idea. Also I don't know how fast the system would be if I push more and more data to the same big document.
Create one collection for users, and one for transactions. There can be two kinds of transactions: credit purchase with positive balance change and messages sent with negative balance change. Transaction may have a subdocument; for example in messages sent the details of the SMS can be embedded in the transaction. Disadvantages: I don't store the current user balance so I have to calculate it every time a user tries to send a message to tell if the message could go through or not. I'm afraid this calculation can became slow as the number of stored transactions grows.
I'm a little bit confused about which method to pick. Are there other solutions? I couldn't find any best practices online about how to work around these kinds of problems. I guess many programmers who are trying to become familiar with the NoSQL world are facing similar problems in the beginning.
As of 4.0, MongoDB will have multi-document ACID transactions. The plan is to enable those in replica set deployments first, followed by the sharded clusters. Transactions in MongoDB will feel just like transactions developers are familiar with from relational databases - they'll be multi-statement, with similar semantics and syntax (like start_transaction and commit_transaction). Importantly, the changes to MongoDB that enable transactions do not impact performance for workloads that do not require them.
For more details see here.
Having distributed transactions, doesn't mean that you should model your data like in tabular relational databases. Embrace the power of the document model and follow the good and recommended practices of data modeling.
Check this out, by Tokutek. They develop a plugin for Mongo that promises not only transactions but also a boosting in performance.
Bring it to the point: if transactional integrity is a must then don't use MongoDB but use only components in the system supporting transactions. It is extremely hard to build something on top of component in order to provide ACID-similar functionality for non-ACID compliant components. Depending on the individual usecases it may make sense to separate actions into transactional and non-transactional actions in some way...
Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not gets reduced and/or the transaction is not gets logged.
This is not really a problem. The error you mentioned is either a logical (bug) or IO error (network, disk failure). Such kind of error can leave both transactionless and transactional stores in non-consistent state. For example, if it has already sent SMS but while storing message error occurred - it can't rollback SMS sending, which means it won't be logged, user balance won't be reduced etc.
The real problem here is the user can take advantage of race condition and send more messages than his balance allows. This also applies to RDBMS, unless you do SMS sending inside transaction with balance field locking (which would be a great bottleneck). As a possible solution for MongoDB would be using findAndModify first to reduce the balance and check it, if it's negative disallow sending and refund the amount (atomic increment). If positive, continue sending and in case it fails refund the amount. The balance history collection can be also maintained to help fix/verify balance field.
The project is simple, but you have to support transactions for payment, which makes the whole thing difficult. So, for example, a complex portal system with hundreds of collections (forum, chat, ads, etc...) is in some respect simpler, because if you lose a forum or chat entry, nobody really cares. If you, on the otherhand, lose a payment transaction that's a serious issue.
So, if you really want a pilot project for MongoDB, choose one which is simple in that respect.
Transactions are absent in MongoDB for valid reasons. This is one of those things that make MongoDB faster.
In your case, if transaction is a must, mongo seems not a good fit.
May be RDMBS + MongoDB, but that will add complexities and will make it harder to manage and support application.
This is probably the best blog I found regarding implementing transaction like feature for mongodb .!
Syncing Flag: best for just copying data over from a master document
Job Queue: very general purpose, solves 95% of cases. Most systems need to have at least one job queue around anyway!
Two Phase Commit: this technique ensure that each entity always has all information needed to get to a consistent state
Log Reconciliation: the most robust technique, ideal for financial systems
Versioning: provides isolation and supports complex structures
Read this for more info: https://dzone.com/articles/how-implement-robust-and
This is late but think this will help in future. I use Redis for make a queue to solve this problem.
Requirement:
Image below show 2 actions need execute concurrently but phase 2 and phase 3 of action 1 need finish before start phase 2 of action 2 or opposite (A phase can be a request REST api, a database request or execute javascript code...).
How a queue help you
Queue make sure that every block code between lock() and release() in many function will not run as the same time, make them isolate.
function action1() {
phase1();
queue.lock("action_domain");
phase2();
phase3();
queue.release("action_domain");
}
function action2() {
phase1();
queue.lock("action_domain");
phase2();
queue.release("action_domain");
}
How to build a queue
I will only focus on how avoid race conditon part when building a queue on backend site. If you don't know the basic idea of queue, come here.
The code below only show the concept, you need implement in correct way.
function lock() {
if(isRunning()) {
addIsolateCodeToQueue(); //use callback, delegate, function pointer... depend on your language
} else {
setStateToRunning();
pickOneAndExecute();
}
}
function release() {
setStateToRelease();
pickOneAndExecute();
}
But you need isRunning() setStateToRelease() setStateToRunning() isolate it's self or else you face race condition again. To do this I choose Redis for ACID purpose and scalable.
Redis document talk about it's transaction:
All the commands in a transaction are serialized and executed
sequentially. It can never happen that a request issued by another
client is served in the middle of the execution of a Redis
transaction. This guarantees that the commands are executed as a
single isolated operation.
P/s:
I use Redis because my service already use it, you can use any other way support isolation to do that.
The action_domain in my code is above for when you need only action 1 call by user A block action 2 of user A, don't block other user. The idea is put a unique key for lock of each user.
Transactions are available now in MongoDB 4.0. Sample here
// Runs the txnFunc and retries if TransientTransactionError encountered
function runTransactionWithRetry(txnFunc, session) {
while (true) {
try {
txnFunc(session); // performs transaction
break;
} catch (error) {
// If transient error, retry the whole transaction
if ( error.hasOwnProperty("errorLabels") && error.errorLabels.includes("TransientTransactionError") ) {
print("TransientTransactionError, retrying transaction ...");
continue;
} else {
throw error;
}
}
}
}
// Retries commit if UnknownTransactionCommitResult encountered
function commitWithRetry(session) {
while (true) {
try {
session.commitTransaction(); // Uses write concern set at transaction start.
print("Transaction committed.");
break;
} catch (error) {
// Can retry commit
if (error.hasOwnProperty("errorLabels") && error.errorLabels.includes("UnknownTransactionCommitResult") ) {
print("UnknownTransactionCommitResult, retrying commit operation ...");
continue;
} else {
print("Error during commit ...");
throw error;
}
}
}
}
// Updates two collections in a transactions
function updateEmployeeInfo(session) {
employeesCollection = session.getDatabase("hr").employees;
eventsCollection = session.getDatabase("reporting").events;
session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );
try{
employeesCollection.updateOne( { employee: 3 }, { $set: { status: "Inactive" } } );
eventsCollection.insertOne( { employee: 3, status: { new: "Inactive", old: "Active" } } );
} catch (error) {
print("Caught exception during transaction, aborting.");
session.abortTransaction();
throw error;
}
commitWithRetry(session);
}
// Start a session.
session = db.getMongo().startSession( { mode: "primary" } );
try{
runTransactionWithRetry(updateEmployeeInfo, session);
} catch (error) {
// Do something with error
} finally {
session.endSession();
}

Lagom: Asynchronous Operations in Command Handlers

In Lagom, what do you do when a command handler must perform some asynchronous operations? For example:
override def behavior = Actions().onCommand[MyCommand, Done] {
case (cmd, ctx, state) =>
// some complex code that performs asynchronous operations
// (for example, querying other persistent entities or the read-side
// by making calls that return Future[...] and composing those),
// as summarized in a placeholder below:
val events: Future[Seq[Event]] = ???
events map {
xs => ctx.thenPersistAll(xs: _*) { () => ctx.reply(Done) }
}
}
The problem with the code like that is that the compiler expects the command handler to return Persist, not Future[Persist].
Is this done on purpose, to make sure that the events are persisted in the correct order (that is, the events generated by a prior command must be saved before the events generated by a later command)? But can't that be handled by proper management of the event offsets, so that the journal always orders them correctly, regardless of when they are actually saved?
And what does one do in situations like this, when the command handling is complex enough to require making asynchronous calls from the command handler?
There is a similar question on the mailing list with an answer from James.
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/lagom-framework/Z6lynjNTqgE
In short, your entity in a CQRS application is a consistency boundary and should only depend on data that it's immediately available inside it, not outside (no call to external services).
What you are probably looking for a what it's called Command Enrichment. You receive an request, collect some data from external services and build a command containing everything you need to send to your Entity.
You certainly should not query the read-side to make business decisions inside your write-side entity. You also should not make business decisions on data coming from other entities.
Your entity should be able to make all the decision because it is the consistency boundary of your model.
What I've been doing in these cases is to pass the PersistentEntityRef to the asynchronous operation, so that it can issue commands to the entity and its those command-handlers (not the one that spawned the async computation) that then persist the events.
Just bear in mind that none of this is atomic, so you have to think about what happens if the asynchronous operation fails midway through issuing the commands or if some commands succeed and some fail, etc. Presumably you'll want some retry mechanism for systemic failures. If you build your command-handlers to be idempotent, it will help you deal with the duplicates.

How to work around the lack of transactions in MongoDB?

I know there are similar questions here but they are either telling me to switch back to regular RDBMS systems if I need transactions or use atomic operations or two-phase commit. The second solution seems the best choice. The third I don't wish to follow because it seems that many things could go wrong and I can't test it in every aspect. I'm having a hard time refactoring my project to perform atomic operations. I don't know whether this comes from my limited viewpoint (I have only worked with SQL databases so far), or whether it actually can't be done.
We would like to pilot test MongoDB at our company. We have chosen a relatively simple project - an SMS gateway. It allows our software to send SMS messages to the cellular network and the gateway does the dirty work: actually communicating with the providers via different communication protocols. The gateway also manages the billing of the messages. Every customer who applies for the service has to buy some credits. The system automatically decreases the user's balance when a message is sent and denies the access if the balance is insufficient. Also because we are customers of third party SMS providers, we may also have our own balances with them. We have to keep track of those as well.
I started thinking about how I can store the required data with MongoDB if I cut down some complexity (external billing, queued SMS sending). Coming from the SQL world, I would create a separate table for users, another one for SMS messages, and one for storing the transactions regarding the users' balance. Let's say I create separate collections for all of those in MongoDB.
Imagine an SMS sending task with the following steps in this simplified system:
check if the user has sufficient balance; deny access if there's not enough credit
send and store the message in the SMS collection with the details and cost (in the live system the message would have a status attribute and a task would pick up it for delivery and set the price of the SMS according to its current state)
decrease the users's balance by the cost of the sent message
log the transaction in the transaction collection
Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not updated and/or the transaction is not logged.
I came up with two ideas:
Create a single collection for the users, and store the balance as a field, user related transactions and messages as sub documents in the user's document. Because we can update documents atomically, this actually solves the transaction problem. Disadvantages: if the user sends many SMS messages, the size of the document could become large and the 4MB document limit could be reached. Maybe I can create history documents in such scenarios, but I don't think this would be a good idea. Also I don't know how fast the system would be if I push more and more data to the same big document.
Create one collection for users, and one for transactions. There can be two kinds of transactions: credit purchase with positive balance change and messages sent with negative balance change. Transaction may have a subdocument; for example in messages sent the details of the SMS can be embedded in the transaction. Disadvantages: I don't store the current user balance so I have to calculate it every time a user tries to send a message to tell if the message could go through or not. I'm afraid this calculation can became slow as the number of stored transactions grows.
I'm a little bit confused about which method to pick. Are there other solutions? I couldn't find any best practices online about how to work around these kinds of problems. I guess many programmers who are trying to become familiar with the NoSQL world are facing similar problems in the beginning.
As of 4.0, MongoDB will have multi-document ACID transactions. The plan is to enable those in replica set deployments first, followed by the sharded clusters. Transactions in MongoDB will feel just like transactions developers are familiar with from relational databases - they'll be multi-statement, with similar semantics and syntax (like start_transaction and commit_transaction). Importantly, the changes to MongoDB that enable transactions do not impact performance for workloads that do not require them.
For more details see here.
Having distributed transactions, doesn't mean that you should model your data like in tabular relational databases. Embrace the power of the document model and follow the good and recommended practices of data modeling.
Check this out, by Tokutek. They develop a plugin for Mongo that promises not only transactions but also a boosting in performance.
Bring it to the point: if transactional integrity is a must then don't use MongoDB but use only components in the system supporting transactions. It is extremely hard to build something on top of component in order to provide ACID-similar functionality for non-ACID compliant components. Depending on the individual usecases it may make sense to separate actions into transactional and non-transactional actions in some way...
Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not gets reduced and/or the transaction is not gets logged.
This is not really a problem. The error you mentioned is either a logical (bug) or IO error (network, disk failure). Such kind of error can leave both transactionless and transactional stores in non-consistent state. For example, if it has already sent SMS but while storing message error occurred - it can't rollback SMS sending, which means it won't be logged, user balance won't be reduced etc.
The real problem here is the user can take advantage of race condition and send more messages than his balance allows. This also applies to RDBMS, unless you do SMS sending inside transaction with balance field locking (which would be a great bottleneck). As a possible solution for MongoDB would be using findAndModify first to reduce the balance and check it, if it's negative disallow sending and refund the amount (atomic increment). If positive, continue sending and in case it fails refund the amount. The balance history collection can be also maintained to help fix/verify balance field.
The project is simple, but you have to support transactions for payment, which makes the whole thing difficult. So, for example, a complex portal system with hundreds of collections (forum, chat, ads, etc...) is in some respect simpler, because if you lose a forum or chat entry, nobody really cares. If you, on the otherhand, lose a payment transaction that's a serious issue.
So, if you really want a pilot project for MongoDB, choose one which is simple in that respect.
Transactions are absent in MongoDB for valid reasons. This is one of those things that make MongoDB faster.
In your case, if transaction is a must, mongo seems not a good fit.
May be RDMBS + MongoDB, but that will add complexities and will make it harder to manage and support application.
This is probably the best blog I found regarding implementing transaction like feature for mongodb .!
Syncing Flag: best for just copying data over from a master document
Job Queue: very general purpose, solves 95% of cases. Most systems need to have at least one job queue around anyway!
Two Phase Commit: this technique ensure that each entity always has all information needed to get to a consistent state
Log Reconciliation: the most robust technique, ideal for financial systems
Versioning: provides isolation and supports complex structures
Read this for more info: https://dzone.com/articles/how-implement-robust-and
This is late but think this will help in future. I use Redis for make a queue to solve this problem.
Requirement:
Image below show 2 actions need execute concurrently but phase 2 and phase 3 of action 1 need finish before start phase 2 of action 2 or opposite (A phase can be a request REST api, a database request or execute javascript code...).
How a queue help you
Queue make sure that every block code between lock() and release() in many function will not run as the same time, make them isolate.
function action1() {
phase1();
queue.lock("action_domain");
phase2();
phase3();
queue.release("action_domain");
}
function action2() {
phase1();
queue.lock("action_domain");
phase2();
queue.release("action_domain");
}
How to build a queue
I will only focus on how avoid race conditon part when building a queue on backend site. If you don't know the basic idea of queue, come here.
The code below only show the concept, you need implement in correct way.
function lock() {
if(isRunning()) {
addIsolateCodeToQueue(); //use callback, delegate, function pointer... depend on your language
} else {
setStateToRunning();
pickOneAndExecute();
}
}
function release() {
setStateToRelease();
pickOneAndExecute();
}
But you need isRunning() setStateToRelease() setStateToRunning() isolate it's self or else you face race condition again. To do this I choose Redis for ACID purpose and scalable.
Redis document talk about it's transaction:
All the commands in a transaction are serialized and executed
sequentially. It can never happen that a request issued by another
client is served in the middle of the execution of a Redis
transaction. This guarantees that the commands are executed as a
single isolated operation.
P/s:
I use Redis because my service already use it, you can use any other way support isolation to do that.
The action_domain in my code is above for when you need only action 1 call by user A block action 2 of user A, don't block other user. The idea is put a unique key for lock of each user.
Transactions are available now in MongoDB 4.0. Sample here
// Runs the txnFunc and retries if TransientTransactionError encountered
function runTransactionWithRetry(txnFunc, session) {
while (true) {
try {
txnFunc(session); // performs transaction
break;
} catch (error) {
// If transient error, retry the whole transaction
if ( error.hasOwnProperty("errorLabels") && error.errorLabels.includes("TransientTransactionError") ) {
print("TransientTransactionError, retrying transaction ...");
continue;
} else {
throw error;
}
}
}
}
// Retries commit if UnknownTransactionCommitResult encountered
function commitWithRetry(session) {
while (true) {
try {
session.commitTransaction(); // Uses write concern set at transaction start.
print("Transaction committed.");
break;
} catch (error) {
// Can retry commit
if (error.hasOwnProperty("errorLabels") && error.errorLabels.includes("UnknownTransactionCommitResult") ) {
print("UnknownTransactionCommitResult, retrying commit operation ...");
continue;
} else {
print("Error during commit ...");
throw error;
}
}
}
}
// Updates two collections in a transactions
function updateEmployeeInfo(session) {
employeesCollection = session.getDatabase("hr").employees;
eventsCollection = session.getDatabase("reporting").events;
session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );
try{
employeesCollection.updateOne( { employee: 3 }, { $set: { status: "Inactive" } } );
eventsCollection.insertOne( { employee: 3, status: { new: "Inactive", old: "Active" } } );
} catch (error) {
print("Caught exception during transaction, aborting.");
session.abortTransaction();
throw error;
}
commitWithRetry(session);
}
// Start a session.
session = db.getMongo().startSession( { mode: "primary" } );
try{
runTransactionWithRetry(updateEmployeeInfo, session);
} catch (error) {
// Do something with error
} finally {
session.endSession();
}