Check if MongoDB mutation will succeed without actually executing it - mongodb

I wonder how/if a MongoDB mutation can be simulated. By "simulated" I mean performing an insert, update or delete action without actually executing it. For example, I'd like to test if the uniqueness index will throw when trying to insert a duplicated value. I search for similar functionality to Ethereum estimate gas action which will throw on an invalid transaction before the transaction is actually sent to the network.

If you're using MongoDB 4.0 or newer, you can use transactions to simulate a dry run. Something like:
conn = pymongo.MongoClient()
with conn.start_session() as s:
s.start_transaction()
conn.test.test.insert_one({'_id':1}, session=s)
conn.test.test.delete_one({'_id':2}, session=s)
if ...dry run condition...:
s.abort_transaction()
else:
s.commit_transaction()
You can abort_transaction() for your dry run, or commit otherwise, like in a typical SQL style transaction. Similarly, a transaction will auto abort if it encounters any error.
Note that transactions require a replica set and MongoDB >= 4.0 to function. See the manual page on transactions for more details.

Related

Handling DB Failure during projection in cqrs

We are creating system using CQRS. Our projections are in mongodb. We are facing some cases. We have an event say OrderCreated. We need to produce a sequential order_no for example #3, #4 etc. We could use a projection and keep a sequence in table then called upsert method. and get a new number. Post a new command : GenerateOrderNumber. now before this post accepted hardware failure occur. If we retry we will have another number. It is not good. How to solve such use case in cqrs.
Our projections are in mongodb <...>
now before this post accepted hardware failure occur
Most likely that described issue is not about CQRS or EventSoucring itself, but related to projection storage, which is MongoDB in question above.
You are trying to perform potential atomic operation without transaction guarantees. Since hardware failure can be caused within random time, database should provide ability for rollbacking failed atomic operations in current transaction.
Best choice is native MongoDB transactions, which are available since 4.0 version - https://docs.mongodb.com/manual/core/transactions/ - and your code will look like this:
session.startTransaction( … );
try {
const lastNo = await eventsCollection.findOne( ... )
await eventsCollection.insertOne( …, lastNo +1 )
session.commitTransaction()
} catch (error) {
session.abortTransaction()
}
If you have to use older MongoDB versions, transactions still can be used. But instead of using intrinsic operator, you should manually write transaction log, and after reconnect to database perform monitoring for broken transactions and revert them manually via log.
You should do all actions via events, even generating sequence no.
In your case I suggest you using saga:
build a projection for generating order_no
fire new event OrderCreated (after this point you have Order Aggregate with some unique id)
saga, listening to this event, fire event GenerateOrderNo (get next free number from projection)
In that case, any time you ask new order_no after failure it'll be the same.
Correct me please if I understood you wrong.

Mongo transactions and updates

If I've got an environment with multiple instances of the same client connecting to a MongoDB server and I want a simple locking mechanism to ensure single client access for a short time, can I safely use my own lock object?
Say I have one object with a lockState that can be "locked" or "unlocked" and the plan is everyone checks that it is "unlocked" before doing "stuff". To lock the system I say:
db.collection.update( { "lockState": "unlocked" }, { "lockState": "locked" })
(aka UPDATE lockObj SET lockState = 'locked' WHERE lockState = 'unlocked')
If two clients try to lock the system at the same time, is it possible that both clients can end up thinking they "have the lock"?
Both clients find the record by the query parameter of the update
Client 1 updates the record (which is an atomic operation)
update returns success
Client 2 updates the document (it's already found it before client 1 modified it)
update returns success
I realize this is probably a very contrived case that would be very hard to reproduce, but is it possible or does mongo somehow make client 2's update fail?
Alternative approach
Use insert instead of update. insert is atomic and will fail if the document already exists.
To lock the system: db.locks.insert({someId: 27, state: “locked”}).
If the insert succeeds - I've got the lock and since the update was atomic, no one else can have it.
If the insert fails - someone else must have the lock.
If two clients try to lock the system at the same time, is it possible that both clients can end up thinking they "have the lock"?
No, only one client at a time writes to the lock space (Global, Database, Collection or Document depending on your version and configuration) and the operations on that lock space are sequential and one or the other (read or write, not both) per document so that other connections will not mistakenly pick up a document in a inbetween state and think that it is not locked by another client.
All operations on a single document are atomic, whether update or insert.

ActiveRecord find_or_initialize_by race conditions

I have a scenario where 2 db connections might both run Model.find_or_initialize_by(params) and raise an error: PG::UniqueViolation: ERROR: duplicate key value violates unique constraint
I'd like to update my code so it could gracefully recover from it. Something like:
record = nil
begin
record = Model.find_or_initialize_by(params)
rescue ActiveRecord::RecordNotUnique
record = Model.where(params).first
end
return record
The trouble is that there's not a nice/easy way to reproduce this on my local machine, so I'm not confident that my fix actually works.
So I thought I'd get a bit creative and try calling create 2 times (locally) in a row which should raise then PG::UniqueViolation: ERROR, then I could rescue from it and make sure everything is handled gracefully.
But I get this error: PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
I get this error even when I wrap everything in individual transaction blocks
record = nil
Model.transaction do
record = Model.create(params)
end
begin
Model.transaction do
record = Model.create(params)
end
rescue ActiveRecord::RecordNotUnique
end
Model.transaction do
record = Model.where(params).first
end
return record
My questions:
What's the right way to gracefully handle the race condition I mentioned at the very beginning of this post?
How do I test this locally?
I imagine there's probably something simple that I'm missing here, but it's late and perhaps I'm not thinking too clearly.
I'm running postgres 9.3 and rails 4.
EDIT Turns out that find_or_initialize_by should have been find_or_create_by and the errors I was getting was from the actual save call that happened later on in execution. #VeryTiredWhenIWroteThis
Has this actually happenend?
Model.find_or_initialize_by(params)
should never raise an ´ActiveRecord::RecordNotUnique´ error as it is not saving anything to db. It just creates a new ActiveRecord.
However in the second snippet you are creating records.
create (without bang) does not throw exceptions caused by validations, but
ActiveRecord::RecordNotUnique is always thrown in case of a duplicate by both create and create!
If you're creating records you don't need transactions at all. As Postgres being ACID compliant guarantees that only one of the both operations succeeds and if it responds so it's changes will be durable. (a single statement query against postgres is also a transaction). So your above code is almost fine if you replace through find_or_create_by
begin
record = Model.find_or_create_by(params)
rescue ActiveRecord::RecordNotUnique
record = Model.where(params).first
end
You can test if the code behaves correctly by simply trying to create the same record twice in row. However this will not test ActiveRecord::RecordNotUnique is actually thrown correctly on race conditions.
It's also no the responsibility of your app to test and testing it is not easy. You would have to start rails in multithread mode on your machine, or test against a multi process staging rails instance. Webrick for example handles only one request at a time. You can use puma application server, however on MRI there is no true concurrency (GIL). Threads only share the GIL only on IO blocking. Because talking to Postgres is IO, i'd expect some concurrent requests, but to be 100% sure, the best testing scenario would be to deploy on passenger with multiple workers and then use jmeter to run concurrent request agains the server.

MongoDB critical section

I need to perform a few operations (read and writes) on my mongodb without having another process interrupting. It's for an online game and when a user sends resources to another the following steps are performed:
Check his resource value
Abort if it's not enough
Insert a resource transaction
Decrement his resource value
Increment the other ones resource value
I'm concerned that while checking if its enough or inserting the resource transaction some other transaction has already been inserted and the values become invalid. How can I make sure that this part is executed in this order?
I can see two ways:
Use client side transactions to hold a "lock": http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/
Or use versioning here whereby you hold a field with a $inc'd version number which gets updated every time you save and must be queried by whenever you go to save. A good example is within Vermongo: https://github.com/thiloplanz/v7files/wiki/Vermongo
Those seem to be the two most plausible ways I see of getting this done.
Transaction is a almost forbidden word when talking about mongo. But you can perform steps 1,2 and 4 using a atomic uptade with $inc using resource value as condition, and then perform steps 3 and 5. You will not have support for rolling back on step if next steps fails.
I am an engineer at Tokutek
TokuMX is a MongoDB replacement server that uses the same protocol and drivers and supports native multi-statement transactions on non-sharded setups. What you want can be accomplished with a serializable transaction, which will take document-level locks on documents you touch. This would be done something like
> db.beginTransaction("serializable");
> if (resourcesInsufficient()) { db.rollbackTransaction(); }
> // insert and update
> db.commitTransaction()
Again, this is not supported in sharding but may be useful for your application. More details, features and limitations are discussed here.

Multi-collection, multi-document 'transactions' in MongoDB

I realise that MongoDB, by it's very nature, doesn't and probably never will support these kinds of transactions. However, I have found that I do need to use them in a somewhat limited fashion, so I've come up with the following solution, and I'm wondering: is this the best way of doing it, and can it be improved upon? (before I go and implement it in my app!)
Obviously the transaction is controlled via the application (in my case, a Python web app). For each document in this transaction (in any collection), the following fields are added:
'lock_status': bool (true = locked, false = unlocked),
'data_old': dict (of any old values - current values really - that are being changed),
'data_new': dict (of values replacing the old (current) values - should be an identical list to data_old),
'change_complete': bool (true = the update to this specific document has occurred and was successful),
'transaction_id': ObjectId of the parent transaction
In addition, there is a transaction collection which stores documents detailing each transaction in progress. They look like:
{
'_id': ObjectId,
'date_added': datetime,
'status': bool (true = all changes successful, false = in progress),
'collections': array of collection names involved in the transaction
}
And here's the logic of the process. Hopefully it works in such a way that if it's interupted, or fails in some other way, it can be rolled back properly.
1: Set up a transaction document
2: For each document that is affected by this transaction:
Set lock_status to true (to 'lock' the document from being modified)
Set data_old and data_new to their old and new values
Set change_complete to false
Set transaction_id to the ObjectId of the transaction document we just made
3: Perform the update. For each document affected:
Replace any affected fields in that document with the data_new values
Set change_complete to true
4: Set the transaction document's status to true (as all data has been modified successfully)
5: For each document affected by the transaction, do some clean up:
remove the data_old and data_new, as they're no longer needed
set lock_status to false (to unlock the document)
6: Remove the transaction document set up in step 1 (or as suggested, mark it as complete)
I think that logically works in such a way that if it fails at any point, all data can be either rolled back or the transaction can be continued (depending on what you want to do). Obviously all rollback/recovery/etc. is performed by the application and not the database, by using the transaction documents and the documents in the other collections with that transaction_id.
Is there any glaring error in this logic that I've missed or overlooked? Is there a more efficient way of going about it (e.g. less writing/reading from the database)?
As a generic response multi-document commits on MongoDB can be performed as two phase commits, which have been somewhat extensively documented in the manual (See: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/).
The pattern suggested by the manual is briefly to following:
Set up a separate transactions collection, that includes target document, source document, value and state (of the transaction)
Create new transaction object with initial as the state
Start making a transaction and update state to pending
Apply transactions to both documents (target, source)
Update transaction state to committed
Use find to determine whether documents reflect the transaction state, if ok, update transaction state to done
In addition:
You need to manually handle failure scenarios (something didn't happen as described below)
You need to manually implement a rollback, basically by introducing a name state value canceling
Some specific notes for your implementation:
I would discourage you from adding fields like lock_status, data_old, data_new into source/target documents. These should be properties of the transactions, not the documents themselves.
To generalize the concept of target/source documents, I think you could use DBrefs: http://www.mongodb.org/display/DOCS/Database+References
I don't like the idea of deleting transaction documents when they are done. Setting state to done seems like a better idea since this allows you to later debug and find out what kind of transactions have been performed. I'm pretty sure you won't run out of disk space either (and for this there are solutions as well).
In your model how do you guarantee that everything has been changed as expected? Do you inspect the changes somehow?
MongoDB 4.0 adds support for multi-document ACID transactions.
Java Example:
try (ClientSession clientSession = client.startSession()) {
clientSession.startTransaction();
collection.insertOne(clientSession, docOne);
collection.insertOne(clientSession, docTwo);
clientSession.commitTransaction();
}
Note, it works for replica set. You can still have a replica set with one node and run it on local machine.
https://stackoverflow.com/a/51396785/4587961
https://docs.mongodb.com/manual/tutorial/deploy-replica-set-for-testing/
MongoDB 4.0 is adding (multi-collection) multi-document transactions: link