Mongo transactions make a snapshot for the reading operations? - mongodb

Imagine the following scenario:
Start a session
Start a transaction for that session
run an read on Document A
a different session made an update on Document A (During execution)
write Document B based on the original read of Document A
Commit the transaction
End the session
Will the update on Document A be atomic between read and write, or is there a concurrency problem? I understand transaction does a snapshot of all write operations but not sure what happens on the reading side.
await session.withTransaction(async () => {
const coll1 = client.db('mydb1').collection('foo');
const coll2 = client.db('mydb2').collection('bar');
const docA = await coll1.findOne({ abc: 1 }, { session });
// docA is deleted by other session on this point
if (docA){
//Does this runs on an outdated condition?
await coll2.insertOne({ xyz: 999 }, { session });
}
}, transactionOptions)

Related

Mongoose transactions not rolling back

The following code create 2 documents in 2 different collections. Unless I'm misunderstanding, 2 things aren't happening properly. The saves are happening serially when I thought it should only happen after the transaction completes. And nothing rolls back upon error. I see the documents in mongo still. What am I doing wrong?
const session = await dbConn.startSession();
let issue;
await dbConn.transaction(async () => {
issue = await IssueModel.create([req.body], { session: session });
await IssueHistoryModel.create([req.body], { session: session });
console.log("test");
throw new Error("oops");
});

MongoDB atlas trigger - execution time limit exceeded

I'm testing out a trigger on MongoDB atlas which runs a Realm function for adding an object to Algolia index upon insertion to the MongoDB collection. In my case the record gets uploaded to Algolia index successfully but the function doesn't stop there and happens to exceed the time limit.
The docs mention that
Function runtime is limited to 120 seconds
and that's the reason for the function to timeout
Here is my Realm function
exports = function(changeEvent) {
const algoliasearch = require('algoliasearch');
const client = algoliasearch(context.values.get('algolia_app'),context.values.get('algolia_key'));
const index = client.initIndex("movies");
changeEvent.fullDocument.objectID = changeEvent.fullDocument._id;
delete changeEvent.fullDocument._id;
index.saveObject(changeEvent.fullDocument)
.then(({objectID}) => {
console.log('successfully inserted: ',objectID);
})
.catch(err => {
console.log(err);
});
};
Here is the result I get on the logs
Logs:
[
"successfully inserted: 61cf0a79c577393620dd8c80"
]
Error:
execution time limit exceeded
I even tried with return statements after the console.logs but still the same issue.
What I'm I doing wrong
Apparently this was fixed by MongoDB team early this March as seen by https://www.mongodb.com/community/forums/t/extremely-slow-execution-of-an-external-dependency-function/16919/27.
I tested with this code below and it worked perfect without any timeouts this time.
I made the function to be an async function. According to the logs it didn't even take 1 second to perform the indexing.
exports = async function(changeEvent) {
const algoliasearch = require('algoliasearch');
const client = algoliasearch(context.values.get('algolia_app'),context.values.get('algolia_key'));
const index = client.initIndex("movies");
changeEvent.fullDocument.objectID = changeEvent.fullDocument._id;
delete changeEvent.fullDocument._id;
try{
const result = await index.saveObject(changeEvent.fullDocument);
console.log(Date.now(),'successfully updated: ',result);
}
catch(e){
console.error(e);
}
}
Logs

try_join to make mongodb transactions sent at the same time

I'm new to Rust and I'm using the default MongoDB driver
https://docs.rs/mongodb/2.0.0/mongodb/
I remember when coding with Node.js, there was a possibility to send transactions with some Promise.all() in order to execute all transactions at the same time for optimization purposes, and if there are no errors, to make a commit to the transaction.
(Node.js example here: https://medium.com/#alkor_shikyaro/transactions-and-promises-in-node-js-ca5a3aeb6b74)
I'm trying to implement the same logic in Rust now, using try_join! but I'm always opposed to the problem:
error: cannot borrow session as mutable more than once at a time;
label: first mutable borrow occurs here
use mongodb::{bson::oid::ObjectId, Client, Database, options};
use async_graphql::{
validators::{Email, StringMaxLength, StringMinLength},
Context, ErrorExtensions, Object, Result,
};
use futures::try_join;
//use tokio::try_join; -> same thing
#[derive(Default)]
pub struct UserMutations;
#[Object]
impl UserMutations {
async fn user_followed<'ctx>(
&self,
ctx: &Context<'ctx>,
other_user_id: ObjectId,
current_user_id: ObjectId,
) -> Result<bool> {
let mut session = Client::with_uri_str(dotenv!("URI"))
.await
.expect("DB not accessible!")
.start_session(Some(session_options))
.await?;
session.start_transaction(Some(options::TransactionOptions::builder()
.read_concern(Some(options::ReadConcern::majority()))
.write_concern(Some(
options::WriteConcern::builder()
.w(Some(options::Acknowledgment::Majority))
.w_timeout(Some(Duration::new(3, 0)))
.journal(Some(false))
.build(),
))
.selection_criteria(Some(options::SelectionCriteria::ReadPreference(
options::ReadPreference::Primary
)))
.max_commit_time(Some(Duration::new(3, 0)))
.build())).await?;
let db = Client::with_uri_str(dotenv!("URI"))
.await
.expect("DB not accessible!").database("database").collection::<Document>("collection");
try_join!(
db.update_one_with_session(
doc! {
"_id": other_user_id
},
doc! {
"$inc": { "following_number": -1 }
},
None,
&mut session,
),
db.update_one_with_session(
doc! {
"_id": current_user_id
},
doc! {
"$inc": { "followers_number": -1 }
},
None,
&mut session,
)
)?;
Ok(true)
}
}
849 | | &mut session,
| | ------------ first mutable borrow occurs here
... |
859 | | &mut session,
| | ^^^^^^^^^^^^ second mutable borrow occurs here
860 | | )
861 | | )?;
| |_____________- first borrow later captured here by closure
Is there any way to send transaction functions sync to not lose any time on independent mutations? Does anyone have any ideas?
Thanks in advance!
Thanks, Patrick and Zeppi for your answers, I did some more research on this topic and also did my own testing. So, let's start.
First, my desire was to optimize transactional writes as much as possible, since I wanted the complete rollback possibility required by code logic.
In case you missed my comments to Patrick, I'll restate them here to better reflect what was my way of thinking about this:
I understand why this would be a limitation for multiple reads, but if
all actions are on separate collections (or are independent atomic
writes to multiple documents with different payloads) I don't see why
it's impossible to retain casual consistency while executing them
concurrently. This kind of transaction should never create race
conditions / conflicts / weird lock behaviour, and in case of error
the entire transaction is rolled back before being committed anyways.
Making an analogy with Git (which might be wrong), no merge conflicts
are created when separate files / folders are updated. Sorry for being
meticulous, this just sounds like a major speed boost opportunity.
But, after lookups I was opposed to this documentation:
https://github.com/mongodb/specifications/blob/master/source/sessions/driver-sessions.rst#why-does-a-network-error-cause-the-serversession-to-be-discarded-from-the-pool
An otherwise unrelated operation that just happens to use that same
server session will potentially block waiting for the previous
operation to complete. For example, a transactional write will block a
subsequent transactional write.
Basically, this means that even if you will send transaction writes concurrently, you won't gain much efficiency because MongoDB itself is a blocker. I decided to check if this was true, and since NodeJS driver setup allows to send transactions concurrently (as per: https://medium.com/#alkor_shikyaro/transactions-and-promises-in-node-js-ca5a3aeb6b74) I did a quick setup with NodeJS pointing to the same database hosted by Atlas in the free tier.
Second, statistics and code: That's the NodeJS mutation I will be using for tests (each test has 4 transactional writes). I enabled GraphQL tracing to benchmark this, and here are the results of my tests...
export const testMutFollowUser = async (_parent, _args, _context, _info) => {
try {
const { user, dbClient } = _context;
isLoggedIn(user);
const { _id } = _args;
const session = dbClient.startSession();
const db = dbClient.db("DB");
await verifyObjectId().required().validateAsync(_id);
//making sure asked user exists
const otherUser = await db.collection("users").findOne(
{ _id: _id },
{
projection: { _id: 1 }
});
if (!otherUser)
throw new Error("User was not found");
const transactionResult = session.withTransaction(async () => {
//-----using this part when doing concurrency test------
await Promise.all([
await createObjectIdLink({ db_name: 'links', from: user._id, to: _id, db }),
await db.collection('users').updateOne(
{ _id: user._id },
{ $inc: { following_number: 1 } },
),
await db.collection('users').updateOne(
{ _id },
{
$inc: { followers_number: 1, unread_notifications_number: 1 }
},
),
await createNotification({
action: 'USER_FOLLOWED',
to: _id
}, _context)
]);
//-----------end of concurrency part--------------------
//------using this part when doing sync test--------
//this as a helper for db.insertOne(...)
const insertedId = await createObjectIdLink({ db_name: 'links', from: user._id, to: _id, db });
const updDocMe = await db.collection('users').updateOne(
{ _id: user._id },
{ $inc: { following_number: 1 } },
);
const updDocOther = await db.collection('users').updateOne(
{ _id },
{
$inc: { followers_number: 1, unread_notifications_number: 1 }
},
);
//this as another helper for db.insertOne(...)
await createNotification({
action: 'USER_FOLLOWED',
to: _id
}, _context);
//-----------end of sync part---------------------------
return true;
}, transactionOptions);
if (transactionResult) {
console.log("The reservation was successfully created.");
} else {
console.log("The transaction was intentionally aborted.");
}
await session.endSession();
return true;
}
And related performance results:
format:
Request/Mutation/Response = Total (all in ms)
1) For sync writes in the transaction:
4/91/32 = 127
4/77/30 = 111
7/71/7 = 85
6/66/8 = 80
2/74/9 = 85
4/70/8 = 82
4/70/11 = 85
--waiting more time (~10secs)
9/73/34 = 116
totals/8 = **96.375 ms in average**
//---------------------------------
2) For concurrent writes in transaction:
3/85/7 = 95
2/81/14 = 97
2/70/10 = 82
5/81/11 = 97
5/73/15 = 93
2/82/27 = 111
5/69/7 = 81
--waiting more time (~10secs)
6/80/32 = 118
totals/8 = ** 96.75 ms ms in average **
Conclusion: the difference between the two is within the margin of error (but still on the sync side).
My assumption is with the sync way, you're spending time to wait for DB request/response, while in a concurrent way, you're waiting for MongoDB to order the requests, and then execute them all, which at the end of the day will cost the same time.
So with current MongoDB policies, I guess, the answer to my question will be "there is no need for concurrency because it won't affect the performance anyway." However, it would be incredible if MongoDB would allow parallelization of writes in transactions in future releases with locks on document level (at least for WiredTiger engine) instead of database level, as it is currently for transactions (because you're waiting for the whole write to finish until next one).
Feel free to correct me if I missed/misinterpreted something. Thanks!
This limitation is actually by design. In MongoDB, client sessions cannot be used concurrently (see here and here), and so the Rust driver accepts them as &mut to prevent this from happening at compile time. The Node example is only working by chance and is definitely not recommended or supported behavior. If you would like to perform both updates as part of a transaction, you'll have to run one update after the other. If you'd like to run them concurrently, you'll need to execute them without a session or transaction.
As a side note, a client session can only be used with the client that it was created from. In the provided example, the session is being used with a different one, which will cause an error.

Download large sets of documents from MongoDB using Meteor methods

I am trying to export all the documents from a collection (which is about 12 MB) using a Meteor method but it is almost always crashing the app or never returning the results.
I am considering to upload the documents to S3 then sending a download link to the client, however it seems like having an unnecessary network connections and will make the process even longer.
Is there a better way to get large sets of data from server to client?
here is the example of that code, it is very simple.
'downloadUserActions': () => {
if (Roles.userIsInRole(Meteor.userId(), ['admin'])) {
const userData = userActions.find({}).fetch();
return userData
}
}
Thanks.
You can use an approach, where you split the requests into multiple ones:
get the document count
until document count is completely fetched
get the current count of already fetched docs
fetch the next bunch of docs and skip already fetched ones
For this you need the skip option in the mongo query in order to skip the already fetched docs.
Code example
const limit = 250
Meteor.methods({
// get the max amount of docs
getCount () {
return userActions.find().count()
},
// get the next block of docs
// from: skip to: skip + limit
// example: skip = 1000, limit = 500 is
// from: 1000 to: 1500
downloadUserActions (skip) {
this.unblock()
return userActions.find({}, { skip, limit }).fetch()
}
})
Client:
// wrap the Meteor.call into a promise
const asyncCall = (name, args) => new Promise((resolve, reject) => {
Meteor.call(name, args, (err, res) => {
if (err) {
return reject(err)
}
return resolve(res)
})
})
const asyncTimeout = ms => new Promise(resolve => setTimeout(() => resolve(), ms)
const fetchAllDocs = async (destination) => {
const maxDocs = await asyncCall('getCount')
let loadedDocs = 0
while (loadedDocs < maxDocs) {
const docs = await asyncCall('downloadUserActions', loadedDocs)
docs.forEach(doc => {
// think about using upsert to fix multiple docs issues
destination.insert(doc)
})
// increase counter (skip value)
loadedDocs = destination.find().count()
// wait 10ms for next request, increase if server needs
// more time
await asyncTimeout(10)
}
return destination
}
Use it with a local Mongo Collection on the client:
await fetchAllDocs(new Mongo.Collection(null))
After the function all docs are now stored in this local collection.
Play with the limit and the timeout (miliseconds) values in order to find a sweet-spot between user-experience and server-performance.
Additional improvements
The code does not authenticate or validate requests. This is up to you!
Aƶlso you might think about adding a failsafe-machanism in case the while loop never completes due to some unintended errors.
Further readings
https://docs.meteor.com/api/methods.html#DDPCommon-MethodInvocation-unblock
https://docs.meteor.com/api/collections.html#Mongo-Collection
https://docs.meteor.com/api/collections.html#Mongo-Collection-find

mongoose model has no access to connections in-progress transactions

In my current express application I want to use the new ability of mongodb multi-doc transactions.
First of all it is important to point out how I connect and handle the models
My app.js (server) firstly connects to the db by using db.connect().
I require all models in my db.index file. Since the models will be initiated with the same mongoose reference, I assume that future requires of the models in different routes point to the connected and same connection. Please correct me if I'm wrong with any of these assumptions.
I save the connection reference inside the state object and returning it when needed for my transaction later
./db/index.ts
const fs = require('fs');
const path = require('path');
const mongoose = require('mongoose');
const state = {
connection = null,
}
// require all models
const modelFiles = fs.readdirSync(path.join(__dirname, 'models'));
modelFiles
.filter(fn => fn.endsWith('.js') && fn !== 'index.js')
.forEach(fn => require(path.join(__dirname, 'models', fn)));
const connect = async () => {
state.connection = await mongoose.connect(.....);
return;
}
const get = () => state.connection;
module.exports = {
connect,
get,
}
my model files are containing my required schemas
./db/models/example.model.ts
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
const ExampleSchema = new Schema({...);
const ExampleModel = mongoose.model('Example', ExampleSchema);
module.exports = ExampleModel;
Now the route where I try to do a basic transaction. F
./routes/item.route.ts
const ExampleModel = require('../db/models/example.model');
router.post('/changeQty', async (req,res,next) => {
const connection = db.get().connection;
const session = await connection.startSession(); // works fine
// start a transaction
session.startTransaction(); // also fine
const {someData} = req.body.data;
try{
// jsut looping that data and preparing the promises
let promiseArr = [];
someData.forEach(data => {
// !!! THIS TRHOWS ERROR !!!
let p = ExampleModel.findOneAndUpdate(
{_id : data.id},
{$incr : {qty : data.qty}},
{new : true, runValidators : true}
).session(session).exec();
promiseArr.push(p);
})
// running the promises parallel
await Promise.all(promiseArr);
await session.commitTransaction();
return res.status(..)....;
}catch(err){
await session.abortTransaction();
// MongoError : Given transaction number 1 does not match any in-progress transactions.
return res.status(500).json({err : err});
}finally{
session.endSession();
}
})
But I always get the following error, which probably has to do sth with the connection reference of my models. I assume, that they don't have access to the connection which started the session, so they are not aware of the session.
MongoError: Given transaction number 1 does not match any in-progress
transactions.
Maybe I somehow need to initiate the models inside db.connect with the direct connection reference ?
There is a big mistake somewhere and I hope you can lead me to the correct path. I appreciate Any help, Thanks in advance
This is because you're doing operations in parallel:
So you've got a bunch of race conditions. Just use async/await
and make your life easier.
let p = await ExampleModel.findOneAndUpdate(
{_id : data.id},
{$incr : {qty : data.qty}},
{new : true, runValidators : true}
).session(session).exec();
Reference : https://github.com/Automattic/mongoose/issues/7311
If that does not work try to execute promises one by one rather than promise.all().