Mongoose select,populate and save behaving differently on Mac and Windows - mongodb

Here's what i did
static populateReferralLinks(){
return Promise.coroutine(function*(){
let companies = yield Company.find({},'billing referral current_referral_program')
.populate('billing.user','emails name');
for(let i = 0 ; i < length ; i++){
companies[i].referral.is_created = true;
companies[i].referral.referral_email = companies[i].billing.user.emails[0].email;
companies[i] = yield companies[i].save();
}
return companies;
}).apply(this)
.catch((err) => {
throw err;
});
}
I have a funciton in which i am selecting only 3 fields to go ahead with i.e billing,current_referral_program and referral.
And populating user using the reference stored in billing.user.
Now when i call this function then on line
companies[i].save();
The following command is shown in the terminal in windows
Mongoose: companies.update(
{ _id: ObjectId("58d12e1a588a96311075c45c") },
{ '$set':
{ billing:
{ configured: false,
user: ObjectId("58d12e16588a96311075c45a") },
referral:
{ is_created: true,
referral_email: 'jadon.devesh98#gmail.com',
},
updatedAt: new Date("Wed, 22 Mar 2017 12:02:55 GMT")
}
}
)
But in Mac's terminal it shows this command
Mongoose: companies.update({ _id: ObjectId("58d12e1a588a96311075c45c") }) { '$set': { billing: { configured: false, user: ObjectId("58d12e16588a96311075c45a") }, current_limit: {}, current_usage: {},referral: { is_created: true, referral_email: 'jadon.devesh98#gmail.com'}}, '$unset': { updatedAt: 1 } }
Now, I haven't mentioned current_limit and current_usage to be empty. it's executing fine on windows but on Mac it's setting current_limit and current_usage empty thus updating my document with empty objects on Mac but not on windows.
It should behave same way on both OS but it is not.

Apparently this problem was there in Mongoose 4.5.8 and is resolved in the latest version i.e 4.9.1
Check it here

Related

Load a mongo collection without creating it if not exists

with mongo db node driver v4.13, how can I load a mongo collection without creating it if not existing?
In earlier versions the function db.collection can be called like this:
db.collection('not_existing', { strict: true }, (err, res) => {
if (err) {
console.log('Collection does not exist');
}
});
But in v4.13 the callback version of this function does not exist anymore and strict: true seems to be ignored.
const collection = await db.collection('not_existing', { strict: true });
console.log(await db.listCollections().toArray()); // lists the collection
You can use
db.getCollectionNames().filter(x => x == 'not_existing').length > 0
or
db.runCommand({ listCollections: 1, filter: { name: 'not_existing' } }).cursor.firstBatch.length > 0
to check whether a collection exists or not.

MongoDB query with 300k documents takes more than 30 seconds

Ok, as said in title, I have "performance issue" where I need to get all documents from a collection but it takes too long. Players collection contains around 300k documents with small size and query in service goes like this:
async getAllPlayers() {
const players = await this.playersCollection.find({}, {projection: { playerId: 1, name: 1, surname: 1, shirtNumber: 1, position: 1 }}).toArray();
return players;
}
Overall size is 6.4MB. I'm using Fastify adapter, fastify-compress and mongodb native driver. If I remove projection, it takes almost a minute.
Any idea how to improve this?
The best time I get is 8 seconds, where fast-json-stringify give me more than 10 seconds boost over 300k records:
'use strict'
// run fresh mongo
// docker run --name temp --rm -p 27017:27017 mongo
const fastify = require('fastify')({ logger: true })
const fjs = require('fast-json-stringify')
const toString = fjs({
type: 'object',
properties: {
playerId: { type: 'integer' },
name: { type: 'string' },
surname: { type: 'string' },
shirtNumber: { type: 'integer' },
}
})
fastify.register(require('fastify-mongodb'), {
forceClose: true,
url: 'mongodb://localhost/mydb'
})
fastify.get('/', (request, reply) => {
const dataStream = fastify.mongo.db.collection('foo')
.find({}, {
limit: 300000,
projection: { playerId: 1, name: 1, surname: 1, shirtNumber: 1, position: 1 }
})
.stream({
transform(doc) {
return toString(doc) + '\n'
}
})
reply.type('application/jsonl')
reply.send(dataStream)
})
fastify.get('/insert', async (request, reply) => {
const collection = fastify.mongo.db.collection('foo')
const batch = collection.initializeOrderedBulkOp();
for (let i = 0; i < 300000; i++) {
const player = {
playerId: i,
name: `Name ${i}`,
surname: `surname ${i}`,
shirtNumber: i
}
batch.insert(player);
}
const { result } = await batch.execute()
return result
})
fastify.listen(8080)
In any case, you should consider to:
paginate your output
or pushing the data into a bucket (like S3) and return to the client a URL to download the file directly, this will speed up a lot the process and will save your node.js process from this data streaming
Note that the compression in node.js is a heavy process, so it slows it down a lot the response. An nginx proxy adds it by default without the need to implement it in your business logic server.

Meteor Mongo Collections find forEach cursor iteration and saving to ElasticSearch Problem

i have Meteor App which is connected to MongoDB.
In mongo i have a table which has ~700k records.
I have a cron job each week, where i read all the records from the table (using Mongo Cursor) and in batches of 10k i want to insert them inside Elastic Search so they are indexed.
let articles = []
Collections.Articles.find({}).forEach(function(doc) {
articles.push({
index: {_index: 'main', _type: 'article', _id: doc.id }
},
doc);
if (0 === articles.length % 10000) {
client.bulk({ maxRetries: 5, index: 'main', type: 'article', body: articles })
data = []
}
})
Since for each is synchronous, goes over each record before it continues, and client.bulk is async, this is overloading the elastic search server and it crashes with Out of Memory Exception.
Is there a way to pause the forEach during the time when the insert is being done? I tried async/await but this does not seem to work as well.
let articles = []
Collections.Articles.find({}).forEach(async function(doc) {
articles.push({
index: {_index: 'main', _type: 'article', _id: doc.id }
},
doc);
if (0 === articles.length % 10000) {
await client.bulk({ maxRetries: 5, index: 'main', type: 'article', body: articles })
data = []
}
})
Any way how to achieve this?
EDIT: I am trying to achieve something like this - if i use promises
let articles = []
Collections.Articles.find({}).forEach(function(doc) {
articles.push({
index: {_index: 'main', _type: 'article', _id: doc.id }
},
doc);
if (0 === articles.length % 10000) {
// Pause FETCHING rows with forEach
client.bulk({ maxRetries: 5, index: 'main', type: 'article', body: articles }).then(() => {
console.log('inserted')
// RESUME FETCHING rows with forEach
console.log("RESUME READING");
})
data = []
}
})
Managed to get this working with ES2018 Async iteration
Got an idea from
Using async/await with a forEach loop
Here is the code that is working
let articles = []
let cursor = Collections.Articles.find({})
for await (doc of cursor) {
articles.push({
index: {_index: 'main', _type: 'article', _id: doc.id }
},
doc);
if (articles.length === 10000) {
await client.bulk({ maxRetries: 5, index: 'trusted', type: 'artikel', body: articles })
articles = []
}
}
This works correctly and it manages to insert all the records into Elastic Search without crashing.
If you are concerned with the unthrottled iteration, then may use the internal Meteor._sleepForMs method, that allows you to put a async timeout in your sync-styled code:
Collections.Articles.find().forEach((doc, index) => {
console.log(index, doc._id)
Meteor._sleepForMs(timeout)
})
Now this works fine within the Meteor environment (Meteor.startup, Meteor.methods, Meteor.publish).
You cron is likely to be not within this environment (= Fiber) so you may write a wrapper that binds the environment:
const bound = fct => Meteor.bindEnvironment(fct)
const iterateSlow = bound(function (timeout) {
Collections.Articles.find().forEach((doc, index) => {
console.log(index, doc._id)
Meteor._sleepForMs(timeout)
})
return true
})
iterateSlow(50) // iterates with 50ms timeout
Here is a complete minimal example, that you can reproduce with a fresh project:
// create a minimal collection
const MyDocs = new Mongo.Collection('myDocs')
// fill the collection
Meteor.startup(() => {
for (let i = 0; i < 100; i++) {
MyDocs.insert({})
}
})
// bind helper
const bound = fct => Meteor.bindEnvironment(fct)
// iterate docs with interval between
const iterateSlow = bound(function (timeout) {
MyDocs.find().forEach((doc, index) => {
console.log(index, doc._id)
Meteor._sleepForMs(timeout)
})
return true
})
// simulate external environment, like when cron runs
setTimeout(() => {
iterateSlow(50)
}, 2000)

Monogdb update function works on terminal but not on metoer server

I'm trying to figure out what I'm doing wrong here with absolutely no luck.
db.attendances.update({ _id: 'hRs6LfAqPBmy4ZNuH' }, { $set: { absentParentOrGuardianDate: undefined } })
If I run this update command in the terminal shell using 'meteor mongo' absentParentOrGuardianDate is removed from the document, however, if I run the same code, slightly changed for meteor, on the meteor server I get an error.
Attendances.update({ _id: 'hRs6LfAqPBmy4ZNuH' }, { $set: { absentParentOrGuardianDate: undefined } });
The error is:
{isClientSafe: true, error: 500, reason: "Internal server error", details: undefined, message: "Internal server error [500]", …}
Can someone please tell me what I'm missing here.
import { _ } from 'meteor/underscore';
const updateDoc = {};
_.each(yourObj, (value, key) => { // I'm using underscore to help loop through the object
if (value) {
updateDoc.$set[key] = value;
} else {
updateDoc.$unset[key] = value;
}
});
yourCollection.update(yourSelector, updateDoc);

Unset nested field using mongoose

Here and here is solution for unsetting some fields which works fine unless they are nested. When I tried the following thing 'null' is being saved against the field instead of unsetting it. How can I get it working ?
PostSchema = new Schema({
title : String
, slug : String
, publish : {
done : {type:Boolean, default:false}
, on : Date
, by : ObjectId
}
, created : Date
, ...
});
PostSchema.pre('save', function(next) {
if(!this.isNew && this.isModified('publish') && !this.publish.done) {
//console.log('OK I am going to unset publish.on, publish.by ');
this.publish.on = undefined;
this.publish.by = undefined;
}
// do some other stuffs
next();
});
EDIT
I got following log :
Mongoose: posts.update({ _id: ObjectId("53e3695289469b7136000033") }) { '$set': { lastModifiedOn: new Date("Fri, 08 Aug 2014 06:47:06 GMT"), publish: { done: false, on: undefined, by: undefined } } } {}