All, my code below is not giving me the correct number of rows, basically I am reading file data and storing it in mongodb collection; might be related to asynchronous vs synchronous operations? would appreciate if someone can point me to the right resource
collection.insertMany(jsonArray);
db.collection('collection1').count(function(function(err,count){
if(err) throw err;
console.log('Total Rows:' + count);
}
Total Rows: 3803
Now if I go to the mongodb command shell it gives me the accurate number of rows
Most probably you are trying to fetch the count before insert operation is complete. So, first wait for the data to be inserted and after that run the count query. Hope this helps.
Try this:
collection.insertMany(jsonArray, function(err, res) {
if (err) {
throw err;
} else {
db.collection('collection1').count(function(err, count) {
if(err) throw err;
console.log('Total Rows:' + count);
})
}
})
Related
So, I only have a few documents in my Mongo DB. For example, I have this basic find request (see below) which takes 4 seconds to return a 1.12KB JSON, before the component re-render.
app.get('/mypath', (req, res) => {
MongoClient.connect(urlDb, (err, db) => {
let Mycoll = db.collection('Mycoll');
Mycoll.find({}).toArray( (err, data) => {
if (err) throw err;
else{
res.status(200).json(data);
}
})
db.close();
})
});
Sometimes for that same component to re-render, with the same request, it takes 8 seconds (which equals an eternity for an Internet user).
Is it supposed to take this long ? I can imagine a user of my app starting to think ("well, that doesn't work") and close it right before the results show.
Is there anything you could point me to to optimize the performance ? Any tool you would recommend to analyze what exactly causes this bottleneck ? Or anything I did wrong ?
At this stage, I don't incriminate React/Redux, because with no DB requests involved, my other components render fast.
Does anyone have a suggestion about how to update a field in each document in a large collection?
I use something like this:
MyModel.find().exec(function(err,data){
if(err){
return console.log(err);
}
data.forEach(function(doc){
doc.Field = doc.Field + 1;
doc.save(function (err) {
if(err) {
console.error('ERROR!');
}
});
});
});
But I get FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory.
Is there a way to process the above update in chunks or something like that?
You can use the async.eachLimit method of the async library to limit the number of concurrent save operations (doc link is to each, scroll down to see the eachLimit variant).
For example, to limit the saves to no more than 5 outstanding at a time:
MyModel.find().exec(function(err, data){
if (err) {
return console.log(err);
}
async.eachLimit(data, 5, function(doc, callback){
doc.Field = doc.Field + 1;
doc.save(function(err) {
if (err) {
console.error('ERROR!');
}
callback(err);
});
});
});
However, in this case it would be much more efficient to use a single update with the $inc operator and the multi: true option to increment each doc's Field value by 1.
MyModel.update({}, {$inc: {Field: 1}}, {multi: true), function(err) { ... });
you need more memory: --max_new_space_size and/or --max_old_space_size, like this:
node --max-old-space-size=4096 server.js
Currently, by default v8 has a memory limit of 512MB on 32-bit
systems, and 1.4GB on 64-bit systems. The limit can be raised by
setting --max_old_space_size to a maximum of ~1024 (~1 GB) (32-bit)
and ~4096 (~4GB) (64-bit), but it is recommended that you split your
single process into several workers if you are hitting memory limits
router.get('/wiki/:topicname', function(req, res, next) {
var topicname = req.params.topicname;
console.log(topicname);
summary.wikitext(topicname, function(err, result) {
if (err) {
return res.send(err);
}
if (!result) {
return res.send('No article found');
}
$ = cheerio.load(result);
var db = req.db;
var collection = db.get('try1');
collection.insert({ "topicname" : topicname, "content": result }, function (err, doc){
if (err) {
// If it failed, return error
res.send("There was a problem adding the information to the database.");
}
else {
// And forward to success page
res.send("Added succesfully");
}
});
});
Using this code, I am trying to add the fetched content from Wikipedia in to the collection try1. The message "Added succesfully" is displayed. But the collection seems to be empty. The data is not inserted in the database
The data must be there, mongodb has { w: 1, j: true } write concern options by default so its only returns without an error if the document is truly inserted if there were any document to insert.
Things you should consider:
-Do NOT use insert function, its depricated use insertOne, insertMany or bulkWrite. ref.: http://mongodb.github.io/node-mongodb-native/2.1/api/Collection.html#insert
-The insert methods callback has two parameters. Error if there was an error, and result. The result object has several properties with could be used for after insert result testing like: result.insertedCount will return the number of inserted documents.
So according to these in your code you only test for error but you can insert zero documents without an error.
Also its not clear to me where do you get your database name from. Is the following correct in your code? Are you sure you are connected to the database you want to use?
var db = req.db;
Also you don't have to enclose your property names with " in your insert method. The insert should look something like this:
col.insertOne({topicname : topicname, content: result}, function(err, r) {
if (err){
console.log(err);
} else {
console.log(r.insertedCount);
}
});
Start your mongod server in a correct path,i.e, same path as that of what you are using to check the contents of collection.
sudo mongod --dbpath <actual-path>
When I am inserting/updating a document in a collection, is the lock applied on the database or the collection. Suppose I have two collections and they are independant of each other in the same database and wants to do write operations on them concurrently. Is this possible?
Here is the code I am using to test this:
var assert = require('assert'),
MongoClient = require('mongodb').MongoClient,
async = require('async');
var station_list = require('./station_list.json'),
trains_list = require('./trains_list.json');
var stationList = [],
trainsList = [];
var MONGO_URL = 'mongodb://localhost:27017/test';
for(var i=0; i<station_list.stations.length; i++)
stationList.push(station_list.stations[i].station_code);
for(var i=0; i<trains_list.trains.length; i++)
trainsList.push(trains_list.trains[i].code);
console.log('trains : ' + trainsList.length + ' stations : ' + stationList.length);
populateTrains();
populateStations();
function populateTrains() {
async.eachSeries(trainsList, populateTrainDb, function (err) {
assert.equal(null, err);
});
}
function populateTrainDb(code, callback) {
MongoClient.connect(MONGO_URL, function (err, db) {
assert.equal(null, err);
var jsonData = {};
jsonData.code = code;
db.collection('trainsCon').replaceOne(
{'code' : code}, jsonData, {upsert: true, w:1}, function (err, res) {
assert.equal(null, err);
db.close();
callback();
});
});
}
function populateStations() {
async.eachSeries(stationList, populateStationDb, function (err) {
assert.equal(null, err);
});
}
function populateStationDb(code, callback) {
MongoClient.connect(MONGO_URL, function (err, db) {
assert.equal(null, err);
var jsonData = {};
jsonData.code = code;
db.collection('stationsCon').replaceOne(
{'code' : code}, jsonData, {upsert:true, w:1}, function (err, res) {
assert.equal(null, err);
db.close();
callback();
});
});
}
The two json files : station_list.json and trains_list.json have around 5000 entries. So after running the given program I get this error after a while :
C:\Users\Adnaan\Desktop\hopSmart\node_modules\mongodb\lib\server.js:242
process.nextTick(function() { throw err; })
^
AssertionError: null == { [MongoError: connect EADDRINUSE 127.0.0.1:27017]
name: 'MongoError',
message: 'connect EADDRINUSE 127.0.0.1:27017' }
at C:\Users\Adnaan\Desktop\hopSmart\testing.js:52:10
at C:\Users\Adnaan\Desktop\hopSmart\node_modules\mongodb\lib\mongo_client.js:276:20
at C:\Users\Adnaan\Desktop\hopSmart\node_modules\mongodb\lib\db.js:224:14
at null.<anonymous> (C:\Users\Adnaan\Desktop\hopSmart\node_modules\mongodb\lib\server.js:240:9)
at g (events.js:273:16)
at emitTwo (events.js:100:13)
at emit (events.js:185:7)
at null.<anonymous> (C:\Users\Adnaan\Desktop\hopSmart\node_modules\mongodb-core\lib\topologies\server.js:301:68)
at emitTwo (events.js:100:13)
at emit (events.js:185:7)
When I check the number of entries entered the database, around 4000 entries had already been entered in both the collections. So what I get from the above experiment was that an error might have occured when one write was being attempted while inside other collection a document must have been getting written.
So how should I proceed to have this concurrency without conflicting locks.
The answer to this question can be quite long and depends on various factors (MongoDB version, storage engine, type of operations you are doing, sharding, etc.). I can only recommend you to read carefully the Concurrency section of the MongoDB documentation, and in particular the lock granularity part.
Make sure to choose the right version of MongoDB first as the behaviour varies greatly from one version to another (e.g. database locking in pre-3.0 vs. collection locking for most operations in post-3.0 using NMAPv1).
I don't think it's concurrency issue with MongoDB, but I could be driver or even with test itself.
I have created a sample application couple of weeks ago to stress test MongoDB while working on a nasty bug. I used C# and MongoDB 3.0 on Windows 10. I have inserted million of documents in multithreaded environment but couldn't crash MongoDB.
Parallel.For(0, 10000, (x =>
{
var lstDocs = new List<BsonDocument>();
for (var i = 0; i < 100; i++)
{
lstDocs.Add(new BsonDocument(doc));
}
collection.InsertMany(lstDocs);
lstDocs.Clear();
}));
You can find code in gist here.
You should not be calling MongoClient.connect every time. That's causing a ton of connections to open and close all the time which is overloading mongo. You should let the MongoClient manage the connection pool. Change it so that you store the db object from MongoClient.connect. Something like this:
var db
MongoClient.connect(url, function(err, database){
db = database;
}
Hi I am trying to build an application which upserts data and fetches from the mongodb baser on the userid.This approach works fine for a single user.But when i try hitting for multiple users say 25 the data fetched seems to be null. Below is my upsert code
collection.update({'USER_ID': passVal.ID},
{'RESPONSE': Data}, { upsert: true }, function (err) {
if (err) {
console.log("Error in saving data");
}
var query = collection.findOne({'USER_ID': passVal.ID});
query.select('RESPONSE');
query.exec(function (err, data) {
if (err) return handleError(err);
console.log(data.RESPONSE);
});
})
I always get an error insome cases as data is null.I have written the read code in the call back of upsert only.I am stuck here any help regarding this will be much helpful.