Search is slow in Node Mongo API? - mongodb

I have created an API to search documents in Mongodb.
As in my Mongodb there is More than 60 Millions Documents.
So, I have created an API to search and the code is
let regexexp = new RegExp(searchquery.split('').join('\\s*'),'i');
s = {"name": regexexp, "status": 'AVAILABLE'};
db.collection(collectionName).find(s).count().then(function(totalcount) {
db.collection(collectionName).find(s).limit(18).toArray(function(err, res) {
if (err) throw err;
responsedata = { status: true,error : false,data : res,'totalproductcount':
totalcount};
resolve(responsedata);
});
});
But it will take time around 1 min 30 sec sometimes 2 min to search 18 documents which is very much for any user is there anything which i have not used in it.
I have used these value for mongo db connction
const option = {
useUnifiedTopology: true,
useNewUrlParser: true,
socketTimeoutMS: 300000,
poolSize:500,
keepAlive: 300000,
connectTimeoutMS: 300000,
authSource:'admin',
};
Then i found out aggregation in Mongodb I have prepared query for that
db.collection('customers').aggregate( [ { $match: { name: "abc" } } ]
).toArray(function(err, results) {
console.log(results);
});
But it will only work for exact match i want to work if in name parameter is abc is present anywhere it will search that option but it doesn't work.
Is there any way to get the data in minimum time like i have checked from mongo shell it gives me data in milliseconds but it will take time when i hit it with my API.
If Any thing more required please posted here.Any help is appreciated.

Related

mongodb Alerts for frequent queries

I have this query that inserts when a listener is listening to a song.
const nowplayingData = {"type":"S","station": req.params.stationname, "song": data[1], "artist": data[0], "timeplay":npdate};
LNowPlaying.findOneAndUpdate(
nowplayingData,
{ $addToSet: { history: [uuid] } }, { upsert: true }, function(err) {
if (err) {
console.log('ERROR when submitting round');
console.log(err);
}
});
I have been getting the following emails for the last week - they are starting to get annoying.
Mongodb Alerts
These alerts don't show anything wrong with the query or the code.
I also have the following query that checks for the latest userID matching the station name.
I believe this is the query setting off the alerts - because of the amount of times we request the same query over and over (runs every 10 seconds and may have unto 1000 people requesting the info at the same time.)
var query = LNowPlaying.findOne({"station":req.params.stationname, "history":{$in: [y]}}).sort({"_id":-1})
query.exec(function (err, docs) {
/*res.status(200).json({
data: docs
});*/
console.error(docs)
if(err){
console.error("error")
res.status(200).json(
err
);
}
I am wondering how can I make this better so that I don't get the alerts - I know I either have to make an index works which I believe needs to be station name and history array.
I have tried to create a new index using the fields station and history But got this error
Index build failed: 6ed6d3f5-bd61-4d70-b8ea-c62d7a10d3ba: Collection AdStitchr.NowPlaying ( 8190d374-8f26-4c31-bcf7-de4d11803385 ) :: caused by :: Field 'history' of text index contains an array in document: { _id: ObjectId('5f102ab25b43e19dabb201f5'), artist: "Cobra Dukes", song: "Leave The Light On (Hook N Sling Remix) [nG]", station: "DRN1", timeplay: new Date(1594898580000), __v: 0, history: [ "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE1OTQ5ODE0MjQsImlhdCI6MTU5NDg5NTAyNCwib2lkIjoicmFkaW9tZWRpYSJ9.ECVxBzAYZcpyueBP_Xlyncn41OgrezrOF8Dn3CdAnOU" ] }
Can you not index an Array?
How I am trying to create the index.
my index creation

How to find all the matches in a nested array with the _id with mongoose

This question may be easy for some of you but I can't get how this query works.
In the attached picture: https://i.stack.imgur.com/KzK0O.png
Number 1 is the endpoint with the query I can't get it to work.
Number 2 is the endpoint where you can see how I am storing the object match in the database.
Number 3 is the data structure in the frontend.
Number 4 is the Match mongoose model.
I am trying to get all the matches that have the _id I am sending by param in any of its members array.
I am trying it with $in, but I am not sure how this nested object array property query works.
I am very new at web development and this is quite difficult to me, any help would be highly appreciated, even some documentation for dummies, since I can't understand the one in the official site.
Thanks in advance
router.get("/profile/:_id", (req, res) => {
const idFromParam = req.params._id;
console.log("params", req.params._id);
Match.find({ match: [ { teams: [{ members: $in: [_id: idFromParam ] } ] ] }}).populate("users")
.then((response) => {
res.json(response);
console.log("response", response);
}) })
.catch((err) =>
res.status(500).json({ code: 500, message: "Error fetching", err })
);
});

MongoDB: is it possible to capture TTL events with Change Stream to emulate a scheduler (cronjob)?

I'm new to MongoDB and I'm looking for a way to do the following:
I have a collection of a number of available "things" to be used.
The user can "save" a "thing" and decrement the number of available things.
But he has a time to use it before it expires.
If it expires, the thing has to go back to the collection, incrementing it again.
It would be ideal if there was a way to monitor "expiring dates" in Mongo. But in my searches I've only found a TTL (time to live) for automatically deleting entire documents.
However, what I need is the "event" of the expiration... Than I was wondering if it would be possible to capture this event with Change Streams. Then I could use the event to increment "things" again.
Is it possible or not? Or would there be a better way of doing what I want?
I was able to use Change Streams and TTL to emulate a cronjob. I've published a post explaining what I did in details and gave credits at:
https://www.patreon.com/posts/17697287
But, basically, anytime I need to schedule an "event" for a document, when I'm creating the document I also create an event document in parallel. This event document will have as its _id the same id of the first document.
Also, for this event document I will set a TTL.
When the TTL expires I will capture its "delete" change with Change Streams. And then I'll use the documentKey of the change (since it's the same id as the document I want to trigger) to find the target document in the first collection, and do anything I want with the document.
I'm using Node.js with Express and Mongoose to access MongoDB.
Here is the relevant part to be added in the App.js:
const { ReplSet } = require('mongodb-topology-manager');
run().catch(error => console.error(error));
async function run() {
console.log(new Date(), 'start');
const bind_ip = 'localhost';
// Starts a 3-node replica set on ports 31000, 31001, 31002, replica set
// name is "rs0".
const replSet = new ReplSet('mongod', [
{ options: { port: 31000, dbpath: `${__dirname}/data/db/31000`, bind_ip } },
{ options: { port: 31001, dbpath: `${__dirname}/data/db/31001`, bind_ip } },
{ options: { port: 31002, dbpath: `${__dirname}/data/db/31002`, bind_ip } }
], { replSet: 'rs0' });
// Initialize the replica set
await replSet.purge();
await replSet.start();
console.log(new Date(), 'Replica set started...');
// Connect to the replica set
const uri = 'mongodb://localhost:31000,localhost:31001,localhost:31002/' + 'test?replicaSet=rs0';
await mongoose.connect(uri);
var db = mongoose.connection;
db.on('error', console.error.bind(console, 'connection error:'));
db.once('open', function () {
console.log("Connected correctly to server");
});
// To work around "MongoError: cannot open $changeStream for non-existent database: test" for this example
await mongoose.connection.createCollection('test');
// *** we will add our scheduler here *** //
var Item = require('./models/item');
var ItemExpiredEvent = require('./models/scheduledWithin');
let deleteOps = {
$match: {
operationType: "delete"
}
};
ItemExpiredEvent.watch([deleteOps]).
on('change', data => {
// *** treat the event here *** //
console.log(new Date(), data.documentKey);
Item.findById(data.documentKey, function(err, item) {
console.log(item);
});
});
// The TTL set in ItemExpiredEvent will trigger the change stream handler above
console.log(new Date(), 'Inserting item');
Item.create({foo:"foo", bar: "bar"}, function(err, cupom) {
ItemExpiredEvent.create({_id : item._id}, function(err, event) {
if (err) console.log("error: " + err);
console.log('event inserted');
});
});
}
And here is the code for model/ScheduledWithin:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var ScheduledWithin = new Schema({
_id: mongoose.Schema.Types.ObjectId,
}, {timestamps: true});
// timestamps: true will automatically create a "createdAt" Date field
ScheduledWithin.index({createdAt: 1}, {expireAfterSeconds: 90});
module.exports = mongoose.model('ScheduledWithin', ScheduledWithin);
Thanks for the detailed code.
I have two partial alternatives, just to give some ideas.
1.
Given we at least get the _id back, if you only need a specific key from your deleted document, you can manually specify _id when you create it and you'll at least have this information.
(mongodb 4.0)
A bit more involved, this method is to take advantage of the oplog history and open a watch stream at the moment of creation (if you can calculate it), via the startAtOperationTime option.
You'll need to check how far back your oplog history goes, to see if you can use this method:
https://docs.mongodb.com/manual/reference/method/rs.printReplicationInfo/#rs.printReplicationInfo
Note: I'm using the mongodb library, not mongoose
// https://mongodb.github.io/node-mongodb-native/api-bson-generated/timestamp.html
const { Timestamp } = require('mongodb');
const MAX_TIME_SPENT_SINCE_CREATION = 1000 * 60 * 10; // 10mn, depends on your situation
const cursor = db.collection('items')
.watch([{
$match: {
operationType: 'delete'
}
}]);
cursor.on('change', function(change) {
// create another cursor, back in time
const subCursor = db.collection('items')
.watch([{
$match: {
operationType: 'insert'
}
}], {
fullDocument : 'updateLookup',
startAtOperationTime: Timestamp.fromString(change.clusterTime - MAX_TIME_SPENT_SINCE_CREATION)
});
subCursor.on('change', function(creationChange) {
// filter the insert event, until we find the creation event for our document
if (creationChange.documentKey._id === change.documentKey._id) {
console.log('item', JSON.stringify(creationChange.fullDocument, false, 2));
subCursor.close();
}
});
});

How do I move a tailable cursor with awaitdata to the end so I just get new updates

I am trying to watch the MongoDB oplog with the node.js driver and it works in theory, but it has quite the ramp up time because it seems to be scanning the entire collection. I found this in the MongoDB docs:
Because tailable cursors do not use indexes, the initial scan for the query may be expensive; but, after initially exhausting the cursor, subsequent retrievals of the newly added documents are inexpensive.
Is there a way to quickly "exhaust" the cursor to just start tailing? It seems to me that the Meteor guys have solved this, but I have troubles understanding the difference from reading their code. This is what I currently have:
var cursorOptions = {
tailable: true,
awaitdata: true,
numberOfRetries: -1
};
var oplogStream = oplogDb.collection('oplog.rs').find(
{
ns: { $regex : /^dbname\./ },
op: "i",
ts: { $gt: lastEntry.ts }
},
cursorOptions
).sort({$natural: -1}).stream();
oplogStream.on('data', publishDocument);
oplogStream.on('end', function() {
log.error("received unexpected end event from oplog watcher.");
});
Great, 5 minutes after asking I find the answer. I'll post this here for future reference:
You have to add the oplogReplay flag and set it to true. This only works if you also do a range query on the ts field. I tried this before without having the range set and it did nothing. Above code works when you add this one line highlighted below:
var cursorOptions = {
tailable: true,
awaitdata: true,
oplogReplay: true, // add this line
numberOfRetries: -1
};

MongoDb get last few documents and the await tailable cursor

I want to get 5 last documents from a MongoDB collection, then keep tailing it for new documents. Can this be done at all with one query, or do I really need two queries? If two queries, what's the best way to achieve this without adding extra fields?
While answer in any language is fine, here's an example node.js code snippet of what I try to achieve (error handling omitted, and snippet edited based on first answer to the question):
MongoClient.connect("mongodb://localhost:1338/mydb", function(err, db) {
db.collection('mycollection', function(err, col) {
col.count({}, function(err, total) {
col.find({}, { tailable:true, awaitdata:true, timeout:false, skip:total-5, limit:5 }, function(err, cursor) {
cursor.each(function(err, doc) {
console.dir(doc); // print the document object to console
});
});
});
});
});
Problem: Above code prints all the documents starting from first one, and then waits for more. Options skip and limit have no effect.
Question: How to easily get 5 latest documents, then keep on tailing for more? Example in any language is fine, does not have to be node.js.
(Answer edited, it's useful to know this does not work with these versions.)
If collection was not tailable, you'd need to find out how many items there is, for that use count, and then use skip option, to skip first count-5 items.
This will NOT work, tailable and skip do not work together (MongoDB 2.4.6, node.js 0.10.18):
MongoClient.connect("mongodb://localhost:1338/mydb", function(err, db) {
db.collection('mycollection', function(err, col) {
col.count({ }, function(err, total) {
col.find({ }, { tailable: true, awaitdata: true, timeout: false, skip: total - 5, limit: 5 }, function(err, cursor) {
cursor.each(function(err, doc) {
console.dir(doc);
});
});
});
});
});