JavaScript Stored Function on MongoDB Server - mongodb

This is related to javascript stored function in mongodb server. I know all the details about the working and use cases. I am doubtful about one line which is in the official documentation of MongoDB.
"Note : We do not recommend using server-side stored functions if possible."
Infact what I feel, after moving to V8 JavaScript engine ( improving concurrency issues for javascript queries ) and given the fact this may save us many network round trip time, why this is not recommended by 10gen?

This is not recommended due to the fact that the javascript function needs to take write lock for the duration of it's executing meaning you'll cause potential bottle necks in your write performance.

There are some disadvantages of stored procedures in general:
https://stackoverflow.com/questions/462978/when-should-you-use-stored-procedures
Yet I understand your point concerning the network roundtrips.

Related

Meteor website takes too much time to load

I have a webApp built using meteor.
Following are the specifications :
Meteor version : 1.8
Mongo Version : 4.0.5
Following is the list of packages I have used :
jquery#1.11.10
twbs:bootstrap#3.3.6
iron:router
reactive-var#1.0.11
fortawesome:fontawesome
blaze#2.1.8
accounts-password#1.5.1
mrt:mathjax
email#1.2.3
momentjs:moment
ian:accounts-ui-bootstrap-3#1.2.89
meteor-base#1.4.0
mongo#1.6.0
blaze-html-templates#1.0.4
session#1.2.0
tracker#1.2.0
logging#1.1.20
reload#1.2.0
ejson#1.1.0
spacebars#1.0.12
standard-minifier-css#1.5.2
standard-minifier-js#2.4.0
jss:jstree
meteorhacks:subs-manager
aldeed:template-extension
reywood:publish-composite
shell-server#0.4.0
stylus#=2.513.13
accounts-base#1.4.3
iron:middleware-stack#1.1.0
http#1.4.1
ecmascript#0.12.4
dynamic-import#0.5.0
sha#1.0.9
simple:json-routes
underscore#1.0.10
aldeed:simple-schema
rafaelhdr:google-charts
meteorhacks:aggregate
The webApp is hosted on AWS ec2 instance with 16GB RAM and 04 processors. The application uses pub-sub method. Now the issue is that whenever there are more than 50 concurrent connections, the CPU usage crosses sixty-percent usage and the webApp becomes annoyingly slow to use. As per my findings, it could be because of two reasons, either the pub-sub schema that I have used is too heavy, i.e., I have used database subscriptions on each page extensively and meteor maintains an open connection continuously with it. Other reason that could be leading to extensive resource usage could be mongoDB usage. As per the dbStats, the db uses more than 06GB of RAM. Following are the details :
I am not sure why such behaviour. Only way I can think of is to hit and trial (remove subscriptions and then test), but it would be too time consuming and also not full proof.
Could someone help me out as to how to proceed.
Depending on the way your app is designed, data-wise, there can be several reasons for this lack of performance.
A few suggestions:
check that you have indexes in your collections
avoid doing aggregation in the publication process, i.e. denormalize the db, publish array of cursors instead, limit the size of the documents, etc.
filter the useless fields in the query
limit the amount of data to the relevant part (lazy load & paginated subscribe)
consider global pubs/subs for the collections you use a lot, instead of reloading them too often on a route based pattern
track small component based subs and try to put them at a higher level to avoid making for instance 30 subs instead of one
My best guess is that you probably need a mix of rationalizing the db "structure" and avoid the data aggregation as much as you can.
You may also have a misuse of the low level collection api (e.g. cursor.observe() stuff) somewhere.

MongoDB: Stored Procedures [duplicate]

This question already has answers here:
MongoDB Stored Procedure Equivalent
(3 answers)
Closed 8 years ago.
As I heard, mongoDB can store internal procedures.
How can I use it?
Official Help is very short.
Can I use stored proc. to implement small logic on this layer?
Same as Postgres pl/pgSQL.
The duplicate question ( MongoDB Stored Procedure Equivalent ) does explain that you can store a procedure within MongoDB that can be called via the eval() command, however, it doesn't really explain why this is a bad thing.
Eval is a direct access to an, almost, unrestricted JS environment called from MongoDB's C++ code. Good to also mention that injection through unescaped parameters is very easy.
They are not stored procedures that work within MongoDBs own runtime (unlike the stored procedures you are thinking of) the JS engine is run from MongoDB, MongoDB is not programmed in JS; it is programmed in C++.
They are only available from a JS context, not from MongoDB's C++ context.
By default they can take global lock even with the nolock option set, it all depends upon the operations you call and the JS in itself is extremely slow in comparison to native MongoDB runtime.
As such:
Can I use stored proc. to implement small logic on this layer?
No. It is actually implemented on a third layer, separate from MongoDB.
MongoDB is designed to run this stuff from client side, there is a 90% chance you will get no real benefits by using "stored procedures". In fact in many ACID databases they are heavily abused and used in such a way that actually slows down applications and makes them more prone to failure. So you need to think very carefully about whether you really "need" them or not.

Not recommended to use server-side functions in MongoDB, does this go for MapReduce as well?

The MongoDB documentation states that it is not recommended to use its stored functions feature. This question goes through some of the reasons, but they all seem to boil down to "eval is evil".
Are there specific reasons why server-side functions should not be used in a MapReduce query?
The system.js functions are available to Map Reduce jobs by default ( https://jira.mongodb.org/browse/SERVER-8632 notes a slight glitch to that in 2.4.0rc ).
They are not actually evaled within the native V8/Spidermonkey evironment so tehcnically that part of them is also gone.
So no, there is no real problems, they will run as though native within that Map Reduce and should run just as fast and "good" as any other javascript you write. In fact the system.js collection is more designed to house code for map reduce jobs, it is later uses that sees it used as a hack for "stored procedures".

will I typically get better performance if I run an update/calc loop via javascript?

I have a script that loops over a set of records, performs some statistical calculations and updates the records. It's a big cursor: get record, calculate statistics from embedded documents, set fields on record, save record. There's <5k records that are being looped and each one embeds 90 history entries.
Question: would I get substantially better performance if I did this via javascript? The alternative being writing it in Ruby. My opinion (unfounded) is that since this can be done entirely in the database I will get better performance if send a chunk of js to Mongodb instead of adding Ruby in to the mix.
Related: is map/reduce appropriate for finding the median and mode of a set of values for many records?
The answer is really "it depends" - if the fields you need to do the calculations are very large, doing the calculation on the server side with JS might be a lot faster simply by cutting down on network traffic.
But, executing JS on the server side also holds a write lock, so depending on how complicated the calculations are, it might be more efficient to just do your calculations on the client side and then simply update the document.
Your best bet is to do a simple benchmark with ruby vs. server side JS. If you need to serve other database traffic at the same time, this should also be considered as well, because your lock % could be different in the two scenarios (you can monitor this with mongostat).
Also, keep in mind that using db.eval will not work with sharding, so avoid it if you are using a sharded environment or plan to in the future.

Benefits of V8 JavaScript engine in MongoDB for map reduce

It's sounding like the V8 JavaScript engine might be replacing SpiderMonkey in MongoDB v2.2+.
What benefits, if any, will this bring to MongoDB map-reduce performance?
For example:
Will overall JavaScript evaluation performance improve (I'm assuming this one's a given?)
Will concurrent map and reduce operations be better able to run in parallel on a single instance?
Will map-reduces still block eachother?
Yes, it will help with parallelism, and help performance. The Spidermonkey engine restricts MongoDB to single threads, however the operations are usually short and allow other threads to interleave so the exact impact is hard to quantify. Of course, testing is always the way to really figure out the benefits.
As you can see here: https://jira.mongodb.org/browse/SERVER-4258
And here: https://jira.mongodb.org/browse/SERVER-4191
Some of the improvements are already available for testing in the development release. To test with V8, just build using V8 as outlined here:
http://www.mongodb.org/display/DOCS/Building+with+V8