which driver should I use to run mongoDB - mongodb

I'm wondering about which driver is the best between the following :
mongodb-csharp driver
simple-mongodb driver
NoRM
which one consider the best !>

I think there are even more flavours: the one you call mongodb-csharp is actually two:
https://github.com/samus/mongodb-csharp
https://github.com/mongodb/mongo-csharp-driver
The first is a bit more mature and is used widely in the field. The second is a recent development, but is coming from 10gen, the creators of mongodb. Implementation is looking good and is rather like the samus driver. If you need in production something right now, I'm not sure what to advise, but in the long run, I'd go for the 10gen driver.
The one thing that it currently doesn't offer is Linq integration. This could be important to you.
I have no experience with the NORM and simple-mongdb drivers.

I would use the official c# driver released by mongoDB.
http://www.mongodb.org/display/DOCS/CSharp+Language+Center
I've been using it and I like it so far.

Related

Efficient paging in MongoDB using mgo.v2 and MongoDB > 4.2

I have already looked at Efficient paging in MongoDB using mgo and asked https://stackoverflow.com/review/low-quality-posts/25723764
I got the excelent response provided by #icza who shares his library https://github.com/icza/minquery.
However, as he said, "Starting with MongoDB 4.2, an index hint must be provided. Use the minquery.NewWithHint() constructor."
The problem is that minquery.NewWithHint() constructor seems to only be available in version 2.0.0, which changed gopkg.in/mgo.v2 support for github.com/globalsign/mgo support.
How can I solve this problem ?
gopkg.in/mgo.v2 has long gone unmaintained. The easiest solution for you would be to switch to the github.com/globalsign/mgo mgo driver. It has identical API, so most likely you only have to change the import paths. It is sill somewhat supported, but I believe it will fade away in favor of the official mongo-go driver. If you would choose to switch to mongo-go, that has "built-in" support for specifying the index min parameter for queries. But know that the mongo-go driver has different API.
Another option would be to fork minquery, and apply the commits I made to the v2.0.0 version, including support for the index hints.

NodeJS binary modules vs those written in JS

Yeah, I'm just wondering whether is it a good idea to use a MySQL or MongoDB driver written in pure JS.. I mean, it shouldn't be problem when running a small app with small db for 100 users / month, but what about heavy load / really huge DBs?
Aren't there any professional MySQL and Mongo drivers for NodeJS that I can compile? The performance of these should be way much better.
Or am I wrong about this? For example, Mongoose uses a driver written in pure JS. Is that good enough to efficiently query 500 million documents?
Any suggestion / advice would be appreciated!
Thanks
EDIT:
So thanks for the response guys. Well I'm still a but unsure about this :).
I mean, writing drivers in Python or Java or even C# surely makes sense, but those languages are much more powerful and faster than JS.
Here is what makes me worried:
My MySQL driver (written in pure JS) executes the query SHOW COLUMNS FROM Table in 300-400ms. If I execute the exact same query from MySQL shell, it takes 20ms.
I use an ORM (JugglingDB) which makes use of https://github.com/felixge/node-mysql module. The 300ms is the raw query query execution time, as printed in debug mode.
Why do we see such a big difference? Is it the ORM, or Node/JS or the driver is too slow?
Most MongoDB drivers are written in the language that they are used with. The Python driver is written in Python, the Perl driver in Perl. There are a few exceptions, as the PHP driver is written in C and the Python driver as an optional C extension to speed things up.
The node-mongodb-native driver is written all in JavaScript: https://github.com/mongodb/node-mongodb-native. It make sense as the NodeJS platform is optimised for this and there should be no adverse effects.

About using MongoDB and Linq. What's better or worse with Norm?

I'd like to use MongoDB with Linq, simply because I do not like to not being able to check the query at compile time.
So I searched a bit and found Norm. However I am having a hard time deciding if it's "safe" to move from the official driver.
So I was wondering if can someone tell me the key differences between the official driver and Norm ?
Also what can Norm do that the official driver can't ?
Is it possible to implement Linq on top of the official driver ?
Thanks in advance
I suggest to use mongodb official driver because it will contains all latest features and any issue will be fixed asap. As i know last commit in norm repository was almost a half of year ago, so.. If you want linq support you can use fluent mongo at the top of the official driver, but i believe that linq support should be soon in official driver.

MongoDB for personal non-distributed work

This might be answered here (or elsewhere) before but I keep getting mixed/no views on the internet.
I have never used anything else except SQL like databases and then I came across NoSQL DBs (mongoDB, specifically). I tried my hands on it. I was doing it just for fun, but everywhere the talk is that it is really great when you are using it across distributed servers. So I wonder, if it is any helpful(in a non-trivial way) for doing small projects and things mainly only on a personal computer? Are there some real advantages when there is just one server.
Although it would be cool to use MapReduce (and talk about it to peers :d) won't it be an overkill when used for small projects run on single servers? Or are there other advantages of this? I need some clear thought. Sorry if I sounded naive here.
Optional: Some examples where/how you have used would be great.
Thanks.
IMHO, MongoDB is perfectly valid for use for single server/small projects and it's not a pre-requisite that you should only use it for "big data" or multi server projects.
If MongoDB solves a particular requirement, it doesn't matter on the scale of the project so don't let that aspect sway you. Using MapReduce may be a bit overkill/not the best approach if you truly have low volume data and just want to do some basic aggregations - these could be done using the group operator (which currently has some limitations with regard to how much data it can return).
So I guess what I'm saying in general is, use the right tool for the job. There's nothing wrong with using MongoDB on small projects/single PC. If a RDBMS like SQL Server provides a better fit for your project then use that. If a NoSQL technology like MongoDB fits, then use that.
+1 on AdaTheDev - but there are 3 more things to note here:
Durability: From version 1.8 onwards, MongoDB has single server durability when started with --journal, so now it's more applicable to single-server scenarios
Choosing a NoSQL DB over say an RDBMS shouldn't be decided upon the single or multi server setting, but based on the modelling of the database. See for example 1 and 2 - it's easy to store comment-like structures in MongoDB.
MapReduce: again, it depends on the data modelling and the operation/calculation that needs to occur. Depending on the way you model your data you may or may not need to use MapReduce.

MapReduce implementation in Scala

I'd like to find out good and robust MapReduce framework, to be utilized from Scala.
To add to the answer on Hadoop: there are at least two Scala wrappers that make working with Hadoop more palatable.
Scala Map Reduce (SMR): http://scala-blogs.org/2008/09/scalable-language-and-scalable.html
SHadoop: http://jonhnny-weslley.blogspot.com/2008/05/shadoop.html
UPD 5 oct. 11
There is also Scoobi framework, that has awesome expressiveness.
http://hadoop.apache.org/ is language agnostic.
Personally, I've become a big fan of Spark
http://spark-project.org/
You have the ability to do in-memory cluster computing, significantly reducing the overhead you would experience from disk-intensive mapreduce operations.
You may be interested in scouchdb, a Scala interface to using CouchDB.
Another idea is to use GridGain. ScalaDudes have an example of using GridGain with Scala. And here is another example.
A while back, I ran into exactly this problem and ended up writing a little infrastructure to make it easy to use Hadoop from Scala. I used it on my own for a while, but I finally got around to putting it on the web. It's named (very originally) ScalaHadoop.
For a scala API on top of hadoop check out Scoobi, it is still in heavy development but shows a lot of promise. There is also some effort to implement distributed collections on top of hadoop in the Scala incubator, but that effort is not usable yet.
There is also a new scala wrapper for cascading from Twitter, called Scalding.
After looking very briefly over the documentation for Scalding it seems
that while it makes the integration with cascading smoother it still does
not solve what I see as the main problem with cascading: type safety.
Every operation in cascading operates on cascading's tuples (basically a
list of field values with or without a separate schema), which means that
type errors, I.e. Joining a key as a String and key as a Long leads
to run-time failures.
to further jshen's point:
hadoop streaming simply uses sockets. using unix streams, your code (any language) simply has to be able to read from stdin and output tab delimited streams. implement a mapper and if needed, a reducer (and if relevant, configure that as the combiner).
I've added MapReduce implementation using Hadoop on Github with few test cases here: https://github.com/sauravsahu02/MapReduceUsingScala.
Hope that helps. Note that the application is already tested.