What are the benefits of Mongoose vs native MongoDB? - mongodb

Previous answers to this question:
Difference between MongoDB and Mongoose
Why do we need, what advantages to use mongoose
The main reason given in these answers is "schemas". Since 3.6, mongodb has introduced its own schemas:
https://docs.mongodb.com/manual/core/schema-validation/
These are more thorough and work by default on inserts and updates.
Are there any more significant reasons to use Mongoose, as that was the main one and now it seems to have been integrated into the native API. I have also noticed that mongoose is lacking various new features implemented in mongodb.

Mongoose, the driver I'm using right now, is much more intuitive to learn if you're a beginner. Many criticize the mongoose because they claim that the creation of collection schemes is the opposite of what was thought of the mongodb and NoSQL databases; However I think that even using the native mongoDB driver you will always have to create a minimum of schematics, even for validation and have an idea of ​​what you are entering in the database. Mangusta is extremely convenient because in addition to allowing the creation of templates, it is possible to declare methods in the document and control events. In addition, the mongoose automatically performs additional validation, and has many more search functions. The real negative point of the mongoose is performance. (On this page there is the difference in performance of the two drivers https://medium.com/#bugwheels94/performance-difference-in-mongoose-vs-mongodb-60be831c69ad)
Naturally all these characteristics of the mongoose abandon the performance.
It's hard to give you a driver because it's based on the kind of project you have in mind.

Related

Mongoengine and Pymongo?

Can I use mongoengine or djongo for ODM and pymongo for interaction with the db?
I've read these two about something related to my question:
Insert data by pymongo using mongoengine ORM in pyramid
Use MongoEngine and PyMongo together
But, I couldn't find what I'm looking for (I guess).
So here's what I'm trying to find:
¿Does this practice affect the performance of my application?
¿How well recommended is it?
So, if it is recommended, and everything is right, ¿Do I need to put an extra layer of security or something?, because, I want to build an API using the serializations for models that django-rest-framework-mongoengine offers, and then do what I have to do in the view of the API endpoint.
It could be djongo or something like it, what I want is just an ODM for serializing, define a structure for the API and so on, use pymongo for queries, cause according to what I've been reading, mongoengine could make slower the interaction with the db
The term "ORM" does not apply to MongoDB since MongoDB is non-relational. The proper term is "ODM" - object-document mapper.
Generally, a MongoDB ODM is built on top of a MongoDB driver. The functionalities of the ODM and the driver are complementary - the driver provides low-level database access and the ODM provides high-level features like schema, associations, callbacks.
If you want to use the high-level features, it makes sense to use an ODM. If you don't need any of those features and just want to perform basic CRUD operations, using a driver directly is more efficient. Some applications use both of these strategies depending on the operation that needs to be performed.

Some questions about mongodb that have me puzzled

I just installed mongodb from Synaptic (Using ubuntu 11.10).
I'm following this tutorial: http://howtonode.org/express-mongodb
to build a simple blog using node.js, express.js and mongodb.
I reached the part where you can actually store post in the mongodb.
It worked for me but I'm not sure how. I didn't even start the mongodb (just installed it).
The second thing that has me puzzled is: where is that data going?
Does anyone have any clue?
EDIT:
I think I found where is the data going:
var/lib/
Where are some files that are apparently written in binary code.
MongoDB stores all your data as BSON within the directory you found. The internal Mongo processes handle the sharding of the data when necessary following your specific setup, accomodating you if you have setup distribution across several servers.
One of the most confusing things about coming over to Mongo from RDBMS is the nature of collections vs tables.
http://www.mongodb.org/display/DOCS/MongoDB%2C+CouchDB%2C+MySQL+Compare+Grid
Of Note: A Mongo collection does not need to have any schema designed, it is assumed that the client application will validate and enforce any particular scheme. The way the data comes in and is serialized is the way it will be saved in its BSON representation. It is entirely possible to have missing keys and and totally different docs in the same collection, which is why robust data checking in your app becomes so important.
Mongo collections also do not need to be saved. There isn't anything that approximates the CREATE TABLE syntax from MySQL, since there's no schema there isn't much such a command would have to describe. Your collection will be created with the first document inserted, and every document inserted will be given a unique hashed '_id' key (assuming you don't define this key in your objects using your own methodology.'
Here's a helpful resource with the basic commands.
http://cheat.errtheblog.com/s/mongo
Mongo, and NoSQL in general, is a developer lead movement trying to bridge the gap between the object-oriented application layer and the persistency layer in an app, which would traditionally have been represented by an RDBMS. Developers just would much rather work within one paradigm, and OOP has become predominant especially in the web environment.
If you're coming over from LAMP and used PhpMyAdmin in the past, I really must suggest RockMongo. This tool makes it so much easier to observe and understand the actual structure of the BSON documents stored in your server.
http://code.google.com/p/rock-php/wiki/rock_mongo
Best of luck!

NoSQL engines that support dynamic queries?

What "NoSQL" database engines support dynamic / advanced queries in a similar fashion to MongoDB (http://www.mongodb.org/display/DOCS/Advanced+Queries) ?
Specifically interested in options that support ad-hoc querying from a shell or within client languages.
None just use MongoDB ;)
Honestly, it really depends on what type of querying you plan to do. For Key/Value style queries where you plan to just pull up one document at a time, then basically all of the NoSQL DBs are good for this.
When it comes to pulling back "sets" of data or using alternate keys, then MongoDB is probably your best "crossover" here. Many NoSQL DBs have limited querying functions, especially on non-key fields. Of course, that's kind of the point of "Key-Value stores", so Mongo is kind of a mutant here.
The last I checked with Cassandra, there was definitely some "hoop-jumping" involved to really support ad-hoc non-key queries. And CouchDB seems to point to "just Map / Reduce".
That stated, I believe that there is motion from several NoSQL dbs to support such ad-hoc querying mechanism. So this answer could be completely wrong in 2 months :)

NoSQL (MongoDB) vs Lucene (or Solr) as your database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
With the NoSQL movement growing based on document-based databases, I've looked at MongoDB lately. I have noticed a striking similarity with how to treat items as "Documents", just like Lucene does (and users of Solr).
So, the question: Why would you want to use NoSQL (MongoDB, Cassandra, CouchDB, etc) over Lucene (or Solr) as your "database"?
What I am (and I am sure others are) looking for in an answer is some deep-dive comparisons of them. Let's skip over relational database discussions all together, as they serve a different purpose.
Lucene gives some serious advantages, such as powerful searching and weight systems. Not to mention facets in Solr (which Solr is being integrated into Lucene soon, yay!). You can use Lucene documents to store IDs, and access the documents as such just like MongoDB. Mix it with Solr, and you now get a WebService-based, load balanced solution.
You can even throw in a comparison of out-of-proc cache providers such as Velocity or MemCached when talking about similar data storing and scalability of MongoDB.
The restrictions around MongoDB reminds me of using MemCached, but I can use Microsoft's Velocity and have more grouping and list collection power over MongoDB (I think). Can't get any faster or scalable than caching data in memory. Even Lucene has a memory provider.
MongoDB (and others) do have some advantages, such as the ease of use of their API. New up a document, create an id, and store it. Done. Nice and easy.
This is a great question, something I have pondered over quite a bit. I will summarize my lessons learned:
You can easily use Lucene/Solr in lieu of MongoDB for pretty much all situations, but not vice versa. Grant Ingersoll's post sums it up here.
MongoDB etc. seem to serve a purpose where there is no requirement of searching and/or faceting. It appears to be a simpler and arguably easier transition for programmers detoxing from the RDBMS world. Unless one's used to it Lucene & Solr have a steeper learning curve.
There aren't many examples of using Lucene/Solr as a datastore, but Guardian has made some headway and summarize this in an excellent slide-deck, but they too are non-committal on totally jumping on Solr bandwagon and "investigating" combining Solr with CouchDB.
Finally, I will offer our experience, unfortunately cannot reveal much about the business-case. We work on the scale of several TB of data, a near real-time application. After investigating various combinations, decided to stick with Solr. No regrets thus far (6-months & counting) and see no reason to switch to some other.
Summary: if you do not have a search requirement, Mongo offers a simple & powerful approach. However if search is key to your offering, you are likely better off sticking to one tech (Solr/Lucene) and optimizing the heck out of it - fewer moving parts.
My 2 cents, hope that helped.
You can't partially update a document in solr. You have to re-post all of the fields in order to update a document.
And performance matters. If you do not commit, your change to solr does not take effect, if you commit every time, performance suffers.
There is no transaction in solr.
As solr has these disadvantages, some times NoSQL is a better choice.
UPDATE: Solr 4+ Started supporting commit and soft-commits. Refer to the latest document https://lucene.apache.org/solr/guide/8_5/
We use MongoDB and Solr together and they perform well. You can find my blog post here where i described how we use this technologies together. Here's an excerpt:
[...] However we observe that query performance of Solr decreases when index
size increases. We realized that the best solution is to use both Solr
and Mongo DB together. Then, we integrate Solr with MongoDB by storing
contents into the MongoDB and creating index using Solr for full-text
search. We only store the unique id for each document in Solr index
and retrieve actual content from MongoDB after searching on Solr.
Getting documents from MongoDB is faster than Solr because there is no
analyzers, scoring etc. [...]
Also please note that some people have integrated Solr/Lucene into Mongo by having all indexes be stored in Solr and also monitoring oplog operations and cascading relevant updates into Solr.
With this hybrid approach you can really have the best of both worlds with capabilities such as full text search and fast reads with a reliable datastore that can also have blazing write speed.
It's a bit technical to setup but there are lots of oplog tailers that can integrate into solr. Check out what rangespan did in this article.
http://denormalised.com/home/mongodb-pub-sub-using-the-replication-oplog.html
From my experience with both, Mongo is great for simple, straight-forward usage. The main Mongo disadvantage we've suffered is the poor performance on unanticipated queries (you cannot created mongo indexes for all the possible filter/sort combinations, you simple can't).
And here where Lucene/Solr prevails big time, especially with the FilterQuery caching, Performance is outstanding.
Since no one else mentioned it, let me add that MongoDB is schema-less, whereas Solr enforces a schema. So, if the fields of your documents are likely to change, that's one reason to choose MongoDB over Solr.
#mauricio-scheffer mentioned Solr 4 - for those interested in that, LucidWorks is describing Solr 4 as "the NoSQL Search Server" and there's a video at http://www.lucidworks.com/webinar-solr-4-the-nosql-search-server/ where they go into detail on the NoSQL(ish) features. (The -ish is for their version of schemaless actually being a dynamic schema.)
If you just want to store data using key-value format, Lucene is not recommended because its inverted index will waste too much disk spaces. And with the data saving in disk, its performance is much slower than NoSQL databases such as redis because redis save data in RAM. The most advantage for Lucene is it supports much of queries, so fuzzy queries can be supported.
MongoDB Atlas will have a lucene-based search engine soon. The big announcement was made at this week's MongoDB World 2019 conference. This is a great way to encourage more usage of their high revenue MongoDB Atlas product.
I was hoping to see it rolled into the MongoDB Enterprise version 4.2 but there's been no news of bringing it to their on-prem product line.
More info here: https://www.mongodb.com/atlas/full-text-search
The third party solutions, like a mongo op-log tail are attractive. Some thoughts or questions remain about whether the solutions could be tightly integrated, assuming a development/architecture perspective. I don't expect to see a tightly integrated solution for these features for a few reasons (somewhat speculative and subject to clarification and not up to date with development efforts):
mongo is c++, lucene/solr are java
maybe lucene could use some mongo libs
maybe mongo could rewrite some lucene algorithms, see also:
http://clucene.sourceforge.net/
http://lucy.apache.org/
lucene supports various doc formats
mongo is focused on JSON (BSON)
lucene uses immutable documents
single field updates are an issue, if they are available
lucene indexes are immutable with complex merge ops
mongo queries are javascript
mongo has no text analyzers / tokenizers (AFAIK)
mongo doc sizes are limited, that might go against the grain for lucene
mongo aggregation ops may have no place in lucene
lucene has options to store fields across docs, but that's not the same thing
solr somehow provides aggregation/stats and SQL/graph queries

What are the advantages of using a schema-free database like MongoDB compared to a relational database?

I'm used to using relational databases like MySQL or PostgreSQL, and combined with MVC frameworks such as Symfony, RoR or Django, and I think it works great.
But lately I've heard a lot about MongoDB which is a non-relational database, or, to quote the official definition,
a scalable, high-performance, open
source, schema-free, document-oriented
database.
I'm really interested in being on edge and want to be aware of all the options I'll have for a next project and choose the best technologies out there.
In which cases using MongoDB (or similar databases) is better than using a "classic" relational databases?
And what are the advantages of MongoDB vs MySQL in general?
Or at least, why is it so different?
If you have pointers to documentation and/or examples, it would be of great help too.
Here are some of the advantages of MongoDB for building web applications:
A document-based data model. The basic unit of storage is analogous to JSON, Python dictionaries, Ruby hashes, etc. This is a rich data structure capable of holding arrays and other documents. This means you can often represent in a single entity a construct that would require several tables to properly represent in a relational db. This is especially useful if your data is immutable.
Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL.
No schema migrations. Since MongoDB is schema-free, your code defines your schema.
A clear path to horizontal scalability.
You'll need to read more about it and play with it to get a better idea. Here's an online demo:
http://try.mongodb.org/
There are numerous advantages.
For instance your database schema will be more scalable, you won't have to worry about migrations, the code will be more pleasant to write... For instance here's one of my model's code :
class Setting
include MongoMapper::Document
key :news_search, String, :required => true
key :is_availaible_for_iphone, :required => true, :default => false
belongs_to :movie
end
Adding a key is just adding a line of code !
There are also other advantages that will appear in the long run, like a better scallability and speed.
... But keep in mind that a non-relational database is not better than a relational one. If your database has a lot of relations and normalization, it might make little sense to use something like MongoDB. It's all about finding the right tool for the job.
For more things to read I'd recommend taking a look at "Why I think Mongo is to Databases what Rails was to Frameworks" or this post on the mongodb website. To get excited and if you speak french, take a look at this article explaining how to set up MongoDB from scratch.
Edit: I almost forgot to tell you about this railscast by Ryan. It's very interesting and makes you want to start right away!
The advantage of schema-free is that you can dump whatever your load is in it, and no one will ever have any ground for complaining about it, or for saying that it was wrong.
It also means that whatever you dump in it, remains totally void of meaning after you have done so.
Some would label that a gross disadvantage, some others won't.
The fact that a relational database has a well-established schema, is a consequence of the fact that it has a well-established set of extensional predicates, which are what allows us to attach meaning to what is recorded in the database, and which are also a necessary prerequisite for us to do so.
Without a well-established schema, no extensional predicates, and without extensional precicates, no way for the user to make any meaning out of what was stuffed in it.
My experience with Postgres and Mongo after working with both the databases in my projects .
Postgres(RDBMS)
Postgres is recommended if your future applications have a complicated schema that needs lots of joins or all the data have relations or if we have heavy writing. Postgres is open source, faster, ACID compliant and uses less memory on disk, and is all around good performant for JSON storage also and includes full serializability of transactions with 3 levels of transaction isolation.
The biggest advantage of staying with Postgres is that we have best of both worlds. We can store data into JSONB with constraints, consistency and speed. On the other hand, we can use all SQL features for other types of data. The underlying engine is very stable and copes well with a good range of data volumes. It also runs on your choice of hardware and operating system. Postgres providing NoSQL capabilities along with full transaction support, storing JSON documents with constraints on the fields data.
General Constraints for Postgres
Scaling Postgres Horizontally is significantly harder, but doable.
Fast read operations cannot be fully achieved with Postgres.
NO SQL Data Bases
Mongo DB (Wired Tiger)
MongoDB may beat Postgres in dimension of “horizontal scale”. Storing JSON is what Mongo is optimized to do. Mongo stores its data in a binary format called BSONb which is (roughly) just a binary representation of a superset of JSON. MongoDB stores objects exactly as they were designed. According to MongoDB, for write-intensive applications, Mongo says the new engine(Wired Tiger) gives users an up to 10x increase in write performance(I should try this), with 80 percent reduction in storage utilization, helping to lower costs of storage, achieve greater utilization of hardware.
General Constraints of MongoDb
The usage of a schema less storage engine leads to the problem of implicit schemas. These schemas aren’t defined by our storage engine but instead are defined based on application behavior and expectations.
Stand-alone NoSQL technologies do not meet ACID standards because they sacrifice critical data protections in favor of high throughput performance for unstructured applications. It’s not hard to apply ACID on NoSQL databases but it would make database slow and inflexible up to some extent. “Most of the NoSQL limitations were optimized in the newer versions and releases which have overcome its previous limitations up to a great extent”.
It's all about trade offs. MongoDB is fast but not ACID, it has no transactions. It is better than MySQL in some use cases and worse in others.
Bellow Lines Written in MongoDB: The Definitive Guide.
There are several good reasons:
Keeping different kinds of documents in the same collection can be a
nightmare for developers and admins. Developers need to make sure
that each query is only returning documents of a certain kind or
that the application code performing a query can handle documents of
different shapes. If we’re querying for blog posts, it’s a hassle to
weed out documents containing author data.
It is much faster to get a list of collections than to extract a
list of the types in a collection. For example, if we had a type key
in the collection that said whether each document was a “skim,”
“whole,” or “chunky monkey” document, it would be much slower to
find those three values in a single collection than to have three
separate collections and query for their names
Grouping documents of the same kind together in the same collection
allows for data locality. Getting several blog posts from a
collection containing only posts will likely require fewer disk
seeks than getting the same posts from a collection con- taining
posts and author data.
We begin to impose some structure on our documents when we create
indexes. (This is especially true in the case of unique indexes.)
These indexes are defined per collection. By putting only documents
of a single type into the same collection, we can index our
collections more efficiently
After a question of databases with textual storage), I glanced at MongoDB and similar systems.
If I understood correctly, they are supposed to be easier to use and setup, and much faster. Perhaps also more secure as the lack of SQL prevents SQL injection...
Apparently, MongoDB is used mostly for Web applications.
Basically, and they state that themselves, these databases aren't suited for complex queries, data-mining, etc. But they shine at retrieving quickly lot of flat data.
MongoDB supports search by fields, regular expression searches.Includes user defined java script functions.
MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.