Mongodb is slow on 8 million rows collection - mongodb

I have 8 milion rows of collection, and I am new to mongoDB.
It loads 20 rows really slowly...
What should I do do speed it up?

Probably you need to add index.
Optimization Guidelines

Need more informations.Just like server's ram。
Optimize Step:
1.Index your query field。
2.Check the index's storage size。If the index's size larger than RAM。MongoDB need to read data by disk I/O.So slower.
8 milion documents is not large。It is not a big deal。

Related

How insert large amount of data (about 1 million records) per minutes into MongoDB?

I want to insert about 1 million records per minutes into a single server MongoDB database. I have index on 6 fields. When the database was empty, I could insert data rapidly in less than a minute into my collection (using bulk insert and multi-processing). However, as the size of data in collection increased, the insertion speed greatly decreased. Is there any idea that how can I handle such data insertion?
(my data is about price changes)
Thanks
Indexes are beneficial in case of find operations where in it performs fast retrieval of documents contained into database but indexes should be created on those fields only which are used as filters for retrieval of selected information.Defining too many indexes result into overhead of insert and update operations as with every insert and update operation those modified records need to be added into index data structure too.
Figure out what your bottleneck is and address it.
Is the server CPU or disk bound? Increase CPU speed or add IOPS to disk.
What proportion of time is used for index writes? Remove all indexes and measure insertion rate at current data size, then add one index at a time while measuring insertion rate with each index addition.
Is insertion rate decreasing linearly with data set size growth? Faster or slower?
MongoDB exposes many server statistics, look through them and identify the ones relevant to throughput, see if you spot any patterns.

MongoDB Remove operation does not remove indexes

We have collection with Millions of records with necessary indexes. we have started archiving data and at the same time we remove the data from the production collection.
Now, the indexes are not getting removed with data.
Is there any way to remove indexes along with data. thanks.
For Example,
Before Backup:
Number of records - 58002174,
Index Size - 10.3 GB
After Backup:
Number of records - 169376,
Index Size - 10.3 GB
The Number of records are far less. but, the index size didn't reduce. I need to reduce index size.
You can rebuild the index to reduce its size after bulk deletions:
db.collection.reIndex()
See the warnings in the linked documentation regarding locking and sharding.
Or just drop the index and recreate it. That would allow you to recreate it in the background, if desired.
Try the compact command: https://docs.mongodb.com/manual/reference/command/compact/#dbcmd.compact
It is advertised to "rewrite and defragment all data and indexes in a collection".

MongoDB can not find() in 1 million documents

I just started to deal with MongoDB.
Created 10 thousand json documents. I do search:
db.mycollection.find({"somenode1.somenode2.somenode3.somenode4.Value", "9999"}).count()
It gives out the correct result. Operating time: 34 ms. Everything is OK.
Now create a database with 1 million of the same documents. The total size of the database is 34Gb.The MongoDB divided the database into files by 2Gb. I repeat the above described query to find the number of relevant documents. I waited for result about 2 hours. The memory was occupied (16GB). Finally I shut down the Mongo.
System: Windows 7 x64, 16Gb RAM.
Please tell me what I'm doing wrong. A production db will be much bigger.
In your particular case, it appears you simply do not have enough RAM. At minimum, and index on "somenode4" would improve the query performance. Keep in mind, the indexes are going to want to be in RAM as well so you may need more RAM anyhow. Are you on a virtual machine? If so; I recommend you increase the size of the machine to account for the size of the working set.
As one of the other commenters stated, that nesting is a bit ugly but I understand it is what you were dealt. So other than RAM, indexing appears to be your best bet.
As part of your indexing effort, you may also want to try experimenting with pre-heating the indexes to ensure they are in RAM prior to that find and count(). Try executing a query that seeks for something that does not exist. This should force the indexes and data into RAM prior to that query. Pending how often your data changes, you may want this to be done once a day or more. You are essentially front-loading the slow operations.

lucene search performance is very slow

I am using lucene.net, I need to index and store very large size of document say 500 GB, Creates index folder of size 500 GB When I use search on this it tooks too much time say 10 min, I use analyzed and not analyzed Index type on different Fields , I have 90 such different fields.
Please help me. Pravin Thokal

Is MongoDB search without index really slow?

I am trying the performance of MongoDB to compare my current MySQL based solution.
In a collection/table X with three attributes A, B, and C, I have attribute A indexed in both MongoDB and MySQL.
Now I throw 1M data in MongoDB and MySQL, and tries the search performance in this straight-ward scenario.
The insert speed on MongoDB is only 10% faster than insert to MySQL. But that is OK, I knew adopting of MongoDB won't bring a magic promotion of my CRUDs, but I am really surprised by the search in MongoDB without index.
The results shows that, MongoDB select on non-indexed field is ten times slower than the select on a indexed field.
On the other hand, the MySQL select (MyISAM) on non-indexed field is only about 70% slower than the select on a indexed field.
Last but not least, in select with index scenario, MongoDB is about 30% quicker than my MySQL solution.
I wanna know that, is above figures normal? Especially the performance of MongoDB select without index?
I have my code like:
BasicDBObject query = new BasicDBObject("A", value_of_field_A);
DBCursor cursor = currentCollection.find(query);
while(cursor.hasNext()) {
DBObject obj = cursor.next();
// do nothing after that, only for testing purpose
}
BTW, from business logic's prespective, my collection could be really large (TB and more), what would you suggest for the size of each physical collection? 10 million Documents or 1 billion Documents?
Thanks a lot!
------------------------------ Edit ------------------------------
I tried the insert with 10 million records on both MongoDB and MySQL, and MongoDB's behavior is about 20% faster than MySQL -- not really that much as I thought.
I am curious that, if I have the MongoDB Auto-sharding being setup, will the insert speed being promoted? If so, do I need to put the Shards on different physic machines, or I can put them on the same machine with multi- cores?
------------------------------ Update ------------------------------
First, I modified the MongoDB write concern from ACKNOWLEDGED into UNACKNOWLEDGED, then the MongoDB insert speed is 3X faster.
Later on, I made the insert program in parallel (8 threads with a 8-cores computer), For MongoDB ACKNOWLEDGED mode, the insert is also improved 3X, for its UNACKNOWLEDGED mode, the speed is actually 50% slower.
For MySQL, the parallel insert mode increases the speed 5X faster! Which is faster than the best insert case from MongoDB!
MongoDB queries without the index will be doing table scan and we should know that data size of mongodb as compared to mysql is much more. I am guessing this might be one of the issue for slowness when doing a full scan.
Regarding query with indexes, mongoDB may turn out faster because of caching, no complex query optimizer plan (like mysql) etc.
The size of the collection is not an issue. In fact 10 million can be easily be handled in one collection. If you are have the requirement of archiving data, then you can break into smaller collections which will make the process easy.