Is it good idea to use Capped Collections for reading the queries with few index defined - mongodb

I wanted to insert around 4 million of record in the normal collection. But the bulk insert was very slow, so I have created Capped Collections and loaded my data. Someone suggested to me that there will not any performance impact so no need to create the indexes.
But I am seeing for fetching the first 25 records with some filtering taking lots of time. I have a few questions to understand it better.
What is the ideal situation where Capped Collections are suggested
Can I create a compound index on the Capped Collections
Any performance improvement with Capped Collections over the normal collection

A capped collection limits how much data it stores. It does not make retrieval of the data it does store any faster.
Generally if you need fast (or, realistically, reasonably performant) reads you should be using indexes.

Related

MongoDB Index - Performance considerations on collections with high write-to-read ratio

I have a MongoDB Collection with around 100k inserts every day. Each document consumes around 1 MB space and has a lot of elements.
The Data is mainly stored for analytics purpose and read a couple times each day. I want to speed up the queries by adding indexes to a few fields which are usually used for filtering but stumpled accross this statement in the mongodb documentation:
Adding an index has some negative performance impact for write operations. For collections with high write-to-read ratio, indexes are expensive since each insert must also update any indexes.
source
I was wondering if this should bother me with 100k inserts vs. a couple of big read operations and if this would add a lot of overhead to the insert operations?
If yes, should i separete reads from writes in in separate collections and duplicate the data or are there any other solutions for this?

What are the production best practices to store a large number of document when using MongoDB?

I am in need of storing applications transaction logs. Decided to use MongoDB. Every day there are almost 200000+- data is storing in single node MongoDB.
We have some reports and operation(if something happened then do something) depending on those logs. So, need to find documents matching different criteria. If going on that pace, is it vulnerable? Will it be slow to execute query?
Any suggestions to make it efficient to use MongoDB?
By the way, those data are in single collection. And MongoDB server version: 4.2.6
mongo collections can grow to be many terabytes without much issue. to be able to query that data in a speedy manner, you will have to analyze your queries and create indexes for the fields that are used in those queries.
indexes are not free though. they will take both diskspace and use up RAM, because for indexes to be useful, they need to fit entirely in RAM.
in most cases, if indexes and collections grow beyond what your hardware can handle, you will have to archive/evict old data and trim down the collections.
if your queries need to include that evicted data in order to generate your reports, you will have to have another collection for summarized values/data of the evicted records which you will have to combine with present data when generating the reports.
alternatively sharding can help with big data but there are some limitations on queries you can do with sharded collections.

MongoDB - Downside to having different documents on same collection?

What are the downsides of storing completely different documents on the same collection of MongoDB?
Unlike others questions, the documents I'm referring to are not related (like parent-child).
The motivator here is cost-reduction. Azure CosmosDB Mongo API charges and scalability are per-collection.
The size of the collection will get a lot bigger a lot faster.
Speed of queries could be impacted as you'll have to scan more documents than required (could maybe use sparse indexes)
Index sizes will be a lot bigger and longer to scan
You'll need to store a discriminator with the documents so you can tell what type one document is compared to another.
If the documents are not related at all, I'd store them in completely separate collections.

How, When and Where Should MongoDB Index Types be Used?

Can any one help me when it is important to use MongoDB Index and where it can be used. Also I need advantages disadvantages of using MongoDB Index?
Can anyone help me when it is important to use MongoDB Index and where it can be used?
Indexes provide efficient access to your data.
Without having indexes in place for your queries, the query can scan more number of documents that it is expected to return. Having good indexes in place avoid scanning collections and more documents that what's required to return.
A well-designed set of indexes that cater the incoming queries to your database can significantly improve the performance of your database.
Also, I need disadvantages of using MongoDB Index
Indexes need memory and space to store. If the indexes are part of your working set. they will be stored in memory. Meaning that you may need sufficient memory to store indexes in-memory along with frequently accessed data.
Every update, delete and write operation needs update to the index data structure. Having too many indexes on a collection that involves keys in write, update or delete operation needs update to an existing index. It adds the penalty to write operations.
Having large number of compound index take more time to restore index in large datasets.

MongoDB capped collection performance

I am recently working on a time series data project to store some sensor data.To achieve maximum insertion/write throughput i used capped collection(As per the mongodb documentation capped collection will increase the read/write performance). when i test the collection for insertion/write of some thousand documents/records using python driver with capped collection without index against the normal collection, i couldn't see much difference in improvement in write performance of capped collection over normal collection. example is like i inserted 40K records on single thread using pymongo driver. capped collection took around 25.4 seconds and normal collection took 25.7 seconds.
Could anyone please explain me when can we achieve maximum insertion/write throughput of capped collection? Is this is the right choice for time series data collections?
Data stored into capped collections are rotated upon exceeding fixed size of capped collection .
Capped collections don't require any indexes as they preserve the insertion order and also data is retrieved in natural order same as order in which the database refers to documents on disk.Hence it offers high performance in insertion and data retrieval process.
For more detailed description related to Capped collections please refer the documentation as mentioned in URL
https://docs.mongodb.com/manual/core/capped-collections/