Postgresql setting parameters for query optimization - database-performance

I am setting work_mem and shared_buffers parameters of PostgreSQL
with an increase in values of these parameters, there is no improvement in query
performance
Please let me know what factors need to set to improve query performance?

Related

Increase postgresql-9.5 performance on 200+ million records

I have 200+ millions of records in postgresql-9.5 table. Almost all queries are analytical queries. To increase and optimize the query performance so far I am trying with Indexing and seems that its not sufficient. What are the other options i need to look it into?
Depending on where clause condition create partitioned table (https://www.postgresql.org/docs/10/static/ddl-partitioning.html)
,it will reduce query cost drastically,also if there is certain fixed value in where clause do partial indexing on partitioned table.
Important point check order of columns in where clause and match it while indexing
You should upgrade to PostgreSQL v10 so that you can use parallel query.
That enables you to run sequential and index scans with several background workers in parallel, which can speed up these operations on large tables.
A good database layout, good indexing, lots of RAM and fast storage are also important factors for good performance of analytical queries.
If the analysis involves a lot of aggregation, consider materialized views to store the aggregates. Materialized views do take up space and they need to be refreshed too. But they are very useful for data aggregation.

MongoDB can not find() in 1 million documents

I just started to deal with MongoDB.
Created 10 thousand json documents. I do search:
db.mycollection.find({"somenode1.somenode2.somenode3.somenode4.Value", "9999"}).count()
It gives out the correct result. Operating time: 34 ms. Everything is OK.
Now create a database with 1 million of the same documents. The total size of the database is 34Gb.The MongoDB divided the database into files by 2Gb. I repeat the above described query to find the number of relevant documents. I waited for result about 2 hours. The memory was occupied (16GB). Finally I shut down the Mongo.
System: Windows 7 x64, 16Gb RAM.
Please tell me what I'm doing wrong. A production db will be much bigger.
In your particular case, it appears you simply do not have enough RAM. At minimum, and index on "somenode4" would improve the query performance. Keep in mind, the indexes are going to want to be in RAM as well so you may need more RAM anyhow. Are you on a virtual machine? If so; I recommend you increase the size of the machine to account for the size of the working set.
As one of the other commenters stated, that nesting is a bit ugly but I understand it is what you were dealt. So other than RAM, indexing appears to be your best bet.
As part of your indexing effort, you may also want to try experimenting with pre-heating the indexes to ensure they are in RAM prior to that find and count(). Try executing a query that seeks for something that does not exist. This should force the indexes and data into RAM prior to that query. Pending how often your data changes, you may want this to be done once a day or more. You are essentially front-loading the slow operations.

Reorganizing indexes and database size

I have a fragmentation problem on my production database. One of my main data tables is about 6GB(3GB Indexes) (about 9M records) in size and has 94%(!) index fragmentation.
I know that reorganizing indexes will solve this problem BUT my database is on SQL Server 2008R2 Express which has 10GB database limit and my database is already 8GB in size.
I have read few blog posts about this issue but non gave answer to my situation.
My Question1 is:
How much size(% or in GB) increase can I expect after reorganizing indexes on that table?
Question2:
Will Drop Index -> Build same index take less space? Time is not a factor for me at the moment.
Extra question:
Any other suggestions for database fragmentation? I know only to avoid shrinking like a fire ;)
Having INDEX on key columns will improve joins and Filters by negating the need for a table scan. A well maintained index can drastically improve performance.
It is Right that GUID's makes poor choice for indexed columns but by no means does it mean that you should not create these indexes. Ideally a data type of INT or BIGINT would be advisable.
For me Adding NEWID() as a default has shown some improvement in counteracting index fragmentation but if all alternatives fail you may have to do index maintenance (Rebuild, reorganize) operations more often than for other indexes. Reorganize needs some working space but in your scenario as time is not a concern, I would disable index, shrink DB and create index.

Is MongoDB search without index really slow?

I am trying the performance of MongoDB to compare my current MySQL based solution.
In a collection/table X with three attributes A, B, and C, I have attribute A indexed in both MongoDB and MySQL.
Now I throw 1M data in MongoDB and MySQL, and tries the search performance in this straight-ward scenario.
The insert speed on MongoDB is only 10% faster than insert to MySQL. But that is OK, I knew adopting of MongoDB won't bring a magic promotion of my CRUDs, but I am really surprised by the search in MongoDB without index.
The results shows that, MongoDB select on non-indexed field is ten times slower than the select on a indexed field.
On the other hand, the MySQL select (MyISAM) on non-indexed field is only about 70% slower than the select on a indexed field.
Last but not least, in select with index scenario, MongoDB is about 30% quicker than my MySQL solution.
I wanna know that, is above figures normal? Especially the performance of MongoDB select without index?
I have my code like:
BasicDBObject query = new BasicDBObject("A", value_of_field_A);
DBCursor cursor = currentCollection.find(query);
while(cursor.hasNext()) {
DBObject obj = cursor.next();
// do nothing after that, only for testing purpose
}
BTW, from business logic's prespective, my collection could be really large (TB and more), what would you suggest for the size of each physical collection? 10 million Documents or 1 billion Documents?
Thanks a lot!
------------------------------ Edit ------------------------------
I tried the insert with 10 million records on both MongoDB and MySQL, and MongoDB's behavior is about 20% faster than MySQL -- not really that much as I thought.
I am curious that, if I have the MongoDB Auto-sharding being setup, will the insert speed being promoted? If so, do I need to put the Shards on different physic machines, or I can put them on the same machine with multi- cores?
------------------------------ Update ------------------------------
First, I modified the MongoDB write concern from ACKNOWLEDGED into UNACKNOWLEDGED, then the MongoDB insert speed is 3X faster.
Later on, I made the insert program in parallel (8 threads with a 8-cores computer), For MongoDB ACKNOWLEDGED mode, the insert is also improved 3X, for its UNACKNOWLEDGED mode, the speed is actually 50% slower.
For MySQL, the parallel insert mode increases the speed 5X faster! Which is faster than the best insert case from MongoDB!
MongoDB queries without the index will be doing table scan and we should know that data size of mongodb as compared to mysql is much more. I am guessing this might be one of the issue for slowness when doing a full scan.
Regarding query with indexes, mongoDB may turn out faster because of caching, no complex query optimizer plan (like mysql) etc.
The size of the collection is not an issue. In fact 10 million can be easily be handled in one collection. If you are have the requirement of archiving data, then you can break into smaller collections which will make the process easy.

How to improve the performance of feed system using mongodb

I have a feed system using fan-out on write. I keep a list of feed ids in redis sorted set, and save the feed content in mongodb, so every time when i read 30 feeds, i have to do 30 query to mongodb, is there anyway to improve it ?
Its depend upon your setup of database. MongoDb has a vast documentation about how to increase simultaneous read and write MongoDb conncurrency
If you need so many writes in database with less latency starts using sharding Deployment Sharding.
If you need to increase number of reads in data base deploy each shards as replica set and rout your read query in secondary node Read Prefences
Also each query should covered by index Better indexing, you can check your query time by simply adding explain after a find it will show you the time and all facts
db.collection.find({a:"abcd"}).explain()
Make sure you have enough ram so that your data set fits with ram atleast your index should fit inside the ram coz each time a data fetched from disk is 10 times slower than RAM.
Check your sever status with running MongoStat it will measures your database performance , page fault , lock , query opertaion manny detail.
Also measure your hardware performance with program like iostat and make sure io wait is low and less than 1%.
Few good links to deployment of mongodb and performance tuning.
1. Production deployment of mongodb
2. Performance tuning of mongodb By 10gen
3. Using redis before mongodb to cache query and result object
4. Example of redis and mongo