Low index/table hit rate on heroku postgres database - postgresql

I see that index cache hit rate (=80%) and table cache hit rate (=93%) are lower than they should be (>99%).
All frequent queries are using indexes and have execution time ~1ms.
Would removing unused indexes help increasing hit rates? What about adding more indexes?
Or is it time for upgrading the DB? (Adding more RAM especially)

Related

PostgreSql maintenance_work_mem increase during Index creatioon

We are doing a migration job on which GIN index creation on a JSonB column takes too long time to create. After investigating a bit , we think if we increase the maintenance_work_mem limit (it is 120MB now), it would speed up the things. But We are not sure if this would interrupt current on going index creation/restart the instance . We are running PostgreSql on GCP
You can change maintenance_work_mem any time without disturbing active database sessions, but it won't have any effect on CREATE INDEX statements that are already running.

Postgres Upsert - fragmentation issues

Summary
I am using Postgres UPSERTs in our ETLs and I'm experiencing issues with fragmentation and bloat on the tables I am writing to, which is slowing down all operations including reads.
Context
I have hourly batch ETLs upserting into tables (tables ~ 10s of Millions, upserts ~ 10s of thousands) and we have auto vacuums set to thresholds on AWS.
I have had to run FULL vacuums to get the space back and prevent processes from hanging. This has been exacerbated now as the frequency of one of our ETLs has increased, which populates some core tables which are the source for a number of denormalised views.
It seems like what is happening is that tables don't have a chance to be vacuumed before the next ETL run, thus creating a spiral which eventually leads to a complete slow-down.
Question!
Does Upsert fundamentally have a negative impact on fragmentation and if so, what are other people using? I am keen to implement some materialised views and move most of our indexes to the new views while retaining only the PK index on the tables we are writing to, but I'm not confident that this will resolve the issue I'm seeing with bloat.
I've done a bit of reading on the issue but nothing conclusive, for example --> https://www.targeted.org/articles/databases/fragmentation.html
Thanks for your help
It depends. If there are no constraint violations, INSERT ... ON CONFLICT won't cause any bloat. If it performs an update, it will produce a dead row.
The measures you can take:
set autovacuum_vacuum_cost_delay = 0 for faster autovacuum
use a fillfactor somewhat less than 100 and have no index on the updated columns, so that you can get HOT updates, which make autovacuum unnecessary
It is not clear what you are actually seeing. Can you turn track_io_timing on, and then do an EXPLAIN (ANALYZE, BUFFERS) for the query that you think has been slowed down by bloat?
Bloat and fragmentation aren't the same thing. Fragmentation is more an issue with indexes under some conditions, not the tables themselves.
It seems like what is happening is that tables don't have a chance to be vacuumed before the next ETL run
This one could be very easy to fix. Run a "manual" VACUUM (not VACUUM FULL) at end or at the beginning of each ETL run. Since you have a well defined workflow, there is no need to try get autovacuum to do the right thing, as it should be very easy to inject manual vacuums into your workflow. Or do you think that one VACUUM per ETL is overkill?

is a reduction in free disk space a good overall indicator of a `work_mem` setting that is too low?

As I understand it (after a fair amount of searching online)...
1- If a component of a query (sort, join, etc.) uses more RAM/memory than my work_mem setting or the total memory used by all current operations on the server exceeds available OS memory, the query will start writing to disk.
Is that true?
2- Postgres (and many other good DB engines) use memory to cache a lot so queries go faster; therefore, the server should indicate low free memory even if the server isn't really starved for memory. So low free memory doesn't really indicate anything other than a good DB engine and healthy utilization.
Is that true?
3- If both #1 and #2 above are true, holding everything else content, if I want a board indicator of a work_mem setting that is too low or not enough overall OS memory, I should look to see if the server free disk space is going down?
Am I thinking about this correctly?
links:
https://www.postgresql.org/docs/current/static/runtime-config-resource.html
http://patshaughnessy.net/2016/1/22/is-your-postgres-query-starved-for-memory
https://www.enterprisedb.com/monitor-cpu-and-memory-percentage-used-each-process-postgresqlppas-9
https://dba.stackexchange.com/questions/18484/tuning-postgresql-for-large-amounts-of-ram
I know I can set log_temp_files and look at individual temp files to tune the work_mem setting, but I wanted an overall gauge I could use to determine if possibly work_mem is too low before I start digging around looking at temp file sizes that exceed my work_mem setting.
I have PostgreSQL 10.
Processing a query takes a number of steps:
generate (all)possible plans
estimate the cost of execution of these plans (in terms of resources: disk I/O,buffers,memory,CPU), based on tuning constants and statistics.
pick the "optimal" plan , based on tuning constants
execute the chosen plan.
In most cases, a plan that is expected (step2) to need more work_mem than your work_mem setting will not be chosen in step3. (because "spilling to disk" is considered very expensive)
Once step4 detects that it is needing more work_mem, its only choice is to spill to disk. Shit happens... At least this doesn't rely on the OS's page-swapping the the overcommitted memory.)
The rules are very simple:
hash-joins are often optimal but will cost memory
don't try to use more memory than you have
if there is a difference between expected(step2) and observed(step4) memory, your statistics are wrong. You will be punished by spill-to-disk.
a lack of usable indexes will cause hash joins or seqscans.
sorting uses work_mem, too. The mechanism is similar :bad estimates yield bad plans.
CTE's are often/allways(?) materialized. This will splill to disk once your bufferspace overflows.
CTE's don't have statistics, and don't have indices.
A few guidelines/advice:
use a correct data model (and don't denormalize)
use the correct PK/FK's and secundary indices.
run ANALYZE the_table_name; to gather fresh statistics after huge modifications to the table's structure or data.
Monitoring:
check the Postgres logfile
check the query plan, compare observed <--> expected
monitor the system resource usage (on Linux: via top/vmstat/iostat)

Is killing a "CLUSTER ON index" dangerous for database?

All the question is in the title,
if we kill a cluster query on a 100 millions row table, will it be dangerous for database ?
the query is running for 2 hours now, and i need to access the table tomorrow morning (12h left hopefully).
I thought it would be far quicker, my database is running on raid ssd and Bi-Xeon Processor.
Thanks for your wise advice.
Sid
No, you can kill the cluster operation without any risk. Before the operation is done, nothing has changed to the original table- and indexfiles. From the manual:
When an index scan is used, a temporary copy of the table is created
that contains the table data in the index order. Temporary copies of
each index on the table are created as well. Therefore, you need free
space on disk at least equal to the sum of the table size and the
index sizes.
When a sequential scan and sort is used, a temporary sort file is also
created, so that the peak temporary space requirement is as much as
double the table size, plus the index sizes.
As #Frank points out, it is perfectly fine to do so.
Assuming you want to run this query in the future and assuming you have the luxury of a service window and can afford some downtime, I'd tweak some settings to boost the performance a bit.
In your configuration:
turn off fsync, for higher throughput to the file system
Fsync stands for file system sync. With fsync on, the database waits for the file system to commit on every page flush.
maximize your maintenance_work_mem
It's ok to just take all memory available, as it will not be allocated during production hours. I don't know how big your table and the index you are working on are, things will run faster when they can be fully loaded in main memory.

MongoDB Insert performance - Huge table with a couple of Indexes

I am testing Mongo DB to be used in a database with a huge table of about 30 billion records of about 200 bytes each. I understand that Sharding is needed for that kind of volume, so I am trying to get 1 to 2 billion records on one machine. I have reached 1 billion records on a machine with 2 CPU's / 6 cores each, and 64 GB of RAM. I mongoimport-ed without indexes, and speed was okay (average 14k records/s). I added indexes, which took a very long time, but that is okay as it is a one time thing. Now inserting new records into the database is taking a very long time. As far as I can tell, the machine is not loaded while inserting records (CPU, RAM, and I/O are in good shape). How is it possible to speed -up inserting new records?
I would recommend adding this host to MMS (http://mms.10gen.com/help/overview.html#installation) - make sure you install with munin-node support and that will give you the most information. This will allow you to track what might be slowing you down. Sorry I can't be more specific in the answer, but there are many, many possible explanations here. Some general points:
Adding indexes means that that the indexes as well as your working data set will be in RAM now, this may have strained your resources (look for page faults)
Now that you have indexes, they must be updated when you are inserting - if everything fits in RAM this should be OK, see first point
You should also check your Disk IO to see how that is performing - how does your background flush average look?
Are you running the correct filesystem (XFS, ext4) and a kernel version later than 2.6.25? (earlier versions have issues with fallocate())
Some good general information for follow up can be found here:
http://www.mongodb.org/display/DOCS/Production+Notes