Need a tool for PostgreSQL monitoring on Windows - postgresql

I have Postgres running on Windows and I'm trying to investigate the strange behaviour: There are 17 postgres processes, 8 out of those 17 consume ~300K memory each.
Does anybody know what such behavior is coused by?
Does anybody know about a tool to investigate the problem?

8 out of those 17 consume ~300K memory
each.
Are you 110% sure? Windows doesn't know how much memory is used from the shared buffers. Each proces could use just a few kb's and using the shared memory together with the other processes.
What problem do you have? Using memory is not a problem, memory is made to use. And 300KB each, that's just a few MB all together, if each proces is realy using 300KB.
And don't forget, PostgreSQL is a multi proces system. That's also why it scales so easy on multi core and multi processor systems.

See pgAdmin: http://www.pgadmin.org/

A tool for analyzing output from postgresql can be found at http://pgfouine.projects.postgresql.org/
pgFouine is a PostgreSQL log analyzer used to generate detailed reports from a PostgreSQL log file. pgFouine can help you to determine which queries you should optimize to speed up your PostgreSQL based application.
I don't think you can find out why you have alot of processes running, but if you feel that it might be because of database usage, this tool might help you find a cause.

Related

Postgres auto tuning

As we all know postgres performance highly depends on config params. Eg if I have ssd drive or more RAM I need to tell that postgres by changing relevant cfg param
I wonder if there is any tool (for Linux) which can suggest best postgres configuration for current hardware?
Im aware Websites (eg pgtune) where I can enter server spec and those can suggest best config
However each hardware is different (eg I might have better raid / controller or some processes what might consume more ram etc). My wish would be postgres doing self tuning, analysing query execution time available resources etc
Understand there is no such mechanism, so maybe there is some tool / script I can run which can do this job for me (checking eg disk seq. / random disk read, memory available etc) and telling me what to change in config
There are parameters that you can tweak to get better performance from postgresql.
This article gives good read about that.
There are few scripts that can do that. One that is mentioned in postgres wiki is this one.
To get more idea about what more tuning your database needs, you need to log its request and performance, after analysing those logs you can tune more params. For this there is pgbadger log analyzer.
After using database in production, you get more idea regarding what requirements you have and how to approach them rather than making changes just based on os or hardware configuration.

Are there any negative performance or functionality downsides to using pg_upgrade with --link option afterwards?

I'm about upgrade a quite large PostgreSQL cluster from 9.3 to 11.
The upgrade
The cluster is approximately 1,2Tb in size. The database has a disk system consisting of a fast HW RAID 10 array of 8 DC-edition SSDs with 192GB ram and 64 cores. I am performing the upgrade by replicating the data to a new server with streaming replication first, then upgrading that one to 11.
I tested the upgrade using pg_upgrade with the --link option, this takes less than a minute. I also tested the upgrade regularly (without --link) with many jobs, that takes several hours (+4).
Questions
Now the obvious choice is of cause for me to use the --link option, however all this makes me wonder - is there any downsides (performance or functionality wise) to using that over the regular slower method? I do not know the internal workings of postgresql data structures, but I have a feeling there could be a performance difference after the upgrade between rewriting the data entirely and to just using hard links - whatever that means?
Considerations
The only thing I can find in the documentation about the drawbacks of --link is the downside of not being able to access the old data directory after the upgrade is performed https://www.postgresql.org/docs/11/pgupgrade.htm However that is only a safety concern and not a performance drawback and doesn't really apply in my case of replicating the data first.
The only other thing I can think of is reclaiming space, with whatever performance upsides that might have. However as I understand it, that can also be achieved by running a VACUUM FULL DATABASE (or CLUSTER?) command after the --link-upgraded database has been upgraded? Also the reclaiming of space is not very impactful performance wise on an SSD as I understand.
I appreciate if anyone can help cast some light into this.
There is absolutely no downside to using hard links (with the exception you noted, that the old cluster is dead and has to be removed).
A hard link is in no way different from a normal file.
A “file” in UNIX is in reality an “inode”, a structure containing file metadata. An entry in a directory is a (hard) link to that inode.
If you create another hard link to the inode, the same file will be in two different directories, but that has no impact whatsoever on the behavior of the file.
Of course you must make sure that you don't start both the only and the new server. Instant data corruption would ensue. That's why you should remove the old cluster as soon as possible.

Postgres why is swap-usage growing? How to reduce it? - AWS RDS

Having a postgres DB on AWS-RDS the Swap Usage in constantly rising.
Why is it rising? I tried rebooting but it does not sink. AWS writes that high swap usage is "indicative of performance issues"
I am writing data to this DB. CPU and Memory do look healthy:
To be precise i have a
db.t2.micro-Instance and at the moment ~30/100 GB Data in 5 Tables - General Purpose SSD. With the default postgresql.conf.
The swap-graph looks as follows:
Swap Usage warning:
Well It seems that your queries are using a memory volume over your available. So you should look at your queries execution plan and find out largest loads. That queries exceeds the memory available for postgresql. Usually over-much joining (i.e. bad database structure, which would be better denonarmalized if applicable), or lots of nested queries, or queries with IN clauses - those are typical suspects. I guess amazon delivered as much as possible for postgresql.conf and those default values are quite good for this tiny machine.
But once again unless your swap size is not exceeding your available memory and your are on a SSD - there would be not that much harm of it
check the
select * from pg_stat_activity;
and see if which process taking long and how many processes sleeping, try to change your RDS DBparameter according to your need.
Obviously you ran out of memory. db.t2.micro has only 1GB of RAM. You should look in htop output to see which processes takes most of memory and try to optimize memory usage. Also there is nice utility called pgtop (http://ptop.projects.pgfoundry.org/) which shows current queries, number of rows read, etc. You can use it to view your postgress state in realtime. By the way, if you cannot install pgtop you can get just the same information from posgres internal tools - check out documentation of postgres stats collector https://www.postgresql.org/docs/9.6/static/monitoring-stats.html
Actually it is difficult to say what the problem is exactly but db.t2.micro is a very limited instance. You should consider taking a biggier instance especially if you are using postgres in production.

SYSIBM.SQLSTATISTICS and SYSIBM.SQLPRIMARYKEYS using most of CPU in DB2 on Windows

I have a fairly busy DB2 on Windows server - 9.7, fix pack 11.
About 60% of the CPU time used by all queries in the package cache is being used by the following two statements:
CALL SYSIBM.SQLSTATISTICS(?,?,?,?,?,?)
CALL SYSIBM.SQLPRIMARYKEYS(?,?,?,?)
I'm fairly decent with physical tuning and have spent a lot of time on SQL tuning on this system as well. The applications are all custom, and educating developers is something I also spend time on.
I get the impression that these two stored procedures are something that perhaps ODBC calls? Reading their descriptions, they also seem like things that are completely unnecessary to do the work being done. The application doesn't need to know the primary key of a table to be able to query it!
Is there anything I can tell my developers to do that will either eliminate/reduce the execution of these or cache the information so that they're not executing against the database millions of times and eating up so much CPU? Or alternately anything I can do at the database level to reduce their impact?
6.5 years later, and I have the answer to my own question. This is a side effect of using an ORM. Part of what it does is to discover the database schema. Rails also has a similar workload. In Rails, you can avoid this by using the schema cache. This becomes particularly important at scale. Not sure if there are equivalencies for other ORMs, but I hope so!

Configuration Recommendations for a PostgreSQL Installation

I have a Windows Server 2003 machine which I will be using as a Postgres database server, the machine is a Dual Core 3.0Ghz Xeon with 4 GB ECC Memory and 4 x 120GB 10K RPM SAS Drives, all stripped.
I have read that the default Postgres install is configured to run nicely on a 486 with 32MB RAM, and I have read several web pages about configuration optimizations - but was hoping for something more concrete from my Stackoverflow peeps.
Generally, its only going to serve 1 database (potentially one or two more) but the catch is that the database has 1 table in particular which is massive (hundreds of millions of records with only a few coloumn). Presently, with the default configuration, it's not slow, but I think it could potentially be even faster.
Can people please give me some guidance and recomendations for configuration settings which you would use for a server such as this.
4*stripped drive was a bad idea — if any of this drives will fail then you'll loose all data, and even SAS drives sometimes fail — with 4 drivers it is 4 times more likely than with 1 drive; you should go for RAID 1+0.
Use the latest version of Postgres, 8.3.7 now; there are many performance improvements in every major version.
Set shared_buffers parameter in postgresql.conf to about 1/4 of your memory.
Set effective_cache_size to about 1/2 of your memory.
Set checkpoint_segments to about 32 (checkpoint every 512MB) and checkpoint_completion_target to about 0.8.
Set default_statistics_target to about 100.
Migrate to Enterprise Linux or FreeBSD: Postgres works much better on Unix type systems — Windows support is a recent addition, not very mature.
You can read more on this page: Tuning Your PostgreSQL Server — PostgreSQL Wiki
My experience suggests that (within limits) the hardware is typically the least important factor in database performance.
Assuming that you have enough memory to keep commonly used data in cache, then your CPU speed may vary 10-50% between a top-of the line machine and a common or garden box.
However, a missing index in an important search, or a poorly written recursive trigger could easily make a difference of 1,000% or 10,000% or more in your response times.
Without knowing exactly your table structure and row counts, I think anybody would suggest that your your hardware looks amply sufficient. It is only your database structure which will kill you. :)
UPDATE:
Without knowing the specific queries and your index details, there's not much more we can do. And in general, even knowing the queries, it's often very difficult to optimize without actually installing and running the queries with realistic data sets.
Given the cost of the server, and the cost of your time, I think you need to invest thirty bucks in a book. Then install your database with test data, run the queries, and see what runs well and what runs badly. Fix, rinse, and repeat.
Both of these books are specific to SQL Server and both have high ratings:
http://www.amazon.com/Inside-Microsoft%C2%AE-SQL-Server-2005/dp/0735621969/ref=sr_1_1
http://www.amazon.com/Server-Performance-Tuning-Distilled-Second/dp/B001GAQ53E/ref=sr_1_5