Is it possible to monitor PostgreSQL server performance from inside a PL/PGSQL function? - postgresql

This may sound exotic, by I would like to programmatically know if it is a 'good moment' to execute a heavy-write PL/PGSQL function in a server. By 'good moment' I mean pondering some direct or calculated indicator of the load level, concurrency level, or any other magnitude of the PostgreSQL server.
I am aware there is a number of advanced applications specialized in performance tracking out there, like https://www.datadoghq.com. But I just want a simple internal KPI that alters or delays the execution of these heavy-write procedures until a 'better moment' comes.
Some of these procedures purge tables, some make average/sum calculations over millions of rows, some check remote tables, etc. They may wait for minutes or hours for a 'better moment' when the concurrent user pressure comes down.
Any idea?

You can see how many other sessions are active by something like:
select count(*) from pg_stat_activity where state='active';
But you have to be a superuser, or have the pg_monitor role, or else state will be NULL for sessions of other users. If that bothers you, you could write a function with SECURITY DEFINER to allow access to this info. (You should probably be putting this into its own function anyway, which means there is no reason that it needs to be implemented in plpgsql unless that is the only language available to you.)
You can also invoke any arbitrary OS operations by using a suitably privileged pl language. That includes from plpgsql, by abusing COPY...FROM PROGRAM.

Related

PostgreSQL: Backend processes are active for a long time

now I am hitting a very big road block.
I use PostgreSQL 10 and its new table partitioning.
Sometimes many queries don't return and at the time many backend processes are active when I check backend processes by pg_stat_activity.
First, I thought theses process are just waiting for lock, but these transactions contain only SELECT statements and the other backend doesn't use any query which requires ACCESS EXCLUSIVE lock. And these queries which contain only SELECT statements are no problem in terms of plan. And usually these work well. And computer resources(CPU, memory, IO, Network) are also no problem. Therefore, theses transations should never conflict. And I thoughrouly checked the locks of theses transaction by pg_locks and pg_blocking_pids() and finnaly I couldn't find any lock which makes queries much slower. Many of backends which are active holds only ACCESS SHARE because they use only SELECT.
Now I think these phenomenon are not caused by lock, but something related to new table partition.
So, why are many backends active?
Could anyone help me?
Any comments are highly appreciated.
The blow figure is a part of the result of pg_stat_activity.
If you want any additional information, please tell me.
EDIT
My query dosen't handle large data. The return type is like this:
uuid UUID
,number BIGINT
,title TEXT
,type1 TEXT
,data_json JSONB
,type2 TEXT
,uuid_array UUID[]
,count BIGINT
Because it has JSONB column, I cannot caluculate the exact value, but it is not large JSON.
Normally theses queries are moderately fast(around 1.5s), so it is absolutely no problem, however when other processes work, the phenomenon happens.
If statistic information is wrong, the query are always slow.
EDIT2
This is the stat. There are almost 100 connections, so I couldn't show all stat.
For me it looks like application problem, not postresql's one. active status means that your transaction still was not commited.
So why do you application may not send commit to database?
Try to review when do you open transaction, read data, commit transaction and rollback transaction in your application code.
EDIT:
By the way, to be sure try to check resource usage before problem appear and when your queries start hanging. Try to run top and iotop to check if postgres really start eating your cpu or disk like crazy when problem appears. If not, I will suggest to look for problem in your application.
Thank you everyone.
I finally solved this problem.
I noticed that a backend process holded too many locks. So, when I executed the query SELECT COUNT(*) FROM pg_locks WHERE pid = <pid>, the result is about 10000.
The parameter of locks_per_transactions is 64 and max_connections is about 800.
So, if the number of query that holds many locks is large, the memory shortage occurs(see calculation code of shared memory inside PostgreSQL if you are interested.).
And too many locks were caused when I execute query like SELECT * FROM (partitioned table). Imangine you have a table foo that is partitioned and the number of the table is 1000. And then you can execute SELECT * FROM foo WHERE partion_id = <id> and the backend process will hold about 1000 table locks(and index locks). So, I change the query from SELECT * FROM foo WHERE partition_id = <id> to SELECT * FROM foo_(partitioned_id). As the result, the problem looks solved.
You say
Sometimes many queries don't return
...however when other processes work, the phenomenon happens. If statistic
information is wrong, the query are always slow.
They don't return/are slow when directly connecting to the Postgres instance and running the query you need, or when running the queries from an application? The backend processes that are running, are you able to kill them successfully with pg_terminate_backend($PID) or does that have issues? To rule out issues with the statement itself, make sure statement_timeout is set to a reasonable amount to kill off long-running queries. After that is ruled out, perhaps you are running into a case of an application hanging and never allowing the send calls from PostgreSQL to finish. To avoid a situation like that, if you are able to (depending on OS) you can tune the keep-alive time: https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-TCP-KEEPALIVES-IDLE (by default is 2 hours)
Let us know if playing with any of that gives any more insight into your issue.
Sorry for late post, As #Konstantin pointed out, this might be because of your application(which is why I asked for your EDIT2). Adding a few excerpts,
table partition has no effect on these locks, that is a totally different concept and does not hold up locks in your case.
In your application, check if the connection properly close() after read() and is in finally block (From Java perspective). I am not sure of your application tier.
Check if SELECT..FOR UPDATE or any similar statement is written erroneously recently which is causing this.
Check if any table has grown in size recently and the column is not Indexed. This is very important and frequent cause of select statements running for some minutes. I'd also suggest using timeouts for select statements in your application. https://www.postgresql.org/docs/9.5/gin-intro.html This can give you a headstart.
Another thing that is fishy to me is the JSONB column, maybe your Jsonb values are pretty long, or the queries are unnecessarily selecting JSONB value even if not required?
Finally, If you don't need some special features of Jsonb data type, then you use JSON data type which is faster (magical maximum, sometimes 50x!)
It looks like the pooled connections not getting closed properly and a few queries might be taking huge time to respond back. As pointed out in other answers, it is the problem with the application and could be connection leak. Most possibly, it might be because of pending transactions over some already pending and unresolved transactions, leading to a number of unclosed transactions.
In addition, PostgreSQL generally has one or more "helper" processes like the stats collector, background writer, autovaccum daemon, walsender, etc, all of which show up as "postgres" instances.
One thing I would suggest you check in which part of the code you have initiated the queries. Try to DRY run your queries outside the application and have some benchmarking of queries performance.
Secondly, you can keep some timeout for certain queries if not all.
Thirdly, you can do kill the idle transactions after certain timeouts by using:
SET SESSION idle_in_transaction_session_timeout = '5min';
I hope it might work. Cheers!

Multiple deletes in Postgres with a commit after each one

I have about 30 million records to delete from a table, and deleting even 10.000 is taking 30 minutes. I'm concerned about issuing the delete command for all the 30 million records, so I'd like to do the delete in batches.
So my approach was to do a loop deleting a batch, then commit, then loop to delete the next batch. But that generates the following error:
LOCATION: exec_stmt_raise, pl_exec.c:3216
ERROR: 0A000: cannot begin/end transactions in PL/pgSQL
HINT: Use a BEGIN block with an EXCEPTION clause instead.
This is the code I wrote:
DO $$
BEGIN
FOR i in 1..30000 loop
DELETE FROM my_table
WHERE id IN (
SELECT id
FROM my_table
WHERE should_delete = true
LIMIT 1000
);
RAISE NOTICE 'Done with batch %', i;
COMMIT;
END LOOP;
END
$$;
What is an alternative to achieving this?
A few things jump out at me:
Transaction overhead in PostgreSQL is significant. It's possible you would see a substantial performance improvement if you just did this all in one big transaction.
It sounds as if you are experimenting on a live production database. Don't do that. Set up a test instance with statistically similar (or identical) data and a similar workload, and use that. That way, if you manage to break something, you'll only hurt the test instance.
Once you have a test instance via (2), you can use it to test (1) and see how long it takes. Then you won't have to guess at which approach is superior.
You mention that other jobs run at fixed times. So this is not a database backing (something like) a web server exposed to the internet. It is a batch system which runs scheduled jobs at predetermined times. Since you already have a scheduling system, use it to schedule your deletes at times when the system is not in use (or when it is unlikely to be in use).
If you decide that you must use multiple transactions, use something other than PL/pgSQL to do the actual looping. You could, for instance, use a shell script or another programming language like Python or Java. Anything with Postgres bindings will do.
The really drastic approach is to replicate the whole database, perform the deletes on the replica, and then replace the original with the replica. The swap may require putting the database into read-only mode for a short time to avoid write inconsistencies and to allow the replica to converge. Since yours is a batch system, that might not matter. Obviously this is the most resource-intensive approach since it requires a whole extra database.

Execute multiple functions together without losing performance

I have this process that has to make a series of queries, using pl/pgsql:
--process:
SELECT function1();
SELECT function2();
SELECT function3();
SELECT function4();
To be able to execute everything in one call, I created a process function as such:
CREATE OR REPLACE FUNCTION process()
RETURNS text AS
$BODY$
BEGIN
PERFORM function1();
PERFORM function2();
PERFORM function3();
PERFORM function4();
RETURN 'process ended';
END;
$BODY$
LANGUAGE plpgsql
The problem is, when I sum the time that each function takes by itself, the total is 200 seconds, while the time that the function process() takes is more than one hour!
Maybe it's a memory issue, but I don't know which configuration on postgresql.conf should I change.
The DB is running on PostgreSQL 9.4, in a Debian 8.
You commented that the 4 functions have to run consecutively. So it's safe to assume that each function works with data from tables that have been modified by the previous function. That's my prime suspect.
Any Postgres function runs inside the transaction of the outer context. So all functions share the same transaction context if packed into another function. Each can see effects on data from previous functions, obviously. (Even though effects are still invisible to other concurrent transactions.) But statistics are not updated immediately.
Query plans are based on statistics on involved objects. PL/pgSQL does not plan statements until they are actually executed, that would work in your favor. Per documentation:
As each expression and SQL command is first executed in the function,
the PL/pgSQL interpreter parses and analyzes the command to create a
prepared statement, using the SPI manager's SPI_prepare function.
PL/pgSQL can cache query plans, but only within the same session and (in pg 9.2+ at least) only after a couple of executions have shown the same query plan to work best repeatedly. If you suspect this going wrong for you, you can work around it with dynamic SQL which forces a new plan every time:
EXECUTE 'SELECT function1()';
However, the most likely candidate I see is invalidated statistics that lead to inferior query plans. SELECT / PERFORM statements (same thing) inside the function are run in quick succession, there is no chance for autovacuum to kick in and update statistics between one function and the next. If one function substantially alters data in a table the next function is working with, the next function might base its query plan on outdated information. Typical example: A table with a few rows is filled with many thousands of rows, but the next plan still thinks a sequential scan is fastest for the "small" table. You state:
when I sum the time that each function takes by itself, the total is
200 seconds, while the time that the function process() takes is more
than one hour!
What exactly does "by itself" mean? Did you run them in a single transaction or in individual transactions? Maybe even with some time in between? That would allow autovacuum to update statistics (it's typically rather quick) and possibly lead to completely different query plans based on the changed statistic.
You can inspect query plans inside plpgsql functions with auto-explain
Postgres query plan of a UDF invocation written in pgpsql
If you can identify such an issue, you can force ANALYZE in between statements. Being at it, for just a couple of SELECT / PERFORM statements you might as well use a simpler SQL function and avoid plan caching altogether (but see below!):
CREATE OR REPLACE FUNCTION process()
RETURNS text
LANGUAGE sql AS
$func$
SELECT function1();
ANALYZE some_substantially_affected_table;
SELECT function2();
SELECT function3();
ANALYZE some_other_table;
SELECT function4();
SELECT 'process ended'; -- only last result is returned
$func$;
Also, as long as we don't see the actual code of your called functions, there can be any number of other hidden effects.
Example: you could SET LOCAL ... some configuration parameter to improve the performance of your function1(). If called in separate transactions that won't affect the rest. The effect only last till the end of the transaction. But if called in a single transaction it affects the rest, too ...
Basics:
Difference between language sql and language plpgsql in PostgreSQL functions
PostgreSQL Stored Procedure Performance
Plus: transactions accumulate locks, which binds an increasing amount of resources and may cause increasing friction with concurrent processes. All locks are released at the end of a transaction. It's better to run big functions in separate transactions if at all possible, not wrapped in a single function (and thus transaction). That last item is related to what #klin and IMSoP already covered.
Warning for future readers (2015-05-30).
The technique described in the question is one of the smartest ways to effectively block the server.
In some corporations the use of this technology can meet with the reward in the form of immediate termination of the employment contract.
Attempts to improve this method are useless. It is simple, beautiful and sufficiently effective.
In RDMS the support of transactions is very expensive. When executing a transaction the server must create and store information on all changes made to the database to make these changes visible in environment (other concurrent processes) in case of a successful completion, and in case of failure, to restore the state before the transaction as soon as possible. Therefore the natural principle affecting server performance is to include in one transaction a minimum number of database operations, ie. only as much as is necessary.
A Postgres function is executed in one transaction. Placing in it many operations that could be run independently is a serious violation of the above rule.
The answer is simple: just do not do it. A function execution is not a mere execution of a script.
In the procedural languages used to write applications there are many other possibilities to simplify the code by using functions or scripts. There is also the possibility to run scripts with shell.
The use a Postgres function for this purpose would make sense if there were a possibility of using transactions within the function. At present, such a possibility does not exist, although discussions on this issue have already long history (you can read about it e.g. in postgres mailing lists).

Does PostgreSQL cache Prepared Statements like Oracle

I have just moved to PostgreSQL after having worked with Oracle for a few years.
I have been looking into some performance issues with prepared statements in the application (Java, JDBC) with the PostgreSQL database.
Oracle caches prepared statements in its SGA - the pool of prepared statements is shared across database connections.
PostgreSQL documentation does not seem to indicate this. Here's the snippet from the documentation (https://www.postgresql.org/docs/current/static/sql-prepare.html) -
Prepared statements only last for the duration of the current database
session. When the session ends, the prepared statement is forgotten,
so it must be recreated before being used again. This also means that
a single prepared statement cannot be used by multiple simultaneous
database clients; however, each client can create their own prepared
statement to use.
I just want to make sure that I am understanding this right, because it seems so basic for a database to implement some sort of common pool of commonly executed prepared statements.
If PostgreSQL does not cache these that would mean every application that expects a lot of database transactions needs to develop some sort of prepared statement pool that can be re-used across connections.
If you have worked with PostgreSQL before, I would appreciate any insight into this.
Yes, your understanding is correct. Typically if you had a set of prepared queries that are that critical then you'd have the application call a custom function to set them up on connection.
There are three key reasons for this afaik:
There's a long todo list and they get done when a developer is interested/paid to tackle them. Presumably no-one has thought it worth funding yet or come up with an efficient way of doing it.
PostgreSQL runs in a much wider range of environments than Oracle. I would guess that 99% of installed systems wouldn't see much benefit from this. There are an awful lot of setups without high-transaction performance requirement, or for that matter a DBA to notice whether it's needed or not.
Planned queries don't always provide a win. There's been considerable work done on delaying planning/invalidating caches to provide as good a fit as possible to the actual data and query parameters.
I'd suspect the best place to add something like this would be in one of the connection pools (pgbouncer/pgpool) but last time I checked such a feature wasn't there.
HTH

Parallelize TSQL CLR Procedure

I'm trying to figure out how I can parallelize some procedural code to create records in a table.
Here's the situation (sorry I can't provide much in the way of actual code):
I have to predict when a vehicle service will be needed, based upon the previous service date, the current mileage, the planned daily mileage and the difference in mileage between each service.
All in all - it's very procedural, for each vehicle I need to take into account it's history, it's current servicing state, the daily mileage (which can change based on ranges defined in the mileage plan), and the sequence of servicing.
Currently I'm calculating all of this in PHP, and it takes about 20 seconds for 100 vehicles. Since this may in future be expanded to several thousand, 20 seconds is far too long.
So I decided to try and do it in a CLR stored procedure. At first I thought I'd try multithreading it, however I quickly found out it's not easy to do in the TSQL host. I was recommended to allow TSQL to work out the parallelization itself. Yet I have no idea how. If it wasn't for the fact the code needs to create records I could define it as a function and do:
SELECT dbo.PredictServices([FleetID]) FROM Vehicles
And TSQL should figure out it can parallelize that, but I know of no alternative for procedures.
Is there anything I can do to parallelize this?
The recommendation you received is a correct one. You simply don't have .NET framework facilities for parallelism available in your CLR stored procedure. Also please keep in mind that the niche for CLR Stored Procedures is rather narrow and they adversely impact SQL Server's performance and scalability.
If I understand the task correctly you need to compute a function PredictServices for some records and store the results back to database. In this case CLR Stored procedures could be your option provided PredictServices is just data access/straightforward transformation of data. Best practice is to create WWF (Windows Workflow Foundation) service to perform computations and call it from PHP. In Workflow Service you can implement any solution including one involving parallelism.