PostgreSQL: how to simulate long running query [duplicate]

PostgreSQL: how to simulate long running query [duplicate] - postgresql

To test my application I need a query to run for a long time (at least a couple of minutes). Any ideas on how to create this quickly?
The application needs to query the catalog to see a list of running queries.
My application uses postgresql. I am fine with creating additional dummy tables if required.

this will run for 5 minutes:
select pg_sleep(5 * 60);
the parameter for pg_sleep() is the duration in seconds.
You can also sleep until a specific timestamp using pg_sleep_until()
More details in the manual:
http://www.postgresql.org/docs/9.5/static/functions-datetime.html#FUNCTIONS-DATETIME-DELAY

Related

Postgres: Count all INSERT queries executed in the past 1 minute

I can do currently active count of all INSERT queries executed on the PostgreSQL server like this:
SELECT count(*) FROM pg_stat_activity where query like 'INSERT%'
But is there a way to count all INSERT queries executed on the server in a given period of time? E.g. in the past minute?
I have a bunch of tables into which I send a lot of inserts and I would like to somehow aggregate how many rows I am inserting per minute. I could code a solution for this, but it'd be so much easier if this was possible to somehow extract directly from the server.
Any type of stats like this, in a certain period of time, would be very helpful, an average time it takes for the query to process, or knowing the bandwidth that goes through per minute, etc.
Note: I am using PostgreSQL 12

If not already done, install pg_stat_statements extension and take some snapshots of the view pg_stat_statements: the diff will give the number of queries executed between 2 snapshots.
Note: It doesn’t save each individual query, rather it parameterizes them and then saves the aggregated result.
See https://www.citusdata.com/blog/2019/02/08/the-most-useful-postgres-extension-pg-stat-statements/

I believe that you can use the audit trigger.
This audit will create a table that register INSERT, UPDATE and DELETE actions. So you can adapt. So every time that your database runs one of those commands, the audit table register the action, the table and the time of the action. So, it will be easy to do a COUNT() on desired table with a WHERE from a minute ago.

I couldn't come across anything solid, so I have created a table where I log a number of insert transactions using a script that runs as a cron job. It was simple enough to implement and I do not get estimations, but the real values instead. I actually count all new rows inserted to tables in a given interval.

Multiple updates performance improvement

I have built an application with Spring Boot and JPA to migrate a Jira postgres database.
Basically, I have 5000 users that I need to migrate. Each user means 67 update queries in different tables.
Each query uses the LOWER function to compare ignoring case.
Some pseudo-code:
for (user : users){
for (query : queries) {
jdbcTemplate.execute(query.replace(user....
I ignore any errors, so if a single query fails, I still go on and execute the other 66.
I am running this in 10 separate threads and each user is taking roughly 120 seconds to migrate. (20 threads resulted in database dead lock)
At this pace, it's gonna take more than a day, which is not acceptable (I am running this in a test environment before doing in production).
The queries looks like this:
UPDATE table SET column = 'NEWUSERNAME' where LOWER(column) = LOWER('CURRENTUSERNAME');
Is there anything I can do to try and optimize this migration?
UPDATE:
I changed my approach. First, I select every element with the CURRENTUSERNAME and get it's ID. Then I create the UPDATE queries using the ID as the "where" clause.
Other than that, it is still taking a long time (4+ hours) to execute.
I am running millions of UPDATEs, each at a time. I know jdbcTemplate has a bulk method, but if a single UPDATE fails, I believe it roll's back every successful update too. Also, I am not aware of the performance improvement it would have, if any.
So, to update the question, given that I have millions of UPDATE queries to run, what would be the best way execute them? (bulk, multi threading, something else)

Query execution slow

I am using postgresql db,
When I execute this following Query in postgresql it is running very Slow.
select * from <table_name>;
Previously it takes 5ms but now it takes around 15 mins still it is running.
Can anyone suggest what are the possible reason?

Running a Postgresql Query at a specific time

Scenario
I have a table which contains tasks that need to be completed by a specific datetime. If the task is not completed by this datetime (+- variable interval) then it will run a script to 'escalate' this task.
This variable interval can be as small as 2 seconds or as large as 2 years
Thoughts so far
Running a cron job every second either via pg_cron or similar will technically allow me to do a check on the database every second, however there is a lot of wasted processing here and a lot of database overhead and i'd rather not do this if possible.
Triggers can be fired on row insert/update/delete. so worst case scenario is we have an external script watching for these triggers being fired.
Question
Is there a way to schedule a query to run at a specific time, ideally within postgresql itself rather than via a bash/cron script. ie:
at 2017-09-30 09:32:00 - select * from table where datetime <= now
Edit
As it came up in the comments PGAgent is a possibility and the scenario for such would be:
The task is created by the user in the application, and the due date is set (eg 2017-09-28 13:00:00) the user has an interval before/after this due date where the task is escalated (eg One Hour Before) so at 12:00:00 on 2017-09-28 i want PGAgent/other option to run my sql script that does the escalation.
The script to escalate is already written and works, the date and time for this PGAgent script to be run is already calculated by another script.

PG-POOL settings issue

I have pgpool installed on my server but I am facing issue while optimising it. I have 2 different queries running on the same table-
1st is update query which runs every 1 min and query exec time is around 30 secs. So technically it is running every 30 secs.
2nd is select query.
The issue I am facing here is that update query is running and i try to run select query , my pg-pool process only update query and until unless update query is not finished it wont process my select query.
As of now, I am following the pg-pool default settings.
Any kind of feedback will be helpful:

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse