Postgresql now() returning incorrect value - postgresql

I have two postgresql servers - one running local on my windows machine and one running on a beta linux server.
I am running this command on both (at very close to the same time):
select current_setting('TIMEZONE'), now();
Local DB result:
"US/Pacific";"2015-10-09 12:29:51.053-07"
Beta DB result:
"US/Pacific";"2015-10-09 12:23:00.121953-07"
As you can see, the timezones are the same, but the times are not- the time on the local database is correct, but the time on the beta server appears to be about 6 minutes and 51 seconds slow.
EDIT based on answers:
Also, I checked the dates and times on both operating systems and both are correct.
The selects are not a part of a big transaction. I am using pgAdmin to run just those statements. Also, I ran the timeofday() with the now() function and they returned the exact same times as the now() function calls.
Any idea how this is happening?

Are you sure you checked the current date/time on the OS in both machines? It looks like the clock on them are not "synchronized"...
Also, are you running that "select" inside "long running" transaction? "now()" function returns always the time "frozen" in the beginning of the transaction. To get running time inside a transaction, use timeofday() function.

You checked the dates on both machines - but did you also check the times? The best way to keep the time in sync is using NTP across all your machines. Also if you are unsure about the transactions, you can use timeofday() to get the current system time.
SELECT timeofday();

PostgreSQL directly calls the underlying operating system's date and time routines to get the timestamp, adjusting only for time zone (if needed) and epoch - which is fixed.
Most likely your clocks are not in fact in sync between the two hosts. Set up network time sync. The problem will go away. If you're really, really sure they are, check the output of the date +%H:%M:%S command on the unix system, and time /T on the Windows command line. Are they really the same?
If they're different in PostgreSQL but not as reported by the operating system and you have set up network time sync, please report a bug to pgsql-bugs or follow up here. But please be very sure they're truly the same, and do make sure network time sync is active and working.

Related

MongoDB cluster time too far from wall clock time

I am maintaining a rocket chat server which relies on a Mongo DB internally.
Recently the linux machine's date was accidentally changed to 2022, and since i changed it back the mongo instance refuses to start, claiming that "New cluster time 1666775701 is too far from this nodes wall clock time 1603715107"
How do I set mongo's clock back to the correct time from the mongo shell?
tldr
Start your instance with maxAcceptableLogicalClockDriftSecs large enough to cover the period. From the error message it seems 63060594 should be enough, but be generous, give it e.g. 94608000 (3 years). Backup data after server spins up. If you can afford it, nuke the database and restore it from backup. Restart mongo with normal/default maxAcceptableLogicalClockDriftSecs
explanations
Mongo causal consistency model is based on Lamport logical clock. It is being used internally in oplog and all replication logic relies on it.
When you accidentally change system date to the future the timestamp advances accordingly. When you roll back system date back to ntp-based time you need to reset the timestamp as well. Default maxAcceptableLogicalClockDriftSecs gives you 1 year which is more than enough for normal operations.
Adjusting this value will let you start mongo and recover it's state. It won't fix any dates from user space - it's totally your responsibility. Mongo knows noting about meaning of your date from your application perspective.
You don't need to start rocket server with custom command line parameters - start mongo manually, back up data, nuke the db, restore data, stop mongod, start rocket server.

Same query runs usually in seconds but then sometimes does not complete after running for hours

Appreciate this question title is a bit vague for SO. Apologies for that but I'm at a complete loss about what to do.
I created my first postgres database about a month ago. It's on Azure and has 100GB of storage of which ~40% is currently used.
I have a query that I run via a cron job each morning. Yesterday morning the query ran in under a minute. This morning however it just seemed to time out. There's no significant difference in data volume between yesterday and today.
Resource utilization from the past 7 days:
Someone suggested it might be the vacuum which I had never heard of before. From a SO post I found on seeing vacuum stats I ran:
select
schemaname, relname,
last_vacuum, last_autovacuum,
vacuum_count, autovacuum_count
from pg_stat_user_tables;
The vacuum has not run since February 2nd.
I have access to Azure server over view dashboard but I'm not sure what to look for e.g. DBforPostgreSQL/servers/ourcompany-data/overview
There's activity log, settings and usage alerts but I don't know where I should search for the culprit.
I also looked at active queries to see if there were any still running that I thought were cancelled:
select * from pg_stat_activity where state = 'active';
There are no other running queries.
I have not shared my query here. That's because I don't want to divert any feedback offering optimization tips. I'm aware that I could optimize it by e.g. replacing CTE's with temp tables but I really want to understand the variance in query completion time for the same query. Right now it feels like it's random.
The database is not shared and no other processes or users are on it.
I tried running the query on yesterday's data too, same thing, the query simply does not complete.
Is there any prescribed list of things to check here? Why might a select query that typically completes in a few seconds stop completing seemingly out of the blue?

High load-avg on Heroku Postgres

Some two weeks ago, I deployed some changes to my app (Flask + SQLAlchemy on top of Postgres) to Heroku. The response time of my dynos went up soon afterwards and the time outs in responses started. Before these problems started, the current app's version has been running flawlessly for some 2-3 months.
Naturally, I suspected my changes in the app and went through them, but there were none relevant to this (changes in the front end, replaced plain text emails with HTML ones, minor changes in the static data that the app is using).
I have a copy of the app for testing purposes, so I cloned the latest backup of the production DB and started investigating (the clone was some 45GiB, compared to 56GiB of the original, but this seems to be a normal consequence of "bloating").
It turns out that even the trivial requests take ridiculous amount of time on production, while they work on the testing one as they should. For example, select * from A where some_id in (three, int, values) takes under 0.5 sec on testing, and some 12-15 sec on prod (A has 3M records and some_id is a foreign key to a much smaller table). Even select count(*) from A will take the same amount of time, so it's not indexing or anything like that.
This is not tied to a specific query or even a table, thus removing my doubts of my code as most of it was unchanged for months and worked fine until these problems started.
Looking further into this, I found that the logs contain load averages for the DB server, and my production one is showing load-avg 22 (I searched for postgres load-avg in Papertrail), and it seems to be almost constant (slowly rising over prolonged periods of time).
I upgraded the production DB from Postgres 9.6 / Standard 2 plan (although, my connections number was around 105/400 and the cache hit rate was 100%) to Postgres 10 / Standard 3 plan, but this didn't make a slightest improvement. This upgrade also meant some 30-60min of downtime. Soon after bringing the app back up, the DB server's load was high (sadly, I didn't check during the downtime). Also, the DB server's load doesn't seem to have spikes that would reflect the app's usage (the app is mostly used in the USA and EU, and the usual app's load reflects that).
At this point, I am without ideas (apart from contacting Heroku's support, which a colleague of mine will do) and would appreciate any suggestions what to look or try next.
I ended up upgrading from standard-2 to standard-7 and my DB's load dropped to around 0.3-0.4. I don't have an explanation of why it started so suddenly.

PostgreSQL. Slow queries in log file are fast in psql

I have an application written on Play Framework 1.2.4 with Hibernate(default C3P0 connection pooling) and PostgreSQL database (9.1).
Recently I turned on slow queries logging ( >= 100 ms) in postgresql.conf and found some issues.
But when I tried to analyze and optimize one particular query, I found that it is blazing fast in psql (0.5 - 1 ms) in comparison to 200-250 ms in the log. The same thing happened with the other queries.
The application and database server is running on the same machine and communicating using localhost interface.
JDBC driver - postgresql-9.0-801.jdbc4
I wonder what could be wrong, because query duration in the log is calculated considering only database processing time excluding external things like network turnarounds etc.
Possibility 1: If the slow queries occur occasionally or in bursts, it could be checkpoint activity. Enable checkpoint logging (log_checkpoints = on), make sure the log level (log_min_messages) is 'info' or lower, and see what turns up. Checkpoints that're taking a long time or happening too often suggest you probably need some checkpoint/WAL and bgwriter tuning. This isn't likely to be the cause if the same statements are always slow and others always perform well.
Possibility 2: Your query plans are different because you're running them directly in psql while Hibernate, via PgJDBC, will at least sometimes be doing a PREPARE and EXECUTE (at the protocol level so you won't see actual statements). For this, compare query performance with PREPARE test_query(...) AS SELECT ... then EXPLAIN ANALYZE EXECUTE test_query(...). The parameters in the PREPARE are type names for the positional parameters ($1,$2,etc); the parameters in the EXECUTE are values.
If the prepared plan is different to the one-off plan, you can set PgJDBC's prepare threshold via connection parameters to tell it never to use server-side prepared statements.
This difference between the plans of prepared and unprepared statements should go away in PostgreSQL 9.2. It's been a long-standing wart, but Tom Lane dealt with it for the up-coming release.
It's very hard to say for sure without knowing all the details of your system, but I can think of a couple of possibilities:
The query results are cached. If you run the same query twice in a short space of time, it will almost always complete much more quickly on the second pass. PostgreSQL maintains a cache of recently retrieved data for just this purpose. If you are pulling the queries from the tail of your log and executing them immediately this could be what's happening.
Other processes are interfering. The execution time for a query varies depending on what else is going on in the system. If the queries are taking 100ms during peak hour on your website when a lot of users are connected but only 1ms when you try them again late at night this could be what's happening.
The point is you are correct that the query duration isn't affected by which library or application is calling it, so the difference must be coming from something else. Keep looking, good luck!
There are several possible reasons. First if the database was very busy when the slow queries excuted, the query may be slower. So you may need to observe the load of the OS at that moment for future analysis.
Second the history plan of the sql may be different from the current session plan. So you may need to install auto_explain to see the actual plan of the slow query.

SQL Server & ADO NET : how to automatically cancel long running user query?

I have a .NET Core 2.1 application that allows users to search a large database, with the possibility of using lots of parameters. The data access is done through ADO.NET. Some of the queries generated result in long running queries (several hours). Obviously, the user gives up on waiting, but the query chugs along in SQL Server.
I realize that the root cause is the design of the app, but I would like a quick solution for now, if possible.
I have tried many solutions, but none seem to work as expected.
What I have tried:
CommandTimeout
CommandTimeout works as expected with ExecuteNonQuery but does not work with ExecuteReader, as discussed in this forum
When you execute command.ExecuteReader(), you don't get this exception because the server responds on time. The application doesn't respond because it reads data to the memory, and the ExecuteReader() method doesn't return control until all the data is read.
I have also tried using SqlDataAdapter, but this does not work either.
SQL Server query governor
SQL Server's query governor works off of the estimated execution plan, and while it does work sometimes, it does not always catch inefficient queries.
SQL Server execution time-out
Tools > Options > Query Execution > SQL Server > General
I'm not sure what this does, but after entering a value of 1, SQL Server still allows queries to run as long as they need. I tried restarting the server instance, but that did not make any difference.
Again, I realize that the cause of this problem is the way that the queries are generated, but with so many parameters and so much data, fine tuning a solution in the design of the application may take some time. As of now, we are manually killing any spid associated with this app that has run over 10 or so minutes.
EDIT:
I abandoned the hope of finding a simple solution. If you're having a similar issue, here is what we did to address it:
We created a .net core console app that polls the database for queries running over a certain allotted time. The app looks at the login name and the amount of time it's been running and determines whether to kill the process.
https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlcommand.cancel?view=netframework-4.7.2
Looking through the documentation on SqlCommand.Cancel, I think it might solve your issue.
If you were to create and start a Timer before you call ExecuteReader(), you could then keep track of how long the query is running, and eventually call the Cancel method yourself.
(Note: I wanted to add this as a comment but I don't have the reputation to be allowed to yet)