Is there a PostgreSQL equivalent of SQL Server profiler? - postgresql

I need to see the queries submitted to a PostgreSQL server. Normally I would use SQL Server profiler to perform this action in SQL Server land, but I'm yet to find how to do this in PostgreSQL. There appears to be quite a few pay-for tools, I am hoping there is an open source variant.

You can use the log_statement config setting to get the list of all the queries to a server
https://www.postgresql.org/docs/current/static/runtime-config-logging.html#guc-log-statement
Just set that, and the logging file path and you'll have the list. You can also configure it to only log long running queries.
You can then take those queries and run EXPLAIN on them to find out what's going on with them.
https://www.postgresql.org/docs/9.2/static/using-explain.html

Adding to Joshua's answer, to see which queries are currently running simply issue the following statement at any time (e.g. in PGAdminIII's query window):
SELECT datname,procpid,current_query FROM pg_stat_activity;
Sample output:
datname | procpid | current_query
---------------+---------+---------------
mydatabaseabc | 2587 | <IDLE>
anotherdb | 15726 | SELECT * FROM users WHERE id=123 ;
mydatabaseabc | 15851 | <IDLE>
(3 rows)

I discovered pgBadger (https://pgbadger.darold.net/) and it is a fantastic tool that saved my life many times. Here is an example of an
report.
If you open it and go to 'top' menu you can see the slowest queries and the time consuming queries.
Then you can ask details and see nice graphs that show you the queries by hour and if you use detail button you can see the SQL text in a pretty form. So I can see that this tool is free and perfect.

I need to see the queries submitted to a PostgreSQL server
As an option, if you use pgAdmin (on my picture it's pgAdmin 4 v2.1). You can observe queries via "Dashboard" tab:
Update on Jun, 2022. Answering to the questions in the comments.
Question 1: My long SQL query gets truncated, is there any workaround?
Follow steps below:
Close pgAdmin
Find postgresql.conf file. On my computer it is located in c:\Program Files\PostgreSQL\13\data\postgresql.conf. If you can't find it - read this answer for more details.
Open postgresql.conf file and find property called track_activity_query_size. As you see by default the value is 1024 that means - all queries bigger than 1024 symbols will be truncated. Uncomment this property and set a new value, for example:
track_activity_query_size = 32768
Restart PostgreSQL service on your computer
P.S: now everything is ready, but keep in mind that this change can slightly decrease the performance. From development/debugging standpoint you won't see any difference, but better don't forget to revert this property in 'production' environment. For more details read this article.
Question 2: I ran my function/method that triggers SQL query but I still can't see it in pgAdmin, or sometimes I see it but it runs so quickly so I can't even expand the session on 'Dashboard' tab?
Answer: Try to run your application in 'debug' mode and set a breakpoint right before you close the connection to the database. At the same time (while you debugging) click on 'refresh' button on 'Dashboard' tab in pgAdmin.

You can use the pg_stat_statements extension.
If running the db in docker just add this command in docker-compose.yml, otherwise just look at the installation instructions for your setup:
command: postgres -c shared_preload_libraries=pg_stat_statements -c pg_stat_statements.track=all -c max_connections=200
And then in the db run this query:
CREATE EXTENSION pg_stat_statements;
Now to see the operations that took more time run:
SELECT * FROM pg_stat_statements ORDER BY total_time/calls DESC LIMIT 10;
Or play with other queries over that view to find what you are looking for.

All those tools like pgbadger or pg_stat_statements require access to the server, and/or altering the server-settings/server-log-settings, which is not such a good idea, especially if it requires server-restart and because logging slows everything down, including production use.
In addition to that, extensions such as pg_stat_statements don't really show the queries, let alone in chronological order, and pg_stat_activity doesn't show you anything that doesn't run right now, and in addition, queries that are running that are from other users than you.
Instead of running any such crap, you can add a TCP-proxy in between your application and PostgreSQL-server.
Then your TCP-proxy reads all the sql-query-statements from what goes over the wire from your application to the server, and outputs it to console (or wherever). Also it forwards everything to PostgreSQL and returns the answer(s) to your application.
This way, you don't need to stop/start/restart your db-server, you don't need admin/root rights !ON THE DB-SERVER! to change the config file, and you don't need any access to the db-server. All you need to do is change the db connection string in your application (e.g. in your dev-environment) to point to the proxy server instead of the sql-server (the proxy-server then needs to point to the sql-server). Then you can see (in chronological order) what your <insert_profanity_here> application does on the database - and also, other people's queries don't show up (which makes it even better than sql-server-profiler). [Of course, you can also see what other people do if you put it on the db server on the old db port, and assing the db a new port. ]
I have implemented this with pg_proxy_net
(runs on Windows, Linux and Mac and doesn't require OS-dependencies, as it is .NET-Core-self-contained-deployment).
That way, you get appx. "the same" as you get with sql-server profiler.
Wait, if you aren't disturbed by other people's queries, what you get with pg_proxy_net is actually better than what you get with sql-server profiler.
Also, on github, I have a command-line MS-SQL-Server profiler that works on Linux/Mac.
And an GUI MS-SQL-Express-Profiler for Windows.
The funny thing is, once you have written one such tool, writing some more is just a piece of cake and done in under a day.
Also, if you want to get pg_stat_statements to work, you need to alter the config file (postgresql.conf), adding tracking and preloading libraries, and then restart the server:
CREATE EXTENSION pg_stat_statements;
-- D:\Programme\LessPortableApps\SQL_PostGreSQL\PostgreSQLPortable\Data\data\postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
You find the documentation for the PostgreSQL protocol here:
https://www.postgresql.org/docs/current/protocol-overview.html
You can see how the data is written into the TCP-buffer by looking at the source code of a postgresql-client, e.g. FrontendMessages of Npgsql on github:
https://github.com/npgsql/npgsql/blob/main/src/Npgsql/Internal/NpgsqlConnector.FrontendMessages.cs
Also, just in case you have a .NET application (with source code) that uses Npgsql, you might want to have a look at Npgsql.OpenTelemetry.
PS:
To configure the logs, see ChartIO Tutorial and TablePlus.
Cheers !
Happy "profiling" !

Related

How to run batch "sql" using Parse server or directly on MongoDB?

I am going to use SQL terminology because I am new to Parse, apologies if that is confusing. I have a table in an app and in order to introduce new functionality I need to add a new column and set all the records to a default value. In SQL I would just run
update <table> set <column> = <value>;
Parse Server has MongoDB as the back end and I am not clear whether the correct approach is to directly access the MongoDB and run statements through the command line or if this would cause an issue with Parse. I found this helpful link for translating SQL syntax to MongoDB for that, https://docs.mongodb.com/manual/reference/sql-comparison/.
I also noticed that there were some tools, such as studio 3t, but the ones I saw all required expensive licenses. If direct MongoDB access is OK, any help understanding how to get to that would be helpful, I installed parse-server from the Bitnami stack on the AWS marketplace and to date I have only been interacting with it through the provided dashboard which doesn't have an "Update all records" option.
Right now my work around is to write a Swift script that runs the update in a loop, but I have to think that if I had millions or records instead of thousands this would be the incorrect approach. What is the proper environment and code to update my existing Parse server so that I can run something like the SQL above?

Is there any way to trace and debug a query in Postgresql?

Is there any way/tool to trace and debug a query in PostgreSQL 9.3.18?
I'm a SQL programmer and sometimes I need to trace and debug my queries and see the values of different fields at execution time. I've Googled this but didn't get any relevant result.
Any idea would be appreciated
PG Admin (database interaction GUI) that is sometimes bundled with PostgreSQL includes a step through debugger for query/calls to Postgres database functions (as opposed to every query that goes to the server).
https://www.pgadmin.org/docs/pgadmin4/4.29/debugger.html
Before using it you have to enable it as a plugin/library in PG Admin.
The debugger will step to statements so sometimes a complex single statement will execute without letting you step through it's details. Still, if you need to see a basic step through of a longer multi statement function or variable values at some points it can be useful. Note, this debug applies to database functions and not general queries.

Could I script my monthly postgres maintenance?

I have to perform a monthly maintenance to a postgres database.
I puTTy into the system, navigate to the database and then run 3 commands on 40 different tables:
CLUSTER [table1] USING [primarykey];
ANALYZE [table1];
REINDEX TABLE [table1];
I have to wait for each command to finish executing before I can run the next one (i.e. CLUSTER, -wait up to a few minutes-, ANALYZE -wait-, REINDEX -wait-, )
It's very simple to do but it takes around 30-45 minutes of me just copying and pasting 120 lines, one line at a time... is there any way to automate this process?
I have zero experience with scripting and I know very little about postgreSQL.
My question is somewhat unique because I cannot install anything in the postgreSQL database. I want to have this script localized on my computer and then be able to run it when it's time for the maintenance.
Clustering automatically reindexes the table. There is no reason to reindex the table immediately after you cluster it.
Do you actually need to do this stuff? Do you have evidence that your tables are in need of clustering? Or you just assuming they do because of something you read off the internet referring to a decade-old version of PostgreSQL written by someone who didn't know what they were talking about in the first place? It is possible you really would benefit from this. It is even more possible you wouldn't, and it is just a waste of time.
If you know nothing about scripting, then you need to learn something about scripting. You should probably tag your post as being about scripting, in whichever shell/language you would like to use.
At the core, all you have to do is write a series of commands to be executed from the command line, and shove them into a text file. The easiest way is probably to install psql on your local computer, if it is not already there.
psql -c 'cluster foobar' -h thehost.example.com
psql -c 'analyze foobar' -h thehost.example.com
You might need to do some configuration to make this connection work with whatever authentication method you have in place, but without knowing which authentication method that is I can't comment further.
If the cluster for some reason fails, there is little reason to proceed to try to analyze it. (But there is also little harm in doing so). If you want to fine tune this situation, there are a variety of ways to do it, depending on which shell you are writing your script for, and what you want it to do.

How to find Who did some query on the database and when?

I have a SQL Server 2008 R2 with a database in it.
How to find a certain query that was executed and from what IP ?
I have tried to go through the transaction logs but I cant understand nothing there.
You should use SQL Server Profiler. It's usually installed by default - look in the SQL Server folder on the Start Menu. When you open it, start a new trace and select the database. In the Trace Properties dialog choose the TSQL template. This will then record all the queries running on the database, along with a whole lot of other stuff. It's not massively easy to track stuff down in here, but look for the BatchStarting events to find the SQL that gets run. Then you should run the procedure sp_who2 on the database so you can match up SPIDs in the profiler to logins.

Entity Framework 4.2 exec sp_executesql does not use indexes (parameter sniffing)

I'm encountering some major performance problems with simple SQL queries generated by the Entity Framework (4.2) running against SQL Server 2008 R2. In some situations (but not all), EF uses the following syntax:
exec sp_executesql 'DYNAMIC-SQL-QUERY-HERE', #param1...
In other situations is simply executes the raw SQL with the provided parameters baked into the query. The problem I'm encountering is that queries executed with the sp_executesql are ignoring all indexes on my target tables, resulting in an extremely poor performing query (confirmed by examining the execution plan in SSMS).
After a bit of research, it sounds like the issue might be caused by 'parameter sniffing'. If I append the OPTION(RECOMPILE) query hint like so:
exec sp_executesql 'DYNAMIC-SQL-QUERY-HERE OPTION(RECOMPILE)', #param1...
The indexes on the target tables are used and the query executes extremely quickly. I've also tried toggling on the trace flag used to disable parameter sniffing (4136) on the database instance (http://support.microsoft.com/kb/980653), however this didn't appear to have any effect whatsoever.
This leaves me with a few questions:
Is there anyway to append the OPTION(RECOMPILE) query hint to the SQL generated by Entity Framework?
Is there anyway to prevent Entity Framework from using exec sp_executesql, and instead simply run the raw SQL?
Is anyone else running into this problem? Any other hints/tips?
Additional Information:
I did restart the database instance through SSMS, however, I will try restarting the service from the service management console.
Parameterization is set to SIMPLE (is_parameterization_forced: 0)
Optimize for adhoc workloads has the following settings
value: 0
minimum: 0
maximum: 1
value_in_use: 0
is_dynamic: 1
is_advanced: 1
I should also mention that if I restart the SQL Server Service via the service management console AFTER enabling trace flag 4136 with the below script, appears to actually clear the trace flag...perhaps I should be doing this a different way...
DBCC TRACEON(4136,-1)
tl;dr
update statistics
We had a delete query with one parameter (the primary key) that took ~7 seconds to complete when called through EF and sp_executesql. Running the query manually, with the parameter embedded in the first argument to sp_executesql made the query run quickly (~0.2 seconds). Adding option (recompile) also worked. Of course, those two workarounds aren't available to us since were using EF.
Probably due to cascading foreign key constraints, the execution plan for the long running query was, uhmm..., huge. When I looked at the execution plan in SSMS I noticed that the arrows between the different steps in some cases were wider than others, possibly indicating that SQL Server had trouble making the right decisions. That led me to thinking about statistics. I looked at the steps in the execution plan to see what table was involved in the suspect steps. Then I ran update statistics Table for that table. Then I re-ran the bad query. And I re-ran it again. And again just to make sure. It worked. Our perf was back to normal. (Still somewhat worse than non-sp_executesql performance, but hey!)
It turned out that this was only a problem in our development environment. (And it was a big problem because it made our integration tests take forever.) In our production environment, we had a job running that updated all statistics on a regular basis.
At this point I would recommend:
Set the optimize for ad hoc workloads setting to true.
EXEC sp_configure 'show advanced', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO
EXEC sp_configure 'optimize for ad hoc', 1;
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sp_configure 'show advanced', 0;
GO
RECONFIGURE WITH OVERRIDE;
GO
If after some time this setting doesn't seem to have helped, only then would I try the additional support of the trace flag. These are usually reserved as a last resort. Set the trace flag using the command line via SQL Server Configuration Manager, as opposed to in a query window and using the global flag. See http://msdn.microsoft.com/en-us/library/ms187329.aspx