How can I log `PREPARE` statements in PostgreSQL? - postgresql

I'm using a database tool (Elixir's Ecto) which uses prepared statements for most PostgreSQL queries. I want to see exactly how and when it does that.
I found the correct Postgres config file, postgresql.conf, by running SHOW config_file; in psql. I edited it to include
log_statement = 'all'
as Dogbert suggested here.
According to the PostgreSQL 9.6 docs, this setting should cause PREPARE statements to be logged.
After restarting PostgreSQL, I can tail -f its log file (the one specified by the -r flag when running the postgres executable) and I do see entries like this:
LOG: execute ecto_728: SELECT i0."id", i0."store_id", i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1)
DETAIL: parameters: $1 = '%foo%'
This corresponds to a query (albeit done over binary protocol, I think) like
EXECUTE ecto_728('%foo%');
However, I don't see the original PREPARE that created ecto_728.
I tried dropping and recreating the database. Afterwards, the same query is executed as ecto_578, so it seems that the original prepared statement was dropped with the database and a new one was created.
But when I search the PostgreSQL log for ecto_578, I only see it being executed, not created.
How can I see PREPARE statements in the PostgreSQL log?

As you mentioned, your queries are being prepared via the extended query protocol, which is distinct from a PREPARE statement. And according to the docs for log_statement:
For clients using extended query protocol, logging occurs when an
Execute message is received
(Which is to say that logging does not occur when a Parse or Bind message is received.)
However, if you set log_min_duration_statement = 0, then:
For clients using extended query protocol, durations of the Parse,
Bind, and Execute steps are logged independently
Enabling both of these settings together will give you two log entries per Execute (one from log_statement when the message is received, and another from log_min_duration_statement once execution is complete).

Nick's answer was correct; I'm just answering to add what I learned by trying it.
First, I was able to see three separate actions in the log: a parse to create the prepared statement, a bind to give the parameters for it, and an execute to have the database actually carry it out and return results. This is described in the PostgreSQL docs for the "extended query protocol".
LOG: duration: 0.170 ms parse ecto_918: SELECT i0."id", i0."store_id",
i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1)
LOG: duration: 0.094 ms bind ecto_918: SELECT i0."id", i0."store_id",
i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1)
DETAIL: parameters: $1 = '%priceless%'
LOG: execute ecto_918: SELECT i0."id", i0."store_id",
i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1)
DETAIL: parameters: $1 = '%priceless%'
This output was generated during a run of some automated tests. On subsequent runs, I saw the same query with a different name - eg, ecto_1573. This was without dropping the database or even restarting the PostgreSQL process. The docs say that
If successfully created, a named prepared-statement object lasts till
the end of the current session, unless explicitly destroyed.
So these statements have to be recreated on each session, and probably my test suite has a new session on each run.

Related

how to get full statement in PostgreSQL

i wonder how to get full query statement in postgresql, when i set log_statement = 'all' in data/postgresql.conf, i can track the query record in log/ directory, but they log the query is:
LOG: Select * from table where id = $1
DETAIL: parameters: $1 = 55,
when there just a small amount param, that is clear. But if many params exist in that query,
that maybe frastrated me. if any some setting in postgresql that i need set, so i can directly get full statement such as
Select * from table where id = 55
No, there is no way to get that in the log, because that is not what arrives at the server. However, it should be easy to write a Perl (or other) script that reformats a log like that.

How to detect if Postgresql function utilizing index on the tables or not

I have created a PL/pgSQL table-returning function that executes a SELECT statement and uses the input parameter in the WHERE clause of the query.
I frame the statement dynamically and execute it like this: EXECUTE sqlStmt USING empID;
sqlStmt is a variable of data type text that has the SELECT query which joins 3 tables.
When I execute that query in pgAdmin and analyze I could see that 'Index scan' on the tables are utilized as expected. However, when I do EXPLAIN ANALYZE SELECT * from fn_getDetails(12), the output just says "Function scan".
How do we know if the table indexes are utilized? Other SO answers to use auto_explain module did not provide details of the function body statements. And I am unable to use the PREPARE inside my function body.
The time taken by execution of the direct SELECT statement is almost the same as the use of function, just couple of milliseconds, but how can I know if the index was used?
auto_explain will certainly provide the requested information.
Set the following parameters:
shared_preload_libraries = 'auto_explain' # requires a restart
auto_explain.log_min_duration = 0 # log all statements
auto_explain.log_nested_statements = on # log statements in functions too
The last parameter is required for tracking SQL statements inside functions.
To activate the module, you need to restart the database.
Of course, testing if the index is used in a query on a small table won't give you a reliable result. You need about as many test data as you expect to have in reality.

How to log queries inside PLPGSQL with inlined parameter values

When a statement in my PLPGSQL function (Postgres 9.6) is being run I can see the query on one line, and then all the parameters on another line. A 2-line logging. Something like:
LOG: execute <unnamed>: SELECT * FROM table WHERE field1=$1 AND field2=$2 ...
DETAIL: parameters: $1 = '-767197682', $2 = '234324' ....
Is it possible to log the entire query in pg_log WITH the parameters already replaced in the query and log it in a SINGLE line?
Because this would make it much easier to copy/paste the query to reproduce it in another terminal, especially if queries have dozens of parameters.
The reason behind this: PL/pgSQL treats SQL statements as prepared statements internally.
First: With default settings, there is no logging of SQL statements inside PL/pgSQL functions at all. Are you using auto_explain?
Postgres query plan of a UDF invocation written in pgpsql
The first couple of invocations in the same session, the SPI manager (Server Programming Interface) generates a fresh execution plan, based on actual parameter values. Any kind of logging should report parameter values inline.
Postgres keeps track and after a couple of invocations in the current session, if execution plans don't seem sensitive to actual parameter values, it will start reusing a generic, cached plan. Then you should see the generic plan of a prepared statements with $n parameters (like in the question).
Details in the chapter "Plan Caching" in the manual.
You can observe the effect with a simple demo. In the same session (not necessarily same transaction):
CREATE TEMP TABLE tbl AS
SELECT id FROM generate_series(1, 100) id;
PREPARE prep1(int) AS
SELECT min(id) FROM tbl WHERE id > $1;
EXPLAIN EXECUTE prep1(3); -- 1st execution
You'll see the actual value:
Filter: (id > 3)
EXECUTE prep1(1); -- several more executions
EXECUTE prep1(2);
EXECUTE prep1(3);
EXECUTE prep1(4);
EXECUTE prep1(5);
EXPLAIN EXECUTE prep1(3);
Now you'll see a $n parameter:
Filter: (id > $1)
So you can get the query with parameter values inlined on the first couple of invocations in the current session.
Or you can use dynamic SQL with EXECUTE, because, per documentation:
Also, there is no plan caching for commands executed via EXECUTE.
Instead, the command is always planned each time the statement is run.
Thus the command string can be dynamically created within the function
to perform actions on different tables and columns.
That can actually affect performance, of course.
Related:
PostgreSQL Stored Procedure Performance

Can I get Ecto to log raw SQL?

I am building an Ecto query like this:
from item in query,
where: like(item.description, ^"%#{text}%")
I'm concerned that this allows SQL injection in text. Before trying to fix that, I want to see how the query is actually sent to the database.
If I inspect the query or look at what is logged, I see some SQL, but it's not valid.
For instance, inspecting the query shows me this:
{"SELECT i0.\"id\", i0.\"store_id\", i0.\"title\", i0.\"description\"
FROM \"items\" AS i0 WHERE (i0.\"description\" LIKE $1)",
["%foo%"]}
When I pass this query to Repo.all, it logs this:
SELECT i0."id", i0."store_id", i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1) ["%foo%"]
But if I copy and paste that into psql, PostgreSQL gives me an error:
ERROR: 42P02: there is no parameter $1
It seems as though Ecto may actually be doing a parameterized query, like this:
PREPARE bydesc(text) AS SELECT i0."id",
i0."store_id", i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1);
EXECUTE bydesc('foo');
If so, I think that would prevent SQL injection. But I'm just guessing that this is what Ecto does.
How can I see the actual SQL that Ecto is executing?
Ecto uses only prepared statements. When using ecto query syntax, introducing SQL injection is not possible. The query syntax verifies at compile-time that no SQL injection is possible.
Showing exactly the queries executed might be difficult because of couple reasons:
Postgrex (and hence Ecto) uses the postgresql binary protocol (instead of the most common, but less efficient, text protocol), so the PREPARE query never actually exists as a string.
For most cases all you would see would be one initial PREPARE 64237612638712636123(...) AS ... and later a lot of EXECUTE 64237612638712636123(...) which isn't that helpful. Trying to relate one to another would be horrible.
From my experience most software of that kind, use prepare statements and log them instead of raw queries, since it's much more helpful in understanding the behaviour of the system.
Yes, that is the exact SQL that is being executed by Ecto (it uses prepared queries through the db_connection package internally) and no SQL injection is possible in that code. This can be verified by turning on logging of all executed SQL queries by changing log_statement to all in postgresql.conf:
...
log_statement = 'all'
...
and then restarting PostgreSQL and running a query. For the following queries:
Repo.get(Post, 1)
Repo.get(Post, 2)
this is logged:
LOG: execute ecto_818: SELECT p0."id", p0."title", p0."user_id", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 WHERE (p0."id" = $1)
DETAIL: parameters: $1 = '1'
LOG: execute ecto_818: SELECT p0."id", p0."title", p0."user_id", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 WHERE (p0."id" = $1)
DETAIL: parameters: $1 = '2'

DB2 deadlock timeout Sqlstate: 40001, reason code 68 due to update statements called from servlet using SQL

I am calling update statements one after the other from a servlet to DB2. I am getting error sqlstate 40001, reason code 68 which i found it is due to deadlock timeout.
How can I resolve this issue?
Can it be resolved by setting query timeout?
If yes then how to use it with update statements in servlet or where to use it?
The reason code 68 already tells you this is due to a lock timeout (deadlock is reason code 2) It could be due to other users running queries at the same time that use the same data you are accessing, or your own multiple updates.
Begin by running db2pd -db locktest -locks show detail from a db2 command line to see where the locks are. You'll then need to run something like:
select tabschema, tabname, tableid, tbspaceid
from syscat.tables where tbspaceid = # and tableid = #
filling in the # symbols with the ID number you get from the db2pd command output.
Once you see where the locks are, here are some tips:
◦Deadlock frequency can sometimes be reduced by ensuring that all applications access their common data in the same order – meaning, for example, that they access (and therefore lock) rows in Table A, followed by Table B, followed by Table C, and so on.
taken from: http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.trb.doc/doc/t0055074.html
recommended reading: http://www.ibm.com/developerworks/data/library/techarticle/dm-0511bond/index.html
Addendum: if your servlet or another guilty application is using select statements found to be involved in the deadlock, you can try appending with ur to the select statements if accuracy of the newly updated (or inserted) data isn't important.
For me, the solution was adding FOR READ ONLY WITH UR at the end of all my SELECT statements. (Apparently my select statements were returning so much data, it locked the tables long enough to interfere with other SQL statements)
See https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/sqlref/src/tpc/db2z_sql_isolationclause.html