How can I execute an "Explain" statement with a prepared query in PostgrSQL that has bind parameters? - postgresql

I would like to be able to execute an explain statement on a query that has bind parameters. For example:
EXPLAIN SELECT * FROM metasyntax WHERE id = $1;
When I attempt to execute this, I get the following error:
ERROR: bind message supplies 0 parameters, but prepared statement "" requires 1
I understand it's telling me that it wants me to supply a value for the query. However, I may not necessarily know the answer to that. In other SQL dialects such as Oracle, it will generate an explain plan without needing me to supply a value for the parameter.
Is it possible to get an explain plan without binding to an actual value? Thanks!

Assuming the parameter is an integer:
PREPARE x(integer) AS SELECT * FROM metasyntax WHERE id = $1;
Then run the following six times, where “42” is a representative value:
EXPLAIN (ANALYZE, BUFFERS) EXECUTE x(42);
Normally PostgreSQL will switch to a generic plan for the sixth run, and you will see a plan that contains the placeholder $1.

No.
The optimiser is allowed to change the query plan based on the parameter. Imagine if the parameter was null - it's obvious that no rows will be returned, so the DB may return an empty rowset instantly.
Just use a representative value.

Related

Is it possible to construct a field in TSQL and then reference it in the same query?

To demonstrate what I need, this query doesn't work:
SELECT
dbo.expensive_function AS test,
IIF(test=1, 3, 4) AS test2
But this does:
SELECT
*,
IIF(test=1, 3, 4) AS test2
FROM (
SELECT
dbo.expensive_function AS test
)
Is there a cleaner way of doing this? I'm translating legacy code from Microsoft Access to SQL Server and I'm finding that I need multiple nested tables to achieve the desired results. It's also undesirable to run the expensive function twice in the query.
There does not seem to be any correlation between your query and the expensive function.
Execute the function before the query, store the value in a variable and use the variable in the query instead of the expensive function.

How to detect if Postgresql function utilizing index on the tables or not

I have created a PL/pgSQL table-returning function that executes a SELECT statement and uses the input parameter in the WHERE clause of the query.
I frame the statement dynamically and execute it like this: EXECUTE sqlStmt USING empID;
sqlStmt is a variable of data type text that has the SELECT query which joins 3 tables.
When I execute that query in pgAdmin and analyze I could see that 'Index scan' on the tables are utilized as expected. However, when I do EXPLAIN ANALYZE SELECT * from fn_getDetails(12), the output just says "Function scan".
How do we know if the table indexes are utilized? Other SO answers to use auto_explain module did not provide details of the function body statements. And I am unable to use the PREPARE inside my function body.
The time taken by execution of the direct SELECT statement is almost the same as the use of function, just couple of milliseconds, but how can I know if the index was used?
auto_explain will certainly provide the requested information.
Set the following parameters:
shared_preload_libraries = 'auto_explain' # requires a restart
auto_explain.log_min_duration = 0 # log all statements
auto_explain.log_nested_statements = on # log statements in functions too
The last parameter is required for tracking SQL statements inside functions.
To activate the module, you need to restart the database.
Of course, testing if the index is used in a query on a small table won't give you a reliable result. You need about as many test data as you expect to have in reality.

How to log queries inside PLPGSQL with inlined parameter values

When a statement in my PLPGSQL function (Postgres 9.6) is being run I can see the query on one line, and then all the parameters on another line. A 2-line logging. Something like:
LOG: execute <unnamed>: SELECT * FROM table WHERE field1=$1 AND field2=$2 ...
DETAIL: parameters: $1 = '-767197682', $2 = '234324' ....
Is it possible to log the entire query in pg_log WITH the parameters already replaced in the query and log it in a SINGLE line?
Because this would make it much easier to copy/paste the query to reproduce it in another terminal, especially if queries have dozens of parameters.
The reason behind this: PL/pgSQL treats SQL statements as prepared statements internally.
First: With default settings, there is no logging of SQL statements inside PL/pgSQL functions at all. Are you using auto_explain?
Postgres query plan of a UDF invocation written in pgpsql
The first couple of invocations in the same session, the SPI manager (Server Programming Interface) generates a fresh execution plan, based on actual parameter values. Any kind of logging should report parameter values inline.
Postgres keeps track and after a couple of invocations in the current session, if execution plans don't seem sensitive to actual parameter values, it will start reusing a generic, cached plan. Then you should see the generic plan of a prepared statements with $n parameters (like in the question).
Details in the chapter "Plan Caching" in the manual.
You can observe the effect with a simple demo. In the same session (not necessarily same transaction):
CREATE TEMP TABLE tbl AS
SELECT id FROM generate_series(1, 100) id;
PREPARE prep1(int) AS
SELECT min(id) FROM tbl WHERE id > $1;
EXPLAIN EXECUTE prep1(3); -- 1st execution
You'll see the actual value:
Filter: (id > 3)
EXECUTE prep1(1); -- several more executions
EXECUTE prep1(2);
EXECUTE prep1(3);
EXECUTE prep1(4);
EXECUTE prep1(5);
EXPLAIN EXECUTE prep1(3);
Now you'll see a $n parameter:
Filter: (id > $1)
So you can get the query with parameter values inlined on the first couple of invocations in the current session.
Or you can use dynamic SQL with EXECUTE, because, per documentation:
Also, there is no plan caching for commands executed via EXECUTE.
Instead, the command is always planned each time the statement is run.
Thus the command string can be dynamically created within the function
to perform actions on different tables and columns.
That can actually affect performance, of course.
Related:
PostgreSQL Stored Procedure Performance

Can I get Ecto to log raw SQL?

I am building an Ecto query like this:
from item in query,
where: like(item.description, ^"%#{text}%")
I'm concerned that this allows SQL injection in text. Before trying to fix that, I want to see how the query is actually sent to the database.
If I inspect the query or look at what is logged, I see some SQL, but it's not valid.
For instance, inspecting the query shows me this:
{"SELECT i0.\"id\", i0.\"store_id\", i0.\"title\", i0.\"description\"
FROM \"items\" AS i0 WHERE (i0.\"description\" LIKE $1)",
["%foo%"]}
When I pass this query to Repo.all, it logs this:
SELECT i0."id", i0."store_id", i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1) ["%foo%"]
But if I copy and paste that into psql, PostgreSQL gives me an error:
ERROR: 42P02: there is no parameter $1
It seems as though Ecto may actually be doing a parameterized query, like this:
PREPARE bydesc(text) AS SELECT i0."id",
i0."store_id", i0."title", i0."description"
FROM "items" AS i0 WHERE (i0."description" LIKE $1);
EXECUTE bydesc('foo');
If so, I think that would prevent SQL injection. But I'm just guessing that this is what Ecto does.
How can I see the actual SQL that Ecto is executing?
Ecto uses only prepared statements. When using ecto query syntax, introducing SQL injection is not possible. The query syntax verifies at compile-time that no SQL injection is possible.
Showing exactly the queries executed might be difficult because of couple reasons:
Postgrex (and hence Ecto) uses the postgresql binary protocol (instead of the most common, but less efficient, text protocol), so the PREPARE query never actually exists as a string.
For most cases all you would see would be one initial PREPARE 64237612638712636123(...) AS ... and later a lot of EXECUTE 64237612638712636123(...) which isn't that helpful. Trying to relate one to another would be horrible.
From my experience most software of that kind, use prepare statements and log them instead of raw queries, since it's much more helpful in understanding the behaviour of the system.
Yes, that is the exact SQL that is being executed by Ecto (it uses prepared queries through the db_connection package internally) and no SQL injection is possible in that code. This can be verified by turning on logging of all executed SQL queries by changing log_statement to all in postgresql.conf:
...
log_statement = 'all'
...
and then restarting PostgreSQL and running a query. For the following queries:
Repo.get(Post, 1)
Repo.get(Post, 2)
this is logged:
LOG: execute ecto_818: SELECT p0."id", p0."title", p0."user_id", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 WHERE (p0."id" = $1)
DETAIL: parameters: $1 = '1'
LOG: execute ecto_818: SELECT p0."id", p0."title", p0."user_id", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 WHERE (p0."id" = $1)
DETAIL: parameters: $1 = '2'

how to use subquery with aggregate function in hive

SELECT peridle, CPU
FROM (SELECT MAX(peridle) FROM try2);
While executing this query in hive I am getting following error
Parse Error: line 1:47 cannot recognize input near 'select' 'MAX' '(' in expression specification
Please suggest a solution how to use aggregate functions in hive subquery
At least two things need to be fixed here:
You are not returning fields named peridle or CPU from the sub-query, yet you are trying to select them.
Hive requires you to alias all sub-queries, even if you don't reference the alias. You can quickly do this by changing the ); at the end to ) x; (or however you want to call it).