I am investigating a possible SQL injection bug in some COBOL code. The code uses host variables to submit the statement to a DB2 database. e.g.
EXEC SQL INSERT INTO TBL (a, b, c) VALUES (:x, :y, :z) END-EXEC
Can anyone tell me if this method would be vulnerable to an SQLi attack or if the way COBOL/DB2 parses the host variables means that it would be impossible to execute?
Every thing I read suggests there are better ways to protect against SQLi but the IBM website does mention using host variables but doesn't explain if it would totally mitigate against the attack.
Static statements with host variables are not susceptible to SQL injection attacks.
Non-parameterized dynamic statements are what you need to worry about...
They would look something like so: (my COBOL is rusty)
STRING "INSERT INTO TBL (a,b,c) VALUES ("
X ", "
Y ", "
Z ")" INTO WSQLSTMT.
EXEC SQL PREPARE MYSTMT FROM :WSQLSTMT END-EXEC.
EXEC SQL EXECUTE MYSTMT END-EXEC.
Note that you could use EXECUTE IMMEDIATE, in place of the two step PREPARE and EXECUTE
In contrast, a parameterized dynamic query looks like:
STRING "INSERT INTO TBL (a,b,c) VALUES (?, ?, ?)" INTO WSQLSTMT.
EXEC SQL PREPARE MYSTMT FROM :WSQLSTMT END-EXEC.
EXEC SQL EXECUTE MYSTMT USING :X, :Y, :Z END-EXEC.
In summary, a static query with host variables like you original posted is SAFE as is a parameterized dynamic query. A non-parameterized query that directly uses the user input to build the SQL statement to execute is NOT SAFE.
The key thing to understand is that the statement must be compiled (PREPARED) in advance before the run-time values of the variables come into play. In you original static statement, the statement is prepared automatically at compile time.
Side note, since a static statement is prepared at compile time, it performs better than a dynamic statement prepared at run time. So it's usually best to use static statements whenever possible.
Related
When a statement in my PLPGSQL function (Postgres 9.6) is being run I can see the query on one line, and then all the parameters on another line. A 2-line logging. Something like:
LOG: execute <unnamed>: SELECT * FROM table WHERE field1=$1 AND field2=$2 ...
DETAIL: parameters: $1 = '-767197682', $2 = '234324' ....
Is it possible to log the entire query in pg_log WITH the parameters already replaced in the query and log it in a SINGLE line?
Because this would make it much easier to copy/paste the query to reproduce it in another terminal, especially if queries have dozens of parameters.
The reason behind this: PL/pgSQL treats SQL statements as prepared statements internally.
First: With default settings, there is no logging of SQL statements inside PL/pgSQL functions at all. Are you using auto_explain?
Postgres query plan of a UDF invocation written in pgpsql
The first couple of invocations in the same session, the SPI manager (Server Programming Interface) generates a fresh execution plan, based on actual parameter values. Any kind of logging should report parameter values inline.
Postgres keeps track and after a couple of invocations in the current session, if execution plans don't seem sensitive to actual parameter values, it will start reusing a generic, cached plan. Then you should see the generic plan of a prepared statements with $n parameters (like in the question).
Details in the chapter "Plan Caching" in the manual.
You can observe the effect with a simple demo. In the same session (not necessarily same transaction):
CREATE TEMP TABLE tbl AS
SELECT id FROM generate_series(1, 100) id;
PREPARE prep1(int) AS
SELECT min(id) FROM tbl WHERE id > $1;
EXPLAIN EXECUTE prep1(3); -- 1st execution
You'll see the actual value:
Filter: (id > 3)
EXECUTE prep1(1); -- several more executions
EXECUTE prep1(2);
EXECUTE prep1(3);
EXECUTE prep1(4);
EXECUTE prep1(5);
EXPLAIN EXECUTE prep1(3);
Now you'll see a $n parameter:
Filter: (id > $1)
So you can get the query with parameter values inlined on the first couple of invocations in the current session.
Or you can use dynamic SQL with EXECUTE, because, per documentation:
Also, there is no plan caching for commands executed via EXECUTE.
Instead, the command is always planned each time the statement is run.
Thus the command string can be dynamically created within the function
to perform actions on different tables and columns.
That can actually affect performance, of course.
Related:
PostgreSQL Stored Procedure Performance
When I'm sketching out SQL statements I have a file of all the queries I have used to analyse my live data. Each time I write a new statement or group of statements at the end of the fileI select them and click 'execute' to see the results. I'm paranoid that I may forget the selection stage and accidentally run all the queries sequentially in the entire file and so I head the file with the line
USE FakeDatabase
so that the queries will fail as they will be run against a non-existing database. But no, instead I get the error
USE statement is not supported to switch between databases
(N.B. I am using SQL Server Management Studio v17.0 RC1 against a v12 Azure SQL Server database.)
What tSQL statement can I use that will prevent further execution of tSQL statements in a file?
use is not supported in AZURE...you can try below ,but there can be many options depending on your use case
Replace use Database with below statement
if db_name() <>'Fakedatabase'
return;
You could, instead, put something like this in each script:
IF ##SERVERNAME <> 'Not-Really-My-Server'
BEGIN
raiserror('Database Name Not Set', 20, -1) with log
END
-- Rest of my query...
As document said, in the extended query protocol, "The query string
contained in a Parse message cannot include more than one SQL
statement".
So to support batch in prepared statement, I think the only way is to
determine the batch size in advance and then create the appropriate
prepared statement, e.g.
Given the batch size is fixed to 3, then prepare below statement:
-- create table foobar(a int, b text);
insert into foobar values($1, $2), ($3, $4), ($5, $6);
Then this prepared statement must be bound with 3 set of arguments.
The limitation is obvious: the batch size is fixed, so if you need to
do batch with size of 4, the previous prepared statement is useless
and you need to recreate it.
On the other hand, in JDBC, it seems that you just need to prepare
below statement:
insert into foobar values($1, $2);
And then calls addBatch() repeatedly until you think the batch size is enough.
What's the final statement does postgresql jdbc driver convert to? I'm
so curious.
I'm not familiar with jdbc, and although I check the driver source
codes, but I still cannot understand it.
Anybody knows the answer? Thanks.
I have failed to understand the need for pgScript, which could be executed using pgAdmin tool. When it should be used? What it can do that plpgSQL cannot do? What is equivalent of it in Microsoft SQL Server?
pgScript is a client-side scripting language, while pl/PgSQL runs on the server. This means they have entirely different use cases. For example, PgScript can manage transaction status while pl/PgSQL cannot, but pl/Pgsql can be used to extend the language of SQL while pgScript cannot do that.
Additionally it means the two will handle many other things quite differently ranging from query plans to dynamic SQL, and while pgScript requires round trips between queries pl/Pgsql does not.
One use for pgScript is to define variables and use them later in your SQLs.
For example, you could do something like this:
declare #mytbl, #maxid;
set #mytbl = 'sometable';
set #maxid = 2500;
set #res = select count(*) from #mytbl where id <= #maxid;
print #res;
This approach is to just have any variables you want to change at the top of your script, rather than those getting buried deep inside complex SQL queries.
Of course, pgScript is a feature available only inside PgAdmin III client like #{Craig Ringer} mentioned in his comment.
I used DDL script generator from Toad Data Modeler.
I ran the script using normal query execution in PgAdmin-III but it kept giving me error:
"ERROR: relation "user" already exists SQL state: 42P07".
I ran Execute pgScript in PgAdmin-III and it worked fine.
I have written a DB2 query to do the following:
Create a temp table
Select from a monster query / insert into the temp table
Select from the temp table / delete from old table
Select from the temp table / insert into a different table
In MSSQL, I am allowed to run the commands one after another as one long query. Failing that, I can delimit them with 'GO' commands. When I attempt this in DB2, I get the error:
DB2CLI.DLL: ERROR [42601] [IBM][CLI Driver][DB2] SQL0199N The use of the reserved
word "GO" following "" is not valid. Expected tokens may include: "".
SQLSTATE=42601
What can I use to delimit these instructions without the temp table going out of scope?
GO is something that is used in MSSQL Studio, I have my own app for running upates into live and use "GO" to break the statements apart.
Does DB2 support the semi-colon (;)? This is a standard delimiter in many SQL implementations.
have you tried using just a semi-colon instead of "GO"?
This link suggests that the semi-colon should work for DB2 - http://www.scribd.com/doc/16640/IBM-DB2
I would try wrapping what you are looking to do in BEGIN and END to set the scope.
GO is not a SQL command, it's not even a TSQL command. It is an instruction for the parser. I don't know DB2, but I would imagine that GO is not neccessary.
From Devx.com Tips
Although GO is not a T-SQL statement, it is often used in T-SQL code and unless you know what it is it can be a mystery. So what is its purpose? Well, it causes all statements from the beginning of the script or the last GO statement (whichever is closer) to be compiled into one execution plan and sent to the server independent of any other batches.