I'm using the Statement's for executeUpdate and executeQuery. The string values I'm concatenating into the SQL query can contain at least the '\ character as well as Unicode.
PreparedStatement seems to do this automatically, but is there some utility function in the JDBC library to escape an arbitrary string for use in a SQL query?
Example errors I've ran into:
org.postgresql.util.PSQLException: ERROR: unterminated quoted string at or near
and
org.postgresql.util.PSQLException: ERROR: invalid Unicode escape
Hint: Unicode escapes must be \uXXXX or \UXXXXXXXX.
No, it's not part of JDBC, and it's different for different database management systems. You should really use PreparedStatement for queries with parameters. This is more secure and it can perform better since the query can be compiled.
See 4.1. SQL Syntax - Lexical Structure in the PostgreSQL manual.
E.g.
The following less trivial example writes the Russian word "slon" (elephant) in Cyrillic letters:
U&"\0441\043B\043E\043D"
The way JDBC does it is:
QueryExecutorImpl.parseQuery()
Breaks the string into fragments, handles single quotes, double quotes, line comments, block comments, dollar signs and takes note of ? for substitution.
Creates a SimpleQuery object
SimpleParameterList
Handles string encoding (UTF-8, bytea, etc.)
Unfortunately, to get the full string escaping (with encoding, handling of special chars, etc.), you'd probably need to somehow access and use QueryExecutorImpl and I wasn't able to figure out how to do that. PreparedStatement does internally use QueryExecutorImpl.
Conclusion, the best and easiest way is to probably use PreparedStatement.
Related
I've been working on an Express app that has a form designed to hold lines and quotes.
Some of the lines will have single quotes('), but overall it's able to store the info and I'm able to back it up and store it without any problems. Now, when I want do pg_dump and have the database put into an SQL file, the quotes seem to cause some things to appear a bit wonky in my text editor.
Would I have to create a method to change all the single quotation marks into double, or can I leave it as is and be able to upload it back to the database without causing major issues. I know people will continue to enter in lines that contain either single or double quotations, so I would like to know any solution or answer that would help greatly.
Single quotes in character data types are no problem at all. You just need to escape them properly in string literals.
To write data with INSERT you need to quote all string literals according to SQL syntax rules. There are tools to do that for you ...
Insert text with single quotes in PostgreSQL
However, pg_dump takes care of escaping automatically. The default mode produces text output to be re-imported with COPY (much faster than INSERT), and single quotes have no special meaning there. And in (non-default) csv mode, the default quote character is double-quote (") and configurable. The manual:
QUOTE
Specifies the quoting character to be used when a data value is quoted. The default is double-quote. This must be a single one-byte character. This option is allowed only when using CSV format.
The format is defined by rules for COPY and not by SQL syntax rules.
When I run:
COPY con (date,kgs)
FROM 'H:Sir\\data\\reporting\\hi.rpt'
WITH DELIMITER ','
CSV HEADER
date AS 'Datum/Uhrzeit'
kgs AS 'Summe'
I get the error:
WARNING: nonstandard use of \\ in a string literal
LINE 2: FROM 'H:Sudhir\\Conair data\\TBreporting\\hi.txt'
^
HINT: Use the escape string syntax for backslashes, e.g., E'\\'.
I've been having this problem for quite a while. Help?
It's not an error, it's just a warning. It has nothing to do with the file content, it's related to a PostgreSQL setting and the COPY command syntax you're using.
You're using PostgreSQL after 8.1 with standard_conforming_strings turned off - either before 9.1 (which defaulted to off) or a newer version with it turned off manually.
This causes backslashes in strings, like bob\ted, get interpreted as escapes, so that string would be bob<tab>ted with a literal tab, as \t is the escape for a tab.
Interpreting strings like this is contrary to the SQL standard, which doesn't have C-style backslash escapes. Years ago the PostgreSQL team decided to switch to the SQL standard way of doing things. For backward compatibility reasons it was done in two stages:
Add the standard_conforming_strings option to use the SQL-standard interpretation of strings, but have it default to off. Issue warnings when using the non-standard PostgreSQL string interpretation. Add a new E'string' style to allow applications to explicitly request escape processing in strings.
A few releases later, turn standard_conforming_strings on by default, once people had updated and fixed the warnings their applications produced. Supposedly.
The escape for \ is \\. So "doubling" the backslashes like you (or the tool you're using) done is correct. PostgreSQL is showing a warning because it doesn't know if when you wrote H:Sir\\data\\reporting\\hi.rpt you meant literally H:Sir\\data\\reporting\\hi.rpt (like the SQL spec says) or H:Sir\data\reporting\hi.rpt (like PostgreSQL used to do, against the standard).
Thus there's nothing wrong with your query. If you want to get rid of the warning, either turn standard_conforming_strings on , or add an explicit E'' to your string.
I know I have to escape single quotes, but I was just wondering if there's any other character, or text string I should guard against
I'm working with mysql and h2 database...
If you check the MySQL function mysql-real-escape-string which is used by all upper level languages you'll see that the strange characters list is quite huge:
\
'
"
NUL (ASCII 0)
\n
\r
Control+Z
The upper language wrappers like the PHP one may also protect the strings from malformed unicode characters which may end up as a quote.
The conclusion is: do not escape strings, especially with hard-to-debug hard-to-read, hard-to-understand regular expressions. Use the built-in provided functions or use parameterized SQL queries (where all parameters cannot contain anything interpredted as SQL by the engine). This is also stated in h2 documentation: h2 db sql injection protection.
A simple solution for the problem above is to use a prepared statement:
This will somewhat depend on what type of information you need to obtain from the user. If you are only looking for simple text, then you might as well ignore all special characters that a user might input (if it's not too much trouble)--why allow the user to input characters that don't make sense in your query?
Some languages have functions that will take care of this for you. For example, PHP has the mysql_real_escape_string() function (http://php.net/manual/en/function.mysql-real-escape-string.php).
You are correct that single quotes (') are user input no-no's; but double quotes (") and backslashes (\) should also definitely be ignored (see the above link for which characters the PHP function ignores, since those are the most important and basic ones).
Hope this is at least a good start!
I want to find rows where a text column begins with a user given string, e.g. SELECT * FROM users WHERE name LIKE 'rob%' but "rob" is unvalidated user input. If the user writes a string containing a special pattern character like "rob_", it will match both "robert42" and "rob_the_man". I need to be sure that the string is matched literally, how would I do that? Do I need to handle the escaping on an application level or is it a more beautiful way?
I'm using PostgreSQL 9.1 and go-pgsql for Go.
The _ and % characters have to be quoted to be matched literally in a LIKE statement, there's no way around it. The choice is about doing it client-side, or server-side (typically by using the SQL replace(), see below). Also to get it 100% right in the general case, there are a few things to consider.
By default, the quote character to use before _ or % is the backslash (\), but it can be changed with an ESCAPE clause immediately following the LIKE clause.
In any case, the quote character has to be repeated twice in the pattern to be matched literally as one character.
Example: ... WHERE field like 'john^%node1^^node2.uucp#%' ESCAPE '^' would match john%node1^node2.uccp# followed by anything.
There's a problem with the default choice of backslash: it's already used for other purposes when standard_conforming_strings is OFF (PG 9.1 has it ON by default, but previous versions being still in wide use, this is a point to consider).
Also if the quoting for LIKE wildcard is done client-side in a user input injection scenario, it comes in addition to to the normal string-quoting already necessary on user input.
A glance at a go-pgsql example tells that it uses $N-style placeholders for variables... So here's an attempt to write it in a somehow generic way: it works with standard_conforming_strings both ON or OFF, uses server-side replacement of [%_], an alternative quote character, quoting of the quote character, and avoids sql injection:
db.Query("SELECT * from USERS where name like replace(replace(replace($1,'^','^^'),'%','^%'),'_','^_') ||'%' ESCAPE '^'",
variable_user_input);
To escape the underscore and the percent to be used in a pattern in like expressions use the escape character:
SELECT * FROM users WHERE name LIKE replace(replace(user_input, '_', '\\_'), '%', '\\%');
As far as I can tell the only special characters with the LIKE operator is percent and underscore, and these can easily be escaped manually using backslash. It's not very beautiful but it works.
SELECT * FROM users WHERE name LIKE
regexp_replace('rob', '(%|_)', '\\\1', 'g') || '%';
I find it strange that there is no such functions shipped with PostgreSQL. Who wants their users to write their own patterns?
The best answer is that you shouldn't be interpolating user input into your sql at all. Even escaping the sql is still dangerous.
The following which uses go's db/sql library illustrates a much safer way. Substitute the Prepare and Exec calls with whatever your go postgresql library's equivalents are.
// The question mark tells the database server that we will provide
// the LIKE parameter later in the Exec call
sql := "SELECT * FROM users where name LIKE ?"
// no need to escape since this won't be interpolated into the sql string.
value := "%" + user_input
// prepare the completely safe sql string.
stmt, err := db.Prepare(sql)
// Now execute that sql with the values for every occurence of the question mark.
result, err := stmt.Exec(value)
The benefits of this are that user input can safely be used without fear of it injecting sql into the statements you run. You also get the benefit of reusing the prepared sql for multiple queries which can be more efficient in certain cases.
I'm working on an iPhone app, which uses the SQLite database, and I'm trying to handle escape characters. I know that there is LIKE ESCAPE to handle escape characters in select statements, but in my application i have SELECT, INSERT, UPDATE actions and i really don't know how to go about handling escape characters.
if you are using sqlite3_exec() all-in-one function,
you dont need to use sqlite3_bind* functions..
just pass the string to sqlite3_mprintf() with %q token
sqlite3_mprintf("%q","it's example");
and the output string is
it''s example
Use FMDB, and then you won't have to. It has built-in parameter binding support, and that will take care of any escaping you need for you.
I believe you simply have to tell SQLite what your escape character is at the end of the SQL statement. For example:
SELECT * FROM MyTable WHERE RevenueChange LIKE '%0\%' ESCAPE '\'
The LIKE will match values such as 30%, 140%, etc. The character I used, \, could be anything.