How to escape string while matching pattern in PostgreSQL - postgresql

I want to find rows where a text column begins with a user given string, e.g. SELECT * FROM users WHERE name LIKE 'rob%' but "rob" is unvalidated user input. If the user writes a string containing a special pattern character like "rob_", it will match both "robert42" and "rob_the_man". I need to be sure that the string is matched literally, how would I do that? Do I need to handle the escaping on an application level or is it a more beautiful way?
I'm using PostgreSQL 9.1 and go-pgsql for Go.

The _ and % characters have to be quoted to be matched literally in a LIKE statement, there's no way around it. The choice is about doing it client-side, or server-side (typically by using the SQL replace(), see below). Also to get it 100% right in the general case, there are a few things to consider.
By default, the quote character to use before _ or % is the backslash (\), but it can be changed with an ESCAPE clause immediately following the LIKE clause.
In any case, the quote character has to be repeated twice in the pattern to be matched literally as one character.
Example: ... WHERE field like 'john^%node1^^node2.uucp#%' ESCAPE '^' would match john%node1^node2.uccp# followed by anything.
There's a problem with the default choice of backslash: it's already used for other purposes when standard_conforming_strings is OFF (PG 9.1 has it ON by default, but previous versions being still in wide use, this is a point to consider).
Also if the quoting for LIKE wildcard is done client-side in a user input injection scenario, it comes in addition to to the normal string-quoting already necessary on user input.
A glance at a go-pgsql example tells that it uses $N-style placeholders for variables... So here's an attempt to write it in a somehow generic way: it works with standard_conforming_strings both ON or OFF, uses server-side replacement of [%_], an alternative quote character, quoting of the quote character, and avoids sql injection:
db.Query("SELECT * from USERS where name like replace(replace(replace($1,'^','^^'),'%','^%'),'_','^_') ||'%' ESCAPE '^'",
variable_user_input);

To escape the underscore and the percent to be used in a pattern in like expressions use the escape character:
SELECT * FROM users WHERE name LIKE replace(replace(user_input, '_', '\\_'), '%', '\\%');

As far as I can tell the only special characters with the LIKE operator is percent and underscore, and these can easily be escaped manually using backslash. It's not very beautiful but it works.
SELECT * FROM users WHERE name LIKE
regexp_replace('rob', '(%|_)', '\\\1', 'g') || '%';
I find it strange that there is no such functions shipped with PostgreSQL. Who wants their users to write their own patterns?

The best answer is that you shouldn't be interpolating user input into your sql at all. Even escaping the sql is still dangerous.
The following which uses go's db/sql library illustrates a much safer way. Substitute the Prepare and Exec calls with whatever your go postgresql library's equivalents are.
// The question mark tells the database server that we will provide
// the LIKE parameter later in the Exec call
sql := "SELECT * FROM users where name LIKE ?"
// no need to escape since this won't be interpolated into the sql string.
value := "%" + user_input
// prepare the completely safe sql string.
stmt, err := db.Prepare(sql)
// Now execute that sql with the values for every occurence of the question mark.
result, err := stmt.Exec(value)
The benefits of this are that user input can safely be used without fear of it injecting sql into the statements you run. You also get the benefit of reusing the prepared sql for multiple queries which can be more efficient in certain cases.

Related

"sqlLike" and "sqlLikeCaseInsensitive" escape character?

Is there any way to escape SQL Like string when using "sqlLike" and "sqlLikeCaseInsensitive"?
Example: I want a match for "abc_123". Using "_______" (7 underscores) would also return "abcX123", how can I enforce "_" as the 4th character?
If you issue the query in persistence, this is actually not a mdriven issue but an SQL issue as mdriven converts the Expression into SQL. So if you really want to restrict the results to underscores only take a look to this question:
Why does using an Underscore character in a LIKE filter give me all the results?
The way to escape the underscore may depend on the needs of your SQL database as the different answers indicate.

Postgres: LIKE query against a string returns no results, though results exist

I feel like I'm missing something really simple here. I'm trying to get a list of results from Postgres, and I know the rows exist since I can see them when I do a simple SELECT *. However, when I try to do a LIKE query, I get nothing back.
Sample data:
{
id: 1,
userId: 2,
cache: '{"dataset":"{\"id\":4,\"name\":\"Test\",\"directory\":\"/data/"...'
};
Example:
SELECT cache FROM storage_table WHERE cache LIKE '%"dataset":"{\"id\":4%';
Is there some escaping I need?
The LIKE operator supports escaping of wildcards (e.g. so that you can search for an underscore or % sign). The default escape character is a backslash.
In order to tell the LIKE operator that you do not want to use the backslash as an ESCAPE character you need to define a different one, e.g. the ~
SELECT cache
FROM storage_table
WHERE cache LIKE '%"dataset":"{\"id\":4%' ESCAPE '~';
You can use any character that does not appear in your data, ESCAPE '#' or ESCAPE 'ยง'
SQLFiddle example: http://sqlfiddle.com/#!15/7703f/1
But you should seriously consider upgrading to a more recent Postgres version that fully supports JSON. Not only will your queries be more robust, they can also be a lot faster due to the ability to index a JSON document.

How to use a text search and parameterized queries in sql without allowing regex injection

So I'm trying to wrap my head around how to combine three conflicting things (bound parameters, regex, partial-match-searching using user input) securely, and I am not sure I have discovered the right/secure way to deal with these things. This shouldn't be an uncommon concern, but the documentation that deals with the intersection of all three security factors for PDO & php is either hard to find or non-existent.
My needs are relatively simple and standard, in order of priority:
I want to prevent sql injection (currently I'm using bound
parameters)
I want to prevent regex injection
I want to search with partial matching using user input strings
So for example, I want to allow a user to search through usernames with a case insensitive partial match, e.g.
A search for Xiu will bring up the username Xiu and Xiulu and also xiuislowercase
Currently I have the following statement:
select * from users where username ilike :search_string || '%'
and elsewhere, I use more complex cases using the regex operator similarly to:
select * from users where username ~* :search_string || '%'
Where :search_string is a bound statement in php pdo, and the database is postgresql.
This performs the right search, returns the right results, and I'm reasonably certain that it is proof against sql injection since it's a bound parameter. However, I'm not certain that it is proof against regex injection, and I have no idea how to make it proof against regex injection at the same time as having it be proof against sql-injection.
How would I completely secure it against regex injection as well, using php, PDO, and postgresql?
LIKE does not support regular expressions, it only has limited pattern-matching metacharacters % and _. So if you escape those two characters with a backslash in the string before you pass it the parameter value, you should be safe.
<?php
$search_string = preg_replace('/[%_]/', '\\\\$0', $search_string);
$pdoStmt->execute(array('search_string'=>$search_string));
Alternatively, you could compare a left-substring of the username to your input, then it's comparing against a fixed string with no metacharacter pattern-matching features.
select * from users where left(username, :search_string_length) = :search_string
Re your comment:
The general rule to avoid code injection is: never execute arbitrary user input as code.
This applies to SQL injection of course, which is why we use parameters to force user input to be interpreted as values, and not modify the syntax of the SQL statement.
But it also applies to the "code" in a regular expression string within an SQL operation. A regular expression is itself a type of code logic, it's a very compact representation of a finite state machine for matching input.
The solution to avoid code injection is that it's okay to let user input choose code (as in whitelisting), but don't let the user input be interpreted as code.

Regular expression to prevent SQL injection

I know I have to escape single quotes, but I was just wondering if there's any other character, or text string I should guard against
I'm working with mysql and h2 database...
If you check the MySQL function mysql-real-escape-string which is used by all upper level languages you'll see that the strange characters list is quite huge:
\
'
"
NUL (ASCII 0)
\n
\r
Control+Z
The upper language wrappers like the PHP one may also protect the strings from malformed unicode characters which may end up as a quote.
The conclusion is: do not escape strings, especially with hard-to-debug hard-to-read, hard-to-understand regular expressions. Use the built-in provided functions or use parameterized SQL queries (where all parameters cannot contain anything interpredted as SQL by the engine). This is also stated in h2 documentation: h2 db sql injection protection.
A simple solution for the problem above is to use a prepared statement:
This will somewhat depend on what type of information you need to obtain from the user. If you are only looking for simple text, then you might as well ignore all special characters that a user might input (if it's not too much trouble)--why allow the user to input characters that don't make sense in your query?
Some languages have functions that will take care of this for you. For example, PHP has the mysql_real_escape_string() function (http://php.net/manual/en/function.mysql-real-escape-string.php).
You are correct that single quotes (') are user input no-no's; but double quotes (") and backslashes (\) should also definitely be ignored (see the above link for which characters the PHP function ignores, since those are the most important and basic ones).
Hope this is at least a good start!

Whats an alternative to having to use sqlite prepared statements to escape characters

I'm working on an iPhone app, which uses the SQLite database, and I'm trying to handle escape characters. I know that there is LIKE ESCAPE to handle escape characters in select statements, but in my application i have SELECT, INSERT, UPDATE actions and i really don't know how to go about handling escape characters.
if you are using sqlite3_exec() all-in-one function,
you dont need to use sqlite3_bind* functions..
just pass the string to sqlite3_mprintf() with %q token
sqlite3_mprintf("%q","it's example");
and the output string is
it''s example
Use FMDB, and then you won't have to. It has built-in parameter binding support, and that will take care of any escaping you need for you.
I believe you simply have to tell SQLite what your escape character is at the end of the SQL statement. For example:
SELECT * FROM MyTable WHERE RevenueChange LIKE '%0\%' ESCAPE '\'
The LIKE will match values such as 30%, 140%, etc. The character I used, \, could be anything.