Using arrays with pg-promise - pg-promise

I'm using pg-promise and am not understanding how to run this query. The first query works, but I would like to use pg-promise's safe character escaping, and then I try the second query it doesn't work.
Works:
db.any(`SELECT title FROM books WHERE id = ANY ('{${ids}}') ORDER BY id`)
Doesn't work
db.any(`SELECT title FROM books WHERE id = ANY ($1) ORDER BY id`, ids)

The example has 2 problems. First, it goes against what the documentation tells you:
IMPORTANT: Never use the reserved ${} syntax inside ES6 template strings, as those have no knowledge of how to format values for PostgreSQL. Inside ES6 template strings you should only use one of the 4 alternatives - $(), $<>, $[] or $//.
Manual query formatting, like in your first example, is a very bad practice, resulting in bad things, ranging from broken queries to SQL injection.
And the second issue is that after switching to the correct SQL formatting, you should use the CSV Filter to properly format the list of values:
db.any(`SELECT title FROM books WHERE id IN ($/ids:csv/) ORDER BY id`, {ids})
or via an index variable:
db.any(`SELECT title FROM books WHERE id IN ($1:csv) ORDER BY id`, [ids])
Note that I also changed from ANY to IN operand, as we are providing a list of open values here.
And you can use filter :list interchangeably, whichever you like.

Related

Postgres fuzzy array intersection

I'm using Postgresql 13 and my problem was easily solved with #> operator like this:
select id from documents where keywords #> '{"winter", "report", "2020"}';
meaning that keywords array should contain all these elements. Also I've created a GIN index on this column.
Is it possible to achieve similar behavior even if I provide my request like '{"re", "202", "w"}' ? I heard that ngrams have semantics like this, but "intersection" capabilities of arrays are crucial for me.
In your example, the matches are all prefixes. Is that the general rule here? If so, you would probably went to use the match feature of full text search, not trigrams. It would require you reformat your data, or at least your query.
select * from
(values (to_tsvector('simple','winter report 2020'))) f(x)
where x## 're:* & 202:* & w:*'::tsquery;
If the strings can contain punctuation which you want preserved, you would need to take pains to properly format them into a quoted tsvector yourself rather than just letting to_tsvector deal with it. Using 'simple' config gets rid of the stemming and stop word removal features, which would interfere with what you want to do.

ormlite select count(*) as typeCount group by type

I want to do something like this in OrmLite
SELECT *, COUNT(title) as titleCount from table1 group by title;
Is there any way to do this via QueryBuilder without the need for queryRaw?
The documentation states that the use of COUNT() and the like necessitates the use of selectRaw(). I hoped for a way around this - not having to write my SQL as strings is the main reason I chose to use ORMLite.
http://ormlite.com/docs/query-builder
selectRaw(String... columns):
Add raw columns or aggregate functions
(COUNT, MAX, ...) to the query. This will turn the query into
something only suitable for using as a raw query. This can be called
multiple times to add more columns to select. See section Issuing Raw
Queries.
Further information on the use of selectRaw() as I was attempting much the same thing:
Documentation states that if you use selectRaw() it will "turn the query into" one that is supposed to be called by queryRaw().
What it does not explain is that normally while multiple calls to selectColumns() or selectRaw() are valid (if you exclusively use one or the other),
use of selectRaw() after selectColumns() has a 'hidden' side-effect of wiping out any selectColumns() you called previously.
I believe that the ORMLite documentation for selectRaw() would be improved by a note that its use is not intended to be mixed with selectColumns().
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectColumns("emailAddress"); // This column is not selected due to later use of selectRaw()!
qb.selectRaw("COUNT (emailAddress)");
ORMLite examples are not as plentiful as I'd like, so here is a complete example of something that works:
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectRaw("emailAddress"); // This can also be done with a single call to selectRaw()
qb.selectRaw("COUNT (emailAddress)");
qb.groupBy("emailAddress");
GenericRawResults<String[]> rawResults = qb.queryRaw(); // Returns results with two columns
Is there any way to do this via QueryBuilder without the need for queryRaw(...)?
The short answer is no because ORMLite wouldn't know what to do with the extra count value. If you had a Table1 entity with a DAO definition, what field would the COUNT(title) go into? Raw queries give you the power to select various fields but then you need to process the results.
With the code right now (v5.1), you can define a custom RawRowMapper and then use the dao.getRawRowMapper() method to process the results for Table1 and tack on the titleCount field by hand.
I've got an idea how to accomplish this in a better way in ORMLite. I'll look into it.

How do I prevent sql injection if I want to build a query in parts within the fatfree framework?

I am using the fatfree framework, and on the front-end I am using jQuery datatables plugin with server-side processing. And thus, my server-side controller may or may not receive a variable number of information, for example a variable number of columns to sort on, a variable number of filtering options and so forth. So if I don't receive any request for sorting, I don't need to have a ORDER BY portion in my query. So I want to generate the query string in parts as per certain conditions and join it at the end to get the final query for execution. But if I do it this way, I won't have any data sanitization which is really bad.
Is there a way I can use the frameworks internal sanitization methods to build the query string in parts? Also is there a better/safer way to do this than how I am approaching it?
Just use parameterized queries. They are here to prevent SQL injection.
Two possible syntaxes are allowed:
with question mark placeholders:
$db->exec('SELECT * FROM mytable WHERE username=? AND category=?',
array(1=>'John',2=>34));
with named placeholders:
$db->exec('SELECT * FROM mytable WHERE username=:name AND category=:cat',
array(':name'=>'John',':cat'=>34));
EDIT:
The parameters are here to filter the field values, not the column names, so to answer more specifically to your question:
you must pass filtering values through parameters to avoid SQL injection
you can check if column names are valid by testing them against an array
Here's a quick example:
$columns=array('category','age','weight');//columns available for filtering/sorting
$sql='SELECT * FROM mytable';
$params=array();
//filtering
$ctr=0;
if (isset($_GET['filter']) && is_array($_GET['filter'])
foreach($_GET['filter'] as $col=>$val)
if (in_array($col,$columns,TRUE)) {//test for column name validity
$sql.=($ctr?' AND ':' WHERE ')."$col=?";
$params[$ctr+1]=$val;
$ctr++;
}
//sorting
$ctr=0;
if (isset($_GET['sort']) && is_array($_GET['sort'])
foreach($_GET['sort'] as $col=>$asc)
if (in_array($col,$columns,TRUE)) {//test for column name validity
$sql.=($ctr?',':' ORDER BY ')."$col ".($asc?'ASC':'DESC');
$ctr++;
}
//execution
$db->exec($sql,$params);
NB: if column names contain weird characters or spaces, they must be quoted: $db->quote($col)

Dynamic WHERE Clause & SQL Injection

I need to create functionality for users to determine the WHERE criteria of a select - the criteria will be dynamic.
Is there a way I can achieve this without opening up my code to SQL injection?
I'm using C# / .NET Windows Application.
Using parameterized queries would go long way toward protecting you from SQL injection attacks, because most bad things happen in the value portion of your where conditions.
For exampleg given a condition a=="hello" && b=="WORLD", do this:
select a,b,c,d
from table
where a=#pa and b=#pb -- this is generated dynamically
Then, bind #pa="hello" and #pb="WORLD", and run your query.
In C#, you would start with an in-memory representation of your where clause in hand, go through it element-by-element, and produce two output objects:
A string with the where clause, where constants are replaced by automatically generated parameter references pa, pb, and so on (use your favorite naming scheme for these blind parameters: the actual names do not matter)
A dictionary of name-value pairs, where names correspond to the parameters that you've inserted in your where clause, and values that correspond to the constants that you pulled from the expression representation.
With these outputs in hand, you prepare your dynamic query using the string, add parameter values using the dictionary, and then execute the query against your RDBMS source.
DO NOT DO THIS
select a,b,c,d
from table
where a='hello' and b='WORLD' -- This dynamic query is ripe for an interjection attack
Ah two phases. Given you column names and operators are not direct user input. E.g. picked from a list or radio group etc
then
String WhereClause = String.Format("Where {0} {1} #{0}","Customer", "=");
So now you Have "Where Customer = #Customer".
Then you can add aparamer Customer and set it from the user input.
There are a few ways to attack this, depends on how complex your criteria could be though.

Parameterized SQL Columns?

I have some code which utilizes parameterized queries to prevent against injection, but I also need to be able to dynamically construct the query regardless of the structure of the table. What is the proper way to do this?
Here's an example, say I have a table with columns Name, Address, Telephone. I have a web page where I run Show Columns and populate a select drop-down with them as options.
Next, I have a textbox called Search. This textbox is used as the parameter.
Currently my code looks something like this:
result = pquery('SELECT * FROM contacts WHERE `' + escape(column) + '`=?', search);
I get an icky feeling from it though. The reason I'm using parameterized queries is to avoid using escape. Also, escape is likely not designed for escaping column names.
How can I make sure this works the way I intend?
Edit:
The reason I require dynamic queries is that the schema is user-configurable, and I will not be around to fix anything hard-coded.
Instead of passing the column names, just pass an identifier that you code will translate to a column name using a hardcoded table. This means you don't need to worry about malicious data being passed, since all the data is either translated legally, or is known to be invalid. Psudoish code:
#columns = qw/Name Address Telephone/;
if ($columns[$param]) {
$query = "select * from contacts where $columns[$param] = ?";
} else {
die "Invalid column!";
}
run_sql($query, $search);
The trick is to be confident in your escaping and validating routines. I use my own SQL escape function that is overloaded for literals of different types. Nowhere do I insert expressions (as opposed to quoted literal values) directly from user input.
Still, it can be done, I recommend a separate — and strict — function for validating the column name. Allow it to accept only a single identifier, something like
/^\w[\w\d_]*$/
You'll have to rely on assumptions you can make about your own column names.
I use ADO.NET and the use of SQL Commands and SQLParameters to those commands which take care of the Escape problem. So if you are in a Microsoft-tool environment as well, I can say that I use this very sucesfully to build dynamic SQL and yet protect my parameters
best of luck
Make the column based on the results of another query to a table that enumerates the possible schema values. In that second query you can hardcode the select to the column name that is used to define the schema. if no rows are returned then the entered column is invalid.
In standard SQL, you enclose delimited identifiers in double quotes. This means that:
SELECT * FROM "SomeTable" WHERE "SomeColumn" = ?
will select from a table called SomeTable with the shown capitalization (not a case-converted version of the name), and will apply a condition to a column called SomeColumn with the shown capitalization.
Of itself, that's not very helpful, but...if you can apply the escape() technique with double quotes to the names entered via your web form, then you can build up your query reasonably confidently.
Of course, you said you wanted to avoid using escape - and indeed you don't have to use it on the parameters where you provide the ? place-holders. But where you are putting user-provided data into the query, you need to protect yourself from malicious people.
Different DBMS have different ways of providing delimited identifiers. MS SQL Server, for instance, seems to use square brackets [SomeTable] instead of double quotes.
Column names in some databases can contain spaces, which mean you'd have to quote the column name, but if your database contains no such columns, just run the column name through a regular expression or some sort of check before splicing into the SQL:
if ( $column !~ /^\w+$/ ) {
die "Bad column name [$column]";
}